as a PDF - Alpen-Adria

Transcription

Martin Hitz, Gerhard Leitner, Rudolf Melcher (Hrsg.)
ISCH’08
Interactive Systems for Cultural Heritage
Lakeside Science and Technology Park
Seminarkonferenz
16. Jänner 2009
Seminar aus Interaktive Systeme
Alpen-Adria-Universität Klagenfurt
Wintersemester 2008/2009
Seminar aus Interaktive Systeme: Format und Ablaufplan
Konferenzseminare haben an der Informatik der Alpen-Adria-Universität Klagenfurt Tradition. Die
Forschungsgruppe Interaktive Systeme hat sich dieser Tradition angeschlossen und über einige Jahre hinweg ihr
eigenes Format entwickelt, das sich mittlerweile gut bewährt hat, aber naturgemäß weiterhin als Work in Progress
verstanden wird. An dieser Stelle des Entwicklungsprozesses soll das aktuelle Lehrveranstaltungsformat festgehalten
werden, um als wohldefinierter Ausgangspunkt für weitere Optimierungen zur Verfügung zu stehen.
Ziel von Konferenzseminaren ist es, den Studierenden neben dem primären Training der Recherche und Rezeption
von (i. A. englischsprachiger) wissenschaftlicher Originalliteratur auch den Prozess der Entwicklung eines Beitrags
zu einer wissenschaftlichen Konferenz näher zu bringen. Zu diesem Zweck sind im Laufe des Semesters mehrere
einschlägige Rollen zu probieren, wobei der Schwerpunkt auf der Rolle eines Autors bzw. einer Autorin einer eigenen
(Überblicks-) Arbeit liegt.
Wir beginnen so früh wie möglich im Semester
(WS 2008/09: 2. Oktober, 18:00) mit einer
Vorbesprechungseinheit, in der das Rahmenthema sowie der Ablauf erläutert werden. Es wird ein kurzes
Impulsreferat gehalten und ein Call for Papers besprochen, das – wie alle Lehrmaterialien – auf der Lehrplattform zu
Verfügung gestellt wird (https://elearning.uni-klu.ac.at/moodle/course/view.php?id=2384). Das Impulsreferat wird
durch eine Reihe von Einstiegsarbeiten (Originalartikel, evt. auch Lehrbuchkapitel) ergänzt, die ebenfalls auf der
Lehrplattform verfügbar gemacht werden. Diese Basisliteratur soll den Studierenden einerseits einen Anker in die
einschlägige Literatur, andererseits aber auch ein Muster für das Qualitätsniveau der als relevant erachteten Literatur
bieten. Abgeschlossenen wird die Vorbesprechungseinheit mit der Vorstellung von Quellen und Werkzeugen zur
Literaturrecherche sowie mit Hinweisen auf Zitierregeln und auf die allgemeine gute wissenschaftliche Praxis.
An diese Vorbesprechung schließt eine Phase der Literaturrecherche an (WS 2008/09: 3.-22. Oktober), in der sich
die Studierenden in das Gebiet einlesen und schließlich eine thematische Nische für ihre eigene Überblicksarbeit
finden sollen. Dabei gilt, dass jede als relevant erachtete gelesene Arbeit auf der Lehrplattform zu hinterlegen ist, und
zwar jedenfalls mit vollständiger Quellenangabe, mit einer eigenen Kurzfassung (1-2 Absätze) mit Hinweisen auf für
das Konferenzthema relevante Aspekte und, falls möglich, mit einem Link auf den Volltext der besprochenen Arbeit.
Die so entstehende Literatursammlung steht allen Teilnehmern zur Verfügung und soll den Prozess der
Literaturrecherche insgesamt beschleunigen1. Die wöchentlichen Plenareinheiten der Lehrveranstaltung dienen in
dieser Phase zur Vorstellung solcher »Literaturfunde«: Die Teilnehmer erläutern in jeweils etwa 10 Minuten die
Essenz des Inhalts eines von ihnen gelesenen und als besonders relevant eingestuften Artikels, wobei mindestens ein
solcher Beitrag pro Person verpflichtend ist2. Aus diesen Diskussionen entstehen erste Ideen über die konkreten
Beitragsthemen der Studierenden.
Diese Beitrags-Ideen werden in der folgenden Konzeptionsphase (WS 2008/09: 23. Oktober – 7. November)
konkretisiert, während der ein vorläufiger Titel und ein Abstract ausgearbeitet werden müssen, sowie mindestens drei
1
Seit SS 2007 wird dafür ein Wiki definiert, das mit einem Beispieleintrag initialisiert wird. Ein Eintrag eines Seminarteilnehmers ist nachstehend
dargestellt. Über den Link in der Quellenangabe ist der Artikel im Volltext zu beziehen. Weiters ist vorgesehen, über dasselbe Wiki auch ein
Glossar wichtiger Begriffe zu etablieren (im Beispiel nicht ersichtlich). Es werden üblicherweise zwischen 3 und 15 derartige Beiträge pro Person
erfasst (Mittelwert bei etwa 5 / Person).
2
Auf Grund der i. A. stattfindenden Diskussion sind je 90 Minuten etwa sechs solcher Beiträge unterzubringen.
I
»Forschungsfragen«, die im Rahmen der zu erstellenden Arbeit beantwortet werden sollen. Gleichzeitig wird die
Literaturarbeit (zielgerichtet) fortgesetzt. Am Ende dieser Phase (WS 2008/09: 13. November) werden die Abstracts
an alle verteilt und die Themenstellungen in einer Plenarsitzung abgeglichen und endgültig festgelegt.
Nun folgt die Ausarbeitungsphase (WS 2008/09: 14. November – 3. Dezember), in der die Erstfassungen der
Beiträge erstellt werden. Die Literaturarbeit wird weiter fortgesetzt. Am Ende dieser Phase liegen alle Erstfassungen
als (formatvorgabenkonforme) PDF-Dateien auf der Lehrplattform vor.
In einer Plenarsitzung zu Beginn der Begutachtungsphase (WS 2008/09: 4.-14. Dezember) werden für jede Arbeit
zwei bis drei Gutachter bzw. Gutachterinnen festgelegt und die Kriterien für ein konstruktives Gutachten vorgestellt,
unterstützt durch ein reales Beispiel (Gutachten aus einem Begutachtungsprozess eines Konferenzbeitrags eines
Mitglieds der Forschungsgruppe) und eine Erfassungsschablone für die numerische Beurteilung einer Reihe von
Standardkriterien. Abgesehen von der Bewertung von diesen Standardkriterien sind die studentischen Gutachterinnen
und Gutachter angehalten, eine Gesamtempfehlung abzugeben, und zwar durch Klassifikation der ihnen zugeordneten
Beiträge in Work in Progress Arbeiten (»noch nicht ganz ausgereift«) und Full Papers (»ordentliche Publikation«).
Da diese Klassifikation letztlich eine (wenn auch sehr schwache) Auswirkung auf die notenmäßige Beurteilung des
Autors bzw. der Autorin hat3, ist diese Phase gruppendynamisch relativ anspruchsvoll. Die endgültige Klassifikation
der Arbeiten erfolgt in einer abschließenden Plenarsitzung (Program Committee Meeting, WS 2008/09:
18. Dezember), in der die einzelnen (auf der Lehrplattform abgelegten) Gutachten von den Gutachterinnen und
Gutachtern vorgestellt und diskutiert werden.
Die Finalisierungsphase (WS 2008/09: 19. Dezember – 7. Jänner) dient zur Überarbeitung des eigenen Beitrags und
zur Erstellung der Camera Ready Copy. Neben der verbesserten Arbeit ist eine kurze Stellungnahme abzugeben, in
welcher Weise auf die Vorschläge der Gutachten eingegangen wurde. In einer Plenarsitzung wird diese
Stellungnahme von den einzelnen AutorInnen vorgetragen; die GutachterInnen äußern sich zur Qualität der
Überarbeitung. Letzte Optimierungen an der Endversion der Arbeit können noch angebracht werden. Schließlich wird
die Lehrplattform hochgeladene PDF-Version von der Seminarleitung in einen Konferenzband übernommen (und mit
Deckblatt, Seitennummern, Inhaltsverzeichnis, Kopf- und Fußzeilen ausgestattet), von dem zum Konferenztermin
vorab für jede teilnehmende Person ein Exemplar vorbereitet wird.
Die Präsentationsphase (WS 2008/09: 16. Jänner 8:30-17:30, ggfs. auch zweitägig) entspricht dem simulierten
Kongress. Dieser findet typischerweise außerhalb der üblichen Lehrveranstaltungsräumlichkeiten statt (Lakeside
Demoraum, B01) und wird mit Pausenverpflegung und –getränken sowie einem gemeinsamen Konferenzessen
»garniert«. Die Präsentationen sind mit 20 Minuten (Full Paper) bzw. 15 Minuten (Work in Progress) limitiert, dazu
kommen Zeit für Diskussion und Vortragendenwechsel, sodass pro Beitrag 30 Minuten Bruttozeit zu veranschlagen
sind. Die Teilnehmerinnen und Teilnehmer werden angehalten, über die drei besten Präsentationen abzustimmen –
die Preisverleihung des Best Presentation Awards beschließt das Seminar.
Vorbereitete Materialien
Die Lehrplattform wird von der Seminarleitung mit folgenden Artefakten initialisiert:
•
•
•
•
•
•
•
Strikte Formatvorlagen für LaTex und Word (nach den ACM SIGCHI Publikationsvorlagen)
Quellen zur Literaturrecherche (insbes. digitale Bibliotheken, Google Scholar, CiteSeer…)
Basisliteratur
Muster einer Überblicksarbeit (i. A. aus ACM Computing Surveys)
Muster eines Eintrags in das Literatur-Wiki
Muster der gutachterlichen Rückmeldungen zu einem realen Konferenzbeitrag
Erfassungsschablone (Excel) für die numerische Beurteilung von Standardkriterien
Erfahrungen 2007/08
Im vergangenen Wintersemester 2007/08 wurde das Seminar von 13 Studierenden belegt. Im Laufe des Semesters
haben sich drei der Studierenden wegen Überlastung abgemeldet, die verbleibenden zehn haben positiv abgeschlossen
(2xA, 3xB, 5xC). Das Ende der Lehrveranstaltung bereits Mitte Jänner erlaubt den Studierenden, sich zu
Semesterende noch voll auf den Abschluss anderer Lehrveranstaltungen zu konzentrieren.
Klagenfurt, 12. 1. 2009
3
Martin Hitz
Der Normalwert (= Note für eine ordentliche Leistung) liegt bei Full Papers bei 2, bei Work in Progress Papers bei 3. Besonders gute
Gesamtleistungen können diesen Normalwert um einen Grad verbessern, genauso kann er um einen Grad verschlechtert werden, was ein
Beurteilungsintervall von 1-3 bzw. 2-4 ergibt. Die Note 5 wird nur bei Nichterfüllung notwendiger Bedingungen vergeben.
II
Inhalt
Sustainability of Cultural Artifacts
Anton Pum
Long Time Archiving of Digital Data in Context of Digital Preservation of Cultural Heritage ............................... 1
Sudheer Karumanchi
Data Management of Cultural Artifacts ....................................................................................................... 7
Andreas Stiglitz
Copyrights, Storage & Retrieval of Digital Cultural Artifacts .......................................................................... 12
Carmen Volina
Libraries in the digital age ....................................................................................................................... 19
Augmentation of Cultural Objects
Bonifaz Kaufmann
Using Narrative Augmented Reality Outdoor Games in Order to Attract Cultural Heritage Sites ......................... 28
BS Reddy Jaggavarapu
Personalized Touring with Augmented Reality.............................................................................................. 37
Claus Liebenberger
Augmented reality telescopes ................................................................................................................... 44
Guntram Kircher
Interactive museum guides: Identification and recognition techniques of objects and images ........................... 55
Digitalization Techniques
Manfred A. Heimgartner
Photorealistic vs. non-photorealistic rendering in AR applications .................................................................. 62
Helmut Kropfberger
Digitalizing Intangible Cultural Artefacts .................................................................................................... 68
Stefan Melanscheg
Technologien zur digitalen Aufbereitung historischer Bauwerke und Denkmäler .............................................. 74
Christian Blackert
Digital Scanning Techniques and Their Utility for Mobile Augmented Reality Applications .................................. 81
Interacting with Digital Environments
Daniel Finke
Möglichkeiten einer digitalen Umwelt ........................................................................................................ 89
René Scheidenberger
Psychological Aspects of User Experience in Virtual Environments ................................................................. 96
Simon Urabl
Indoor Tracking Techniques ..................................................................................................................... 102
III
Lakeside Science & Technologie Park, 16.1.2009
ISCH'09 - Interactive Systems for Cultural Heritage
Long Time Archiving of Digital Data in Context of Digital
Preservation of Cultural Heritage
Anton Pum
Alpen-Adria Universität Klagenfurt
9020 Klagenfurt
apum@edu.uni-klu.ac.at
ABSTRACT
Time is an unstoppable factor in human’s life. Every
lifeform becomes older while the time passes by. And while
it is (actually) not possible to store human beings for the
posterity, it is possible with almost every kind of data. We
live in a very fast paced and dynamic world, where
technology evolves faster than one can follow. This
includes new technologies, e.g. new formats for data
storage and new ways how data can be obtained at all.
Sometimes when developed technologies are so radically
new, old data storages become obsolete. Either there are
new ways of saving data or even new ways of gathering
that may lead to absolutely new devices, formats and ways
of presenting information.
This development stands in conflict with the wish to keep
data for a long time and to have undestroyable data storages
that keeps consistent over decades, maybe even centuries.
This paper shows the challenges in terms of long term
archiving of digital data and already existing initiatives and
solutions.
discussed, followed by an explanation of the resulting
paradox. Afterwards some methods and possibilities of
effective digital cultural preservation will be shown in
detail.
CULTURAL PRESERVATION
Cultural preservation can be described as the process of
saving all sorts of material of cultural relevance for the
future. This includes the selection of material to be saved
and the search for a proper way for long-time archiving.
In general, archiving of digital data is more complicated
than archiving printed information, because of some
important differences. [1] A printed document is a physical
object, a digitally saved one is a logical object. The physical
object is always readable, just by using the human eyes, but
to read and understand a logical object (which only saved
on a physical data storage) it is necessary to have the right
equipment, be it respective software or hardware. Also most
of the time physical objects (especially documents) are still
readable and understandable if they have been slightly
damaged. Digital data is more likely to be completely
unreadable, if it has been damaged in any way.
Author Keywords
Cultural Preservation, Long Term Archiving, Challenges,
Initiatives
So it gets clear that there are some challenges with the
digital preservation of cultural heritage. The challenge gets
even greater when we are talking about long term archiving.
We’ll see about that later in this paper.
INTRODUCTION
This paper deals with the topic of long-term preservation of
data, in general and especially in terms of cultural heritage
as a chosen scientific field. In times of fast paced
technology development the choice of a possibilty for
persistent data storage is not an easy one. There are several
initiatives, which deal with the preservation of cultural
heritage. I want to discuss their effort together with the
different challenges they face. First off there is a short
description of cultural preservation and its benefits. Then
the challenges of digital cultural preservation will be
BENEFITS
So why should one think about preserving cultural heritage.
I want to show the aspects of cultural preservation that have
a positive influence on today’s life. This does not only
include digital preservation, but also the care for natural
habitats and other “real life” heritage (like buildings,...).
There are several areas benefitting from cultural
preservation. [2]
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise,
or republish, to post on servers or to redistribute to lists, requires prior
specific permission and/or a fee.
Seminar aus Interaktive Systeme WS07/08, 18-19 January, 2008,
Klagenfurt, Austria
Copyright 2008 Klagenfurt University
1
Seminar aus Interaktive Systeme, WS 08/09
1
-
Area of environment
It has a positive effect on the appearance of
city centers and also causes a reduction in air
pollution, if there are any significant efforts
made in cultural heritage preservation. This is
mainly caused by a improvent in the planning
of local transport systems. Both, the cultural
heritage and the people in the local area
benefit from better air conditions.
-
Area of education
It helps to stimulate the social and material
vitality of areas around museums. Such places
are generally used to obtain information and
not only by school classes, but also other
institutions.
So
there
are
absolute
improvements in teaching and accessing
information.
-
Area of construction
Places of cultural heritage are often places of
social inclusion at the same time. This raises
the quality of life of the local inhabitants. Also
when restoring actual buildings and other
objects, cultural goods may serve as a model
and experience gathered by refurbishing an
old building, book or something else of
cultural relevance can be helpful also with
actual objects.
-
Local economy
The economy is effected in a positive way too,
because of greater attractivness for tourism in
the respective area. Further the preservation of
cultural heritage is also often the origin of new
developed technologies and may stimulate
small and medium sized enterprizes to
develop new markets.
So it should be clear that cultural heritage has great
influence on our everyday life and affects many different
areas and businesses.
WHAT CULTURAL HERITAGE IS PRESERVED?
So the next question may be, what kind of cultural heritage
is preserved for posterity.
LONG TERM PRESERVATION
Now we are talking especially about digital data. We spend
uncountable time in producing digital information. Doing
this requires much effort, often great investments –
economical and personal, so do not want to loose the
outcome of all this work. This is why we have the need for
preserving digital data over long time. It is the same case
with cultural heritage that is saved in a logical way as data
on a storage device. For example you have a book from the
medieval, which is dangerous to touch, because it would
possibly break. So there is the need for another way to keep
the data of the book persistent, in today’s case, the need for
saving the information in a digital way. So you have a
certain storage device to save the data, of course the newest
technology. But here starts the problem. How long will this
technology be “new”? This leads us to the paradox of long
term archiving.
THE PARADOX
So we have the need for saving our data over a long time.
This could mean several decades to hundrets of years, so
we need a way to save digital data permanently and
consistent.
Look back the last 30 years. In this time, several new and
influencial technologies have been developed. We had the
CD, DVD, some other storage formats that experienced
significant changes (especially concerning personal
computer hardware) and maybe the most important of all:
the internet. Think about how these technologies influenced
peoples life and the way things are done. Look back 60
years where people in the computer business (if something
like that existed in those days) said, that no more than 5
computers will ever be necessary worldwide (This is a
famous misquote from Thomas J. Watson made 1943, the
author was although unable to find any specific citation).
Speaking about technology changes we can assume that
there will be more such massive changes in the coming
decades, like they appeared in the past. So technology
changes, no matter what, but we want our data to be
persistent over a long time.
This causes an inevitable paradox. [3] On the one hand we
want to keep our old data unchanged and on the other hand
we want to access this data and interact with it with the
newest technology and tools.
So this means there are several great challenges in this
field, which need further discussion.
CHALLENGES
There are some main challenges to enlist, when talking
about long term digital preservation. This also considers
cultural heritage preservation, but in general it is valid for
all types of digital archiving.
There are generally three different digital preservation
requirements we have on preserved data. Those
requirements are determined by the creation context of the
preserved data, based on if it was originally digital data or if
it is converted data based on a real-life object. [3]
2
Also the “Task Force on Archiving of Digital Information”
mentions the problem of the longevity of physical data
storages. [5] So even under the best conditions any physical
media can be fragile and has limited shelf life, so it has to
be replaced sooner or later. So technology becomes
obsolete over time because of new systems and
technologies replacing it. It is said that migration of the data
is the better and more general solution, though it is not
always possible to migrate the data in exactly the same way
it existed before and it’s a time consuming, costly and much
more complex process, than the simple refreshing from one
media to another one of the same kind. Also legal and
organizational issues are relevant, when thinking of how to
distribute one’s knowledge and data (in terms of price and
availability), so it may not be either too easy or too hard to
obtain.
Its content must be stored consistently.[3] So if one
accesses the data one year later, there must be the same
complete and valid information coming in return. This can
be a difficult task if we are talking about dynamic content,
that is for example linked to another dynamic content and
both change over time. Nontheless consistency must be
guaranteed. In terms of cultural heritage data can be very
complex, widespread and distributed, which leads to
problems in terms of consistency.
The preserved data is stored in a certain format and style,
which can be problematic concerning the tools needed to
access it. [3] One may not have the proper
software/hardware to get information from the respective
storage device. Standards are generally used to surpass this
problem, but even standards become old over time.
Sometimes also the context in which the data has been
created is important.[3] This always requires further
research to learn about the circumstances data has been
generated. Also for cultural heritage this fact is very
important, for example if you need to know, why a certain
ancient book was written, ship was buildt or something
else.
Margulies et al. say there is also the necessity to
administrate the metadata of the preserved data.
[6]Metadata must be preserved with the same attention as it
is done with the concerning data. There are two ways of
presenting the data (after it has been found): [6]
-
The data is copied on a storage device of the
latest technology and read then by actual
drives.
Lorie presents a typical case of data access in the future. [4]
A file (F2000) has been stored in the year 2000 on a storage
device (M2000). If in the year 2100 one wants to obtain the
respective data, following conditions must be met: [4]
-
F2000 must be physically intact. This requires
bit-stream preservation.
-
A device must be available to read the bit
stream. Of course this would not be the same
as to the time, when the data has been stored
100 years before.
-
The bit stream must be correctly interpreted.
-
2.
3.
Emulation
A new machine is limited to the old
presentation processes.
Those processes base on the fact that digital data in its
essence cannot be read and understood by the human eye
(as it would be the case with written paper, books and
documents).[6] So we need the process of transforming the
information in digital data, which can only be understood
by machines. So the information data (Daten) is digitalized
(Digitalisierung) via a machine and then saved to a storage
device (Speichermedium).
Lorie discusses the conditions as followed: [4]
1.
Data-format migration
The first condition can only possibly be met, if the
data is copied to a newer storage device
periodically, due to the finite life time of any
physical device.
Information
The second condition has the same requirement as
condition 1, one must copy the data from old
media to new one, so it can be used be the latest
technology. [4] This is also vital to benefit from
the advantages of the newest technology.
Machine
Data
Storage device
Data
Digitalisation
Fig.1 modified on [6] Transformation of information to
digital data
The third condition is probably the hardest one to
meet.[4] A bit stream must therefore be interpreted
in the same way in the future as it is today, despite
of new findings and technology. We will discuss
several sketches of solutions on this problem later
on.
To read and understand this saved information again,
certain machines are necessary, which are able to interpret
the saved information in the right way. The machine grants
access to the storage device and the user of the archive
(Archivbenutzer) can then understand the data in the right
way, as it has been saved.
One general problem of digital cultural heritage data is its
scale and diversity [5] in terms of connections to other
information, complexity and right interpretation.
3
3
Storage
device
Machine
Data
Archive User
Data
Access
needed in this case. In some cases it might even be not
possible to access the old data via emulation, because the
old program only displayed the data in a certain way, but
grants no access to it.
Fig.2 modified on [6] Interpretation of digital data
In my opinion those are many intolerable drawbacks, so
advanced solutions have to be found. Lorie in [4] makes a
proposal:
This process shows some criteria of digital data too. [6] The
independence of the storage device (as the data can be
saved to every kind storage device), the dependence of the
machine that interprets the information and the possibility
of no losses involved in the process. Based on this criteria
there can be some promises for solutions, as seen later on.
This is where the distinction of the archiving of data and the
archiving of programs becomes relevant.
Chen [3] says, “the solutions, which facilitate migrating
digital records across generations of both, technology and
people, must be capable of interfacing with the digital
access systems” shown in figure 3.
So one main task to accomplish will be the provision of
proper interfaces, able to handle the connection between
user and machine, considering real data and metadata.
Data archiving:
In this case the system must be able to extract the data out
of a bit stream and present it in the way it has originally
been saved.[4] To interpret the data in the right way the
user is also supported with metadata, which explains the
means of the real data. So the logical model for data
archiving must be simple in order to minimize the effort,
that has to be done when one needs metadata to understand
the model and data. And it is only used to restore the data
and not to work with it. In his example [4] he uses a
Universal Virtual Computer (UVC), which will be
explained later.
The data is stored as a bit stream in 2000, represented by
Ri.[4] In 2100 the client accesses the data, which the clients
sees as a set of data elements, that follow a certain data
schema Sd. The method is the decoding algorithm returning
the data to the client in the appropriate way, based on Sd.
With a language L the metadata is saved in 2000, means the
description how to decode the data. In addition a
mechanism Ss allows the client to read Sd as if it was data.
This mechanism Ss is a schema to read schemas and should
be simple and intuitive, so that it will be not changed over a
long time and remains known.
Fig.3 [3] A digital appraisal, preservation and access
model. It is one suggestion, which permits migration of
digital recordings across generations of technology and
people.
METHODS, TOOLS AND INITIATIVES
As shown in Figure 3, there are some general proposals for
solutions regarding long term archiving of digital data.
Lorie makes an interesting distinction between the
preservation of data and the preservation of programs,
including their behavior. [4] Conversion of the data from
one system change to another may be appropriate when
handling business data, needed nearly everday, but when it
comes to longtime saving of data, that may not be accessed
over some decades, such conversion are a big burden and
not free of risks. On the other hand the emulation of the old
systems can be very expensive (not just in terms of money)
and are also often an overkill, when one just wants to see a
record again, the whole program functionality of the past is
Fig. 4: Overall mechanism for data archival [4]
Lorie shows an example for a practical implementation of
such a decoding system. The whole idea in the
implementation process is based on the UVC machine.[4] It
is a computer only existing virtually, it has the functionality
of a computer and its definition is so basic that it will
endure forever (universal).
4
The exchange of the data is supported via metadata which
are also part of the data in Distarnet.
My only complaint is, that I’m not sure if even such a UVC
would survive a technology change so drastically that it
could make all other computers on the market obsolete.
Look back a hundret years, no one knew that there will ever
be computers. So what will be in a hundret years, we cannot
imagine now.
Chronopolis
Chronopolis is a project for a preservation model facility
that for long-term support of irreplacable and important
national data collection.[7] It ensures that, (1) standard
reference data sets remain available to provide critical
science reference material; (2) Collections can expand and
evolve over time, as well as withstand evolution in the
underlying technologies and (3) preservation “of last resort”
is available for critical disciplinary and interdisciplinary
digital resources at risk of being lost.
Program Archival
In terms of program archival it is also proposed to use the
UVC approach. It is extended to serve as a program
archival mechanism too. [4] Instead of archiving the UVC
method to decode the data, the whole program, saved on the
actual machine, will be saved together with the UVC code
that emulates the instruction set of the actual machine. So
an emulator is written in the present.
The project reacts on many calls from different experts that
preservation must be done now not some time in the
future.[7] The preservation architecture of Chronopolis
includes a digital library system, data grid technology, a
research and development laboratory and an administration
and policy effort, as shown in figure 6.
To keep it possible to adapt the data to whatever output
device there will be in a hundret years, it is a proposal to
keep the data structure produced by an output processor
simple and well documented.[4] In this way it will be easy
to write a mapping for the output data according to the
actual output device of the future.
Distarnet
Margulies [6] provides a sketch for a solution based on the
processes of migration and emulation. They say that one
must eliminate as many costly factors out of the process as
possible to make it efficient. This is solved via
automatization.
Distarnet is a project with the goal for a development of a
communicationsprotocol for a distributed and automated
archiving system. [6] They have a process model to assure
that data keeps unchanged and consistent, which is of great
use when building a distributed system. A P2P networl is
used to communicate, where every node saves a certain
amount of redundant information. The nodes communicate
with each other via internet technologies.
Fig. 6 Chronopolis components
Moore et.al. mentions in [7] that preservation environments
have to support the following set of archival processes:
Fig. 5: Distarnet node processes
In Figure 5 one can see the processes of one node of the
network. There are evaluation steps over and over again to
check the right redundance of the saved information. The
algorithm is capable of deleting, copying and checking the
information and to initiate the according process.
-
appraisal
-
accession
-
description
-
arrangment
-
storage
-
preservation
-
access
The main idea behind Chronopolis is the federation of
several systems to minimize the typical risks. So the
preservation facility provides three functionalities: [7]
The solution provides an automated way of data migration
and to keep the data consistent over a distributed system.
5
5
-
Core center
Supports user access to the digital data, it is
the primary access resource.
-
Replication center
This functionality provides mirror copies of
the digital holdings to guarantee user access if
the core center is unavailable
-
To raise the chances of a successful preservation of cultural
heritage data that is digitally stored, we also have to think
about simplifying the way data is saved including its
metadata, which is also indispensible. With the right
combination of saved data and metadata and an effective
way to keep data consistent from system change to system
change it can be possible to save all important data for
nearly eternity.
Deep archive
This is the most complex functionality. It
stages all submitted data, that comes from
archivist controlled processes, allows no
remote user changes and handles all changes
on the data via versions.
REFERENCES
1. Steenbakkers, Johan F. Digital Archiving in the 21st
century: Practice and the national library of the
netherlands. Library Trends Vol. 54, No. 1 2005, p. 3356.
So via the combination of a core center and a replication
center, the system can handle the risks of media failure,
natural disasters (if the replication center is geographically
disrtibuted from the core center), local operational errors (if
different teams handle different centers) and systemic
system failures, if the centers are build by different vendor
products. [7]
2. Cassar, May Evaluating the benefits of cultural heritage
preservation: An overview of international initiatives.
Center for Sustainable Heritage, London 2006.
The deep archive finally can handle malicious users
By evolving the replication center to an autonomous center,
also other things can be replicated, especially namespaces –
for identifying archivists, storage resources and managing
constraints concerning the access. [7] This leads to a total
independency of the technological storage device. This is
one of the key components of such preservation facilities,
as mentioned before.
There are some other initiatives, such as the DANA (Digital
Archive Network for Anthropolgy), which less focuses on
the long term preservation of digital data, but on the
effective and distributed preservation and access on cultural
heritage objects. [8] Even with the PDF format it is tried to
provide an according archiving technology, as it was
nevertheless originally intended as an independend format
for documentation representation. [9]
CONCLUSION
Long term preservation of digital data is very complex in
terms of distribution of data, longevity of physical storage
devices and consistency and integrity of the data itself, so
that it can be accessed the same way nowadays and in the
far future. We’ve seen, that some key aspects to do so are
the exchange of data involving no losses, the independency
of the storage device and the dependency on the output
machine.
3. Chen, Su-Shing. The Paradox of Digital Preservation.
IEEE Comuter 34 (3), p. 24-28 (2001).
4. Lorie, Raymond A. Long Term Preservation of Digital
Information, Proceedings of the 1st ACM/IEEE-CS joint
conference on Digital Libraries, p- 346-352, January
2001, Roanoke, Virginia, USA.
5. Hedstrom, M. It’s about time: Research Challenges in
Digital Archiving and Long Term Preservation: Final
Report. The National Science Foundation and the
Library of Congress, August 2003.
6. Margulies, S., Subotic, I., Rosenthaler, L.
Langzeitarchivierung digitaler Daten DISTributed
ARciving NETwork DISTARNET, IS & T Conference
San Antonio, 2004.
7. Moore, Reagan W., et.al. CHRONOPOLIS – Federated
Digital Preservation Across Time and Space. In: Local
to Global Interoperability – Challenges and
Technologies, Sardinia, Italy, p. 171 -176, June 2005.
8. Clark, Jeffrey T. et.al. Preservation and Access of
Cultural Heritage Objects Through a Digital Archive
Network for Anthropology. In: Proceedings of the
Seventh International Conference on Virtual Systems
and Multimedia. Enhanced Realities: Augmented and
Unplugged, IEEE Computer Society Press (2001), pp.
253-262.
9. Davies, A.N. Adobe PDF use in analytical information
storage and archiving, Spectroscopyeurope, vol.19, no.5
2007,
p.20-25.
6
Data Management of Cultural Artifacts
Sudheer Karumanchi
skaruman@edu.uni-klu.ac.at
Institut für Informatik-Systeme
Universität Klagenfurt
ABSTRACT
opportunities to reconstruct and preserve the cultural and
historical artifacts in new contexts with meaningful
relationships. In the last few years museums have been
exploring the use of virtual environments to make their
content available in digital form. Museums can save space,
prevents data damage, transmit data easily by digitizing.
Virtual Reality is a tool to describe, visualize, record and
store the data in a visual figure which is easier to
understand and translate, work on sites without interfering
with the physical environment. Virtual Reality application
in archaeology and heritage are frequently identified with
the reconstruction of ancient sites in the form of 3D models
for desktop viewing.
Virtual museums provide simulation of real museums. They
represent the virtual environment that exhibits physical
museum in a variety of ways over the internet and provide
access to a wide range of users all over the world. Thus
provides users to visit museums without any time, locality,
safety and expense restrictions. It provides more
information to users by allowing them to examine the
artifacts. Museums contain recorded information like
images, paintings, films, sound files, publications, text
documents and other data of historical, scientific or cultural
artifacts. To develop the virtual environment in a museum,
the entire museum collection must be digitized right up
front.
We can summarize the objectives of virtual cultural
heritage projects in three words: initially to reconstruct the
data “representation”, to present and visualize the virtual
environment “experience”, and providing the ability to gain
insights and modifying the experience “interaction”.
Almost all virtual heritage related projects will include all
three of these characteristics in their implementations, yet
very few examples exist where all the three is achieved.
This is due to many factors like increasing amount of
heterogeneous data and decreasing storage space.
This paper reviewed the challenges that virtual museums
face in making their content available. This paper will
explore the current state of virtual reality for the cultural
heritage, and discuss the issues involved in using state-ofthe-art interactive virtual environments for learning, historic
research, and entertainment.
Digital technologies encompass a wide range replacing the
earlier alternatives like animation, audio, film, graphics,
television, video etc. Some applications such as databases
and search engines make more accessible. The World Wide
Web and multimedia provides access to a range of digital
resources and support a variety of learning styles through
the internet. Many museums incorporate some type of
intranet to provide a dedicated and limited resource that is
functionally similar.
Keywords
Virtual Museums, Virtual Reality, Content Production,
Content Visualization, Content Management.
The design and development of virtual heritage systems
mainly divided into three stages namely input data
collection, data processing and data visualization
technologies which can be called as Content Production,
Content Management and Content Visualization [11]. This
paper aims at focus on the advent of latest technologies in
digital reconstruction of cultural and historical artifacts to
store, archive, retrieve and visualize in order. In the
following sections a brief discussion about the architecture
and different technologies how cultural artifacts are
represented in virtual museums.
INTRODUCTION
Cultural heritage is an important resource for any country
from several perspectives whether it deals with paintings,
historical monuments, sculptures etc. Virtual Reality is the
simulation of a real or imagined environment that can be
experienced visually. Virtual Reconstructions are the most
commonly used applications of virtual reality in cultural
heritage.
Virtual
environment
presents
exciting
Klagenfurt, Austria
Copyright 2008 Klagenfurt University...$5.00.
PROPOSITION
Figure 1 illustrates the overall architecture consisting of
Content Production, Content Management and Content
Visualization. Traditionally most virtual museum systems
use digital technologies to data acquisition, 3D modeling
and refinement to show virtual environment. With the
1
7
advent of digital photography and photogrammetry,
museums can record 2D pictures of artifacts and
complement with text descriptions. This whole process may
be called as Content Production. Technologies advances
towards 3D in culture heritage to record the artifacts with
higher precision and in attractive looking, replacing the
older saying “do not touch” with “touch and feel”.
RELATED WORK
The Ark of Refugee Heirlooms [12], a cultural database
mainly focus on Greek Culture Heritage
Figure 2: A web based database user interface [10]
The Ark is a database containing more than 4000 artifacts
to document the cultural identity of Greek population due to
population exchange between Greek and Turkey.
Figure 1: Over all Architecture [11]
The 3D– ArCAD System [13], multimedia database for
archaeological ceramics and glass artifacts
After successful digitization, now focus on importing these
digitized artifacts on to a database. This is nothing but
storage and management of digitized artifacts. To make the
information available, there are so many problems that are
to be addressed with data storage like
•
Intelligent mechanisms to turn the raw data into
useful information
•
Uncertainty, Data Classification: It is important to
know the relevant information of artifact when it is
created and where it is created.
•
Access to data storage
•
Developing a user database query system: Several
constraints lie in developing a query system to
access uncertain information, dynamic querying to
specific queries graphically i.e., using GUI
widgets, visual querying based around a time line
as artifacts are represented on timeline.
•
And finally providing a user friendly environment.
The visualization of digital artifacts of museums is
performed by Virtual Reality interfaces [5]. Several virtual
museums make use of existing Web3D technology to
present their artifacts. Some virtual museums make use of
3D technologies like VRML, or QuickTime VR. Examples
are the SCULPTEUR project, the World Heritage Tour and
several virtual cities, like Saint Petersburg 300 or Virtual
Rome. Recent trends in Internet towards increasing
bandwidth and browser plug-ins using large datasets and hiquality rendering developed some more applications. Some
examples are the 3D of Piazza dei Miracoli and the Pure
Form Museum that make use of the XVR technology,
Playing the Past, making use of the Quest3D technology, or
Keops, based on the Virtools technology. Besides web
applications, there are also interesting applications of realtime 3D graphics to Cultural Heritage which need dedicated
clients, the most famous being Second Life-based virtual
museums which are either replicas of real museums or fully
virtual environments with no correspondence with real
institutions.
All the above framed questions are fulfilled by only internet
databases providing a vast amount of information i.e.
distributed all over the world. Visualization of cultural
artifacts can be seen in next section.
Increasing trend towards realistic environment like real
museums came into the research replacing the existing
ones; therefore 3D modeling environments are developed.
Instead of accessing the 2D models, there is a better look
and feel experience for the visitors with 3D environments
as they are really visualizing the museum. The problem of
capturing 3Dobjects and information had overcome with
2
8
the advent latest tools. Several commercial software
interactive tools to develop accurate 3D environments are
availabe. The ActiveWorlds system allows users to design
their homes in a 3D environment. Outline3D allows
drawing an interior. The Virtual Exhibitor, a commercial
tool aims to construct and simulate the exhibition items in
3D. QTVR is a good example for 3D presentation of digital
artifacts in 2D space. Now-a-days personal computers with
graphics hardware are also capable of displaying 3D
artifacts. Using these modeling tools the digital artifacts can
be easily developed and placed online. Although the
produced results are visually realistic they are always less
detailed than the real artifacts.
Figure 4: Web based visualization of cultural objects [11]
PRESENTATION OF MULTIMEDIA DATABASE
The three stage system for retrieving the information can be
shown in following figure. In the first stage the user is
prompted to access database search engine in order to
locate an object matching desired search criteria.
Figure 3: The user interface of the 3D-ArCAD database
system [10]
The end part in data visualization is placing these 3D
modeled artifacts on web providing the visitors to view the
cultural heritage content remotely (for use in Internet),
locally (for use in museums e.g. touch screen display) or a
suitable environment (Interfaces). The main idea here is
how visitors will search for these 3D artifacts? To make the
3D artifacts accessible via World Wide Web with realistic
and navigating performance 3D web modeling languages
are available especially VRML [4]. Virtual Reality
Modeling Language is a standard file format to represent
3D interactive worlds and objects. X3D is the next
generation to VRML, an xml based file format used to
represent 3D computer graphics. XSL (Extensible Style
Sheet Language) style sheets can be used to present the
content in different visualization modes in different styles
for different visitors. The basic problems while presenting
the artifacts to web are file size which leads to affects in
loading time and navigation. In addition to this, the ease of
navigation also depends on the complexity of 3D objects.
Figure 5: Three stage interaction system procedure [10]
The database generates a report of all matching artifacts and
prompts the user to select from list. In the second stage the
user gets specific artifact to inspect and a 3D modeled
representation through VRML language. In the third stage
the user is prompted to interact with the 3D object by
panning, zooming, by clicking on the surface and by other
measurements provided in database.
Presentation and visualization of cultural artifacts on web
provides [9]
•
User friendly and Comprehensible
•
Customization
•
Flexible and efficient use
There are several problematic issues that are to be identified
in visualization of cultural artifacts on web that could
enhance the virtual museum experience, mainly
3
9
•
Consistency
•
Standards of GUI Interfaces
•
Navigation
•
Functionality and quality of graphic elements -
•
Documentation – virtual museums must provide
the users with user friendly language and help the
inexperience visitors how to visit museums by
documenting in simple vocabulary.
•
Access rights – Virtual museum systems must
provide the privileges and access rights to the
users in a descriptive manner to access all the
information and all the inventories provided in
museums. Rights for administrators and users are
different in accessing the artifacts. The virtual
museums system must
REFERENCES
1. ARCO (Augmented Representation of Cultural
Objects).
http://www.arco-web.org/
•
Visibility – virtual museums must focus on how to
communicate the information to the user.
•
Searching – helps the users to find the exactly
what they are looking for. Search engine works on
by providing key words or by providing index.
Search engines also help users in mining data
available in databases and news papers.
•
•
2. ARCOLite: An XML Based System for Building and
Presenting Virtual Museum Exhibitions Using Web3D and
Augmented Reality. In Proceedings of the theory and
Practice of Computer Graphics 2004 (Tpcg'04) - Volume
00 (June 08 - 10, 2004).
TPCG. IEEE Computer Society, Washington, DC, 94-101.
3. Walczak, K. and Cellary, W. Building database
applications of virtual reality with X-VRML. In
Proceedings of the Seventh international Conference on 3D
Web Technology (Tempe, Arizona, USA, February 24 - 28,
2002). Web3D '02. ACM, New York, NY, 111-120.
4. Walczak, K. and Cellary, W. X-VRML - XML Based
Modeling of Virtual Reality. In Proceedings of the 2002
Symposium on Applications and the internet (January 28 February 01, 2002). IEEE Computer Society, Washington,
DC, 204-213.
5. Hemminger, B. M., Bolas, G., Carr, D., Jones, P., Schiff,
D., and England, N. Capturing content for virtual museums:
from pieces to exhibits. In Proceedings of the 4th
ACM/IEEE-CS Joint Conference on Digital Libraries
(Tuscon, AZ, USA, June 07 - 11, 2004). JCDL '04. ACM,
New York, NY, 379-379.
Content accessibility – virtual museums must
provide the quick reliable and easy access to
information about virtual museums digital
artifacts. For example some users are interested in
thumb nails representing the images of objects
and some are interested in voice recognition for
user posed queries for easy participation
6. Reitmayr, G. and Schmalstieg, D. Location based
applications for mobile augmented reality. In Proceedings
of the Fourth Australasian User interface Conference on
User interfaces 2003 - Volume 18 (Adelaide, Australia). R.
Biddle and B. Thomas, Eds. ACM International Conference
Proceeding Series, vol. 36. Australian Computer Society,
Darlinghurst, Australia, 65-73.
Aesthetic issues – providing the effects to visitors
like sound, vision and auditory etc, virtual
museums may also enable the visitors to touch
and feel the artifacts and also providing the multi
sensory experience
7.Hansen, F. A. Ubiquitous annotation systems:
technologies and challenges. In Proceedings of the
Seventeenth Conference on Hypertext and Hypermedia
(Odense, Denmark, August 22 - 25, 2006). HYPERTEXT
'06. ACM, New York, NY, 121-132.
CONCLUSION
This paper clearly stated how virtual museums manage in
representing different cultural artifacts. Different digitalized
technologies and overall architecture showing content
management, visualization and production is presented.
Here how cultural artifacts are presented and how the
presented data is visualized based on the end user choices.
In this a multimedia representation for digitized information
is provided and different problematic issues that are to be
focused in developing this are also mentioned. The uses of
different technologies in presenting and modeling their
artifacts in virtual museums are focused. At last a three
stage interaction system representing how an end user can
interact with virtual objects over the internet is shown.
8. H. Chen, C. Chen, 'Metadata Development for Digital
Libraries and Museums – Taiwan’s Experience',
International Conference on Dublin Core and Metadata
Applications 2001.
9. Sylaiou, S., Economou, M., Karoulis, A., and White, M.
2008. The evaluation of ARCO: a lesson in curatorial
competence and intuition with new technology. Comput.
Entertain. 6, 2 (Jul. 2008), 1-18.
10. Tsirliganis, N. Pavlidis, G. Koutsoudis, A. Politou,
E. Tsompanopoulos, A. Stavroglou, K. Chamzas, C.,
NEW
WAYS
IN
DIGITIZATION
AND
VISUALIZATION OF CULTURAL OBJECTS,
4
10
11. Wojciechowski, R., Walczak, K., White, M., and
Cellary, W. 2004. Building Virtual and Augmented Reality
museum exhibitions. In Proceedings of the Ninth
international Conference on 3D Web Technology
(Monterey, California, April 05 - 08, 2004). Web3D '04.
ACM, New York, NY, 135-144.
12. Politou E., Tsevremes I., Tsompanopoulos A., Pavlidis
G., Kazakis A., Chamzas C., "Ark of Refugee Heirloom" A Cultural Heritage Database, EVA 2002: Conference of
Electronic Imaging and the Visual Arts, March 25-29,
2002, Florence, Italy.
13. Tsirliganis N., Pavlidis G., Koutsoudis A.,
Papadopoulou D., Tsompanopoulos A., Stavroglou
K., Loukou Z., Chamzas C., "Archiving 3D Cultural
Objects with Surface Point-Wise Database Information",
First International Symposium on 3D Data Processing
Visualization and Transmission, June 19-21, 2002, Padova,
Italy.
5
11
Copyrights, Storage & Retrieval of Digital Cultural Artifacts
Andreas Stiglitz
stige@edu.uni-klu.ac.at
ABSTRACT
This paper describes how it is possible to provide access to
a huge repository, which can be used by almost everybody,
for viewing and managing digitalized cultural artifacts. It
also explains how such a system can be implemented and
how it deals with copyrights. Furthermore, the paper
focuses on different retrieval aspects for such an artifactrepository. Thus, we try to answer the following questions:
Which mechanisms can be used for entering data and
browsing through such a huge database of artifacts? Are
there any user-friendly ways to search and find artifacts in
the repository? How can such a system be maintained and
how successful is the handling of copyrights and licenses
conforming to the law?
Different systems are already in development or in use. But
these systems can’t be implemented without dealing with
legal aspects like the copyright.
Long term digital preservation of digital memory is an
emerging technological and policy issue. To offer access to
the cultural heritage to the whole world through the Internet
the protection and management of copyrights for available
digital artifacts have to be considered.
This paper explains how the problems raised before can be
solved by different methods, technologies and theoretical
approaches.
Keywords
Repository, digital cultural artifacts, retrieval, storage
browsing mechanisms, maintenance, copyrights, licenses.
COPYRIGHT
Before we can digitize cultural artifacts, there is one aspect
to be considered, which is very important and essential,
because it restricts the digitization itself and the further
usage of the digital content through the Internet and
different other Web applications: the issue of copyright
protection and management [1].
INTRODUCTION
Since the interests in the preservation of our cultural
heritage increased during the last few years, electronic
presentation and access to the huge volumes of relevant
data became more and more important. Information
technologies now make it possible for specialists and also
for average citizens worldwide to access information and
galleries of cultural artifacts.
Cultural organizations which want to use digital cultural
artifacts have to be aware that there exist actual
technological solutions for copyright protection and
management such as watermarking, encryption, meta-data
and digital rights management systems. There exist
technical guidelines which describe systems, software and
hardware solutions, commercial applications which are
focusing on the copyright protection and management issue.
They are giving an advice to the cultural organizations of
how to choose a solution that fits their needs amongst the
existing ones. The next sections describe the interplay
between the components needed to implement such
copyright solutions [1].
If we want to use a huge repository for viewing and
working with data concerning digital cultural artifacts, we
have to consider several important aspects: The
implementation of such a database has to offer easy
retrieval of information for the average users. The browsing
mechanisms have to be intelligent to provide an easy usage.
Also insertion of new data should be comfortable. Another
requirement is the user-friendliness of the system and we
have to solve the problems of practical maintenance and
administration of such a repository. Finally, we have to deal
with the aspects of management of copyrights and licenses.
DRMS
One of these solutions is the Digital Rights Management
System, which protects the copyright of digital images
through robust watermarking techniques. This happens by
multi-bit watermarks, which are integrated into the digital
images which are commercially exploited and delivered to
the buyers [1].
Everybody, who needs access to such a repository, wants
that his/her requirements can be fulfilled. Users can be
curators, exhibitors, critics, and viewers. To meet their
claims a repository can be implemented as an open-source
Web portal system, providing opportunities for
customization and integration.
1
12
CPS
The Copyright Protection Subsystem is an intermediate
layer between an e-commerce application and a digital
artifact library. It protects the copyright of the digital
artifacts stored and is exploited by the DRMS. It’s like a
black box which creates a watermarked image from the
original digital image. This process automatically produces
watermarked versions of the images whenever a new
original image is stored to the digital library. It includes a
copyright owner id and other information for copy control,
digital signature, unauthorized use-tracking and transaction
management [1].
Watermarking Algorithm
Watermarking normally is used always if copyright
protection of digital contents is needed and these data is
available to everyone who knows that there exists some
specific hidden meta-data and wants to remove it for illegal
purposes. To avoid this, watermarking can give you a proof
of ownership of the digital data by embedding copyright
statements, which can’t be removed easily like meta-data
[1].
Effective Licensing Mechanisms
Also the difference between purchasing and licensing is
very important for the purposes of the cultural
organizations. Purchasing a copy of a work is the most
common transaction model of the copyright legislative
framework. The process transfers the ownership rights from
the creator to the buyer. Thanks to the copyright directive
it’s possible to loan, rent or resell the purchased copy.
Licensing is the restricted transfer of rights for the work for
a limited use under defined and declared circumstances. To
get a license, a contract has to be signed, which includes the
terms and conditions for using the copy of work. Licenses
are also used for storing information about the terms and
conditions for purchasing and using cultural heritage
artifacts. Licenses have specific advantages, like an easy
implementation, but their use for repositories with digital
content, which should be accessible by the public is limited,
because it’s not so easy to handle a huge amount of owners.
Furthermore, creating a license can take a long time [1].
The DRMS provides [1]:
- Services for creation, management and long term
preservation of digital cultural artifacts
- Digital management of rights and the copyright of the
content
- Copyright proof and protection of the digital content
through technical means
- Direct and effective mechanism for licensing digital
content
- Added value services for the end user
The effective combination, customization and integration of
these technologies into an information system enable an
effective protection and management of the copyright of the
digital cultural artifacts [1].
Archiving of digital cultural artifacts also needs to be
permanent. But how can it be possible that the wide range
of different formats can be stored long enough? Is it
possible to migrate information across the formats? How
can the growing impact of IT requirements and the
complexity of copyright and other rights in digital artifacts
be handled [2]?
Owners of artifacts are far more aware of its potential value
than a person that wants to see the artifact, and so the
owners often don’t want to hand over any rights of the
artifacts. This makes the policing of access and copyright
control an increasingly demanding part of providing public
access to collections of cultural artifacts. If they can earn
money by providing access or promoting a collection, then
it is obvious that commercial providers want to get a piece
of the cake [2].
Today more and more cultural artifacts are getting
digitalized and the different mediums and technologies on
which the output appears and is supported leads to the
problem that we have to be more selective on what is to be
represented in our cultural archives and we have to think
about more factors than just the content, for example
playback technologies, accessing software, availability of
hardware, long term preservation issues, supporting
documentation and difficult copyright and access issue.
These points have to be considered for the final and
complex decision, which artifacts should be selected for
presentation [2].
The huge amount of materials available, the different
formats used, the need for copying or migrating the data,
the digitization of older artifacts, the need for special
software and hardware to access and copy, the control over
copyright and other ownership related rights and the
increased demand from users for online access lead to high
costs that cannot be paid by one cultural organization alone
[2].
They need permissions for images to be digitized and
placed in online repositories but how do they reassure
copyright owners that their images will not be downloaded
and used for purposes they may not agree with? And how
can the organizations protect themselves if this happens
[2]?
So, what are the challenges they have to face in the near
future? The digitization of existing cultural artifacts is a
slow, on-going process that will concentrate on materials
out of copyright control or small repositories for specialists.
Digitizing the artifacts will be more difficult and cost more
money because the digital repositories have to be
maintained for many years [2].
13
Anti-Copyright
One of the constant concerns of cultural producers about
the anti-copyright movement, which wants to realize the
usage of creative work without using copyright and
ownership, is how they are still getting benefits for their
work [5].
Digital Repositories need to support [3]:
- Metadata for the management of data about the digital
artifacts description and copyright
- Global standards for the unique identification of the
digital images
- Watermarking for the management and protection of
copyrights, as discussed before
The oldest belief why information should not be privatized
is that experimentation and invention would be hindered by
lack of access to the building blocks of culture. If cultural
artifacts (images or language) are privatized, they would
become cultural capital, which leads to different
hierarchical social classes, like any other form of capital
can do the same. Privatization of culture is a process that
stabilizes opinions within ideological ideas to keep the
actual situation, called status quo [5].
An important metadata set is related with the intellectual
property rights and includes information about the
copyright, the creator and right holder, the restriction of use
and contact information [3].
Intellectual Property and Ownership
Issues of ownership are becoming also more important and
complex because of the existing restrictions on appropriate
usage of content in the repositories and because the cultural
organizations need to develop protocols for the creative
ownership of digitalized artifacts. Copyright laws are
difficult to understand because multiple users can be
owners. In the non-profit sector, it often leads to problems
because museums often have different policies for the sale
of artifacts and for the leasing to commercial entities [4].
Privatizing cultural artifacts is part of the market, where the
participation in privatization is a sellout to market demands.
Individual cultural producers are worried about that they
don’t get any money for their work because of unauthorized
duplication. But they don’t need to fear this, only if an artist
is transformed into an institution, which can only produce
further works by earning enough money [5].
Copyright encourages the creativity and innovation by the
artists by ensuring that they get financial benefits for their
ideas for a finite period of time before the utility of these
ideas join the unprotected commons. But copyright
protections can also limit the creativity by being too rigid
because innovation is often inspired by existing creative
work [4].
For example, Elvis was transformed from an individual into
an institution. Elvis is now the merchandising of his videos,
films, records and all kinds of stuff. Elvis as an individual is
so irrelevant that even his death doesn’t stop the existence
of the product Elvis. Celebrities in whatever cultural field
are not people as we see them, they are institutions that
need copyright to protect their money. For those who are
still individual producers copyright is not necessary, often
it’s even bad for them [5].
A possibility to avoid these restrictions but keeping the
copyright is to use peer-to-peer networks as alternatives to
unrestricted file-sharing. An institution becomes the role of
a leader and encourages the others to connect to the
community and create content while providing mutually
beneficial conditions of fair use for content. This
encourages innovation, gives access to a huge audience and
cultural institutions can provide the infrastructure to
develop communities [4].
STORAGE & RETRIEVAL
After discussing the problems and different aspects of the
copyright issue, this paper now goes into the issue of the
storage and retrieval of the copyright protected cultural
artifacts in the repositories by showing different approaches
as a solution [6].
A more generalized possibility to save intellectual property
is the Creative Commons. It allows copyright holders to
grant some of their rights to the public while retaining
others through a variety of licensing and contracts including
dedication to the public domain or open content licensing
terms. For example the British Broadcasting Corporation's
Creative Archive uses this approach. But there are also
some problems concerning this approach including the
raising of the opinion through the cultural debate that
intellectual property decisions have a cultural dimension.
There still must be solved the question of how to provide
access to the content to ensure that users can use existing
works and create new works fitting to the intellectual
property issues in the sector of Creative Commons [4].
DAM
The first approach is a Digital Asset Management system
which provides the means for storage, annotation, retrieval
and reuse of any multimedia data such as 3D-objects,
images, sound, video and text. The most important aspects
of the system are the efficient combination of the semantic
aspects with multimedia data and the retrieval engine of the
system which combines the newest 3D-contentbased
algorithms and semantic driven, relevance feedback
methods [6].
Some organizations have already started adopting Digital
Asset Management software solutions for secure and
efficient storage, search and retrieval of digital artifacts /
3
14
assets. The main features of DAM systems are the efficient
storage, a quick and accurate search and retrieval function
and the reusability of digital artifacts. Metadata plays a very
important role in the management of non-text-based assets
[6].
The retrieval system offers a refinement through relevance
feedback methods. By marking the relevant results of a
query, the system delivers more relevant artifacts, which are
similar to the marked artifacts and are not included in the
first result [6].
One of the DAM approaches is dedicated to television
production companies and focuses on the metadata
annotation, the retrieval and the exchange of them. A
research was done to test the effectiveness of media asset
management and to show the possibilities and requirements
for using a specific file format for media exchange [6].
The search is done through user-defined aspects of the
semantic network, which means the users can give every
artifact a specific meaning that can be compared to another,
which leads to a good performance of the retrieval. Also a
search through the Google search engine is possible, which
enlarges the search capabilities [6].
The features of every commercial DAM system are the
efficient storage capabilities, where every digital artifact is
stored along with its metadata into an appropriate database,
a search and retrieval module, and a security module, for
protecting unauthorized access to the digital assets. The
digital assets can also be converted to many formats [6].
The best improvement of the retrieval adequacy is reached
by the use of the relevance feedback algorithms. The user
becomes an active part of the retrieval system. The user
enters a query and scores the relevant and non-relevant
results. The system refines the search and delivers a result
with artifacts that correspond more with what the user
searched for.
The Digital Asset Management system provides
modularity, safety and customization ability. It offers a high
retrieval accuracy and is able to store and retrieve any kind
of multimedia data as 3D-objects, images, sound, video and
text. Also a compatibility with many different database and
ontology standards is given [6].
The prototype of the DAM System uses a normal database
concept with content-based retrieval (CBR) methods that
are ontology compatible, which means it’s possible to
search for and to compare similar content, and supports
relevance feedback (RF) algorithms. Furthermore, there is
the option of exploiting an innovative content-free retrieval
(CFR) algorithm. Content-free retrieval algorithms could be
soon the state of the art information retrieval algorithms. By
combining the three algorithms above (CBR, CFR and RF)
the system makes it possible to retrieve geometrically
similar 3D objects and also refine the results by different
variations. Furthermore the system allows thesauri, text and
Web search techniques to reduce irrelevant results in a
query and to modify the results according to the needs of
the users [6].
An advantage of the DAM system is that information
retrieval can be accomplished by comparing multimedia
data, by means of text/thesauri and documentation data
query and even by a combination of them. The proposed
system can handle every known type of multimedia,
although content-based and context-free retrieval methods
are only 3D-oriented, in order to store, annotate and retrieve
the data relevant to them. The system is very customizable
to fit the needs of diverse application fields. The system is
highly generic and delivers a specialized expert support for
3D-data retrieval [6].
Multimedia storage, annotation, and retrieval are possible
through classical text-based and thesaurus search and
advanced, modern retrieval algorithms for 3D-object
retrieval, such as content-based, semantic-based methods
and relevance feedback algorithms are available. The
working prototype uses a client-server model to enable Web
and network support, which allows access to the database
through a network [6].
COLLATE
The EU-funded project COLLATE - Collaboratory for
Annotation, Indexing and Retrieval of Digitized Historical
Archive Material started in fall 2000 and ran for three
years. An international team worked together to develop a
new type of collaboratory in the domain of cultural
heritage. The system offers access to a digital repository of
historic text archive material documenting film censorship
practices for several thousands of European films from the
twenties and thirties. For the most important films it
provides enriched context documentation including selected
press articles, film advertising material, digitized photos
and film fragments. Major film archives from Germany,
Austria, and the Czech Republic deliver the sources and
work with the system as pilot users [7].
The system exploits the user-generated metadata and
annotations by using advanced XML-based content
management and retrieval methods. The final version of the
online repository should offer the possibility to integrate
cutting-edge document pre-processing and management
facilities, XML-based document handling and semiautomatic segmentation, categorization and indexing of
digitized text documents and pictorial material, which
partially could be implemented. Combining results from the
manual and automatic indexing procedures, elaborate
content
and
context-based
information
retrieval
mechanisms can be applied. An XML-based content
manager is responsible for the integration of knowledge
15
able to manage, archive and view huge amounts of cultural
artifacts in an appropriate way. New artifacts, called social
media, like blogs or multiplayer online role-playing games
will be in the near future as important for historians as the
archiving of complete castles, statues or other historical
artifacts are for us now. And maybe the distributed network
systems that are used by these media can solve the storage
problem [8].
processing methodology and retrieval functionality in the
system [7].
The following modules are part of the COLLATE system
[7]:
- Three document pre-processing modules for digital
watermarking of the documents like copyright and integrity
watermarks, intelligent, automatic document structure
analysis and classification and automatic, concept-based
picture indexing and retrieval.
- A distributed multimedia data repository comprising
digitised text material, pictorial material like photos and
posters and digital video fragments.
- Tools for the representation and management of the
metadata, the XML-based content manager incorporating
an ontology manager and a retrieval engine.
- A collaborative task manager for complex individual and
collaborative tasks, such as indexing, annotation,
comparison, interlinking and information retrieval,
including tools for online communication and collaborative
discourse between the domain experts and other system
users.
- The Web-based user interface of COLLATE comprises
several workspaces for different tasks performed by
distributed user groups and user types allowing for different
access rights and offered interface functions. The final
system version should be generated semi-automatically by
exploiting knowledge from the underlying task model and
the user-specific dialogue history.
The problem of enough space for the storage is very
important. From a technological point of view it’s possible
to solve the problem, because the improvement of
technologies leads to an increase of the capacity of storage
media [8].
Signs of success
New social artifacts are different than the usual cultural
artifacts. They are connected with a community and are
related to the interactions of the users. They also offer the
feature to make a highly detailed documentation of our
modern culture. The most difficult differences from
traditional artifacts are that they are dynamic and always in
progress, which means there can’t be finitely said if a final
state will be reached. Furthermore, these artifacts of social
media are connected by one or more huge networks through
the Web. Especially for these newer cultural artifacts it’s
difficult to find out the correct storage requirements. The
best condition to store such amounts of media is that
memory is getting cheaper and transmission speeds are
getting faster every day. The technology of distributed
networks and peer-to-peer applications is maybe the key to
find a solution to the storage problem by building a huge
repository for storing the artifacts and accessing them
globally [8].
Context-based Retrieval of Documents
An appropriate search and retrieval function is an essential
requirement for giving the user community the possibility
to access a cultural digital repository in a reasonable way.
The artifacts in the digital repository must be indexed by
content and subject matter to enable an advanced contentand context-based search. In context-based retrieval, a
document does not stand for its own. The actual context of
this document is considered. So it’s necessary to know the
specific type of an annotation with respect to its context [7].
To imagine how the concept for the repository could work
it can be compared with the application Napster for music
file exchange. Like for the storage of cultural artifacts they
managed it in a similar way to break the limits of physical
capacities by digitizing an unbelievable great amount of
music, storing it and distributing it with the help of the users. The network of users was used to distribute the tasks of
digitization, storage and access, that the costs and the effort
were minimal. This type of decentralized network is the
concept for all social media and could be the solution to
build a decentralized memory institute for digital cultural
artifacts. It would include redundancies, checks, and quality
controls like used in similar existing applications, which are
already working examples of this concept [8].
The COLLATE system represents a new type of repository
that supports content and concept-based work with digital
document sources. Innovative task-based interfaces support
professional domain experts in their individual and
collaborative scholarly work, like analyzing, interpreting,
indexing and annotating the sources. The metadata is
managed by the XML-based content manager and an
intelligent content- and context-based retrieval system
which finally also can be used for digital cultural artifacts
[7].
Such an archive for cultural artifacts requires new policies.
Digital networks and their communities offer new
challenges to producers of cultural heritage and archives
containing the artifacts. New producers and new users are
also a result, like new content and new collaborators. But
they also deliver solutions, like storage, distribution and
dynamic capture, which could be used to improve the
Storage problem of cultural artifacts
An uprising problem of storing cultural artifacts is the
management of the amount of data of the artifacts. Actual
or so called traditional archiving environments might not be
5
16
capacities and logics of the archive. Combining these
solutions with actual ones can improve the traditional
repositories for digital cultural artifacts [8].
Costs
An actual approach to select and preserve cultural heritage
is to be responsible for every individual repository and to
pay the money itself, including costs for acquiring the
digital artifact, for conversion to a manageable standard
format, and for the storage and maintenance of that artifact
over enough time. The costs, furthermore, include the
determination of technical requirements for the
maintenance of the artifact and the strategies for
representing, migrating and emulating the artifact [8].
Centrally managed network accessible storage costs less
than locally managed institutional storage of cultural
artifacts, it leads to a greater access of users and local
institutions save their money, which allows them to
preserve the infrastructure [8].
Today the technologies of digital scanning and the costs of
storage are at the point that it’s possible for every national
library to afford the digital capture of all their print media
[8].
The amount of photographs, films, and unpublished
materials is a greater challenge for today’s possibilities. We
can’t estimate the correct storage requirements and the
actual digital formats are unsatisfying. In the near future
this will change and capturing images and film through
digital scanning will be the next step after capturing print
and sound media successfully. However, every kind of
artifact needs a storage database for extra metadata, which
is necessary for building such a digital collection [8].
But not only the content in the digital form, also appropriate
permissions for presenting it are necessary. In the best case
no further costs arise than providing appropriate storage
space and permissions [8].
Digital artifacts that are in the public domain or artifacts,
whose owners want to make them available under certain
conditions, are stored and maintained by the Internet
Archive, which is a trustful institution that preserves and
provides access to digital heritage. Libraries can contribute
digitalized textual resources to build a huge collective
repository, which couldn’t be done by one single library.
Publishers and authors can donate their works, if the
copyright has not expired and if they are willing to allow
copies to be made under certain circumstances. This results
in a great access and the preservation of their work [8].
preserve digital cultural artifacts. The main functional
categories are the selection and storage of data in
repositories and archives, the maintenance like conversion,
migration and emulation and service provision like
indexing and access services [8].
The selection and storage of cultural heritage is the main
responsibility of institutions that are supported by the
government. Leaving the responsibility with the originators
and copyright holders was also discussed. Many reasons are
against this solution: if a copyright expires it’s possible that
a collection gets fragmented and the costs are maybe too
high to guarantee a continued maintenance and access. This
is why the data collection and maintenance should remain
part of the responsibility of such government-funded
institutions. But the responsibility for indexing and access
could be moved to the private sector. As a result it’s
possible for the information industry to develop products
and services based on repositories for cultural artifacts. The
user would become responsible for indexing, access and
other services because of paying for these specific products
and services. It’s also possible to implement a cultural
heritage repository by shifting the responsibility from the
government to a more market-driven model, where the
users are responsible for funding all aspects of heritage
preservation [8].
Preservation of cultural heritage is the storage and
maintenance of digital artifacts and the capturing of
dynamic processes and patterns of use. To preserve the
digital cultural heritage, we need to distribute the
responsibility to the public and the private sectors,
involving the information industry. This can lead to the
implementation of so called digital heritage repositories [8].
CONCLUSION
Finally, repositories for digital cultural artifacts can be
implemented with different methods to receive a userfriendly and powerful storage and retrieval of the data. The
key aspect is to find or invent the appropriate approach to
meet the right purposes of the usage of such repositories. If
this is done well the issue of the copyright also has to be
solved sufficient. This leads to a good preservation of the
cultural heritage of past, our actual and also future societies.
Key aspects of preservation
Preserving the digital heritage is difficult. The current
situation needs to define three important dimensions of
preservation: functions, responsibilities, and funding.
Functions are the different activities that are necessary to
17
REFERENCES
5. Critical Art Ensemble. The Financial Advantages of
Anti-copyright. Digital Resistance: Exploration in
Tactical Media. Libres Enfants du Savior Numerique
Anthologie du Libre, Autonomedia (2001), 149-151.
6. Onasoglou, E., Katsikas, D. Mademlis, A., Daras, P. and
Strintzis, M.G. A novel prototype for documentation
and retrieval of 3D objects. International Workshop on
Visual and Multimedia Digital Libraries 2007,
VMDL07 (2007), 1-6.
7. Thiel, U., Brocks, H., Dirsch-Weigand, A., Keiper, J.
and Stein, A. A Collaborative Archive Supporting
Research on European Historic Films - the COLLATE
Project. Proceedings of the DLM-Forum 2002, DLM
(2002), 228-235.
8. De Jong, A., Wintermans, V., Abid, A., Uricchio, W.,
Bearman, D. and Mackenzie, J.O. Preserving the digital
heritage. Netherlands National Commission for
UNESCO
2007,
UNESCO
(2007),
1-64.
1. Tsolis, D.K., Sioutas, S. and Papatheodorou, T.
Copyright protection and management of digital cultural
objects and data. The annual conference of the
International Documentation Committee of the
International Council of Museums 2008, CIDOC 2008
Athens (2008), 1-11.
2. Pymm, B., Dr. Keeping the culture: archiving and the
21st century. VALA2000, VALA2000 (2000), 1-10.
3. Tsolis, D.K., Tsolis, G.K., Karatzas, E.G. and
Papatheodorou,
T.
Copyright
Protection
and
Management and A Web Based Library for Digital
Images of the Hellenic Cultural Heritage. Association
for Computing Machinery 2002, ACM (2002), 53-60.
4. Russo, A. and Watkins, J. Digital Cultural
Communication: Enabling new media and co-creation in
South-East Asia. International Journal of Education and
Development using Information and Communication
Technology 2005, IJEDICT Vol. 1 (2005), 4-17.
7
18
“Libraries in the digital age”
Carmen Volina
Alpen-Adria-Universität Klagenfurt
9020 Klagenfurt am Wörthersee
cvolina@edu.uni-klu.ac.at
ABSTRACT
This paper deals with the design of digital libraries in
context of cultural heritage management. At first the term
“Digital Library”, in the following shortened by “DL”, will
be explained. Furthermore, the challenges which evolve
when building a digital library will be explained. The other
parts of the paper concentrate on state-of-the-art approaches
to displaying digital library collections and making physical
libraries more attractive. Eventually, an example will be
presented how visitors can be edutained while being on an
exhibition.
data processing lead the way into the digital era where all
kinds of objects in various digital forms are adopted for
creation, preservation, and application.[8] The movement
toward digital heritage has been strongly supported by
increasing interest and resources from government and
academics. Many projects have contributed to the actual
development of digital libraries and museums. Building
digital heritage requires substantial resources in materials,
expertise, tools, and cost.
Definition
The term “Digital Library” can be used in many different
ways.[6] Nowadays a DL is known as a portal for digital
objects, documents and data sources. DLs can consist of
The paper will demonstrate the challenges which arise
when building DLs. Furthermore, presentation and
browsing techniques of DLs (which already exist globally)
are presented. In the end, an example is exposed, which can
people edutain while they are visiting exhibitions.
Visualization approaches will be shown on two specific
existing projects (I have taken this two approaches because
I thought they would be appropriate for our context of
cultural heritage): The Perseus Digital Library and the
InfoGallery which is a web-based infrastructure for
enriching the physical library space with informative art
“exhibitions” of digital library material and other relevant
information, such as RSS news streams, event
announcements etc.

digital objects themselves,

metadata of digital and conventional documents,

access to external data resources (e.g. search
machines, virtual specialized libraries),

personalized
components
information, stored searches),

local offered services (e.g. online tutorials, chat
services, e-mail services) as well as

community components, which can be used for the
communication within the users.
(e.g.
customer
Author Keywords
Cultural
Heritage,
Digital
Library,
Challenges,
Edutainment, Perseus, InfoGallery, PDLib, Cohibit.
CHALLENGES
Information Seeking Needs
Cultural heritage is a vast domain consisting of museums,
archives, libraries and (non)government institutions.[3]
Searching for information in this domain is often
challenging because the sources are rich and heterogeneous,
combining highly structured, semi-structured and
unstructured information, combining authorized and
unauthorized sources, and combining both text and other
media.
INTRODUCTION
New technology advances on computer, networking, and
Klagenfurt, Austria
Copyright 2008 Klagenfurt University...
Digital archiving of cultural heritage involves multimedia
documentation and dissemination on selected subjects. The
integrated use of image, video, sound and text provides a
rich context for preserving, learning, and appreciating the
documented subjects.[9] Multimedia materials are powerful
1
19
in conveying information and experiences of intellectual
and cultural heritage.
this object belong?” Complex queries typically
combine several constraints.
However, the creation, manipulation, and presentation of
multimedia materials require special skills and expertise. In
addition, subject documentation involves selection,
compilation, and interpretation of subject materials.[9]
These activities require subject domain knowledge and
access to material sources. After the subject materials are
acquired, they need to be checked, categorized, annotated,
and organized. In the end, a digital archive is to be built as
an information database system.
2.
These tasks should be done by persons who have proper
training in library science. Those specialists can be grouped
in five types:[9]
1.
Subject domain specialists: These persons have
sufficient knowledge on a specific domain and
have access to material sources. They make the
decisions how the subject will be illustrated and
interpret the content of the materials.
2.
Digital media specialist: These persons are
familiar with digital media equipment and tools.
Their task is to digitize the physically subject
materials with required quality and specifications.
3.
4.
5.
3.
Data management specialist: These persons have
adequate librarian training. They overtake the tasks
of collecting, categorizing and annotating the
subject materials. Additionally they do the
controlling of the information on the materials.
Graphic interface specialist: These persons are
specialized in artistic graphic and web interfaces.
They create multimedia content presentation
format on subject materials.
Software engineering specialist: These persons do
the programming of the software system. The
system includes databases, web page interfaces
and system management.
Information Gathering: to carry out different
search tasks to fulfill a higher goal, e.g. collecting
information to make a decision. The information
gathering task is the main task of the specialists. It
has many sub-tasks:
a.
Comparison of differences and similarities
between objects or a set of objects.
b.
Relationship search between individual
pieces of information.
c.
Topic search queries to learn more about an
object.
d.
Exploration search is done when there is no
identified source for the queried subject.
Hence the experts look for related topics for
suggestions.
e.
Combinations to find out matches among
pieces of information from different
sources.
Keeping Up-to-date: not goal-driven, just to find
out what is new. The specialists can be up-to-date
in two different ways:[3]
a.
Active: Going to the sources and scan them
for changes from sources, e.g. browsing.
b.
Passive: Using technology to automatically
deliver new information from sources, e.g.
RSS feed or community mailing-lists.
4.
Communication: an information exchange task,
e.g. email.
5.
Transaction: an information exchange task, e.g.
online auction.
6.
Maintenance: involves organizing information, e.g.
updating bookmarks.
To perform their daily work, domain experts need to access
and exploit cultural heritage information in its full
richness.[3] Their search tasks are dominated by a range of
different (relatively complex and high level) information
gathering tasks, while the tools tend to be geared towards
support for (relatively simple and low level) fact finding
tasks. Many search tasks require experts to use and combine
results from multiple sources, while the tools typically
provide access to a single source.
The following figure shows a classification of information
task behavior for cultural heritage expert users. It includes
these task categories:[3]
1.
Figure 1 [3]: Classification of information task behavior for
expert users
Fact Finding: to ask goal oriented and focused
questions. Fact Finding Questions vary from
simple to very complex. An example for a simple
question could be: “To which tribe/culture” does
My opinion is that this presented classification covers the
main tasks of cultural heritage expert users. I think that
some more minor tasks are needed to done but all in all
these categories are complete. You have to bear in mind
2
20
that cultural heritage is an active tradition. Therefore, the
cultural heritage experts have actively take part in
A study was carried out by Amin, A. et al. They wanted to
find out what the information searching needs of cultural
heritage experts are.[3] The evaluation of the study showed
that experts need to compare, relate and combine pieces of
information manually or ask their colleagues. To get
answers to the simple and complex questions they have,
their current used tools provide insufficient interface
support for query formulation. Additionally, most experts’
search tasks require information from many different
sources, while their tools tend to support search in only one
source at a time.
Universal Access
Figure 2 [1]: PDLib Overview
After explaining what the difficulties for the specialists in
information seeking are, I would like to point out what
another challenge for the digital library itself is.[3] Due to
the increasing importance of the wireless communication
systems it is vital for digital libraries to grant universal
access architecture for the users in order that they stay
connected to the network and have access to the digital
library at any time and everywhere (mobile environments).
Another approach for providing universal access is made by
Adam et al.[1] They propose an approach which include
three components:
1
To provide universal access , a system must be designed for
mobile environments. One system which has been designed
for mobile computers is PDLib.[3] This prototype provides
library services to several device types (e.g. desktop, laptop,
PDA and mobile phone) with multiple operating systems
(e.g. Windows, Linux, Mac OS, Palm Os). The system
consists of three layers:
1.
Client Tier: this layer includes a variety of devices
with which a user can interact with PDLib.
2.
Server Tier: this layer shows the server system
infrastructures that provide services to clients:
Data Server, Mobile Connection Middleware and
Web Front-end.
3.
1.
the Digital Library Object Model
2.
the Object Manifestation Model
3.
the Object Delivery Model
The approach is based on the self-manifestation of digital
library objects, accomplished by using a component, called
oblet, which is a small piece of software that installs itself
on the client and renders the digital library objects based on
user and system profiles. A Petri net model is used to
represent the objects that can model synchronization and
fidelity constraints. I do not want to go into more detail in
this approach. For further information please refer to the
paper of Adam et al.
Usability
Another challenge I would like to discuss is the usability of
digital libraries which already has to be considered when
building the digital library.[4] In the beginning the builders
of digital libraries have to bear in mind that the end users
are individuals who have no particular skills in information
retrieval, and are accessing library resources from their own
desks, without support from a librarian. The typical thing
users do in the digital libraries is to search for interesting
articles. Searching is not just a case of entering a search
term and viewing a list of results. It is more extended and
iterative, i.e. the searching evolves over a period of time
and relies on users being able to follow new paths as they
appear.
Interoperability Tier: this layer includes other
(PDLib) data servers.
The devices of the client tier communicate with the server
tier to access PDLib digital library services. The access type
of the client tier with the server tier varies according to the
client device’s capabilities.
PDLib is an ongoing research and development effort
where interesting challenges related to digital libraries and
mobile computing are being improved.
People do not just search for items, they also browse for
them. Jones et al. characterize this distinction as follows:[8]
1.
Browsing: users navigate information structures to
identify required information.
2.
Searching: users specify terms of interest, and
information matching those terms is returned by an
indexing and retrieval system. Users may, in turn,
browse these results in an iterative manner.
1
Universal access is known as “facilitating access to
complex multimedia digital library objects that suits the
users' requirements”.[1]
3
21
3.
2.
Skimming: users get a quick impression of a text
by searching for and alighting on key words or
sentences.[12] If these passages fit into the context
for what they are looking for they shift into deep
reading.2
To find out how users interact with digital libraries within a
single session, but not necessarily using a single library,
Blandford et al. carried out a study with five users.[3] The
users were asked to find one paper on their own research
topic using their choice of libraries from a given set (ACM
Digital Library – www.acm.org/dl/, IDEAL –
www.ideallibrary.com, NZDL – www.nzdl.org, EBSCO –
www.uk.ebsco.com, Emerald – www.emerald-library.com,
Ingenta – www.ingenta.com).
3.
The most important design issues which have been found
out were:[4]
1.
Familiarity: users need to be able to rapidly
acquire understanding of core library features,
content and structures, if they are to gain a sense of
progress while working with the library.
2.
Blind alleys: many interaction sequences did not
achieve the user’s objectives. This is most obvious
when a search returns “no matches”, but occurs in
some more extended interactions.
3.
Discriminability: forming understandings of the
content and possibilities in a collection relies on
being able to discriminate between possibilities.
4.
Serendipity: finding unexpected interesting results
seems to give users a particular sense of making
progress. Serendipity depends on users being
easily able to identify interesting information,
which is one aspect of discriminability.
5.
Working across boundaries: transition events,
where one agent changes context, can often be a
source of interactional difficulties.
Terminology: words, sentences and
abbreviations used by a system Screen
design: The way information is presented on
the screen
b.
Navigation: The ease with which users can
move around the system
Relevance: The match between the system
content and individual user’s information
needs
b.
System accessibility: The ease with which
people can access specific systems
c.
System visibility: Observability or degree to
which the result of an innovation are visible
and communicable to others
Individual Differences: vary from user to user, e.g.
domain knowledge or computer experience
a.
Computer
self-efficacy:
Individual
judgment of one’s capability to use a new
system
b.
Computer experience: Exposure to different
types of applications and familiarity with
various software packages
c.
Domain knowledge: User’s knowledge of
the subject domain
The paper of Thong et al. suggest some recommendations
how the user acceptance of digital libraries can be
increased, e.g. to avoid jargon to make the context clear to
general users etc. For further information please have a look
at their paper.
Interface Characteristics: the interface is the door
through which users access a digital library
a.
a.
Figure 3 [14]: Model of user acceptance of a digital library
Another classification of usability factors has been done by
Thong et al.[15] They have developed a model of user
acceptance of digital libraries. They talk about three
categories of external factors which impact the user
acceptance:
1.
Organizational Context: in which the digital
library operate
I assume that the combination of the classifications of
Blandford et al. and Thong et al. regarding the design issues
of digital libraries are complete. When all factors are
noticed when designing a digital library the success of the
user acceptance is granted.
Additionally I would like to mention that the study of
Blandford et al. came up with several cases where decisions
taken by computer scientists and librarians have had
unanticipated consequences.[4] Therefore, the specialists
who are building a digital library should be aware that
2
e.g. XLibris, an “active reading machine”, offers a
“skimming mode” to support the user’s activities [11]
4
22
every decision they make in the design process has
consequences for the usability of the digital library.
In my paper I have only mentioned some challenges which
are faced with digital libraries. There are of course many
more challenges which have to be taken when building
digital libraries. One aspect is the copyright management
which is dealt in the paper of Andreas Stiglitz.[14] To learn
how this challenge can be treated please read his paper.
The next section of the paper deals with the possibilities to
visualize the collections of digital libraries. This will be
shown on three specific examples.
Figure 4 [11]: London Collection Vases in thumbnail view
With a browser you can query the digital library of Perseus
for a certain vase and receive a list of links or thumbnails
for any selected subset of the vases.[13] Then you have the
possibilities to follow the link or to view the photograph(s),
including information, for the selected vase.
VISUALIZATION APPROACHES
Perseus digital library
The Perseus Project is a digital library that has been under
continuous development since the 1987.[5] The aim of
Perseus is to bring a wide range of source materials to as
many people as possible.[11] In the beginning the team of
the Perseus Project only concentrated on ancient Greek
culture. Meanwhile they have begun to cover Arabic,
German, Roman and Renaissance Materials. Furthermore,
they established an Art & Archeology Artifact Browser
where you can find coins, vases, sculptures, sites, gems and
buildings. In the following I would like to focus on the 3D
Vase Museum which distinguishes from the Perseus web
site by one major point: if you request an object, the data
(texts, graphics, etc.) of the object you have looked before
disappear from the view. Within this project the project
team wanted to achieve a view of the whole available data,
while pursuing detailed analysis of a part of it.
The museum is a 3D virtual environment that presents each
vase as a 3D object and places it in a 3D space, which looks
like a room within a museum. The user can view the 3D
world and navigate anywhere in it with a VRML6-enabled
web browser or a head-mounted virtual reality display.
The project team of the Perseus Project used 2D interaction
techniques in the Perseus digital library.[11] Starting with
the 3D Vase Museum approach the team began with 3D
interaction techniques, e.g. direct manipulation3, zooming
user interfaces4 as well as techniques for information
visualization. In the 3D Vase Museum the user can learn in
virtual reality using non-WIMP5 and lightweight
interactions.
Figure 5 [11]: The 3D Vase Museum in eye-level view
Additionally to the viewing of the graphics and moving
around in the room it is possible to get other textual
information into the view without leaving the context of the
3D room. Secondary information about the vases appear on
the virtual screen when the user navigates to an area of
visual interest as it is shown on the following figure.
3
describes interactive systems where the user physically
interacts
with
their
operating
system;
http://www.cs.umd.edu/class/fall2002/cmsc838s/tichi/dirma
n.html
4
users can change the scale of the viewed area in order to
see
more
detail
or
less;
http://en.wikipedia.org/wiki/Zooming_user_interface
5
WIMP stands for “window, icon, menu, pointing device“
– the style of interaction uses a physical input device to
control the position of a cursor and presents information
organized in windows and represented with icons;
http://www.cc.gatech.edu/classes/cs6751_97_winter/Topics
/dialog-wimp/
6
VRML is the virtual reality modelling language to display
3D scenes; http://www.debacher.de/vrml/vrml.htm
5
23
An InfoColumn, which can be seen as a digital version of a
poster column where librarians could post announcements
via a web interface, has been built.[6] The announcements
appear as animated objects. If the visitors are interested in a
piece of digital material he or she could place a Bluetooth
enabled mobile phone on specific locations on the shelf
surrounding the column. Selected references to library
materials were then pushed to the phone via an established
Bluetooth connection. As many librarians requested for this
kind of physical exhibition, a general infrastructure for
web-based informative art, named InfoGallery, has been
designed.
Library visitors typically experience InfoGallery on large
flat panels or projection surfaces on walls, floors, or
ceilings.[6] A collection of animated InfoObjects is
available on the screen. The visitors can click or tab on the
touch-sensitive surface to explore a piece of displayed
information in depth. Finally, references to the information
may be dragged to a Bluetooth phone or sent to an email
address supplied by the visitor.
Figure 6 [11]: Secondary information about a vase
The whole 3D Vase Museum approach was evaluated in a
user study. [13] The project team wanted to find out if the
aim of the project – to increase the speed and accuracy in
learning the general context of the London vase collection –
could be achieved. They wanted to know if it is possible to
learn about a whole collection as much and as quickly as
possible. Therefore they gave some tasks as homework to
students of many different college courses in archeology
and developed ten questions which covered all areas of the
whole vase collection (color, themes, shapes, etc).
The evaluation, which took place in a specific way (for
further details please read through the referenced paper),
resulted in a large and significant difference in speed as
well as significant difference in accuracy between 3D Vase
Museum and the Perseus interface.[13] The students which
did the tasks with the Vase Museum performed 33 % better
on the tasks and achieved this nearly three times faster than
with the students who used the Perseus interface.
Figure 7 [6]: InfoGallery in use on an InfoColumn
Thus the 3D Vase Museum is an example of a solution to
the problem of focus-plus-context in information display as
the user can focus on an individual vase without losing
overall context of the collection. After this successful
project the project team would like to develop the 3D Vase
Museum in a fully immersive virtual environment, in which
a user can browse using innate skills such as turning the
head and walking.
The overall goal of informative art and the InfoGallery
system is to make people discover useful information or
inspirational material[6]. To make this discovery happen, an
InfoGallery needs to draw and maintain the attention of
potential users via its placement, shape and aesthetic
expression. Hence the InfoGallery displays should be
positioned in popular rooms like:
InfoGallery
InfoGallery is the second visualization example which I
would like to introduce. It is about a web-based
infrastructure for enriching the physical library space with
informative art “exhibitions” of digital library material.[6]
The aim of InfoGallery is to introduce informative art
applications in the physical library space to support
serendipity for digital library resources. This goal has been
evolved because visitors of physical libraries never find out
that the library possess a lot of digital resources unless they
have a targeted need and ask a librarian or perform a
targeted search. In the beginning the Hybrid Library project
has been developed.

Refreshment areas: coffee rooms and printer
rooms, where people relax and have time to watch
the InfoGallery information

Queues and lines: e.g. when delivering back books
at a counter

Entrances and hallways: here many people pass by
(the InfoGallery object should not block traffic)

Section squares of a library: e.g. newspaper
section, children’s section

The city: InfoGalleries can be integrated in city
surfaces.
The InfoGallery can be equipped with new issues of digital
periodical subscriptions, e.g. RSS feeds from selected IT
6
24
news sites, published papers from local authors, events and
other news from the library. Furthermore, an interaction
technique called iFloor is available and can be used: iFloor
provides an interaction technique based on camera tracking
of its users shadows from the ceiling.
passive RFID tags and are used as Tangible Interface to
control the virtual character system. There are two front
ends, one driver’s cab, two middle parts and five rear ends
(which allows the visitors to construct five different car
models). These objects represent the car-model pieces on
the scale 1:5.
The car is assembled on a workbench with RFID readers.
This workbench has five different areas where the car
pieces can be placed. Each area is for exactly one object.
The people can build the car beginning with the front on the
left and the rear on the right or the other way around.
Figure 8 [6]: iFloor in use
These two instances show that learning about cultural
artifacts can be fun and education at once. This leads me to
my next topic: An already existing project in order to attract
visitors on exhibitions before they fall asleep because of
boredom.
Figure 9 [10]: Overview of the interactive installation prototype
Two virtual characters are projected in life-size on a
screen.[10] There are three cameras installed to recognize
the presence of visitors. The installation runs in two modes.
The OFF mode is intended to attract visitors whereas the
ON mode reacts on user inputs and supports the visitors.
The mode is automatically switched to the ON mode when
visitors enter the installation framework. The virtual
characters welcome the guests, present the idea of the
exhibit and encourage them to begin with the assembly
task. The two virtual guides assist the visitors in the
construction phase with giving comments on the visitor’s
actions.
EDUTAINMENT
Everybody knows that while visiting exhibitions people get
bored and tired at some time even though the exhibition is
very interesting. Hence there could be a motivation of the
exhibitors to entertain the people in order to make them
staying longer at their fair. The entertainment part could
also be combined with educational aspects. Therefore I
would like to present an example how edutainment in
exhibits can be achieved.
To act in a credible way the COHIBIT system identifies the
current construction status and the visitor’s actions which
were done[10]. The two virtual characters should be noticed
like alive. Furthermore the guests of the exhibit should also
be entertained by these two characters.
Cohibit
COHIBIT is an acronym for COnversational Helpers in an
Immersive exhiBIt with a Tangible interface.[10]
COHIBIT is an ambient intelligence edutainment
installation which is used in theme parks. These theme
parks are visited by millions of people of different ages,
different interests and different skills. As a result the
edutainment installation has to be easy to use, simple to
experience and robust to handle.
The installation’s behavior is modeled as a finite state
machine (94 nodes and 157 transitions).[10] The state
machine determines the order of scenes at runtime. Each
node and each transition can have a playable scene attached
to it. There are 418 saved scenes. The order is calculated by
logical or temporal conditions as well as randomization.
How does COHIBIT look like?[10] The visitors of the
theme parks find a set of 3D puzzle pieces which are easily
identified as car parts. The motivation for the visitors is
now to assemble a car with these puzzles. In the
background of the puzzle, there are two life-like characters.
While assembling the car, the visitor is tracked and the two
characters comment on the visitor’s activities.
There exist 802,370 different combinations derived from
the ten instrumented pieces, the five positions on the
workbench and the two different directions in which the
objects can be placed.[10] These combinations are
classified in five categories because not every configuration
can be addressed when planning the character behavior.
The Ambient Intelligence Environment consists of ten
tangible 3D puzzle pieces. They implement invisible
1.
Car completed: the construction of the car has been
completed
7
25
2.
Valid construction: the construction
completed by adding elements.
can
be
3.
Invalid configuration: an invalid combination of
elements (e.g. driver’s cab behind rear element)
4.
Completion impossible: the construction can not be
completed without backtracking.
5.
Wrong direction: the last piece was placed in the
opposite direction with respect to the remaining
elements.
satisfaction, considering that everybody would like to have
access to the internet and network services 24/7 everywhere
in the world.
Afterwards, the projects “3D Vase Museum” and
“InfoGallery” were introduced as examples of visualization
approaches for digital libraries.
Finally, COHIBIT was presented in order to show that
educational aspects of digital libraries could make fun.
REFERENCES
1. Adam, N. R. et al.: A Dynamic Manifestation Approach
for Providing Universal Access to Digital Library
Objects. IEEE Transactions on Knowledge and Data
Engineering, Vol. 13, No. 4, July/August 2001
For every configuration various text modules are saved
separately.[10] Together with the texts the system
calculates the simulation of the two virtual agents. The
combination of the text and the simulation produce a scene.
2. Alvarez-Cavazos, F. et al.: Universal Access
Architecture for Digital Libraries. ITESM, Campus
Monterrey, 2005.
3. Amin, A. et al.: Understanding Cultural Heritage
Experts’ Information Seeking Needs. Proceedings of the
JCDL’08, Pittsburgh, Pennsylvania, USA, June 16-20,
2008, 39-47.
4. Blandford, A., Stelmaszewska, H., Bryan-Kinns, N.:
Use of Multiple Digital Libraries: A Case Study.
Proceedings of the JCDL’01, Roanoke, Virginia, USA,
June 24-28, 2001, 179-188.
5. Crane, G.: The Perseus Project and Beyond, D-Lib
Magazine, 1998.
6. Grønbæk, K., et al.: InfoGallery: informative art
services for physical library spaces. Proceedings of the
6th ACM/IEEE-CS joint conference on Digital libraries,
Chapel Hill, NC, USA, June 11-15, 2006, 21-30.
7. Hapke, T.: In-formation’ – Informationskompetenz und
Lernen im Zeitalter digitaler Bibliotheken. Preprint aus:
Bibliothekswissenschaft – quo vadis, hrsg. von Petra
Hauke. München, 2005, 115-130.
Figure 10 [10]: Visitor's actions and virtual guides
8. Jones, S., McInnes, S., Staveley, M.: A Graphical User
Interface For Boolean Query Specification. International
Journal on Digital Libraries, Volume 2 Issue 2/3, 1999,
207-223.
Although COHIBIT is now used in theme parks it is also
appropriate for other exhibitions (indoor and outdoor). To
use it in the context of cultural heritage it would be possible
to assemble various statues or other different objects of the
cultural heritage sector according to the context of the
exhibition.
9. Liu, J.-S., Tseng, M.-H., Huang, T.-K.: Mediating Team
Work for Digital Heritage Archiving. Proceedings of the
JCDL’04, Tucson, Arizona, USA, June 7-11, 2004,
259–267.
CONCLUSION
In this paper I have discussed the challenges which are
faced when building digital libraries. Many specialists have
to work together in order to establish such libraries.
Therefore, it is very important that there is a lot of
communication between those experts. They have to
consider that every decision they make has influences on
the usability of the digital library in the future.
10. Ndiaye A. et al.: Intelligent technologies for interactive
entertainment. First international conference, Madonna
di Campiglio, November 30-December 2, 2005, 104–
113.
11. Perseus Digital Library:
http://perseus.mpiwg-berlin.mpg.de/ and
http://www.perseus.tufts.edu/, Tufts University,
Medford, MA, 2003.
The topic of mobile computing has gained a lot of
importance recently. Hence, this topic should be in mind
when designing a digital library. This will lead to customer
8
26
12. Schilit, B. N., Price, M. N., Golovchinsky, G.: Digital
Library Information Appliances. Digital Libraries,
Pittsburgh, USA, 1998, 217-226.
14. Stiglitz, A.: Copyrights, storage & retrieval of digital
cultural artifacts. Klagenfurt, 2008.
15. Thong, J. Y. L., Hong, W., Tam K. Y.: What leads to
user acceptance of digital libraries? Communication of
the ACM, Volume 47, Issue 11, November 2004, 78-83.
13. Shiaw, H., Jacob, R., Crane G. The 3D-Vase Museum: A
new approach to context in a digital library.
Proceedings of the 5th ACM/IEEE-CS joint conference
on Digital libraries 2005, Denver, CO, USA, June 07 11, 2005, 125 – 13
9
27
Using Narrative Augmented Reality Outdoor Games in
Order to Attract Cultural Heritage Sites
Bonifaz Kaufmann
Institute of Informatics Systems
University of Klagenfurt
bkaufman@edu.uni-klu.ac.at
ABSTRACT
information but also want to be entertained while visiting
cultural heritage places according to [18] cited by [17].
To many people whether they are young or old, history as
well as archaeological and cultural heritage sites seem to be
dead matter, since it needs a lot of background knowledge
and the capability of imagination to let history become
alive. Upcoming technology provides new possibilities to
send people back in time or even into the future in order to
gain an immersive experience of history. Advanced mixed
and augmented reality applications could support visitors to
experience historical events by their own as they are
themselves part of it. Instead of being a silent observer, the
user might act as an important figure of a game story based
on scientific data about the historical context of an ancient
site.
In order to attract a broader range of people for historical
events or history at all it needs some entertainment aspects
as well as possibilities to revive history. On the one hand,
having information incorporated into entertainment, which
is well known as edutainment, will be a proper approach to
increase the interest in history and in fact, most museums
have already installed a lot of educational multimedia
experiences. On the other hand, augmented reality (AR)
equipment would give live to calm scenarios around ancient
places. Using such equipment, a visitor can travel back in
time or even into the future in order to gain an immersive
experience of history or historical events. Moreover, a
visitor can be part of a narrative taking place at a cultural
heritage site which is driven by a story enriched by
scientific data about the historical context of an ancient
place. While location-aware mobile games at cultural
heritage sites are already installed at some places, a more
advanced technology game should be the next step. The
adoption of augmented reality would provide much more
experience and presence supporting the imagination of
players and would become an adventure to visitors.
Combining these two aspects desirable for historical
sightseeing tours namely edutainment and augmented
reality results in narrative augmented reality outdoor games.
This paper wants to encourage the use of narrative augmented reality games based on historical information
arranged around the environment of a cultural heritage site
to help interested visitors to understand and feel history.
Therefore a game concept will be outlined providing an
impression of how an augmented reality outdoor
application would look like using the example of the long
term project “Burgbau zu Friesach” in Carinthia, where a
castle will be built during the next thirty years by making
use of medieval tools and processes.
Author Keywords
Augmented Reality, Cultural Heritage, Ancient Sites,
Mixed Reality, Games, Edutainment
This article wants to discuss why games at all should assist
the learning of the history of ancient sites. It highlights how
augmented reality outdoor games must be designed to meet
the requirements of game based learning at cultural heritage
sites. Furthermore, this paper will also provide a draft how
an augmented reality outdoor game would look like using
the example of the long term project “Burgbau zu
Friesach” in Carinthia, Austria. This project is about
building a castle over a time period of thirty years only by
making use of medieval tools and processes. It is difficult
for visitors to comprehend the entire process of constructing
a castle just by visiting the site at a certain point of time.
Since the production lasts over decades, a visitor would
only observe the actual stage of the construction site instead
of getting an impression of the whole process or the
different steps of the construction. Actually using
INTRODUCTION
Not only in Europe but all over the world, there are many
cultural heritage sites telling a thrilling story about our past.
However, these stories are not always tangible for visitors.
Many archaeological sites are mainly ruins, where visitors
can only see the basic layout of a settlement or building.
Furthermore, ancient items like tools or domestic hardware
are often preserved only as fragments. That is the reason
why history as well as archaeological and cultural heritage
sites seem to be dead matter to many people. It needs a lot
of background knowledge and the capability of imagination
to let history become alive. Moreover, while in the past
tourists were satisfied just by being at such sites, it can be
observed that nowadays tourists not only want to get
1
28
augmented reality it does not really matter if the user is
watching into the future or into the past. Whether there is a
ruin which will be augmented to see the former complete
building, or there are foundation walls of a castle which
will be augmented to view the whole castle, either way uses
the same method – augmentation of missing parts.
prototype consisted of a Head Mounted Display (HMD)
attached to the top of the post, a camera on it and a laptop
computer.
Figure 2: Concept augmented reality post and prototype [17]
Figure 1: Matching events when timelines are folded
For that reason, taking the project “Burgbau zu Friesach”,
even though it is not a cultural heritage site but rather an
ongoing construction site, is a useful sample to show how
an augmented reality outdoor game would fit into real
ancient sites.
This paper is organised in the following way: In the next
section researches will be presented which is connected to
this approach. Section 3 highlights the complementary
partnership of history and edutainment, while section 4
focuses on equipment and implementation hints for augmented reality outdoor games. Section 5 outlines a basic
game concept for the project “Burgbau zu Friesach”, before
finally in the last section the conclusion will be drawn.
RELATED WORK
This section aims at providing an overview about projects
and research work related to elaborated systems installed at
cultural heritage sites as well as augmented reality outdoor
environment settings at such places. Several solutions exist
where mobile devices and technological services are in use
at cultural heritage sites to support visitors in learning about
the place and its history. The first subsection will briefly
review some projects heading that way. In addition, augmented reality equipment is widely used in various cultural
heritage outdoor scenarios. The second part of this section
will describe some AR outdoor implementations relevant to
the proposed design approach within this paper.
Advanced Cultural Heritage Experience
The idea of exploiting technical devices to enrich the visit
of cultural heritage sites is not really new. Doyun Park et al.
[17] were seeking for new immersive tour experiences to
meet upcoming needs of tourists for more elaborated tour
guides. Because of their main objective they chose augmented reality technology to gain as much immersiveness
as possible. Using a post which is fixed at an interesting
position at the historical place, they were able to provide an
augmented vision and audio of that location to a user. Their
ARToolkit1 was used to register the actual viewport to the
scene, while animated 3D VRML (Virtual Reality
Modelling Language) models had been used to overlay the
real time video capture. A user study was conducted to
evaluate four characteristics of the system, namely immersiveness, interest level, understandability and intention of
use. In all four categories they observed positive outcome.
However, the evaluation revealed also negative aspects like
a lack of reality of the 3D models and users said the
usability of the hardware should be improved.
Instead of having fixed installed binoculars, Ardito et al. [2]
used the visitor’s mobile phones to run their application.
They developed Explore!; a game aiming at supporting
middle school students in learning about history of cultural
heritage sites at South Italy. The game which has already
been present as a paper based version is played in groups of
4 or 5 students whereas each group has one mobile phone.
The group plays a Roman family just arrived at the ancient
settlement. The mission of the game is to collect information about the site as well as to identify places which have
to be marked in a map. If a group has difficulties to find a
requested place, they could inquiry the virtual game master
on their mobile. Some tasks require to find invisible
monuments like a civil basilica not existing any more, in
such cases the mobile game aids by showing a virtual 3D
representation of the building on its screen. For educational
reasons the game depends on three stages, the introduction,
the game phase and the debriefing part. The latter is held by
a game master i.e. a teacher and utilities a master application installed on a notebook computer. All log files from the
mobile devices are transferred to the master application to
interpret the results and to denote the winning team. The
educational aspects are well described in [1] while the
master application is presented in [3]. An extensive evaluation of the concept as well as a comparison between the
paper based version of the game and the mobile game
version can be found in [9].
1
http://www.hitl.washington.edu/artoolkit
29
personal user profiles had been used to customize the
multimedia content provided to the user. Additionally, in
order to reduce necessary user interaction, the content was
presented to the user according to her position and orientation, even so multi-modal user interfaces were also
implemented to let the user interfere if she wants to change
the course. LIFEPLUS does not only present buildings and
other artefacts indoor as well as outdoor, but also ancient
life using high-quality 3D animated avatars with ancientlike clothes and natural hair. Nevertheless, the authors of
[24] stated that they had difficulties to cope with changing
light conditions when moving from indoor to outdoor and
the other way around. HMDs capable of high level contrast
would be required. In addition, the very high processing
power needed for 3D modelling, tracking and rendering,
forced them to use two notebook computers instead of one
light weighted device. Moreover, battery life time was also
a big issue therefore in future work they aim to improve it
to achieve at least one hour of nonstop operation.
A complete different approach has been done for an ancient
site in Norway. A Norwegian team developed a 3D collaborative virtual environment (CVE) creating an interactive
educational game based on the “Battle of Stiklestad”
occurred in 1030 [19]. The landscape, the battlefield,
soldiers, farmers as well as farmer houses and buildings are
designed as virtual 3D content of the game using Active
Worlds2 with respect to historical authenticity. The players
of that game could act as soldiers to perform certain quests
while concurrently learn about the battle. This was the
original idea of the project; however they extended the
system by connecting offsite but online players sitting in
front of the CVE with onsite players equipped with PDAs.
This combination adds a new facet of reality to the immersive experience. Onsite players can play together in teams
with online players, while meeting in the virtual world. An
onsite player is represented as an avatar inside the CVE,
mapping its position using real world GPS coordinates. All
players can communicate using public chat or private
messages supported by a server infrastructure delivering
these messages.
A strongly narrative driven AR environment is GEIST [15].
In GEIST players can meet ghosts living at cultural heritage
places like castles or small alleys. The player’s mission is to
release a ghost by visiting former times, exploring the
ghost’s concern to help him to finish his suffering. The
researches built a thrilling interactive drama using AR
equipment, hybrid tracking to estimate the line of sight of a
player, several content databases accessed by a query
engine and a story engine. Tracking is similar to the
approaches before supported by making use of natural
feature extraction to increase accuracy, however adapting
different algorithmic. For visualisation, semi-permeable
glasses are used. The story engine’s job is to tell the story
which was indeed fictional but coherent to historical facts.
The story engine relies on plot elements like facts and
events, but does not determine the player’s freedom of
acting and moving.
These projects demonstrated quite diverse methodologies to
supplement historical places by edutainment features.
Whether a collaborative virtual environment connected to
PDAs, mobile phones or a fixed post was used, they all
have in common an educational point of view rather than
just fun. The next subsection concentrates more on augmented reality aspects.
Mobile Augmented Reality Environment
Augmented Reality (AR) does seem to fit quite well into
cultural heritage sites, since there are normally only
fragments of buildings remaining. The strength of AR,
completing parts to an ensemble or add something to a
scene, does support visitors of such sites to imagine the
former appearance of their interest. Indeed, many projects
can be found transferring this idea to real world
applications.
A more recently published project is TimeWarp [13].
TimeWarp is a mobile outdoor mixed reality edutainment
game. The game takes place in Cologne where the player
has to free little creatures occurring in the history of
Cologne called Heinzelmännchens who felt into time holes
and therefore are captured in different time periods. For
playing the game, each player is equipped with a PDA
acting as an information device and a mobile AR system to
superimpose the environment accordingly. The AR system
consists of a light weight monocular SVGA head-worn
optical see-through display, an ultra mobile personal
computer and GPS as well as an inertial sensor for tracking
the position and orientation. Furthermore, the developers
used multimodal interaction like focus and click interaction,
location awareness and action for placing virtual items
using a gyroscopic mouse. The authors developed a Mixed
Reality Interface Modelling Language (MRIML), which
eases location aware event handling and the positioning of
multimedia content.
One of these projects is ARCHEOGUIDE [26], which
allows tourists to visit Olympia in Greece in an augmented
reality manner watching reconstructed ruined buildings and
view some Olympic Games sport disciplines at its original
ancient stadium while wearing video see-through AR
binoculars. Although ARCHEOGUIDE employs mobile
devices, the AR experience is restricted to predefined
viewpoints. However, besides differential GPS and a digital
compass, the scientists involved an optical tracking algorithm based on predefined calibrated images to calculate the
position and orientation of the viewer more precisely for
better registration of the 3D models to the real time video
stream [23]. A more advanced AR system called
LIFEPLUS was tested at Pompeii in 2004 [24]. This system
is able to guide visitors continuously, other than limited to
predefined viewports mentioned before. Furthermore,
2
http://www.activeworlds.com
3
30
exhibitions. Many institutions like museums, explorations
or exhibitions have already enriched their installations by
interactive audiovisual multimedia equipment providing an
easy access to complex information. Such installations are
often crowd pullers exciting young and old people equally.
Those interactions are normally not really challenging,
which is indeed not its purpose, but if historical content is
provided that way it degenerates often to try it out and
leave.
Figure 3: Augmented views with Heinzelmännchen [14]
A user study with 24 participants uncovered some hints for
further improvements. Different to the projects before,
TimeWarp did not use enhanced natural feature tracking
methods in its first version. For that reason users felt not
really immersed, because of misaligned and low quality
virtual objects. Additionally, they had to pay so much
attention to the system that players were often unaware of
safety issues. On the other hand, wearing a vest which is
carrying the equipment was never mentioned negatively.
Albeit TimeWarp is an outdoor only application, it has also
to deal with changing light conditions. Problems with high
level of sunshine were reported, which made the superimposed content sometimes impossible to recognise on the
see-through device.
It could be seen that there are different proposals trying to
combine historical knowledge with entertainment facets.
The next section discusses the correlation between
historical facts and edutainment and answers the question
why it is a gainful symbiosis.
HISTORY AND EDUTAINMENT
While many visitors of museums or cultural heritage sites
do not have interest or time in reading tremendous lines of
information about each ancient piece, they want to get a
quick overview about the most interesting parts of it. Often
there is just a plate in front of an old artefact with some
background information written on it. Mostly, visitors are
not allowed to touch such invaluable antique artefacts; even
if in some cases this gives an extra portion of interest, it can
become boring rather quickly, in particular for younger
people.
Museums and other cultural heritage institutions are
looking for new alternatives to attract visitors. A lot of
museums have already supplemented their human museum
guides by digital audio or multimedia guides. Mobile
multimedia devices allow personalised content delivery
depending on the age of the visitor, her personal preferences or learning capabilities while providing the
information in an audiovisual manner [16]. Nonetheless,
these technologies are still supplying information in a rather
passive way. A user is still just reading, watching or
listening but yet to a multimedia-based prepared content. A
higher learning outcome is guaranteed if visitors are
involved more actively. This is the reason why one can find
highly interactive applications with multi touch panels and
other attracting interaction paradigms at several current
Figure 4: Interactive table "floating.numbers" for exhibition
at the Jewish Museum in Berlin (image by myhd.org)
However, history could be perfectly integrated into
narratives and storylines. An adventure based on scientific
findings may be a welcome alternation for people with an
inquisitive mind, which links the knowledge to self made
experience and thus it will be part of their long-term
memory. If a visitor would be part of a narrative based on
historical data and he can change the flow of the story
interactively, he would be much more involved as in one of
the settings mentioned before. This is where history can
deploy its strength - as a fundament for narratives. Games
on the other hand can be designed to tell interactive stories.
This combination, narratives and games, lets a visitor
become an actor who might solve challenging tasks while
learning parenthetically. Consulting Stapleton’s mixed
reality continuum [22], in passive settings like in old
fashioned museums imagination relies heavily on the
audience, whereas in interactive games the player’s
imagination is mediated by an interactive story. This will
support visitors to revive historical events.
An interactive multimedia exhibition installation is in
general not convenient for outdoor areas, hence cultural
heritage sites still have this little plates placed in front of
point of interests to describe artefacts a visitor is looking at.
Nevertheless, narrative role-play games are rather suitable
for outdoor scenarios where people can move somewhat
freely, run around and explore different places. A game
design which incorporates the historical context and the
location specific setting of a cultural heritage site satisfies
upcoming demands of tourists and other visitors.
Augmented reality technology enables the implementation
of narrative games at cultural heritage sites allowing
visitors to gain an immersive experience while being part of
a lively history.
31
AUGMENTED REALITY OUTDOOR GAMES
system can be seen as a crucial component. It is responsible
for accurate position and orientation detection of the user’s
viewport and consists of several cooperating modules.
Normally a GPS is in place to get rough position data.
However, differential GPS should be preferred to reduce
the positioning bias by a factor of ten to an accuracy of 0.5
meters [12]. The orientation could be determined by a
digital compass or an inertial sensor. Both, position and
orientation data, will need further refinement by utilise
optical tracking algorithms which can be subdivided in
marker-less and marker-based tracking methods [6]. For the
latter markers have to be attached to objects or buildings.
They must be visible for the cameras field-of-view and can
be used to calculate the proper viewpoint of the camera in
real time. In contrast, a marker-less method does not need
any preinstalled marker, thus it observes the video stream
for features such as edges, colours, textures and key points
to obtain orientation information. It is expected to have
robust real time marker-less tracking algorithms in near
future since this is an up-to-date research topic with recent
remarkable results which can be explored in detail referring
to [20] and [7].
While the section before discussed edutainment more in
general as a driver for historical facts, this chapter will look
in detail how AR technology could do it even better. In
essence, this section will accentuate the most important
objectives for designing an immersive augmented reality
outdoor game.
Many approaches are conceivable to introduce visitors in
historical knowledge using mobile phone games. However,
a mobile phone game offers less permanent impression and
is not really spectacular. According to Stapleton et al. [22]
one scheme to achieve a full immersive game experience is
to combine reality and virtuality in a way that allows
people’s imagination to bridge the gap to full
immersiveness. For that reason, augmented reality games
tend to be a welcome alternative to provide a unique
experience to a person. Examples of AR outdoor games can
be found in [8] and [4].
In [25] an overview is given about which components are
required to implement an AR system. The following listing
is based on [25] but enhanced by four additional
components (communication infrastructure, game story
engine, multimedia content database and game
orchestration tool) to meet the gaming aspect. Therefore, an
AR outdoor game will consist of eight components:
•
at least one visualisation and audio device
•
an accurate tracking system
•
a high performance processing unit
•
a high speed communication infrastructure
•
a multimodal interaction module
•
a sophisticated game story engine
•
a multimedia content database
•
and a game orchestration tool
Tracking as well as rendering and animation of 3D models
either needs a lot of processing power, hence a fast 3D
graphics card and a state-of-the-art processing unit is
required. Moreover, it should be light weighted and
compact in size, since the user has to carry it during the
game. There are different architectural setups possible. If
the entire application is not installed at the wearable
computer, but a distributed architecture has been chosen, a
powerful communication infrastructure has to be available
to transfer vast amount of data, because multimedia content
is known as high volume data.
Without any interaction a game would be rather boring,
thus multimodal interaction is essential. Especially in AR
applications different approaches of interaction paradigms
are helpful. One which uses location awareness is obtained
nearly for free, because of continuous position tracking. On
the other hand, classical desktop user interfaces are mostly
not applicable for AR settings, why other interaction
methodologies should be considered. They are strongly
discussed within the 3D user interface community whereas
many solutions are addressed in [10]. The interaction
techniques should be selected carefully, since interaction is
directly related to the usability of such a system and hence
important for acceptance.
The visualisation device is normally the most obvious
component from the user’s point of view. Whether it is an
optical see-trough device or a video display, it is used to
superimpose virtual items over the user’s real view or it
shows a real time video stream which is augmented by
virtual content. It could be head-worn, a fixed installation
or a hand-held device i.e. a PDA or a mobile phone.
Whatever device it is it should be equipped with earphones
or speakers to deliver audio content. It is also required that
a camera is attached to the visualisation device in order to
capture the user’s viewport accordingly. Because of the
reason that a user has to wear it all the time during his tour
or gaming experience it should be not cumbersome but easy
to use and light weighted. Since the visualisation device
will be passed-through by many users, hygiene as well as
robustness should also be taken into account.
Similar to the approach in [15] and [5] a story engine is
necessary to control the game flow. This engine is a core
component in a game based application. The game story
engine will communicate with the interaction module and
the tracking system to figure out which task to activate
next. It consults the content database to load 3D models,
audio files and additional information related to the context
the player is working on and delivers it to the visualisation
device. By the way, game story engines for AR games are
quite similar to those of ordinary computer game engines.
Due to the reason that an AR application has to overlay
virtual content exactly over real world scenes the tracking
5
32
The game engine does not necessarily have to distinguish
between real and virtual content, everything might be
modelled as virtual for the game engine’s point of view.
This means that additional to virtual objects, the real world
objects and their states must occur within the game engine’s
world representation. The game engine does not need to
differentiate between simulated or real sensor events which
are normally triggered by collisions, time-based actions and
closeness or by user intention. Such mapping achieves a
great benefit during testing, where real world events can
also be easily simulated [6].
Designing an AR outdoor game which should be
independent of a certain site requires an appropriate tool for
supporting game orchestration to customise the game
engine for individual locations. For games at cultural
heritage sites, historical facts, characters, artefacts,
buildings as well as the location itself has to relate to the
game story. Broll et al. [6] highlights the basic steps for
pre-game orchestration as following:
1.
appropriate registration of the game area into real
world coordinates
2.
virtual representations of real world objects
3.
initialisation and positioning of the game items
4.
initialisation of the players including the setup of
the individual equipment
5.
initialisation of the game state
it is better not to be too stingy when including virtual
artefacts. In addition, atmospheric effects and transparency
for text or icons can enhance realism. Not to forget about
spatial sound, this can increase the atmosphere dramatically
while expanding immersive experience even further. Lastly
the game story has to be thrilling, concise and informative
in order to target a wide range of persons.
This section gave an overview about technical and environmental issues as well as implementation hints for AR
outdoor games. The following chapter focuses on game
storytelling and proposes a drafted concept of an AR
outdoor game for a location in Carinthia where a medieval
castle will be constructed during the next three decades.
ADVENTURE “BURGBAU ZU FRISACH”
“Burgbau zu Friesach” is a long term running historical
project in Carinthia located in South Austria recently
started. A castle will be created over a time period of thirty
years only by making use of medieval tools and processes
similar to a project in Guédelon in France. Craftsmen will
wear ancient clothes when building the castle. They will
forge and carve the tools needed themselves and wood and
stones are carried by horses and oxen to the construction
site. The scientific mentoring of the projects is held by the
Institute of History of the University of Klagenfurt which
controls the authenticity of the used tools and processes.
Therefore the game orchestration software should feature
import and registration of maps and images to real world
coordinates as well as importing and positioning of 2D and
3D models of real world items to build the overall game
area. It should also provide facilities to alter the game
settings and the game state [6].
Considering lessons learnt from [13], [11] and [16] there
are some traps which should be avoided whenever it is
possible. There are many possible visualisation devices on
the market, but most still with inefficiencies such as devices
with low resolution, weak contrast levels and a narrow
field-of-view. Especially in outdoor scenarios these
properties are important for a satisfying experience and
should be investigated religiously, mainly because this is
what the user sees. Furthermore, realism was mentioned
several times as a decreasing factor for immersiveness.
Thus realistic augmentation is very important to AR
outdoor games. High fidelity 3D models and precise
positioning of the models with respect to lighting, shading
and shadows, like models proposed in [24], should be taken
into account. If the processing power is not sufficient for
real time rendering of such high quality models, it might be
considered to have a special game challenge where a player
has to match a high quality virtual 3D model with its real
world placeholder to solve a certain task. Many evaluations
revealed that there was not much virtual content provided,
responsible for distracting the immersive feeling, therefore
Figure 5: Similar project to “Burgbau zu Friesach” started at
Guédelon, France in 1997 (image by guedelon.fr)
Tourists can visit the site and watch the ongoing work of
stonemasons, blacksmiths and rope makers and other
parties thereto. Beside the touristic benefit, the aim of the
project itself is to get more insight into medieval processes
and medieval life. However, for tourists it is difficult to
comprehend the entire process of constructing a castle just
by visiting the site at a certain point of time. Since the
progress of the castle construction happens to be extremely
slow. A visitor would only observe the actual stage of the
construction site, instead of getting an impression of the
whole process or the different steps of the construction. As
already stated in the introduction, taking the project
“Burgbau zu Friesach”, which is actually not a proper
cultural heritage site but rather an ongoing construction site,
this example can show how an augmented reality outdoor
game would fit into real ancient sites as argued by Figure 1.
Based on the assumption that an equipment as described in
33
would be to find little hidden messages carved into the
castle’s walls 10 years forward in time to get some
additional hints about the conspirator. If a player is not able
to solve a certain task, he could ask the magician for help.
The magician is a virtual avatar appearing in front of the
user if he has been consulted by the player. The powerful
magician can give hints or even solve the task by using a
magic formula. A player could also talk to virtual characters
living in the future, to get some information. At the end of
the game, when the player has completed all tasks
successfully by uncovering the conspirator, he is able to
watch a great celebration at the castle in the future.
Otherwise the castle would be empty, since the conspiracy
would have won and the inhabitants were banished.
the section before is used, a brief outline of a game concept
using the coulisse of the castle construction should offer an
insight how a concrete augmented reality outdoor game
story could look like, where a player is part of a historic
event.
The actual story is based around the castle construction site
and the user will have to fulfil different tasks to drive the
story. A visitor plays the role of a consultant of the lord of
the castle. The lord tells him that he assumes that a
conspiracy against the completion of the castle is going on.
He further assumes that some craftsmen might be bribed
and forced to build entrapments into the castle i.e. build
secret passages, doors or weak walls or something else. The
player is asked to blast the conspiracy in order to save the
correct completion of the castle. The lord’s magician
delivers a special “magic” AR tool enabling the player to
travel forward or backward in time, to see the construction
site’s progress in different time periods by his own eyes like
a time traveller. As “magic” AR visualisation device, a
handheld portable computer (Figure 6) should be preferred
against an HMD, since a handheld device would meet
hygienic and robustness requirements necessary for AR
outdoor games.
This brief outline of a game story should certainly by no
means replace a full game concept. It might simply show
that a game at a historical site could be thrilling and informative in an inspiring way. It should be said, that the
history of the surroundings as well as storytelling knowhow
as proposed by [5] should be included, when designing a
full concept of such a game. The aim should be not only to
design any story but a historical reasonable story.
CONCLUSION
This paper has shown that new ways of providing historical
content can be realised using narrative augmented reality
outdoor games, in particular for cultural heritage sites.
Arguments were given why visitors of cultural heritage
places should be involved more actively in learning about
historical facts. Thus history has been proposed to be used
as basic building blocks for interactive narrative game
design. Some approaches have been highlighted which are
making use of modern technology to foster the learning of
ancient sites. Furthermore, augmented reality equipment
has been demonstrated to collaborate quite well at cultural
heritage sites. An overview of necessary components for
augmented reality outdoor games has been given with
regard to design narrative augmented reality games. Finally,
a short idea of a game concept for a castle construction site
was outlined.
Figure 6: Ergonomic handheld Augmented Reality device
designed around an ultra-mobile PC [21]
REFERENCES
During the game the player will have to collect information
about technical details and methods of creating a castle. He
can ask real craftsmen how they do their job or query a
database for additional information. With this information
the player might travel into the future to compare the
collected knowledge against the future construction to
prove if someone is doing false. Using the “magic” tool, the
player will see how the castle would look like in future if
the failures would be built in. Maybe he finds a secret door
or sees that the wall is not thick enough compared to the
craftsmen’s explanations. The task is to find inconsistencies
between the gathered knowledge and the castle that he sees
through his “magic” time traveller tool. When solving
important tasks the user will change the future. Feedback is
given by providing the corrected future castle state to the
user again through the “magic” tool. Another challenge
1. Ardito, C., Buono, P., Costabile, M. F., Lanzilotti, R.,
and Pederson, T. 2007. Mobile games to foster the
learning of history at archaeological sites. In Proceedings of the IEEE Symposium on Visual Languages and
Human-Centric Computing (September 23 - 27, 2007).
VLHCC. IEEE Computer Society, Washington, DC, 8186.
2. Ardito, C., Costabile, M. F., Lanzilotti, R., and
Pederson, T. 2007. Making dead history come alive
through mobile game-play. In CHI '07 Extended
Abstracts on Human Factors in Computing Systems
(San Jose, CA, USA, April 28 - May 03, 2007). CHI '07.
7
34
3. Ardito, C. and Lanzilotti, R. 2008. "Isn't this
archaeological site exciting!": a mobile system
enhancing school trips. In Proceedings of the Working
Conference on Advanced Visual interfaces (Napoli,
Italy, May 28 - 30, 2008). AVI '08. ACM, New York,
NY, 488-489.
4. Avery, B., Thomas, B. H., Velikovsky, J., and Piekarski,
W. 2005. Outdoor augmented reality gaming on five
dollars a day. In Proceedings of the Sixth Australasian
Conference on User interface - Volume 40 (Newcastle,
Australia, January 30 - February 03, 2005). M.
Billinghurst and A. Cockburn, Eds. ACM International
Conference Proceeding Series, vol. 104. Australian
Computer Society, Darlinghurst, Australia, 79-88.
5. Braun, N. 2003. Storytelling in Collaborative Augmented Reality Environments. In Proceedings of the
WSCG, (Plzen, Czech, 2003).
6. Broll, W., Ohlenburg, J., Lindt, I., Herbst, I., and Braun,
A. 2006. Meeting technology challenges of pervasive
augmented reality games. In Proceedings of 5th ACM
SIGCOMM Workshop on Network and System Support
For Games (Singapore, October 30 - 31, 2006).
NetGames '06. ACM, New York, NY, 28.
7. Castle, R. O., Klein, G. and Murray, D.W. 2008. Videorate Localization in Multiple Maps for Wearable Augmented Reality. In Proceedings of the 6th IEEE
international Symposium on Wearable Computers (Sept
28 - Oct 1, 2008) ISWC. Pittsburgh PA, 15-22.
8. Cheok, A., Goh, K. H., Liu, W., Farbiz, F., Fong, S. W.,
Teo, S. L., Li, Y., and Yang, X. 2004. Human Pacman: a
mobile, wide-area entertainment system based on physical, social, and ubiquitous computing. Personal
Ubiquitous Comput. 8, 2 (May. 2004), 71-81.
9. Costabile, M. F., De Angeli, A., Lanzilotti, R., Ardito,
C., Buono, P., and Pederson, T. 2008. Explore! possibilities and challenges of mobile learning. In Proceeding
of the Twenty-Sixth Annual SIGCHI Conference on Human Factors in Computing Systems (Florence, Italy,
April 05 - 10, 2008). CHI '08. ACM, New York, NY,
10. Bowman, A. D., Kruijff, E., LaViola, J. J. jr., Poupyrev,
I. 2005. 3D User Interfaces: theory and practice.
Addison-Wesley.
11. Dow, S., Mehta, M., Lausier, A., MacIntyre, B., and
Mateas, M. 2006. Initial lessons from AR Façade, an
interactive augmented reality drama. In Proceedings of
the 2006 ACM SIGCHI international Conference on
Advances in Computer Entertainment Technology
(Hollywood, California, June 14 - 16, 2006). ACE '06,
vol. 266. ACM, New York, NY, 28.
12. Gleue, T., Dähne, P. 2001. Design and implementation
of a mobile device for outdoor augmented reality in the
archeoguide project. In Proceedings of the 2001
Conference on Virtual Reality, Archeology, and Cul-
tural Heritage (Glyfada, Greece, November 28 - 30,
2001). VAST '01. ACM, New York, NY, 161-168.
13. Herbst, I., Braun, A., McCall, R., and Broll, W. 2008.
TimeWarp: interactive time travel with a mobile mixed
reality game. In Proceedings of the 10th international
Conference on Human Computer interaction with Mobile Devices and Services (Amsterdam, The
Netherlands, September 02 - 05, 2008). MobileHCI '08.
14. Herbst, I., Ghellah, S., and Braun, A. 2007. TimeWarp:
an explorative outdoor mixed reality game. In ACM
SIGGRAPH 2007 Posters (San Diego, California,
August 05 - 09, 2007). SIGGRAPH '07. ACM, New
York, NY, 149.
15. Kretschmer, U., Coors, V., Spierling, U., Grasbon, D.,
Schneider, K., Rojas, I., and Malaka, R. 2001. Meeting
the spirit of history. In Proceedings of the 2001
Conference on Virtual Reality, Archeology, and
Cultural Heritage (Glyfada, Greece, November 28 - 30,
2001).
16. Liarokapis, F. and Newman, R. M. 2007. Design experiences of multimodal mixed reality interfaces. In
Proceedings of the 25th Annual ACM international
Conference on Design of Communication (El Paso,
Texas, USA, October 22 - 24, 2007). SIGDOC '07.
17. Park, D., Nam, T., and Shi, C. 2006. Designing an
immersive tour experience system for cultural tour sites.
In CHI '06 Extended Abstracts on Human Factors in
Computing Systems (Montréal, Québec, Canada, April
22 - 27, 2006). CHI '06. ACM, New York, NY, 11931198.
18. Poon, A. 1994. The ‘new tourism’ revolution, Tourism
Management, Volume 15, Issue 2, April 1994, Pages
91-92.
19. Prasolova-Førland, E., Wyeld, T. G., and Lindås, A. E.
2008. Developing Virtual Heritage Application with 3D
Collaborative Virtual Environments and Mobile Devices
in a Multi-cultural Team: Experiences and Challenges.
In Proceedings of the Third international Conference on
Systems (Icons 2008) - Volume 00 (April 13 - 18, 2008).
ICONS. IEEE Computer Society, Washington, DC, 108113.
20. Reitmayr, G., Drummond, T. 2006. Going out: robust
model-based tracking for outdoor augmented reality.
Mixed and Augmented Reality, IEEE / ACM International Symposium on, vol. 0, no. 0, pp. 109-118, 2006
Fifth IEEE and ACM International Symposium on
Mixed and Augmented Reality (ISMAR'06).
21. Schall G. et al. 2008. Handheld Augmented Reality for
Underground Infrastructure Visualization, To appear in
Journal on Personal and Ubiquitous Computing,
Special Issue on Mobile Spatial Interaction
35
(Stellenbosch, South Africa, November 03 - 05, 2004).
AFRIGRAPH '04. ACM, New York, NY, 107-113.
22. Stapleton, C. B., Hughes, C. E., Moshell, M. 2002.
Mixed Reality and the interactive imagination, Swedish
American Simulation Conference, 2002.
25. Vlahakis, V. et al. 2004. Experiences in applying augmented reality techniques to adaptive, continuous
guided tours. Tourism Review (26 - 28 January 2004)
IFITT. ENTER2004. Cairo.
23. Stricker, D., Kettenbach, T. 2001. Real-Time and
Markerless Vision-Based Tracking for Outdoor Augmented Reality Applications. In Proceedings of the
IEEE and ACM international Symposium on Augmented
Reality (Isar'01) (October 29 - 30, 2001). ISAR. IEEE
Computer Society, Washington, DC, 189.
26. Vlahakis, V., Karigiannis, J., Tsotros, M., Ioannidis, N.,
Stricker, D. 2002. Personalized Augmented Reality
Touring of Archaeological Sites with Wearable and
Mobile Computers. In Proceedings of the 6th IEEE international Symposium on Wearable Computers
(October 07 - 10, 2002). ISWC. IEEE Computer
Society, Washington, DC, 15.
24. Sundstedt, V., Chalmers, A., and Martinez, P. 2004.
High fidelity reconstruction of the ancient Egyptian
temple of Kalabsha. In Proceedings of the 3rd international Conference on Computer Graphics, Virtual
Reality, Visualisation and interaction in Africa
9
36
Personalized Touring with Augmented Reality
BS Reddy Jaggavarapu
Institut für Informatik-Systeme
bjaggava@edu.uni-klu.ac.at
ABSTRACT
position of the user, angular movements of the user, and etc.
Whenever the user wants to know information about the
environments, the pre-stored information is provided to the
user at that time. To provide that information the system
needs to know about the user location, point of interest and
the information user wants to know about. To know the
users position we have GPS (Global Positioning System),
DGPS (Differential GPS), algorithms to calculate angular
changes (change in view), ultrasonic signal and vision
algorithms etc. After knowing users position the data is
transmitted by using WLAN (Wireless Local Area
Network), ad-hoc (network without base station, each node
in the network passes the data to the other nodes), streaming
and different types of transmission techniques. Until now
the systems are providing information to the user by the
context of their position and the interest of their view,
language,
and
the
digital
information
(images/narration/video) of a particular object. The users
are provided with the general information, some users may
feel bore or unnecessary. To solve this problem systems
have been developed. Those systems get the profile
(interests or preferences) of a user and according to the
user’s profile the information is produced. The user profile
varies depending on the user hobbies, knowledge of the
environment, age, gender, area of interest, the place user is
coming from, and etc. Whenever the user enters into the
system, the user is allowed to choose his/her preferences.
Depending on the users preferences the tour is conducted.
To provide personalized touring with Augmented Reality
(AR) systems have been developed (mentioned in the later
related work section). Those systems are using the context
of the user and the information about the environment to
conduct the tour. There are many problems involved in this
area. Researchers are working to develop efficient and
adoptable systems to provide the effective personalization
tours. This paper is going to discuss the how the data is
presented to the user with the use of context, how the
contexts influence the dynamic presentations, how the user
emotions are estimated and the role of emotions in
presenting the content to the user, and the interactivity
support provided to the user to change his profiles at any
time and synchronizing this data with the system. This
paper also presents the problems involved in analyzing the
user context and extracting the context.
Author Keywords
Context, User Emotions, Personalized Touring, .
ACM Classification Keywords
H5.m. Information interfaces and presentation (e.g., HCI):
INTRODUCTION
Personalized touring means providing or guiding an user
with the information that he/she is expecting from a point of
interest. For example if a user is visiting a historical place,
the user expects different type of information, for example
history, culture, habits, etc. So we need systems to provide
data to the users.
Augmented reality is used to provide data to the user in
effective manner. AR is embedded on the users display
devices, so that users can view the overlaid images, videos
of the particular environment or object. With use of
augmented reality users are navigated from one place to
another place, and the content is also presented to the users.
Nowadays users are provided with different devices to
visualize the environments. Those devices are for example
PC, laptop, PDA, mobile phone, sea through, and etc. To
visualize the environment there are many constraints to be
considered, for example information about the environment,
Combing the AR with personalized touring is
advantageous. Systems have been developed to produce
personalized touring for indoor and outdoor. Varieties
(hands free, portable, handheld, etc.) of devices are also
introduced.
Klagenfurt, Austria
RELATED WORK
There has been great research work is going in the area of
context aware and personalized touring. To provide user
with the personalized tour, different types of systems have
been developed. Those systems, based on the user location
1
37
system supports three different types of devices and
provides information to the users.
provide the content in different ways. To provide the
content and present the content there have been many
techniques introduced.
Some research works also concentrated on the service
context [2]. User location context is used in discovering the
variety of available services and providing them to the user.
The aim is to provide Dynamic Tour Guide (DTG). This
system uses user’s interests as the basic ontology, and
semantic matching algorithms to provide the data. After
discovering the services the navigation and tour are
updated. So based on the user’s context, location context
(information about the environment), and service context
the DTG is performed. This system provides the audio for
guiding the user.
Figure 2: Virtual assistant [9]
Figure 1: Overview of the context interpretation [2]
The above figure 1 shows how the contexts are interpreted
to provide the information to the user.
Researchers also developed systems with the Augmented
Reality (AR). The environment, monuments, ruins, and
artifacts are presented to the user by augmenting the
images, videos, etc. The different ways in presenting data
are,
• Using virtual assistants, audio
• Images and corresponding text documents
• Streaming videos
Figure 3: ARCHEOGUIDE architecture [1]
To provide personalized touring user preferences or
interests are key to present corresponding information to the
user. ARCHEOGUIDE [1] has been developed to provide
the personalized touring. This system considers user’s
location, orientation, and preferences. With this context as
information to the system, the system provides user with the
suitable content. In this system users are equipped with
laptop and see through device, pen-PC, and palm top. Users
can access information with heterogeneous devices.
MARA [9] has been developed, where the researchers
developed a virtual assistant to guide the user. In this
system user location and orientation are considered as the
context. The figure 2 shows the virtual assistant. In this
system they used see-through devices to view the virtual
assistant.
The researchers have focused on different type of areas.
Those are analyzing the context, context awareness,
environmental issues, content presentation to the user,
visualization of the content, and many more areas.
Depending on the user preferences the visuals are
generated. Figure 3 shows the system architecture of the
ARCHEOGUIDE. From the figure 3 it is observed that this
2
38
Adaptability is needed to support whenever a user is in an
environment and accessing the system. To provide data to
the user the context has to be analyzed, to analyze the
context we have GPS systems which provides the location
information of the user. Location of the user is most
important in our systems. For every system location of the
user is the major context and also the accurate values are
provided by using the GPS and DGPS (Differential GPS).
After finding the location of the user systems generally
provides the data to the users. Due to the over flooding of
the data to the every user, researchers have introduced user
preferences or interest. Where the systems take information
from the user and analyze that information, and finally
present the filtered information to the user. Then later on
the context are analyzed by using the objects in the
environment. That means by considering the users point of
interest views, the content is provided to the user. To get the
users point of interest cameras are introduced. Cameras
capture the images from the objects in the environments and
transmit those captured images to the system. Then the
systems filter the data by considering the users views as
context and the filtered data is provided to the users.
To provide personalization tour, research is going on
recommendation models [10]. This type of systems uses
the user’s interests or preferences to recommend suitable
services. The history of the user is also stored and used for
the future use.
How the User Profile Play Role in Personalized Tour
In the above mentioned systems (ARCHEOGUIDE and
MARA) we have seen user profile/preferences/interests.
Some systems get the user profiles before starting the tour
and those systems present the tour according to the user
profiles. Some other systems for example recommender
systems analyze the user profile while presenting the data.
User profile plays the major role in personalized tours.
Examples for the use profiles are mentioned in the above
section. Now the question is how really the user preferences
influence the tours. For example, a family is visiting a
museum. Assume that family has kids, teenagers, middle
aged people and old people. All those family members are
equipped with PDAs. If a kid and old man are interested in
watching the same monument, they get different type of
information. Before starting the tour profiles/preferences of
the users are taken. Kid and the old man enter or choose the
preferences.
For transmissions systems are using WLAN, internet, etc.
Systems interpret the data and transmit that data to the
users. After transmitting the users get the data, user is able
to select the data and the data is visualized on the user’s
device. To visualize data there are many rendering
techniques have been developed. They are 2D, 3D
rendering techniques.
Kids generally entertained when they have a funny cartoon
as tour guide on their display device. So a funny cartoon
entertains the kid with a different voice (which suits for
kids), and provides the images (those images are capable to
reach the kids level of understanding). In this way tour is
conducted to the kid.
The fallowing sections explains how the contexts are
analyzed, how the data is retrieved and presented, the
influence of the context on the personalized touring, the
role user emotions in presenting tours, discussions and
conclusions on the presented topics.
The old man gets a formal tour guide and the tour is
conducted to the old man. Old people generally want to
know much more information about particular point of
interests. The formal tour guide presents the information
according to the old man preferences.
HOW to RETREIVE and PRESENT DATA
The user selects preferences, depending on the user
selections systems should retrieve and present the data.
With the user preferences, context of the user is determined.
The information about the objects, where the user is
concentrating on, is also part of the context. These are
named as context representation [5]. This context
representations, represents the context of the user and the
information of objects, provides the contextual information.
To retrieve data context representation itself is not
sufficient.
Because
in
context
representation,
representations are unified and are complex to compare. So
we need context transformation. It transfers context into
Boolean or probability to make the comparisons easier.
TECHNOLOGY
The systems developed are using different types of
technologies to provide the data to the users. For every
system which is supporting personalized tour with
augmented reality need database. The data about the
environment, museums, artifacts, ruins are stored at
databases. Research work has done on storing the data at
the data storages. The data repositories are used to store not
only the data about the environments and they also store the
historical image collections, videos, and also the external
information.
The systems support different kind of devices to present the
data to the users. As mentioned earlier in this paper,
nowadays users are provided with different types of
devices. Systems should be able to support all those PDAs,
mobile phones, laptops, palm tops, etc. Systems should also
support indoor, outdoor, and also the users who are sitting
at home or office through web.
With the use of context transformation, the objects which
are relevant to the user’s context are selected by using
personalized object filter. Personalized object filter, filters
the objects by probabilistic selection. Now the object is
selected, but systems have to present only the preferred
data. Presenting the preferred data involves personalized
3
39
information
presentation.
filter
and
personalized
and also the related paintings that helps user in watching the
painting in the exhibition. Those types of systems [12] have
been developed and also they consider the user interests and
present the information about the paintings. This type of
systems also guides the users for the next interesting places
by guiding with location information.
information
Figure 5: Visual cue matching process [12]
To develop such kind of systems involve visual cue
matching technique. The user is equipped with a camera,
inertia tracker (tracks objects within a short distance), and
UMPC (Ultra-Mobile PC). When the user is watching at
paintings the system matches the current view with the
stored image and determines the position of the user. After
measuring the position of the user the system presents the
directions for the next interesting point.
Figure 4: Personalized Information Retrieval [5]
The figure4 shows how the information is presented to the
users. Who, when, where, what, how, and why are used to
represent the context of the user. Depending on the user
context the objects from the environment are selected,
filtered and presented to the user.
To demonstrate the above figure here is one example. Two
users A and B are at the same location and viewing the
same object, but they get different types of information.
Because user A’s context is different from user B’s context.
The above figure 5 shows the matching process (this
process keeps user in the reachable region to meet the next
painting), after this process the user is provided with the
relevant information, i. e the directions to reach the next
painting.
The problems involved in this system are,
•
To compare the contexts they used Boolean and
probability functions. Still the research is going on in
searching the techniques how to transform this user and
object contextual information into numerical values for
better and effective comparisons
•
Effective delivery and presentation of the data
•
The user interests varies depending on the location and
time, how to support the user with the change in time
The problems involved in this type of systems are
Even some systems support the user by providing a writing
text on the system type of service. This type of service
helps users to know more about the missing data from the
presentations, and also user can extend his/her knowledge
after the presentation by knowing more about those written
data on the interesting objects. The problem involved in this
type is how to provide the information to the user about
those dynamic written questions.
•
Light illumination conditions, because here the main
technique deals with the visual cue matching.
•
Effective way of presenting information to a particular
user based on his/her preferences. That means one user
shows more interest in knowing about a particular
object and the other user shows less interest in knowing
information about the paintings. So system should be
dynamic in nature to provide different types of
information to any kind of user. And also user interest
changes from painting to painting.
Till now we have seen the use of images in analyzing the
position and providing the content to the user in high level.
The fallowing section presents how the images play role in
low level in generating the context.
Not only basing on the users context and information of the
objects, techniques have developed to present interesting
data to the users. For example if a user is in an exhibition
and watching interesting paintings, if we have a system
which provides the user with different formats of
information (text, image, audio, etc.) about those paintings
INFLUENCE of CONTEXT on PERSONALIZED TOURING
Before going into detail we should know how to exploit the
context. To exploit the context, [4] have proposed
architecture. Figure 6 shows the architecture of the system.
From the figure it is observed that there are two types of
processes, first one is high-level processing and second one
4
40
is low-level processing, and also there are two types of
context, preliminary and final.
at the low-level processing generates contents from the
database. The system presents only selected contents by
depending on the user interactions. Those selected contents
are presented with augmented scene with 3D rendering
process at particular pose of the camera.
In the low-level processing, context awareness framework
gathers data from sensor which captures video data. That
video data is preliminary context to the context aware
framework, which then produces final context. The
captured data at the sensor is passed to the marker
detection. In the marker detection stage, the incoming
image frames are detected for the markers. Incoming image
frames are transformed into Binary images. Markers are
detected in those transformed Binary images. These
detected markers are recognized at the marker recognition
stage. By depending on the change in the intensity of the
light the threshold value of the binary generated image is
calculated. Changing the threshold value is needed, because
in the system pre-stored marker illuminations are static.
When light illusion changes threshold value is changed, so
that the reliability is guaranteed at the marker detection and
recognition stages. The detected markers are recognized at
the markers recognition stage and also the pose of the
marker is calculated. Pose determines the specific view of
the image. The recognized markers are assigned with
unique ID’s. These ID’s are passed to the high-level
processing.
This system supports the user in presenting the dynamic
content effectively.
The advantage with this system is, it addresses the light
illumination problem that encountered in the previous
works.
The problems involved are
•
The major problem is time consumption at the lowlevel processing. If it takes much time to complete the
marker detection and recognition process, the
preliminary context is directly sent to the contextaware framework. The development is needed in
increasing the performance of the low level processing.
•
How far the generated context is helpful for the user.
This section discussed how the context influences in
personalization tours. In the next section we see how the
user face emotions play major role to support the
personalized tour.
ESTIMATING the USERS FACE EMOTIONS
Apart from the user preferences, there are other
mechanisms available to present the dynamic data to the
user. Those mechanisms are face recognition, estimating
expressions and emotions, and eye tracking. The main idea
behind introducing face emotions into the personalized
touring is to provide the much more effective presentations
of the data. To increase the entertainment face emotions are
considered.
Figure 7: Process of Image Processing [7]
Algorithms have been developed to recognize the emotions
of the face. In [7] using Genetic Algorithms the
classification of the human face emotion is done. They used
lip and eye features for the classification. Figure 7 shows
the process of the image processing.
Figure 6: Context-aware system architecture [4]
In the high-level processing content selection is made by
the user with the interactions. The marker ID’s recognized
5
41
The research questions in this area are
•
How to integrate this type of algorithms into a system
that provides the personalized tours?
•
Adaptability with the current systems, with this nature
the implementation is made easier. How the
adaptability is achieved?
•
How to transform these emotion features into context
of the user?
•
User emotions are not constant with time they change
very rapidly.
•
Problems in generating and managing the data.
•
How to support dynamic tour.
•
How to increase the performance? With all the above
mentioned constraints it is not easy to produce data
very rapidly.
•
above mentioned systems used the context of the user and
information of the object in optimizing the content that to
be provided to the user. If the system also considers the user
emotions as the contexts, depending upon the context
corresponding content is returned to the user. Then the user
will be provided with the dynamic content. This content
may of different types
Narrations to support different emotions. Depending
the object information and user’s context narrations are
generated. The question is how to support dynamic
narrations when the user emotions are considered.
•
How far the generated content is useful to the user
Depending upon the emotion context, the system may
simply visualize an image which represents the user
emotion.
•
Audio presentations, which are pre-stored and related
to the emotion contexts.
•
Videos clips and narrations, all these are may not be
computed dynamically. The pre-stored, which is related
to the user emotions, that data is presented to the users.
Developing this type of system with the pre-stored data is
not so complex, but developing a system which provides
the dynamic data by considering the user’s emotions is
much complex.
Visualization techniques are needed to support
different types of emotions and also the context for
example when the user emotion changes the current
visualizing image or video clip should show the effect
from the context.
•
•
DISCUSSION and CONCLUSIONS
In this paper we have seen different types of systems that
are providing personalized tours. The context is playing the
major role in providing the personalized tours. Some
systems showed how the context influences in retrieving
and presenting retrieved information to the users. We also
have seen different types of problems involved in
transforming context in the low-level processing of the
system. There is a need to increase the performance of the
system.
The facial expressions are recognized from real time video.
The system continuously tracks the face and detects the
facial expressions by using facial feature detection
techniques. In [8] used Support Vector Machines (SVM) for
the classification of the emotions.
We also have seen how the human face emotions are
detected and how those emotions influence the
presentations or tour of the user.
REFERENCES
1.
Dähne, P. and Karigiannis, J. N. 2002. “Archeoguide:
System Architecture of a Mobile Outdoor Augmented
Reality System”. In Proceedings of the 1st
international Symposium on Mixed and Augmented
Reality (September 30 - October 01, 2002).
Symposium on Mixed and Augmented Reality. IEEE
Computer Society, Washington, DC, 263.
2.
Klaus ten Hagen, Marko Modsching, Ronny Kramer,
"A Location Aware Mobile Tourist Guide Selecting
and Interpreting Sights and Services by Context
Matching," mobiquitous,pp.293-304, The Second
Annual International Conference on Mobile and
Ubiquitous Systems: Networking and Services, 2005.
3.
Pietro Mazzoleni, Elisa Bertino, Elena Ferrari, Stefano
Valtolina, “CiVeDi: A Customized Virtual
Environment for DatabaseInteraction”,.
Wonwoo Lee, Woontack Woo, "Exploiting ContextAwareness in Augmented Reality Applications,"
isuvr,pp.51-54, 2008 International Symposium on
Ubiquitous Virtual Reality, 2008.
4.
Figure 8: Facial expression recognition [8]
From the above figure8 it is observed that facial expressions
recognized and displayed with an image. In general and the
6
42
5.
Dongpyo Hong, Yun-Kyung Park, Jeongwon Lee,
Vladimir Shin, and Woontack Woo, “Personalized
Information Retrieval Framework”, pages 81-90,
ubiPCMM 2005.
6.
Marian Stewart Bartlett, Gwen Littlewort, Ian Fasel,
Javier R. Movellan, "Real Time Face Detection and
Facial Expression Recognition: Development and
Applications to Human Computer Interaction.,"
cvprw,pp.53, 2003 Conference on Computer Vision
and Pattern Recognition Workshop - Volume 5, 200.
7.
8.
9.
Schmeil, A. and Broll, W. 2006. MARA: an augmented
personal assistant and companion. In ACM SIGGRAPH
2006 Sketches (Boston, Massachusetts, July 30 August 03, 2006). SIGGRAPH '06. ACM, New York,
NY, 141.
10. Xueping Peng, Zhendong Niu, "The Research of the
Personalization,"
wi-iatw,pp.107-110,
2007
IEEE/WIC/ACM International Conferences on Web
Intelligence and Intelligent Agent Technology Workshops, 2007.
11. Hong, J., et al. “Context-aware system for
proactivepersonalized service based on context history.
Expert
Systems
with
Applications
(2008),
doi:10.1016/j.eswa.2008.09.002.
M. Karthigayan, R. Nagarajan, M. Rizon, Sazali
Yaacob, "Personalized Face Emotion Classification
Using Optimized Data of Three Features," iihmsp,pp.57-60, Third International Conference on
International Information Hiding and Multimedia
Signal Processing (IIH-MSP 2007), 2007.
12. Dong-Hyun Lee, Jun Park, "Augmented Reality Based
Museum Guidance System for Selective Viewings,"
dmamh,pp.379-382, Second Workshop on Digital
Media and its Application in Museum & Heritage
(DMAMH 2007), 2007.
Byungsung Lee, Junchul Chun, Peom Park,
"Classification of Facial Expression Using SVM for
Emotion Care Service System," snpd,pp.8-12, 2008
Ninth ACIS International Conference on Software
Engineering, Artificial Intelligence, Networking, and
Parallel/Distributed Computing, 2008.
7
43
Augmented reality telescopes
Claus Liebenberger
Student
claus.liebenberger@edu.uni-klu.ac.at
ABSTRACT
This paper investigates 5 projects which deal with
presentation interfaces, which augment a real life scene
with additional information like images, video, audio and
even 3-dimensional virtual objects. Within those projects
either a telescope or binocular like device has been
developed and used in different environments to deliver a
spectator an enriched viewing experience of an actual point
of interest.
Figure 1: Augmented reality visualization system [26], [11]
even 3D objects are superimposed on real world images
(Figure 1, center image) to finally achieve an enriched or
augmented view of the scene (Figure 1, right image).
While not all of the devices are real telescopes, the utilized
concepts are based on a telescope metaphor.
The investigated display devices mainly have been
developed by scientific institutes, but some of them are
already available on the market. In this case the
construction has been done in cooperation with commercial
institutions. Utilized technologies are explained with
respect to tracking and visualization. Further, projects are
mentioned, in which the devices have been set up and used.
In one project however, the real scene is completely
replaced with an image from the past. Although this device
(TimeScope) might not provide augmented reality as
defined by Azuma, it is a mature existing solution which
would be good candidate for the monitoring of a castle
building process. At least the basic idea of enabling the
spectator to view past sceneries is applicable to our possible
field of use for such a device. Therefore I found it worth
including this project in the paper.
Apart from the primary design principles of the different
devices, a key point of investigation is the implication for
possible user interaction, restricted by the design of the
different devices, with respect to its relevance for the use at
the project “Burgenbau in Friesach”.
The devices and projects, which will be discussed in more
detail are:
 The Meade LX200 GPS 10 Augmented Astronomical
Telescope has been developed by Andre Lintu at the
MPI Informatik in Saarbrücken, Germany. Its primary
purpose is to support the use of telescopes in educational
environments by adding relevant information to viewed
astronomical objects. [14, 16]
Keywords: Augmented reality, telescopes, survey
INTRODUCTION
Augmented reality (AR) systems are defined by Azuma in
[3]. They have the following three characteristics:
 They combine real and virtual
 GeoScope has been developed by C. Brenner et al. at the
University of Hannover, Germany. While also stationary
- like the other telescopes – the main difference is that
GeoScope uses a macro display instead of a micro
display, which softens the restriction of single user
operation compared to other solutions. [6]
 They are interactive in real time
 The virtual information is 3-dimensionally linked to a
real environment [6]
Taking this definition, AR devices enrich real 3dimensional environments with virtual content in real time.
 Reeves et al. from the University of Nottingham in
England placed a telescope at the One Rock exhibition
in the Morecambe Bay. They specifically investigated the
user interaction of the visitors of the exhibition with the
AR device. [20]
The investigated display devices which provide this
functionality either are real telescopes or are similar to
telescopes in terms of functionality and hardware or user
interface design.
 TimeTraveller and its successor TimeScope are devices
with which the spectator can go back in time and view
the visible area in front of the telescope like it has been in
With one exception, all investigated solutions have in
common that additional information like images, videos, or
1
44
the past. The TimeScope has been set up in Berlin and
Munich and was jointly constructed by büro+staubach,
WALL AG and ART+COM in Berlin. [1]
 AR telescope (XC-01), developed at the Fraunhofer
Institute in Darmstadt, Germany. The device has been in
use at the projects "Grube Messel" – an archeological site
nearby Darmstadt and for monitoring the construction of
the "INI-GraphicsNet building". [21, 26]
DESIGN CONSIDERATIONS FOR AUGMENTED
REALITY DISPLAYS
Figure 2: Optical-See-Through [13]
Before taking a closer look at different devices, a few
design considerations are discussed to provide means of
demarcation of the investigated projects.
The name of projects and devices indicate towards
telescopes (AR Telescope, GeoScope, TimeScope), but
some of them have little in common with real telescopes as
for instance defined in Wikipedia, or on Dictionary.com.
Nevertheless, they take advantage of the telescope
metaphor with respect to user interaction.
Therefore, I clarify what I mean by “telescope” by
classifying the devices using the dimensions of a taxonomy
of AR displays, which has originally been proposed in [19].
Also certain features, which the investigated devices have
in common, are stated.
Figure 3: Video-See-Through [13]
combined with computer generated images or objects and
presented to the spectator on a monitor. Most of the
binoculars and telescopes, which have been developed in
the environment of tourism, apply this technology.
Within this paper, I specifically talk about optical
telescopes in contrast to Radio and X-Ray or gamma-ray
telescopes.
Brenner states in [6] that, “The main benefits of this
approach are the ease of practical implementation, the
relatively low cost using commodity hardware components,
the possible use in mobile settings and the reduced
registration requirements compared to optical-see-through
solutions. Key limitations of video-see-through solutions
are the reduced optical resolution of the "real" environment
(limited to the video resolution) and the restricted field of
view“.
A more detailed discussion about the benefits and
drawbacks of each method can be found in [3] and [19].
Another very important restriction is that this paper focuses
on stationary devices. In literature, a certain type of device
– namely the head mounted device (HMD) – is clearly
accepted as being an AR display. HMDs are not stationary
and they for sure meet the requirement of linking virtual
information to a 3D-environment. Although telescopes do
have some concepts in common with HMDs, the question
is, when and if we can call a telescope, which mixes real
with virtual, an augmented reality display.
In [19], Milgram et al. describe a number of dimensions
which can help to classify and compare different AR
displays or to describe the used approaches.
The Reality-Virtuality Continuum
Another (continuous) dimension is called the RealityVirtuality Continuum (Figure 4). It deals with the
comparison of Virtual Reality, Mixed Reality and
Augmented Reality systems.
Optical-See-Through vs. Video-See-Through
The first dimension categorizes AR displays into Optical
See-Through and Monitor based displays. The latter class is
also called Video-See-Through [13].
Milgram et al. do not see the two concepts of augmented
reality and virtual reality as antitheses. In [3], they describe
the inherent attributes of a real and a virtual environment,
whereby the real environment – in contrast to the virtual
one – is strictly constrained by laws of physics. On the
other side, the virtual environment is strictly synthetic,
which may or may not mimic a real scenery. These two
systems are placed at the opposite ends of a continuum with
all possible deviations in between.
With Optical-See-Through (Figure 2) systems, the spectator
can view the real world through a semi-transparent (or halfsilvered) mirror, onto which computer generated images or
objects are projected. The investigated astronomical
telescope utilizes this technology.
With Video-See-Through systems (Figure 3), the real world
is synthesized (recorded using a video camera), before it is
2
45
The calculation of the correct position and view point can
either be achieved through positioning information,
gathered by hardware-sensors, like GPS or inertial sensors
as well as rotation encoders, or through image recognition
techniques (also called vision based tracking). The latter
can either be performed using special markers or by
identifying other recognizable objects in the target scene
(markerless tracking). Nearly all investigated systems use
hardware tracking techniques, but as vision based tracking
gains better results, these methods are also included into the
tracking process.
Figure 4: Reality-Virtuality Continuum [19]
In the project descriptions, we will see that most devices
can be placed easily placed into the augmented reality
section of this continuum by placing virtual objects into a
real world picture. However, some devices (like
TimeScope) come close to the right side of this continuum,
by completely replacing the real world with pictures or
videos. Nonetheless, they place this information into a real
world 3-dimensional context and can therefore be called
augmented reality systems according to Azuma’s definition.
OCCLUSION
One major challenge in the field of augmented reality is to
overcome the problem of occlusion. Real objects (cars) or
subjects (people) might move into the scene in front of a
virtual object. As the exact position of this object cannot be
measured (at least not in the investigated systems) it may
happen, that the objects disappear behind the projected 3d
model, although in real they should be in front of the
model. This is especially true when displaying sceneries at
crowded cultural heritage sites, where a lot of people may
walk around.
Further dimensions
Further classification dimensions for AR (or MR) displays
stated in [19] are:
 Exocentric or egocentric reference: Like HMDs,
stationary telescopes also display a scene from an
egocentric point of view, especially when the scene is
watched through eye-pieces (micro-displays, or directly
through lenses).
As can be seen in Figure 5 from one of the investigated
projects (the AR telescope), the problem of occlusion
already can be solved for static and solid objects in the
scene. The new augmented Building is nicely rendered
behind the wall which is visible in the picture. However
moveable objects or even trees cause problems. Note that
the leaves of the red tree are cut at the edge of the new
building.
 Extend of world knowledge: This dimension describes
to which extend the viewed scene is placed inside a 3dimensional model. Naturally, the scene, viewed by a
spectator through a telescope is not modeled at all.
However, to correctly deal with occlusion, this will be
partly necessary.
 Extend of presence metaphor: The extend of presence
denotes how much a spectator has the feeling of being
inside a viewed scene. Because of its stationary design
and the concept of viewing something “far away”,
telescopes can hardly provide this impression by nature.
However this does not prevent the feeling that the viewed
scene actually happens at the time of viewing.
Zöllner et al. follow in [25] a new approach of “Reality
filtering” to partly overcome this problem. They try to filter
out moveable objects or persons, which could interfere with
the scene. Using object recognition techniques, unwanted
objects are “erased” from the scene by replacing the
affected areas with previous versions of the scene. However
this approach has not yet been implemented in any of the
investigated projects.
TRACKING
PROJECTS AND PRODUCTS
Every augmented reality system faces the challenge to
correctly measure or calculate the current location of the
spectator (or the AR device), as well as its pointing
direction and angle of the view, in order to accurately
overlay a virtual object over the real world image.
I found projects in two areas in which telescopes with
augmented reality features have been applied in the past:
Astronomy and tourism. The visualization technique used
in astronomical AR telescopes is based on Optical-SeeThrough techniques [4].
While devices, which are worn by the user (like HMDs)
operate in a 3d environment with 7 external degrees of
freedom (position + pointing direction of the device – each
with 3 dimensions, plus tilt), the use of stationary
telescopes can help to reduce complexity concerning this
task. Because the location of the telescope is fixed, only 3
external degrees of freedom remain (as long as zooming is
not taken into account).
One of the investigated devices (TimeScope) even reduces
this to 1 degree – it can only rotate around one axis.
Figure
5:
Occlusion,
INI-GraphicsNET
building
[21]
3
46
The preferred method used in touristic AR telescopes is the
video see-through technique.
In this section, the projects, in which telescopes either have
been developed or used, are described in more detail.
Augmented Astronomical Telescope
Among all investigated projects, only two devices utilize a
real telescope in its original definition of a light gathering
and magnifying device. This is the first example.
The Meade LX200 GPS 10” Augmented Astronomical
Telescope has been developed by Andre Lintu et al. at the
MPI Informatik in Saarbrücken, Germany (Figure 6). Its
primary purpose is to support the use of telescopes in
educational environments by adding relevant information to
viewed astronomical objects. [14][16] It is also the only
project which uses an Optical-See-Through approach.
Figure 6: Augmented Astronomical Telescope [4]
Figure 7 shows a schematic description of the components
which forms the complete system. The system is based on
the commercially available telescope MEADE LX200 GPS
10”
(Manufacturer:
Schmid
Cassegrain,
www.meadeshop.de) which costs about €5500. [18]. The
telescope is motorized and includes a GPS system.
The original eye-piece of the telescope has been replaced
by a projection unit, which overlays computer generated
images onto the original image gathered from the telescope
(Figure 6). The team developed the unit in-house. It uses a
DMD (Digital Mirror Device) of 0.7” diagonal with XGA
resolution (1024 x 768 pixels). The DMD was developed by
OpSys Project Consulting. It features a custom projection
lens and a special color LED projection head (see Figure 8).
The reason why this system has been chosen is because it
provides a very low black luminance. LCD displays also
emit light for dark areas. Especially when watching the sky,
where the target objects are difficult to spot, a permanently
overlaid “grey” area could dramatically disturb the
spectator.
Figure 7: Augmented Astronomical Telescope – Schema [15]
A beam splitter is used to combine both images [16]. The
computer generated (CG) image is obtained from a standard
laptop computer running a modified version of the open
source planetarium software Stellarium [10].
One noteworthy issue arises from the fact that the
geometric shapes of the display (which is rectangular 4:3)
does not meet the shape of the image from the telescope
itself (which is circular). The issue was solved by matching
the horizontal extends of the two displays. With that, the
CG image cannot reach the very upper and lower part of the
image obtained from the telescope, but only very small
areas in the edges of the CG image cannot be displayed in
the final picture.
Figure 8: Projection schema [4]
During fast motion of the telescope, tracking is not relevant.
As soon as the telescope reaches its final position, a feature
called high precision – provided by the telescope itself – is
used to exactly determine the current pointing position.
With high precision, the telescope calibrates itself by first
slewing to a bright known star and then slewing back to the
original position. However, it is important to take the
sidereal tracking (the earth moves around its own axis) into
account and to reach a high accuracy with this respect.
A remote controller is used to position the telescope. The
positioning information of the remote controller is both
used for the telescope and the tracking coordinates for the
software.
4
47
The blended additional textual information for sure can be
very useful for the spectator. However, as soon as textures
of the target objects are being blended into the scene, one
can hardly tell, if the original image has been completely
replaced by the virtual image, or if the real image is really
“only” augmented and still visible.
GeoScope
The GeoScope has been developed by C. Brenner et al. at
the University of Hannover, Germany. At a first glance, the
device does not look like a telescope. While also stationary,
like the other telescopes, the main difference is that
GeoScope uses a macro display instead of a micro display,
which softens the restriction of single user operation
compared to other solutions. The original intended areas of
application have been city planning and public participation
here within.
Figure 9: Original and augmented view of the sky [4]
Brenner explains in [6]:
„The GeoScope is mounted at a fixed position and consists
of a display, oriented towards the user, and a camera,
which points at the surroundings. Just as a telescope, the
GeoScope can be turned around two axes, the two angles
being captured by measuring devices. Together with the
known position, a fast and highly precise tracking of the
current view direction is possible, allowing the
superposition of the real scene, as delivered by the camera,
and virtually generated information.“
Figure 10: GeoScope in operation [6]
The necessary precision could be achieved by performing
periodic error correction. Finally the exact position of the
telescope can be derived from the device itself (approx.
every second). [14]
A 10,4“-TFT touch screen touch screen display with a
1024x768 pixel resolution used for user interaction and the
CG image computing device is directly attached at the back
of the screen. The used devices do have a temperature range
from -20 to +50°C, which makes them appropriate for
outdoor usage. Initially they used a webcam as the image
gathering device. Because this camera does not have
zooming functionality, the intention was to replace the
camera.
The telescopes location and pointing position is then
mapped to the coordinate system of the Stellarium software,
which also knows the date, time and location from which
the object – stored in the database – originally has been
observed.
The MEADE telescope can use different focal points,
which makes it necessary to calculate a correct scaling
factor of the objects in the Stellarium database. A simple
approach has been chosen to calculate this scaling factor:
Pictures of the moon has been taken using the different eyepieces directly from the telescope and from the overlaid
texture from the database. The size of the two objects
(original moon and moon object from Stellarium) have been
compared to calculate the scaling factor.
Tracking is based solely on hardware sensors. Due to the
nature of telescope type devices, with which distant target
objects are viewed, the tracking mechanism must be
precise. Absolute shaft encoders could have been used for
this purpose. They have the advantage, that those devices
do not need to be calibrated using and endpoint and they are
durable. On the other side, encoders, which meet the
necessary precision requirements (14-17 Bit) are expensive
(400-600€). Therefore, cheaper analog industry-sensorpotentiometers have been used. They have a resolution of
about 20 Bit and are also durable enough. The analog
position information is converted using an A/D converter
and directly fed into the computer’s USB interface. While
the resolution did not make any problems, the team
expected issues concerning the linearity and temperaturedrift.
Figure 9 shows the original (not augmented) image from
the telescope on the left, and the augmented counterpart on
the right. Some additional information about the target
object can be seen in the left upper area. Also the original
object has been overlaid by a texture from the Stellarium
software.
Animations of observed objects can also be blended into the
view. These objects can be rotated and their projected
evolution over time can be displayed. [14] Further, the user
has the possibility to switch the augmented textures on and
off to compare the real image with the CG image.
Concerning user interaction, well known interaction
concepts can be applied by using a macro display with
touch screen functionality. Apart from “selecting” the
5
48
desired window of the world by rotating the GeoScope in 2
dimensions, the device provides point and click
functionality. Menus or buttons can be displayed with
which additional information about a viewed object can be
gathered by the user.
Other concepts like scrollbars, sliders, control boxes etc.,
which exist on standard PCs, can be utilized and the user
does not have to learn new interfaces to use the device. [7]
Another important aspect is the fact, that while only one
person at a time can use telescopes through an eye-piece,
many persons can collaborate using the GeoScope.
Therefore transition issues (handing the device over to
another person), like described by Reeves in [20], can also
be avoided.
Figure 11: Augmentation - real image (left) - masked (center) virtual object (right) [7]
One might argue that the implementation of head tracking
would enhance the experience for a single observer. This
would result in a more “see through” device. However, the
original intention of device usage has been for
collaboration. As soon as 2 or more observers use the
device at the same time, head tracking does not make sense,
because different images had to be projected for every
single observer. On a single standard LCD display, this is
not feasible.
Figure 12: Setup at the One Rock exhibition
The ONE ROCK project
For the purpose of planning, two main aspects are
important:
Reeves et al. from the University of Nottingham, England,
placed a telescope at the One Rock exhibition in the
Morecambe Bay. Within this project, an indoor installation
has been set up and the telescope acts more like a
microscope by displaying small organisms inside an
incubator. (see Figure 12)
 A solid geometric base: This is gathered from GIS
databases.
 A 3D-model of the surroundings of the target object:
This model is achieved using outdoor scanning
techniques.
In [20], the Telescope construction is described as shown in
Figure 13. Looking into the viewing tube (1) reveals the
contents of the screen (2), which displays a processed video
feed from a webcam located at the front of the body (3).
The Telescope can be moved using the handles (4) which
rotate the entire body section about the pivot of the tripod
(5). The light switch on the right handle triggers a halogen
lamp attached next to the webcam. The device allows
zooming.
The target object itself is also modeled in 3D. The used
software is called ArcObjects from ESRI Inc. ArcObjects
can be used with licenses for ArcGIS Desktop or ArcGIS
Server, which are large and expensive GIS software
products. However, for a custom application as needed for
the GeoScope a cost-effective deployment is offered with
the ArcEngine Developer Kit and the ArcEngine Runtime
[20].
For tracking, a digital compass (6) is attached to the
underside of the viewing tube, and detects changes in the
heading and pitch of the Telescope’s upper section.
Rotation of the tube is calculated from the roll of the
compass as it is rotated by the viewing tube. The compass
heading readings allowed a 360-degree range, whereas both
pitch and roll were limited to ±40 degrees. Due to difficult
lighting in the set up (the incubator was not illuminated all
the time), the digital compass was the only mean for
tracking (in opposite to marker, or vision based
approaches).
Figure 11 shows a resulting scene as displayed on the
GeoScope. The left image displays the original scene. In the
center image, some buildings have been masked out. The
right image shows the scene augmented with the 3d model
of a new building.
Although one of the objectives of this project has been to
build a robust (in terms of vandalism proof) system, the
mounting and construction seems to be bit fragile,
especially when compared to the AR telescope XC-01 and
XC-02 (see chapters “AR Telescope…”).
6
49
 Although initially it might look like a drawback, that
only one person at a time can watch the scene through
the telescope, the transition (hand over) of the device to
somebody else, is relatively painless. With HMDs, the
viewpoint cannot be maintained during the transition
phase, which means that the next spectator will always
view the scene from a different perspective. Moreover, if
a target augmented region is not focused anymore, the
new spectator might not even see the same information.
 Due to the separation of target and device – there is space
between the telescope and the viewed incubator – the
possibility of interference exists. People might walk into
view for instance. The telescope system is not aware of
this, and the augmented object still would be visible. This
distracted visitors for a short moment, but the problem
was quickly recognized by the spectators by looking up
for a second. Afterwards they continued to watch the
scene.
Figure 13: Telescope at the One Rock exhibition
 5 different levels of engagement have been identified in
the setup: (a) Augmented user, (b) Disaugmented user:
the user controlling/holding the device, but not looking
through it, (c) Co-Visitor: standing nearby and related to
the currently augmented user, (d) Observer: Other
visitors grouped around the telescope, watching the
augmented user, (e) Bystander: Those not currently
engaged with the device or target.
Collaboration among the different groups has been noticed
to be an important part of the experience, and the roles have
been exchanged repeatedly. This might be problematic with
more permanently worn devices like HMDs. [20]
Figure 14: Resulting image of the One Rock telescope
These raised issues, because magnetic disturbances
appeared and also the reading of the current compass
position was rather slow, so it took some time, until
augmented areas could be “spotted” by the spectator.
TimeTraveller and TimeScope
TimeTraveller and its successor TimeScope are commercial
solutions with which the spectator can go back in time and
view the visible area in front of the telescope like it has
been in the past. (Figure 15)
The software, which has been developed, uses the Java
Communications API to gain compass position information.
It also provides the video handling and display service
using the Java Media Framework API and OpenPTC
graphics library.
Little information about the used technologies has been
published. What can be seen from the product descriptions
of the different web pages of the developing companies
WALL AG, ART+COM and büro+staudach [1][8][2], is
the exterior design and the general functionality. (Figure
15)
Augmented areas initially are surrounded by green
rectangles to ease the spotting. When such a region comes
into view, a video or image is rendered into the scene.
Figure 14 shows the resulting image, when a user looks
through the telescope.
A 6.4” flat panel display is used, which is connected to an
embedded PC with a VIA C3 processor. Instead of hard
drives, flash memory acts as the mass data storage. Below
the computer, a high precision rotary encoder measures the
current pointing direction of the telescope with can be
rotated by 120°. In addition a cooling/heating unit is
included in the housing.
During the exhibition, the usage of the telescope has been
monitored by external cameras and by simultaneously
tracking the current position of the telescope to have a
synchronized view of the image, which the user currently
sees.
A number of different aspects like Sharing and Stability,
Viewing and vicinity, levels of engagement and transitions
have been investigated. [20]
On the outside, a life cam (1) is used to gather the real
image. Therefore, also a video see through approach is
used. To the left and right of the binoculars (3), buttons are
placed, with which the user can step back and forth in time
(2). The housing is made of brushed stainless steel.
The most interesting findings are listed in the following.
7
50
The TimeScope has been set up in Berlin and Munich.
Sample pictures on the project web page indicate, that the
device has also been used in indoor exhibitions, fairs or
museums.
In a special business model, WALL AG offers the telescope
for free, when an advertising medium can be placed
together with the telescope. Apart from that, the telescope
can be purchased normally.
AR Telescope (XC-01)
The AR telescope has been developed by D. Stricker et al.
at the Fraunhofer IGD institute in Darmstadt, Germany. It
also takes a video-see-through approach and is able to
overlay the real world with geo-tags or 3d modeled objects.
Figure 18 shows the device installed at one project at Grube
Messel.
Figure 15: TimeScope internals and design
A vandalism proof metal case contains a high resolution
camera and an LCD display, a precise hardware tracking
system, air conditioning and a coin-acceptor-unit. On the
front, a camera records the real scene. Augmented with
digital overlays, this scene is displayed on the LCD panel at
the back of the telescope. Visualization is performed by a
standard PC, which is mounted inside the basement plate of
the device. The only thing needed for operation, is external
power. Optionally, the XC-01 can be equipped with a coinaccept-unit, which starts the application as soon as the user
places a coin inside. [21] For outdoor usage the device can
also be equipped with an air-condition-unit.
The XC-01 also solely relies on hardware sensors which
gather information about the pointing direction of the
device. [17]
Figure 16: TimeScope view [2]
A standard software package has been adopted by IGD to
create applications for the XC-01. This software package
had to be purchased separately but will be shipped directly
with the new version XC-02 (see following chapter). Using
this software, simple objects like images, text and videos
can be placed into the scene by using a drag and drop
editor. A function which shall be performed for a certain
tag, can be selected from a predefined pool of functions.
[24]
Such functions can be:
 Tracking configuration of the camera
 Animated or interactive scenarios
 Video and 2D animations
Figure 17: AR telescope schema [22]
 Light simulation
As depicted in Figure 15, the telescope can only be rotated
along its vertical axis – no up and down movement is
possible. By pressing the time-zoom buttons (Figure 15 – 2)
images from different times in the past are displayed. These
pictures completely replace the real image, but are placed at
the exact same position. Additional interaction elements
(Figure 15 – 6) may be used for further unspecified
functionalities. Figure 16 shows the image as a spectator
can view it through TimeScope.
 Occlusion objects
3D models can be provided in VRML format or in its
successor, the X3D standard.
The XC-01 has first been set up at in summer 2005 at the
Grube Messel, an UNESCO Natural World Heritage Area
near Darmstadt. They have developed an application to let
8
51
the visitor explore the site’s fossils, geology and industrial
history in an attractive and entertaining way. Figure 18
shows an example view of this application. [21, 12]
According to Dr. Zöllner, a project timeframe of about 4 to
6 months is calculated by the IGD institute. The time
primarily depends on the type of CG information, which
shall be overlaid,
Another installation has been located directly nearby the
IGD institute, where the new building of the INIGraphicsNet Foundation has been constructed. The
telescope was used. Through the xc-01 visitors could see
the final building in its real environment before it actually
has been built. Behind the semi-transparent 3D model the
progress of the construction can be seen live. When the
construction has finished, the real building and the virtual
one will have grown into each other. An example picture is
provided in section OCCLUSION in Figure 5. [21]
While geo-tagging1 is implemented rather easily (when the
base material is provided by the customer), 3D models need
more attention by nature. IGD has staff available to fine
tune 3D-models provided by customers, in order to reduce
complexity and optimize them for projection. If very
detailed and textured 3D models shall be displayed, a
special computer hardware might be necessary, to be able to
cope with the computational requirements to correctly
display the model in real time without (or with acceptable)
latency.
AR Telescope (XC-02)
Taking the TimeScope at the side – as one might argue, if it
really is an “augmented reality” solution – the AR telescope
XC-02 seems to be the most mature device available.
The XC-02 is a new version of the XC-01, which will be
released in the beginning of 2009. Most of the information
about this new version has been gathered through a phone
conversation with Dr. Zöllner on Dec. 1 st, 2008. During the
previous installations of the XC-01 the following findings
could be made:
1) The tracking, which has solely been based on hardware
tracking runs into problems, when the mounting of the
telescope gets loose, for instance due to loosened screws.
2) Although built for robustness, and vandalism proof, the
two movable elements (pivots) to rotate the telescope in
horizontal and vertical direction are prone to heavy
usage of the device.
3) Initial calibration and modeling must be done on-site,
which raises project costs.
Figure 18: AR Telescope at Grube Messel [21]
IGD tried to tackle these shortcomings in the new version
XC-02.
The design of the telescope has been modified towards a
periscope to reduce the number of externally accessible
movable elements (Figure 19).
A new vision based tracking mechanism, as described in
[25] is now used in combination with the hardware sensors.
This improves both shortcomings (tracking, calibration)
mentioned above:
Now, the modeling can be done off-site using panoramic
images of the scene on which the tracker routine can be
trained. Also, the device does not solely rely on the
hardware sensor information and therefore the precision of
these sensors is not that important anymore. The final
position and pointing direction is gathered primarily by the
vision based system, yet by means of sensor fusion, also the
hardware sensors are still taken into account. Therefore,
when the system is set up on its final location, it can nearly
run “out of the box” with only minimal calibration efforts.
Figure 19: XC-02 Augmented reality telescope [24]
The price for the XC-02 is not yet defined, but will be in the
same range as the XC-01 (approx. € 17.000), except the
software shall now be included already in this price.
1
Geo-tagging: placing icons at certain positions within the
scene, which can provide further information after
activation.
9
52
which are encountered in general, when dealing with
freely movable devices.
FURTHER PROJECTS
I have found some other projects related to mixed reality
and the telescope metaphor. They have not been covered in
more detail in this work. Either, because the projects do not
have a scientific or commercial background, or the basic
concept does not adhere to the “stationary” requirement.
This moves the used concept of the developed devices more
to the area of head mounted devices.
 The user does not have to carry any equipment him/her.
[5]
 The devices can be operated unattended from site staff.
 The metaphor of a telescope is well understood, so no
detailed explanation is necessary on how to operate the
device.
Nevertheless, the projects, which have been found, are
briefly listed in this chapter.
Although a few drawbacks can also be identified:
 None of the investigated solutions provide a stereoscopic
view of the scenery. This might be all right, because
nobody would expect a stereo image when peeking
through a telescope. In general (especially for outdoor
usage) the target objects have a big distance to the
spectator.
Drift (artistic project)
In [9], the artist Alex Davis presents an installation of a
telescope on a ferry boat, with which the visitor can view
the real-time 180° panorama. When the user examines
different aspects of the environment, they are not only
changing the spatial view, but also the temporal one.
 The target object can only be viewed from one side – it
is not possible to move around the object to view it from
a different perspective. This shortcoming can be reduced
by placing multiple devices at different locations.
Initially this might seem costly, but to be able to equip
visitors of a cultural site with HMDs, for instance, this
also requires a number of devices to be acquired.
The project does not raise the claim to be scientific.
Because of that fact, the aspects of augmented reality,
tracking issues or the design of the telescope itself is not
explained, if even tackled. Therefore, this project is only
mentioned in short in this work.
However one aspect of the installation is interesting in
terms on user interaction: The user can “control” the time
by rotating the telescope towards the back end of the ferry
boat. The more the telescope is rotated in this direction, the
more “blurry” becomes the view, and videos of the scene
from the past are displayed in the telescope. Moving back
towards the front of the craft, the time moves “forward”
again and the real environment is shown.
The majority of the investigated projects utilize hardware
sensor tracking, which can be sufficient due to the fact that
the investigated telescopes all have had a fixed position (at
least during one session). However, with the recent progress
in image based tracking, these technologies could be used
together with the hardware based tracking to improve the
robustness of
AugureScope
The projects, in which the devices have been tested by the
originators, are (partly) comparable to the project
Burgenbau in Friesach. In particular - placing augmented
reality telescopes at the surrounding of the castle, which
will be built, seems to be feasible and therefore interesting.
Schnädelbach et al. developed the AugureScope, which is
an augmented reality telescope, mounted on a movable
platform. Therefore the same implications like on worn
devices like HMDs apply. Similar to the GeoScope, the
system also uses a macro display. Therefore, multiple
visitors can use the AugureScope simultaneously in contrast
to HMDs. They created real augmented reality applictions
using avatars. [23]
REFERENCES
1. ART+COM AG and WALL AG. Timescope,
http://www.timescope.de/default.asp?lang=de, last
accessed on 07.11.2008
CONCLUSIONS
2. ART+COM AG. TimeScope brochure, Berlin, 2005,
available from http://www.artcom.de/images/stories/
2_pro_timescope/timescope_e.pdf, last accessed on
02.12.2008
We have seen some examples of AR devices which either
use telescopes or at least take advantage of the telescope
metaphor.
In the following, the benefits of using the concept of
telescopes are listed:
3. Azuma R. A Survey of Augmented Reality. In
Presence: Teleoperators and Virtual Environments,
volume 6, 355–385, August 1997, also available from
https://eprints.kfupm.edu.sa/21390/, last accessed on
07.01.2009
 As one can easily observe by the examples of
TimeScope and AR telescope, the technology is mature
enough, to be already available on the market and has
some advantages compared to alternative augmented
reality hardware, like head mounted devices. Especially,
telescopes are stationary. This eases tracking issues,
4. Bimber, O. and Raskar, R. 2006. Modern approaches to
augmented reality. In ACM SIGGRAPH 2006 Courses
(Boston, Massachusetts, July 30 - August 03, 2006).
SIGGRAPH '06. ACM, New York, NY
10
53
5. Bleser G., Becker, M. and Stricker, D. Real-time visionbased tracking and reconstruction, In Journal of realtime image processing 2 (2007), Nr.2-3, S.161-175,
Springer Berlin/Heidelberg, Germany.
17. Lutz B., Roth D., Weidenhausen J., Mueller P., Gora S.,
Vereenooghe T., Stricker D. and Van Gool, L. EPOCH
Showcase: On Site Experience. In 5th International
Symposium on Virtual Reality, Archaeology and
Intelligent Cultural Heritage - VAST2004, Brussels,
Belgium, December 2004, also available from
http://public-repository.epoch-net.org/deliverables/
D2.4.1-Showcases.pdf, last accessed on 07.01.2009
6. Brenner C., Paelke V., Haunert J. and Ripperda N. The
GeoScope - A Mixed-Reality System for Planning and
Public Participation. In UDMS'06, Proc. of the 25th
Urban Data Management Symposium, Aalborg, 2006.
18. MEADE LX200 Product Description. Available from
http://www.astroshop-sachsen.de/teleskope/teleskope
/pdf_datasheet.php?products_id=2002&osCsid=201339
4bb1020b1d276cb09051b04d17, last accessed on
02.12.2008
7. Brenner C. and Paelke, V. Das GeoScope - Ein MixedReality-Ein-Ausgabegerät für die Geovisualisierung. In
Aktuelle Entwicklungen in Geoinformation und
Visualisierung, GEOVIS 2006, Kartographische
Schriften Band 10, Kirschbaum Verlag. Potsdam, April
2006,
19. Milgram P., Takemura H., Utsumi A. and Kishino F.
Augmented Reality: A class of displays on the realityvirtuality continuum, In SPIE Vol. 2351,
Telemanipulator and Telepresence Technologies, pp.
282-292, 1994
8. büro+staubach. TimeTraveller and TimeScope,
http://www.buero-staubach.de/index.php?id=211, last
9. Davis A. Drift (Project Website). available from
http://schizophonia.com/installation/index.htm, last
20. Reeves S., Fraser M., Schnädelbach H., O’Malley C.
and Steve Benford. Engaging augmented reality in
public places. In Adjunct proceedings of SIGCHI
Conference on Human Factors in Computing Systems
CHI 2005, ACM Press, April 2005.
10. Fabien Chéreau. Stellarium. available from
http://stellarium.sourceforge.net, 2005.
11. Fritz F., Susperregui A. and Linaza M.T. Enhancing
Cultural Tourism experiences with Augmented Reality
Technologies, In The 6th International Symposium on
Virtual Reality, Archaeology and Cultural Heritage
VAST (2005), M. Mudge, N. Ryan, R. Scopigno
(Editors), Pisa, Italy, 2005, also available from
http://public-repository.epoch-net.org/publications/
VAST2005/shortpapers/short2005.pdf
21. Stricker, D. Virtualität und Realität. In Games and
Edutainment Nr 01|2005, INI-GraphicsNet, Darmstadt,
2005, available from
http://www.inigraphics.org/press/brochures/games_broc
h/games/Games_2005.pdf, last accessed on 07.11.2008
22. Stricker, D. Virtual and Augmented Reality Lecture
Notes, Fraunhofer IGD, 2008, available from
www.igd.fhg.de/~hwuest/vorlesung/Vorlesung1_Sommer
semester_2008.pdf, last accessed on 02.12.2008
12. Grube Messel project web page. http://www.grubemessel.de, last accessed on 03.12.2008
23. Schnädelbach H., Koleva B., Flintham M., Fraser M.
and Chandler P. The Augurscope: A Mixed Reality
Interface for Outdoors, In Proc. CHI 2002, pp. 1-8,
ACM Press, April 2001
13. Kiyokawa K., Kurata Y. and Ohno H. An Optical Seethrough Display for Enhanced Augmented Reality,
http://www.lab.ime.cmc.osakau.ac.jp/~kiyo/cr/kiyokawa-2000-05-GI2000/kiyokawa2000-05-GI2000.pdf, last accessed on 30.11.2008
24. Tourismus Medien Produktion. Product web page of the
XC-01 and XC-02 AR Telescope, 2007,
http://www.tourismus-medien.de, last accessed on
02.12.2008
14. Lintu A. and Magnor M. An Augmented Reality System
for Astronomical Observations. IN IEEE Virtual Reality
2006, pp. 119-126, IEEE, Piscataway, USA, March
2006
25. Zoellner M., Pagani A., Pastarmov Y., Wuest H. and
Stricker D. Reality Filtering: A Visual Time Machine in
Augmented Reality. In The 9th International Symposium
on Virtual Reality, Archaeology and Cultural Heritage
VAST (2008)
15. Lintu A. and Magnor M. Augmented Astronomical
Telescope - project web page, http://www.mpiinf.mpg.de/~lintu/projects/aat.html, last accessed on
02.12.2008
26. Zoellner M., Stricker D. and Bockholt U. AR
Telelescope - Taking Augmented Reality to a large
Audience. COMPUTER GRAPHIK topics, 17(1):19–20,
2005, also available from
http://www.inigraphics.net/press/topics/2005, last
16. Lintu A. and Magnor M. Augmented Astronomical
Telescope. In Virtuelle und erweiterte Realität: 2.
Workshop der GI-Fachgruppe VR/AR, Aachen,
Germany, 2005, also available from
http://www.cg.cs.tubs.de/people/magnor/publications/arvr05.pdf, last
11
54
Interactive museum guides: Identification and recognition
techniques of objects and images
Guntram Kircher
Waldhorngasse 17, A-9020 Klagenfurt
guntram.kircher@gmail.com
ABSTRACT
identification. Therefore this paper discusses some
techniques like barcodes, optical card readers and RFID.
In this paper I outline the questions: Why do we need image
and object recognition regarding cultural artifacts? What is
required for image/object recognition and identification
respectively what techniques make sense? What
possibilities are there to enhance the interaction between the
user and the images and objects in the case of
identification? First of all there will be described the
different fields of application of such interactive museum
guides and image recognition in general, confirmed with a
state of the art example. Afterwards the paper will give an
overview of different image/object recognition techniques
for instance the SURF or SIFT algorithm. Then also some
identification techniques like RFID will be discussed.
Furthermore the advantages and disadvantages of the
techniques will be identified. The third part of the paper
will discuss different possibilities to enhance the interaction
between the visitor and the objects of interest. There will be
presented also a study of the institute FIT, which analyzes a
museum guide on the user’s point of view.
THE NEEDS OF IMAGE/OBJECT RECOGNITION AND
IDENTIFICATION
The needs can be divided in many different domains like
the industrial, military, medicine and the every day life
domain.
In the industrial sector it is useful to be able to
automatically inspect manufactured components such as
machine parts, food products etc.
The military sector wants to have a possibility to
automatically recognize weapons systems such as planes,
ships, tanks, missiles etc. and be able to differ between
friend and opponent.
In the medicine sector it would be very useful to be able to
automatically diagnose diseases and also automatically
interpret medical images like X-ray, CAT scan etc.
The last domain is the every day life domain, which is
discussed in this paper. In this sector it would be nice to
have a system where you can interact with images and
objects regarding cultural artifacts such as buildings, statue
and pictures. [4] To demonstrate this domain, I would like
to present the Phone Guide, which was developed by the
Bauhaus-University Weimar in Germany. This guide uses
camera equipped mobile phones with an on device object
recognition. The main technical achievement of this guide
is a simple object recognition approach. They carry out all
computations directly on the phone. Thereby the costs for
online times will be decreased and there is no network
traffic. As you can see it is very important for museums to
communicate the information of cultural artifacts to the
visitor. Mostly you can find audio guides as a mobile
device for museum visitors. These devices have some
disadvantages like it can presented only auditory
information or the user has to look up and type in the
identification number of the object. [2]
Author Keywords
Image recognition, object recognition, object identification,
SURF, SIFT, RFID, barcodes, visual code
INTRODUCTION
These days it is necessary for museums to look at new
technologies and possibilities to enhance the attractiveness
of their cultural artifacts. [4] This is important, because
many museums present their exhibits in a rather passive and
non-engaging way. Often it is very difficult for the museum
visitor to get the information of interest. Moreover, the
information found does not always meet the specific
interest of the visitor. [1] One of these new technologies is
the task of finding correspondences between two images of
the same scene or object. In this context the paper describe
the SURF and the SIFT algorithm to demonstrate such a
technology. [5] The possibility to identify objects of interest
is an important requirement for any tour guides. The idea is
to reproduce the act of pointing to an object like a building,
statue or image and not to ask a human tour guide: “what is
that?”[3] Such a technology is the field of object
1
55
REQUIREMENTS
RECOGNITION
OF
IMAGE
AND
OBJECT
To extract information there is often used the tool of digital
imaging. There are three areas of digital imaging:
•
•
•
Image Processing
Computer Graphics
Machine Vision
Image processing is concerned with improving the
appearance of an image. Some examples in this context are
filtering out noise, correcting for distortion, improving
contrast etc. so that the image can be better understood. [4]
It helps us to understand the vast amount of data acquired
everyday in real life scenarios. Image processing can be
used to highlight the differences and make it easier for a
human being to detect the cancer from the image. Another
application might be to restore an old film that has been
damaged by dust and decay. [11] The second area is
concerned with creating an image of a scene, object or
phenomena and it is given a description of the scene, object
or phenomena. [4] This field deals with the problem of
image synthesis. The process of producing the image data
from the scene model is called rendering. Image synthesis
includes the physics of light, properties of material etc.
Then there are also animated imagery, which includes
simulation theory, finite element analysis, kinematics,
sampling theory and other mathematically based fields.
Computer graphics is very closely connected to the area of
mathematics. [7] The third area is the machine vision which
tries to create a description of a scene, object or phenomena
and gives an image of the scene, object or phenomena. [4]
It aims to duplicate human vision by electronically
perceiving and understanding an image. Furthermore
machine vision is the construction of explicit, meaningful
descriptions of physical objects from images. Moreover it
consists of techniques for estimating features in images,
relating feature measurements to the geometry of objects in
space and interpreting this geometric information. [10]
TECHNIQUES OF IMAGEA AND OBJECT RECOGNITION
The search for discrete image correspondences can be
grouped into three steps. In the first step “points of interest”
are selected at distinctive location in the image like corners,
blobs and T-junctions. In the second step the neighbor of
every “point of interest” is presented by a feature vector. In
the third step the descriptors vectors are matched between
different images. This matching is often the distance
between two vectors.
SURF (SPEEDED-UP ROBUST FEATURE)
The first technique, which will be presented in the paper, is
the SURF (Speeded-Up Robust Feature) algorithm. This
algorithm is a performant scale- and rotation invariant
interest point detector and descriptor. With this algorithm it
will be tried to find the most important interest points of
images or objects. This is done, in order to recognize
images or objects of interest very fast. These interst points
are selected at distinctive locations in the image, such as
corners, blobs, and T-junctions. It approaches or even
outperformes previously proposed shemes with respect to
repeatability, distinctiveness and robustness. The algorithm
achieves this goals through using integral images for image
convolutions. Furthermore it builds on the strenghts of the
leading existing descriptors. Moreover the SURF algorithm
simplifies these methods to the most important points.
Figure 1: Gaussian derivatives with box filters
First of all the algorithm constructs a circular region around
the detected “points of interest” for the purpose of assigning
a unique orientation to the gain invariance to image
rotations. In figure 2 in the middle is the Haar wavelet
filters shown.
Figure 2: Left: detected “points of interest; Middle Haar
wavelet filter; Right: size of the descriptor window at different
scales.
On the left side, there is a homogeneous region, all values
are relatively low. [1]
There are also some different versions of SURF available.
These versions are SURF-64 and some other alternatives
like SURF-36 and SURF-128 and the upright counterparts
like U-SURF-64, U-SURF-36 and U-SURF-128. The
difference between SURF and its variants exists in the
dimension of the descriptor. SURF-36 has only 3x3 subregions. SURF-128 is an extended version of SURF for
example. The very fast matching speed of the algorithm is
achieved by a single step added to the indexing based on
the sign of the Laplacian of the interest point. This minimal
information approves to almost double the matching speed,
as it has already been computed in the interest point
detection step. The main advantage of the algorithm is the
speed and the rate of recognition.[1]
56
SIFT (SCALE INVARIANT FEATURE TRANSFORM)
are robust against scaling, rotation and partially robust
against a change of the lightning conditions. A
disadvantage of the algorithm is definitely the speed. [8]
Some experimental results have shown that the newer
SURF-algorithm was faster and better at the rate of
recognition the objects of interest. [1]
The algorithm was developed by the canadian academic
David G. Lowe. This method is as far as possible invariant
relative to the variations of the of the image or the object.
The algorithm originates from the computer vision. The
field of application is the photogrammetry for assignments
of images. The algorithm consists of four major stages:
scale-space peak selection, key point localization,
orientation assignment and key point descriptor. In the first
stage potential interest points are identified by scanning the
image over location and scale. Then key points are
identified to sub-pixel accuracy and eliminated if found to
be unstable.[8]
TECHNIQUES
IDENTIFICATION
OF
AUTOMATIC
OBJECT
In the next few years the usage of computers will further
expand. In this connection everybody should think about
the chance, that several techniques based on multimedia,
internet and other communication instrument. The
techniques, which can be summarized with the name “autoid”, as techniques for automatically identification of
objects. In our context I will present the RFID (Radio
Frequency Identification) technique, the barcode technique
and the visual code technique. Some other techniques, like
techniques of the biometry and card reader techniques are
not presented in this context, because for the theme of
cultural heritage it is not so interesting.
RFID (RADIO FREQUENCY IDENTIFICATION)
RFID is a technique of object identification, which
transmits identity of an object wirelessly, using radio
waves. In our case the objects are the cultural artifacts. The
identity is given in a unique serial number. Moreover it is a
subset of Auto-ID technologies, such as barcodes, optical
card readers and some biometric technologies. The basic
concept of the technology has a reader and a RFID tag. The
reader is a device that emits radio waves and receives
signals back from the tag. The tag is a microchip attached to
a radio antenna and is mounted on an underlying layer.
Some advantages of RFID are:
Figure 3: This figure shows the stages of key point selection.
(a) The 233x189 pixel original image. (b) The initial 832 key
point’s locations at maxima and minima of the difference-ofGaussian function. Key points are displayed as vectors
indicating scale, orientation, and location. (c) After applying a
threshold on minimum contrast, 729 key points remain. (d)
The final 536 key points that remain following an additional
threshold on ratio of principal curvatures. [9]
•
In the third stage the dominant orientations for each key
point based on its local image patch are identified. [8] The
final stage constructs a local image descriptor for each key
point. The patch has been centered about the key point’s
location, rotated on the basis of its dominant orientation and
scaled to the appropriate size. The goal is a compact, highly
distinctive and yet robust to changes in illumination and
camera viewpoint descriptor. The standard SIFT key point
descriptor is important in several respects, namely the
representation is carefully designed to avoid problems due
to the boundary effects, it is compact, expressing the patch
of pixels using a 128 element vector, and the representation
is resilient to deformations such as those caused by
perspective effects. On the other hand this descriptor is
complicated and the choices behind its specific design are
not clear. Another maybe better approach to reduce the
complexity of the standard SIFT descriptor is the PCASIFT descriptor. The descriptor has the same input as the
standard SIFT descriptor. The main advantage of the SIFTalgorithm is the robustness. That means that the key points
•
•
•
•
•
Non-line-of-sight(possibly built into or placed
inside containers)
Long range
Many tags read out at once
Robust (not as fragile as a printed bar code)
Gives a path from simple identification of objects
to locating objects
Almost cheap as a printed barcode
There are two main categories of RFID systems, namely the
active RFID system and the passive RFID system. The
active RFID tags have their own transmitter and power
source and are write-once or rewriteable. The system
broadcasts a signal to transmit the data stored on the
microchip using a power source. The passive system do not
have an own transmitter and a power source. It is also
write-once or rewriteable. The system reflects back radio
waves coming from the reader antenna. One interesting task
regarding our context of cultural heritage is the “Mobile
RFID”. Here a mobile phone is used as RFID reader. The
figure 4 shows an example of an application of this task.
3
57
books, and even luggage is tagged at the airport. The
technique was invented in 1949 by Bernard Silver and
Norman Woodland. Barcodes were not used in retail
business until 1967 when the first barcode scanner was
installed by RCA in a Kroger store in Cincinnati. The
principle technique of barcodes is to encode alphanumeric
characters in bars of varying widths. It is used a one
dimensional coding scheme. The height of the barcode
provides added redundancy to the system, when parts of the
system are damaged. For example the UPC code contains a
code indicating the manufacturer of the product and a
number that identifies the product family and product.
There is no price information included, but product
information is passed after scanning to the in-store product
database and the appropriate price is retrieved. The code
includes twelve numbers from 0-9. Here the first eleven
digits are data and the last one is a “check digit” to ensure
that the scanning was correct. Smaller products can only
have eight digits. Each digit of the code is encoded using
two bars and two spaces. Figure 5 shows an example of an
UPC barcode.
Figure 4: an example of ticket purchasing service using mobile
RFID
When you assign this example to our domain, it should be
possible to purchase tickets for several museums and
instead of the poster you can think of a cultural artifact with
an RFID tag. So the user gets information about the object
of interest directly on his or her private mobile phone.
There exists also several RFID standards like the ISO
14443 (for contact-less systems), ISO 15693 (for vicinity
systems, such as ID badges), and ISO 18000 (to specify the
air interface for a variety of RFID applications. Another one
is the EPC standard, which was developed by EBCglobal.
This standard is used for product identification. [12] The
main advantage of RFID systems is that no intervisibility is
required. It is also a very robust system, which means that
the transponders can be read although there are
interferences like snow, fog, dirtiness or other structural
difficult conditions. Another advantage is the speed of a
RFID system. There can be achieved a speed of less than
100 millisecond. [14] Certainly there are also disadvantages
of RFID systems. The main disadvantage is that the
purchasing of such a system is expensive. Furthermore it is
time-consuming, when the system is introduced. [15]
Especially data protection specialists are very careful,
because when RFID chips will be produced in a bulky way,
the people can be controlled, since it is possible to localize
the individual person. [16]
Figure 5: UPC barcode
VISUAL CODES
Visual codes are one- or two-dimensional symbols, which
will be added on the interesting objects as visual tags. The
information will be stored visual and through taking a
picture this information can be interpreted.
BARCODES
Figure 6: process of visual code
In the world around us barcodes have become very
omnipresent. The most common barcodes are the Universal
Product Codes (UPC). This is used to tag merchandise in
the local grocery stores. Other examples are seen on library
In the first step on the left side of the picture the code will
be converted into a binary code. The next step is to find the
datum line. Afterwards the edges to balance the noise will
be identified. Finally the information will be read out.
58
There are two possible variations of visual codes, namely
the active code, which is shown on the display and the
passive code, which is imprinted. The advantages of this
technique are that it is cheap, everywhere applicable and
there is no extra technique necessary. The disadvantages of
the technique are that there is less memory available and
environmental conditions can be problematic. [13]
about these real objects of interest. There are some different
possibilities to interact with the object of interest. The first
possibility is the direct entry. During this alternative
proceeds no direct interaction between the user and the
object. The objects are assigned to a unique identification
like an id-number. The user type the identification of the
object manual in the device. Another possibility is the
acoustical interaction with the object. In this case an
acoustical signal is sent. The object can interpret the
received signal. The next alternative is the camera based
interaction with the object. There are two possibilities, how
to use the camera. The first one is the continuous video
recording and the second one is to photograph the
interesting object. Furthermore there exist two different
technologies. At first there is the pure image recognition.
Therefore it can be used a PDA with a wlan card and an
integrated webcam. Often some problems occur like that
the photograph will not be identified, because of the
location and the lightning conditions. The second one is the
camera based interaction with visual codes. In the year
2005 Elliot Malkin started the digital “Graffitiprojekt”. In
this project every station of the closed Third Avenue
tramway had a visual code. The visitors could take a picture
of the visual code and heard true stories about the location.
Moreover the user had also the possibility to consign
experiences. A further alternative is the display interaction
with active visual codes. An example is a soda machine and
therefore I think it is not so important in our context. The
next alternative is much more interesting in our context of
cultural artifacts, namely the scanning of the environment.
An example in this context is a PDA using a RFID reader.
More real objects can be collected in one environment and
it is also possible to localize this objects. Furthermore tags
can be used, which permits the interaction from afar. The
next application is to point to a far object of interest. There
is a prototype known, called ScanMe, which has integrated
a laser pointer. The objects of interest can be equipped with
a sensory chip, which reacts to the laser pointer. The
problem of this method is the aiming accuracy. The last
method is the nearly contact with the interesting object.
Again technologies such as RFID can be used. A possible
application is the known handy ticketing for bus or
tramway. In our context I think it is very difficult, because
you have nearly contact to the objects and many cultural
artifacts are very sensitive like the image of the Mona Lisa.
[13]
POSSIBILITIES TO ENHANCE THE INTERACTION
BETWEEN THE USER AND THE OBJECT OF INTEREST
To introduce this chapter I want to give a state of the art
example of an application, which enhances the interaction
between the user and the objects of interest. This
application is called the interactive museum guide and this
guide was developed by the Computer Vision Laboratory in
Zürich. The object recognition system was implemented on
a Tablet PC using a simple USB webcam for image
acquisition, which can be seen in Figure 7. With this device
the visitor can simple take a picture of an object of interest
from any position and then it is a description of the object
presented on the screen.
Figure 7: Tablet PC with USB webcam
An early prototype of this museum guide was shown to the
public during the 150 anniversary celebration of the Federal
Institute of Technology (ETH) in Zürich, Switzerland.
There is also installed a synthetic computer voice on the
tablet PC, which reads out the information on the screen. So
the user can focus on the object and has not to read the
information on the screen. For demonstration they choose
20 objects of the Landesmuseum in Zürich. The object
recognition system, which is used by this device, is based
on the SURF algorithm, which was presented before. The
visitor of the museum takes a picture of the object and so
the object of interest is identified. [1]
The next section deals with two studies of a museum guide,
which will be tested for usefulness and usability. The first
study was focused on the comparison of a sub-notebookversion with conventional media. The second alternative
tested a PDA for the visit of the museum. The prototype,
which was used for this study, was the system called
“Hippie”. This system was developed of the institute FIT
and some other project partner. In the first study they used
the mobile device “Toshiba Libretto 100”. The handling of
this device was very difficult, but at this time (1997 – 2000)
it was the smallest device on the market. The second study
Until now the interaction happens between the user and for
example the mobile end-device. Because of the further
development of technologies some new interaction
possibilities occur, which allows the interaction with the
real environment. The real environment can be devided in
physical objects like images or statues, room and location
and creatures. The goal of the interaction is to find out more
5
59
used a PDA – device. For the first study 60 people, who
were invited in a museum in Bonn, made the tests. Twothirds of the invited people were between the age of 20 to
39 and one-thirds was between 60 and 69. The other
conventional media were an audio guide and a booklet.
First of all the results of the study with the sub-notebook
will be presented. On a scale (1=never, 5=always) was
tested, how well the information could be found and how
long they needed for the job processing. It was shown that
in this case the sub-notebook was inferior. The table 1
illustrates the results again.
characteristic
notebook
Audioguide
booklet
Information
found at first
go
3,55
4,70
4,50
Processing
time (min)
90
64
22
handling of the notebook. The handling of the functionality
was not easy enough for using it at a museum visit.
At the further development of the system the hardware was
changed. In place of the sub-notebook, they used a PDA for
the museum guide and made further tests with the new
input device. Also new test people were invited. This time
were only 7 people between the ages of 20 to 30 invited.
The attention of the users was focused predominant on the
exhibit and only 41% of the attention was focused on the
PDA. Here the fast understanding of the device played an n
important role. Furthermore the use of the PDA during the
visit was easy, the handling with the device had not
stressed, and only the manual input was felt as disturbance.
The comparison between the four media, namely the PDA,
audio-guide, booklet and guidance, provided other results
than the results with the sub-notebook. The PDA alternative
was in all categories the winner unless the simplicity of use.
The table 4 illustrates the results of this test.
characteristic
PDA
Audio
guide
booklet
Guidance
Support
of
enjoyment
2,0
2,4
3,6
3,3
Handling
2,4
2,8
2,6
1,6
Examination
deepen
2,0
3,0
3,3
2,4
Spark interest
in
cultural
artifacts
2,6
2,8
3,4
3,0
Applicability
for
cultural
exhibit
1,9
2,4
3,4
2,9
Table 1: effort of the different media
Than it was tested how effective the information was
communicated. This test was relatively balanced between
the media. The table 2 illustrates the results:
characteristic
notebook
Audioguide
booklet
Achieved
points
22,9
22,6
22,8
Points relative
to the time
1,19
1,65
1,58
Table 2: effectively of information brokerage in points
Another interesting test was the question of the applicability
of the media. The notebook was evaluated worst. The
results are illustrated in the next table.
characteristic
notebook
Audioguide
booklet
Applicability
for
art
exhibition
3,53
2,35
1,93
Preference for
use
12%
43%
30%
Table 3: effectively of information brokerage in points
The test of the handling was also very negative for the
notebook. The two other conservative media differed not
really. The booklet was the winner at this test. The main
result of this first study was that the average visitor with the
electronically system was less satisfied than with the audio
guide and the booklet. This was explained with the bad
-
Table 4: Evaluation of the PDA alternative
Concluding it is evident that the use of input devices with
easy handling and object identification can be a very good
possibility for museums to make the visit of the museum
more attractive and interesting.[17]
CONCLUSION AND OUTLOOK
In this paper, I described the requirements, functionalities,
possibilities and technologies of several interactive museum
guides. There have also presented a fast and performant
point detection scheme. [5] Furthermore some alternatives
for improving the interaction of the user to the object in
case of identification of the object of interest were given. At
the end of the paper a study was presented, where some
input devices were compared. The study was not the newest
one, but I think it showed the trend of using interactive
museum guides in the future. The most important points for
such guides are the simpleness in handling, the usability,
the quickness and the grade of automation of the system.
60
16. http://www.a-v-j.de/seo/wissen-news/vor-undnachteile-von-rfid-chips/
Finally the recognition rate decrease with an increasing
number of objects. In future it is possible to investigate the
combination of on-mobile devices object recognition with a
grid of local emitters, like infra-red, Bluetooth or RFID.
17. Oppermann R.: Ein nomadischer Museumsführer
aus Sicht der Benutzer. (2003)
Another possibility is to investigate the capabilities of the
high recognition performance. [2]
Future tour guides, I think mostly they will be on a mobile
device, should support both exploratory and browsing
modes of information discovery. The systems should
answer two main questions, namely “What is that?” and
“What is here?” [3]
REFERENCES
1. Bay H., Fasel B., Van Gool L.: Interactive Museum
Guide: Fast and Robust Recognition of Museum
Objects. (2006)
2. Föckler P., Zeidler T., Brombach B., Bruns E., Bimber
O.: PhoneGuide: Museum Guidance Supported by OnDevice Object Recognition on Mobile Phones. (2005)
3. Davis N., Cheverest K., Dix A., Hesse A.:
Understanding the Role of Image Recognition in
Mobile Tour Guides. (2005)
4. Rome J.: Introduction to Image Recognition. (2002)
5. Bay, H., Tuytelaars, T., Van Gool, L.: SURF: Speeded
up robust features. In: ECCV. (2006)
6. Sali S.: Image Recognition on Mobile Phones. (2007)
7. Howland J. E.: Computer Graphics; Department of
computer science; Trinity University (2005).
8. Ke Y., Suktthankar R., : PCA-SIFT: A more distinctive
representation for local image descriptors. (2004).
9. Lowe D. G. : Distinctive Image Features from ScaleInvariant Keypoints. (2004).
10. COMP3411/9414: Computer Vision; School of
Computer Science and Engineering.
11. Bollinger B. J.: Digital Image Processing – Image
Processing and Restoration; Departement of Electrical
and Computer Engineering; University of Tennessee;
(2007).
12. Stevanovic S.: Radio Frequency Identification
(RFID) (2007).
13. Aust J.: Mobile Interaktionen und Mobile Medien
– Mobile Interaktion mit der realen Umwelt (2005).
14. Brooks automation GMBH: http://www.brooksrfid.com/de/rfid-grundlagen/rfid-vorteile.html
15.
Ludwig Maximilian Universität München:
http://www.medien.ifi.lmu.de/lehre/ws0607/mmi1/essa
ys/Sebastian-Loehmann.xhtml
7
61
Photorealistic vs. non-photorealistic rendering in AR
applications
Manfred A. Heimgartner, Bakk.
Student
Laudonstraße 35a, 9020 Klagenfurt
m.a.heimgartner@edu.uni-klu.ac.at
ABSTRACT
model realistically and fitting to the surrounding, or
reducing the details and realistic look of the model as well
as the environment [7]. A common method to reduce detail
in non-photorealistic rendering (NPR) is to create a hand
drawn or comic-like look which is called cel-shading or
toon shading (see figure 1) [5]. Although when creating an
NPR look of an object, details like lighting, texturing and
shadowing are still very important but can be done less
detailed and/or without special techniques like bumpmapping.
In this paper two general approaches to rendering AR
applications are shown and examined: a photorealistic and a
non-photorealistic approach. The photorealistic approach
tries to create virtual objects which are similar to real world
objects by applying plastic shading, very realistic
shadowing and textures including bump mapping for the
simulation of surface roughness. The non-photorealistic
approach on the other hand focuses on decreasing the detail
level of the real world image so that virtual objects don’t
have to be extremely detailed. Usually shadows and
lighting are as important and detailed as in a photorealistic
approach but details like textures and bump mapping are
reduced in detail. AR images are then often stylized
cartoony or like paintings. Last some implications for the
project "Burgbau zu Friesach" are given.
The basis of an appropriate virtual object is of course the
accurate measurement and construction of a 3D model, for
example with 3D scanners, time-of-flight scanning or
triangulation [4]. Once the model is available, the work on
the details of an either realistic or non-realistic look can
begin.
Author Keywords
AR, photorealistic, non-photorealistic, rendering, shading,
shadowing, lighting, bump mapping, cartoon style,
watercolor style, painted style.
INTRODUCTION
The digitalization of objects or buildings is a quite complex
task. But once a 3D model is available, it is also very
important that virtual object fits into the real environment
including (partial) buildings/objects.
Generally, this can be either done by creating a very
realistic look of the virtual object alone i.e. by
shading/lighting, shadows, texturing and bump-mapping the
Figure 1. Comparison of realistic plastic shader and cartoony
cel-shading.[13]
Klagenfurt, Austria
1
62
PHOTOREALISTIC RENDERING (PR)
To recreate an object realistically, there are many different
aspects that have to be concerned. The shading creates a
three-dimensional look to the object [7]. Next a correct
shadow that may overlap with other virtual and non virtual
objects has to be created.
Real World
Virtual World
Real Object
RR Shadow
RV Shadow
Virtual Object
VR Shadow
VV Shadow
Table 1 – Types of Virtual Shadows[9]
These two techniques are of course even more challenging
in outdoor scenarios, where the light comes from a different
angle, depending on the current daytime. Another important
aspect are the right textures for the model and maybe even
add so-called bump mapping which simulates realistic
surfaces like walls with roughness and dents.
The classification also implies that there can be two
different light sources – a real and a virtual one. Haller [7]
extended this classification by merging the real and virtual
light source to one single light source and thus also merging
the RV and VR Shadow into one type. Also he added a new
type, which is cast in Augmented Reality applications
where one light source causes real and virtual shadows
which are cast on real and virtual objects likewise.
Shading
He developed a real-time shadow volume algorithm that
can be used in Augmented and Mixed Reality applications
with already good results. Shadows are rendered quite
realistically, whether they are cast by virtual or on virtual
and real objects (see figure 3). The first image shows a
relatively simple VR Shadow – the shadow is cast by the
virtual object on a real object at the same angle the real
shadow is cast by the real object. The other three images
show the combination of virtual and real shadows, having
one common light source and quite realistically overlapping
with each other.
When the 3D-Model of an object is available, one of the
first steps to create a realistic look is to add shading to the
model (see figure 2).
Plastic shading ultimately creates a quite realistic threedimensional look of a model by adding lighting information
from a specific source direction to it and thus also creating
a kind of self-shadow.
Figure 2. Object in different stages of shading.[1]
Shadowing
Creating a realistic shadow can be quite challenging,
especially in outdoor scenarios with large virtual objects
that may cast a shadow on other virtual and real objects as
well as vice versa. Naemura et al. [9] considered four types
of Virtual Shadows, as seen in table 1.
Figure 3. Examples of different Virtual Shadows, created with
the real-time shadow volume algorithm. [7]
2
63
The shadows are not 100 % accurate - mixed shadows are
still tricky, for example the virtual shadow is completely
cast although the virtual object is partially covered by a real
object. But the algorithm is already more than sufficient to
create a good illusion of realistic shadowing in AR
applications.
Bump Mapping
To further create a realistic shadow in outdoor AR
applications the solar altitude needs to be known. A
common approach to do this is the method of Image Based
Lighting (IBL) [11].
A very effective way to have textures look like they are
having a structure, bumps and scratches is bump mapping
[7]. This technique really adds a lot of realism to images
without really adding much geometry (see figure 5).
Furthermore to reduce the need hardware power, not every
little detail should be taken into account for the virtual
model. For example carved surfaces don’t have to be
realized by really modeling them but can be shown by
creating a proper texture [8].
An omni-directional photograph of the real scene
(environmental map, see figure 4) is taken and each pixel is
then associated a directional value, which is interpreted as
the amount of light that is coming from that direction.
Figure 4 – Environmental mapping formats[11]
Figure 5. Example of Bump Mapping – notice that the wall
model itself has a flat surface, only the bump-mapped texture
creates the look of stones with deep splices[2]
There are different ways to acquire such images, common
methods are using a fish-eye lens, using spherical cameras
or taking a number of pictures and then sticking them
together.
NON-PHOTOREALISTIC RENDERING (NPR)
NPR basically shares the same graphic techniques with PR.
Objects are shaded and often also correctly shadowed,
textured and maybe even bump-mapped. The big distinction
however is the type of shading. Whereas PR relies on
plastic shading with realistic and detailed three-dimensional
appearance, NPR tries to reduce the details and also even
reduces the degree of detail in the real environment (for
example see figure 6).
Texturing
Another important aspect of a realistic look is the use of
adequate textures. Often a texture consists of a rather small
graphic "tile" which is then tiled on the surface of the
model. With modern graphics hardware, the maximum
texture size is between 2048x2048px and 8192x8192px,
depending on the video card of course.
Nowadays more and more algorithms which create a NPR
look (like filters for simulating pen & ink, water color or
impressionistic images) can be processed in real-time [7].
NPR is often used in AR when abstract information that is
not representable or a special detail needs to be described.
For example the presentation of a complex machine, where
some special parts should be highlighted, benefits from the
abstraction of unneeded details and different visualisation
of the important details. On the other hand this technique is
of course less useful in applications where the detail of the
image is important like medical or security-crucial
scenarios [5].
For relatively simple AR-objects current mid-end hardware
equipment should be more than sufficient as simple
surfaces like bricks or stonewalls should already look very
good with smaller tile sizes like 256x256px or 512x512px.
To realistically illuminate and texture large objects in
outdoor scenarios two promising methods have been
proposed by Trocolli and Allen [12]. As the two approaches
a very formal and mathematical I refrain from explaining
them in this paper and refer to their paper.
3
64
Figure 7. Overview of the painterly filter.[6]
To render the virtual objects again two steps are needed.
Silhouette Rendering creates a Silhouette Model of the
object (again with distinct edges) and Non-linear Shading
applies the right three-dimensional look between the too
simple no Shading and too realistic Plastic Shading.
Figure 6. NPR-Filter used in the movie "A Scanner
Darkly"[10]
Common styles in NPR for AR applications are cartoonlike, sketched-like [6] and painted- [7] or watercolored-like
[3].
Cartoon-/sketched-like stylization
Basically the method by Fischer et al. [6] consists of the
application of a painterly filter for the camera image and
NPR of the virtual object.
The painterly filter performs two separate steps (see figure
7), resulting in two different images. In one step the degree
of detail of the real world image is dramatically reduced by
generating large, uniformly colored regions through basic
color segmentation. The other step again consists of two
steps, first high-contrast edges are detected and then a postprocessing filter is applied to generate thick lines for the
stylized look. After these steps the two images are merged
and the painterly filter completed.
Figure 8. Comparison of conventional AR, cartoon-like
stylization and sketch-like stylization.[6]
If the sketched-like stylization is needed, the step of color
segmentation will be skipped, resulting in an image with
white background and black silhouettes.
Painted-like stylization
The basis for the method of Haller [7] is the rendering
technique Painterly Rendering for Animation by B. J. Meier
which has been modified by them to support real-time
environments by using modern 3D hardware and can be
divided into at least three steps. Before the rendering can
start there is a pre-processing step in which the polygon
meshes are converted into 3D particles. Next the rendering
starts with at least two passes.
4
65
As generating a new pattern for each frame was too
hardware-expensive at that time (the test system was a 2.2
GHz CPU with an nVidia G7950 GPU) the re-tiling was
only done every 10 frames.
In the first pass, two reference images with color and depth
information are rendered and stored into textures by an
arbitrary rendering algorithm. In the third step, and second
pass, an optimal amount of brushstrokes is rendered by
using billboards. Reference pictures from the first pass are
then used to define color and visibility and the surface
normal determines the orientation of each stroke.
Figure 9. Comparison of Van Gogh’s Bedroom original
painting and painterly-rendered simulation. [7]
Watercolored-like stylization
Figure 11. Example of a watercolor-stylized image. [3]
Chen et al. [3] propose in their paper of rendering AR
scenes by using Voronoi diagrams (see figure 10) to
simulate watercolor effects, adjusting the cells along strong
silhouette edges to maintain visual coherence at a relatively
low temporal cost.
CONCLUSION AND IMPLICATIONS FOR THE PROJECT
"BURGBAU ZU FRIESACH"
The possibilities in rendering AR applications are quite
manifold. At first the main question is of course if a more
realistic and detailed or less realistic and even cartoony
rendering approach is appropriate (and technically possible
with the hardware available). Generally with today’s
hardware the possibilities are vast and (maybe with some
cutbacks in the display of fine details) the augmentation of
even very detailed structures should be no problem.
Project "Burgbau zu Friesach" aims at augmenting a castle
ruin, especially showing how castles were build in former
times and how the complete buildings would look like. As
the project seems to have an emphasis on historical
correctness the approach of photorealistic rendering may be
more adequate. With a NPR approach many details would
get lost like decorative wall structures, tapestry or maybe
special patterns in a wall. On the other hand, by using NPR
such details might be especially pointed out and also the
seamless integration of augmented structures onto ruins
should be easier. Also, if an AR game is to be included at
the ruins a comical look might fit better to the gaming
environment and make important objects or structures
easier to distinguish.
Figure 10. Simple Voronoi diagram with easy-to-distinguish
colored regions. [3]
The method can be divided into three steps. First the image
is tiled with a Voronoi diagram and colored based on the
original image. Next the dark strokes that highlight object
silhouettes in watercolor paintings simulated by detecting
and drawing strong edges in the image. Then to maintain
the visual coherence in the images the Voronoi cells are retiled along the strong edges in the scene and thus a new
Voronoi diagram is created – otherwise the output video
would have an undesired "shower door" look. This also
simulates color bleeding between the regions because the
cells will overlap two regions that are separated by an edge.
So it’s really hard to give a general recommendation, the
question is, what the primary intention of the stakeholders
is.
5
66
Applications - Proceedings of the 2004 ACM
SIGGRAPH international conference on Virtual Reality
continuum and its applications in industry 4-1 (2004),
ACM Press (2004), 189 – 196.
REFERENCES
1. About.com, http://www.about.com, 03.12.2008
2. Arekkusu, http://homepage.mac.com/arekkusu/main.html,
03.12.2008
8. Magnenat-Thalmann, N., Foni, A. E. and Cadi-Yazli, N.
Real-time animation of ancient Roman sites. Computer
graphics and interactive techniques in Australasia and
South East Asia. Proceedings of the 4th international
conference on Computer graphics and interactive
techniques in Australasia and Southeast Asia. ACM
Press (2006), 19 – 30.
3. Chen, J., Turk, G. and MacIntyre, B. Watercolor
inspired non-photorealistic rendering for augmented
reality. Virtual Reality Software and Technology.
Proceedings of the 2008 ACM symposium on Virtual
reality software and technology. ACM Press (2008), 231
– 234.
9. Naemura, T., Nitta, T., Mimura, A. and Harashima, H.
Virtual Shadows - Enhanced Interaction in Mixed
Reality Environment. Proceedings of the IEEE Virtual
Reality Conference 2002. IEEE Computer Society
(2002).
4. Cignoni, P. and Scopigno, R. Sampled 3D models for
CH applications: A viable and enabling new medium or
just a technological exercise? Journal on Computing
and Cultural Heritage (JOCCH), Vol.1, Issue1. ACM,
New York, NY, USA. Article No. 2 (2008).
10. Nature.com, http://www.nature.com/, 03.12.2008
11. Santos, P., Gierlinger, T., Stork, A. and McIntyre, D.
Display and rendering technologies for virtual and
mixed reality design review. Frauenhofer Publica
(2007).
5. Fischer, J. and Bartz, D. Real-time Cartoon-like
Stylization of AR Video Streams on the GPU.
WSI/GRIS – VCM. University of Tübingen (2005).
6. Fischer, J., Bartz, D. and Straßer, W. Stylized
Augmented Reality for Improved Immersion.
Proceedings of IEEE Virtual Reality. IEEE Computer
Society (2005), 195 – 202, 325.
12. Trocolli, A. and Allen, P. Building Illumination
Coherent 3D Models of Large-Scale Outdoor Scenes.
International Journal of Computer Vision, Vol. 78 ,
Issue 2-3. Kluwer Academic Publishers (2008), 261 –
280.
7. Haller, M. Photorealism or/and non-photorealism in
augmented reality. Virtual Reality Continuum And Its
13. Wikipedia. http://en.wikipedia.org/, 03.12.2008.
6
67
Digitalizing Intangible Cultural Artefacts
Bakk. Helmut Kropfberger
hkropfbe@edu.uni-klu.ac.at
INTANGIBLE CULTURAL ARTEFACTS
ABSTRACT
The following list of different kinds of intangible cultural
artefacts follows the definitions by the Convention for the
Safeguarding of Intangible Cultural Heritage (short ICH) of
the UNESCO. [8] I omitted the category of “Knowledge
and practices concerning nature and the universe” as this
information has to be transported either orally, through the
teaching of skills or within rituals, which will all be
described in the following text.
This paper aims to give the reader a basic understanding of
the state-of-the-art approaches for digitalizing Intangible
Cultural Artefacts. It will discuss each kind of Intangible
Cultural Artefact (as defined by the UNESCO) on its own
and will present the possibilities to create digital
representations of these artefacts. Further, it will then try to
explore the strengths and weaknesses of the different
capturing approaches, promoting a multimedia-based
combination of several recording techniques, which will be
discussed in detail in the second part of the paper.
The description of each kind of ICH will be followed by a
short proposal for the usage of different techniques to
capture and/or digitalize the different artefacts.
Please note that the UNESCO mainly focuses on the
cultural aspect of ICH and therefore promotes interpersonal transportation and teaching rather than digitalizing
and archiving of ICH.
Author Keywords
Intangible Cultural Artefacts, Intangible Cultural Heritage,
ICH, digitalizing, multimedia-based approach
INTRODUCTION
Oral Tradition and Expressions
Intangible Cultural Artefacts are cultural artefacts, which
have no physical form, for example dance, fighting styles,
craftsmanship, songs, stories, rituals etc. They are an
important part of any culture but they might disappear as
they cannot be passed on or recorded for the following
generations as easily as physical objects.
Oral Tradition and Expressions not only include stories,
legends, proverbs, rhymes and riddles etc. but also the
languages themselves and their pronunciations. Oral
artefacts also play a major role in songs, theatre, performing
arts in general and rituals.
While the content of oral artefacts can, most of the time, be
stored in written text, which can then be digitally
represented, this is not true for the presentation and the
context of the oral artefact. For example the storyline of a
legend can be captured by written text, either in the
respective (written) language or in a transcript to a different
language. In contrast, the act of telling the legend, the
improvisation, intonations and enactment of the performer
or storyteller and the reaction of the audience cannot be
captured solely by written word but must be captured by a
multimedia-based approach. But still, the multimedia
recording of a story-telling session lacks the interactivity
and originality of an actual live performance.
We now face a world that is becoming more and more
globalized, not only from a market view but also from a
cultural view. Global mass-media promotes standardized
cultural practises and some skills are becoming obsolete,
[8] but we still have the chance to safeguard fading cultural
artefacts by either trying to stop their disappearance or by
trying to archive them for future generations.
The following text deals with different kinds of intangible
cultural artefacts, their characteristics and the possibilities
we have (or need) to capture them.
(Please note, that I view the techniques of digitally
capturing something by means of written, visual or audio
recording as trivial and will not go into depth about them,
unless they are in some way special or enriched for the
purpose in question.)
At this point the archiving of languages themselves has to
be taken into account as well. As some languages disappear
or are changed, the need arises to capture, archive and
digitalize them. This can be done by producing dictionaries
of the words and phrases, plus their meanings. Furthermore
the rules (e.g. grammar) have to be recorded and also the
pronunciation of the vocabulary, or at least the basic rules
of pronunciation, have to be recorded as well. In some cases
1
68
also gestures play an important role, which then have to be
captured as well, if possible in a visual form. Most of the
time languages also encompass some kind of writing
system, which also has to be archived, alongside with its
rules and meanings. Especially in this field a multimediabased approach is needed, because written text (and its
digital representation) is hardly enough to give future
generations the possibility to study and understand a
language. [4]
Figure 1: Selection of basic Labanotation Symbols and
corresponding gestures [5]
Performing Arts
In order to capture a dance in digital media we can now
choose between different approaches, each with its own
benefits and shortcomings. There exist formal sign
languages, for example “Labanotation”, which can describe
the different body movements. They either describe the
position and following movements of each body part at a
given time, (see Figure 1) or they break down dances to
common (body-) movements and then describe a dance by
these building blocks. [1] These “step-by-step” instructions
can even be used to learn the dance or to create an
interactive 3D model of a dancer, which reacts to changes
of the dance step sequence.[6] But this whole process is
indeed very demanding and time consuming for the creator
of the written documentation of a dance. Another
possibility is to simply capture the dance on video, thereby
creating appearance data. By using a motion-capturing
approach with multiple cameras this can even create a 3D
model, after refining the collected data by mathematical
algorithms.[7] A variation of this would be to use motion
capturing with passive or active markers, fixed on the body
or clothing of the dancer.[7] These techniques can provide
motion data which can afterwards be animated with
different hulls and textures, overlaying the created “stick
figure” (as used in many modern animated movies and
computer games). (see Figure 2) The problem of both
approaches is that they either produce appearance data or
motion data, and that the combination of them both can
become a problem. This is especially true, if the dancer is
wearing a costume, which is curiously shaped or hides
movements from the cameras.[3, 7]
Performing Arts are including, but not limited to, music,
dance and theatre performance. It is further noteworthy, that
Performing Arts on one hand often consist of other kinds of
ICH (e.g. the rhymes and lyrics of a song, which also
belong to Oral Traditions) and on the other hand also play
an important role in other ICH (e.g. music or dance as part
of a ritual).
Due to the different characteristics of the Performing Arts I
will try to address them each at a time.
Music
Music is an integral part of nearly every culture and
therefore has many different roles and functions in different
contexts. As a mainly audio-based media the approach to
digitalize and archive it is quite straight forward: by either
digitalizing an actual recording or by using a written sign
language (e.g. notes) to represent it. An even better
understanding can be reached by doing both, creating a
written representation and a recording.
But as simple as it might seem, there are still problems,
which need to be addressed, when archiving music. Firstly,
the performance of a piece of music can largely vary (e.g.
different musicians interpret the same piece of music
differently). Secondly, some cultures use different scale and
rhythm systems, which cannot be represented satisfyingly
in a standard written music system. Thirdly, the instruments
and the mastery of them can be a very important aspect of
the music and therefore have to be archived as well (see
Craftsmanship). And lastly, as with Oral Tradition and also
the following ICH, improvisation might play a major role in
the cultural aspect of music and so a recording can only be
seen as a sample, which may be more or less representative.
Dance
Dance might be defined as “ordered bodily expression,
often with musical accompaniment, sung or instrumental”
[8]. Similar to music, dance comes in many different
varieties, as some dances are performed by a person alone,
in pairs or groups, while wearing masks, costumes or other
ornaments. Also the importance and the form of the music,
to which the dance is performed, might vary from culture to
culture or may be influenced by the context in which it is
performed.
Figure 2: Dancer wearing a suit with passive markers (left),
motion data “stick figure” overlaid with an untextured hull
(right) [7]
2
69
Social Practices, Rituals and Festive Events
As for the music, it has to be captured as well, in
correspondence to the dance, (as already described) or it
may be omitted, if it can be recreated and archived at
another time. [2]
“Social practices, rituals and festive events are habitual
activities that structure the lives of communities and groups
and that are shared by and relevant for large parts of them.
They take their meaning from the fact that they reaffirm the
identity of practitioners as a group or community. [...]
Social practices shape everyday life and are known, if not
shared, by all members of a community.”[8]
Personally I think that in order to complete this list of
Performing Arts, one might also include formal fighting
stiles (e.g. Karate, Fencing, Wrestling etc.), as they are also
incorporated in many different cultures and are not that
different from dance. They are a sequence of body
movements that may or may not have corresponding body
movements by a (sparing-) partner and might even follow a
defined rhythm or music (e.g. Capoeira). As with dancing a
motion capturing approach might be the most promising
option for digitalization.
Rituals are a combination of all or some of the
aforementioned ICH. Therefore every technique viable to
capture one or more of them might be used to capture the
corresponding aspect of a ritual. One might take into
account, that some aspects of a ritual are more important
than others (e.g. the written or audio recording of a catholic
mass, rather than the video of the priest reading it), and
therefore choose the means of capturing and archiving
accordingly. Again the need to digitally archive material
elements (e.g. tools, scenery etc) arises.
Traditional theatre performance
Traditional theatre performance might be considered as a
combination of the aforementioned ICH, as it includes oral
elements and movements, which have to be captured in
parallel by a multimedia approach. Due to the fact that
props, scenery and stages play an important role in
theatrical performances, we are definitely crossing the line
to digitalizing tangible artefacts and by that leaving the
focus of this paper. Although we have to keep in mind, that
the combination of both, the digital representation of
intangible and tangible artefacts play a crucial role in this
context, because each set of data might not be complete
without the other.[6]
What strikes me as being important and distinctive from the
other ICH is, that in rituals, especially religious ones, a lot
of meta-information is needed. Only by providing the
context, background and meanings of different aspects of a
ritual, it can be recorded, so that an “outsider” can
understand it. (E.g.: Why is the person performing this
ritual? What do the actions and objects represent? Is the
ritual only performed at a special time/place? Etc) This not
only calls for an extensive approach to capture as many
aspects of a ritual as possible, but also needs the possibility
to interweave and connect the digital representation of
cultural artefacts.
Traditional Craftsmanship
The digital archiving of a tangible object can be achieved
quite easily, depending on the requirements to the data. By
contrast, the archiving of the act of creating said object can
be a laborious task. Especially, if the skills needed to
produce the good are highly specialized and are largely
based on the experience and the knowledge of the artisan.
For example, the Pyramids at Giza are one of the Seven
Wonders of the Ancient World, and are already largely
mapped and studied, but what intrigues archaeologists most
is, how they were actually build, which tools were used and
how they were used. And even when they actually find
hints in scriptures how it was done, they have a hard time
recreating it.
MULTIMEDIA-BASED APPROACH
The need for a multimedia approach to capture Intangible
Cultural Artefact is undeniable. This is especially true, if
the aim of the capturing is to preserve all (or as many as
possible) aspects of a still existing Intangible Cultural
Artefact. In the following chapter I will sum up the basic
aspects of the different media representations of an ICH.
(For a structural overview of the proposed multimediabased approach see Figure 3)
Textual Representation
The most basic representation of an ICH is a textual
description. This straight forward task lays the basis for any
archiving approach, as it is also the basic means for
searching for, categorisation of etc. an artefact in a database
or archiving system in general.
Using tools to create a complex object, as well as playing
an instrument, requires learned skills, knowledge of the
materials and experience created through practice. This can
hardly be transformed into any digital data. The textual
description of a task and the visual recording of it lack the
aspect of experience, even motion data recorded from an
expert cannot transport the mental and haptic skills needed
to perform a specialized skill. Furthermore we still lack the
techniques to capture motion data with a satisfying
granularity for this task within reasonable effort.
But additional to the basic written description of an
Intangible Cultural Artefact I want to promote two further
applications of textual representations.
3
70
Figure 3: Proposed Structural Overview for the Multimedia-Based Approach
Audio Representation
Written Symbolic Representation
An audio recording can achieve something that plain text
hardly can: It evokes feelings within the listener and breaths
live into a description of an ICH. It is furthermore one of
the simplest ways to transport the sound aspect of an
intangible artefact. This aspect may not only be the sound
of voices or instruments while playing music, but also
something like the sound of material or tools used while
performing craftsmanship. (e.g. the sound of a hammers
impact on metal tells the smith about its quality and
structure) Sound can also capture aspects of the context in
which an intangible artefact is set. (e.g. the cheering of the
audience or natural sounds like waves, which may play a
role in a ritual) Because of the fact that an audio recording
cannot capture as selectively as a video recording, the
problem of different angles does not apply in such an extent
as it does with video. (see below) What holds true is that an
audio recording of an intangible artefact is always just a
snapshot and can only stand for the whole ICH in a limited
way. Therefore the practice of multiple recordings should
be promoted. These recordings may differ in some aspects
and be similar in others to each other. (e.g. different
performers, different setting etc.) On the one hand, this may
lead to redundancies, which enlarge the effort to archive
ICH, but on the other hand this enriches the value of the
archive. (see below)
In order to transport and enable other people to recreate an
intangible artefact, humanity developed a multitude of
special written symbolic languages. For example musical
notation or afore mentioned Labanotation are complex
abstract systems to write down music or body movement.
Such systems exist for a wide variety of art forms in a more
or less developed way.
The use of such systems is most important in the archiving
of ICH, because it gives the future audience a chance to not
only watch or listen to a performance (see Audio and Video
below) but also to re-enact and recreate a performance. But,
in order to use such a system efficiently, a dictionary or
general How-To has to be provided to teach the correct use
of the system.
Furthermore the same ICH might be symbolized in different
language systems in order to represent the whole artefact.
For example, a song may need different musical notations
for the melody, the beat and the lyrics.
Cultural Meta-Information
It is hardly true that a cultural artefact can convey its whole
meaning by itself. Often it is the context that makes a
cultural artefact so significant. Finding an ancient knife is
one thing, but knowing that it was used in a special ritual to
sacrifice animals to the gods of fertility gives the artefact a
totally different and richer meaning. The same is true for
any ICH. For example, a recorded dance on its own is quite
interesting, but we also need further information: Who were
the dancers? Is it performed by a single person or by a
group? Was everybody allowed to dance this special dance?
When was the dance performed, and especially WHY was it
performed? Was it part of a ritual? What is the
mythological background of the ritual? Is there some
connection to other dances or rituals? Etc.
Finally one has to consider that there is a strong bond
between audio and video recordings of an intangible
cultural artefact. Sometimes the one is nearly useless
without the other. Therefore strategies have to be developed
to show the connections between audio and video
representations (or even motion capturing data).
Video Representation
It should be noted that in this paper I am talking about
“moving pictures” when talking about capturing an ICH on
video.
This representation of an ICH can become very wide and
complex but it enables us to not only recreate an ICH but
also to understand its original meaning.
As with audio, an actual recording of a performance can
convey a lot of different information than the written
4
71
capturing techniques, especially when multiple aspects have
to be captured. For example, the combination of body
movement, hand gestures, or finger movements, and facial
expressions can hardly be captured by the same capturing
technique in one go. Especially while capturing
craftsmanship a highly accurate capturing technique is
needed.
representation can. It is the human factor that gives an ICH
its uniqueness and importance. Therefore the video
recording of a performance is a very essential part of the
archiving of an intangible cultural artefact. But it has to be
kept in mind that, like every recording, a video of one
performance is just a “snapshot” or example of a special
intangible cultural artefact. Therefore a video recording is
not able to stand by itself but needs the written symbolic
representation in order to be analysable for its actual
content.
As a topic for further research I would like to suggest the
development of a system, which can translate motion data
to a written symbolic language. (e.g. motion data to
Labanotation) This could reduce the amount of time needed
to create an archive of intangible cultural artefacts and
enable future users to easier recreate the captured artefacts.
As with video we have to face a further shortcoming. A
single recording can only show the performance from a
specific point of view or angle. So the person who wants to
archive an ICH has to decide on what aspect he wants to
focus. Is it sufficient to film a dancer in an angle so that the
whole body is seen or would it be better to zoom on the feet
and legs in order to better preserve the dancing steps? Do
the facial expressions or hand gestures play an important
role in the dance and can they be sufficiently captured
together with the rest of the body movements? Should the
camera be fixed or change its angle as the dancer spins
around and changes position? Furthermore, what is the best
focus while filming a person performing craftsmanship?
Connected Artefacts
With this heading I just want to remind of the fact that
many intangible and tangible cultural artefacts have
connections to each other. These connections have to be
represented in some way as well. After all, a dance is
nothing without music, the music is played on an
instrument by a musician and the instrument itself was
created using special craftsmanship.
CONCLUSION
The key word in this context and in the problem of
“snapshots” is redundancies. The more often an intangible
cultural artefact is recorded from a different angle, with a
different focus or while performed by a different artist the
better is the understanding and information that can be
derived by the sum of the recordings. This may lead to an
overwhelming amount of time and storage space, which is
needed to archive one single artefact. Therefore a balance
has to be found between the completeness of the recording
and the consumed resources.
This paper, although not very technical in nature, tries to
illustrate that the actual techniques to digitalize intangible
artefacts are not fit for this task. Regardless the technical
specifications of each approach, the underlying paradigms
always introduce shortcomings in one area or the other.
Therefore the only acceptable solution is to use a
multimedia-based approach, by providing recordings of an
intangible cultural artefact in as many ways as possible (e.g.
written, visual, audio, motion data, meta-data etc.). And
even then we have to face, that a digital archive entry of a
performance can never life up to the actual performance or
event, as it lacks so much context, interaction and emotions,
which can hardly be transported by digital means.
Motion Data Representation
Sometimes the video representation of an artefact is not
enough, especially when the intangible artefact is to be
represented in a digital 3D environment. (e.g. Augmented
Reality, Animation etc) There are several ways to create a
Motion Data Representation. With the right system the
Written Symbolic Representation can be translated into
motion data, or the motion data can be recorded using
motion capturing techniques. Most of the time these
techniques use multiple cameras to record the motion of a
moving object. Then the collected data is compared and a
3D model is created using special algorithms. These multiangle techniques may support the call for redundancy when
creating a video representation of an intangible cultural
artefact, as the data collected by the cameras may be used
for both representations. One might also argue, that the
creation of a Motion Data Representation is sufficient and
may replace the Video Representation, as an animation
created from the motion data can also be viewed instead of
a classic video. As an argument against this proposition,
one may take into account, that the granularity and accuracy
of a video is much higher than that of an animation created
from motion data, because we still lack exact motion
REFERENCES
1. Beck, J.; Reiser, J.(1998): “Moving Notation: A
Handbook of Musical Rhythm and Elementary
Labanotation for the Dancer”, Taylor & Francis, 1998
2. Belvilacqua, F. et al. (2002): “3D motion capture data:
motion analysis and mapping to music”,Symposium on
Sensing and Input for Media-centric Systems, 2002
3. Cheng, X.; Davis, J. (2000): “Camera Placement
Considering Occlusion for Robust Motion Capture”,
Stanford Computer Science Technical Report, 2000
4. Gippert, j. Et al. (2006): “Essentials of Language
Documentation”, Walter de Gruyter, 2006
5. Griesbeck, C. (1996): “Introduction to Labanotation”,
University
of
Frankfurt,
1996.
http://user.uni-frankfurt.de/~griesbec/LABANE.HTML,
04.01.2009
6. Hachimura, K. (2006): “Digital Archiving of Dancing”.
IPSJ SIG Technical Reports, Z0031B, 2007
5
72
7. Kamon, Y. et al. (2005): "Coordination of Appearance
and Motion Data for Virtual View Generation of
Traditional Dances". IEEE Computer Society, 2005
8. UNESCO (2003): “Intangible Heritage domains in the
2003
Convention”,
http://www.unesco.org/culture/ich/index.php?pg=00052
, 02.12.2008
3. Skrydstrupn, M. (2006): “Towards Intellectual Property
Guidelines and Best Practices for Recording and
Digitizing Intangible Cultural Heritage” , WIPO, 2006
4. Smeets, R. (2004): Transcript of “Intangible Cultural
Heritage and Its Link to Tangible Cultural and Natural
Heritage”, Okinawa International Forum, 2004
5. Stovel, H. (2004): Transcript of “The World Heritage
Convention and the Convention for Intangible Cultural
Heritage: Implications for Protection of Living Heritage
at the Local Level”, Okinawa International Forum, 2004
6. Uesedo, T. (2004): Transcript of “Efforts to Pass Down
the Traditional Performing Arts of Taketomi Island”,
Okinawa International Forum, 2004
7. Yin, T. (2006): “Museum and the Safeguarding of
Intangible Cultural Heritage”, The Ethic Arts, Issue 6,
2006
Further Readings
1. Czermark, K. et al. (2003): “Preserving intangible
cultural heritage in Indonesia”, UNESCO Jakarta, 2003
2. Hachimura, K. (2001): “Generation of Labanotation
Dance Score from Motion-captured Data”, Joho Shori
Gakkai Kenkyu Hokoku, VOL.2001;NO.66(CVIM128);PAGE.103-110, 2001
6
73
Technologien zur digitalen Aufbereitung historischer
Bauwerke und Denkmäler
Stefan Melanscheg
MNr.: 0360279
Bahnhofsstraße 38/c, A-9020 Klagenfurt
smelansc@edu.uni-klu.ac.at
A-(0)-650/2054004
ABSTRACT
Der Inhalt dieser Arbeit stellt bereits bestehende
Technologien zur Digitalisierung historischer Bauwerke
und Denkmäler vor. Dabei werden zunächst
Basiskonzepte und im Anschluss eine Hybridkombination
vorgestellt. Auf Techniken zum Digitalisieren historischer
Objekte unter Laborbedingungen, wird nicht näher
eingegangen. Zuletzt werden alle vorgestellten
Technologien gegenübergestellt und in bezug zu ihrer
Zweckdienlichkeit beim Digitalisieren einer Burgruine
bewertet.
KEYWORDS
Handaufmaß,
Tachymetrie,
Photogrammetrie,
Laserscannen, Hybrides Laserscannen, Gegenüberstellung
der Digitalisierungstechnologien.
EINLEITUNG
Es gibt heute bereits einige Technologien die zum
Digitalisieren eines Bauwerks oder eines Denkmals
eingesetzt werden. Allerdings ist es mit einer einzigen
Technik meist nicht möglich alle gewünschten Details in
entsprechender Genauigkeit zu erfassen. Beispielsweise
eignet sich Laserscannen wunderbar zum Abbilden einer
Gebäudeaußenhaut als Punktwolke. Ist es dabei aber auch
notwendig, die Inschriften am Torbogen des Gebäudes zu
erfassen, wird man dies mit Laserscannen alleine nicht
vollständig realisieren können. D.h. die eingesetzten
Techniken zur Digitalisierung eines Bauwerks oder eines
Monuments sind immer abhängig vom geforderten
Genauigkeitsgrad.
Aber auch der weitere Verwendungszweck des
Ergebnisses spielt für die Wahl der einzusetzenden
Technik eine wesentliche Rolle. Soll das digitale Abbild
als Grundlage für tiefergehende Forschungen dienen und
sollen einzelnen Segmente auf einer Metaebene
vergleichbar gemacht werden, ist mit Sicherheit ein
personal or classroom use is granted without fee provided that copies
are not made or distributed for profit or commercial advantage and that
copies bear this notice and the full citation on the first page. To copy
otherwise, or republish, to post on servers or to redistribute to lists,
requires prior specific permission and/or a fee.
Seminar aus Interaktive Systeme WS08/09, January, 2009,
Klagenfurt, Austria
höherer
Detailierungsgrad
notwendig,
als
zur
Repräsentation auf „google earth“, wo bedenkenlos mit
Photogrammetrie gearbeitet werden könnte. Soll eine
möglichst einfache Basis zur Verarbeitung in einem CAD
Programm erfasst werden, könnte auch das altbewährte
Konzept der Tachymetrie ausreichend sein.
Zusammengefasst muss also stets die Frage nach dem
weiteren Verwendungszweck geklärt werden. Aus der
resultierenden Antwort ergibt sich meist auch eine
Teilantwort auf die notwendige Genauigkeit und somit
auf die zu verwendende Digitalisierungstechnologie.
Die vorliegende Arbeit wird die heute verfügbaren 3D
Scantechnologien vorstellen, wobei die diskutierten
Genauigkeiten und Beispielanwendungen umgekehrt
Rückschlüsse auf die Möglichkeiten der richtigen
Verwendung zulassen.
BEGRIFFLICHKEITEN
In unserem Anwendungsfeld können für die Messung an
dreidimensionalen Objekten je nach Anforderung in
punkto Genauigkeit und Verwendungszweck, vier
verschiedene Basismessverfahren angewandt werden: [4]
•
•
•
•
Handaufmaß
Tachymetrie
Photogrammetrie
Laserscannen
Ein wesentlicher Unterschied liegt in der Auswahl der zu
messenden Punkte. Diese werden beim Handaufmaß und
der Tachymetrie willkürlich und bei Photogrammetrie und
Laserscannen unwillkürlich bzw. erst bei der
Nachbearbeitung gewählt.
Photogrammetrie verwendet mehrere Aufnahmen des
Messobjekts aus verschiedenen Blickwinkeln und
kombiniert diese anhand von gewählten Bezugspunkten
miteinander zu einem räumlichen Objekt. Das
Laserscannen dagegen beschießt gewissermaßen das
Messobjekt in einem definierten Raster mit einer hohen
Anzahl von Messpunkten. Tachymetrie bezeichnet ein
grundlegend artverwandtes Verfahren zum Laserscannen,
allerdings werden die Messpunkte am Objekt von der
Messperson selbst gewählt. Das Handaufmaß ist
selbsterklärend. Bei einer kombinierten Anwendung aus
Laserscannen und Photogrammetrie spricht man vom
Hybriden Laserscannen.
74
Die einzelnen Verfahren werden auf den folgenden Seiten
nun genauer beschrieben. Jeder Methode wird zumindest
ein Anwendungsbeispiel zugeordnet, welches mit
Abbildungen vorgestellt wird. Folgende Grafik soll den
Zusammenhang der einzelnen Technologien noch einmal
verdeutlichen.
Positionsveränderungen der Messpunkte über die Zeit
leicht ermittelt werden. [8]
Abbildung 2. Tachymetrie als Kontrollinstrument [8]
Die Anwendung der Tachymetriemethodik ist in der
Bauwirtschaft bekannt und bewährt. Eine vereinfachte
Abwandlung ist beispielsweise das Nivelliergerät.
Inwieweit kann nun aber die Tachymetrie beim
Digitalisieren in unserem Bereich nützlich sein?
Abbildung 1. Zusammenhang der
Digitalisierungstechnologien
MESSUNG PER HAND
Unter Messung per Hand oder Handaufmaß versteht man
das direkte Messen am Objekt mit analogen oder digitalen
Messinstrumenten jeglicher Art. Diese Messergebnisse
können dann in einem weiteren Schritt via CAD Software
digital abgebildet werden. Für eine Anwendung im
Bereich der Digitalisierung von ganzen Bauwerken ist
diese Methode jedoch nicht sinnvoll. Sehr wohl zur
Anwendung kommt das Handaufmaß nach wie vor bei
archäologischen Ausgrabungen. Dies in Form von
Skizzen und der Lokalisierung von Fundstücken, die via
Raster auf die Zeichnung übertragen werden. Die
Dreidimensionalität, die bei einem digitalisierten Objekt
gegeben ist, verliert sich hier allerdings. Geübte
Archäologen schaffen am Papier eine Skizze in einer
isometrischen Darstellung.
TACHYMETRIE
In der Tachymetrie gibt es zwei wesentliche Dimensionen
(Richtung und Entfernung), über die bezugnehmend auf
die Position des Messinstruments, Punkte am Messobjekt
erfasst und gespeichert werden.
Die Bedienung des Messinstrumentes erfolgt meist
manuell und die Auswahl der Messpunkte nach einer
durchdachten Logik in gewünschter Granularität. Abb. 2
verdeutlicht dieses Prinzip am Beispiel einer
Kontrollmessung an einem Brückenbogen. Hierbei soll
der Bogen auf seine Verformung hin geprüft werden. Es
gibt fixe Messpunkte am Objekt und das Messinstrument
behält dabei seine Position. Der Messvorgang selbst wird
in vorgegebenen Perioden durchgeführt. Somit können die
Betrachtet man Tachymetrie als Werkzeug zum Abtasten
einer
gesamten
Gebäudeaußenhaut
um
Dreidimensionalität für weitere Forschungszwecke zu
erzeugen, müssten dementsprechend viele Messpunkte
manuell erfasst werden. Dies wäre jedoch im höchsten
Maße zeitaufwendig und demnach unwirtschaftlich! Soll
das zu scannende Objekt jedoch in einer niedrigen
Detailgenauigkeit dreidimensional als Polygonobjekt zur
weiteren Verwendung herangezogen werden, ist
Tachymetrie ausreichend. Allerdings hätte das
Digitalmodell keinerlei Ähnlichkeit mit dem Original, da
die Polygone keine dem Original entsprechenden
Texturen abbilden. [8]
Den anderen Verfahren gegenüber hat Tachymetrie einen
großen Vorteil was die Messgenauigkeit betrifft.
Demnach wäre dieses Konzept perfekt geeignet, um
Referenzpunkte am Gebäude exakt zu erfassen. Diese
Referenzpunkte könnten in weiterer Folge als
Bezugspunkte für Abbildungen aus der Photogrammetrie
oder Punktwolken von 3D Laserscanergebnissen dienen.
Daher hat Tachymetrie noch immer einen wichtigen und
ganz zentralen Anteil am Erfolg einer Digitalisierung
eines historischen Bauwerks. [1]
PHOTOGRAMMETRIE
In der Methodik der Photogrammetrie werden Punkte,
Linien und Flächen direkt aus zweidimensionalen Fotos
gemessen. Wenn Standpunkt und Orientierung, sowie das
Aufnahmeverhalten einer digitalen Kamera bekannt ist,
können die Grundsätze der projektiven Geometrie
ausgenützt werden.
Über einen Strahlenschnitt von zwei Aufnahmen eines
Objekts aus verschiedenen Richtungen, kann das Objekt
digital rekonstruiert werden. Diese Rekonstruktionen kann
wiederum in ein übergeordnetes Koordinatenmodell
überführt und zusammengefügt werden, was das
vielfältige Anwendungsspektrum der Photogrammetrie
75
bereits erahnen lässt. Dazu müssen allerdings genaue
Passpunkte am realen Objekt vermessen werden. Dafür
kann und wird die Methode der Tachymetrie verwendet.
Wird nun, wie in unserem Fall, ein historisches Bauwerk
auf diese Art abgebildet, spricht man von Mehrbild –
Photogrammetrie.
Modellhubschrauber
und
einer
handelsüblichen
Spiegelreflexkamera mit Weitwinkelobjektiv. Die
Forschungsgruppe versuchte einen kostengünstigen und
doch erfolgreichen Weg zu beschreiten.
Ebenfalls können über die Grundsätze der projektiven
Geometrie aus einer einzigen Abbildung mit vier
bekannten Punkten maßstäbliche Ebenen des Objekts
abgeleitet werden. Dies wird durch eine Entzerrung des
Bildes realisiert. Durch die Zusammenführung dieser
Ebenen, können maßstäbliche Pläne erstellt werden. In
diesem Fall spricht man von Einbild – Photogrammetrie.
[4]
Ein großes Anwendungsfeld hat die Photogrammetrie bei
der Sammlung geographischer Informationen. Legt man
diesen Anwendungskontext auf Informationsgewinnung
für Landkarten um, unterscheidet man dabei die Bereiche
Landinformationssysteme,
Topographische
Informationssysteme
und
Geographische
Informationssysteme, wobei der photogrammetrische
Begriff hauptsächlich im Bereich der Topographischen
Informationssysteme (TIS) zum Einsatz kommt. [6]
Abbildung 4. Modellhelikopter der Firma Helicam [3]
Photogrammmetrie findet aber auch Anwendung in der
Digitalisierung historischer Denkmäler und Bauwerken.
Hierzu gibt es einige interessante Projekte zu nennen. Als
erstes Beispiel sei die digitale Archivierung der
Pyramiden von Túcume, Peru, erwähnt. Die Archäologen
setzten sich hier die Anforderung, den Zustand des
Monuments als kulturelles Erbe möglichst authentisch
digital darzustellen. Dabei dienten Luftaufnahmen aus
dem Jahr 1949 als Grundlage. [9]
Abbildung 5. Flugsteuerungssoftware weGCS [3]
Der Hubschrauber wurde mit einem GPS Modul bestückt
und verfolgte einen vorher fixierten Flugplan selbsttätig.
An den definierten Punkten wurden dann Aufnahmen in
Quer- und Längsrichtung gemacht. Insgesamt wurden so
85 Fotos angefertigt.
Da bereits eine österreichische Firma dieses Gebiet vorher
mittels des Verfahrens des terrestrischen Laserscannens
vermessen hat und sich die Abweichungen beim
Ergebnisvergleich
beider Verfahren als
gering
herausstellte,
ist
die
photogrammetrische
Herangehensweise für eine kostengünstige und schnelle
Vermessung eines historischen Bauwerkes in dieser
Größenordnung sehr empfehlenswert, wenn eine
Auflösung von 3cm wie hier in Peru ausreicht. [3]
Abbildung 3. 3D Blick auf den Túcume Adobe Komplex in
Peru auf Basis von Luftaufnahmen aus dem Jahr 1949 [9]
Ein weiteres sehr interessantes Anwendungsbeispiel für
Photogrammmetrie findet sich bei der Digitalisierung
einer prä-inkaischen Siedlung in Pinchango Alto,
ebenfalls Peru, wieder. Dabei wurden erneut
Luftaufnahmen verwendet, allerdings nicht aus
historischer Quelle, sondern aktuell über einen
Zusammenfassend kann festgehalten werden, dass die
Photogrammetrie in Kombination mit der Tachymetrie ein
mächtiges Potential im Digitalisierungskontext darstellt.
Zusätzlich zur Bestimmung der Passpunkte mittels
Tachymetrie, könnte auch noch das Handaufmaß
nutzbringend eingesetzt werden wenn entzerrte Photos die
Grundlage dafür bilden. [4]
76
Subtraktion einer beleuchteten Aufnahme mit einer
normalen Aufnahme schließlich die rechte Farbe ergab.
Abbildung 6. Einzelbild Photogrammetrie Links:
Originalfotografie, Rechts: Entzerrung auf Hausfrontebene
[4]
LASERSCANNEN
Terrestrische Laserscanner verfolgen eine ähnliche
Funktionsweise wie Tachymeter, sie messen ebenfalls die
Entfernung und Richtung zum jeweiligen Messpunkt. Der
Unterschied liegt, wie im Einführungsteil bereits
beschrieben, in der Auswahl der Messpunkte, die beim
Laserscannen unwillkürlich getroffen wird und zumeist
einen bestimmten Bereich betrifft. Daraus resultiert eine
große Ansammlung von Messpunkten, die als Punktwolke
bezeichnet wird. Es gibt auch bei der Tachymetrie
Instrumente, die mittels Servomotoren mehrere
Messpunkte automatisiert erfassen. Laserscanner jedoch,
lenken den Strahl über eine Spiegelkonstruktion im
Scanner entsprechend ab. Als Richtwert bei einem 360°
Scan
gelten
1
Million
Messpunkte.
Die
Ergebnispunktwolken können wie bei den vorher
behandelten Verfahren auch, mit Passpunkten kombiniert
werden.
Das Laserscannen hat den Vorteil, dass innerhalb kurzer
Zeit, eine große Menge an Informationen erfasst werden.
Nachteilig wirkt sich allerdings der recht hohe Aufwand
in der Nachbearbeitung aus wo entweder teilautomatisiert
Regelwerke aus den Punktwolken abgeleitet werden, oder
mittels Dreiecksvermaschung der Punkte auf regelmäßige
Oberflächen geschlossen wird. Unregelmäßige Formen
können so sehr gut erfasst werden. Neben dem hohen
Nachbearbeitungsaufwand, wirken sich auch der hohe
Anschaffungspreis und die Unhandlichkeit der
Ausrüstung nachteilig aus. [4]
„The
Digital
Michelangelo
Project“
eines
Forschungsteams der Stanford University, der University
of Washington und der Firma Cyberware Inc. lässt sich
als praktischer Anwendungsfall anführen, wo gerade in
Bezug zum Digitalisieren von Denkmälern, zehn Statuen
vom Michelangelo mit einem eigens dafür konzipierten
Laserscanner vermessen und erfasst wurden. Dabei wurde
die Form, als auch die Farbe aufgenommen. Zur
Farberfassung diente allerdings eine Kamera, wobei die
Abbildung 7. The Digital Michelangelo Project [7]
Das Projekt startete im Jahr 1997 und zog sich inklusive
aller Nach- und Aufbereitungsarbeit bis zum Sommer
2004. Neben der langen und intensiven Projektarbeit, war
auch der logistische Umgang der notwendigen
Arbeitsausrüstung und Gerätschaft eine Herausforderung,
ganz abgesehen vom Zugang zu den Kunstwerken an sich.
[7]
Bezugnehmend auf den Einsatz von Laserscantechnologien für Bauwerke, sei an dieser Stelle ein
Forschungsbericht der FH Mainz zu erwähnen. Dabei
wurden am Beispiel des Monuments „Porta Nigra“ in
Trier verschiedene Scanverfahren angewendet, darunter
auch das Laserscannen. Eingesetzt wurde ein Cyrax 2500
von Leica Geosystems, welcher 1000 Punkte pro Sekunde
bei einer Genauigkeit von 6mm auf 50m erfasst.
Abweichungen aufgrund von Farbänderungen hielten sich
in Grenzen, aber der Einfluss von reflektieren
Oberflächen stellte sich als Problem heraus. Nachdem die
Entfernung über den reflektierten Laserstrahl gemessen
wird, können reflektierte Strahlen mit fälschlicherweise
niedriger Energie falsche Schlüsse über die Entfernung
zulassen. Speziell bei dunklen, schattigen Bereichen des
Gebäudes trat dieses Problem auf.
Abbildung 8. Bereinigte Punktwolke des Porta Nigra [1]
77
Trotzdem wurde die gesamte Außenhaut des Gebäudes
vermessen. Bei insgesamt 51 Scanpositionen wurden 132
Millionen Punkte erfasst, was einem Datenvolumen von
5,1 GB entspricht. Anhand dieser Zahlen lässt sich bereits
erahnen wie ressourcenintensiv eine Vermessung mittels
Laser werden kann. Nicht zu vergessen ist dabei noch die
Aufarbeitung und Rekonstruktion dieser Datenmenge.
Eine weitere Problemgröße blieb zunächst im
Verborgenen, wurde aber spätestens beim Kombinieren
der Punktwolken schlagend. Die Messungen an der
Ostseite des Monuments waren deutlich ungenauer als die
der Westseite. Es stellte sich heraus, dass der öffentliche
Transitverkehr durch Vibrationen die Messung negativ
beeinflusst hatte.
Im nächsten Schritt erfolgte dann die Aufarbeitung der
Messdaten
mit
einer
ganzen
Reihe
von
Softwareanwendungen, bis schließlich ein Bild der
Außenhaut des Objekts mit den relevanten Punkten als
Wolke übrig blieb, dabei führte die Reduktion von
fehlerhaften Messpunkten auch zu einer Reduktion der
Genauigkeit. Wenn ein Laserstrahl genau eine Kante des
Gebäudes trifft, ist die Reflektion nicht eindeutig. Trotz
allen
automatisierten
Algorithmen
blieb
den
Wissenschaftlern eine mühsame, manuelle Nachbearbeitung nicht erspart.
Und trotzdem reichte dann das fertige 3D Modell den
Archäologen und Historikern als Forschungsgrundlage
nicht vollständig aus. Das Problem lag, wie so oft, im
Detail. Ornamente, Torbögen und andere wichtige
Details, konnten dem Ergebnismodell nicht genau genug
entnommen werden. [1]
HYBRIDE LASERSCANNER
Beim hybriden Laserscannen sollen die Vorteile von
Photogrammetrie und Laserscannen vereinigt genutzt
werden. Dazu wird vom Standpunkt der Laserabtastung
zusätzlich eine digitale Aufnahme mittels Kamera
vorgenommen. Dadurch kann der Fotografie über die
Punktwolke eine Tiefeninformation gegeben werden und
umgekehrt der Punktwolke eine Textur für das
Polygonmodell.
Es gibt verschiedene Bauformen von Laserscannern, die
bereits eine Digitalkamera integriert haben. Auf der
anderen Seite existieren aber auch Softwareanwendungen
die im Nachhinein Fotografien auf eine Punktwolke
mappen können. Nichtsdestotrotz ist das hybride
Laserscannen kein Allheilmittel. Nachteilig wirkt sich
immer noch der hohe Nachbearbeitungsaufwand vom
Laserscannen her aus, der meist nur teilautomatisiert
gelöst werden kann. Eine manuelle Nachbearbeitung ist
also essentiell. Diese Problematik stellt gerade für
Nichtexperten ein großes Hindernis da. [2]
Abbildung 9. 3D Fassadenmodell realisiert mit hybridem
Laserscannen [2]
Über die Intensität eines reflektierten Laserstrahls, kann
auch auf die Farbe des bestrahlten Punkts geschlossen
werden. Auf diese Weise können über eine Laserscannung
auch sog. Intensitätsbilder des Objekts erzeugt werden. In
Kombination mit dem Photogrammetrieprinzip würden
sich diese Bilder besser interpretieren lassen.
Abbildung 10. (vlnr) Intensitätsbild in grau, Intensitätsbild
eingefärbt und Intensitätsbild überlagert mit Fotografie [5]
Welche Vorteile würden sich nun beim Einsatz zum
Digitalisieren von Bauwerken ergeben? [5]
•
•
•
•
Matching der Intensitätsabbildungen des
Scanners mit den hochaufgelösten Bildern der
Digitalkamera.
Punktwolke kann anhand der Farbinformationen
des Bildes besser interpretiert werden.
Die Bildinformationen können als Textur auf die
vermaschte
Objektoberfläche
übertragen
werden.
Die Gewinnung der Bildinformationen hat sich
in den letzten Jahren dadurch vereinfacht, dass
die mögliche Auflösung stetig gewachsen ist,
während die Anschaffungspreise gesunken sind.
Die Kombination aus Kamera und Scanner ist auf den
ersten Blick sehr verlockend, allerdings gibt es noch
Probleme bei Nutzung ohne Tageslicht. Einige Geräte
verfügen bereits über eine integrierte Lichtquelle, die
allerdings auf größere Entfernungen Genauigkeitseinbußen mit sich bringen. Eine Aufnahme bei Tageslicht
benötigt zwischen 3 und 10 Minuten. [5]
78
Ein interessanter Kontext zur Digitalisierung von
Bauwerken liegt in der Panoramafunktion mancher
Geräte. Diese Funktionalität könnte genutzt werden, um
Innenräume zu Digitalisieren und sie mittels
Referenzpunkten in das Vermessungsobjekt einzubinden.
Die Dauer der Aufnahme für einen Innenraum über eine
Panoramamessung ist mit 60 Minuten angegeben (Stand
2004).
LICHTPROJEKTION
Einen ganz anderen Ansatz verfolgt das Scannen mittels
Lichtprojektion. Dabei wird das Objekt beleuchtet und
mehrfach fotografiert. Diese Abbildungen werden
anschließend
mit
Photogrammetriesoftware
(z.B.
ImetricS) nachbearbeitet, wobei die Software die
Abbildungen bündelt und automatisch abgleicht. So
entsteht
eine
dreidimensionale
Abbildung
des
Messobjekts. Für die Anwendung auf ganze Bauwerke ist
diese Methodik allerdings nicht vorteilhaft und wird daher
nicht näher behandelt. [1]
gemacht werden, ist Photogrammetrie in Kombination mit
Tachymetrie die treffendste, kostengünstigste und vor
allem schnellste Methode um diese Datenbank in
absehbarer Zeit auch zu befüllen. Wenn man die rasante
Entwicklung in der digitalen Photographie der letzten
Jahre verfolgt hat, hat Photogrammetrie meiner Ansicht
nach auch das größte Zukunftspotential.
Anders muss man die Digitalisierung von Denkmälern
verstehen. Sofern es nicht möglich ist, sie mittels
Lichtprojektion unter Laborbedingungen zu erfassen,
müssen sie mit einem Laser Punkt um Punkt vermessen
werden. Das Forschungsteam zum Michelangelo Projekt
hat dies bereits vorgezeigt. Um ein wirklich allen
Anforderungen entsprechendes Ergebnis zu erzielen, muss
die dafür notwendige Technik aber noch handlicher,
erschwinglicher und in der Nachbearbeitung schneller
werden.
Abbildung 11. Hochauflösendes 3D Oberflächenmodell
durch Lichtprojektion [1]
ZUSAMMENFASSUNG
In dieser Arbeit wurden die verschieden Methoden zur
dreidimensionalen Erfassung von Bauwerken und
Denkmälern vorgestellt. Resultierend lässt sich daraus
ableiten, dass es nicht wirklich möglich ist, zwischen
einer falschen und einer richtigen Technologie zu
entscheiden. Diese Wahl ist immer abhängig vom
Erfassungsobjekt, dem weiteren Verwendungszweck des
digitalen Abbildes, sowie der gewünschte Genauigkeit des
Modells. Eine zentrale Rolle nimmt die Tachymetrie ein,
denn sie ist zur exakten Bestimmung von Passpunkten bei
jeder Methode essentiell. Letztendlich kann man
Präferenzen verteilen, für welchen Anwendungsbereich
welche Methodik am vorteilhaftesten erscheint.
Für die Digitalisierung von Bauwerken, speziell für den
wissenschaftlichen Bereich, dürfte das hybride
Laserscannen am ehesten den Vorgaben entsprechen.
Zwar besteht ein erheblicher Nachbearbeitungsaufwand
der Punktwolke, allerdings liefert die Kombination aus
Photogrammetrie und Laserscannen das ansprechendste
Ergebnis. Sollen Bauwerke für die Allgemeinheit, bspw.
in einer Online Datenbank über den Browser abrufbar
79
1.
Boochs, F., Hoffmann, A., Huxhagen, U., &
Welter, D. (2006). Digital reconstruction of
archaeological objects using hybrid sensing
systems - the example Porta Nigra at Trier.
Mainz.
5.
Kersten, T., Przybilla, H., & Lindstaedt, M.
(2006). Integration, Fusion und Kombination
von terrestrischen Laserscannerdaten und
digitalen Bildern. Hamburg, Bochum.
2.
Boochs, M., Heinz, G., Huxhagen, U., & Müller,
H. (2006). Digital Documentation of Cultural
Heritage Objects using hybrid recording
techniques. Mainz: University of Applied
Sciences Mainz, i3mainz, Institute for Spatial
Information and Surveying Technology, Roman
Germanic Central Museum Mainz.
6.
Kraus,
K.
(2000).
Topographische
Informationssysteme. München.
7.
Levoy, M., Rusinkiewicz, S., Ginzton, M., Pulli,
K., Koller, D., Anderson, A., et al. (2000). The
Digital Michelangelo Project: 3D Scanning of
large Statues. Stanford/Washington: University
of Stanford, University of Washington,
Cyberware Inc.
8.
Mönicke,
H.,
&
Link,
C.
(2003).
Verformungsmessungen an Brücken mittels
reflektorloser
Tachymetrie.
Stuttgart:
Hochschule für Technik in Stuttgart.
9.
Sauerbier, M., Kunz, M., Fluehler, M., &
Remondino, F. (2004). Photogrammetric
Reconstruction ov Adobe Architecture at
Túcume, Peru. Bern: Swiss Federal Institute of
Technology,
Intitute
of
Geodesy
and
Photogrammetry.
80
3.
Eisenbeiss, H., Sauerbier, M., Zhang, L., &
Grün, A. (2005). Mit dem Modellhelikopter über
Pinchango Alot. Zürich: Institut für Geodäsie
und Photogrammetrie .
4.
Juretzko, M. (2004). Reflektorlose VideoTachymetrie - ein integrales Verfahren zur
Erfassung
geometrischer
und
visueller
Informationen.
Bochum:
Fakultät
für
Bauingenieurwesen
der
Ruhr-Universität
Bochum.
Digital Scanning Techniques and Their Utility for Mobile
Augmented Reality Applications
Christian Blackert
Student of the University of Klagenfurt
Universitätsstrasse 65-67, A-9020 Klagenfurt, Austria
cblacker@edu.uni-klu.ac.at
ABSTRACT
enhanced by virtual content to provide another perspective
on something which is currently not available.
The creation of augmented reality utilizations requires on
one hand methods to realize the presentation and on the
other hand methods and techniques to provide objects
which should be displayed in augmented reality
applications. This paper describes available object scanning
methods, in context of digitizing cultural artifacts and try to
answer the questions which approach is able to fulfill the
requirements for augmented reality applications. The focus
should be on augmented reality applications which provide
the user the possibility to move and to change the points of
view on the augmented reality object. After analyzing the
current scanning technologies the paper try to present you
some ideas how current scanning approached could be
modified or combined to fulfill the augmented reality
needs, followed by the question how motion capturing
techniques could provide an additional cognition benefit if
included to the augmented reality applications.
The real world content does not get replaced by any virtual
copy in augmented reality applications. This kind of
approach is called virtual reality, where the user enters a
complete virtual environment, which of course could be a
copy of real world scenery or fictive environment.
Another idea to use augmented reality would be to share the
information of cultural artifacts or of any other relevant
object, without the need to get in touch with the original
object itself. The approach to digitize the object and to use
this data for augmented reality application could provide a
very efficient way to distribute this information to
everyone, who is not able to visit or to work with the
original object. The scope of this paper is to analyze which
available digitizing techniques and methods are available
and how these technologies performs. It shall answer the
question if those scanning techniques are able to provide a
cultural object in such a quality to be used within mobile
augmented reality solutions as added digital content.
ACM Classification:
H5.1. [Multimedia Information
augmented, and virtual realities.
Systems]:
Artificial,
The user should be mobile and able to view the object from
different directions and / or under different light conditions.
These requirements shall be used to check available 3D
scanning methods and compare them, if they are able to
provide such digitized cultural objects. In addition, there
shall be some additional attempts to give some ideas how
the current solutions could be modified, or which additional
techniques are required to generate such an object, which
fulfill the presentation requirements for mobile augmented
reality solutions.
Keywords
Augmented Reality (AR), Laser Scanning, Light Reflection
Fields, Motion Capturing.
INTRODUCTION
When talking about mixed reality either augmented or
virtual reality, the basic idea behind is the combination of
real world and virtual components.
Creating augmented reality means adding virtual content
information into real world sceneries. This should happen
in a way that the user gets the possibility to obtain a better
understanding, impression and cognitions about the
provided scenery. In general, the provided scenery has been
Another aspect which increases the complexity of this
challenging approach would be, if it could be possible to
create augmented reality applications which are able to
provide objects which are able to perform a kind of motion.
The last topic which shall be added into discussion is
motion capturing. Providing an augmented reality solution
where detailed objects could be viewed is a nice feature, but
providing in addition the information how a cultural object
work or has been used in the past would provide additional
benefit.
Klagenfurt, Austria
1
81
AUGMENTED REALITY AND MOVING USERS
When talking about augmented reality the user is supposing
to see some objects, which are not real, but they should
appear, as real touchable objects.
The possibility of the user to move and act within presented
the augmented reality is another aspect which increases the
complexity of such applications.
The example in figure 2 shows such mobile augmented
reality application. The user is wearing a head up display,
which provides the user the virtual content accordantly.
Accordantly means, that the system must be able to
recognize in which direction the user is looking and which
virtual content is available to combine both sources
together. A kind of tracking system must be available to
check the user position and which virtual content needs to
be transmitted.
The following list shows the main points, which needs to be
considerate for mobile augmented reality applications:
•
•
•
•
Figure 1. An example of the use of faked augmented reality
(the object in hand is added digital content) [1]
The air plane (Figure 1) has been added to a digital stored
picture of real environment scenery, but in that case a
common photo composition. Nevertheless, also a
combination of virtual and real world contents merged into
a one moment scenery. The difference to real augmented
reality applications in comparison to the given example
above is that it shall be a real time presentation. This
means, that the real world content is not stored and
modified afterwards by additional virtual content which
gets played back as video stream later on. It is a real time
application which merges the virtual content to the real
world scenery. The additional context needs to be presented
and merged with the real world information and must
function as a single set of information.
•
•
Wearable Computing System
Wearable Display (Heads up, Hand held)
Tracking System (User position, line of sight)
Registration System (What content needs to be
displayed where?)
Wireless Network System
Data Storage and Access Technology
There are a lot of components which need to work together
to realize mobile augmented realities. The computing
system needs to provide the right virtual content on demand
and transmit this information to the displaying device.
The tracking system needs to check the position of the
roaming user within the augmented reality and what he is
looking / acting. The Registration System is the part where
the augmented reality gets filled up with virtual content.
This means, the place the user is looking at, must be
recognized by the system or marked with special markers to
inform the system, that the user is looking to a place, which
have additional virtual content. The required data needs
than transmitted via wireless communication directly to the
user. [8]
The virtual object could be viewed of any possible direction
and distance which requires a dynamic change of the
appearance and displaying the object parts accordantly in
the user’s line of view. Well, it is not surprising, that a
smooth changing (within the possible common user
changing ratio) would be strongly recommended to prevent
delays and image drops during user position changes.
Another important point is the required level of detail. Not
every augmented reality solution would require high
resolution objects to transmit the message (Figure 2).
Nevertheless, it needs to be taken into consideration if high
resolution objects with very detailed appearance
information are required. The last mentioned points are
quite relevant for the object scanning procedure, because
those provide the information, which could be presented
afterwards.
Figure 2. Maintenance work by using augmented reality to
receive additional work advices [2]
82
Providing such digital objects which are able to be viewed
from – in best case – out of every possible view angle and
different light environment condition could manage to be a
high attractive alternative to provide information and
impressions.
OBJECT SCANNING, WHERE EVERYTHING STARTS
Beside the current representation techniques and hardware,
the paper focuses on the topic which scanning techniques
are available to provide such accurate digitized objects. The
following pages provide a detailed overview about
available scanning approaches / methods and if one of them
generates good and adequate results for later augmented
reality purposes. The two techniques to scan objects which
get discussed are on the one hand the laser beam based
scanners and derivates and on the other hand a
photorealistic approach using the light reflecting fields of
an object.
Figure 3. Basically concept of a Triangulation Scanner
System [3]
The first scanning techniques which get thrown into
competition are 3D laser scanners which generally consist
of a laser beam emitting device and a receiver unit. There
are several methods available which base on this basic
technique but with different kind of hardware to increase
the accuracy of the measurement points, which is a good
entry to get familiar with the problematic of this technique.
This laser scanner approach consists of a transmitting unit,
which emits a laser beam at a defined, incrementally
changed angle from one end of a mechanical base onto the
object, and a CCD1 camera at the other end of this base
which detects the laser spot (or line) on the object. The 3D
position of the reflecting surface element can be derived
from the resulting triangle (Figure 3) [3].
The basic idea behind or in other words already available
solutions using this technology, sends out a laser beam to
the object and calculate the time how long the laser beam
needed from the transmitter back to the receiving unit. Over
the calculated time (time of flight method) and with the
detailed information about the position of the receiver and
transmitter unit, it is possible to get the position where the
laser beam hit the surface of the object (Triangulation
method) [1].
Well, as explained above, the laser is sampling the surface
and the receiver waits for the reflected signal from the
object. Objects with less complexity regarding their
properties and conditions are not that difficult to deal with
to get accurate scanning results. In case that the objects are
quite complex, like for example sphere shaped elements or
added fur, jewelry inlets or any other materials which do
not reflect the laser beam adequately will generate noisy
cloud points or in the worse case points which could not be
sampled [3].
The time of flight scanner is principle performing a step by
step scanning operation, with around 100 scanning spots
per second and provides at the end the travel time of each
scanned spot. A spot represents the position which the laser
scanner captured. The difference between the time of flight
method and the triangulation method is that the scanned
spot position gets calculated, because two positions are
already known. The scanned data needs to be postprocessed, corrected if necessary in some cases and
provides than the data for constructing the 3D wire mesh
model of the object.
Another point which needs to be considered is the used
resolution, the spot size of the scanner and in some cases
the distance between object and scanner. The last
mentioned point is important, if it is not possible to get
close enough to object. Scanning systems with higher
distance are not that accurate than close range ones, so the
distance does influence the spot size and accuracy of the
whole laser scanner. On the other side, if the objects surface
is quite rough and the scanner beam is focused on a defined
distance, it could be that scan results are also not accurate
enough. Refocusing procedures are required to balance
those gaps adequately [4].
1
CCD – Charged-Coupled Device are light resistive
devices and able to provide proportional signal accordantly
to the received light intensity [5].
3
83
SURFACE IS NOTHING WITHOUT TEXTURE
Another disadvantage of this kind of approach is the lack of
information about the objects appearance or in other words,
the objects texture. The basic laser scanning method does
only provide information about the surface shape but no
details about the texturing, which are disadvantages of this
kind of scanning technique. Additional modifications could
be applied to the laser scanner to get some additional
information about the texture of an object. Beside this
modification, there are solutions available which are able to
follow the main surface scanning laser, to collect the color
information of the object [4, 13].
beam could be complicated, because the sensory is not able
to detect the signal accordantly if the environment light
intensity is too high. Another show stoppers are the
materials of the object, as mentioned before. Most of them
could not be scanned adequately because of different kind
of reflection and material absorption options. Fur as
example is quite difficult to be laser scanned because the
hairs could not be recognized accordantly. Another material
has maybe a high absorption rate in the laser operation
wavelength. For example jewelries maybe reflect nothing,
because the laser beam gets redirected in a way, that the
receiver units does not receive any signal. [3].
We step short out for now and include this aspect of the
augmented reality purposes which shall be covered. The
object in the augmented reality shall be as close as possible
to the original one, regarding its appearance. In addition, it
is important to view the object from different view angles
which normally modifies the appearance of the object for
the viewer as well.
The mentioned 3D laser scanner approach above does only
provide details about the surface satisfactory, but no
information about the lighting conditions around the object,
neither the texture details, if no color detection sensor or
camera has been installed in addition. The common
procedure to apply an adequate look to 3D models would
be, to create images of the object and to paste the image
parts to the corresponding place (post processed coordinates
after scanning) of the 3D model. This approach is able to
provide quite realistic results but for mobile augmented
reality purposes with different view angles and light
conditions, not accurate enough to provide a digital copy
which behave like the original one. [13]
Where is the problem? The problem with mentioned
method above is, that the texture of the image only consist
one kind of illumination information which is static (special
light constellation at the cultural object location), captured
in the image and finally applied to the 3D model. It would
be necessary to store the appearance of the object in a more
dynamically way or static, but then with all possible
environment illumination scenarios to adapt it accordantly
to the position of the light and viewer’s position. Imagine a
statue digitized as described above which is rotating and
does not change any appearance. The shadows on the object
would not change, because of the fixed lighting information
of the image which got adapted to the 3D model. Another
idea would be to take the 3D model of the object and render
every lighting condition for the object again.
Nevertheless, this technology could provide very well
results, if the missing information of texturing and light
conditions could be scanned by using different techniques.
For augmented reality applications where different
illuminations aspects, view angles and high detailed
textures are not required this kind of approach would
already provide accurate results. On the other hand, the
technique has some small taints. The detection of the laser
Figure 4. 3D model of a statue, rendered with original
textures [3].
PHOTOREALISTIC SCANNING WITHOUT MODELING
The next scanning approach discussed in this paper is able
to produce a photorealistic virtual object without the need
to generate a 3D wire mesh model out of prior scanned
surface points. This technique uses a high speed camera and
illumination setups to rebuild the object appearance by
using the light reflection field of the object [7, 11].
The method is quite effective to get realistic looking digital
objects with different kinds of illumination scenarios. With
the reflection field, it is possible to reconstruct the shape of
the object due calculating the object shape out of the stored
appearance images. [12]
How this method will help to fulfill the needs for realistic
augmented reality objects will be discussed.
The approach consists of two parts to scan the object. The
first part is a semi circular arm which is equipped with light
spots. This arm is rotating vertically around the object
84
within a defined step width and distance to the cultural
object. The second part is a high speed camera, which is
synchronized with the spot lights and catches the object
appearance [7].
One of the main advantages of this method is that it is not
necessary to create a 3D model of the object. The stored
images provide enough information to recalculate the shape
of the object out of the light reflection field. Indeed the post
processing algorithm to recover the shape of the object
indeed could be called as complex operation and is not
explained in detail. Another important point is the
availability to scan objects with materials or content, which
is not able to get scanned by the laser beam technique
mentioned above. [9]
Bringing in the augmented reality needs; this procedure
would be able to provide enough accurate object details for
later augmented reality applications and in addition, there is
already the appearance of the object with different lighting
conditions included.
Figure 6. Photometric Scanning approach in action [7].
The spot light semi circular arm equipped with in example
17 spot lights (Figure 6) should be able to move to at least
360 different positions to cover a whole object. This would
generate overall 6.120 illumination points, which needs to
be synchronized with the camera system. In addition it
would be required to move the camera accordantly from
bottom side to the top side of the object, which would
create additional 180 horizontal positions, if a step width of
one degree is chosen and also 360 vertical positions around
the object. The positions where the camera and the light
spot bar are close to each other not taking into consideration
would result in overall 64.800 camera positions. For each
camera position it would be necessary to catch all relevant
illuminations aspects so the light spot arm needs to move
one time around the object. 64.800 camera position
multiplied with 6.120 illumination position gives us
396.576.000 images. If at least a camera resolution of
640x480 pixel is used with 16 bit color deep this would
generate 38.400 Byte per image.
In addition, it would be possible to change the illumination
direction accordantly, but not that smooth. The spot light
arm step width is not that small, that it would be possible to
get a very smooth change of the lighting conditions. This
gap could be resolved by rendering the object with different
light settings for the specified positions, which are not quite
illuminated.
Another possible solution for this restriction would be to
reduce the step width of the spot light arm to get overall
more images of the object with information about the
appearance of the object at a specific angle. Another show
stopper is the fixed camera position in this approach. It
would be necessary to mount additional cameras close to
each other to get more images from different points of
views. The second idea in this context would be to move
the camera up and down wards along a second circular arm
which is rotating around the object. This could maybe help
to have additional data about positions, which are not that
well reconstruct able. For example, this could be the
opposite side of the current fixed camera position.
The overall storage amount would be 15.228.518.400.000
Byte or written in another scale 15.2 TB for one scanned
object. If we set up a time for example of 0.01 seconds per
image, the whole scanning process would take around 46
days! In my opinion this approach is not feasible, generates
too much data, and requires a lot of overall effort to scan a
single object. The given example is the hardcore procedure
to get of each vertical and horizontal degree an image,
which is of course, not needed to get a photorealistic
looking digital object. In addition, to this scanning effort, it
is required to post process the collected data to get the final
digitized object. How much storage capacity would be
required for the finished object cannot be estimated so far
and also not how much data needs to be transferred to the
augmented reality device to update the view accordantly but
it would be not feasible with current hardware.
With these adjustments it would be possible to get quite
good results for mobile augmented reality applications, but
on the other hand the required effort to realize such
detailed, from any angle viewable object would be quite
high. In addition, the generated amount of data needs to be
processed and transferred to the device which has to display
it. Another critical point of this method is the high memory
consumption of the scanning procedure. The following
small calculation example should give a brief idea about
how much data we are talking about, if we try to get a very
detailed virtual object. The given example does not match
exact with the performed in [7] but has slightly modified to
see how much data a 360° scanning would generate.
So which method is the right one, or which solution would
provide the best effort / performance ratio?
5
85
This question cannot be answered by saying one of the
mentioned techniques is that one which would provide the
best result in every single case. Both methods have their
strength and weaknesses in different aspects. It strongly
depends on the required accuracy of the augmented reality
application. If it is not required, to view an object from
every possible direction and different illumination
conditions, the laser scanner and the light reflecting field
technique provides both a satisfied result, even if I
personally would tend to prefer the laser scanner method,
because of the available 3D model as result of the scanning
method. Those models could be modified afterwards and
turned in every needed position. On the other hand, if the
object consists parts which are hardly to be detected by a
laser with less to no reflection point (for example fur), the
information of this artifact would be lost and finally not a
good approach to create an exact digital copy of the original
one.
The best solution would be a combination of the advantages
of both techniques which should take place in a hybrid
solution. To get a better overview about which key points
are covered the following list should be used as reference:
•
•
•
•
•
3D Model required or not
Level of texture quality (photorealistic, simple)
Appearance (Illumination, Shadows)
Point of view
Field of application
The laser scanner technique should be used to sample the
surface of an object to create the 3D wire mesh model. The
light reflection field approach should be used to scan the
object texture and appearance. The result of the light
reflection field procedure should be photorealistic images
which could be dynamically applied to the 3D wire mesh
model. Due later computerizing and rendering it would be
possible to add additional lighting conditions to the basic
photorealistic model. With that, it would be possible to
receive a photorealistic 3D model of such a quality that it
would fulfill the mobile augmented reality purposes.
MOTION CAPTURING AS ADDITIONAL (AR) BENEFIT
The last pages were about how to scan cultural objects in
such a way, that the results are suitable for mobile
augmented reality applications. The next logical step
(without taking the feasibility to realize such system into
consideration) would be to give a digital object the
possibility to show its functionality.
So far, only objects has been processed which should be
viewable from different points of view, in a very realistic
style without any consideration about the available object
mechanism / functionality. It could be quite interesting to
get cognitions about how some cultural objects work or
which application it addressed in the past.
That’s the point where motion capturing would provide a
possible solution to digitize and store the mechanism of an
object. With the collected motion information it would be
possible to bring a 3D model of the object to life. Well,
there are some steps between to receive the final 3D model
with motion sequences, but step by step. [6,9]
Basically this method has been used to capture the motion
of humans and to use the stored information to give a
virtual model the possibility to perform the same
movements adequately. This method could also be used to
capture the motion of cultural objects.
The creation of a common video clip could also be
understood as motion capturing action, but normally does
not include any relevant indicators to compute the actor’s
motion. Of course just recording the actor would not yield
directly to motion sequence details which could be applied
to any kind of digital object. The recorded information in
the video could be used to understand the basic idea behind
the motion elements by viewing it several times. In
addition, it would be possible to imitate the actor’s motion
by another actor accordantly. This would need some
practice if viewing a quite complex actor performance, but
this is not the main idea behind motion capturing. The
motion capturing which should be discussed stores this
motion sequences and apply it to a 3D model.
Sadly to say, capturing the motion of an object requires
adding some special markers / indicators, which needs to be
placed on the objects surface. In special cases it would be
possible to do so, but for cultural artifacts in alarming
conditions this kind of approach is not applicable. The
markers need to be fixed on the surface, which may cause
damages on the cultural artifact.
All common methods require markers which need to be
added on the object surface, as written before. The first one
uses ratio transmitters with different radio ID’s (this is
necessary to determine the position of every single marker
at any time) and sensors which are able to recognize the
position changes of the transmitters. Basically the
transmitters are able to change the magnetic field within
during motion. The sensors are able to detect the field
changes. With the radio ID of each marker it is possible to
determine, which marker caused which kind of magnetic
field change. Object parts which influence the surrounding
magnetic field complicate the detection of the markers. [9]
The second one uses a set of cameras and special colored
markers which are be able to be detected by special
cameras. Special colored marker sounds a little bit strange,
but for example if the markers would be green and most of
the object surface is also green, the cameras would have
some troubles to detect the markers accordantly. To prevent
such troubles the cameras are limited to one color to detect
the markers and to prevent any influence from object
appearance, which may causes wrong marker detection. [6]
Now step over to our mentioned scanning techniques.
The discussed laser beam scanning techniques is designed
to capture surface details of stationary objects and to create
86
a 3D model afterwards. Any object motions during the laser
scanning procedure would prevent any successful
reconstruction of the object afterwards. So that techniques
does not provide any benefit for motion capturing, except
the availability of the 3D model. The availability of the 3D
model after the laser scanning procedure is one of the main
advantages, because the motion capturing information
could be added to the wire mesh model.
In my honest opinion, it would be necessary to create a
solution or even a set of different approaches to get all best
practice results together, but how much effort is
appropriate?
Let’s start a small example to test the discussed techniques
and how they could be combined to get digital objects with
photorealistic appearance and additional motion
information.
The second digitizing approach uses the light reflection
field technique which could maybe modified in a way to
provide additional motion information by replacing existing
cameras by markers optimized ones after scanning
procedure has been finished. The main disadvantage is that
there is no 3D model of the object available to apply the
motion sequences. The former advantage of the “no need of
any 3D model” got now the main show stopper for this
approach.
Basically it is required to cover following three components
to create mentioned object. First of all, it is required to
recover the information about the object shape and
appearance. The light reflection field approach could
provide this data by scanning the object. So the 3D object
and the photorealistic appearance would be already
available after this scanning procedure. The next part would
be to start with the motion capturing which could be done
by the light reflection field set up. Additional motion
capturing cameras and markers placed around the object
could now generate the motion profile. Maybe this
approach would also work without using special markers at
the object surface. Well, maybe it is possible to use the light
reflection field of the object to reconstruct the motion
capturing data. I guess this would also work, but that would
cause, that the whole objected needs to be scanned several
times. Sounds possible, but would require a lot of effort and
time to recover all information and to apply the motion
details to the scanned object.
Anyway, getting back to idea of mobile augmented reality
applications a photorealistic digital copy of the original one,
enhanced by adding some basic movements would increase
the cognitions about it. Using the 3D model of the laser
scanner method, modified by adding the object appearance
from the light reflection field followed by adding an
additional step to capture the object motions accordantly
and merge all results together to generate the final object,
could be a possible approach to generate such object.
This result would maybe provide the best solution for
mobile augmented reality applications, but would require a
very high effort to generate such object. Maybe if the
technological development goes on with current speed, this
could be realized and handled in a way, that such
augmented reality application could get true.
The whole scanning scenario above completely lacks a 3D
model. Commonly the motion capture information is stored
as a compatible model and added to the object 3D model.
Sounds like, that there is a missing link between object and
motion data, because there is no combined base (3D model)
for it. An additional laser scanning step needs to be done, to
generate the 3D model of the object. This would be a
redundant step and would only provide the missing 3D
model. I would recommend, using the light reflection field
data, which is primarily used to reconstruct the object for
3D model creation. This should be the more effective way
to realize the model, compared to the actions, to build up an
additional laser scanner and start scanning again. The
surface of the digital object is already available, so an
inverse computing should be possible as well.
CONCLUSION
After reading a lot of papers about scanning techniques,
augmented reality and motion capturing, the already
realized approaches are quite impressive. As layman
without any deeper technical understanding of all
mentioned methods, the achieved results in some projects
are brilliant. The paper has focused on scanning techniques
providing digital objects for augmented reality application,
which require a high level of object quality and the need to
view them from different points of view.
Finally, the discussed techniques are able to provide the
required data in adequate quality to realize photorealistic
objects, but no one is able to cover all mentioned points.
Maybe the photorealistic approach enhanced by motion
capturing technology and 3D model creation procedure
would be able to cover it, but it is difficult to estimate if the
required effort is justifiable. If the presentation technology
is not able to handle it, it would be waste of funds to create
the best 3D for mobile augmented reality application.
Based on the collected information and explained
approaches, it should be not surprising, that no one of the
mentioned scanning techniques is able to generate all
relevant information. Every technique needs to deals with
different tradeoffs regarding different kind of relevant
aspects and attributes they address. The photorealistic
approach requires for example that the hardware is able to
be placed around the object. If this is not possible, the
whole procedure is not practicable. On the other hand, there
are some additional ideas to use light reflection field
systems, which uses a different approach to receive the
digital object. [11]
Nevertheless, the available techniques are able to provide
objects which could be used for augmented reality and it’s
strongly depends on the purpose of the application. Just
adding some simple objects by using the current techniques
7
87
would not be a big deal, adding something quite realistic, is
one.
REFERENCES
1. Augmented Reality Example, http://www.sharedreality.de/img/portfoliobump.jpg, (30.11.2008)
2. Augmented
Reality
User,
http://www.istmatris.org/images/overview2.jpg, (30.11.2008)
3. Boehler, W., Heinz. G. and Marbs, A. The Potential of
Non-Contact Close Range Laser Scanners for Cultural
Heritage, CIPA International Symposium, Proc.
Potsdam, Germany (2001), 2.
4. Boehler, W. and Marbs, A. 3D Scanning Instruments,
CIPA, Heritage Documentation - International
Workshop on Scanning for Cultural Heritage Recording
Proc. Corfu, Greece (2002), 2.
5. Boyle, W.S. and Smith, G.E. Charge Coupled
Semiconductor Devices, Bell Laboratories, Murray Hill,
NJ, (1982).
6. Gleicher, M. Animation From Observation: Motion
Capture and Motion Editing, Computer Graphics 33(4),
p51-54. Special Issue on Applications of Computer
Vision to Computer Graphics, University of Wisconsin
(1999).
7. Hawkins, T., Cohen, J. and Debevec, P. A Photometric
Approach to Digitizing Cultural Artifacts University of
Southern California (2001).
8. Höller, T., Feiner, S., Mobile Augmented Reality,
(2004), 4-6.
9. Horber, E., Motion
Germany (2002)
Capturing,
University Ulm,
10. Levoy, M. The digital Michelangelo project: 3D
scanning of large statues. In Proc. Siggraph (2000).
11. Levoy, M. and Hanrahan, P. Light Field Rendering,
Computer Science Department, Stanford University,
Siggraph (1996).
12. Robson, S., Bucklow, S., Woodhouse, N. and Papadaki,
H., Periodic Photogrammetric monitoring and surface
reconstruction of historical wood panel painting for
restoration purposes, ISPRS (2004)
13. Wulf, O., Wagner, B. Fast 3D Scanning Methods for
Laser Measurement Systems, Conference on Control
Systems and Computer Science, Institute of Hannover,
Germany (2003).
88
Möglichkeiten einer digitalen Umwelt
Daniel Finke
Alpen-Adria Universität
Egarterweg 6
dfinke@edu.uni-klu.ac.at
ABSTRACT
Im Zusammenhang einer mit digitalen Geräten
angereicherten Umwelt fallen oft die Begriffe „Ambient
Intelligence“, „Calm Computing“ und „Ubiquitous
Computing“ (im folgendem mit UC abgekürzt). Diese
Begriffe werden in diesem Abschnitt erläutert.
Die Fortschritte auf den Gebieten von Miniaturisierung,
Energieeffizienz, Sensorik und intelligenten Materieallen
ermöglichen es Hardware in Geräte des täglichen
Gebrauchs zu integrieren. Diese Integration von Mikrochips
und Sensoren in unsere Umgebung schafft eine digitale
Umwelt. Der Artikel beinhaltete mögliche Szenarien aus
dem Bereich von „Ubiquitous Computing“. Mit Hilfe der
Szenarien soll ein besseres Verständnis der verschiedenen
Möglichkeiten und Einsatzarten von in die Umgebung
eingebundenen technischen Systemen erzeugt werden. Der
Artikel beschreibt: welche Anforderungen an diese
allgegenwärtige Datenverarbeitung Systeme gestellt
werden, wie solche Systeme aufgebaut werden können und
beschreibt einen generischen Prototypen eines verteilten
Sensornetzwerkes (ProSpeckZ).
Der Begriff des UC geht auf Mark Weiser zurück. In
seinem Artikel „The Computer for the 21st Century“
beschreibt er seine Idee, in welcher der Computer als
eigenständiges Gerät verschwindet und durch so genannte
„intelligente Geräte“ ersetzt wird, die die Menschen bei
ihren Tätigkeiten dezent unterstützen [10].
Ubiquitous Computing bezeichnet die Allgegenwärtigkeit
von Sensoren, Aktoren und Prozessoren, die miteinander
kommunizieren, verschiedene Aktionen auslösen und
Abläufe steuern. Alltagsgegenstände bekommen so die
zusätzliche
Eigenschaft,
sich
entsprechend
der
wahrgenommenen Umgebung zu verhalten. Neben UC
werden auch oft Begriffe wie „Mobile Computing“,
„Pervasive Computing“ und „Ambient Intelligence“
verwendet, deren Beziehung zueinander ich jetzt kurz
beschreiben
möchte.
Von
der
„traditionellen“
Datenverarbeitung mit Servern, PCs, Terminals und
traditionellen Ein- und Ausgabegeräten als Interface führt
eine Erhöhung der Mobilität zum „Mobile Computing“ –
eine verstärkte Einbettung miniaturisierter Computer in
andere Gegenstände hingegen zum „Pervasive Computing“.
Werden beide Aspekte zusammen genommen, so ergibt
sich eine allgegenwärtige Datenverarbeitung, das UC (siehe
Abbildung 1).
Keywords
digitale Umwelt, Ambient Intelligence, Calm Computing,
Ubiquitous Computing, Speckled Computing.
EINLEITUNG UND BEGRIFFERKLÄRUNG
Die
rapide
Evolution
der
Informationsund
Kommunikationstechnologie bietet die Möglichkeit,
Sensoren und technische Geräte nahtlos in unsere
Umgebung zu integrieren und untereinander zu vernetzen.
Dadurch entsteht eine „digitale“ Umwelt. Diese kann auf
Benutzer automatisch reagieren, auf deren Bedürfnisse
eingehen und deren Verhalten vorhersagen. Jedoch
entstehen aus dieser Entwicklung neue Anforderungen an
die verwendeten Geräte aber auch andere Faktoren wie zum
Beispiel der Kontext beanspruchen mehr Aufmerksamkeit.
Das Design solcher Systeme ist nicht trivial. Der Artikel
beschreibt einen möglichen Aufbau solcher Systeme und
specklednet als Beispiel für die Realisierung und die
Einsatzfähigkeiten solcher Systeme.
Klagenfurt, Austria
Abbildung 1: Ausprägung des UC entnommen aus [3]
1
89
Bei Ambient Intelligence (deutsch Umgebungsintelligenz)
handelt es sich um ein technologisches Paradigma und
besitzt Ähnlichkeiten zu UC und Pervasive Computing. Das
Ziel
der
Forschung
auf
dem
Gebiet
der
Umgebungsintelligenz ist es, Sensoren, Funkmodule und
Computerprozessoren in Alltagsgegenstände zu integrieren.
Weitere Anwendungsfälle und Szenarien können in der
Taucis-Studie [3] nachgelesen werden. Die positive Vision
ist, dass die Erweiterung nahezu aller Gegenstände um
Kommunikationsfähigkeit und Intelligenz, dazu führt, dass
unser Alltag spürbar erleichtert wird und die intelligente
Umgebung uns auf natürliche Art und Weise bei Bedarf
unterstützt. Die negative Seite ist, totale Überwachung
(gläserne Mensch) und eingeschränkte Selbstbestimmung
da die Umgebung manipulative wirken kann.
Bei Calm Computing [4] benötigt die Interaktion mit der
Umgebung nicht den Focus des Nutzers. Die Interaktion
rückt in den Hintergrund und wird automatisiert. Das
Prinzip hinter „Calm Computing“ ist es, Informationen so
zu Präsentieren, dass sie wenig Aufmerksamkeit fordern.
Das heißt, dass die Nutzung der Computerleistung nicht
mehr Aufmerksamkeit erfordert, als das Erfüllen anderer
alltäglicher Tätigkeiten, wie gehen oder lesen. Wenn jedoch
erforderlich die Interaktion leicht in den Mittelpunkt
gestellt werden kann.
Die Identifizierung von Personen durch biometrische
Merkmale ist seit einigen Jahren im Kommen, jedoch
handelt es sich dabei um Systeme, die eine aktive Mitarbeit
des Nutzers voraussetzen. So ist es für das Scannen eines
Fingerabdruckes oder der Iris erforderlich, Finger oder
Auge genau zu positionieren. Dies entfällt zwar bei
Erkennungssystemen, die die menschliche Stimme als zu
erkennendes Merkmal nutzen, aber die Zukunft gehört
Systemen, die Personen sogar ohne das Abfordern
konkreter Aktionen erkennen können. Das ist einerseits für
die Nutzer bequem ist aber andererseits auch problematisch,
da dadurch eine Identifikation von Personen auch ohne
deren Einverständnis erfolgen kann. Das Wahrnehmen der
Aktivität und des Verhaltens der Nutzer wird ebenfalls über
Software realisiert. Dabei kommt Software zum Einsatz, die
zunächst Daten über den Nutzer sammelt und Profile
erstellt, um so die volle Leistungsfähigkeit zu erreichen. Es
sind aber auch Programme denkbar die unabhängig von
Profildaten Auswertungen vornehmen, zum Beispiel um
während eines Telefonates durch Auswertung der
akustischen Daten abzuschätzen, ob der Gesprächspartner
lügt.
Kontextverständnis
Ein wichtiger Aspekt im UC ist die Wahrnehmung des
Kontextes durch Geräte (Context Awareness)[5]. Das
Hauptziel ist, dass sich Systeme flexibel auf die jeweiligen
Erfordernisse einstellen können, ohne für Einzelaktion
explizit konfiguriert werden zu müssen. Ein einfaches
Beispiel ist die Raumbeleuchtung oder Raumtemperatur,
die sich stets den Vorlieben eines Menschen anpassen,
wenn er einen Raum betritt. Es lassen sich drei wichtige
Aspekte unterscheiden:
Um „Calm Computing" [7] erreichen zu können, müssen
viele Aufgaben an die umgebende Technik delegiert
werden. Eine unverzichtbare Rolle spielen dabei
Softwareagenten, die alltägliche Aufgaben wie zum
Beispiel das Aushandeln von Parametern in adaptiven
Umgebungen übernehmen können. Entscheidend hierfür ist
es, den Kontext des Nutzers (z.B. Ort, Verhalten,
Verfassung und Vorlieben) korrekt zu gestallten, damit die
Technik „intelligent" reagiert. Hierfür werden Techniken
aus anderen Forschungsgebieten, wie Semantic Web,
Künstliche Intelligenz, Soft Computing herangezogen um
die Anpassbarkeit der Prozesse zu verbessern. Eine zentrale
Frage, die sich aus der Idee des Calm Computing ergibt ist,
ob und wieweit es möglich ist die informationelle
Selbstbestimmung einzelner Personen zu schützen und
trotzdem soviel wie möglich an Kontrolle abzugeben.
• Wahrnehmung von Identität,
SZENARIEN
Ein wichtiger Aspekt im Zusammenhang mit UC ist der
Kontext. In [3] werden 3 Punkte beschrieben die zu
beachten sind:
Kontextwahrnehmung
• Aktivität und Zustand eines Nutzers,
• Wahrnehmung der physischen Umgebung sowie die
Selbstwahrnehmung von Geräten.
Im weiterten Sinne zählt dazu auch, dass Computer den
Kontext von Daten, insbesondere von Dokumenten
erkennen. Dies führt unter anderem zur Entwicklung eines
„semantischen
Webs“,
das
von
dazugehörigen
Softwareagenten interpretiert werden kann. [6]
Personenerkennung und Verhaltensanalyse
Damit sich UC Systeme auf den Nutzer einstellen können,
müssen sie ihn und seinen aktuellen Zustand erkennen und
daraus die gewünschte Umgebungseinstellung generieren.
Dies wird durch Software bewerkstelligt, die die von
Sensoren gelieferten Daten auswertet.
In diesem Abschnitt möchte ich Einsatzmöglichkeiten von
in die Umgebung integrierte Technologie bei dem Projekt
Burgbau zu Friesach beschreiben. Mit Hilfe der Szenarien
soll ein besseres Verständnis der verschiedenen
Möglichkeiten und Einsatzarten von in die Umgebung
eingebundenen technischen Systemen geschaffen werden.
In Friesach soll in den kommenden 30 Jahren eine
Höhenburg
gebaut
werden.
Um
größtmögliche
Authentizität zu erreichen wird der Bau mit mittelalterlicher
Technik und mittelalterlichen Methoden errichtet. Weitere
Informationen können unter http://www.friesach.at/
nachgelesen werden.
Szenario 1
Es ist 6:30. Bob Baumeister macht sich auf den Weg zur
Arbeit. Bob ist Bauleiter des Langzeitprojektes Bau der
90
Burg zu Friesach. Nach drei Wochen wohlverdienten
Urlaubes ist Bob schon gespannt ob sich der Burgbau auch
ohne ihn weiterführen lies. Der Burgbau wird zwar mit
mittelalterlichen Methoden gebaut, jedoch werden
selbstvernetzende Sensoren verwendet. So ist es Bob
möglich den Baufortschritt der in den letzten drei Wochen
geschafft worden ist, in einem Model, welches auf einen
Computer in seinem Büro läuft, anzuzeigen und
auszugewerten. Dies erfolgt durch einfaches Auslesen von
Informationen aus dem Sensornetz, zum Beispiel die
Anzahl der vernetzten Sensoren. Ein weiterer Nutzen ist,
dass es über den Zeitpunkt des erstmaligen online gehen
einzelner Knoten möglich wird eine genau Abfolge des
Aufbaues zu rekonstruieren. Dies scheint bei einer auf 30
Jahre angelegten Bauzeit ein interessanter Mehrwert zu
sein. Als Bob den Arbeitsfortschritt auf dem Modell
überprüft wird er auf ein Problem, welches in seiner
Abwesenheit aufgetreten ist und von den Arbeitern direkt
auf der Baustelle erfasst wurde, hingewiesen. Das hat den
Vorteil, dass zusätzlich zu den eingegebenen Daten auch
der exakte Ort erfasst wird an welche die Daten eingegeben
werden. Diese Daten können später bei der Dokumentation
des Projektes hilfreich sein. Ein Torbogen konnte nicht mit
den dafür vorgesehenen Hilfsmitteln erstellt werden. Bob
kann sich noch bevor er die Baustelle betritt über
Alternativen informieren und ist in kurzer Zeit wieder auf
dem Laufenden.
auf dem Smartphone durchgeführt. Der Schüler der am
meisten Punkte sammelt darf beim abschließenden Essen
als Burgherr oder Fräulein fungieren. Bei der Anmeldung
wurden auch auf Essensunverträglichkeiten geprüft und
gegebenenfalls ein alternativer Menü-Plan erstellt.
TECHNISCHE FAKTOREN VON UC SYSTEMEN
Um im vorigen Kapitel beschriebene Szenarien realisieren
zu können sind laut der Taucis-Studie [3] folgende
grundlegende technische Faktoren ausschlaggebend.
Miniaturisierung
Durch
die
fortschreitende
Miniaturisierung
der
Computerhardware wird es möglich immer mehr Chips in
Gegenstände in unserer Umgebung zu integrieren. Die
RFID Technologie kann als erster Vorläufer dieser
Entwicklung gesehen werden, ist jedoch nur ein
Minimalansatz. Durch die steigende Leistungsfähigkeit und
die immer kleiner, kostengünstigeren Chips wird es
einfacher eingebettete Computersysteme in Gegenstände
des täglichen Gebrauchs zu integrieren. Aktuelle Chips von
Intel und AMD werden bereits in 45 Nanometer gefertigt,
aber auch Verfahren die Strukturgrößen bis zu 15
Nanometern erlauben sind bereits in der Entwicklung. Die
Grenzen sind auf den Gebiet der Chipherstellung noch
lange nicht erreicht[8]. Auch Anwendungen in anderen
Teilgebieten der Nanotechnologie eröffnen neue
Möglichkeiten für das Einbetten von Geräten in
Alltagsgegenstände.
Szenario 2
Alice besucht die zweite Klasse der HTL Villach. Heute
steht für ihre Klasse der Besuch des Burgbaues zu Friesach
auf dem Programm. Da sich die Burg noch im Bau befindet,
wird der Großteil des Programmes im Freien stattfinden.
Deshalb wird Alice bereits beim Aufstehen über das Wetter
in Friesach informiert um so geeignete Kleidung wählen zu
können. Die Daten dafür kommen direkt aus dem
Sensornetz der Burg. Die 2a ist aus dem Schwerpunkt
Bautechnik. Der Ausflug stellt eine gelungene Ergänzung
zum bereits gelernten Stoff da. Die Klasse kann hier
miterleben, wie etwas mit Hilfe, von teils vergessenen alten
Wissen aus dem Bereich der Bautechnik, entsteht. Bereits
gestern haben die Schüler eine Applikation auf ihren
Smartphone installiert. Mit dieser Anwendung ist es
möglich seine persönlichen Präferenzen bekannt zu geben
und die Zeit die man in der Burg verbringen möchte. Bei
der Ankunft verbinden sich die Smartphones mit dem
Sensornetz, welches auf dem gesamten Burggelände
verfügbar ist und eine individuelle Tour wird berechnet.
Durch das Sensornetz ist jederzeit bekannt wo sich wie
viele Schüler befinden. So ist es möglich Wartezeiten an
bestimmten Standorten und bei praktischen Übungen zu
vermeiden. Alice macht sich auf den Weg und erkundet
mittels interaktiven Tour Guide die Burg. Der Tour-Guide
führt sie zu verschiedenen Stationen bei welchen sie mittels
Videos, Präsentationen und auch praktischen Übungen über
den Burgbau aber auch über das Leben im Mittelalter
informiert wird. Nach jeder Station wird ein kurzes Quiz
Energie
Ein weiterer wichtiger Punkt stellt die Energieversorgung
da. In den letzten Jahren ist ein Trend zu stromsparenden
Komponenten erkennbar. Auch die Technik zum Speichern
von Energie entwickelt sich kontinuierlich weiter. Weiters
wird
auch
auf
dem
Gebiet
der
drahtlosen
Energieübertragung als auch der mobile Energiegewinnung
stark geforscht. Brennstoffzellen auf Basis von Wasserstoff
oder ähnlichen scheinen sich für mobile Geräte zu eignen,
aufgrund der Nachfüllproblematik eignen sie sich jedoch
nicht für eingebettete Komponenten. Bei eingebetteten
Computersystemen ist die drahtlose Energieübertragung
eine sinnvolle Alternative. Dabei nutzen Transponder für
den eigenen Betrieb und die Übermittlung eines
Antwortsignals die Energie des elektromagnetischen Feldes
des Senders.
Intelligente Materialien
Intelligente Materialien sind laut [9] „non-living material
systems that achieve adaptive behaviour“. “Hierzu gehören
Verbundwerkstoffe mit integrierten piezoelektrischen
Fasern, elektrisch und magnetisch aktive Polymere und so
genannte „Shape Memory Alloys“ (SMA), d.h.
Metalllegierungen, die nach einer Verformung einfach
durch Erhitzung ihre ursprüngliche Gestalt wieder
annehmen. Zum Bereich der intelligenten Materialien kann
man auch Mikro-elektromechanische Systeme (MEMS)
zählen, d.h. Kombinationen aus mechanischen Elementen,
3
91
Sensoren, Aktuatoren und elektronischen Schaltungen auf
einem Substrat bzw. Chip.“[3]
• Eine verteilte Infrastruktur
Schnittstellen (Interfaces).
Mit solchen Verbundwerkstoffen können Aufgaben wie
Verformungen und Bewegungen an sich, aber auch das
Lokalisieren dieser Verformungen, ermöglicht werden.
• Eine verteilte Infrastruktur für den Transport von Daten.
Neue Bauformen
Durch die bereits beschriebenen Faktoren wird es möglich
neue Bauformen zu verwenden. Ein gutes Beispiel dafür ist
Wearable Computing. Wearable Computing bezeichnet die
Integration von Computersystemen in Kleidung. Die
Integration reicht dabei von einfachem RFID-Tag bis hin zu
komplexeren
Systemen
mit
Einund
Ausgabemöglichkeiten und Microcontrollern. Zusätzlich
sind Anwendung wie Smart Dust [9] oder SpeckledNet
[1,2] Ergebnisse dieser neuen Bauformen.
Sensorik
Zur möglichst vollständigen Wahrnehmung der Umgebung
werden Sensoren benötigt. Sensoren erfassen Daten über
die Umgebung und leiten die gewonnenen Daten an
Systeme weiter, welche die Verarbeitung übernehmen und
eine Reaktion einleiten können. Sensoren können dazu
genutzt
werden
Temperatur,
Luftfeuchtigkeit,
Geschwindigkeit, Position, usw. wahrzunehmen.
für
Sensoren
und
• Rechenleistung von einem oder mehreren verteilten
Computern, die Daten verarbeiten und Entscheidungen
treffen. Dabei sollten die Entscheidungsalgorithmen
adaptiv sein, d.h. sich an unterschiedliche Bedingungen
und Kontexte anpassen können.
• Zugriff auf einen oder mehrere, möglicherweise auch
verteilte Datenspeicher.
• Anbindung an externe Datenquellen und Dienste.
• Komponenten zur Umsetzung von Entscheidungen bzw.
zur Ausführung eines Services oder anderen Aktionen
auch in einer verteilten Infrastruktur.
Aufgrund dieser Beschreibung kommt man auf das
folgende Diagramm, welches die Interaktion eines Nutzers
mit einem generischen UC System darstellt. Dabei stehen
die Pfeile jeweils für den Datentransport zwischen den
unterschiedlichen Komponenten des Systems.
Universalität mobiler Geräte
Der letzte in [3] beschriebene Punkt beschäftigt sich mit der
Leistungsfähigkeit von mobilen Geräten wie Smartphones
und PDAs. Dieser Punkt scheint bei neuen Smartphones
bereits erfüllt zu sein. Telefone wie das iPhone von Apple
oder das G1 von HTC verfügen über ein vollständiges
Betriebssystem
und
eine
Vielzahl
von
Kommunikationsschnittstellen (UMTS, WLAN, Bluetooth)
sowie GPS und ausreichend Rechenleistung um beliebige
Anwendung auszuführen. (Ortsbasierte Services, Mobile
Commerce).
FUNKTIONEN VON UC SYSTEMEN
Sind die im oberen Teil beschrieben Faktoren erfüllt, ist es
möglich folgende grundlegende technische Funktionen zu
gewährleisten:
• Stetig und überall verfügbare Computerunterstützung.
• Stark vereinfachte Schnittstellen zwischen Mensch und
Computer, die die Aufmerksamkeit und Interaktion der
Nutzer minimal einfordern, dies wird auch als Calm
Computing [4] bezeichnet.
• Automatische Steuerung und Anpassung der Umgebung
an Nutzerpräferenzen oder situationsabhängige Kontexte.
• Automatische
Ausführung
und
Abwicklung
wiederkehrender
standardisierter
Abläufe
ohne
Notwendigkeit einer Nutzerinteraktion.
Um diese Funktionen auch technisch umzusetzen zu
können, werden folgende Komponenten, in Software,
Hardware oder als Virtualisierung, benötigt:
Abbildung 2: Adaptives UC System (Grafik aus [3])
Der Mensch oder auch ein Objekt wird vom UC System
mittels Sensoren erfasst und identifiziert. Diese
Informationen fließen zusammen mit Daten aus internen
und
externen
Datenquellen
in
eine
adaptive
Entscheidungsfindung ein, die das zu erbringenden Service
steuert und Aktionen auslöst. Solche Systeme können auch
modular aufgebaut werden, was bedeutet, dass sich
unterschiedlichste Komponenten, je nach frei verfügbaren
Kapazitäten, zu einem System zusammenschließen um
spezielle Aufgaben zu erledigen. Das hat zur Folge, dass
bei Wiederholung gleicher Abläufe unterschiedliche
Komponenten beteiligt sein können. Die Dynamik des
Systems bietet zwar einige Vorteile, jedoch sollte man
aufgrund der prinzipiellen Offenheit des Systems für alle
Komponenten des Systems Sicherheitsmaßnahmen
92
verwenden, d.h. die einzelnen Komponenten und auch
einzelne
Teilsysteme
müssen
über
eigene
Sicherheitsmechanismen verfügen. Nur dadurch kann
gewährleistet werden, dass im Fall der Kompromittierung
einer Komponente der Rest des Systems nicht automatisch
ebenfalls kompromittiert wird. Für heutige UC Systeme
prototypisch sind RFID Systeme. Dabei werden Objekte
mit Hilfe von RFID Tags identifiziert, da noch keine
geeigneten Sensoren zu ihrer Erkennung existieren.
Unterschiedliche Objekte werden also mit einheitlichen
Sensoren, dem RFID Reader, gelesen. Weitere Beispiele für
zentrale UC Technologien sind Ad-hoc-Netze sowie
ortsbasierte Dienste (GPS, GSM, WLAN). An verschieden
Orten ist oftmals auch der Kontext ein völlig anderer z.B.
das verwenden des Mobiltelefons für private und
geschäftliche Zwecke. Eine weitere wichtige Erkenntnis ist,
dass UC Systeme die Eigenschaften und Anforderungen
ihrer zugrunde liegenden Basistechnologien erben. Das hat
vor allem relevante Auswirkungen auf Datenschutz und
Datensicherheit, da die Summe der einzelnen Komponenten
und deren Daten eine völlig neue Dimension der
Datenverarbeitung eröffnen kann.
Ein konkretes Systemdesign und System das an der
Universität von Edinburgh entwickelt wird, beschreibt der
nachfolgende Teil.
Abbildung 3: System-Level Überblick für Speck und Specknet
entnommen aus [2]
Mit
Specknets
begegnet
man
einzigartigen
Netzwerkproblemen. Es braucht neuartige Lösung um
damit umzugehen. Einige Schlüsseleigenschaften von
Specknets werden dezentralisierte Kontrolle und
Adaptivität sein. Dynamischen Routing um unterbrochene
Verbindungen wieder aufzubauen oder ein Mac-Layer der
die Energiestände der Specks kennt und die Rechen- und
Komunikationsleistung entsprechend anpasst.
SPECKLED COMPUTING
Ein Speck[1,2] vereint messen, verarbeiten und drahtlose
Kommunikation in einem einzigen Chip. Specks sind
autonom, verfügen über eine erneuerbare Energiequelle und
können auch mobile eingesetzt werden. Specks können
zusammenarbeiten und programmierbare Rechennetze
bilden. Diese sogenannten Specknets sind als eine
generische Technologie für UC gedacht. Es soll möglich
sein, Daten zu messen, verarbeiten und daraus
Informationen zu generieren.
Ein einzelner Speck ist in Hinsicht auf seine Rechen- und
Speicherleistung sehr beschränkt. Im Verbund zu einem
Specknet jedoch leistungsstark. Specks verarbeiten ihre
Daten selbst und geben nur das Ergebnis nach außen weiter.
Durch die limitierte Rechen- und Speicherleistung einzelner
Specks muss ein Mechanismus gefunden werden um die
Aufgaben gemeinsam zu erfüllen. Dazu wird ein neues
Model für verteiltes Rechen gebraucht das auf die
Besonderheit der Specknets eingeht. Zu den Besonderheiten
gehören zum Beispiel eine hoher Ausfallsrate der Speck
und eine weniger verlässliche Kommunikation.
Abbildung 3 zeigt einen System-Level Überblick eines
Specks und des Specknets. Es ist geplant, dass solche
Specks programmierbar sind. Specknets als feinmaschige,
verteilte Rechnernetze fungieren und ein leichtgewichtiges
und stromsparendes Kommunikationsprotokoll benutzen.
Die drahtlose Kommunikation kann entweder über Infrarot
oder Funkwellen abgewickelt werden, je nach
Anwendungszweck.
Der folgende Abschnitt beschreibt einen zu Testzwecken
realisierten Prototyp eines Specknets.
5
93
Der Prototyp trägt den Namen ProSpeckZ. Die ProSpeckZ
Plattform besteht aus folgenden 3 Kernkomponenten:
Kommunikation mit, an den ProSpeckZ angeschlossenen
Geräten, wird vom Betriebssystem übernommen.
• Ein 802.15.4 konformen Chipsatz welcher die drahtlose
Kommunikation bis 250 kbps mit 16 Kanälen unterstützt.
• Eine zu 2.4 GHz passende Antenne und Filterschaltkreise
welche es ermöglichen die Reichweite zwischen 30 cm
und 20 Metern per Software anzupassen.
• Ein programmierbares System-on-Chip (PSoC) verhilft
dem ProSpeckZ zu rekonfigurierbaren analogen
Schaltkreisen für den Anschluss externer Schnittstellen
und Komponenten. Der PSoC ist der Rechnerkern des
ProSpeckZ. Er besteht aus einem 8 Bit Mikro-Controller,
16 KBytes Flash und 256 Byte Ram.
Abbildung 5: System Überblick entnommen aus [1]
BEREITS REALISIERTE ANWENDUNGEN
Im folgenden Teil werden noch vier Anwendungen
beschrieben die mittels ProSpeckZ realisiert wurden und
die Vielfalt der Einsatzarten dieser Technologie zeigen
sollen.
Abbildung 4: Speck Hardware Größenvergleich entnommen
aus [1]
Einen Überblick über das ProSpeckZ System wird in
Abbildung 5 gezeigt. Auf den Hardwareschichten wird das
802.15.4 Protokoll benutzt um die physische drahtlose
Kommunikation zu ermöglichen während der PSoC die
Integration von Sensoren und Aktoren in ProSpeckZ
erlaubt. Die Firmware Schicht bildet ein Energie-bewusster
MAC Layer. Weiters wird der die Reservierung eines
Kanals mit einem konfliktarmen Verfahren (duty cycling
[5]) realisiert.
Für das Routing wird ein neuartiges leichtgewichtiges
Netzwerkprotokoll verwenden. Dabei kann sowohl ein
Ansatz mittels Unicast als auch ein Ansatz mittels Multicast
benutzt werden. Der Unicast Ansatz ist zwar
energiesparend aber nicht so robust gegen Störungen wie
der Multicast Ansatz. Dadurch wird es den darüber
liegenden Schichten ermöglicht drahtlos Daten durch das
Specknet zu transportieren.
Die nächste Schicht bildet ein Echtzeit Betriebssystem
welches für die Einteilungen von Aufgaben und ausführen
von Events und Kommandos zuständig ist. Auch die
Bei der ersten Applikation handelt es sich um eine typische
Sensornetzwerk Anwendung. Durch das Anschließen eines
Temperatursensors an einen ProSpeckZ kann er leicht zu
einem Feuermelder gemacht werden. Die ProSpeckZ
können dann in einem Gebäude verteilt werden und
erzeugen mittels drahtloser Verbindungen ein verteiltes
Feueralarmnetzwerk. Der Mehrwert eines solchen Systems
liegt in der Fähigkeit, Menschen vom Feuer weg zu führen.
Bei der Entdeckung eines Feuers wird ein einfacher
verteilter Algorithmus ausgeführt, welcher mit Hilfe der in
den ProSpeckZ gespeicherten Koordinaten einen Weg am
Feuer vorbei, in die Freiheit, berechnet. Durch die drahtlose
Verbindung ist das System auch bei Zerstörung einzelner
Knoten noch einsatzfähig.
Der Hauptzweck von ProSpeckZ ist die Unterstützung der
Entwicklung von Algorithmen für Speckled Computing.
Ein Beispiel für solch einen Algorithmus ist ein verteilter
Algorithmus zur logischen Ortsbestimmung. Der
Algorithmus schätzt für jeden Knoten im Netzwerk die
logische Position aufgrund two-hop Informationen. Eine
logische Position ist als eine, relative zu der logischen
Positionen der anderen Knoten, Koordinate in einem zweioder dreidimensionalen Raum definiert. Die logische
Positionsinformation ist in vielen SensornetzwerkAnwendungen fast ebenso wichtig wie die gemessenen
Informationen. Der Algorithmus wurde auf einer Java
Softwareplattform entwickelt und getestet. Mit Hilfe von
ProSpeckZ kann er jetzt auch in der realen Welt getestet
werden. Zu diesem Zweck werden die ProSpeckZ mit
LCDs ausgestattet. Die LCDs zeigen die essentiellen Daten
wie logische Position und Anzahl der verbundenen Knoten
an. Durch das Bewegen der ProSpeckZ kann nun in einer
94
realen Umgebung der Algorithmus getestet werden und
Anpassungen durchgeführt werden um eine stabilere und
akkuratere Schätzung zu erhalten.
Prototypen wird die Entwicklung von konkreter
Anwendung erleichtert. Das Gebiet des UC bleibt jedoch
ein interessantes Forschungsgebiet. Projekte wie der
Burgbau zu Friesach können durch die Integration solcher
Technologien in Zukunft profitieren. Bereits Drahtlose
Netzwerke und RFID Tags besitzen großes Potenzial sind
jedoch erst der Anfang. Sobald generische Technologien
wie zum Beispiel Specks günstig einsatzfähig sind, ergeben
sich zahlreiche weitere Möglichkeiten.
Die nächste Anwendung zeigt die leichte Erweiterbarkeit
der ProSpeckZ mit Sensoren und Aktoren. Der ProSpeckZ
wird
mit
einen
Lautsprecher
und
einen
Infrarotentfernungsmesser verbunden. Nun kann der
ProSpeckZ als Tastenloses Musikkeyboard verwendet
werden. Je nach Abstand zum Infrarotsensor gibt der
Lautsprecher einen bestimmten Ton aus.
REFERENZEN
1. Leach, M and Benyon, D (2006): "Interacting with a
Speckled World". ADPUC'06, November 27-December
1, 2006 Melbourne, Australia.
Specks können auch als eine Technologie für mobile
Spielzeuge und Roboter eingesetzt werden. Specks können
mit Motoren oder ähnlichen Aktoren verknüpft werden und
mobile gemacht werden. Um auch auf diesem Gebiet
Algorithmen und Applikationen testen zu können sind
ProSpeckZ geeignet. Wie in Abbildung 6 rechtes unteres
Bild ersichtlich kann durch einfaches Verbinden eines
ProSpeckZ mit einem Miniaturauto eine mobile Plattform
für das Entwickeln von kooperativen, mobilen
Applikationen erstellen.
2. Arvind D K, Wong K J, "Speckled Computing:
Disruptive Technology for Networked Information
Appliances", in Proceedings of the IEEE International
Symposium on Consumer Electronics (ISCE'04) (UK),
pp 219-223, September 2004
3. Technikfolgenabschätzung, Ubiquitäres Computing und
Informationelle Selbstbestimmung, Studie im Auftrag
des Bundesministeriums für Bildung und Forschung
Deutschland,
http://www.taucis.huberlin.de/content/de/publikationen/taucis_studie.php
(04.12.2008)
4. Weiser / Brown, The Coming Age of Calm Technology,
1996
http://www.ubiq.com/hypertext/weiser/acmfuture2endn
ote.htm (06.12.2008).
5. Schmidt, Ubiquitous Computing – Computing in
Context, 2002; Projekt TEA zur „Context Awareness“:
http://www.teco.edu/tea/tea_vis.html (30.11.2008).
6. Berners-Lee, Weaving the Web, 2000.
7. Weiser, Mark / Brown, John Seely: The Coming Age of
Calm
Technology,
1996,
http://www.ubiq.com/hypertext/weiser/acmfuture2endn
ote.htm (30.11.2008).
Abbildung 6: Zeigt die vielseitigen Einsatzmöglichkeiten der
ProSpeckZ entnommen aus [1].
8. Lawrence M. Krauss1 and Glenn D. Starkman,
Universal Limits on Computation, arXvi 2004
ZUSAMMENFASSUNG
9. McCloskey, Paul: From RFID to Smart Dust, 2004.
Die Integration von Hardware in unsere Umgebung und
Gegenstände des täglichen Gebrauchs werden in Zukunft
sicher zunehmen. Obwohl das Gebiet des UC schon seit fast
20 Jahren erforscht wird sind praktische Anwendungen
noch Mangelware. Die Fortschritte auf den Gebieten von
Miniaturisierung,
Energieeffizienz,
Sensorik
und
Intelligente Materieallen ermöglichen es immer Hardware
in Geräte des täglichen Gebrauchs zu integrieren. Auch die
Anforderungen an UC-Systeme sind in zahlreichen
Forschungsartikeln beschrieben, der nächste Schritt ist das
Erstellen von Testanwendung. Ein gutes Beispiel für eine
Testplattform ist ProSpeckZ. Mit Hilfe von ProSpeckZ ist
es möglich Algorithmen für UC-Systeme zu testen und auf
ihre Tauglichkeit zu prüfen. Mit Hilfe solcher generischen
10. Weiser, The Computer for the 21st Century, Scientific
American, 265, 3, 1991, S. 66-75;
7
95
Psychological Aspects of User Experience
in Virtual Environments
René Scheidenberger, Bakk. techn.
rscheide@edu.uni-klu.ac.at
(Matr.Nr: 0360243)
623.400 Seminar aus Interaktive Systeme
Universität Klagenfurt
ABSTRACT
DEFINITIONS & ABBREVIATIONS
Several psychological aspects influence user experience in
virtual environment applications. When designing a VE, it is
really important to be aware of these aspects and the parameters that influence them. After introducing some issues that
are important for user-interaction in VEs, two key factors,
namely “spatial cognition“ and immersion & presence“, are
”
identified being important for every issue. Spatial cognition
is defined and the development of it at children explained.
The geometries of spatial representations are mentioned, and
several parameters for navigation and orientation introduced.
The second key factor is called immersion. Some criterias
for measuring immersion are introduced and also sound immersion is mentioned. Finally a conclusion is built out of the
found results.
Here you will find several definitions and abbreviations that
will be used in the whole paper.
• Cognitive Mappings: These are map-like cognitive representations of geographic or other large-scale environments. [2]
• Immersion or Presence: The awareness of a physical self
that is diminished or lost by being surrounded in an engrossing total environment. The user looses the critical
distance to the experience and gets emotionally involved [11]. He is totally present in the new environment and
left his real surroundings [9].
• Proposition: Propositions can be seen as most smallest,
abstractive unit of knowledge, which describes an issue.
This is mostly done with predicate-argument-relations.
[11]
Author Keywords
virtual environment, cognitive aspects, spatial cognition, immersion, presence
• Spatial Orientation: This must be distinguished between
spatial cognition. It is also known as geographic orientation and refers to the way, an individual determines his
location in the environment. A topographical cognitive representation is used, which is related to a reference system
to the environment. [2]
INTRODUCTION
This paper deals with a very specific topic of cultural artifact exploration, namely the psychological aspects of user
experience in virtual environments, where artifacts could be
presented in.
Although VEs have been strongly developing during the last
decades, existing environments are still not able to evoke a
natural experience of being in the VE. Users frequently get
lost easily while navigation, and simulated objects appear
to be compressed and underestimated, compared to the real
world [8]. To make a VE as realistic as possible, it is very
important to look at human’s psych.
This paper is no real solution for the problem of experienced
realism in VEs, but it points out the variables that influence
it.
• Virtual Environments (VE): VEs, also known as Virtual
Realities (VR), are computer-simulated environments that
allow users to interact with them. They can offer audiovisual experiences, but also haptic feedback. Users can
interact with the VEs with standard input devices (e.g.
mouse or keyboard) but also through more complex devices (e.g. data gloves or special glasses). [11]
• Vection: Vection means the illusory self-motion. If we are
sitting in a still standing train for example, and watch the
neighbor train running out, we sometimes feel that our
train is leaving. Vection is known to appear for all motion
directions and along all motion axes. [8]
bear this notice and the full citation on the first page. To copy otherwise, or
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
Klagenfurt, Austria.
Copyright 2008 Klagenfurt University ...$5.00.
COGNITIVE ISSUES IN VIRTUAL ENVIRONMENTS
This chapter describes several cognitive issues that are important for a user, when entering a VE. Perception, attention,
learning and memory, problem solving and decision making,
and motor cognition play a role. In the following paragraphs
they will be described according to Munro et.al. [4].
1
96
• Perception: Perception is an active process, which is not
only a bottom-up conveyance of data (e.g. processing visual images) to higher cognitive centers, but also topdown contribution to perception. Expectation and experience result from the active interpretation of sensations,
in the context of already done expectation and experience.
To explain this kind of cognitive circle, I have to mention
several characteristics of VEs that raise issues for perception. Poor resolution of displays, problems with alignment
and convergence, and the texture maps presented on displays are just a few characteristics which place constraints
on the bottom-up data processing in perception. Often objects can not be recognized. This is the reason why topdown contributions of experience and expectations are involved in the process. [4]
In VEs, we especially deal with spatial cognition, which
is not really the same as spatial perception [2]. Cognition
holds by law the definition of all the modes of knowing
(perceiving, thinking, imagining, reasoning, etc.). Perception can be seen as a subsystem or function of cognition.
As we want to describe the full detail of psychological
aspects in this paper, and spatial cognition is very important, we will take a look at it in the section “Spatial Cognition“.
Hart et.al. found following suitable definition:
“... spatial Cognition is the knowledge and internal or cognitive representation of structure, entities, and relations of
space; in other words, the internalized reflection and reconstruction of space in thought“ [2].
• Attention: As a result of limitation of the field of view
and resolution provided by most VE systems, the user
has to carry out navigation and orientation actions, to
bring the matter of interest into view [4]. See subsection
“Navigation & Orientation“ for details.
1. Initially the child sees the space undifferentiated from its
own body. Then the progress of self-object differentiation
starts and the near-space around the child is differentiated
from it’s own body. Later, during the middle of the first
year, far-space can also be differentiated.
• Learning and Memory: Many VE-Applications are used
for learning environments. No matter if it is a flightsimulator that trains forthcoming pilots or just a tutor system that
helps students learning any topic, the learning and memory is depending on perception and attention. The way how
learning and memory cognitively works is not part of this
paper, because it is not so important for the reader and
would explode the length of the paper.
2. The child starts to interact with the world around and a
shift from passive acceptance to active construction of
space comes.
Landau defined it also in a similar way:
“Spatial cognition is the capactiy to discover, mentally
transform, and use spatial information about the world to
achieve a variety of goals, including navigating through the
world, identifying and acting on objects, talking about objects and events, and using explicit symbolic representaions
such as maps and diragrams to communicate about space“
[7].
Development of Spatial Cognition
In order to use spatial cognition in VEs correctly as a designer, we must understand how it is learned by a human.
The process of developing knowledge of space can be seen
as follows [2]:
The development starts with the creation of a child’s first
spatial concepts and rises until the adult’s cognitive representation of large-scale environments. There are seven progressions and polarities of this development:
3. The life-space of young children is bound together by personal significance, and is centered around the child rather
than determined by any abstract system.
4. There is a change from egocentrism to perspectivism. The
own body is not taken as reference system anymore. Instead, a coordinated abstract system, which allows considerations of space from many perspectives, is used.
• Problem Solving and Decision Making: From a system designer’s point of view, the concepts of problem solving
and decision making - which go along with learning - are
very important. He must be aware when a decision could
be taken by a user, and how the system should react then.
5. No interrelated parts are used in the development of space
of a child first. Later, as an adult, the relations are known.
• Motor Cognition: In real world, people move around and
change their physical position, in order to make observations and carry out actions. These motor skills are strongly
involved in the mental process of navigation and orientation. In VEs such skills can often not be used, because they
are not technically supported.
6. The orientation of a child must be seen as as an irreversible temporal succession of movements. If it gets lost for
example, the reorientation is really difficult for it, because it has to start every time from a new point of view.
The whole cognitive-mappings of the surrounding environment is rebuilt. This is the reason why an abstract system is built. There we have some reference points, which
we use for reorientation.
All these issues have something in common, namely the
need of “spatial cognition“ and attention, which goes hand in
hand with “immersion & presence“. The following sections
define these two aspects in detail.
7. Due to rigidity and instability of dynamics of objects, the
person becomes flexible to arrangements and their changes.
SPATIAL COGNITION
With these progressions alone, we are not able to define any
stages of the development process. To identify stages, Piaget
[6] introduced following criterias:
There are several definitions for spatial cognition (see [2][p.
252 - 274]), but in the following I will only mention two.
These are strong enough for what we need in VEs.
2
97
Figure 1. Two different kinds of geometrical transformation [7][p. 397]
• There is a fixed order of succession between the stages
(hierarchy).
An example for such an abstraction could be the following: A person sees a car driving down a road. In his mind
he builds the abstraction ( car“, driving“, road“). If the
”
”
”
car stops then, a manipulation of the abstraction is done
f.e. ( car“, stands“).
”
”
• The acquisitions or structures from the previous stage are
integrated in the new stage, instead of substituting them
with newer opinions (integration).
Geometries of Spatial Representation
• The achievements of previous stages are consolidated
(consolidation).
As it is now clear, how the spatial cognition develops, we
have to talk about the spatial representations in our mind and
their geometries. If we know about these geometries, we can
simply make use of them when designing a VE.
• Each stage is a coordinated whole by virtue of ties of implication, reciprocity, and reversibility. Anything done in
one direction, must be able to be done in the other direction (coordination)
B. Landau [7] says that the physical world can be formally
described in terms of different geometric properties. Different objects can be characterized by transformations of an
already known object. To illustrate this, we think of a simple example, a 2-in square (see Figure 1) which is a simple
shape. It’s properties would be, for example, four equal sides and four 90◦ angles. If we translate, rotate, or reflect the
square, the lengths of its sides and their angles of intersection
will remain the same. As a result, the properties of distance
and angle remain invariant under euclidean transformations.
We identify this square as the same afterward, just turned to
some degree.
If we use other geometrical transformations, this will not
hold. Topological transformations (continuous deformations) do not work with properties like distances or angles. Figure 1 shows, that the results of these transformations on our
square, are extremely distorted. Although we would argue
that these figures are not the same, they are topologically
equal.
Out of the fact that the properties are not always the same
at each transformation, B. Landau [7] found some hierarchy, that relates geometries (see Table 1). Higher geometries
have all properties of the lower ones. F.i. projective geometries also have all properties of topological geometries
(openness/closeness of a curve, concurrence of curves at a
point) plus its new properties (straightness, collinearity, etc).
With the help of the progressions and these criterias, Piaget [6] identified four major periods in the development of
intelligence. Each of these levels is composed of an organized totality of mutually dependent and reversible behavior
sequences (schemas).
• sensory motor period (birth to 2): The child’s intelligence
is tied to actions and the coordination of actions. Out of
these actions, higher thoughts (e.g. ordering of elements)
are carried out in a premature way.
• preoperational (2 to 7): The child is able to represent the
external world in terms of symbols, which are already
mentally used for intuitive and partially coordinated operations.
• concreted operational (7 to 12): There is a drastic turning
point of intelligence at this period. Forms of mental organization develop and the child is now capable of logical
thinking. From now on, it is able to differentiate and coordinate different points of view, independent from itself.
• formal operational (12 to 15+): Now, concrete mental manipulation of real“ objects is done. A high order or reflec”
tive abstraction (propotions) and classification is created
for objects.
3
98
On the other hand, low frequencies in the periphery force
strong motion there.
• Eye movements: They influence the vection illusion. If
eyes fix a stationary target, vection will develop faster than
the eyes follow the stimulus.
• Auditory vection: Not only visually induced vection is
possible. Acoustic stimuli can also induce vection. Realism of the acoustic simulation and the number of sound
sources both enhance auditory vection. Acoustic land”
marks“, which are sound sources that are bound to a stationary object, are more suitable for creating auditory vection than artificial sounds. Combining visual stimuli, with
consistent spatialized auditory cues, can enhance both,
vection and immersion in the simulated environment.
If you set the mentioned parameters ideally, we are close to
a good user experience. One thing that is missing in navigation research, is the difference between landmark-, route-,
and survey-knowledge of an environment. This knowledge
goes hand in hand with the before mentioned geometries of
spatial representation, because the knowledge is cognitively
represented in that way. Werner et.al. [10] define the three
parts of knowledge as follows.
Table 1. Examples of geometries and their properties [7][p. 398]
This hierarchy provides a formal, testable theory that can
be used to understand the nature of spatial representations.
These representations are used in the next subsection for navigation and orientation.
Navigation & Orientation
• Landmarks are unique objects at fixed locations (e.g. visual objects, sounds, stable tactile percepts). A view from a
particular location corresponds to a specific configuration
of landmarks. Several landmarks are for example used to
determine a relative position of a target by triangulation.
As we now know, how a human person cognitively represents geometries, we have to take a look at, how a user navigates and orientates itself in a VE. Navigation tasks are
essential to any environment that demands movement over
large spaces. A designer of a VE is especially interested making the navigation task for a user as transparent and trivial
as possible. [1]
To reach this, we have to focus on multi-modal stimulation
of the senses, where vision, auditory information, and touchfeedback let the user perceive, they are moving in space.
Spatial presence and immersion (see section “Immersion &
Presence“) play an important role here. They are necessary for a quick, robust, and effortless spatial orientation and
self-motion. [8]
• Routes correspond to fixed sequences of locations, as experienced in traversing a route. Routes will be gained, if
one becomes familiar with the context of surroundings
and remembers the seen locations.
• Survey knowledge abstracts from specific sequences and
integrates knowledge from different experiences into a
single model. An example could be, combining several
routes for a various reason. The reason would then be the
integrated knowledge.
A very important fact, belonging to navigation, is vection. To
use vection in a positive way when creating VEs, we have to
consider important parameters that influence vection. Riecke
et.al. [8] defined following parameters:
Which of these knowledge types are used, depends on the
task the user wants to do [1].
IMMERSION & PRESENCE
• Size of visual field of view: For virtual reality applications, larger field of views are better suitable for inducing a
compelling illusion of self-motion, than smaller fields.
The dream of nearly every VE designer is, to let people stepping into the virtual world, forgetting that they are in an illusion [9]. As we already found out, the increase of spatial presence and immersion, also increases the overall convincingness and perceived realism of the simulation. Riecke
et.al. [8] found out, that there is a direct relation between
spatial presence and the strength of the self-motion illusion
in VEs. This is the reason why we are now focusing on this
aspect.
• Stationary forward and moving background: A moving
stimulus in the background induces vection. If we see for
example a large part of a virtual scene moving, especially
if it is in some distance away from us, we assume that this
is caused by our movement in the environment.
• Spatial frequency of the moving visual pattern: At central
field of view, the graphical scene must be rendered at high
resolution and fidelity. Stimuli in the periphery must not
have that good quality. This results from the fact that, on
the one hand, high spatial frequencies in the central field
of view produce most compelling vection.
Variables of Immersion
There are several ways to gain immersion. One way would
be for example, to remove real world sensations and substitute them with virtual once.
4
99
A simple example for this would be head-mounted displays,
which totally, or just to some extend, replace the real world
by a virtual world. [9]
In general, we have to be able to measure immersion first,
before being able to identify what is “good immersion“. This
is the reason for Sadowski et.al. [9] introducing several variables that influence presence.
• Ease of interaction: If users have problems navigating in
the VE, or performing a task there, they usually feel the
environment is not natural. The easier the interaction, the
more likely the user is to be immersed in the VE.
• User-initiated control: The greater the level of control a
user has, the higher the level of immersion of the user.
• Pictorial realism: The presented perceptual stimuli must
be connected, continuous, consistent, and meaningful
(e.g. field of view, sound or head tracking).
Table 2. Sound immersion level scale [3][p. 3]
• Length of exposure: Within the first 15 minutes of exposure presence should be gained. Motional sickness, immersion and vection are all correlated. If sickness is high, it
causes a reduction in immersion.
• Level 4: (HRTF = Head Related Transfer Function, WFS
= Wave Field Synthesis) A stable and more realistic 2D
sound field is created by a minimum of 4 speakers. Phase synchronization between channels/speakers is one of
difficult tasks here. Failures at this level lead to unstable
images and distortion in the sound field.
• Social factors: Users often believe the VE is more likely
to exist, if other users are also in the VE.
• System factors: The question to be raised here is, how
good the system represents the real world.
• Level 5: The synthesis of stable 3D images around the user
permits him to totally envelop in the virtual scene. Rendering of distance and localization are supposed to be as
accurate as in ideally real world.
• Internal factors: According to Stanney, these are the last
factors that influence presence. Individual differences in
the cognitive process of experiencing a VE are difficult to
estimate. Visually dominant people usually report greater
levels of presence, than people whose auditory system is
more dominant. It is very important to consider the type
of individuals that will use the VE and their preferred representational system.
With these level scale, designers of VEs are able to measure,
to some extend, the auditory immersion a user will do.
Flow
The last point I want to focus on is the flow experience of
users in a VE. According to Pace, this is described as follows:
“Flow is an enjoyable state of intense mental focus that is
sometimes experienced by individuals who are engaged in a
challenging activity“ [5].
However the next subsection deals with a solution to the auditory problem and mentions several levels to measure sound
immersion.
Sound Immersion
Clear goals and timely feedback characterize this experience. Flow is similar to immersion in the sense that it requires focused attention and leads to ignore irrelevant factors. A
good example for a flow experience could be a computer game player who becomes so involved in a game that he loses
track of time and temporarily forgets his physical surroundings.
An interesting thing is that flow depends on culture, stage of
modernisation, social class, age or gender. People of different age for example tend to have different flow experiences
than people of same age. What they do to experience flow
varies enormously, but they describe how it feels in almost
identical terms. [5]
The correlation between visual and auditory perception is
also mentioned by Faria et.al. [3]. They introduced six levels
to measure the level of sound immersion (see Table 2):
• Level 0: This defines a mono-aural “dry“ signal, which
comes from only one speaker, that does not represent or
reconstruct the real position of the audio source.
• Level 1: The experience of echos and reverbation is added.
With this two characteristics the user is able to guess about
size and type of environment is virtually in.
• Level 2: Inherits previous level capabilities. The perception of movements and the direction of sound sources can
be reconstructed.
• Level 3: (VBAP = Vector Based Amplitude Panning) A
correct positioning can be done and users get a sense for
distances.
5
100
4. A. Munro, R. Breaux, J. Patrey, and B. Sheldon.
Cognitive Aspects of Virtual Environments Design. In
Handbook of Virtual Environment, chapter 20, pages
415–434. Lawrence Erlbaum Associates, 2002.
CONCLUSION
Before working on the topic of psychological aspects in
VEs, I thought there must a universal way to improve the
experienced realism of users. Now, after reading lots of papers, I must say that there still exists no master solution for
this. On the one hand there are several variables that influence user experience, which must be set dynamically, depending on the system to create, the user who will use it and the
available technology. But on the other hand it is not explicitly said, how to really make use of them to improve user
experience.
To sum up, an interesting topic where still much work is to
be done and some creative ideas are needed.
5. S. Pace. Immersion, Flow And The Experiences Of
Game Players. Technical report, Central Queensland
University.
6. J. Piaget. The Origins of Intelligence in Children.
International Universities Press, 1952.
7. V. Ramachandran. Encyclopedia of the Human Brain,
volume 4. Elsevier Science, 2002.
8. B. Riecke and J. Schulte-Pelkum. Using the
perceptually oriented approach to optimize spatial
presence & ego-motion simulation. Technical report,
MaxPlanckInstitut für biologische Kybernetik, 2006.
REFERENCES
1. R. Darken and B. Peterson. Spatial Orientation,
Wayfinding and Representation. In Handbook of
Virtual Environment, chapter 24, pages 493–518.
Lawrence Erlbaum Associates, 2002.
9. W. Sadowski and K. Stanney. Presence in Virtual
Environments. In Handbook of Virtual Environment,
chapter 40, pages 791–806. Lawrence Erlbaum
Associates, 2002.
2. R. Downs and D. Stea. Image & Environment:
Cognitive Mapping and Spatial Behavior, volume 2.
Aldine-Publ., 1976.
10. S. Werner, B. Krieg-Brckner, H. Mallot, K. Schweizer,
and C. Freksa. Spatial Cognition: The Role of
Landmark, Route, and Survey Knowledge in Human
and Robot Navigation. Informatik aktuell, 1997.
3. R. Faria, M. Zuffo, and J. Zuffo. Improving spatial
perception through sound field simulation in VR. In
VECIMS 2005 IEEE International Conference on
Virtual Environments, Human-Computer Interfaces,
and Measurement Systems, 2005.
11. Wikipedia. Available at
http://en.wikipedia.org/, 01.12.2008.
6
101
Indoor Tracking Techniques
Simon Urabl
surabl@edu.uni-klu.ac.at
are invisible. Five, Bluetooth positioning system is a
network of access-points which uses this technology.
ABSTRACT
This paper describes five indoor tracking techniques which
are state of the art, in order to give the reader an overview
of the different techniques which can be used. First the
different tracking systems – fiducial tracking, inertial
tracking, optical tracking, invisible marker tracking and
Bluetooth tracking system – are described. Then the
combination of inertial and vision tracking is explained,
also an extension of fiducial tracking called cluster tagging
is described. The combination of tracking techniques is
explained with projects of the real world.
Fiducial Tracking Techniques
Fiducial Tracking works with so called fiducial markers,
which are placed around a room, like written in [1]. The
fiducial markers are recognized on the pose of the camera.
Using an underlying data repository the position of the
fiducial marker is determined.
Fiducial Detection
A good and cheap fiducial marker can be a colored circle
sticker (it can easily be produced by a color injection or
laser printer). There are many variables which affects the
calibration of the detection: the printing of the markers,
camera it self and digitizer color response and lighting.
Therefore the first approach is the use of a calibrated color
region detection and segmentation. The second approach is
based on fuzzy membership functions. This approach uses a
multiscale relationship between the neighbor pixel of the
marker and its background that is how the marker can be
distinguished from its background.
Author Keywords
Tracking techniques, marker, recognition, position,
orientation, IMU, coordinate system, landmarks, data
exchange.
INTRODUCTION
This paper gives an overview on a segment of indoor
tracking techniques which are state of the art and describe
how they are used in combination with each other. First
there will be a detailed overview of five different indoor
tracking techniques. In the second part of this paper there
will be explained how the described tracking techniques
work together in the praxis.
STATE OF THE
TECHNIQUES
ART
OF
INDOOR
To recognize the marker, there are expectation intervals
determined: the range of camera-to-fiducial distance, the
size of the fiducial and the camera parameters. The
modeling of a fiducial is described by two transitions: form
background to a colored fiducial and from this fiducial to
the background again. Other conditions are that there has to
be a minimum of distance between fiducials - camera and
the background must be uniform.
TRACKING
In the following text there will be described five tracking
techniques. First, Fiducial tracking is based on the
recognition of a markers using a camera. Second, inertial
tracking uses a velocity and a rotation measurement device
in order to estimate the position of the object in an
unprepared environment. Third, optical tracking uses on the
one hand natural landmarks, on the other hand special
crated landmarks in order to recognize the position. Fourth,
invisible marker tracking systems works like a fiducial
marker tracking system, the difference is that the markers
Seminar aus Interaktive Systeme WS08/09, 3 December, 2008,
Klagenfurt, Austria
Figure 1: Fiducial transition model [1]
Scalable Fiducials
If the camera is posed to close or to far from the marker, the
fiducial is projected to large or to small for detection. The
1
102
system will not recognize the marker correctly. That is why
single-size fiducials - like described before - have limited
tracking range. The solution for this problem is a multi-ring
color fiducial system. These systems use different size
concentric rings as fiducial markers.
the rotation of the device. The device can also have a sensor
for the gravity vector and a compass. The compass should
compensate gyro drifts.
The IMU device can be combined with a camera and be
calibrated on it. Also the lens distortion and focal length
should be calibrated, so that a zoom-action of the user can
also be included on the tracking calculation.
Coordinate Systems
In order to work with a sensor unit containing a camera and
an IMU, several coordinate systems have to be introduced.
There are three coordinates which help to calculate the
position of the camera.
Figure 2: Concentric ring fiducials allow multiple levels and
unique size relationships for each ring [1]
As shown on figure 2 the fiducials contain many rings, each
in a different size. The first-level fiducial is composed of a
red core and the first green ring around this core. The next
level-rings surround the ring of the antecessor level. For
providing a better recognition of the level it is important to
use different colors for each ring.
Extendible Tracking
As written before a contra of tracking from fiducials is the
limited range of camera viewpoints from which the
fiducials are recognized. Simple pan or zoom with the
camera can loose the marker on the tracking. To bend
forward this problem, the extendible tracking allows the
user to interactively place a new fiducial marker in the
scene. The location of the new fiducial is calibrated from
the initial fiducial. In order to have an accurate and
confident calibration of the new fiducial, the system uses
recursive filters that estimate the position of the fiducial.
After this the system can recognize the fiducial with the
pose calculations as it is used with the other fiducials
markers.
Inertial Tracking Techniques
In [2] Inertial Tracking Systems are explained as a
completely self-contained, sensing physical phenomena
created by linear acceleration and angular motion which can
be used in an unprepared environment. To determine
orientation and to position the inertial sensors, Newton’s
laws are used. The inertial sensor contains two devices for
determine the position and orientation, the accelerometer
and the gyroscope. The accelerometer measures the
acceleration vectors in reference to the inertial reference
(the staring point). The gyroscope measures the changes on
orientation the inertial sensor makes. With the combination
of these two devices the location of the inertial sensor can
be derivate from the orientation of the acceleration vector.
In [3] the device which supports this tracking system is
called tracker or IMU (Inertial Measurement Unit). An
IMU incorporates three orthogonal gyroscopes to capture
First, the earth coordinate system determines the pose of the
camera with respect to the floor (earth). The features of the
scene are modeled in this coordinate system. Second, the
earth coordinate system is attached to the moving camera.
The origin of this coordinate system is the center of the
camera. Third, the body is the coordinate system of the
IMU. Although the camera and the IMU are attached and
contained within a single package, their coordinate systems
do not coincide. Both coordinate systems are in a constant
translation and rotation.
With these coordinate systems a variety of geometric
quantities can be denoted.
Inertial sensors
In [4] the IMU is compound of three perpendicular
mounted angular velocity sensors (gyroscopes) and two
accelerometers.
All these sensors are synchronously
measured and a temperature sensor is also included to
compensate the measure dependency of each sensor
component of the IMU.
Every component of the IMU has to be calibrated in respect
to the physical alignment, the gains and offset and also the
temperature relations of the gains and offset. The 3D
angular velocity vector and the 3D acceleration vector are
computed with an on-board processor in the body
coordinate system.
The calibration of the gyroscope signals contains
measurements of the rotation velocity from the body to the
earth. These measurements are described in the body
coordinate system. Although the temperature sensor
corrects many possible drifts, some low-frequency offset
fluctuations remain. The measurements are not accurate
enough to pick up the rotation of earth. The solution to this
error can be provided by the earth coordinate system.
The calibration of the accelerometer signals contains
measurements of the acceleration vector and the gravity
vector. The accelerometer contains also - like the gyroscope
- low-frequency offset errors. This problem can also be
solved by getting coordinate from the earth coordinate
system. Gravity is a constant vector in the earth coordinate
system; the problem is that in the body coordinate system
103
the gravity detection depends on the orientation of the
sensor unit. In other words, once the orientation is known,
the acceleration signal can be used to estimate the
acceleration. But also when the acceleration is known, the
direction of move can be estimated.
Optical Tracking Techniques
Optical tracking can be divided in two main categories. The
first delineation is the vision-based, which uses imaging
technologies in 2D and image-forming (especially if the
images that used for tracking are also used for provide the
user a real-world image). The images are used as landmarks
and can be recognized with sensor technologies. The
second category employs different types of landmarks for
tracking. The logic is the same: the landmarks are
recognized by the sensor, but they are not images instead
they use colored shapes, reflective markers or even natural
features like edges and corners.
There are two kinds of landmark categories: passive and
active. The tracking system discusses physical landmarks
which are placed in scene. In [5] the author has a thesis
which explains how a reflection of light can be used as a
passive and an active landmark. The physical landmarks are
passive because they merely reflect the light. The light it
self can be seen as an active landmark, because it moves
and flashes under the control of the system host.
Figure 3: The optical triangulation algorithm [5]
L1, L2 and L3 are landmarks spread in a known area. That
means that their location and the distances between each
other are known. The tracking system also contains a
camera coordinate system that knows the distance of the
view vectors (V1, V2 and V3) because of calibration. The
transformation (the move form one point to another with
the camera) is also known by the system. The next step is to
analyze and compute the data from the world coordinate
system and from the camera coordinate system and convey
this data to the tracker coordinate system. In this step the
position of the user can be determined.
Optical triangulation
An often used algorithm in the optical tracking is the optical
triangulation. The basic idea of this algorithm is that the
user has two different views of the same scene and
consequently he can establish correspondences between
points that are visible in both images. That is how a variety
of parameter can be determined to locate the position of the
user.
Outside-looking-in vs. Inside-looking-out
In [6] there are two different kinds of configurations of
optical tracking techniques described. In the inside-lookingout configuration the optical sensors are attached on the
moving user and the landmarks are spread on the room.
3
104
considerations by mounting the sensors in the environment.
Also the landmarks are simple, cheap and small and can be
located in many places on the user. The communication
from the user to the rest of the system is relatively simple or
even unnecessary.
Invisible Marker Tracking System
Many tracking technologies have been used in AR features;
the most accurate results come from vision-based methods,
especially on fiducial markers. Using markers increases
robustness and computational requirements are reduced.
The problem with marker-bases tracking is that there must
be maintenance. Other tracking techniques do not use
markers, but they use natural features or geometry for
tracking. These tracking techniques are unreliable
compared to the marker based methods and calibration is
needed in order to use it. Invisible marker tracking focuses
on the advantages of both marker and marker-less tracking
techniques. It uses fiducial markers, which allow an
accurate tracking and these markers are also invisible, so
they are not intrusive in visible range.
Figure 4: Inside-looking-out configuration [6]
In the outside-looking-in configuration the landmarks are
placed on the moving user and the optical sensors are
placed around the room.
In [7] there is an invisible tracking system presented, which
is based on fiducial marker tracking. They use a special
marker which is invisible in visible range. The markers are
drawn with an IR fluorescent pen which can be tracked by
the system in an infrared range. The tracking system also
includes two cameras and one half mirror. One of the
cameras is a scene camera, which captures the real scene,
and one an IR camera, which recognize the invisible
marker. The two cameras are positioned in each side of half
mirror so that their optical centers coincide with each other.
Bluetooth Indoor Positioning System (BIPS)
Using a Bluetooth technology for tracking purposes is one
of the most promising and cost-effective chose, like written
in [8]. The key features of this technology are the
robustness, the low complexity, the low power consumption
and the short range. The core of this localization System is
a network of different access points using Bluetoothtechnology. The operations of these access points are
coordinated in a central server machine, which can
calculate the position of a device inside a room.
Bluetooth Basics
Figure 5: Outside-looking-in configuration [6]
Using the inside-looking-out configuration is not
recommended for small or medium working volumes.
Mounting the sensors on the user is more challenging then
mounting them on the environment. Also the
communication from the sensor packaging to the rest of the
system is more complex. In contrast to that the outsidelooking-in configuration comprised fewer mechanical
A small cluster of devices that share a common physical
channel is called the piconet, which constitutes the building
block of a Bluetooth network. One of these devices is set as
the master, the other ones assume the roll of slaves. The
slaves derive the channel-hopping sequence as a function of
the master’s clock and address. In order to establish a
connection between the devices, the rolls of each
Bluetooth-device must be set to all of them. The connection
takes place between the master device and a slave device,
never between salve devices. To achieve this Bluetooth
specification there are two phases that have to be passed
through. The initial phase is called inquiry, here the inquirer
discovers the identity of possible salves. The second and
105
user enters a new room and its handheld device is
discovered by other access points. The localization of the
users can be realized by tracking the links that disappear
and the new links that appear (new links that are
established). To ensure the localization of the user first
there must be a precisely identification of the device
limitation (range of the devices). Secondly the link
establishment process requires that each device becomes
aware of other unknown devices which enter their coverage
area. This functionality is provided with by the connection
establishment procedure described in the page and
connection phase.
last phase is the page phase, here pager informs the rest of
the devices (pager units) about its identity and imposes its
clock as the piconet clock. This phase corresponds to an
initial connection setup.
Inquiry phase
The first step in the inquiry phase is that the master enters
in an inquiry state. The master device sent broadcasts
messages using a pool of 32 frequencies (inquiry hopping
sequence). These broadcast messages consist of two ID
packages which are sent in two different frequencies, which
are repeated at least 256 times before a new frequency is
used. The slaves who want to be discovered listen to the
messages on the same frequencies of the master and switch
during the listening to an inquiry scan state. The slaves
change their listening frequency every 1.28 sec and keeps
listening the same frequency for the time it is necessary to
receive the complete ID packages.
INDOOR
TRACKING
TECHNIQUES
TOGETHER IN REAL WORLD
WORKING
In this part of the paper the interaction of some explained
tracking techniques will be described. First, use of vision
tracking on inertial tracking is explained. Vision tracking is
used to make corrections on the results of inertial tracking.
Using these two tracking techniques together ensures a
good correction of tracking drifts from inertial tracking and
so a better result. Second, clustering tagging is described.
This kind of tracking system is an extension of fiducial
tracking and is used to increase resilience of obscuration.
Page and connection phase
The inquiry messages of the master (inquirer) do not carry
any information about the sender. So there is another phase
needed to establish connection, the page phase. In this
phase the master tries to capture the page message from the
slave, where synchronization data is contained. Therefore,
the master uses also like in the inquiry phase a pool of 32
frequencies but this time belonging to the page hopping
sequence. At the ending of the page phase, the master enters
in the connection phase and sends a connection request to
the slave. If the slave agrees and acknowledges the request,
the connection is established. At this point, the devices can
begin to exchange packages.
Inertial and vision tracking
In [2] a prototype is presented that uses inertial orientation
data and vision features to stabilize performance and correct
inertial drifts. The fusion of both tracking techniques are
treated as an image stabilization problem. Basically a 2D
image is build from the inertial tracking data. This 2D
image is an approximation of the real world so that vision
tracking features corrects and refines this result. The inertial
data is also used to reduce the search space for vision data.
BIPS System design principals
BIPS is an indoor tracking system that focuses on localizing
mobile users inside a corporate building. As told before the
system has many Bluetooth devices. These devices interact
with a handheld device carried by the user. These devices
assume the masters roll to discover and enroll the user in
their coverage area. The BIPS server is a central machine
which contains the intelligence of the tracking system. It
performs the coordination of the masters to localize the user
and can so track their movements.
Camera model and coordinates
The prototype uses a CCD video camera and a rigidly
mounted 3DOF inertial sensor. The configuration of this
system uses four principal coordinate systems. In [2] the
coordinate systems are called different then I wrote before:
world coordinate system (earth coordinate system), cameracentred coordinate system and inertial-centred coordinate
system (body coordinate system). The fourth coordinate
system is an addition in order to provide the corrections of
vision tracking, the 2D image coordinate system.
There are mainly two tasks that are executed from the
access points. The first one is to discover a user which
enters the coverage area of its Bluetooth interface. The
second task is the data transfer from the users that are
associated to its piconet. The update of the user’s position
should not affect the throughput of the data. This is a goal
which should be guaranteed by the BIPS system.
Static and dynamic registration
The static calibration is also called static registration. In this
calibration the focus is on the transformation between
internal frame and camera frame. In other words the inertial
data is related to the camera motion, so that image features
motion can be used.
Device discovery
The inertial tracker accumulates over time many drifts and
errors. To abolish these kinds of errors an analytical
correction would be difficult, so the better solution should
be a dynamic registration. The strategy in dynamic
registration is to minimize the tracking error in the image
The localization and the tracking are implemented in the
BIPS server through the coordinates which are sent from
the master devices. When a user exits a room, which is not
any more in the coverage area of a master, some links to the
devices disappear. Automatically new link appear when the
5
106
plane, relative to the visually-perceived image. So basically
their goal is to automatically track projections in image
when camera is moving. Therefore they used a tracking
prediction from the inertial data, followed by a tracking
correction with vision. The position of the points which are
projected in the image can be estimated during the rotation
of camera, so inertial data predicts the motion of image
features. In order to correct the drifts these predicted image
positions are refined by doing local search.
In local search there are three motion analysis functions
which are used here for: feature selection, tracking and
verification. Firstly, the 0D and 2D features are selected
and checked on reliability and suitability. This selection and
evaluation process uses also data from last tracking
estimations. After that the features are ranked according to
their evaluation and over give to the tracking. Secondly, the
tracking method is a differential-based local optical-flow
calculation. It uses normal-motion information in local
neighbourhoods to perform a least-squares minimization to
find the best fit to motion vectors. Verification and
evaluation metrics are used to check the confidence of
every estimated result. If the estimation confidence is poor,
a refinement of the result is done iteratively until the
estimation error converges. Thirdly, the verification is
composed of a motion verification strategy and a feedback
strategy. Both strategies basically depend on the estimated
motion field to generate an evaluation frame that measures
the estimation residual. The error of the estimation is
determined by the difference between evaluation frame and
the true target frame. The error information is sent back to
the tracking module which corrects the motion and gives it
again forward to a re-evaluation.
obscuration, tags are arranged in a known spatial
configuration, data is encoded redundantly across the tags
and tags are not uniquely indexed. This fiducial extension
differs from normal fiducial systems in three points. Firstly,
in fiducial tags are normally only used for location.
Secondly, data is not encoded redundantly in tag. Thirdly,
normally data is indexed.
Motivation of cluster tagging
In actual fiducial tracking systems the required distance
from tag to camera is relatively small (one to tow meters).
The problem is that if the distance is bigger then these two
meters, the tracking does not work accurate. In order to
address this problem, the markers (tags) have to be bigger;
this again increases the likelihood of obscuration. Cluster
tagging addresses this problem and others:
1.
Smaller tag size: If the system uses pure indexed-based
tags, the payload increases. The number of unique tags
determines the payload size. When tags are not
explicitly indexed, the payload can be chosen smaller,
this permits a greater feature size. Cluster tagging
permits a smaller payload size per tag. Its solution is to
distribute data across multiple tags.
2.
Redundancy: The obscuration problem can be solved
by distributing redundantly data across the tags. If the
system have more information, the obscured parts of
the in the image can be calculated and determined.
Cluster tagging
Actually there are many different location (tracking)
systems which normally contain three mainly stages:
determine the identity of an object, measures a quantity
related to the distance to one or more sensors and compute
a location. These systems associate the object with a tag,
here the tag can be divided in tow different types of tags
depending on the use of a local power source (active tag,
e.g. Bluetooth) and the other without local power (passive
tag, e.g. fiducial markers). The main problem of passive
tags is that tracking objects in moving images is notoriously
complex. The use of fiducials simplifies this main problem
and gives a reliable solution for tracking. The advantages of
fiducial tracking are manifold: They do not need sensors
spread around the tracking environment, the markers are
easily made with commodity items available in every office
and if the cameras are calibrated correctly it provides an
accurate estimations in orientation.
In [9] they present an extension of fiducial tracking which
is not in fiducial tracking systems contained. They call it
cluster tagging and its based on the use of multiple tags:
Tags are used for both communication and location
information, multiple tags are used to increase resilience of
Figure 6: Clustering advantages (a) A singly-tagged
object has visible bits, but the lack of a full shape border
(a square).prevents them being read (b) Clustering
small tags allows some data to be read and potentially
20 corner correspondences to be identified for more
accurate pose and position determination. [9]
3.
Geometric arrangements: The markers can be build
such as some extra information about their neighbours
could be included. So geometric arrangements and
patterns can contain information about where other tags
107
REFERENCES
should be. This also assists the image processing
algorithm.
4.
1. Neumann, U., You, S., Cho, Y., Lee, J., Park, J.
Augmented Reality Tracking in Natural Environments.
Computer Science Department, Integrated Media
System Center, University of Southern California.
Better fitting: An irregular object can better be covered
by multiple tags, then only one.
2. Neumann, U., You, S., Azuma, R. Hybrid Inertial and
Vision Tracking for Augmented reality Registration.
Integrated Media System Center, University of Southern
California.
3. Chandaria, J., Thomas, G., Bartczak, B., Koeser, K.,
Koch, R., Becker, M., Bleser, G., Stricker, D.,
Wohlleber, C., Felsberg, M., Gustafsson, F., Hol, J.,
Schön, T.B., Skoglund, J., Slycke, P.J., Smeitz, S. RealTime camera tracking in the Martis project. BBC
Research, UK, Christin-Albrechts-University Kiel,
Germany, Frauenhofer IDG, Germany, Linköping
University, Sweden, Xsens, Netherlands.
Figure 7: Cluster tagging allows for irregular shapes. (a) Tag
limited to 25 bits. (b) Clustering allows at least 45 bits. [9]
5.
Robust pose estimation: Every marker has an own
estimation of location and orientation of an object. If
there are more than one, the multiple provided
information from the tags can be used to enhance the
estimation of the object.
6.
More data: In order to remove dependencies on a local
database, data could be encoded about the object it is
attached to (composition, owner, etc). Multiple tags
can be used to convey more information overall.
4. Hol, J., Schön, T., Luinge, H., Slycke, P., Gustafsson, F.
Robust real-time tracking by fusing measurements from
inertial and vision sensors. Journal of Real-Time Image
processing manuscript.
5. Livingston, M.A. Vision-based Tracking with Dynamic
Structured Light for Video See-through Augmented
Reality. Dissertation, The University of North Carolina,
Chapel Hill.
6. Welch, G., Bishop, G., Vicci, L, Brumback, S., Keller,
K., Colucci, D., High-Performance Wide-Area Optical
Tracking. Alternate Realities Corporation.
CONCLUTION
Actually there are many different tracking techniques which
can be used for indoor environments. This paper only
addresses some of these. In my opinion tracking systems
which use markers to locate objects are the cheapest and
best option for indoor tracking. They permit an easy
computation of the tracking and also a fast production of
the markers.
7. Prakt, H., Park, J. Invisible Marker Tracking for AR.
Department of ECE, Hanyang University, Seoul, Korea
(2004).
8. Bruno, R., Delmastro, F. Design and Analysis of a
Bluetooth-based Indoor Localization System. IIT
institute CNR, Pisa, Italy.
9. Harle, R., Hopper, A. Cluster Tagging: Robust Fiducial
Tracking for Smart Environments. Computer
Laboratory, University of Cambridge, Cambridge, UK.
Every tracking system has their advantages and
disadvantages. In order to perform exact results of a
tracking, there have to be a combinations or interaction of
different tracking systems. So the disadvantages of one
tracking system are covered by the advantages of the other
tracking system.
The columns on the last page should be of approximately equal length.
7
108