Volume P-201(2012) - Mathematical Journals

Transcription

Volume P-201(2012) - Mathematical Journals
GI-Edition
Gesellschaft für Informatik e.V. (GI)
publishes this series in order to make available to a broad public
recent findings in informatics (i.e. computer science and information systems), to document conferences that are organized in cooperation with GI and to publish the annual GI Award dissertation.
The volumes are published in German or English.
Information: http://www.gi.de/service/publikationen/lni/
Elmar J. Sinz
Andy Schürr (Hrsg.)
Elmar J. Sinz, Andy Schürr (Hrsg.): Modellierung 2012
Broken down into
• seminars
• proceedings
• dissertations
• thematics
current topics are dealt with from the vantage point of research and
development, teaching and further training in theory and practice.
The Editorial Committee uses an intensive review process in order
to ensure high quality contributions.
Lecture Notes
in Informatics
Modellierung 2012
14.–16. März 2012
Bamberg
ISSN 1617-5468
ISBN 978-3-88579-295-6
The “Modellierung” conference series reports since 1998 on the broad range of
modeling from a variety of perspectives.This year, up-to-date research results comprise topics from Foundations of Modeling, Visualization of Models, Model Based
Development, Search, Reuse and Knowledge Acquisition to Domain-specific Applications.
201
Proceedings
Elmar J. Sinz, Andy Schürr (Hrsg.)
Modellierung 2012
14.-16. März 2012
Bamberg
Gesellschaft für Informatik e.V. (GI)
Lecture Notes in Informatics (LNI) - Proceedings
Series of the Gesellschaft für Informatik (GI)
Volume P-201
ISBN 978-3-88579-295-6
ISSN 1617-5468
Volume Editors
Prof. Dr. Elmar J. Sinz
Lehrstuhl für Wirtschaftsinformatik, Universität Bamberg
96052 Bamberg, Germany
Email: elmar.sinz@uni-bamberg.de
Prof. Dr. Andy Schürr
Institut für Datentechnik, Technische Universität Darmstadt
64283 Darmstadt, Germany
Email: andy.schuerr@es.tu-darmstadt.de
Series Editorial Board
Heinrich C. Mayr, Alpen-Adria-Universität Klagenfurt, Austria
(Chairman, mayr@ifit.uni-klu.ac.at)
Dieter Fellner, Technische Universität Darmstadt, Germany
Ulrich Flegel, Hochschule für Technik, Stuttgart
Ulrich Frank, Universität Duisburg-Essen, Germany
Johann-Christoph Freytag, Humboldt-Universität zu Berlin, Germany
Michael Goedicke, Universität Duisburg-Essen, Germany
Ralf Hofestädt, Universität Bielefeld, Germany
Michael Koch, Universität der Bundeswehr München, Germany
Axel Lehmann, Universität der Bundeswehr München, Germany
Ernst W. Mayr, Technische Universität München, Germany
Sigrid Schubert, Universität Siegen, Germany
Ingo Timm, Universität Trier
Karin Vosseberg, Hochule Bremerhaven, Germany
Maria Wimmer, Universität Koblenz-Landau, Germany
Dissertations
Steffen Hölldobler, Technische Universität Dresden, Germany
Seminars
Reinhard Wilhelm, Universität des Saarlandes, Germany
Thematics
Andreas Oberweis, Karlsruher Institut für Technologie (KIT), Germany
 Gesellschaft für Informatik, Bonn 2012
printed by Köllen Druck+Verlag GmbH, Bonn
Vorwort
Modelle stellen eines der wichtigsten Hilfsmittel zur Beherrschung komplexer Systeme
dar. Die Themenbereiche der Entwicklung, Nutzung, Kommunikation und Verarbeitung
von Modellen sind so vielfältig wie die Informatik mit all ihren Ausdifferenzierungen.
Die Fachtagung „Modellierung“ wird vom Querschnittsfachausschuss Modellierung der
Gesellschaft für Informatik e.V. seit 1998 durchgeführt und hat sich als einschlägiges
Forum für die Modellierung etabliert. Sie führt Teilnehmerinnen und Teilnehmer aus
allen Bereichen der Informatik sowie aus Wissenschaft und Praxis zusammen. Die Tagung zeichnet sich traditionell durch lebendige und fachgebietsübergreifende Diskussionen und engagierte Rückmeldungen aus, weshalb sie gerade auch für Nachwuchswissenschaftlerinnen und Nachwuchswissenschaftler interessant ist.
Der vorliegende Band enthält die 17 Beiträge des wissenschaftlichen Hauptprogramms
der Modellierung 2012, die aufgrund von jeweils 3 Gutachten aus insgesamt 45 Beiträgen ausgewählt wurden. Dies entspricht einer Annahmequote von 37,7 %.
Die Themen der wissenschaftlichen Beiträge umspannen ein weites Feld, das von
Grundlagen der Modellierung über Visualisierung von Modellen bis hin zu modellbasierter Entwicklung, Suche und Wiederverwendung von Modellen, Wissensgewinnung
aus Modellen sowie domänenspezifischer Modellierung reicht. Ergänzt wird das Programm durch ein Doktorandensymposium sowie Workshops, Tutorien und ein Praxisforum. Alles in allem ein rundes Programm rund um die Modellierung.
Wir danken allen Referenten dafür, dass sie uns ihre Beiträge anvertraut haben, sowie
dem Programmkomitee und den Verantwortlichen für Doktorandensymposium, Workshops, Tutorien und Praxisforum für die Qualitätssicherung und die Auswahl des endgültigen Programms. Ein besonderer Dank geht an Dipl.-Wirtsch.Inf. Domenik Bork und
Dipl.-Wirtsch.Inf. Matthias Wolf für ihren unermüdlichen Einsatz zum Gelingen der
Tagung. Nicht zuletzt geht unser herzlicher Dank an Frau Regina Henninges für ihre
Rolle als organisatorischer Mittelpunkt der Tagung.
Bamberg und Darmstadt im Februar 2012
Elmar J. Sinz, Andy Schürr
Sponsoren
Wir danken den folgenden Unternehmen für die Unterstützung der Modellierung 2012:
BOC Information Technologies Consulting AG
MID GmbH
Senacor Technologies AG
Querschnittsfachausschuss Modellierung
Die „Modellierung“ ist eine Arbeitstagung des Querschnittsfachausschusses Modellierung, in dem derzeit folgende GI-Fachgliederungen vertreten sind:












EMISA, Entwicklungsmethoden für Informationssysteme und deren Anwendung
FoMSESS, Formale Methoden und Modellierung für Sichere Systeme
ILLS, Intelligente Lehr- und Lernsysteme
MMB, Messung, Modellierung und Bewertung von Rechensystemen
OOSE, Objektorientierte Software-Entwicklung
PN, Petrinetze
RE, Requirements Engineering
ST, Softwaretechnik
SWA, Softwarearchitektur
WI-MobIS, Informationssystem-Architektur: Modellierung betrieblicher Informationssysteme
WI-VM, Vorgehensmodelle für die Betriebliche Anwendungsentwicklung
WM/KI, Wissensmanagement
Verantwortliche
Programmkomitee-Vorsitz:
Workshops:
Praxisforum:
DoktorandInnensymposium:
Tutorien:
Elmar J. Sinz, Universität Bamberg
Andy Schürr, Technische Universität Darmstadt
Matthias Riebisch, Technische Universität Ilmenau
Peter Tabeling, Intervista AG
Ulrich Frank, Universität Duisburg-Essen
Gabriele Taentzer, Universität Marburg
Friederike Nickl, Swiss Life Deutschland
Jan Jürjens, TU Dortmund und Fraunhofer ISST
Programmkomitee
Colin Atkinson
Thomas Baar
Brigitte Bartsch-Spörl
Ruth Breu
Jörg Desel
Jürgen Ebert
Gregor Engels
Ulrich Frank
Holger Giese
Martin Glinz
Martin Gogolla
Ursula Goltz
Holger Herrmanns
Wolfgang Hesse
Martin Hofmann
Frank Houdek
Heinrich Hussmann
Matthias Jarke
Jan Jürjens
Gerti Kappel
Dimitris Karagiannis
Roland Kaschek
Ralf Kneuper
Christian Kop
Thomas Kühne
Jochen Küster
Horst Lichter
Peter Liggesmeyer
Florian Matthes
Heinrich C. Mayr
Mark Minas
Günther Müller-Luschnat
Universität Mannheim
akquinet tech@spree GmbH, Berlin
BSR GmbH, München
Universität Innsbruck
FernUniversität Hagen
Universität Koblenz-Landau
Universität Paderborn
Universität Duisburg-Essen
Hasso-Plattner-Institut
Universität Zürich, CH
Universität Bremen
TU Braunschweig
Universität des Saarlandes
Universität Marburg
LMU München
Daimler AG
LMU München
RWTH Aachen
TU Dortmund und Fraunhofer ISST
TU Wien
Universität Wien
Düsseldorf
Darmstadt
Alpen-Adria-Universität Klagenfurt
Victoria University of Wellington
IBM Research, Zürich, CH
RWTH Aachen
TU Kaiserslautern
TU München
Alpen-Adria-Universität Klagenfurt
Universität der Bundeswehr München
Pharmatechnik GmbH
Programmkomitee (Fortsetzung)
Friederike Nickl
Markus Nüttgens
Andreas Oberweis
Erich Ortner
Barbara Paech
Thorsten Pawletta
Jan Philipps
Klaus Pohl
Alexander Pretschner
Ulrich Reimer
Wolfgang Reisig
Ralf Reussner
Matthias Riebisch
Bernhard Rumpe
Bernhard Schaetz
Peter Schmitt
Andy Schürr
Elmar J. Sinz
Friedrich Steimann
Susanne Strahringer
Peter Tabeling
Gabriele Taentzer
Klaus Turowski
Michael von der Beeck
Gerd Wagner
Mathias Weske
Andreas Winter
Mario Winter
Heinz Züllighoven
Albert Zündorf
Swiss Life Deutschland
Universität Hamburg
Universität Karlsruhe
Technische Universität Darmstadt
Universität Heidelberg
Hochschule Wismar
Validas AG, München
Universität Duisburg-Essen
Universität Karlsruhe
FH St. Gallen
Humboldt-Universität zu Berlin
KIT Karlsruhe
Technische Universität Ilmenau
RWTH Aachen
Technische Universität München
Universität Karlsruhe
Technische Universität Darmstadt
Universität Bamberg
Fernuniversität Hagen
TU Dresden
Intervista AG
Universität Marburg
Universität Magdeburg
BMW AG
BTU Cottbus
HPI an der Universität Potsdam
Carl von Ossietzky Universität Oldenburg
Fachhochschule Köln
Universität Hamburg
Universität Kassel
Organisationsteam
Domenik Bork
Matthias Wolf
Universität Bamberg
Universität Bamberg
Inhalt
Grundlagen der Modellierung
Cédric Jeanneret, Martin Glinz, Thomas Baar
Modeling the Purposes of Models ........................................................................... 11
Florian Johannsen, Susanne Leist
Reflecting modeling languages regarding Wand and Weber’s Decomposition
Model ...................................................................................................................... 27
Janina Fengel, Kerstin Reinking
Sprachbezogener Abgleich der Fachsemantik in heterogenen
Geschäftsprozessmodellen ...................................................................................... 43
Visualisierung von Modellen
Thomas Goldschmidt, Steffen Becker, Erik Burger
Towards a Tool-Oriented Taxonomy of View-Based Modelling ............................. 59
Michael Schaub, Florian Matthes, Sascha Roth
Towards a Conceptual Framework for Interactive Enterprise Architecture
Management Visualizations .................................................................................... 75
Christian Schalles, John Creagh, Michael Rebstock
Exploring usability-driven Differences of graphical Modeling Languages: An
empirical Research Report ...................................................................................... 91
Modellbasierte Entwicklung
Thomas Buchmann, Bernhard Westfechtel, Sabine Winetzhammer
ModGraph: Graphtransformationen für EMF ...................................................... 107
Michael Schlereth, Tina Krausser
Platform-Independent Specification of Model Transformations @ Runtime
Using Higher-Order Transformations .................................................................. 123
Timo Kehrer, Stefan Berlik, Udo Kelter, Michael Ritter
Modellbasierte Entwicklung GPU-unterstützter Applikationen ............................ 139
Suchen, Wiederverwendung und Wissensgewinnung
Lars Hamann, Fabian Büttner, Mirco Kuhlmann, Martin Gogolla
Optimierte Suche von Modellinstanzen für UML/OCL-Beschreibungen in USE... 155
Benjamin Horst, Andrej Bachmann, Wolfgang Hesse
Ontologien als ein Mittel zur Wiederverwendung von Domänen-spezifischem
Wissen in der Software-Entwicklung .................................................................... 171
Jochen Reutelshoefer, Joachim Baumeister, Frank Puppe
A Meta-Engineering Approach for Customized Document-centered
Knowledge Acquisition ......................................................................................... 187
Domänenspezifische Anwendungen
Stefan Gudenkauf, Steffen Kruse, Wilhelm Hasselbring
Domain-Specific Modelling for Coordination Engineering with SCOPE ............. 203
Alexander Rachmann
Referenzmodelle für Telemonitoring-Dienstleistungen in der Altenhilfe .............. 219
Beate Hartmann, Matthias Wolf
Erweiterung einer Geschäftsprozessmodellierungssprache zur Stärkung der
strategischen Ausrichtung von Geschäftsprozessen .............................................. 235
Johanna Barzen, Frank Leymann, David Schumm, Matthias Wieland
Ein Ansatz zur Unterstützung des Kostümmanagements im Film auf Basis
einer Mustersprache ............................................................................................. 251
Martin Burwitz, Hannes Schlieter, Werner Esswein
Agility in medical treatment processes – A model-based approach ...................... 267
Modeling the Purposes of Models
Cédric Jeanneret, Martin Glinz
Thomas Baar
University of Zurich
Binzmühlestrasse 14
CH-8050 Zurich, Switzerland
{jeanneret, glinz}@ifi.uzh.ch
Hochschule für Technik und Wirtschaft Berlin
Wilhelminenhofstraße 75A
D-12459 Berlin, Germany
thomas.baar@htw-berlin.de
Abstract: Today, the purpose of a model is often kept implicit. The lack of explicit
statements about a model’s purpose hinders both its creation and its (re)use. In this
paper, we adapt two goal modeling techniques, the Goal-Question-Metric paradigm
and KAOS, an intentional modeling language, so that the purpose of a model can be
explicitly stated and operationalized. Using some examples, we present how these
approaches can document a model’s purpose so that this model can be validated, improved and used correctly.
1
Introduction
With the advent of Model Driven Engineering (MDE), models play a more and more
important role in software engineering. Conceptually, a model is an abstract representation
of an original (like a system or a problem domain) for a given purpose. One cannot build
or use a model without knowing its purpose. Yet, today, the purpose of a model is often
kept implicit. Thus, anybody can be mislead by a model if it is used for a task it was
not intended for. Furthermore, a modeler must rely on his experience and his feelings to
decide how much and which detail is worth being modeled. This may result in models at
the wrong level of abstraction for its (unstated) purpose. Stating the purpose of a model
explicitly is only a first step to address these issues.
Eventually, the purpose of a model can be characterized by a set of operations. There
are two kinds of operations: (i) operations performed by humans to interpret (understand,
analyze or use) the model and (ii) operations executed by computers to transform the model
into another model (model transformations). Being able to express the purpose of a model
with a set of model operations allows to measure how well a model fits its purpose. In
previous work [JGB11], we have made a contribution towards measuring the confinement
of a model (the extent to which it contains relevant information) given the set of formal
operations to be executed on it.
Having a set of operations is not enough, though: we must ensure that these operations
can be performed on the model — no matter whether these operations are performed by
humans or executed by computers. For this, we have to make explicit which information
the operations need from the model and we have to determine which structures a model
12
Cédric Jeanneret, Martin Glinz, Thomas Baar
has to conform to. In other words, we need to state which elements the metamodel must
contain for enabling the operations.
Our previous work assumes that these operations and these metamodels exist. This assumption may hold in an MDE context, but not in a wider context: Often, the purpose
of a model is not even stated explicitly. Thus, there is a need for (a) methods to elicit
and document modeling purposes in the first place and (b) methods to operationalize these
modeling purposes systematically. In goal modeling, there are many approaches for these
two tasks. However, these approaches were designed for other contexts than modeling.
In this paper, we adapt two of these goal modeling approaches for systematically deriving
a set of of model operations and associated metamodel elements from a qualitatively stated
model purpose. First, we present Goal-Operation-Metamodel (GOM), a generalization of
the Goal-Question-Metric (GQM) paradigm [Bas92]. Second, we propose to use KAOS
[vL09] (a goal modeling language) as a metalanguage to create intentional metamodels.
The remainder of this paper is organized as follows. In the next section, we present
the problem context of our work in more details. In Section 3, we describe the GoalOperation-Metamodel method and we present intentional metamodeling with KAOS in
Section 4. We discuss our findings in Section 5 while Section 6 discusses related work.
Finally, we conclude the paper in Section 7.
2
Problem Context
Many modeling theories distinguish two roles in the model building process: the modeler
and the expert. Modeling is a collaborative activity involving a dialog among these two
roles: The modeler elicits information about the original from the expert before formalizing it, while the expert validates the content of the model as explained by the modeler.
These roles and the relationships are reprented in Figure 1. Hoppenbrouwers et al. even
consider the model as the minutes of this dialog [HPdW05].
While building a model may be valuable on its own, the value of modeling consists of
using the model as a substitute of the original to infer some new information about it.
These inferences are made by the interpreter – a third role related to modeling. To achieve
this, the interpreter performs various model operations on the model, like executing queries
on it, extracting views from it or transforming it to other models or artefacts.
When describing the nature of modeling, Rothenberg listed the following purposes of
models [Rot89]:
The purpose of a model may include comprehension or manipulation of its
referent [the original], communication, planning, prediction, gaining experience, appreciation, etc. In some cases this purpose can be characterized by
the kinds of questions that may be asked of the model. For example, prediction corresponds to asking questions of the form “What-if...?” (where the user
asks what would happen if the referent began in some initial state and behaved
as described by the model).
Modeling the Purposes of Models
builds
Expert
Model
represents
provides / validates
information
elicits
information
Modeler
uses
13
Interpreter
Purpose
Legend:
Role
deliberates
Original
Entity
Figure 1: The roles involved in a modeling activity.
A clearly stated modeling purpose can be used as a contract between the modeler and
the interpreter. Establishing contracts is costly, as they must be negotiated and edited.
Nevertheless, such a contract can be useful in two ways: First, as a specification for a
model’s purpose, it provides a strong basis on which the model can be validated. It can
also provide the modeler with some guidance for improving the model so that it reaches
the right level of abstraction. Second, as a description of a model’s purpose, it tells an
interpreter whether the model at hand is fit for the intended use or, if several models are
available, it helps him to choose which model will best fit his purpose.
In the vein of [LSS94], we consider a model as a set of statements M . For each modeling
purpose, there is a set of relevant statements D. In an ideal case, the set D should correspond to the set of statements in the model M . When the sets M and D differ, we can
quantify the deviation of M from D by using measures from the Information Retrieval
field: precision and recall. Precision measures the confinement of a model, the extent
to which it contains relevant statements. Recall, on the other hand, measures the completeness of a model, that is, the proportion of relevant statements that has actually been
modeled. By measuring the confinement and completeness of a model, a modeler can assess how adequate is its level of abstraction for its purpose. Indeed, a model at the right
level of abstraction for its purpose is both confined and complete (M = D).
However, defining the set D is challenging. In our previous work [JGB11], we made a
contribution towards measuring the confinement of a model given a set of operations that
characterizes its purpose. When these operations are executed on a model, they navigate
through its content and gather some information by reading some of its elements. The
set of elements touched by a model operation during its execution forms the footprint of
that operation. Thus, the footprint contains all elements that affect the outcome of the
operation. For a set of operations, the global footprint of the set of operations is the union
of the footprints of each operation. This global footprint is the intersection M ∩ D: it is
the set of statements that are both present in the model and used to fulfill its purpose.
14
Cédric Jeanneret, Martin Glinz, Thomas Baar
In this paper, we propose two approaches to operationalize a qualitatively stated modeling purpose into a set of model operations and their supporting metamodels. Instead of
inventing new methods from scratch, we adapt two existing goal modeling techniques,
GQM [Bas92] and KAOS [vL09], so that they can be used in a modeling context in addition to measurement and requirements engineering, respectively. To illustrate the use of
these methods in modeling, we first present two examples.
2.1
Motivating Examples
In this section, we introduce two examples to motivate and illustrate our approaches to
capture the purpose of a model. The first example is the London Underground map, used
by Jeff Kramer in [Kra07] to highlight that the value of an abstraction depends on its purpose. The second example is related to Software Engineering, where an architect models
her (or his) system according to the “4+1” viewpoints of Kruchten [Kru95] for making
some performance analysis as described in [CM00]. We have used this example in our
previous paper to explain the various usage scenarios of model footprinting [JGB11].
The London Underground Map
As most major cities, London has an underground railway system. To help its users to navigate in London with it, its operator, the Transport for London (TfL) company provides a
map of this transit system. Figure 2 shows the evolution of the map along the years. In
1919 (Figure 2a), the map was a geographical map of London with the underground lines
overlaid in color. In 1928 (Figure 2b), ground features like streets were removed from the
map and the outlying lines were distorted to free some space for the congested center, making it more readable. The first schematic representation of the network appeared in 1933
(Figure 2c): the precise geographic location of stations is discarded; only the topology of
the network is represented. The current map (Figure 2d) contains additional information
such as the accessibility of stations, the connections to other transportations systems and
fare zones.
In this example, the modeler is the employee of TfL designing the map. The expert is an
employee of TfL who knows the underground network well. The interpreter is a user of
the map. The map is used to plan trips in London, that is, the map must help travelers
to answer the following questions: how to get from A to B? How much does it cost?
How long does it take? Is that route accessible for disabled people? Interestingly, in this
example, the people who use the model to plan their trip also use the modeled system to
actually travel in London.
Performance Analysis on Software Architecture
For the second example, we consider an architect analyzing the performance of a piece of
software. To this end, she describes its architecture using the “4+1” view model proposed
Modeling the Purposes of Models
(a) Map in 1919
(b) Map in 1928
(c) Map in 1933
(d) Map in 2009
15
Figure 2: Maps of the London Underground.
c
(a), (b) and (c): #TfL
from the London Transport Museum collection
c
(d): #Transport
for London
by Kruchten in [Kru95]: This view model includes (1) use case and sequence diagrams for
the scenario view, (2) class diagrams for the logical view, (3) component diagrams for the
development view, (4) activity diagrams for the process view and (5) deployment diagrams
for the physical view. Her model is first transformed into an extended queueing network
model (EQNM) as explained by Cortellessa et al. in [CM00]. Performance indicators are
then measured on the EQNM. The architect wants the following questions to be answered:
What is the response time and throughput of her system? Where is its bottleneck?
In this example, EQNMs can be seen as the semantic domain for architecture models written in UML. There are therefore two chained interpretations: the first interpretation translates a UML model into an EQNM, while the second interpretation analyses the EQNM.
In this example, we focus on the translation from UML to EQNM.
16
Cédric Jeanneret, Martin Glinz, Thomas Baar
The architect plays all three roles in this example. As the architect of her software, she
is the expert. As she creates the model, she is the modeler. As she uses the model for
evaluating the performance of her software, she is the interpreter. However, there are two
additional stakeholders involved in this example: Cortellessa and his team developed the
analysis used by the architect, while Kruchten, by defining the “4+1” viewpoint, proposed
a “contract” between the modeler and the interpreter.
Contrary to the previous example, the performance analysis is mostly automated. As the
architect is only interested in its results, she may know little about the internals of the
technique. Thus, the documentation of the analysis must state explicitly which information
the analysis requires in input models.
3
Goal-Operation-Metamodel
GQM is a mechanism for defining and evaluating a set of operational goals using measurement [Bas92]. In the GQM paradigm, a measurement is defined on three levels:
• At the conceptual level, the goal of the measurement is specified in a structured manner: It specifies the purpose of the measurement, the object under study, the focus
of the measurement and the viewpoint from which the measurements are taken.
• At the operational level, the goal is refined to a set of questions.
• At the quantitative level, a set of metrics is associated to each question so that it can
be answered in a quantitative manner.
Our approach consists of using GQM for models other than metrics. According to Ludewig
[Lud03], metrics are some kind of models. However, GQM has to be extended on its three
levels to describe modeling purposes other than quantitative analysis. At the conceptual
level, the goal template must support purposes like code generation1 or documentation. At
the operational level, the set of questions will be replaced by a set of (general) operations:
Beside queries, one may need simulations and transformations to refine the goal stated
at the conceptual level. Finally, the quantitative level becomes the definable level: metamodels replace metrics to support the model operations from the operational level. These
operations will be run on conforming models in a similar way that questions can be answered with the value of a metric. Thus, we call this approach Goal-Operation-Metamodel
(GOM).
3.1
GOM and the London Underground Map
Based on the GQM template described in [Bas92], we define the goal of the map as the
following:
1 Code generation, as an operation, is not supported when a model is at a conceptual level. Here, we consider
code generation as the model’s purpose to be described with GOM.
Modeling the Purposes of Models
17
Analyze the London Underground
For the purpose of characterization
With respect to reachability and connectivity of its stations
From the view of a traveler
In the following context: the traveler may be a disabled person, the tube is
part of a larger transportation system, the map is displayed on a screen or on
paper in stations
From this goal, we derive the following questions to be answered from the model:
(a) What is the shortest path between two stations?
(b) How much does it cost to travel along a given path?
(c) How long does it take to travel along a given path?
(d) Is a given path accessible to a disabled person?
(e) When traveling along a path, at which station to leave a train?
(f) When traveling along a path, in which train (line and direction) to enter?
Table 1 lists which questions are supported by the 4 versions of the map displayed in
Figure 2. All maps can be used to find the shortest path between two stations and where to
step in and step off trains. However, only the 2009 version fully supports disabled people
and allows for computing the cost of a trip. Since it preserves the geographic location of
stations, the map of 1919 can be used to estimate the time needed for a trip (without taking
transfers into account).
Table 1: Operations supported by the different versions of the map.
Map
1919
1928
1933
2009
Path (a)
√
√
√
√
Cost (b)
−
−
−
√
Time (c)
√
−
−
−
Accessibility (d)
−
−
−
√
Step Off (e)
√
√
√
√
Step In (f)
√
√
√
√
A map conforming to the metamodel depicted in Figure 3 could be used to answer all
the questions characterizing the purpose of the map. Segments, lines and stations form
the topology of the network, allowing a traveler for planning (question (a)) and executing
(questions (e) and (f)) a trip with the Underground. Fare zones are involved in the computation of the cost of a trip (question (b)). The accessibility of a station serves for question
(d) while the distance covered by a segment is needed to answer question (c).
In this example, questions are answered “mentally” by the travelers. Still, all these questions could be formalized with queries in OCL or operations in Kermeta [MFJ05]. For
example, Listing 1 presents the operation computing the cost of a trip (encoded as a sequence of segments) in Kermeta. This operation is defined for the metamodel presented in
Figure 3.
18
Cédric Jeanneret, Martin Glinz, Thomas Baar
FareZone
+zoneID : Integer
+cost : Real
+fareZones
*
+zone
1..2
Station
+destinationStation +incomingSegments
1
+name : String
+isAccesible : Boolean +sourceStation
+stations
*
1
Segment
0..* +distance : Integer
+leavingSegments
0..*
1..* {ordered}
+segments
+lines
1..*
Map
Line
+lines
*
+name : String
Figure 3: A metamodel for a map of the London Underground.
/ / Compute t h e c o s t o f a t r i p
o p e r a t i o n c o s t ( p a t h : Sequence<Segment >): R e a l i s do
r e s u l t := ”0” . toReal
/ / F i r s t , c o l l e c t a l l t r a v e r s e d zones
var t r a v e r s e d Z o n e s : S e t<FareZone> i n i t S e t<FareZone >.new
path . each { seg |
var s r c : S e t<FareZone> i n i t s e g . s o u r c e S t a t i o n . z o n e
var d s t : S e t<FareZone> i n i t s e g . d e s t i n a t i o n S t a t i o n . z o n e
var i n t e r : S e t<FareZone> i n i t s r c . i n t e r s e c t i o n ( d s t )
i f not i n t e r . isEmpty
then
/ / Both s t a t i o n s a r e i n t h e same z o n e
traversedZones . addAll ( i n t e r )
else
/ / The s e g m e n t t r a v e r s e s a b o u n d a r y
traversedZones . addAll ( src )
traversedZones . addAll ( dst )
end }
/ / Second , sum t h e c o s t o f e a c h t r a v e r s e d z o n e
t r a v e r s e d Z o n e s . each {z | r e s u l t := r e s u l t + z . c o s t }
end
Listing 1: Operation computing the cost of a trip.
3.2
GOM and the Performance Analysis on Software Architecture
In this example, we only consider the immediate goal of the model, which is the generation
of an EQNM, and we leave the final goal (the performance analysis) out. Still, both goals
could have been captured by GOM. Slightly adapting the GQM template [Bas92], the
immediate goal of the model can be stated as follows:
Analyze the architecture of a software system
For the purpose of generating an EQNM
With respect to the scenario view and the physical view as defined in [Kru95]
From the view of the software architect
In the following context: the generation of an EQNM is explained in [CM00],
Modeling the Purposes of Models
19
this generation is automated, the generated EQNM will be used to analyze the
performance of the architecture
[CM00] describes, formally, the various steps in the generation of the EQNM from UML
models:
(1) deduce a user profile from the use case diagram,
(2) combine the sequence diagrams into a meta execution graph (meta-EG),
(3) obtain the EQNM of the hardware platform from the deployment diagram and tailor
the meta-EG into an EG-instance for that platform,
(4) assign numerical parameters to the EG-instance, and
(5) assign environment based parameters to the EQNM, process the EG-instance to obtain
software parameters before assigning them to the EQNM.
This chain of transformations requires information from the following UML diagrams: use
case diagrams, sequence diagrams and deployment diagrams. The other diagrams of the
“4+1” model — the class, the component and the activity diagrams — are not needed for
this purpose.
While GOM allows to state the purpose of a model explicitly and operationalize it, goals
expressed in GOM are not formal enough to be analyzed automatically, for example, to
find conflicts among them. In the next section, we present how a model’s purpose can be
expressed in a goal-oriented modeling language.
4
Intentional Metamodeling
In the previous section, we presented a structured but informal way to specify a model’s
purpose. In this section, we introduce intentional metamodeling with KAOS, a goal modeling language designed for use in early phases of requirements engineering. A KAOS
model consists of four interrelated views:
Goal modeling establishes the list of goals involved in the system. Refined goals and
alternatives are represented in an AND/OR tree. Conflicts among goals are also
represented in this diagram.
Responsibility modeling captures the agents to whom responsibility for (leaf) goal satisfaction is assigned.
Object modeling is used to represent the domain’s concepts and the relationships among
them.
Operation modeling prescribes the behaviors the agents must perform to satisfy the goals
they are responsible for.
20
Cédric Jeanneret, Martin Glinz, Thomas Baar
A goal can be refined into conjoined sub-goals (the goal is satisfied when all its sub-goals
are satisfied) or into alternatives (the goal is satisfied when at least one of its alternatives
is satisfied). Therefore, goals are represented as AND/OR trees in KAOS. In such a tree,
the goals below a given goal explain how and how else the goal can be realized. On the
opposite, goals higher in the hierarchy provide the rationale for a given goal, explaining
why the goal is present in the model.
[vL09] provides a taxonomy of goals based on their types and their categories. There are
two main types of goals: behavioral goals (such as Achieve, Cease, Maintain and Avoid
goals) prescribe the behavior of a system, while soft-goals (such as Improve, Increase,
Reduce, Maximize and Minimize goals) prescribe preferences among alternative systems.
Similarly, there are two main categories of goals: functional goals (like Satisfaction [of
user requests] or Information [about a system state] goals) state the intent behind a system
service and non-functional goals (like Usability or Accuracy) state a quality or constraint
on its provision or its development. This taxonomy can be helpful for eliciting and specifying goals.
Goals are refined until they are assignable to a single agent. Leaf goals are then made
operational by mapping them to operations ensuring them. Operations are binary relationships over systems states. They can be derived from the formal specification of goals or
built from elicited scenarios. Finally, a conceptual model gathers all concepts (including
their attributes and the relationships among them) involved in the definition of goals and
operations.
We use KAOS as a metametamodel and not as a metamodel as it was initially designed
for: In our approach, KAOS models are metamodels. Goals depict the modeling purposes.
Operations prescribe the operations that can be executed on models and the conceptual
model defines the abstract syntax of the language. Thus, a metamodel written in KAOS
specifies many aspects of a modeling task: it states the purpose and intended usage of
models as well as their structure. In the remainder of this section, we present KAOS
metamodels for our examples.
4.1
Intentional Metamodeling and the London Underground Map
A KAOS metamodel of the London Underground map is presented in Figure 4. The main
goal of the map is to provide travelers with a means to understand how to travel from
a station A to another station B successfully. To achieve this, the map must satisfy the
following sub-goals: to help travelers to plan their trip, to help them to buy the right ticket
for it and to help them for the navigation, that is, to prevent them from getting lost during
their travel.
These goals are operationalized through the following operations performed by the traveler: find the shortest path between stations A and B, compute its cost (by summing the
fares of traversed fare zones) and carry out the plan by riding on the right line and connecting on the right station. As we did in Section 3.1, we can define these operations
formally and derive a metamodel to support them. For space reasons, this metamodel is
Modeling the Purposes of Models
21
not included in Figure 4, but it presented in Figure 3.
Provide a means for
understanding how to travel
from A to B successfully
Achieve [Buy
Right Ticket]
Achieve
[Plan Trip]
Avoid
[Getting Lost]
Compute Cost of
Planned Trip
Compute
Shortest
Path A → B
Carry Out Planned
Trip
Traveler
Figure 4: A KAOS metamodel for the London Underground map.
4.2
Intentional Metamodeling and the Performance Analysis
We present an intentional metamodel of the performance analysis in Figure 5. The final
goal of the architect is to analyze the performance of her architecture. This goal has been
refined to three sub-goals: First, performance models are generated automatically from
some UML diagrams. Then, these performance models are parametrized and solved. For
space reasons, we did not further elaborate these two latter goals. We also considered
UML diagrams as atoms, ignoring their internal elements such as actors, messages and
nodes.
A computer is responsible for the generation of performance models. This goal is operationalized with four automated operations: generate the user profile from the use case
diagram, generate the meta-EG from sequence diagrams, instantiate the meta-EG into an
EG-instance with the help of the deployment diagram and generate an EQNM from the
deployment diagram. These operations correspond to the first three steps described in
[CM00]. The last two steps are captured in the two remaining goals, parametrize and
solve the performance models.
5
Discussion
This paper is an initial contribution towards the modeling of models’ purposes. For this,
we have adapted two existing goal modeling approaches and applied them to two modeling
tasks, demonstrating the feasibility of such metamodeling.
22
Cédric Jeanneret, Martin Glinz, Thomas Baar
Achieve
[Performance Analysis]
Achieve [Solve
Performance Models]
Achieve [Parametrize
Performance Model]
Achieve [Generate
Performance Models]
Computer
Architect
Generate User
Profile
Use Case
Diagram
User
Profile
Create Meta
EG
Sequence
Diagram
Meta EG
Instantiate
EG
EG
Instance
Create EQNM
Deployment
Diagram
EQNM
Figure 5: A KAOS metamodel for a performance analysis.
In the remainder of this section, we compare the two approaches presented in this paper,
GOM and intentional metamodeling. We also discuss the benefits and difficulties related
to these approaches.
5.1
Comparison GOM and Intentional Metamodeling
Contrary to GOM, intentional metamodeling with KAOS can capture the complete rationale behind the creation and the use of a model. As explained in Section 4, goal models are
organized in AND/OR trees. By navigating the tree from an element upwards, a modeler
can find the rationale explaining a given operation, meta-class or modeling purpose. Likewise, but using downward navigation, the modeler can figure out how a model purpose is
realized by looking at its sub-goals or its alternatives.
In this paper, we only presented semi-formal KAOS models. However, these models can
be completely formalized and thus are amenable to automated analysis, including the verification of goal refinements or the derivations of goal operationalizations [vL09]. The
weaknesses of KAOS lie in the cost and difficulties of formalizing goals and operations. In
comparison, GOM is a semi-formal approach. It only provides templates for stating modeling purposes and guidelines for deriving questions from this purpose. Future research
should explore under which conditions a low or a high level of formality is preferred or
required.
Modeling the Purposes of Models
23
We have presented GOM and intentional metamodeling as two different approaches, because they come from different field: software measurement and early requirements engineering, respectively. Future work may integrate these two approaches, combining the
ease of use of the templates and guidelines of GOM and the formality of KAOS.
5.2
Benefits and Limitations
Stating a model purpose and making it operational allows for measuring the fitness of
a model for this purpose. A model is complete if it contains all the elements necessary
to fulfill its goals. Conversely, a model element is pertinent if it contributes to the satisfaction of at least one goal. Confined models only contain pertinent elements. With a
formal KAOS model, it is possible to measure these qualities objectively by establishing
satisfaction arguments.
However, eliciting a model’s purpose and elaborating it has a cost. The benefits must be
higher than the costs if the practice is to be adopted by practitioners. Models are like systems. Making explicit requirements about models (such as stating their purpose) aims at
reducing the risk of creating the wrong models. Models at the wrong level of abstraction
have consequences ranging from small annoyances for their interpreters to the impossibility of fulfilling the purposes they were made for.
Furthermore, goal modeling is difficult. First, many modelers are not experienced in intentional modeling. Courses on Software Engineering or Modeling typically cover data,
behavior and process modeling languages but leaves out goal modeling. Thus, (intentional) metamodelers will be rare in the near future. Second, goal models grow rapidly as
goals are refined and alternatives are identified.
6
Related Work
In this section, we present the state of the art in metamodeling and model quality and we
discuss its limitations. For van Gigch [vG91], a metamodel should cover many aspects
of modeling, not only “data” metamodeling (the syntax of the language). In this vein,
Kermeta proposes to metamodel the behavior of models, so that the operational semantics
of models can be specified [MFJ05]. In this paper, we go one step further by metamodeling
modeling agents and their goals.
In their model of modeling [MFBC10], Muller et al. place the intention of a model at the
heart of their notation. They define intention as follows:
The intention of a thing thus represents the reason why someone would be
using that thing, in which context, and what are the expectations vs. that
thing. It should be seen as a mixture of requirements, behavior, properties,
and constraints, either satisfied or maintained by the thing.
24
Cédric Jeanneret, Martin Glinz, Thomas Baar
In their notation, intentions are considered as sets and thus represented as Venn diagrams.
While this notation allows to represent the intersection and the overlap among intentions,
it does not allow to represent the internal content of the intention behind a model. The
focus of our paper is to represent this intention, so that its modelers and its interpreters can
agree and reason on it.
For Nuseibeh et al. [NKF93], a viewpoint consists of (1) a style (the modeling language
and its notation), (2) a work plan describing the development process of the viewpoint
including possible consistency check or construction operations, (3) a domain defining
the area of concern with respect to the overall system, (4) a specification describing the
viewpoint’s domain using the viewpoint’s style (in other words, the view of the system
from the viewpoint) and (5) a work record keeping track of development history within
the viewpoint. According to the IEEE 1471 standard, a viewpoint captures the conventions
for constructing, interpreting and analyzing a particular kind of view. Thus, a viewpoint
defines — among others — modeling languages, model operations that can be applied
to views and stakeholders whose concerns are addressed in the views. Viewpoints define
the various views (and their relationships) in a software specification or in an architecture
description, thus, they provide the modeler with guidelines on what they are expected to
model. However, we are not aware of guidelines to define these viewpoints, nor techniques
to validate that a view actually satisfies the needs of the stakeholders using it.
In [MDN09], Mohaghegi surveyed frameworks, techniques and studies of model quality in model based software development. They identified 6 quality goals: correctness,
completeness, consistency, comprehensibility, confinement and changeability. Manual reviews [LSS94] and metrics [BB06] can be used to assess and improve the confinement and
completeness of models. However, these techniques are either bound to a given modeling
language and a given process [BB06] or must be tailored for the modeling task at hand
[LSS94]. In comparison, intentional metamodeling and GOM are not bound to any specific language or process. Because they document the purpose of models, goals expressed
and operationalized in GOM or KAOS may serve as basis to derive checklists, guidelines
and metrics for validating models.
In previous work [JGB11], we propose and compare two methods to compute the footprint
of an operation – the set of all information used by the operation during its execution. Dynamic footprinting reveals the actual footprint of an operation by tracing its execution on
the model. In contrast, static footprinting estimates footprints by first analyzing, statically,
the definition of the operation to obtain its static metamodel footprint, the set of all modeling constucts (i.e., types, attributes and references) involved in this definition. The model
footprint can then be estimated by selecting only those model elements that are instances of
elements in the metamodel footprint. In this previous work, we assumed that the purpose
of a model can be characterized by the set of operations being carried on it and that these
operations were formally defined. These assumptions are reasonable in a MDE setting.
Still, in this paper, we are interested in methods to specify an arbitrary model purpose and,
if possible, to refine this purpose into a set of operations whose footprints can be looked at.
In other words, the focus of this paper is the elicitation, documentation and operationalization of modeling purposes. The operationalization produces metamodels and operations
that can be used, accessorily, as input for model footprinting.
Modeling the Purposes of Models
25
In addition to GQM and KAOS, there are other goal oriented techniques and methods for
requirements engineering, such as i* [Yu97] and Tropos [CKM02]. While we could have
selected these approaches to capture and analyze the purposes of models, we chose KAOS
and GQM instead for their strong focus on the operationalization of goals.
7
Conclusion
One cannot build a model without knowing its purpose, and one must not use a model
for purposes it is not fit for. Despite its importance, the purpose of a model is often kept
implicit.
In this paper, we adapted two existing goal modeling approaches — GQM [Bas92] and
KAOS [vL09] — to capture the purpose of a model and operationalize it into a set of
operations and a metamodel. With these elements in hands, it is possible to measure
how fit a model is for the purpose. We demonstrated the feasibility of the approaches by
applying them to two examples.
These early results are promising, but the benefits of such intentional metamodels remain to be established empirically (e.g., with industrial case studies). With the experience
gained in modeling the purpose of models, we can elaborate the templates and adapt the
guidelines offered by KAOS and GQM in more detail. In this vein, further research could
define a profile for KAOS and develop specific analysis for intentional metamodels.
For the first time, goal modeling techniques were applied to modeling itself, raising many
open issues: What is the source of modeling goals, that is, who are the metaexperts? Do
intentional metamodels help in model management and model reuse?
Acknowledgement
Our work is partially funded by the Swiss National Science Foundation under the project
200021 134543 / 1.
References
[Bas92]
Victor R. Basili. Software modeling and measurement: the Goal/Question/Metric
paradigm. Technical Report UMIACS TR-92-96, 1992.
[BB06]
Brian Berenbach and Gail Borotto. Metrics for model driven requirements development.
In 28th International Conference on Software Engineering (ICSE ’06), pages 445–451,
Shanghai, China, 2006. ACM.
[CKM02]
Jaelson Castro, Manuel Kolp, and John Mylopoulos. Towards requirements-driven information systems engineering: the Tropos project. Information Systems, 27(6):365–
389, 2002.
26
Cédric Jeanneret, Martin Glinz, Thomas Baar
[CM00]
Vittorio Cortellessa and Raffaela Mirandola. Deriving a queueing network based performance model from UML diagrams. In International Workshop on Software and
Performance (WOSP 00), pages 58–70, 2000.
[HPdW05] Stijn Hoppenbrouwers, H. A. Proper, and Th P. der Weide. A Fundamental View on the
Process of Conceptual Modeling. In Conceptual Modeling (ER 2005), volume 3716 of
LNCS, pages 128–143. Springer, 2005.
[JGB11]
Cédric Jeanneret, Martin Glinz, and Benoit Baudry. Estimating Footprints of Model
Operations. In 33rd International Conference on Software Engineering (ICSE 2011),
pages 601–610, Waikiki, Honolulu, HI, USA, 2011. ACM.
[Kra07]
Jeff Kramer. Is abstraction the key to computing?
50(4):36–42, 2007.
[Kru95]
Philippe Kruchten. The 4+1 View Model of Architecture. IEEE Software, 12(6):42–50,
1995.
[LSS94]
Odd Ivar Lindland, Guttorm Sindre, and Arne Sølvberg. Understanding Quality in
Conceptual Modeling. IEEE Software, 11(2):42–49, 1994.
[Lud03]
Jochen Ludewig. Models in software engineering - an introduction. Software and Systems Modeling, 2(1):5–14, March 2003.
Communications of the ACM,
[MDN09] Parastoo Mohagheghi, Vegard Dehlen, and Tor Neple. Definitions and approaches to
model quality in model-based software development: A review of literature. Information and Software Technology, 51(12):1646–1669, 2009.
[MFBC10] Pierre-Alain Muller, Frédéric Fondement, Benoit Baudry, and Benoı̂t Combemale.
Modeling modeling modeling. Software and Systems Modeling, 2010.
[MFJ05]
Pierre-Alain Muller, Franck Fleurey, and Jean-Marc Jézéquel. Weaving Executability into Object-Oriented Meta-Languages. In 8th International Conference on Model
Driven Engineering Languages and Systems (MoDELS 2005), volume 3713 of LNCS,
pages 264–278, 2005.
[NKF93]
Bashar Nuseibeh, Jeff Kramer, and Anthony Finkelstein. Expressing the relationships
between multiple views in requirements specification. In 15th international conference
on Software Engineering (ICSE ’93), pages 187–196, Baltimore, MD, USA, 1993.
[Rot89]
Jeff Rothenberg. The nature of modeling. In Artificial intelligence, simulation & modeling, pages 75–92. John Wiley & Sons, Inc., New York, NY, USA, 1989.
[vG91]
John P. van Gigch. System Design Modeling and Metamodeling. Plenum Press, New
York, NY, USA, 1991.
[vL09]
Axel van Lamsweerde. Requirements Engineering: From System Goals to UML Models
to Software Specifications. Wiley, 2009.
[Yu97]
Eric Yu. Towards modelling and reasoning support for early-phase requirements engineering. In 3rd International Symposim on Requirements Engineering (RE ’97), pages
226–235, 1997.
Reflecting modeling languages regarding Wand and
Weber’s Decomposition Model
Florian Johannsen, Susanne Leist
Department of Management Information Systems
University of Regensburg
Universitaetsstraße 31
93053 Regensburg
Florian.Johannsen@wiwi.uni-regensburg.de
Susanne.Leist@wiwi.uni-regensburg.de
Abstract: The benefits of decomposing process models are widely recognized in
literature. Nevertheless, the question of what actually constitutes a “good”
decomposition of a business process model has not yet been dealt with in detail.
Our starting point for obtaining a “good” decomposition is Wand and Weber’s
decomposition model for information systems which is specified for business
process modeling. In the investigation at hand, we aim to explore in how far
modeling languages support the user in fulfilling the decomposition conditions
according to Wand and Weber. An important result of the investigation is that all
investigated business process modeling languages (BPMN, eEPC, UML AD) can
meet most of the requirements.
1 Introduction
Business process modeling is widely recognized as an important activity in a company
[BWW09]. For instance, business process models can serve as a basis for decisions on
IT-investments or the design and implementation of information systems [BWW09]. In
view of its understandability the size of a business process model plays a central role
[MRC07]. Depending on both the purpose of modeling and the target group considered,
requirements on process models may differ. While a software engineer may be interested
in details of a business process (e.g. complex control-flow mechanisms), another
employee may only consider the more abstract model levels, giving him/her a basic
understanding of the business process [BRB07]. For creating process models that are
manageable and understandable in size, but also contain all the information needed (e.g.
for software development, process improvement efforts etc.), they are decomposed “into
simpler modules” [GL07]. In doing so, a process model is decomposed into several
model levels that differ in detail [KKS04]. Nevertheless the characteristics that actually
constitute a “good” decomposition [BM06, BM08] remain unclear. In practice, the
decomposition of process models is usually done in an “ad hoc” fashion [RMD10].
Guidelines on how to decompose a model into subprocesses are missing [RMD10]. Our
starting point is Wand and Weber’s model for a good decomposition which was
developed for information systems (see [WW89, We97]).
28
Florian Johannsen, Susanne Leist
We specify this model for business process modeling giving business analysts a means
to evaluate their decomposed models. As already mentioned in literature (see [Re09]),
the potential of the Wand and Weber model seems promising for deriving criteria to
judge whether the decomposition of a process model is “good” or “bad”. As a first step
in our investigation we evaluate the capabilities of common process modeling languages
to enable Wand and Weber’s decomposition model. It is our aim to explore how far
these modeling languages support the user in fulfilling the defined conditions. Although,
in fact, common modeling languages enable the decomposition e.g. by means of
hierarchical functions in Event-driven Process Chains (EPCs) or subprocesses in the
Business Process Modeling Notation (BPMN), the information given on a certain model
level is not only dependent on the control-flow. Sometimes additional information is
needed which becomes obvious by taking, for instance, a data-oriented view (e.g. focus
on data elements). Since not all modeling languages support views that are not solely
directed at the control-flow, the capabilities of a modeling language influence the quality
of the decomposition.
This paper is structured as follows. In section two, we give a definition of the term
decomposition, highlight the relevance of the Wand and Weber model, and describe the
procedure for the research under study. Section three introduces the investigated
business process modeling languages, and section four presents the decomposition
model. Whether the process modeling languages are capable to support the
decomposition model or not, is discussed in section five. Therefore requirements on
process modeling languages are derived. Section six presents conclusions, a set of
limitations, and potential directions for future research.
2 Conceptual Basics and Related Work
2.1 Decomposition and process model quality
Manifold metrics for judging the quality of a process model were recently developed
(see [GL07, Va07, MRA10]). Moreover frameworks for evaluating conceptual models
exist [SR98, KLS95]. In that context, decomposition is seen as a means to improve the
understandability of a process model while reducing the likelihood of errors at the same
time [MRA10]. The term decomposition is used in several publications, and many
further publications (see e.g. [FS06, He09]) use terms with similar meanings (e.g.
deconstruction, disaggregation, specialization). We define the decomposition of a
process according to Weber [We97] as a set of subprocesses in such a way that the
composition of the process equals the union of the compositions of the subprocesses in
the set. Everything in the composition of the process is included in at least one
subprocess in the set of subprocesses we chose. The decomposition of a process is
represented in a level structure of subprocesses, and, on each level, the process or the
subprocesses are displayed in a process model (see [We97]). Disaggregation and
specialization are seen as special types of decomposition representing a part-of-relation
respectively an is-a-relation. Most of the related work distinguishes heterogeneous types
of decomposition for a given objective.
Reflecting modeling languages regarding Wand and Weber’s Decomposition Model
29
For example, vom Brocke [Br06] introduced design principles for reference modeling
which aim to provide a greater flexibility in reference modeling. Malone et al. [Ma99]
developed the “process compass” which differentiates between horizontal specialization
by means of objects and vertical disaggregation into subprocesses. Heinrich et al. [He09]
used disaggregation and specialization for decomposing a process landscape, aiming at
identifying primarily functional similarities of the detailed subprocesses. Ferstl and Sinz
[FS06] defined principles (the so-called decomposition rules) to recursively refine
processes over several levels of detail which support disaggregation and specialisation.
The principles were especially designed to be used within the framework of their SOM
(semantic object model) methodology. Österle [Ös95] described a pragmatic procedure
to decompose processes. The objective of the procedure is to detail macro processes into
micro processes (see [Ös95]). Therefore he suggested four sources (services, business
objects, process or activities of the customer process, existing activities) which help to
derive activities from the macro process [Ös95]. While, based on their objectives,
different principles of decomposition are defined in these publications, characteristics of
a good decomposition are not investigated.
2.2 Relevance of Wand and Weber’s decomposition model
As described above (section 2.1) various principles and suggestions to help practitioners
achieve a good decomposition exist. But, to our knowledge, only one general theory of
decomposition has so far been proposed in information systems (see [BM06]): Wand and
Weber’s good decomposition model (see [WW89, WW90, We97]). The decomposition
model is part of the Bunge-Wand-Weber model (BWW model) [We97]. The BWW
model is deeply rooted in the information system discipline [Re09] and considers the
representational model, the state-tracking model, and the decomposition model as named
above [We97]. The representational model has gained popularity as a means of the
ontological analysis of modeling languages (see e.g. [RI07, Ro09, RRK07]). Therefore
modeling languages are evaluated regarding ontological completeness and ontological
clarity [Re09].
Both the state-tracking and the decomposition model are based on the concepts of the
representational model [We97]. Details on the BWW model can be found in Weber
[We97], for example. The decomposition model as it was originally developed by Wand
and Weber comprises five conditions to judge the quality of a decomposition [We97]:
minimality, determinism, losslessness, minimum coupling, and strong cohesion. These
conditions help a user to decide whether an information system has been appropriately
decomposed or not. Investigating these principles of good decomposition [We97] to
support the creation of manageable business process models in large-scale initiatives has
already been promoted by Recker et al. [Re09] as a promising field for research. If the
decomposition model proves to be appropriate for that purpose, guidelines on how to
decompose business process models may be derived in a subsequent step. This opinion is
shared by Reijers and Mendling [RM08] as well. The positive effect of the
decomposition conditions on the comprehensibility of UML diagrams has already been
shown empirically (see [BM02, BM06, BM08]).
30
Florian Johannsen, Susanne Leist
2.3 Procedure for deriving requirements on modeling languages
Since the decomposition conditions are based on the BWW representational model
[We97] and modeling languages differ regarding their ontological completeness [Re09],
the question arises in how far heterogeneous modeling languages are able to support
Wand and Weber’s decomposition conditions. To answer this question we adhere to the
following procedure.
Step 1: Specification of the
decomposition conditions for
business process modeling
Step 2: Derivation of metrics
for evaluating process models
regarding the decomposition
conditions
Step 3: Formulation of
requirements on
modeling languages
Step 4: Evaluation
of the modeling
languages
Figure 1: Procedure for deriving requirements and evaluating modeling languages
In a first step, the decomposition conditions, which have their origin in information
systems, are being specified for business process modeling. Based on this specification,
metrics are derived (step 2) to judge whether a process model adheres to the
decomposition conditions as defined in step 1. These metrics serve as an objective basis
for the evaluation of process models. The metrics are obtained from our specification of
the decomposition conditions (step 1). Thus they address those modeling constructs that
are focused for evaluating process models regarding their fulfillment of the
decomposition conditions. By looking at the metrics and the modeling constructs they
address, requirements on modeling languages can be defined straightaway. The third
step of our procedure (figure 1) contains the formulation of the requirements on
modeling languages. Using these requirements, common modeling languages are
evaluated regarding their support of the decomposition conditions (step 4). In doing so, it
becomes obvious as to which degree a decomposed process model that was designed by
using a specific modeling language can be judged regarding its coherence with the
decomposition conditions. This investigation is part of a research project which aims to
define conditions for a good decomposition. The research project is based on the design
science research method (see [He04]). Thus decomposition conditions will be build and
evaluated afterwards. The investigation at hand serves to prove the capabilities of
existing knowledge (modeling languages and Wand and Weber’s decomposition model)
and builds upon design science principles as well. We evaluate existing artifacts
(modeling language) using requirements derived by a proposed solution (Wand and
Weber’s decomposition model) for a given problem (decomposition).
3 Business Process Modeling Languages
Manifold notations exist for modeling business processes. Especially the Business
Process Modeling Notation (BPMN), the enhanced Event-driven Process Chains
(eEPCs), and UML activity diagrams (UML ADs) have gained considerable attention in
the field of business process modeling [MR08, Me09]. BPMN and UML have been
developed and promoted by the OMG as standards in the modeling domain [MR08].
However, not only the ratification by the OMG, but also the growing tool support have
contributed largely to their popularity in today`s business process modeling projects
[MR08].
Reflecting modeling languages regarding Wand and Weber’s Decomposition Model
31
eEPCs are characterized by a high user acceptance [Me09, STA05], especially in the
German-speaking community. A lot of reference models for different areas of
application (e.g. computer integrated manufacturing, logistics or retail) are designed
using eEPCs, while the notation is supported by manifold modeling tools as well
[Me09]. Whereas other modeling languages exist (such as Petri-nets) (see [Mi10]), most
of them were developed for analysis purposes and not for communicating models to
business people and employees [Mi10] which hampers their popularity. Thus, in the
following, we focus on eEPCs, UML ADs and BPMN. In addition, all of these languages
support modeling constructs such as “collapsed subprocesses” (BPMN), “sub-activities”
(UML AD), or “hierarchical functions” (eEPC) enabling the process design on different
model levels.
Enhanced Event-driven Process Chains (eEPCs): Event-driven Process Chains were
developed in the early 1990s for visualizing an integrated information system from a
business perspective (see [STA05]). The EPC is part of the ARIS framework (see
[STA05]). The ARIS framework comprises several views (e.g. data view, function view
or organization view) that can be used to specify an EPC-model through additional
information, for example data elements or organizational units [STA05]. In that context,
we speak of enhanced Event-driven Process Chains (eEPCs).
Business Process Modeling Notation (BPMN): BPMN was officially introduced in
2004. The idea was to create a graphical standard to complement executable business
process languages such as BPEL or BPML, for example [MR08]. In the meantime,
Version 2.0 of the standard was released by the OMG. BPMN offers a variety of
graphical modeling elements which are separated into basic and extended elements
[OMG10].
UML Activity Diagrams (UML ADs): UML can be seen as a standard in the field of
object-oriented modeling [Ru06]. It plays a dominant role in software engineering, since
the functionality as well as the static structure of software can be described by several
diagram types [Ru06]. In that context, activity diagrams (UML ADs) are important for
modeling business processes, software is supposed to support. In the meantime, Version
2.4.1 of UML was released by the OMG [OMG11]. An “action” is the central element of
UML activity diagrams for describing the behavior within a business process [Ru06].
The terminology in the field of business process modeling techniques is not
standardized. We therefore stick to the terminology of Vanderfeesten et al. [Va07] which
can be used for nearly all common business process modeling languages. Therefore we
consider activities, events, data elements, directed arcs, connectors, and resources as
constructs of a process model. Contrary to Vanderfeesten et al. [Va07] we also list
events as separate elements, since events are an important concept during process
execution [Mi10] which is emphasized by modeling languages such as the EPC. We
adhere to this terminology in the following. This allows us to specify the decomposition
conditions regardless of the business process modeling languages used (e.g. eEPC,
BPMN etc.).
4 The Decomposition Model
Wand and Weber`s decomposition model [We97] focuses on the decomposition of
information systems and specifies five conditions: (1) minimality, (2) determinism, (3)
losslessness, (4) minimum coupling and (5) strong cohesion.
32 Florian Johannsen, Susanne Leist
Some of the conditions can also be found in neighboring disciplines such as data
modeling or business process modeling (see e.g. [BCN92, Be95, Va07]). These findings
are referred to in order to specify the conditions for the purpose under study
appropriately. In addition, Green and Rosemann [GR00] as well as Recker et al. [Re09]
reflect modeling languages regarding the BWW model. Their results, too, help to specify
the conditions.
Minimality condition: Following Weber [We97] a decomposition „is good only if for
every subsystem at every level in the level structure of the system there are no redundant
state variables describing the subsystem“. In the information systems domain this means
that every subsystem of an information system should be characterized by the minimum
number of attributes necessary for describing the subsystem [We97]. Minimality is an
aspect that has also been addressed both in data modeling (see e.g. [BCN92]) and
business process modeling (see e.g. [Be95]). According to Batini et al. [BCN92] a model
is minimal if no object can be removed without causing a loss of information. If there is
a loss of information or not, is to be judged by the end-user, and is therefore highly
subjective. In addition, it is also the end-user who decides whether a specific modeling
element is necessary or not. As already stated, a software engineer may expect more
details in a process model than, for instance, a normal employee (see [BRB07]).
Therefore a modeling construct can be seen as needless, if it is not required by the enduser. Another important aspect of minimality is seen in avoiding redundancies. But,
while designing redundant-free models is a realistic goal in data modeling, this does not
apply to business process models [Be95]. Therefore redundancies in business process
models are quite common and may be necessary to design semantically correct models.
Becker [Be95] gives some hints as to when activities in a business process model can be
merged to avoid redundancies. Nevertheless the user’s perception plays a central role in
deciding whether a construct within a model should be modeled more than once (see
[Be95]). Sometimes redundant-free process models may be difficult to understand
because complex structures of different connectors (e.g. OR, XOR, AND) are needed.
Therefore we distinguish between wanted and unwanted redundancies. Only unwanted
redundant elements, however, should be avoided. The final decision whether an object in
a process model is to be considered as unwanted redundant should be up to the end-user.
Therefore, to evaluate different designs of a process model as regards minimality we
propose the following (see table 1):
Verification of minimality
Number (#) of activities, events, data elements,
resources that are not required by the end-user or
unwanted redundant in relation to all activities,
events, data elements, resources.
Metric
# not required or unwanted redundant
activities, events, data elements, resources/
# all activities, events, data elements,
resources
No.
1
Table 1: Verification of minimality
Regarding the metric, the size of the business process model is reflected upon when
evaluating minimality. As mentioned, the end-user`s perspective is crucial at that point.
Determinism condition: According to Weber [We97] determinism can be defined the
following way: “For a given set of external (input) events at the system level, a
Reflecting modeling languages regarding Wand and Weber’s Decomposition Model
33
decomposition is good only if for every subsystem at every level in the level structure of
the system an event is either (a) an external event, or (b) a well-defined internal event”.
The decomposition model mentions internal and external events [We97, Re09, GR00].
According to Burton-Jones and Meso [BM02], internal events are those events that occur
during the execution of a process. Whether a specific internal event occurs, depends on
decisions made or activities performed. The completeness check of a “purchase order”
indicates that an order is either “complete” or “incomplete”, depending on prior steps in
the process, for example. The decomposition model requires internal events to be “welldefined” [We97, Re09, GR00]. This means that knowledge concerning the prior state
enables a user to predict the subsequent event that will occur [We97]. In literature, there
has been discussion about the relation between OR-splits and their effect on the
instantiation of a process [ADK02]. It becomes obvious that the use of OR-splits often
leads to designs in which events or subsequent states are hard to predict and may lead to
complications during the actual execution of the process [ADK02]. Therefore the
determinism of a process model suffers from the use of OR-splits. In addition, Cardoso
showed the negative effect of OR-splits on the understandability of process models (see
[Ca05]). Therefore he introduced the Control-Flow-Complexity-Metric [Ca05] that
relates the complexity of a process model to the use of specific connectors. Negative
impacts on the understandability of business process models are also caused by XORsplits that are not based on conditional expressions. BPMN models using event-based
XOR-splits, for example, are hard to interpret, since the branch to be chosen after the
XOR-split depends on an event to occur, mainly the receipt of a message [OMG10]. In
that case, internal events are only modeled in an implicit way, while the process flow
actually comes to an abrupt stop at that point. External events, on the other hand, are
triggered by factors that are beyond a company`s influence, for instance, a server crash at
a supplier which prevents the regular stockpiling of the company`s warehouse [We97].
While the existence of such events should be recognized, it is hard to predict their effects
on the actual process execution. When external events are known, activities to react to
these external influences can be specified within a process model. Nevertheless it is
often hard to identify all external events that may have an impact on a process. Therefore
a modeler can only be expected to model external events, insofar as she/he is able to
identify them. If a process model has few external events this can either be an indicator
that the process is only little affected by external influences or that the modeler has not
identified all external events properly. Despite these problems the relation between the
number of external events and all events of the model can be used to judge to which
degree a process model is stamped with external events. To evaluate different designs of
a process model we therefore propose the following:
Verification of determinism
Number (#) of OR-splits in relation to all Split-operations of the
model.
Control-Flow-Complexity-Metric according to [Ca05]. OR-splits have
the most negative impact on the complexity of the model.
Number (#) of XOR-splits that are not based on conditional
expressions in relation to all Split-operations of the model.
Number (#) of external events in relation to all events of the model.
Metrics
# OR-splits/
# all Split-operations
see [Ca05]
# XOR-splits not based on
conditional expressions/
# all Split-operations
# external events/
# events of the model
Table 2: Verification of determinism
No.
2
3
4
5
34
Florian Johannsen, Susanne Leist
Losslessness condition: Weber [We97] believes that “a decomposition is good only if
every hereditary state variable and every emergent state variable in a system is
preserved in the decomposition“. Simply speaking, the decomposition model demands
“not to lose properties” of a thing that is being decomposed [WW89, We97]. No
information must get lost during the decomposition. The ideas of Moody [Mo98]
concerning the completeness of data models can be used to specify this aspect for
business process models. A model therefore suffers from “losses”, if certain constructs
(e.g. activities, events) are required by the target group but cannot be found in the
process model itself. The perspective of the target group once again becomes decisive in
that context. In addition, Weber [We97] exemplifies that decomposition can lead to a
false reproduction of the real world. This means that the semantics of a business process
model may be distorted during decomposition and losses of the required semantics will
occur. Considering resources can be of great help during decomposition. While two
activities may look equal at first sight (e.g. “checking account”), they can be different
regarding both the person performing the activity and the resources needed (see [Be95]).
The underlying semantics can be completely different for these activities (e.g. “checking
private customers` account” vs. “checking business customers` account”). What is more,
syntactical errors occurring during decomposition will lead to misinterpretations and
losses of the required semantics, too. As a consequence, the syntactical correctness of a
model must be guaranteed for all model levels. Therefore “losslessness” of a model can
be checked by means of the following metrics; the relation once again considers the size
of the model:
Verification of losslessness
Number (#) of missing activities, events, data
elements, resources on all model levels considering
an original model (or the requirements of a user).
Number (#) of wrongly designed constructs
(syntactically and semantically) in relation to all
required constructs.
Metrics
# missing activities, events, data elements,
resources/# all activities, events, data
elements, resources of an original model (or
the requirements of an user)
# wrongly designed constructs/
# all required constructs
No.
6
7
Table 3: Verification of losslessness
Minimum coupling condition: Weber [We97] states that “a decomposition has
minimum coupling iff the cardinality of the totality of input for each subsystem of the
decomposition is less than or equal to the cardinality of the totality of input for each
equivalent subsystem in the equivalent decomposition”. Another aspect of the
decomposition model addresses the coupling of the subsystems [We97]. The condition
demands a minimum coupling which requires a minimum cardinality of the totality of
the input [We97]. In process models, inputs are seen as data elements and the minimum
cardinality refers to the number of relations between incoming data elements and
activities. In the context of business process modeling, this idea is also supported by
Vanderfeesten et al. [Va07]. According to Vanderfeesten et al. [Va07, VCR07]
“coupling” measures the number of interconnections between the activities of a business
process model. Thus it becomes obvious in how far various activities are dependent on
each other [VRA08]. “Two activities are coupled, if they contain one or more common
data element(s)” [Va07]. Accordingly, the degree of coupling of a business process
model can be calculated by counting the number of coupled pairs (see [Va07, VRA08]).
Reflecting modeling languages regarding Wand and Weber’s Decomposition Model
35
The activities have to be selected pairwise beforehand. The mean is then determined on
the basis of the total number of activities [Va07]. This approach has a strong focus on
the data elements. With “minimal coupling” the activities in a business process model
are neither too small nor too big (see [Va08]). Nevertheless, Wand and Weber admit that
the meaning of “minimum coupling” is unclear and different interpretations can be found
in literature (see [We97]). A further interpretation of Wand and Weber’s definition of
“minimum coupling” in business process models could be seen in the possibility to
measure the strength of the connection between the activities (see [Va08]). In that case,
mainly the control-flow would be focused. The degree of coupling depends on the
complexity and the type of connections (e.g. XOR, AND, OR) between the activities
[VCR07]. In Vanderfeesten et al. [Va08] the so called “Cross-Connectivity-Metric
(CC)” is introduced for that purpose. The coupling of a business process model is thus
determined by the complexity of the connections between its activities. To compare
different designs as regards their degree of “coupling”, the following metrics can be used
(the relation once again considers the size of the model):
Verification of minimum coupling
Number (#) of “coupled pairs” (activities sharing the same data
element) in relation to all activities (see [Va07]).
Cross-Connectivity-Metric according to [Va08]. The strength of
the connections between activities is considered by assigning
weightings to the paths of the model.
Metrics
# coupled pairs/
# all activities*(# all activities-1)
see [Va08]
No.
8
9
Table 4: Verification of coupling
Strong cohesion condition: According to Weber [We97] “a set of outputs is maximally
cohesive if all output variables affected by input variables are contained in the same set,
and the addition of any other output to the set does not extend the set of inputs on which
the existing outputs depend and there is no other output which depends on any of the
input set defined by the existing output set“. Whereas coupling tends to enlarge the size
of an activity, cohesion downsizes activities [We97]. The “strong cohesion condition”
requires for each activity of the process model that all output of an activity depends upon
its input (see [We97, VRA08]). In literature, only few publications can be found on the
“cohesion” of a business process model. Exceptions are Vanderfeesten et al. [VRA08]
and Reijers and Vanderfeesten [RV04] who introduce metrics for measuring cohesion. A
strong focus is placed on the “data elements” within an activity. These data elements are
processed by operations. Operations can be understood as small parts of work within an
activity [Re03]. Strong cohesion is given, if operations within an activity overlap by
sharing “data elements”, either as input or as output (activity relation cohesion according
to [VRA08]). In addition, strong cohesion is also dominant when several of the data
elements within an activity are used more than once (activity information cohesion
according to [VRA08]). This definition comes very close to the definition in Wand and
Weber’s decomposition model, because they both define cohesion as mainly data-driven
and focus the processing of data elements within the activities.
In summary, the cohesion of an activity is determined by the extent to which the
operations of an activity “belong” to each other [VRA08, RV04]. Vanderfeesten et al.
[VRA08] propose three metrics to determine the cohesion of an activity. The final
process cohesion is then calculated on the basis of the cohesion values of the activities.
36
Florian Johannsen, Susanne Leist
Verification of strong cohesion
The activity relation cohesion determines in how far the operations within
one activity are related with one another [VRA08].
The activity information cohesion determines how many data elements are
used more than once in relation to all the data elements [VRA08].
The activity cohesion is the product of the activity relation cohesion and
the activity information cohesion [VRA08].
Metrics
see [VRA08]
No.
10
see [VRA08]
11
see [VRA08]
12
Table 5: Verification of cohesion
Although the conditions introduced are named decomposition conditions, they do not
facilitate the procedure of decomposition. They are applied on the basis of the results of
the decomposition and enable the evaluation of a decomposed process model by means
of metrics (introduced above). The metrics` value helps to compare different alternative
models, although the interpretation of differences between the metrics` values remains
an open issue. Is it worth to reduce the value of coupled pairs in relation to all activities
from 0.3 to 0.1, for example? Furthermore it has to be considered that the use of these
metrics means an additional effort, since all metrics introduced can be calculated for all
model levels of the designed alternatives. Since the decomposition of a process model
into several, more detailed model levels always means adding semantics, user
specifications have to be regarded as well. Therefore some metrics cannot be directly
derived from the process models but have to imply users` knowledge or specification
documents. These metrics are part of the conditions “losslessness” and “minimality”.
5 Evaluation of the Business Process Modeling Languages
5.1 Requirements based on the decomposition conditions
In the following, we derive requirements on modeling languages by looking at our
specification of the decomposition conditions (see section 4) and the corresponding
metrics that reflect our interpretation. In doing so, each requirement (see table 6) can be
directly associated to the related decomposition condition as well as certain facets of our
interpretation.
To fulfill the minimality condition according to our interpretation from section 4, a
process model should not include unwanted redundant and not required elements (see
also metric 1). Not only the decision whether an element is unwanted redundant, but also
whether it is required or not, is up to the user. In this regard, the context of the process is
decisive. Whereas the modeling language offers modeling constructs to represent the
process, the user specifies them taking into account the context of the process. The
modeling language cannot prevent the user from misinterpreting the requirements
resulting from the context of the process (see [Mi10]). Therefore requirements for this
condition cannot be defined.
The determinism condition, as it has been specified in section 4, requires a predictable
control-flow of the process which implies that all internal events are well-defined. The
aforementioned OR-splits often lead to designs in which events or subsequent states are
hard to predict (see [ADK02]). This also becomes evident by metrics 2 and 3 we have
introduced. In addition, if the conditions of the outgoing branches of an XOR-connector
are not explicitly defined, the subsequent state is not determinable either (see [OMG10]).
Reflecting modeling languages regarding Wand and Weber’s Decomposition Model
37
This aspect is dealt with in metric 4, while the number of XOR-splits that are not based
on conditional expressions should be minimal. A process modeling language should
therefore enable the definition of conditions to specify the outgoing branches of an
XOR-connector (requirement 1 – see table 6) and should not support an OR-connector
(requirement 3 – see table 6) (see also metrics 2, 3 and 4). Contrary to poorly-defined
internal events in a good decomposition, poorly-defined external events are permitted
(see [We97, GR00]). This is due to the fact that it is often not possible to predict a
subsequent state a priori that occurs as a result of an external event. The “determinism
condition” only demands to represent external events in a model. External events in a
process model are counted by metric 5 resulting from our specification of the condition.
Accordingly, the process modeling language should be able to display external events
(requirement 2 – see table 6).
To fulfill the losslessness condition according to our interpretation (see section 4),
hereditary and emergent elements of a process are to be preserved in the decomposition.
Since only being based on the knowledge of users or specification documents with
which a missing (see metric 6) or wrongly designed element (see metric 7) can be
identified, the process modeling language is not able to support this condition. To detect
syntactically wrongly designed elements, the process modeling language has to be
specified by means of its metamodel (requirement 4 – see table 6).
In order to be able to define the minimum cardinality of the minimum coupling condition
(according to section 4) and evaluate a process regarding metric 8, the process modeling
language has to display inputs and the flow between data elements and activities
(requirement 5 – see table 6). Earlier on (section 4), we made a suggestion to fulfill the
minimum coupling condition which is not based on inputs, namely to investigate the
strength of the connections between the activities by applying the Cross-ConnectivityMetric (see [Va08] and metric 9). The strength of the connection between the activities
is measured considering all nodes (activities and connectors) and arcs. Therefore the
process modeling language has to display activities, connectors as well as the arcs
between them (requirement 6 – see table 6).
Decomposition
condition
Minimality
Determinism
Losslessness
Minimal coupling
Strong cohesion
Requirements
No requirements can be defined
The process modeling language has to provide modeling constructs for:
conditions to specify outgoing arcs of an XOR-connector (requirement 1)
external events (requirement 2)
The process modeling language should not support an OR-connector
(requirement 3)
The process modeling language is defined by its metamodel (requirement 4)
The process modeling language has to provide modeling constructs for:
input data elements and the flow between data elements and activities
(requirement 5)
activities, connectors and arcs between them (requirement 6)
The process modeling language has to provide modeling constructs for:
input data elements (requirement 7)
output data elements (requirement 8)
intermediate data elements (requirement 9)
the flow between the data elements (requirement 10)
Table 6: Requirements on business process modeling languages
Corresponding
metrics
metric 1
metric 4
metric 5
metrics 2,3
metrics 6,7
metric 8
metric 9
metrics 10,11,12
38
Florian Johannsen, Susanne Leist
The strong cohesion condition (we have introduced in section 4) is related to the
functionality a subsystem performs [WW89, We97] and requires for each activity of the
process model that all output of an activity depends upon its input [We97]. To be able to
measure cohesion with the suggested metrics (see metrics 10, 11 and 12) the process
modeling language has to display all inputs and outputs for every activity (requirements
7 and 8 – see table 6). The flow between input and output as well as possibly existing
data elements between them by means of intermediate results are to be regarded as well
(requirements 9 and 10 – see table 6). A short overview of the identified requirements is
given in table 6. It becomes obvious that ten requirements can be derived from our
interpretation of the decomposition conditions. These requirements cover the range of
modeling constructs needed to evaluate a process model regarding the decomposition
conditions. The process of modeling a real-world situation is, however, subjective and
thus not considered at this point.
5.2 Capabilities of business process modeling languages
Support of determinism: The determinism condition focuses on modeling constructs
representing external events, OR-connectors, and conditional expressions related to
XOR-operations (see section 4 and 5.1). In eEPCs, the outgoing branches of an XORsplit are specified by the events to follow. While it is possible in modeling tools such as
ARIS to add attributes to the arcs which specify conditional expressions, these are
usually not modeled on a graphical level. In recent years, the eEPC notation was
enhanced by modeling constructs for visualizing inter-organizational business processes
(see [KKS04]). As a consequence, external events of cooperation partners, too, become
evident. It is also possible to use start events for expressing external events (see [GR00]).
The eEPC-notation provides an OR-connector. BPMN supports the exclusive gateway.
The decision which one of the outgoing arcs of the exclusive gateway is chosen depends
on a condition that is visualized by labeling the outgoing arcs [OMG10]. BPMN offers a
variety of event-types and different triggers [OMG10] that can be used to visualize the
occurrence of an external event in the process model [Re09]. BPMN supports ORconnectors as well. In UML ADs, the decision node (and a corresponding conditional
expression) is used for XOR-operations [Ru06]. UML 2.0 introduces the “accept event”
which can be used to express external events [Ru06]. Contrary to BPMN and eEPCs,
OR-operations are not considered by UML ADs.
Support of losslessness: For all notations considered, official metamodels are available
(see [Sch98, OMG10, OMG05]). But these metamodels are either too focused on
specific aspects of the modeling language (e.g. for BPMN: metamodel for choreography
activity, artifacts metamodel, external relationship metamodel) or mainly address
technical aspects. Nevertheless, literature provides metamodels which were derived from
the available specifications providing a more manageable means for a practitioner to
design syntactical correct business process models. In that context, Rosemann [Ro96]
presents a comprising metamodel for eEPCs which also takes into account connectors
and views, while Korherr and List [KL07] design a metamodel for BPMN. Bordbar and
Staikopoulos [BS04] develop a metamodel for UML ADs in particular.
In summary, metamodels exist in literature which are less complex than those presented
in the official specifications, helping a practitioner to design syntactically correct
models.
Reflecting modeling languages regarding Wand and Weber’s Decomposition Model
39
Support of minimal coupling: On the one hand, minimum coupling can be determined
by the interconnections between functions/activities/actions based on common data
elements. On the other hand, the structure of the process model provides information to
calculate the coupling degree [Va08, VCR07]. The first option takes a data-oriented
view while the second option focuses the control-flow. All modeling languages
considered offer modeling constructs to calculate the coupling degree according to our
specification and design models with “minimum coupling”. eEPCs support the data view
(see [STA05]), while in BPMN data objects are used for presenting both information and
data (see [OMG10]). UML ADs have object nodes representing data elements that are
transferred from one action to another one [OMG05]. These can either be attached to an
action symbol as a “pin” or to an object flow. In all modeling languages the connection
between the functions/activities/actions is the control-flow.
Support of strong cohesion: The strong cohesion condition is based on a data-oriented
view (see [VRA08, Re03]). As already stated, eEPCs support data elements, while the
distinction between input and output data elements is possible. However, the flow
between the data elements themselves is not visualized within an eEPC (see [STA05]).
Additional diagrams would be necessary in that context (see [STA05]). In addition,
possible intermediate data elements that are produced within a “basic” function while
transforming an input data element to an output data element are not modeled. If the
function was a “hierarchical function” with further model levels subjacent, additional
data elements would be given. In BPMN, data objects can be differentiated as data input
and data output on a graphical level, while the flow between the data objects is not
explicitly modeled (see [OMG10]). Regarding basic activities no intermediate data
elements are modeled that may be produced within the activity to create the final output
data (see [OMG10]). In UML ADs, object nodes represent data elements, while the
object flow respectively the “pin symbol” characterizes them as input or output data (see
[Ru06]). While all modeling languages considered support input and output data,
additional diagrams are necessary to highlight the flow between the data elements.
Intermediate data elements within a function/activity/action in the sense of
Vanderfeesten et al. [VRA08] are not explicitly modeled or supported. Therefore the
degree of cohesion [VRA08] cannot be calculated by just looking at the process models.
Decomposition
condition
Requirements
Determinism
Losslessness
Minimal coupling
Strong cohesion
Key:
Requirement 1 (conditions for arcs of a XOR-connector)
Requirement 2 (constructs for external events)
Requirement 3 (no support of OR-connector)
Requirement 4 (definition of a metamodel)
Requirement 5 (constructs for input data elements and flow
between data elements and activities)
Requirement 6 (activities, connectors and arcs between them)
Requirement 7 (constructs for input data elements)
Requirement 8 (constructs for output data elements)
Requirement 9 (constructs for intermediate data elements)
Requirement 10 (constructs for flow between data elements)
: fulfilled;
0: partly fulfilled;
eEPC
BPMN
UML
AD






x
x



x
x



x
x
x

x
0


x
0
x: not fulfilled



0
Table 7: Results of the evaluation
Table 7 summarizes the findings. None of the modeling languages entirely fulfills the
requirements derived in section 5.1. Major differences between the languages can be
seen in the requirements regarding the determinism condition.
40
Florian Johannsen, Susanne Leist
Some restrictions become obvious when evaluating the languages against the
requirements derived from the losslessness and strong cohesion condition.
6 Summary and Outlook
The use of Wand and Weber’s decomposition model for business process modeling is
meant to facilitate the decomposition of the process model. This enables a better
comprehensibility of the model for its users. Whereas this statement is the basis for our
complete research project and will have to be empirically validated, we have only just
started our investigation with this paper. The objective was to find out which of the three
selected business process modeling languages (BPMN, eEPC, UML AD) is best able to
support the decomposition conditions. A first result is that requirements on business
process modeling languages could not be defined for all decomposition conditions. The
capabilities of the modeling languages do not vary for most of the requirements which
stresses the similarities of the languages. The main differences could be detected when
fulfilling the requirements of the determinism condition. An important result to be
incorporated into the research project is that the business process modeling languages
can meet most of the requirements and that, for all deficiencies, supplementary models
or an extension of the process modeling language can be provided. In that context, it is
of special interest that none of the business process modeling languages is capable to
model the data elements as it is required for the strong cohesion condition. Intermediate
data elements as well as the flow between the data elements have to be documented in
supplementary models which will be verified by means of conducting the empirical
validation of the decomposition conditions. The results of the investigation underline the
need for a better integration of data elements into business process modeling. As a
restriction to the above, it has to be stated that the requirements on the modeling
languages were derived from the authors` interpretation of the decomposition conditions
by Wand and Weber [WW89, WW90, We97]. The conditions were specified by metrics
allowing an objective evaluation of different design alternatives. Nevertheless there may
be other interpretations of these conditions in the context of business process modeling.
While process modeling itself is a subjective task, evaluation procedures in the field of
process modeling, too, may underlie subjectivity. This refers to section 5.2 in particular.
In addition, the investigation is limited, because only three process modeling languages
were investigated. With the next steps of the research project we aim to validate the
decomposition model and derive a decomposition method which comprises principles
and practical guidelines for business analysts.
References
[ADK02]
[BCN92]
[Be95]
van der Aalst, W.M.P.; Desel, J.; Kindler, E.: On the semantics of EPCs: A vicious
circle. In: EPK 2002: Business Process Management using EPCs, 2002; p. 71–80.
Batini, C.; Ceri, S.; Navathe, S.B.: Conceptual Database Design - An entitiy
relationship approach. Benjamin/Cummings Publishing, Redwood City et al., 1992.
Becker, J.: Strukturanalogien in Informationsmodellen: Ihre Definition, ihr Nutzen
und ihr Einfluß auf die Bildung von Grundsätzen ordnungsmäßiger Modellierung
(GoM). Wirtschaftsinformatik 95. Physica, Heidelberg, 1995, p. 133-150.
Reflecting modeling languages regarding Wand and Weber’s Decomposition Model
[BM02]
[BM06]
[BM08]
[BRB07]
[Br06]
[BS04]
[BWW09]
[Ca05]
[FS06]
[GL07]
[GR00]
[He09]
[He04]
[KKS04]
[KL07]
[KLS95]
[Ma99]
[Me09]
[Mi10]
[Mo98]
[MR08]
41
Burton-Jones, A.; Meso, P.: How Good Are These UML Diagrams? An Empirical
Test of the Wand and Weber Good Decomposition Model. In: International
Conference on Information Systems (ICIS), 2002; p. 101-114.
Burton-Jones, A.; Meso, P.: Conceptualizing Systems for Understanding: An
Empirical Test of Decomposition Principles in Object-Oriented Analysis.
Information Systems Research 2006; 17:38-60.
Burton-Jones, A.; Meso, P.N.: The Effects of Decomposition Quality and Multiple
Forms of Information on Novices’ Understanding of a Domain from a Conceptual
Model. Journal of the Association for Information Systems 2008; 9:748-802.
Bobrik, R.; Reichert, M.; Bauer, T.: View-Based Process Visualization. Lecture
Notes in Computer Science 2007; Volume 4714/2007:88-95.
vom Brocke, J.: Design Principles for Reference Modelling - Reusing Information
Models by Means of Aggregation, Specialisation, Instantiation, and Analogy. In
(Fettke, P., Loos, P. eds.): Reference Modelling for Business Systems Analysis. Idea
Group Publishing, Hershey, USA, 2006.
Bordbar, B.; Staikopoulos, A.: On Behavioural Model Transformation in Web
Services. Lecture Notes in Computer Science 2004; 3289/2004:667-678.
Becker, J.; Weiß, B.; Winkelmann, A.: A Business Process Modeling Language for
the Banking Sector - A Design Science Approach. In: Fifteenth Americas
Conference on Information Systems (AMCIS), 2009; p. 1-11.
Cardoso, J.: How to Measure the Control-flow Complexity of Web Processes and
Workflows. In (Fischer, L. ed.): Workflow Handbook. Lighthouse Point 2005.
Ferstl, O.K.; Sinz, E.J.: Modeling of Business Systems Using SOM. In (Bernus, P.,
Mertins, K., Schmidt, G. eds.): Handbook on Architectures of Information Systems.
Springer, Berlin etc., 2006, p. 347-367.
Gruhn, V.; Laue, R.: Approaches for Business Process Model Complexity Metrics.
In (Abramowicz, W., Mayr, H.C. eds.): Technologies for Business Information
Systems. Springer, Berlin, 2007; p. 13-24.
Green, P.; Rosemann, M.: Integrated process modeling: An ontological evaluation.
Information Systems 2000; 25:73-87.
Heinrich, B. et al.: The process map as an instrument to standardize processes:
design and application at a financial service provider. ISeB 2009; 7:81-102
Hevner et al.: Design Science in Information Systems Research. MISQ 2004; 28:75105
Klein, R.; Kupsch, F.; Scheer, A.-W.: Modellierung inter-organisationaler Prozesse
mit Ereignisgesteuerten Prozessketten, 2004.
Korherr, B.; List, B.: Extending the EPC and the BPMN with Business Process
Goals and Performance Measures. In: 9th ICEIS, 2007.
Krogstie, J.; Lindland, O.I.; Sindre, G.: Towards a Deeper Understanding of Quality
in Requirements Engineering. In: Proceedings of the 7th CAISE, 1995; p. 82-95.
Malone, T.W. et al.: Tools for Inventing Organizations: Toward a Handbook of
Organizational Processes. Management Science 1999; 45:425-443.
Mendling, J.: Metrics for Process Models - Empirical Foundations of Verification,
Error Prediction, and Guidelines for Correctness. Springer, Berlin et al., 2009.
Mili, H. et al.: Business process modeling languages: Sorting through the alphabet
soup. ACM Computing Surveys 2010; 43:1-54.
Moody, D.L.: Metrics for Evaluating the Quality of Entity Relationship Models.
Lecture Notes in Computer Science 1998; 507/1998:211-225.
zur Muehlen, M.; Recker, J.: How Much Language Is Enough? Theoretical and
Practical Use of the Business Process Modeling Notation. Lecture Notes in
Computer Science 2008; 5074/2008:465-479.
42
Florian Johannsen, Susanne Leist
[MRA10]
[MRC07]
[OMG05]
[OMG10]
[OMG11]
[Ös95]
[Re03]
[Re09]
[RI07]
[RM08]
[RMD10]
[Ro96]
[Ro09]
[RRK07]
[Ru06]
[RV04]
[Sch98]
[SR98]
[STA05]
[Va07]
[Va08]
[VCR07]
[VRA08]
[We97]
[WW89]
[WW90]
Mendling, J.; Reijers, H.; van der Aalst, W.: Seven process modeling guidelines.
Information and Software Technology 2010; 52:127-136.
Mendling, J.; Reijers, H.A.; Cardoso, J.: What Makes Process Models
Understandable? Lecture Notes in Computer Science 2007; 4714/2007:48-63.
OMG Unified Modeling Language (OMG UML) – Superstructure, 2005.
OMG: Business Process Model and Notation (BPMN) – Version 2.0, 2010.
OMG Unified Modeling Language, Infrastructure – Version 2.4.1, 2011.
Österle, H.: Business in the information age Springer, Berlin et al., 1995.
Reijers, H.A.: A Cohesion Metric for the Definition of Activities in a Workflow
Process. In: Eighth CAiSE/IFIP8.1 International Workshop on Evaluation of
Modeling Methods in Systems Analysis and Design, 2003, p. 116-125.
Recker, J. et al.: Business process modeling: a comparative analysis. Journal of the
Association for Information Systems 2009; 10:333-363.
Recker, J.; Indulska, M.: An Ontology-Based Evaluation of Process Modeling with
Petri Nets. Interoperability in Business Information Systems 2007; 2:45-64.
Reijers, H.; Mendling, J.: Modularity in Process Models: Review and Effects.
Lecture Notes in Computer Science 2008; 5240:20-35.
Reijers, H.A.; Mendling, J.; Dijkman, R.: On the Usefulness of Subprocesses in
Business Process Models. BPM Report, 2010.
Rosemann, M.: Komplexitätsmanagement in Prozeßmodellen. Gabler-Verlag,
Wiesbaden, 1996.
Rosemann, M. et al.: Using ontology for the representational analysis of process
modelling techniques. International Journal of Business Process Integration and
Management Decision 2009; 4:251-265.
Recker, J.; Rosemann, M.; Krogstie, J.: Ontology- Versus Pattern-Based Evaluation
of Process Modeling Languages: A Comparison. Communications of the Association
for Information Systems 2007; 20:774-799.
Russell, N. et al.: On the suitability of UML 2.0 activity diagrams for business
process modelling. In: 3rd Asia-Pacific conference on Conceptual modelling, 2006.
Reijers, H.A.; Vanderfeesten, I.T.P.: Cohesion and Coupling Metrics for Workflow
Process Design Lecture Notes in Computer Science 2004; 3080:290-305.
Scheer, A.-W.: ARIS - Modellierungsmethoden - Metamodelle - Anwendungen.
Springer, Berlin et al., 1998.
Schütte, R.; Rotthowe, T.: The Guidelines of Modeling – An Approach to Enhance
the Quality in Information Models. LNCS 1998; 1507/1998:240-254.
Scheer, A.-W.; Thomas, O.; Adam, O.: Process Modeling Using Event-Driven
Process Chains. In (Dumas, M., van der Aalst, W., Hofstede, A.T. eds.): Processaware information systems. John Wiley and Sons 2005, p. 119-146.
Vanderfeesten, I.T.P. et al: Quality Metrics for Business Process Models. In
(Fischer, L. ed.): BPM and Workflow Handbook 2007. Future Strategies, p. 179-190.
Vanderfeesten, I. et al.: On a Quest for Good Process Models: The CrossConnectivity Metric. Lecture Notes in Computer Science 5074 2008; 5074:480-494
Vanderfeesten, I.; Cardoso, J.; Reijers, H.A.: A weighted coupling metric for
business process models. In: Proceedings of the CAiSE 2007, p. 41-44.
Vanderfeesten, I.; Reijers, H.A.; van der Aalst, W.M.P.: Evaluating workflow
process designs using cohesion and coupling metrics. Computers in Industry 2008;
59:420-437
Weber, R.: Ontological Foundations of Information Systems, Queensland, 1997.
Wand, Y.; Weber, R.: A Model of Systems Decomposition. In: Tenth International
Conference on Information Systems, 1989; p. 42-51.
Wand, Y.; Weber, R.: Toward a theory of the deep structure of information systems.
In: International Conference on Information Systems, 1990; p. 61-71.
Sprachbezogener Abgleich der Fachsemantik in
heterogenen Geschäftsprozessmodellen
Janina Fengel, Kerstin Reinking
Fachbereich Wirtschaft
Hochschule Darmstadt
Haardtring 100
64295 Darmstadt
janina.fengel@h-da.de
kerstin.reinking@h-da.de
Abstract: In Unternehmen bringt die Geschäftsprozessmodellierung über die Zeit
Sammlungen unterschiedlicher Modelle hervor. Sind diese zusammenzuführen,
erschweren semantische Unterschiede den inhaltsbezogenen Abgleich, obwohl dies
Vorbedingung für ihre Integration wie beispielsweise im Falle von Analysen,
Unternehmensumstrukturierungen, Fusionen oder Standardeinführungen ist. Neben
semantischer Heterogenität bedingt durch die Verwendung verschiedener Modellierungssprachen liegt ein Haupthindernis für automatisiertes Matching von
Modellen in der Art der Nutzung der zur Bezeichnung von Modellen und ihren
Elementen gewählten natürlichen Sprache und unterschiedlich genutzter Fachsprachen. In diesem Beitrag wird hierzu eine Methode vorgestellt, wie eine
Kombination von Ontology-Matching-Verfahren heuristische Unterstützung bieten
kann.
1 Hintergrund und Motivation
Die Geschäftsprozessmodellierung zur Beschreibung und Gestaltung betrieblichen
Geschehens hat in den vergangenen Jahrzehnten stark an Bedeutung gewonnen. In der
Unternehmenspraxis entsteht daher häufig der Bedarf existierende Modelle abzugleichen
wie in Fällen von Projekten zur Architektur-, Daten- und Prozessintegration, semantischen Konsolidierungsprojekten, Unternehmensfusionen und B2B-Integrationen sowie
bei der Einführung von Standards oder Standardsoftware. Zur Zusammenführung von
Geschäftsprozessmodellen sind die vorhandenen Modelle bezüglich der Inhaltsbedeutung ihrer Elemente zu vergleichen, um Entsprechungen, Ansatzpunkte, Schnittstellen
oder gar Überschneidungen und Redundanzen ermitteln zu können. Das Vergleichen und
Verknüpfen heterogener Modelle ist indes eine nicht-triviale Aufgabe, denn selbst
Modelle gleichen Typs unterscheiden sich häufig semantisch [BP08]. Allerdings tritt
dabei semantische Heterogenität nicht nur im Bereich der Modellierungssprachen auf,
sondern typischerweise bei der Auswahl der natürlich- sprachlichen Fachbegriffe, die
zur Benennung der Modellelemente verwendet werden [TF07].
44
Janina Fengel, Kerstin Reinking
Besonders die frei wählbare Fachterminologie behindert eine Integration von Modellen
und damit der zugrunde liegenden Daten und Prozesse, umso mehr bei unterschiedlicher
Herkunft der Modelle, sei es aus dezentralen Teams, unterschiedlichen Konzernbereichen oder verschiedenen unabhängigen Unternehmen. Die in natürlicher Sprache
formulierten Bezeichnungen spiegeln neben der branchenüblichen Fachterminologie
auch die jeweilige tradierte unternehmensspezifische Geschäftssprache wider. Existiert
kein allgemein gültiges, verbindlich definiertes Vokabular oder Regeln bezüglich deren
Anwendung, können sich Modelle darin erheblich unterscheiden. Erschwert werden
Abgleiche nicht nur bedingt durch die Problematik verschiedener Inhaltsbedeutungen
der verwendeten Bezeichnungen und das Verständnis davon, sondern auch durch
unterschiedlich gewählte Begriffe oder Begriffskombinationen zur Bezeichnung von
Modellelementen. Liegen gar Namenskonflikte bedingt durch Synonymie oder
Homonymie vor, sind Modelle weder manuell noch automatisiert direkt vergleich- und
damit integrierbar [BRS96; TF06]. Insbesondere in großen Unternehmen existiert bereits
eine Vielzahl an Geschäftsprozessmodellen, die über die Zeit von unterschiedlichen
Personen oder dezentral in Gremien mit mehreren Personen, oft sogar anhand
unterschiedlicher Vorgaben erstellt wurden, in verschiedenen Modellierungssprachen
oder unter Nutzung unterschiedlicher Fachterminologien. Auch wenn der gleiche
Sachverhalt modelliert ist, können sich arbeitsteilig erstellte konzeptuelle Modelle
erheblich in ihren Bezeichnern unterscheiden, sodass die für ihre Nutzung notwendige
Vergleichbarkeit nicht grundsätzlich vorausgesetzt werden kann [BD10]. Dies gilt umso
mehr im Falle des Aufeinandertreffens von Modellen aus bisher unabhängig agierenden
Unternehmen oder Unternehmensteilen. Daher gilt es vor Aufnahme weiterführender
Arbeiten den semantischen Istzustand zu analysieren. Semantische Ambiguität ist
aufzulösen, um die Aussagen von Modellen inhaltlich in Bezug bringen und abgleichen
zu können, denn erst der Abgleich der Fachsprache erlaubt die Identifikation von sich
inhaltlich entsprechenden Modellen und Modellelementen und darauf aufbauend
gegebenenfalls weiterführende strukturelle Vergleiche [SM07]. Bisher sind solche
Analyseaufgaben überwiegend nur manuell leistbar. Der notwendige Abgleich und die
Integration konzeptueller Modelle wie die hier betrachteten Geschäftsprozessmodelle
sind heute rein intellektuelle Arbeiten. Liegen gar viele und große Modelle vor, sind
diese Aufgaben ohne automatisierte Unterstützung nur mittels großem Ressourceneinsatz zu erfüllen.
Um diese Lücke schließen und das Potential von Rechenleistung zur automatisierten
Verarbeitung nutzen zu können, wird nachfolgend eine entsprechende IT-gestützte
heuristische Methode vorgestellt. Dieser Ansatz fokussiert auf die Nutzungsphase nach
der Erstellung von Modellen, insbesondere auf Fragen der gemeinsamen Verwendbarkeit. Zur Reduktion der Arbeitslast beim bedeutungsbezogenen Abgleich auf
Nutzerseite wird dazu die Anwendung von Semantic-Web-Technologien, insbesondere
Ontologieverarbeitung, und eine Kombination von Verfahren zur Verarbeitung
natürlicher Sprachen auf die Frage der Ermittlung semantischer Ähnlichkeit von
Geschäftsprozessmodellen in Kap. 2 beschrieben. Dazu folgt die Vorstellung der
Vorgehensweise zur Erschließung und Formalisierung der in Geschäftsprozessmodellen
enthaltenen semantischen Information und der dafür benötigten Ontologien sowie in
Kap. 3 des entsprechend implementierten Prototypen. Darauf aufbauend wird in Kap. 4
die Anwendung der Methode gezeigt.
Sprachbezogener Abgleich der Fachsemantik in heterogenen Geschäftsprozessmodellen 45
Der Beitrag endet in Kap. 5 mit der Vorstellung und der Verbindung zu verwandten
Arbeiten sowie in Kap. 6 einer kurzen Schlussbetrachtung und einem Ausblick auf
zukünftige Arbeiten.
2 Semantische Analyse
Modelle repräsentieren in der Regel abgestimmtes Fachwissen. Dies ist zum einen
Wissen über die Beschreibung von Sachverhalten in Repräsentations- bzw. Modellierungssprachen, zum anderen das Fachwissen zu den modellierten Sachverhalten,
beschrieben durch die organisationale bzw. Geschäftssemantik. Die Erschließung und
Repräsentation dieses Wissens kann durch semantische Analyse vorgenommen werden
[Li00]. Auf diese Weise lassen sich die Beziehungen zwischen den Objekte beider
Domänen erfassen und darstellen. Prinzipiell kann die Repräsentation und automatisierte
Verarbeitung von Wissen zum weiteren Ausbau der Informationsverarbeitung beitragen.
Im Geschäftsalltag hat die Allgegenwärtigkeit des Internet als globale Infrastruktur zur
hohen Akzeptanz webbasierter Unterstützung elektronischer Geschäftsabwicklung
beigetragen. Die Entwicklung der Idee des Semantic-Web und seiner spezifischen
Technologien bietet nun weiterführend die Möglichkeit der Nutzung webbasierter
Ontologien in ihrer Eigenschaft als explizite Spezifikationen als Mittel zur
Wissensstrukturierung und Herstellung semantischer Interoperabilität basierend auf
offenen Standards. Das Prinzip der Annotation von Information mit Metadaten erlaubt
die Repräsentation von Wissen in strukturierter, maschinenzugänglicher Form aufbauend
auf Internettechnologien, lesbar sowohl für Maschinen als auch von Menschen [SBH06].
Insbesondere bietet sich die Nutzung solcher semantischer Technologien in den Fällen
an, in denen intellektuelle Arbeitsleistung zu kostspielig ist und wiederkehrend
Abgleiche insbesondere für große und heterogene Mengen von Daten und Informationen
zu leisten sind [Fr10]. Ziel des fachsprachlichen Abgleich von Geschäftsprozessmodellen ist die Unterstützung der Vorarbeiten zu strukturellen Vergleichen von
Modellen, die wiederum von der verwendeten Modellierungssprache beeinflusst werden.
2.1 Ontologieerstellung und Ontology-Matching
Kernelement des Semantic-Web sind Ontologien. Dies sind im informatiktechnischen
Sinne Artefakte und können als konzeptuelle Schemata verstanden werden [AF05]. Im
Prinzip sind Ontologien Sammlungen von Definitionen von Elementen und ihren
Beziehungen und enthalten ein abgestimmtes Vokabular [DOS03]. Sie formalisieren die
Bedeutung von Begriffen. Obwohl bei der Entwicklung von Ontologien dasselbe
Problem auftritt wie bei der Erstellung von Geschäftsprozessmodellen, nämlich die
Entstehung semantischer Heterogenität durch die Wahl der Modellierungssprachen und
der Fachsprache für die Bezeichner für Klassen bzw. Konzepte und Relationen, sind
diese bei Ontologien wiederum weiterführend automatisiert nutzbar für Abgleiche. Die
Forschung im Themenfeld des Ontology-Matching widmet sich Fragen der
Abgleichbarkeit und Auflösung semantischer Ambiguitäten [ES07].
46
Janina Fengel, Kerstin Reinking
Ontology-Matching-Verfahren unterstützen bei der Klärung der Bedeutung verwendeter
Begriffe und dienen damit der Ermittlung der Bedeutung von Aussagen über
Sachverhalte bzw. deren Beschreibungen. Ziel ist das Auffinden semantischer
Relationen, die sich als Ontology-Mappings ausdrücken lassen. Angewendet auf die
Frage der Bestimmung der Ähnlichkeit der Inhaltsbedeutungen von Modellen und ihren
Elementen können sie als semantische Korrespondenzen dienen. Dies ermöglicht
Aussagen der Art „A aus Ontologie X entspricht B aus Ontologie Y“, die sich als
Funktionen beschreiben lassen
𝑆𝑒𝑚𝐶𝑜𝑟𝑟 (𝑒1 ) = �{𝑒2 ∈ 𝑂2 |𝑒2 }, 𝑒1 ∈ 𝑂1 � ∈ [0,1]
Diese semantischen Korrespondenzen drücken Äquivalenz oder Ähnlichkeit aus. Für das
Abgleichen der in Geschäftsprozessmodellen enthaltenen Geschäftssemantik bieten sich
elementbasierenden Ontology-Matching-Verfahren an. Ein umfassender Überblick dazu
findet sich in [ES07]. Für weiterführende Nutzung können die Korrespondenzen
persistiert werden. Dadurch können die verknüpften Ontologien bestehen bleiben, ohne
zusammengeführt werden zu müssen. Dies ist besonders im Hinblick darauf nützlich,
dass die zugrunde liegenden Modelle nicht ohne weiteres geändert werden können,
sondern aktiv genutzt werden. Bewahrte Korrespondenzen bieten stattdessen die
Möglichkeit einer virtuellen semantischen Integration.
2.2 Erschließung und Formalisierung der Semantik von Modellen
Existierende Geschäftsprozessmodelle sind nicht-ontologische Ressourcen, aus denen
durch Reengineering die Bedeutung der Modellaussage extrahiert und semantisch
formalisiert werden kann. Eine solche Wiederverwendung von Modellen und ihre
Konvertierung in Ontologien erlaubt ihre weiterführende Verwertung, während sie
weiterhin unverändert aktiver Nutzung zur Verfügung stehen. Durch automatisierte
Dekomposition und Überführung in Ontologien wird Maschinenzugang zum enthaltenen
Wissen hergestellt. Ansatzpunkt für die Erschließung des enthaltenen Wissens ist die
Überlegung, dass Modelle Fakten aus zwei Wissensbereichen enthalten. Aus dem
Sprachraum der Domänensprache sind Konzepte zur Benennung von Modellen und
ihren Elementen herangezogen worden, während die Konzepte der Modellierungssprache zur Beschreibung im Sinne einer Typisierung und Anordnung dieser Konzepte
genutzt wurden. In Umkehrung dieses Vorgangs lassen sich Modelle zerlegen, um die
jeweils verwendeten Konzepte der Sprachräume zu extrahieren und in Form
semantischer Modelle zu erfassen, wie in [FR10] beschrieben. Dabei wird die
vorhandene Modellinformation ohne manuellen oder zusätzlichen intellektuellen
Aufwand an dieser Stelle erschlossen. Die Ontologien zur Beschreibung des
Metamodells liegen in OWL bereits vor und können zur Nutzung des Vorgehens der
Modelldekomposition verwendet werden. Somit fallen für den eigentlichen Abgleich
keine Vorarbeiten an. Bei der Dekomposition werden Modelle mittels XSLT in zwei
Ontologien in OWL DL konvertiert. Dies sind die Modellontologie mit den Bezeichnern
des Modellnamens und der Modellelemente und die Modelltypontologie mit dem
Modell- und den Modellelementtypen.
Sprachbezogener Abgleich der Fachsemantik in heterogenen Geschäftsprozessmodellen 47
Zusammen beschreiben sie das Modell mit seinem Namen und Modelltyp sowie die
Modellelemente mit ihren Namen und ihrem Modellelementtyp. Bei der Konvertierung
werden alle Modellnamen und Modellelementbezeichnungen ohne weitere Verarbeitung
„as-is“ transferiert. Auf diese Weise können vollständige Ausdrücke zur Weiterverarbeitung übernommen werden, denn das Fachwissen bei der Modellierung zeigt sich
oft erst in der Kombination von Worten zu häufig genutzten Formulierungen. Ebenso
bleibt erhalten, dass gegebenenfalls Konventionen die Vergabe von Elementbezeichnern
geleitet haben, sowie die verwendete natürliche Sprache und unterschiedlicher
Sprachgebrauch genauso wie Besonderheiten der Domäne. Bei Geschäftsprozessmodellen werden zur Bezeichnung von Ereignissen und Aktivitäten zumeist
Ausdrücke bzw. Phrasen bestehend aus mehreren Termen verwendet, die selten einen
vollständigen Satz bilden. Bei einem semantischen Abgleich ist daher jeder Term einzeln
und in seiner Eigenschaft als Teil der vorliegenden Kombination zu betrachten, denn die
Phrasen tragen allein in ihrer Gesamtheit die ihnen zugedachte Bedeutung.
Augenfälligster Unterschied bei der Analyse der mittels der hier vorgestellten Methode
abzugleichenden Modellsammlung war die Unterscheidung zwischen Modellen in
deutscher und englischer Sprache. Allerdings zeigte sich, dass zumeist keine
Umgangssprache zur Anwendung kam und Formulierungen von Emotionen wie Ironie
oder Beschönigungen nicht auftraten. Ebenso wurden nur in geringem Umfang
beschreibende Adjektive, Adverbien oder modifizierende Ausdrücke gefunden. Dabei
wurde auch sichtbar, dass verschiedene Bezeichnungen desselben Begriffs nicht nur
durch unterschiedlichen Sprachgebrauch seitens der Modellier, sondern auch begründet
durch die Anforderungen und Beschränkungen der jeweiligen Modellierungssprache
anzutreffen sind [BD10].
2.3 Semantischer Abgleich der natürlichen Sprache der Bezeichner
Um die entstandenen Modellontologien, die die Domänensemantik der konvertierten
Modelle enthalten, automatisiert miteinander in Bezug zu bringen, können OntologyMatching-Verfahren angewendet werden. Für ansonsten manuell auszuführende
Modellabgleiche kann so automatisierte Unterstützung geboten werden und die Modellelemente, die die Domänensemantik widerspiegeln, können unabhängig von der
ursprünglich genutzten Modellierungssprache verglichen werden. Dabei zeigte sich, dass
die in Prozessmodellen übliche Benennung von Elementen mit mehreren Termen in
einer Phrase wie oben beschrieben durch Name-Matching-Verfahren wie beispielweise
Zeichenkettenvergleiche bzw. Nutzung von String-Matching-Metriken allein zu
minderwertigen Ergebnissen führt. Dies gilt insbesondere bei Vorliegen von Synonymen
sowie im Falle unterschiedlicher Positionen gleicher oder ähnlicher Terme innerhalb der
zu vergleichenden Phrasen. Stattdessen galt es, verschiedene Anforderungen zu erfüllen.
Wie beschrieben führt unterschiedlicher Sprachgebrauch von Modellierern zur
Verwendung von unterschiedlichen Bezeichnern. Daher ist davon auszugehen, dass sich
Synonyme in den zu vergleichenden Modellen befinden, die beim Einsatz allein von
String-Metriken als nicht übereinstimmend erkannt werden könnten. Stattdessen ist die
Auflösung von Synonymen erforderlich. Ebenso ist anzunehmen, dass Bezeichner in
semantisch ähnlichen Modellen in verschiedenen Sprachen vorkommen können.
48
Janina Fengel, Kerstin Reinking
Daher ist es erforderlich, dass mehrsprachige Modelle verarbeitet werden können und
informationslinguistische Verfahren abhängig von der jeweiligen Sprache genutzt
werden. Da es sich bei den Bezeichnern in Modellen um Phrasen handelt, die keine
grammatikalisch vollständigen Sätzen oder gar Texte darstellen, sind allerdings einige
bestehende informationslinguistische Verfahren nicht direkt anwendbar. Beispielsweise
können solchartige Phrasen kaum sinnvoll einer Part-of-Speech-Analyse unterzogen
werden. Um eine der Art der Bezeichner angemessene Behandlung zu ermöglichen,
wurden verschiedene Verfahren kombiniert, die nachfolgend im Einzelnen kurz
vorgestellt werden.
2.4 Informationslinguistische Verfahren
In den vergangenen Jahrzehnten sind verschiedene natural language processing bzw.
informationslinguistische Verfahren entstanden, die sich mit der Verarbeitung
natürlicher Sprache in bzw. für Informationssysteme befassen [HL09]. Sie eignen sich
daher für das Ontology-Matching auf Elementebene [ES07].
2.4.1 Kompositazerlegung
Begriffe in natürlichen Sprachen können unterschiedlich komplex sein, entweder
bestehend aus einem Einzelbegriff oder in Form einer Begriffskombination. Dabei
besteht ein Einzelbegriff meist aus einem Wort, eine Begriffskombination aus mehreren
begrifflichen Bestandteilen. Im Englischen sind dies häufig Mehrwortbenennungen, im
Deutschen dagegen Komposita, d.h. die Verbindung mindestens zweier selbstständig
vorkommender Worte zu einem zusammengesetzten Wort [Be05]. Für Kompositabildung erlaubende Sprachen wie das Deutsche ist es sinnvoll, Kompositazerlegung
durchzuführen und die einzelnen Bestandteile des Kompositums für den Abgleich zu
benutzen [St07]. Dabei ist es bei der Dekomposition von Wichtigkeit, sinnvolle begriffliche Bestandteile herzustellen, um alle Vorkommen eines Suchwortes zu finden. Zur
Vermeidung nicht sinnvoller Zerlegung von Mehrwortbegriffen oder unerwünschter
Zerlegung von Eigennamen können geeignete Wörterbucher unterstützen [Be05].
2.4.2 Disambiguierung durch Auflösung von Synonymie
Synonyme sind unterschiedliche Bezeichnungen für denselben Begriff. Erscheinungsformen dabei sind unterschiedliche Flexionsformen, verschiedene Schreibvarianten eines
Wortes, Varianten in unterschiedlichen Zeichensystemen, Abkürzungen oder Vollformen sowie alternativ nutzbare Begriffe [We01]. Durch die Auflösung von
Synonymen kann gewährleistet werden, dass semantische Übereinstimmungen zwischen
Begriffen gefunden werden, selbst wenn diese unterschiedlich benannt worden sind,
sodass die Abgleichsergebnisse verbessert werden [Be05]. Die Auflösung bzw. Word
Sense Disambiguation kann unter Zuhilfenahme eines Thesaurus als Synonymwörterbuch vorgenommen werden [St07]. Ein Thesaurus verknüpft Terme zu
begrifflichen Einheiten mit und ohne präferierte Bezeichnungen und setzt sie in
Beziehung zu anderen Begriffen.
Sprachbezogener Abgleich der Fachsemantik in heterogenen Geschäftsprozessmodellen 49
In solchen Begriffsordnungen werden zumeist Beziehungen wie Synonymie und
Ambiguität, Hyponymie und Hyperonymie, Antonymie sowie Assoziation erfasst
[SS08]. Zur Erstellung webbasierter Thesauri bietet das W3C SKOS, das Simple
Knowledge Organization System [MB09]. Die Nutzung von SKOS erlaubt die
Wiederverwendung frei verfügbarer Ressourcen, wie beispielsweise WordNet [Fe98]
oder den Standard-Thesaurus Wirtschaft (STW) [Zb10].
2.4.3 Behandlung von Stoppwörtern
Im Information Retrieval werden Wörter, die bei Indexierungen nicht beachtet werden,
Stoppwörter bzw. stop words, genannt. Zumeist übernehmen sie syntaktische Funktionen
und haben somit keine Relevanz für Rückschlüsse auf den Inhalt eines Dokuments. Im
Deutschen wie im Englischen sind dies Artikel, Konjunktionen, Präpositionen oder
Pronomina sowie die Negation [Be05]. Gleichwohl sind sie für das Verständnis
unerlässlich [Be05]. Die Menge an Stoppwörtern kann domänenspezifisch variieren, da
auch Wörter enthalten sein können, die, trotzdem sie Bedeutungsträger sind, nicht
verwendet werden sollen, da sie in den meisten Dokumenten vorkommen und somit
nicht zur inhaltlichen Differenzierung nützen. Entsprechend bietet es sich für die Frage
der Geschäftssemantik in Prozessmodellen an, diese nicht generell zu eliminieren wie
vorgeschlagen in [Ko07], sondern domänenspezifisch. Abhängig von der Art von
Suchen erlaubt der Verzicht auf die Eliminierung bessere Ergebnisse bei Suchen mit
Wortkombinationen [Be08]. Weiterhin ist im Falle von Geschäftsprozessen bei
Entscheidungen häufig die Existenz der Negation bei der Suche nach semantisch
ähnlichen Elementen von Bedeutung. Insbesondere bei Vorliegen kurzer Phrasen, bei
denen ein in der jeweiligen Sprache übliches Stoppwort einen erheblichen Bedeutungsunterschied ausmacht, kann die Stoppworteliminierung zu falschen Ergebnissen führen,
wie beispielswiese bei Negationen [St07].
2.4.4 Stemming
Zur morphologischen Analyse bieten sich im Information Retrieval Methoden zur
Grundformbildung bzw. Lemmatisierung sowie der Wortstammbildung bzw. Stemming
an [St07]. Bei der Lemmatisierung wird die grammatische Grund- oder Stammform
durch die Rückführung der konkreten Wortform auf einen Wörterbucheintrag ermittelt.
Beim Stemming werden morphologische Varianten eines Wortes auf ihren gemeinsamen
Wortstamm durch die Entfernung von Flexionsendungen und Derivationssuffixen auf
einen gemeinsamen Stamm zurückgeführt, wobei dieser nicht zwingend ein lexikalischer
Begriff sein muss. Im Falle des Abgleichs von Prozessmodellen können auf diese Weise
Bedeutungsähnlichkeiten zwischen Aktivitäten, egal ob mittels eines substantivierten
Verbs oder einer Kombination aus Verb und Substantiv benannt, und Objekten genauer
ermittelt werden, da hier nur die Stammformen miteinander verglichen werden. Zudem
können unerwünschte Matchings von Suffixen ausgeschlossen werden, da diese vor dem
Matching entfernt werden.
50
Janina Fengel, Kerstin Reinking
2.4.5 Vergleich von Zeichenketten
Eine Folge von Zeichen eines definierten Zeichensatzes wird als Zeichenkette bzw.
String bezeichnet. Strings sind Zeichensequenzen beliebiger Länge aus einem definierten
Vorrat [ES07]. String-Matching-Algorithmen suchen Übereinstimmungen von Zeichenketten. Diese Aufgabe fällt in den verschiedensten Domänen an und hat im Laufe der
Zeit zu unterschiedlichen Ansätzen geführt [CRF03]. String-Metriken erlauben die
Messung von Ähnlichkeiten zwischen Zeichenketten [SSK05]. Die Levenshtein-Distanz
zweier Strings ist die minimal erforderliche Anzahl von Einfügungen oder Entfernungen
zur Umwandlung der ersten in die zweite Zeichenkette [Le66]. Die Jaccard-Metrik
vergleicht die Ähnlichkeit von Worten innerhalb eines Ausdrucks [Ja12]. Die JaroMetrik vergleicht Zeichen und ihre Position innerhalb der Zeichenkette, auch wenn sie
einige Positionen voneinander entfernt sind [Ja89]. N-Gramme können zur Fragmentierung von Worten bzw. Zeichenketten verwendet werden [St07]. Der darauf basierende
Q-Grams-Algorithmus zählt die gemeinsame Menge von Tri-Grammen in den zu
vergleichenden Zeichenketten und eignet sich dadurch für so genanntes approximate
string matching [ST95]. Da es bei den Ergebnissen aus den verschiedenen Verfahren
große Unterschiede geben kann, ist hier die Auswahl einer passenden Metrik in
Abhängigkeit von der Sprache und Funktion der Begriffe zu treffen [SSK05]. Obwohl
String-Metriken allein nicht alle Bedürfnisse beim Finden von semantischen
Ähnlichkeiten von Bezeichnern erfüllen, haben sie sich trotzdem als nützlich in diesem
Feld erwiesen [SSK05]. Liegt keine Synonymie von Termen vor, können sie eingesetzt
werden, um semantische Ähnlichkeit aufgrund von Übereinstimmungen von
Zeichenketten zu bestimmen. Ein vorher durchgeführtes Stemming kann dabei die
Präzision der Ergebnisse erhöhen, denn durch die Reduzierung auf den Wortstamm
werden dann beispielsweise Übereinstimmungen zwischen Suffixen nicht bewertet.
3 Implementierung
Zur Anwendung der beschriebenen Verfahren wurde prototypisch ein System namens
LaSMat implementiert, welches für Language-aware Semantic Matching steht.
3.1 Technische Realisierung
Die Realisierung der Komponenten erfolgte in Java. Das System kann als Java-API
eingebunden oder über eine prototypische Oberfläche angesprochen werden. Abbildung
1 zeigt das Vorgehen zum Abgleich der Modellontologien in Form eines Sequenzdiagramms. Bei einer Anfrage wird im ersten Schritt ein Abgleich beider Phrasen
vorgenommen. Dieser Vergleich erfolgt unidirektional. Liegt vollständige Übereinstimmung vor, wird der Wert 1 als Konfidenzwert und damit angenommene Stärke der
gefundenen semantischen Korrespondenz zurückgegeben. Ist dies nicht der Fall, werden
die Phrasen in Einzelterme zerlegt und diese miteinander verglichen. Hierbei kommen
alle oben vorgestellten Verfahren zum Einsatz, wobei die Kompositazerlegung derzeit
nur für die deutsche Sprache durchgeführt wird.
Sprachbezogener Abgleich der Fachsemantik in heterogenen Geschäftsprozessmodellen 51
Bei allen Verfahren hat der Nutzer die Möglichkeit zu parametrisieren, indem Gewichtungen für die Ergebnisse der verschiedenen Verfahren gesetzt werden können. Die
Gewichtung für Übereinstimmungen von als Stoppwort identifizierten Termen ist
konfigurierbar. Zur Auflösung von Synonymen können zur Laufzeit Thesauri im SKOSFormat importiert werden. Standardmäßig eingebunden sind WordNet [Fe98] als
lexikalische Ressource generell für die englische Sprache im SKOS-Format [W310] und
als wirtschaftsspezifische Ressource der STW, der Begriffe in deutsch und englisch
enthält [Zb10]. Daneben ist für die generelle deutsche Sprache eine von uns erstellte
SKOS-Version des OpenThesaurus in Benutzung [Na05].
Abbildung 1. Sequenzdiagramm des Language-aware Semantic Matchers
Dabei lässt sich über den Parameter s ∈ [0,1] als Synonym-Maß die Gewichtung von
Synonym-Matches für die Ergebnisaggregation konfigurieren. Für das Stemming werden
die Bibliotheken für die deutsche und die englische Sprache aus dem Snowball-Projekt
genutzt [PB11]. Für das String-Matching steht eine Auswahl verschiedener StringMetriken zur Verfügung. Es wird dafür die Java-API SimMetrics genutzt [Ch06]. Für die
Gewichtung des Ergebnisses in der Gesamtwertung kann ein entsprechender Wert
angegeben werden. Zur Ermittlung des Gesamtwerts der Konfidenzen der gefundenen
Korrespondenzen werden aus allen Verfahren die besten Ergebnisse aggregiert. Die
Ergebnisse sind Matchinginformationen zu jeder Phrase.
52
Janina Fengel, Kerstin Reinking
Diese lassen sich im INRIA-Format [Eu06] sowie in einer Alignment-Ontology in einem
von uns dafür entwickelten Format abspeichern. Die prototypische Oberfläche
ermöglicht eine tabellarische Visualisierung der Ergebnisse, wobei zur Filterung ein
Schwellwert für die Stärke der gefundenen Korrespondenzen gesetzt werden kann.
3.2 Berechnung der semantischen Ähnlichkeit
Gefundene Korrespondenzen werden als Tupel beschrieben in der Form
〈(𝑒1 , 𝑚1 ), (𝑒2 , 𝑚2 ), 𝑐〉
wobei
- (𝑒𝑘 , 𝑚𝑘 ) der Bezeichner eines Elements einer Modellontologie ist,
- c als Konfidenz die angenommen Stärke der Beziehung darstellt, ausgedrückt als
numerischer Wert zwischen 0 und 1.
Der entwickelte Algorithmus bestimmt einen fuzzy Wert für die Ähnlichkeit zwischen
zwei Bezeichnern, wobei 1 Äquivalenz ausdrückt und 0 keinerlei Übereinstimmung
bedeutet. Wir definieren die Ähnlichkeit zwischen zwei Bezeichnern als arithmetisches
Mittel aller Übereinstimmungen in Relation zur Anzahl der Terme in beiden
Bezeichnern mit
𝑂𝑣𝑒𝑟𝑎𝑙𝑙𝑇𝑒𝑟𝑚𝑆𝑖𝑚(𝑒1 , 𝑒2 ) 𝑂𝑣𝑒𝑟𝑎𝑙𝑙𝑇𝑒𝑟𝑚𝑆𝑖𝑚(𝑒1 , 𝑒2 )
+
𝑙𝑒𝑛𝑔𝑡ℎ(𝑒2 )
𝑙𝑒𝑛𝑔𝑡ℎ(𝑒1 )
𝑆𝑖𝑚(𝑒1 , 𝑒2 ) =
2
wobei
- 𝑙𝑒𝑛𝑔𝑡ℎ(𝑒𝑘 ) die Anzahl an Termen der Bezeichnung 𝑒𝑘 ist, ausgedrückt als
𝑙𝑒𝑛𝑔𝑡ℎ(𝑒𝑘 ) = 𝑁𝑢𝑚(𝑡𝑒𝑘 )
- 𝑂𝑣𝑒𝑟𝑎𝑙𝑙𝑇𝑒𝑟𝑚𝑆𝑖𝑚(𝑒1 , 𝑒2 ) die Gesamtübereinstimmung zwischen allen Termen zweier
Bezeichner.
Für die Berechnung der Gesamtübereinstimmung wird das jeweils höchste Ähnlichkeitsmaß zwischen dem aktuell verglichenem Term und allen Termen des zweiten
Bezeichners für die Berechnung herangezogen mit
𝑙𝑒𝑛𝑔𝑡ℎ(𝑒1 )
𝑂𝑣𝑒𝑟𝑎𝑙𝑙𝑇𝑒𝑟𝑚𝑆𝑖𝑚(𝑒1 , 𝑒2 ) =
�
𝑘=1
max (𝑆𝑖𝑚 �𝑡𝑘𝑒1 , 𝑡1…𝑛𝑒2 �)
wobei
- 𝑆𝑖𝑚(𝑡𝑘 , 𝑡𝑛 ) das Ähnlichkeitsmaß zwischen zwei Termen ist.
Die Bestimmung dieses Ähnlichkeitsmaßes basiert auf der Berücksichtigung
verschiedener Werte. Im Falle einer exakten Übereinstimmung ergibt das
Ähnlichkeitsmaß
𝑆𝑖𝑚(𝑡𝑘 , 𝑡𝑛 ) = 1
Sprachbezogener Abgleich der Fachsemantik in heterogenen Geschäftsprozessmodellen 53
Dabei wird allerdings für den Fall, dass es sich bei den übereinstimmenden Termen um
Stoppwörter handelt, das konfigurierte Stoppwort-Maß anstelle des Wertes 1 verwendet.
Im Fall (k ≠ n ) würde das Ergebnis der Distanzmessung sein, dass keine Übereinstimmung vorliegt oder eine gesonderte Behandlung aufgrund der Distanz zwischen den
einzelnen Zeichen nötig wäre [Ja89]. Dabei ist jedoch zu beachten, dass die Distanz
zwischen zwei Termen, anders als bei reinen Zeichensequenzen, wie beispielsweise
Gencodes, nicht in allen Fällen zur Bedeutungsänderungen führt, sondern trotzdem
semantische Ähnlichkeit vorliegt. Dies lässt sich am Beispiel der beiden Bezeichner
„check invoice“ und „invoice check“ zeigen, bei denen semantische Ähnlichkeit
anzunehmen ist. Allerdings lässt die unterschiedliche Positionierung der Terme
innerhalb des Bezeichners das Vorliegen unterschiedlicher Wortarten vermuten. Die
Distanz der Terme lässt also auf einen Unterschied schließen, der aber kleiner ist als der
bei Distanzen zwischen gleichen Zeichen in einem String [PW97]. Unser Ansatz für (k ≠
n) wird daher weitergeführt als
𝑆𝑖𝑚(𝑡𝑘 , 𝑡𝑛 )
𝑡𝑑
wobei
- td als „term disorder weight“ eingeführt wird mit einem Wert ≥ 1.
Dies folgt dem Ansatz von McLaughlin zur Behandlung von „disagreeing characters”
bei String-Vergleichen wie angewendet in [PW97], wobei jedoch die tatsächliche
Distanz der beiden Terme aus oben genanntem Grund außer Acht gelassen wird. Dieser
Wert ist konfigurierbar. Ein hoher Wert verringert daher das Ähnlichkeitsmaß zwischen
zwei Termen, die an unterschiedlichen Stellen einer Phrase stehen.
3.3 Interpretation der Resultate
Die Ergebnisse des Matchings drücken die Stärke einer ermittelten Korrespondenz als
Konfidenzwert zwischen 0 und 1 aus. Bei der Analyse der Ergebnisse durch Domänenexperten zeigte sich allerdings, dass die Ergebnisse in dieser Form nicht intuitiv
verständlich sind. Daher wird dazu eine Fuzzyfizierung vorgenommen und beginnend
bei 1 für c = 1 die Angabe "exactMatch", für 1 < c > 0,745 die Angabe "closeMatch",
für 0,745 < c >0,495 die Angabe "relatedMatch" Nutzern präsentiert. Dies unterstützt sie
bei der Entscheidung bezüglich weiterführender Arbeiten zu Abgleichen oder Analysen.
4 Anwendung
Der Prototyp wurde genutzt, um die Machbarkeit und den Nutzen zeigen zu können für
eine Sammlung von insgesamt 1.380 Geschäftsprozessmodellen, die zu gleichen Teilen
deutsch- oder englischsprachige Bezeichner ihrer Elemente aufweisen. Es handelt sich
dabei um Modelle des SAP-Referenzmodells, verschiedene Modellen aus der Literatur
sowie Referenzmodelle entnommen aus E-Business-Standards.
54 Janina Fengel, Kerstin Reinking
4.1 Empirische Evaluation
Es wurden aus dieser Sammlung zufällig acht Modellpaare ausgesucht, zwischen denen
Ähnlichkeit vermutet wurde. Dabei waren Modelle unterschiedlichen Typs willkürlich
aus EPK, BPMN-Modellen und UML-Aktivitätsmodellen ausgewählt. Dazu wurden die
konfigurierbaren Werte wie im Screenshot in Abbildung 2 ersichtlich gesetzt.
Abbildung 2. Screenshot des LaSMat
Zur Beurteilung des Ergebnisses der vorgenommenen Abgleiche der Modellontologien,
die die Geschäftssemantik repräsentieren, wurden die gefundenen Korrespondenzen mit
einer Stärke größer 0,5 verglichen mit Korrespondenzen, die manuell von Domänenexperten als Referenz erstellt wurden. Augenfällig war dabei der Zeitaufwand. Während
die menschliche Arbeit für alle ausgewählten Modellpaare bei einem Umfang von einer
bis mehreren Stunden lag, dauerte der Abgleich im LaSMat-System zwischen 290 ms bis
maximal 3.100 ms pro Paar. Zur Beurteilung der Ergebnisgüte wurde auf Maße aus dem
Information Retrieval zurückgegriffen [St07]. Dies sind Precision (P), Recall (R) und FMeasure (F) ausgedrückt als Wert zwischen 0 und 1. P beschreibt die Korrektheit als
Verhältnis aller korrekt gefundener zur Menge aller gefundenen Korrespondenzen. R
beschreibt die Vollständigkeit als Verhältnis aller korrekt gefundenen zur Menge aller
erwarteten Korrespondenzen. Zur Gesamtbeurteilung zeigt F das gewichtete harmonische Mittel dieser beiden Werte. Die Anwendung der Methode ergab für P einen
Mittelwert von 0,89, für R einen Mittelwert von 0,9 und für F einen Mittelwert von 0,89.
Aus den Mittelwerten der Stichprobe lässt sich für die Grundgesamtheit als Indiz für die
Machbarkeit der Methode vermuten, dass bei 5%-iger Irrtumswahrscheinlichkeit die
Precision zwischen 0,8 und 0,98 und der Recall zwischen 0,83 und 0,97 liegt, wobei der
Maximalwert jeweils 1 ist.
Sprachbezogener Abgleich der Fachsemantik in heterogenen Geschäftsprozessmodellen 55
4.2 Detailbetrachtung zur Verfahrenskombination
Zur Betrachtung zur Wirkung der Parametrisierung der verschiedenen genutzten Verfahren wurde eine Detailbetrachtung an Einzelbeispielen vorgenommen. Durch die
Kompositazerlegung wurden die Ergebnisse erwartungsgemäß verbessert, beispielsweise
wurde die Ähnlichkeit zwischen „Rechnungsprüfung“ und „Rechnung prüfen“ ohne
Zerlegung mit einem Wert von 0,54 zurück gegeben und mit Zerlegung von 0,74.
Synonym-Matches können unterschiedlich gewichtet werden. Dies erscheint sinnvoll in
den Fällen, in denen es aufgrund von Quasi-Synonymen zu Bedeutungsverschiebungen
kommt. Während das Matching ohne Synonymauflösung keine Übereinstimmung
zwischen bedeutungsgleichen Benennungen findet, werden durch die Synonymauflösungen diese Übereinstimmungen gefunden. Dabei führt ein Wert von 0 zu einem
Abgleich ohne Synonymauflösung, während alle Werte größer 0 das Ergebnis
gewichten. Ein zwischen Stoppworten gefundener exakter Match beeinflusst maßgeblich
das Gesamtergebnis beim Phrasen-Matching aufgrund der im Vergleich zu Volltexten
geringen Anzahl an Termen. Unser Ansatz, Stoppwort-Matches mit 0.0 zu gewichten,
sodass Stoppwort-Matches nicht in die Gewichtung bei der Gesamtähnlichkeitsbewertung fallen, liefert ähnliche Ergebnisse wie die Stoppworteleminierung, berücksichtigt aber weiterhin die Fälle, in denen ein Stoppwort einen Bedeutungsunterschied
ausmacht. Durch Stemming konnten Abgleiche unterstützt werden, wobei für die
flexionsstarke deutschen Sprache die Ergebnisse nur in geringerem Umfangs verbessert
wurden im Vergleich zum Englischen. Für den Zeichenkettenvergleich kam bei der
Evaluation Q-Grams zum Einsatz mit einem Term Disorder Weight von 3 gemäß des
Ansatzes von Mclaughlin wie oben beschrieben. Dies lieferte unter Beachtung der
Position eines Terms innerhalb der Phrase erhöhte Trefferquoten.
5 Verwandte Arbeiten
Aufgrund der großen Bedeutung der Modellierung zur Beschreibung und Gestaltung
betrieblichen Geschehens kommen in der Folge dem Modellabgleich und der Modellintegration eine immer entscheidendere Bedeutung für die Prozess- und IT-Optimierung
und damit letztendlich für die Wettbewerbsfähigkeit von Unternehmen zu. Allerdings
liegen trotz dieser Bedeutung keine für den Unternehmenseinsatz geeigneten Methoden
und Werkzeuge vor. Einige in der Literatur vorliegende Arbeiten zur Modellintegration
konzentrieren sich auf den Bereich der Modellierungssprachen und die Möglichkeiten
der Migration oder Integration basierend auf der Übertragung der Modelle von einer
Modellierungssprache in eine andere [Ge07; MK07]. Dabei wird der Aspekt heterogen
verwendeter Fachsprache nicht betrachtet, sondern die Modellelementbezeichnungen
werden unverändert weiter genutzt. Obwohl die Nutzung von Ontologien langfristig als
Möglichkeit zur Herstellung eines einheitlichen, gemeinsamen, ständig aktuellen und
kollaborativ weiterentwickelten digitalen Modells des ganzen Unternehmens gesehen
werden [Fr10], existieren bisher keine Vorschläge zu ihrer Anwendung für Modellabgleiche nach deren Erstellung bzw. für Integrationen oder Konsolidierungen.
Existierende Vorschläge zur Integration von Prozessmodellen konzentrieren sich zumeist
auf die Phase der Ersterstellung von Modellen.
56
Janina Fengel, Kerstin Reinking
Dabei wird das Vorliegen eines separat erstellten Domänenmodells zur Bezeichnung von
Modellelementen oder für ihren Abgleich vorausgesetzt [BEK06; We07]. Im Gegensatz
dazu erfordert unsere Methode keine zusätzlichen Vorarbeiten dieser Art. Andere
Ansätze erfordern manuelle Annotationsarbeiten zur Auszeichnung von Prozessmodellelementen zur Ermöglichung semantischer Verarbeitung [HLD07; TF09; BD10].
Aktuell liegen keine Ansätze vor, die semantische Abgleiche und existierender Modelle
unter Berücksichtigung sowohl der Modellierungs- als auch der genutzten Fachterminologie und verschiedener natürlicher Sprachen bieten. Hier kann unser Ansatz
ergänzend wirkend.
6 Schlussbetrachtung
Im vorliegenden Beitrag wurde eine Methode zum semantischen Abgleich bereits
existierender Geschäftsprozessmodelle mit Hilfe von Semantic-Web-Technologien,
insbesondere Ontology-Matching-Verfahren, vorgestellt. Dadurch wird die Fachsemantik in Modellen maschinell erschließbar und durch eine entsprechende sprachbezogene Auswahl, Kombination und parametrisierbare Ergebnisaggregation mehrerer
sprachverarbeitender Verfahren automatisiert abgleichbar. Die ermittelten Ergebnisse
können Ansatzpunkte für weiterführende Strukturvergleiche und darauf basierende
Verarbeitungsschritte wie beispielsweise Konsolidierungen oder Modelländerungen
bieten. Dazu wurde das hier vorgestellte System prototypisch implementiert und für den
Machbarkeitsnachweis der entwickelten Methode genutzt. Dabei konnte gezeigt werden,
dass die gewählte Kombination von Einzelverfahren Nutzern automatisierte
Unterstützung bieten kann. Da das System die Parametrisierung von Gewichtungen vorsieht, ist hierzu weiterführende Evaluation bezüglich deren Effizienz geplant, um
domänenspzifisch geeignete Kombinationen ermitteln zu können. Ebenso liefert
Ontology-Matching (bisher) keine perfekten Ergebnisse. Insbesondere ist für die Fälle,
in denen Phrasen numerische, kryptische oder mischsprachliche Begriffe enthalten, noch
weitere Forschungsarbeit nötig. Langfristig könnte weiterführende Forschung bezüglich
des entstandenen Bedarfs an Block Matching für das Erkennen taxonomischer und
mereologischer Zusammenhänge nutzenstiftend sein. Insgesamt hoffen wir, mit unserem
Vorschlag die Nützlichkeit der Anwendung von Semantic-Web-Technologien zur
Unterstützung beim Abgleich von Geschäftsprozessmodellen gezeigt zu haben.
Literaturverzeichnis
[AF05]
Antoniou, G.; Franconi, E.; van Harmelen, F.: Introduction to Semantic Web Ontology
Languages. In: Reasoning Web. 1st Int. Summer School 2005, Malta, Springer, Berlin
Heidelberg, 2005; S. 1–21.
[BD10]
Becker, J. et al.: Ein automatisiertes Verfahren zur Sicherstellung der
konventionsgerechten Bezeichnung von Modellelementen im Rahmen der
konzeptionellen Modellierung. In: Modellierung 2010, LNI 161, 2010; S. 49–65.
[Be05]
Bertram, J.: Einführung in die inhaltliche Erschliessung. Ergon., Würzburg, 2005.
[Be08]
Beus, J.: Google changes the treatment of stopwords. http://www.sistrix.com/news
/713 -google-veraendert-behandlung-von-stopworten.html, 30.10.2011.
Sprachbezogener Abgleich der Fachsemantik in heterogenen Geschäftsprozessmodellen
57
[BEK06]
Brockmans, S. et al.: Semantic Alignment of Business Processes. In: Proc. of the 8th
Intern. Conf. on Enterprise Information Systems (ICEIS 2006). INSTICC, Setúbal,
2006; S. 197–203.
[BRS96]
Becker, J.; Rosemann, M.; Schütte, R.: Prozeßintegration zwischen Industrie- und
Handelsunternehmen - eine inhaltlich-funktionale und methodische Analyse. In
Wirtschaftsinformatik 39, 1996; S. 309–316.
[BP08]
Becker, J.; Pfeiffer, D.: Solving the Conflicts of Distributed Process Modelling –
Towards an Integrated Approach. In: 16th Europ. Conf. on Information Systems (ECIS
2008), 2008; S. 1555–1568.
[Ch06]
Chapman, S.: SimMetrics: Open source library of Similarity Metrics.
http://sourceforge.net/projects/simmetrics/, 17.10.2011.
[CRF03]
Cohen, W.; Ravikumar, P.; Fienberg, S.: A Comparison of String Distance Metrics for
Name-Matching Tasks. In: Proc. of IJCAI-03 Workshop on Information Integration on
the Web (IIWeb-03), 2003; S. 73–78.
[DOS03]
Daconta, M. C.; Obrst L. J.; Smith K. T.: The Semantic Web. Wiley, 2003.
[ES07]
Euzenat, J.; Shvaiko, P.: Ontology Matching. Springer, Berlin, 2007.
[Eu06]
Euzenat, J.: An API for ontology alignment.
https://gforge.inria.fr/docman/view.php/117/251/align.pdf, 17.10.2011.
[Fe98]
Fellbaum, C. Hrsg.: WordNet: An Electronic Lexical Database. MIT Press,
Cambridge, 1998.
[FR10]
Fengel, J.; Rebstock, M.: Domänensemantik-orientierte Integration heterogener
konzeptueller Modelle. In: Modellierung betrieblicher Informationssysteme.
Modellgestütztes Management; (MobIS 2010 ); LNI P-171, 2010; S. 63–78.
[Fr10]
Frank, U.: Interview mit Rudi Studer zum Thema „Semantische Technologien“. In
Wirtschaftsinformatik 52, 2010; S. 49–52.
[Ge07]
Gehlert, A.: Migration fachkonzeptueller Modelle. Logos-Verl., Berlin, 2007.
[HL09]
Harms, I.; Luckhardt, H.-D.: Virtuelles Handbuch Informationswissenschaft.
http://is.uni-sb.de/studium/handbuch/, 30.10.2011.
[HLD05]
Hepp, M. et al.: Semantic Business Process Management: A Vision Towards Using
Semantic Web Services for Business Process Management. In: Proc. of the IEEE
Intern. Conf. on e-Business Engineering. ICEBE 2005, IEEE., 2005; S. 535–540.
[HP11]
HP Hewlett Packard: Jena - A Semantic Web Framework for Java.
http://jena.sourceforge.net/. 30.10.2011
[Ja12]
Jaccard, P.: The Distribution of the Flora in the Alpine Zone. In The New Phytologist,
1912, 11; S. 37–50.
[Ja89]
Jaro, M. A.: Advances in Record-Linkage Methodology as Applied to Matching the
1985 Census of Tampa. Journal of the American Statistical Association, 1989; S. 414–
420.
[Ko07]
Koschmider, A.: Ähnlichkeitsbasierte Modellierungsunterstützung für
Geschäftsprozesse. Universitätsverl., Karlsruhe, 2007.
[Le66]
Levenshtein, V.: Binary Codes Capable of Correcting Deletions, Insertions, and
Reversals. In Cybernetics and Control Theory, 1966, 10; S. 707–710.
58
Janina Fengel, Kerstin Reinking
[Li00]
Liu, K.: Semiotics in information systems development. Cambridge Univ. Press,
Cambridge, New York, 2000.
[MB09]
Miles, A.; Bechhofer, S.: SKOS Simple Knowledge Organization System Reference.
http://www.w3.org/TR/2009/REC-skos-reference-20090818/, 20.09.2011.
[MK07]
Murzek, M.; Kramler, G.: The Model Morphing Approach – Horizontal
Transformations between Business Process Models. In: Proc. of the 6th Intern. Conf.
on Perspectives in Business Information Research - BIR'2007, Tampere, Finland,
2007; S. 88–103.
[Na05]
Naber, D.: OpenThesaurus: ein offenes deutsches Wortnetz.
http://www.danielnaber.de/publications/gldv-openthesaurus.pdf, 12.10.2010.
[PB11]
Porter, M.; Boulton, R.: Snowball. http://snowball.tartarus.org/index.php, 31.10.2011.
[PW97]
Porter, E. H.; Winkler, W. E.: Approximate String Comparison and its Effect on an
Advanced Record Linkage System. http://www.census.gov/srd/papers/pdf/rr97-2.pdf,
10.08.2011.
[SBH06]
Shadbolt, N.; Berners-Lee, T.; Hall, W.: Semantic Web Revisited. In IEEE Intelligent
Systems, 2006, 21; S. 96–101.
[SM07]
Simon, C.; Mendling, J.: Integration of Conceptual Process Models by the Example of
Event-driven Process Chains. In: 8. Intern. Wirtschaftsinformatik (WI 2007) Univ.Verl. Karlsruhe, Karlsruhe, 2007; S. 677–694.
[SS08]
Stock, W. G.; Stock, M.: Wissensrepräsentation. Oldenbourg, München, 2008.
[St07]
Stock, W. G.: Information Retrieval. Oldenbourg, München, 2007.
[ST95]
Sutinen, E.; Tarhio, J.: On Using q-Gram Locations in Approximate String Matching.
In: Proc. of the 3rd Ann. Europ. Symposium on Algorithms ESA '95. Springer, Berlin,
1995; S. 327–340.
[StStKo05] Stoilos, G.; Stamou, G.; Kollias, S.: A String Metric for Ontology Alignment.
In: ISWC 2005. Springer-Verlag, Berlin Heidelberg, 2005; S. 624–637.
[TF06]
Thomas, O.; Fellmann, M.: Semantische Integration von Ontologien und Ereignisgesteuerten Prozessketten. In: Proc. EPK 2006 Geschäftsprozessmanagement mit
Ereignisgesteuerten Prozessketten. CEUR-WS.org, Vol. 224, 2006; S. 7–23.
[TF07]
Thomas, O.; Fellmann, M.: Semantic Business Process Management: Ontology-Based
Process Modeling Using Event-Driven Process Chains. In IBIS 2, 2007; S. 29–44.
[TF09]
Thomas, O.; Fellmann, M.: Semantische Prozessmodellierung – Konzeption und
informationstechnische Unterstützung einer ontologiebasierten Repräsentation von
Geschäftsprozessen. In Wirtschaftsinformatik 51, 2009, S. 506–518.
[W310]
Links to SKOS Data. http://www.w3.org/wiki/SkosDev/DataZone, 31.10.2011.
[We01]
Weiss, M.: Automatische Indexierung mit besonderer Berücksichtigung
deutschsprachiger Texte. http://www.ai.wu.ac.at/~koch/courses/wuw/archive/inf-semws-00/weiss/index.html, 30.10.2011.
[We07]
Weske, M.: Business Process Management. Concepts, Languages, Architectures.
Springer, Berlin Heidelberg, 2007.
[Zb10]
ZBW Leibniz-Informationszentrum Wirtschaft: STW Standard-Thesaurus Wirtschaft.
http://zbw.eu/stw/versions/latest/download/about.de.html, 30.10.2011
Towards a Tool-Oriented Taxonomy of
View-Based Modelling
Thomas Goldschmidt1 , Steffen Becker2 , Erik Burger3
1
ABB Corporate Research Germany, Industrial Software Systems Program
thomas.goldschmidt@de.abb.com
2
University of Paderborn, steffen.becker@uni-paderborn.de
3
Karlsruhe Institute of Technology (KIT), burger@kit.edu
Abstract: The separation of view and model is one of the key concepts of ModelDriven Engineering (MDE). Having different views on a central model helps modellers to focus on specific aspects. Approaches for the creation of Domain-Specific
Modelling Languages (DSML) allow language engineers to define languages tailored
for specific problems. To be able to build DSMLs that also benefit from view-based
modelling a common understanding of the properties of both paradigms is required.
However, research has not yet considered the combination of both paradigms, namely
view-based domain specific modelling to a larger extent. Especially, a comprehensive
analysis of a view’s properties (e.g., partial, overlapping, editable, persistent, etc.) has
not been conducted. Thus, it is also still unclear to which extent view-based modelling
is understood by current DSML approaches and what a common understanding if this
paradigm is. In this paper, we explore view-based modelling in a tool-oriented way.
Furthermore, we analyse the properties of the view-based domain-specific modelling
concept and provide a feature-based classification of these properties.
1
Introduction
Building views on models is one of the key concepts of conceptual modelling [RW05].
Different views present abstract concepts behind a model in a way that they can be understood and manipulated by different stakeholders. For example, in a component-based
modelling environment, stakeholders, such as the system architect or the deployer will
work on the same model but the system architect will work on the connections and interactions between components whereas the deployer will focus on a view showing the
deployment of the components to different nodes [Szy02].
This is not only true for different types of models, as e.g., the different abstraction levels
defined by the Model Driven Architecture (MDA) [MCF03], but also for having different
views on the same models. Specialised views on a common underlying model foster the
understanding and productivity [FKN+ 92] of model engineers. Recent work [ASB09] has
even promoted view-based aspects of modelling as core paradigm.
Frameworks for the creation of Domain Specific Modelling Languages (DSMLs) allow
to efficiently create tailored modelling languages. Different types of DSML creation approaches have emerged in recent years [KT08, CJKW07, MPS, Ecl11b, Ecl11a, KV10].
60
Thomas Goldschmidt, Steffen Becker, Erik Burger
Many of these approaches implicitly allow for, or explicitly claim to, support the definition
of views on models.
However, a comprehensive analysis of view-based aspects in DSML approaches has not
been performed, yet. Furthermore, there is no clear determination on these concepts given
in literature. Work on architectural, view-based modelling is mostly concerned with its
conceptional aspects (e.g., [RW05, Cle03, Szy02, ISO11]). Their definitions are however
on architecture level and do not deal with specific properties of views within modelling
tools, such as their scope definition, representation, persistency or editability.
In order to agree on requirements for view-based modelling and to be able to decide which
view-based DSML approach to use, language engineers and modellers require a clear and
common understanding of such properties. Researchers have partially used these properties explicitly or implicitly in existing work (e.g., [GHZL06, KV10]). However, as these
concepts and properties are scattered across publications and are often implicitly considered, this paper aims at organising them.
In order to get an overview on the view-based capabilities in existing DSML frameworks,
we analysed a selection of DSML frameworks. We included graphical DSML frameworks
([KT08, CJKW07, Ecl11a]) as well as textual DSML frameworks ([MPS, Ecl11b, KV10])
to identify a common understanding of view-based domain-specific modelling.
The contribution of this paper is two-fold: First, we provide a common definition of viewbased modelling from a tool oriented point of view. Second, we identify the different properties and features of view-based modelling in DSML approaches. The work presented in
this paper is beneficial for different types of audience. Modellers can use the properties
to make their requirements on view-based modelling more explicit. Tool builders can use
the presented properties to classify and validate their approaches or as guidelines for the
development of new view features. Researchers can benefit from the common definition
of views on which further research can be based.
The remainder of this paper is structured as follows. Section 2 presents an overview as well
as a differentiation of different notions of the term “view” that are used as categories for
the classification scheme. Properties of view-types, views and specific editor capabilities
are given in Section 3. Related work is analysed in Section 4. Section 5 concludes and
outlines future work.
2
Determination of View-Points, View-Types, and Views
In this paper we try to clarify the understanding of the common terminology in viewbased modelling. As our understanding of view, view points, view types and modelling
originates from a tooling perspective, our understanding of the terms varies slightly from
existing definitions like the ISO 42010:2011 standard [ISO11]. Therefore, in this section
we are illustrating an example of a view-based modelling approach from our own industrial
experience and illustrate using this example our understanding of the terms. We find the
same understanding realised in many of the tools we have classified and surveyed in order
to come up with this taxonomy. Finally, we discuss our terminology in the context of
existing definitions like the ISO standard.
Towards a Tool-Oriented Taxonomy of View-Based Modelling 61
2.1 Tool basis
In orde to come up with a generic taxonomy for tools in the view-based modelling area
we analysed several DSML tools. The selection process for the tools that we analysed was
based on the following criteria:
1. The search was based on electronic databases, e.g., ACM DigitalLibrary IEEEXplore, SpringerLink as well as references given by DSML experts. The search
strategy was based on the keywords “domain-specific”, “modelling”, “language”,
“view-based”, “view-oriented”, “views” and “framework” in various combinations.
The search was conducted in several steps within December 2010 and January 2011.
2. Domain-specific modelling includes approaches stemming from several different
research areas. Therefore, we included DSML approaches coming from different
areas, e.g., meta-case tools, compiler-based language engineering as well as general
model-driven engineering.
3. For being recognised as view-based DSML framework it should be possible to define new or use existing metamodels and create multiple concrete syntaxes for them.
4. Approaches for which a tool or framework was available were included. This ensured that approaches only having a theoretical or very prototypical character were
excluded.
5. Approaches which have a tool which is not longer maintained or where the project
was considered dead were excluded.
6. Finally, we excluded tools for which we found no indications for industrial relevance
such as experience reports or real world evaluations. Thus, only tools proven to be
mature enough for industrial application were included.
The selection process was very strict as our goal was to evaluate only those tools which
had a chance of being employed in industrial projects. Of course, this may threat the
general applicability of our taxonomy but on the other hand ensures that the taxonomy is
applicable by industry. Finally the tools we analysed were the following: Eclipse GMF
[Ecl11a], MetaEdit+ [KT08], Microsoft DSL tools [CJKW07], Jetbrains MPS [MPS] and
Eclipse Xtext [Ecl11b]. Due to space restrictions, we cannot include the whole survey
here; however, preliminary results are available online.1
2.2 Terminology
Figure 1 presents our example language and its views from the business information domain. The example language serves as a DSL to model business entities, their relations,
their interactions and their persistence behaviour.
1 http://sdqweb.ipd.kit.edu/burger/mod2012/
Thomas Goldschmidt, Steffen Becker, Erik Burger
Static : ViewPoint
View
Point
62
defines
defines
View
Type
BusinessObject
Structure :
ViewType
ValueTypes
Overview :
ViewType
instanciates
instanciates
View
Model
(Abstract
Syntax)
represents
represents
represents
Address
represents
Boolean 0..1 persist()
{
store this;
var ss1=commit;
var containsThis=all[ss1] Company->
iterate(Boolean 1..1 contains=false; i |
contains.or(i==this));
return containsThis;
}
Customer
store Customer
with addresses > 0
to LocatedCust
Position
instanciates
instanciates
store Company
with valueType = false
to CompanyTable
Address
0..*
Address
addresses
BlockImplemen
tation :
ViewType
Interaction :
ViewType
instanciates
instanciates
defines
defines
Persistency :
ViewType
ValueTypes
Customer
addresses : Address [0..*]
Dynamic : ViewPoint
defines
Invoice
represents
represents
<?xml version="1.0" encoding="ASCII"?>
<xmi:XMI xmi:version="2.0" xmlns:xmi="http://www.omg.org/XMI" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<BOs:BO name="Customer">
<signatures xsi:type=“BOs:MethodSignature“ name="persist" type="/1"
<implementation xsi:type=“Bos:Block“>
...
</xmi:XMI>
Figure 1: Example language and its viewpoints and viewtypes used by some example views
1
hasStakesIn
System
models 1
1 modelledBy
*
showsElements
View
views
instanciates
definedBy
*
1
defines
analyses 1
Metamodel
represents
1
1..* stakeholder
1
*
Model
1
*
stakeholders
1..*
ViewType
has
represents
*
Stakeholder
*
viewTypes
defines
*
*
definedBy
*
interestedIn
*
ViewPoint
representedIn
1
defines
Concern
1
concern
Figure 2: Terminology for view-based modelling used in this paper.
Our language therefore consists of two viewpoints: a static viewpoint to model the static
structure of the business entities and a dynamic viewpoint to model their dynamics. The
static viewpoint consists of three view types: a structure viewtype that defines how to
presents the business objects and their relations in a class digram like notation, a value
viewtype that defines how to shows the attributes of the business entities, and a persistency viewtype that defines how to link the business entities with the database and defines
default values. Figure 1 also contains for each of the viewtypes an illustrative example
view showing a simple model. The figure shows how viewtypes relate to classes (BusinessObject) and views to instances of these classes (Customer, Address). The language’s dynamic viewpoint has an interaction viewtype that defines how to represent business object
interrelations and a block implementation viewtype that defines how behaviour of single
business entities is specified. Again, Figure 1 illustrates each viewtype on an example.
Using this example, we introduce our terminology illustrated in Figure 2. The class digram
shows the terms used in view-based modelling and represents our tool-centric understanding of views, viewtypes, and viewpoints.
Towards a Tool-Oriented Taxonomy of View-Based Modelling 63
Our conceptualisation starts with the System (or the ”real-world object”) which is being
studied by its Stakeholders wrt. their specific system Concerns. In our example, we may
think of database designers of the system under study who want to analyse their database
table structure. Therefore, they are interested in the static Viewpoint. A viewpoint represents a conceptual perspective that is used to address a certain concern. A view point
includes the concern, as well as a defined methodology on how the concern is treated, e.g.,
instructions how to create a model from the particular viewpoint. In order to analyse the
system, they create a single, consistent Model of the system under study. The model has
to be an instance of its Metamodel.
In order to show parts of this model, we need a set of concrete syntaxes. These concrete
syntaxes are defined by Viewtypes. It defines the set of metaclasses whose instances a
view can display. This description uses the metamodel elements and provides rules that
comprise a definition of a concrete syntax and its mapping to the abstract syntax. It defines
how elements from the concrete syntax are mapped to the metamodel and vice versa. For
example, the business object structure viewtype shows classes and their relations but not
the instructions how to persist the entity to the database.
A View is the actual set of objects and their relations displayed using a certain representation and layout. A view resembles the application of a view type on the system’s models.
A view can therefore be considered an instance of a view type. For example, the structure
view in Figure 1 shows the business entities ”Customer” and ”Address”. They may be
a selection of all possible classes, e.g., ”CreditCardData” is not shown on this particular
view but may be shown on a different view from the same view type. Also the elements
”Customer” and ”Address” can also appear in other views.
The separation between the definition of the view type and its instances is also topic of
the recently started initiative of the OMG called Diagram Definition [Obj10]. Within
this new standard, which currently under development, the OMG distinguishes between
Diagram Interchange (DI) and Diagram Graphics (DG). Where the former is related to the
information a modeller has control over, such as position of nodes, the latter is a definition
of how the shapes of the graphical language look like. The mapping between a DG and
a metamodel can then be defined by a mapping language such as QVT. Considering a
DI instance a view and a DG definition including the mapping a view type the OMG’s
definition is perfectly in line with our own experience and what we contribute in this paper.
The ISO 42010:2011 standard [ISO11] gives the following definitions: A view “adresses
one or more of the concerns of the system’s stakeholders” and “expresses the architecture
of the system-of-interest in accordance with an architecture viewpoint” where a viewpoint
“establishes the conventions for constructing, interpreting and analyzing the view to address concerns framed by that viewpoint.” In contrast to the ISO 42010:2011 standard
[ISO11], we follow the idea of having a single model of the system under study and views
just visualise and update this central model. For different kinds of data, we favour the use
of different view types, in analogy to the model type in the ISO standard. Furthermore, we
do not focus on architecture descriptions. As a consequence, our concept contains explicit
the model and its metamodel which also allows us to associate the viewtypes to the metamodel. As a minor difference, we favour a view point just to address a single concern to
have a clear relation.
64
Thomas Goldschmidt, Steffen Becker, Erik Burger
BusinessObjects
Multiplicity
NamedElement
ExpressionStmt
name : String
BusinessObject
Statement
statements
{ordered}
block
valueType : Boolean
0..*
1
Block
0..* implementation
1
0..*
owner
signatures
MethodSignature
1
inv: self.signatures->forAll(s |
self.elementsOfType.typedElement
.name <> s.name)
elementsOfType
0..*
entity
1
Lower : Integer
Upper : Integer
Ordered : Boolean
Unique : Boolean
TypeDefinition
1
0..*
type
typedElement
TypedElement
Association
1 association
2 ends
AssociationEnd
signature
Figure 3: Example metamodel.
3
Classification of View Type, View and Editor Properties
We explicitly distinguish between the definition level (view type) and the instance level
(view) to pay respect to the different roles involved in domain specific modelling. Modellers use views on instance level, whereas language engineers work on view type level.
To separate between the properties of these levels we first present view type properties
in Section 3.1 and second view properties in Section 3.2. Finally, as DSML frameworks
mostly come with their own editor framework, which also has a large impact on the way
how modellers can work with their views, we present a classification scheme for editor
capabilities that have an impact on view building in Section 3.3.
We use the properties presented here for two different purposes. Firstly, for the communication and reasoning about specific view types and views. Having these explicit properties
eases the communication and helps to avoid errors in the definition as well as the application of a view-based modelling approach. Secondly, applied to a given view-based
modelling approach, the fulfilment of a property resembles the fact that a certain approach
is capable of defining view types or instantiating views that feature this specific property.
3.1 View Type
A view type defines rules according to which views of the respective type are created.
These rules, can be considered as a combination of projectional and selectional predicates as well as additional formatting rules that determine the representation of the objects within the view. Projectional predicates define which parts of a view type’s referenced metamodel and/or elements of that metamodel a view actually shows. For example, the “BusinessObject Structure” view type defined in our example is a projection that
shows elements of type BusinessObject, Association, etc. but not Block or
Statement elements. Additionally, projectional predicates may also refer to specific attributes defined on metamodel level. In other words, projectional predicates define which
types of elements (classes, associations, attributes, etc.) a view type includes.
Towards a Tool-Oriented Taxonomy of View-Based Modelling 65
Legend
exclusive OR
inclusive OR
*
mandatory feature
optional feature
multiple usage per
parent feature
Selectional
C/P
{per view type/
per predicate}
*
Complete (C)
Projectional
C/P
View Type
*
Partial (P)
Depthfirst C/P
Breadth-first /
Local C/P
Extending
Contain
ment
C/P
Overlapping
Containment
C/P
Downwards
Containment C/P
Upwards
Containment C/P
Representation
Textual
Inter View
Type
Intra View
Type
Other
Graphical
Tabular
Figure 4: Properties of view types.
Selectional predicates define filter criteria based on attributes and relations of a view’s elements. For example, the “ValueTypes Overview” view type of our example only includes
elements of type BusinessObject that have set their valueType attribute to true,
thus showing only abstract classes. In other words, selectional predicates define on instance level, which conditions elements have to fulfil in order to be relevant for a specific
viewtype.
Finally, a view type contains rules defining how a view represents the projected and selected elements. Given the “BusinessObject Structure” view type of our example, one
of these rules describes that the view type displays BusinessObjects as rectangular
boxes with the value of their name property as label.
The determination of the three rule types is given on a conceptional level, the implementation of a view type may also combine these rules into a joined rule. These three types of
rules define the range of properties a view type may fulfil. Examples are projectional or selectional completeness or whether a view type defines a textual or graphical representation
of the underlying model.
Note that, depending on whether a view should be editable or read-only, these rules have
to be considered in a bidirectional way: I) the direction which specifies how a view is
created for an underlying model II) defines how a model is created and/or updated based
on changes that are performed in a view.
The feature diagram depicted in Figure 4 gives an overview of the properties identified by
us.
Complete View Type Scopes: A language engineer needs to ensure that the created language is capable of expressing programs of the targeted domain. Especially if a view-based
approach is employed, achieving the desired expressiveness and coverage of the domain’s
metamodel can be an error prone task. To ease the communication on this coverage as
well as to provide a basis for tool builders to cope with this challenge, we define different
types of completeness for the definition of a view type.
A view type may be complete, which means that it considers all classes, properties, and
relations of a metamodel that are reachable from the part of the metamodel for which
the view type is defined. A complete view type can be used as a starting point for the
interaction of a model displaying all model elements from which a modeller can dive
deeper into the model using other view types.
Block
1
statements
0..*
Statement
downwards
(a) Example: containment complete.
Customer
addresses
addresses : Address [0..*]
Intra View Type Overlap:
Property „addresses“ occurs
more than once in the same
view but using different
representations
Inter View Type Overlap:
Classes are shown in multiple
view types
0..*
Address
ValueTypes
Address
Position
VTAssocs
1
signatures
0..*
MethodSignature
1
0..* implementation
VTPackage
BusinessObject
containment complete
Thomas Goldschmidt, Steffen Becker, Erik Burger
upwards
66
(b) Examples: view type overlaps.
Figure 5: Examples for containment completeness and view type overlaps.
The scope of a view type rule is given by its projectional as well as selectional parts. Thus,
we can also distinguish between projectional completeness (which includes the containment and local completeness) and selectional completeness (see the definition of instance
completeness below).
Projectional Complete View Type Scope: Projectional completeness of views based on the
MOF [Obj06] meta-metamodel can be defined on several levels. Starting from a certain
model element there are two dimensions on how to span a scope to other model elements.
First, traversing depth-first to other model elements via the associations defined in the
metamodel and second, including, all elements that are, breadth-first, reachable through
all directly attached attributes or associations.
Depth-first completeness is hard to specify as it would need to be defined for each specific path through a metamodel. However, a special subset of depth-first that is useful
to define is the completeness w.r.t. to the containment associations. Containment associations are the primary way for creating an organisational structure for a metamodel. Therefore, we define a special case of depth-first completeness called containmentcompleteness. A view type is containment complete concerning a specific element o if all
elements that are related to it via containment associations are shown in the view. Three
different notions of containment complete can be defined (Figure 5(a) shows examples
based on our example metamodel.).
• Downwards containment-complete means that all elements that are transitively connected to o are part of the view if o is their transitive parent.
• Upwards containment-complete means that all elements that are transitively connected to o are part of the view if they are a transitive parent of o.
• The third notion specifies that all transitive parents and children of o are part of the
view.
The second dimension of view type completeness is the breadth-first completeness, which
we call local completeness. Local completeness is fulfilled if a view type can display all
directly referenced elements of a given element. A view type is locally complete concerning a class c if every direct property of c can be displayed by an instance of the view
Towards a Tool-Oriented Taxonomy of View-Based Modelling 67
Persistency
Storage
boName : String
tableName : String
MM
BO + Persistency View Type
template Storage :
„store“ boName „with“ „valueType“ „=“
external {query =
BusinessObject.allInstances()->
select(name = boName),
property = valueType}
„to“ tableName
;
View Type Def.
BusinessObjects
BusinessObject
name : String <<inherited>>
valueType : Boolean
MMext
: Storage
boName = „Company“
tableName = „CompanyTable“
Company : BusinessObject
store Company
with valueType = false
to CompanyTable
name = „Company“
valueType = false
Model
View
Figure 6: Using an external metamodel Persistency that is
connected via a query in the view type to an existing meta- Figure 7: Example view instance of the
view type specified in Figure 6.
model BusinessObjects.
type. Making this type of completeness explicit is useful if a certain view type should be a
detail editor for a specific class where the modeller should be able to view and/or edit all
properties of the given class.
Selectional or Instance Completeness: Selectional completeness, or instance completeness means that the selection of the view type includes all model instances that appear in
the underlying model as long as the projection of the view type also includes them. However, projectional completess is not required in order to fulfil the instance completeness
property. For example, a view type can have a projection of a class A which does not
include a property propA. As long as the view type includes all possible instances of A it
is still instance complete. In contrast to that, if a view type defines a selection criterion for
A, such that only As having a propA value of “selected”, are included the template for A
is not instance complete anymore.
Partial View Type Scope: The scope of a view type is considered partial concerning a
metamodel, if it only covers a certain part of the element types that are defined within the
metamodel. This means, for example, the “BusinessObject Structure” view type of our example is partial w.r.t. the metamodel as it omits the classes such as Block. Just as a view
type has different types of completeness, i.e., projectional and selectional completeness, a
view type that is not complete w.r.t. one of these properties it is then automatically partial.
Extending View Type Scope: In addition to the properties partial and complete, there
are also view types that combine elements from the underlying model with additional
information from an external model Mex . The extended information is defined by the
fact that it is not directly reachable by model navigation but by some kind of external
model, e.g., a decorator model, from the extended view type. Often, the information that
should be added in such a view type is additionally defined using a different metamodel
MM ex . In our terminology a view type always refers to a single metamodel. Therefore,
the metamodel for such an extending view type refers to an artificial composite metamodel
including both related metamodels.
A concrete example that shows how such an extension view type could be defined is depicted in Figure 6, based on the example business object metamodel. The example shows
that there is an external persistency annotation metamodel that does not have any connection with the business object metamodel. The storage annotation only contains a hint to
the name of the business object that should be persisted in its boName attribute. However,
it might be a requirement that a language engineer needs to define a view type not only
showing elements of the persistency metamodel but also presenting information from the
68
Thomas Goldschmidt, Steffen Becker, Erik Burger
Persistency
Storage
tableName : String
BusinessObjects
entity
1
BusinessObject
name : String <<inherited>>
valueType : Boolean
Figure 8: Using an external metamodel Persistency to non-intrusively add persistency annotations to
an existing metamodel BusinessObjects.
business object metamodel, i.e., if the mentioned business object is a value type or not.
Therefore, a query is given in the view type that retrieves the corresponding business object with the specified name and from which the valueType property is then shown in the
view, as illustrated in Figure 7.
Overlapping View Type Scope: This property is not a direct property of a view type
but defines a relationship between two or more view types. View types may also cover
scenarios where there is more than one view type that is able to represent the same type of
element. On the other hand it is also possible that the same view type can handle a distinct
type of element in different ways. We call these types of overlaps inter- and intra view
type overlap. Figure 5(b) shows an examples for both types of overlaps.
An inter view type overlap occurs whenever one or more view types are able to represent
the same element. A prerequisite for this property is, that the involved view types are based
on the same metamodel. Figure 5(b) shows that the “BusinessObject structure” view type
has an inter view type overlap with the “ValueTypes Overview” view type as both show
BusinessObjects.
If the same view type can represent the same element in different ways an intra view
type overlap is present. This means that there is more that one predicate in the view type
that includes the same element. Figure 5(b) shows that the “BusinessObject structure”
view type has an intra view type overlap as association ends are represented as a compartment within the BusinessObject’s shape as well as within a label decoration of the
Association’s shape.
Representation: The third type of rules which a view type defines are responsible for
defining the representation of the elements of a view. A view may comprise different types
of representation rules. Possible types are textual, graphical, tabular and arbitrary other
types. Rules may also combine different types. For example, a graphical representation
may include some textual or tabular parts as well.
3.2 Views
Views, as instances of view types can also have different properties which are depicted in
Figure 9 using a feature diagram. Using these properties, we classify views concerning
the extent of information they show of the underlying model(s). We distinguish between
selective and holistic views. Additionally, we handle the persistence and editability of
layout and selection as well as inter and intra view overlaps of views. The following
Towards a Tool-Oriented Taxonomy of View-Based Modelling 69
View
{per view/per predicate} *
View Scope
Deletion
Selective
Holistic
Addition
Deletion
Persistency
Selection
Layout
Editable Entities
Layout
Model
Overlapping
Intra View
Inter View
Addition
Figure 9: Properties of view instances.
properties affect the behaviour of the a view instance, however, also the view type may
define generically whether a view has a specific property or not.
View Scope: A view shows a specific selection of elements from its underlying model.
If changes occur in the model, i.e. elements are added or deleted, the view needs to be
updated according to these changes. The view scope property defines whether this is done
automatically or only if a user explicitly requests the update.
Selective View Scope: A view is considered selective if it is possible to show a subset of
the elements that could be shown according to its view type. A selective view only shows
these specifically selected elements. The selection may either be done automatically or
manually by a user of the view. For example, the view example for the “BusinessObject
structure” view type depicted in Figure 1 is selective, as the modeller can manually select
whether or not specific BO occurs in the view or not. In this case the view only shows the
BOs “Customer” and “Address” and omits, for example, “Company”.
A view can be selective concerning different types of changes:
Addition Selective Addition of elements to the model that fall into the scope of a view’s
view type are only added to the view’s selection, if added manually. For example, the
“BusinessObject Structure” view types may not show all BOs at once. A modeller
can select whether a newly added BO should appear in a certain view or not.
Deletion Selective Deleting of elements from the model that fall into the scope of a view’s
view type are propagated to the view’s selection, if deleted manually. Thus, elements that were deleted from the underlying model do not necessarily result in the
deletion of their view representations. For example, in many graphical modelling
tools, representations of elements in a view where the underlying model element
is not available anymore, are not automatically removed from the diagrams but are
rather annotated, indicating that the underlying model element is missing.
Holistic View Scope: In contrast to addition selective views, a view may be addition holistic. This means that it always presents the whole set of possible elements that can be
displayed by the view. If elements are added and/or removed this is immediately reflected
automatically in the view. Modelling tools mostly use this type of view to present the user
an overview on the underlying model. Analogously, deletion holistic views directly reflect
any deletion of an element by removing its view representation, i.e., view and model are
always synced.
Overlapping: This property is not a direct property of a view but defines a relationship
between two or more views. A view may be overlapping with another view. In this case
70
Thomas Goldschmidt, Steffen Becker, Erik Burger
elements may occur in more than one view at once. This may be a view of the same view
type but also a different one. If the element occurs in multiple views we speak of inter view
overlap, whereas we call multiple occurrences within the same view intra view overlap.
Editability: In addition to displaying model elements according to the view type’s rules,
an editable view needs to provide means to interact with and thus modify the underlying
model. Actions such as create, update and delete need to be performable to make a view
editable. Editability of views can also be subdivided into two different degrees of editability. First, if only the layout information can be changed but not the actual model content,
the view is only layout editable. Second, if the model content is editable through the view
it is considered content editable.
Another interesting aspect is that editability is closedly related to the view type scope (cf.
Section 3.1). The scoping of a view type might dertermine the editability of its views.
For example, a view type might omit a mandatory attribute of a metamodel class in its
specification. In this case it is not possible to create new instances of this class using this
view type but it is still possible to view and modify instances of the class.
Persistency: A view may be persistent regarding its selection as well as its layout. Stored
view layouts enable faster access, as it does not need to be created newly every time a
modeller opens the view. Additionally, if a persistent view it is at the same time editable
enables for customisation of a view’s selection of elements and/or layout.
For non-holistic views, the modeller decides which elements a view should include and
which not. If such a selection should be saved, a view needs to be selection persistent. In
this case the view’s selection of elements is stored.
Additionally, a modeller may customize the layout of the view by manually changing
certain parts, such as explicit positioning of the elements occurring in the view, or, for
textual views, white-spaces or indentations. Additionally, a modeller may add additional,
mostly informal content, such as comments or annotations. If a view allows to store this
kind of information it is layout persistent.
3.3
Editor Capabilities
Features that have an impact on how view-based DSML frameworks deal with the interaction of users and views as well the synchronisation between view and model also
influence the requirements on an employed view-based modelling approach. Figure 10
depicts a feature diagram that gives an overview on these editor capabilities. Note that we
included only such properties we consider as special requirements for a view-based modelling approach. A broader view on DSML editor capabilities, at least for textual DSML
approaches can be found in [GBU08].
Bidirectionality: To keep models and their views in sync, the rules that do this synchronisation need to be bidirectional (or there need to be two rules where one resembles the inversion of the other). In order to be correct, a bidirectional rule (or a pair of corresponding
rules) needs to comply to the effect conformity property. To comply to effect conformity,
changes made directly to the model should leave it in the same state as an equal change
on the view level that is then automatically propagated back to the model would do. This
Towards a Tool-Oriented Taxonomy of View-Based Modelling 71
Framework Capabilities
Inconsistency Handling
Constraint Inconsistency
Model Inconsistency
Update Strategy
Immediate
Bidirectionality
Deferred
Figure 10: Editor capabilities.
automated back propagation is defined by the view type rules. Furthermore, vice-versa,
changes made through the view to the model should leave the view in the same state as an
equal change on the model level which is propagated by the corresponding view type rule.
Additionally, Matsuda et al. [MHN+ 07] define three bidirectional properties that need to
be fulfilled in order to create consistent view definitions.
Update Strategy: We classify the update strategy which triggers the propagation of
changes between view and model into two different types. (I) An update can be performed
at the very moment a change is made to one of the sides, either model or view. This kind
of update is denoted an immediate update strategy. (II) An update may occur at a point
in time decoupled from the actual change event. This kind of update is denoted deferred
update strategy.
The point in time when updates are performed predefines the number of allowed changes
between two subsequent synchronisation runs. In the immediate update strategy the transformations are executed as soon as an atomic change was performed to either the view
or its model. This strategy allows a tighter coupling between view and model and avoids
conflicts that may occur if an arbitrary number of changes is performed before the next
synchronisation. On the other hand, the deferred update strategy allows to have an arbitrary number of changes in this time span. This allows to work with views in a more
flexible way, as they can be changed offline, i.e., if the underlying model is currently not
available. However, having an arbitrarily large number of changes, that need to be synchronised, dramatically increases the probability of conflicts.
Consistency Conservation: As modelling is a creative process, models are mostly created
step-by-step. Thus, allowing for intermediate, possibly inconsistent states may foster the
usability and productivity of a view-based DSML [Fow05, FGH+ 94]. If this is the case, an
editable view might contain valuable information that was created during modelling but
that is not yet transformable into a valid model.
We define two different classes of inconsistency: (I) violation of metamodel constraints
that lead to what we call constraint inconsistency, which means that the view has statically
detectable semantic errors and (II) model inconsistency if a view is syntactically incorrect
and cannot be transformed into a model at all.
Metamodel constraints restrict the validity of models that would theoretically be constructible obeying only the rules defined in the metamodel without constraints. This also
includes multiplicity definitions for associations and attributes. For example, considering
out example metamodel, an invariant defined for the metamodel class BusinessObject
expresses that a MethodSignature may not have the same name as an AssociationEnd connected to the same BO (expressed as OCL invariant: inv: self.signatures
->forAll(s | self.elementsOfType.typedElement.name <> s.name
72
Thomas Goldschmidt, Steffen Becker, Erik Burger
)). If there exists an instance of MethodSignature which has a name that is already
given to such an AssociationEnd, this constraint is violated. However, during the
process of modelling there may be intermediate states where both elements have the same
name, e.g., during a renaming process. Still, the element should be representable in a
view, i.e., with additional information stating that the constraint is currently violated. If
constraint inconsistency was not supported, the modeller would have to first change the
AssociationEnd’s name before renaming the MethodSignature.
In case (II) a greater degree of freedom in modelling can be reached if a view even supports
to hold content that cannot be translated into a model at all. This allows a developer to
work with the view like a “scratch pad”. We denote this type of inconsistency model
inconsistency. As graphical modelling tools mostly only allow the modeller to perform
atomic modifications that preserve the syntactical correctness of the view, this type of
inconsistency mostly only occurs within textual modelling. In the latter case modellers are
often free to type syntactically incorrect information within a view.
4
Related Work
Oliveira er al. [OPHdC09] presented a theoretical survey on DSLs which also included
the distinction between the language usage and development perspectives. However, the
presented survey remains on a more conceptual level, mostly dealing with properties such
as internal vs. external, compilation vs. interpretation as well as general advantages and
disadvantages of DSL approaches. The authors do neither mention graphical DSMLs nor
do they include view-based modelling aspects in their survey.
Pfeiffer and Pichler give a tool oriented overview on textual DSMLs in [Pfe08]. Their survey is based on three main categories, which are language, transformation, and tool. The
evaluated features include the representation and composability of language definitions,
transformation properties such as the update strategy as well as the kind of consistency
checking that is supported. However, view-based modelling aspects and graphical or hybrid DSML approaches are omitted.
Buckl et al. [BKS10] have refined the ISO 42010 standard and created a framework for
architectural descriptions. A formal definition of the terms view, viewpoint and concern is
provided, which is in compliance with ISO 42010. The definition is however restricted to
architecture modeling.
In our own previous work [GBU08] we conducted a classification based survey on textual
concrete syntax approaches which have a common intersection with the approaches for
view-based DSMLs. However, the focus in this previous work was on evaluating the
textual modelling capabilities such as grammar classes, generator and editor capabilities
(such as code completion or syntax highlighting) and did not include features that are
required for view-based modelling.
Another, feature based survey on textual modelling tools, is presented by Merkle in [Mer10].
Although, this survey includes some features, which we also discuss, such as the representation of the concrete syntax as well as some tool related aspects, it does not present
view related features nor does it give hints on the existence of view-based aspects in the
Towards a Tool-Oriented Taxonomy of View-Based Modelling 73
classified tools.
5
Conclusions & Future Work
In this paper, we identified properties for the main concepts of view-based DSMLs. The
analysis was based on our experiences with several different graphical, as well as textual
DSML approaches. In this we distinguish between viewpoints, view types and views. We
furthermore focus on properties that relate to tool oriented capabilities such as partial or
overlapping view definitions or holistic and selective views.
This classification scheme allows DSML developers and users to explicitly specify properties of view types and views. This enhances the communication between language engineers and modellers during requirements elicitation, specification and implementation of
view-based DSMLs.
Based on the classification scheme we will carry out a systematic review of existing DSML
approaches. The results of this analysis are beneficial for language engineers as it helps
them in selection process of a view-based modelling approach. Furthermore, we will be
able to identify gaps in tool support w.r.t. view-based modelling. Preliminary results of
the tool evaluation are available online.2
References
[ASB09]
Colin Atkinson, Dietmar Stoll, and Philipp Bostan. Supporting View-Based Development through Orthographic Software Modeling. In Stefan Jablonski and Leszek A.
Maciaszek, editors, ENASE, pages 71–86. INSTICC Press, 2009.
[BKS10]
Sabine Buckl, Sascha Krell, and Christian M. Schweda. A Formal Approach to Architectural Descriptions – Refining the ISO Standard 42010. In Advances in Enterprise
Engineering IV, volume 49 of Lecture Notes in Business Information Processing, pages
77–91. Springer Berlin Heidelberg, 2010.
[CJKW07]
Steve Cook, Gareth Jones, Stuart Kent, and Alan Wills. Domain-specific development
with visual studio dsl tools. Addison-Wesley Professional, first edition, 2007.
[Cle03]
Paul Clements. Documenting software architectures: Views and beyond. SEI series in
software engineering. Addison-Wesley, Boston, Mass., 2003.
[Ecl11a]
Eclipse Foundation. Graphical Modeling Framework Homepage. http://www.
eclipse.org/gmf/, 2011. Last retrieved 2011-10-06.
[Ecl11b]
Eclipse Foundation. Xtext Homepage. http://www.eclipse.org/Xtext/,
2011. Last retrieved 2011-10-06.
[FGH+ 94]
A. Finkelstein, D. Gabbay, A. Hunter, J. Kramer, and B. Nuseibeh. Inconsistency Handling In Multi-Perspective Specifications. IEEE Transactions on Softw. Eng., 20:569–
578, 1994.
2 http://sdqweb.ipd.kit.edu/burger/mod2012/
74
Thomas Goldschmidt, Steffen Becker, Erik Burger
[FKN+ 92]
A. Finkelstein, J. Kramer, B. Nuseibeh, L. Finkelstein, and M. Goedicke. Viewpoints:
A Framework for Integrating Multiple Perspectives in System Development. International Journal of Software Engineering and Knowledge Engineering, 2, 1992.
[Fow05]
Martin Fowler. Language Workbenches: The Killer-App for Domain Specific Languages? 2005.
[GBU08]
Thomas Goldschmidt, Steffen Becker, and Axel Uhl. Classification of Concrete Textual Syntax Mapping Approaches. In Proceedings of the 4th European Conference on
Model Driven Architecture - Foundations and Applications, pages 169–184, 2008.
[GHZL06]
John C. Grundy, John G. Hosking, Nianping Zhu, and Na Liu. Generating DomainSpecific Visual Language Editors from High-level Tool Specifications. In ASE, pages
25–36. IEEE Computer Society, 2006.
[ISO11]
ISO/IEC/IEEE Std 42010:2011 – Systems and software engineering – Architecture description. Los Alamitos,CA: IEEE, 2011.
[KT08]
S. Kelly and J-P. Tolvanen. Domain-Specific Modeling:Enabling Full Code Generation. Wiley-IEEE Society Press, 2008.
[KV10]
Lennart Kats and Eelco Visser. The Spoofax Language Workbench. Rules for Declarative Specification of Languages and IDEs. In Proceedings of OOPSLA, pages 444–463,
2010.
[MCF03]
S.J. Mellor, A.N. Clark, and T. Futagami. Model-driven development - Guest editor’s
introduction. IEEE Software, 20:14– 18, 2003.
[Mer10]
Bernhard Merkle. Textual modeling tools: overview and comparison of language
workbenches. In Proceedings of SPLASH, pages 139–148, New York, NY, USA, 2010.
ACM.
[MHN+ 07] Kazutaka Matsuda, Zhenjiang Hu, Keisuke Nakano, Makoto Hamana, and Masato
Takeichi. Bidirectional Transformation based on Automatic Derivation of View Complement Functions. In Proc. of the ICFP 2007, page 47//58. ACM Press, 2007.
[MPS]
JetBrains MPS. http://www.jetbrains.net/confluence/display/
MPS/Welcome+to+JetBrains+MPS+Early+Access+Program.
[Obj06]
Object Management Group (OMG). MOF 2.0 Core Specification, 2006.
[Obj10]
Object Management Group (OMG). Diagram Definition, 2010.
[OPHdC09] Nuno Oliveira, Maria Joao Varanda Pereira, Pedro Rangel Henriques, and Daniela
da Cruz. Domain Specific Languages: A Theoretical Survey. In Proceedings of
the 3rd Compilers, Programming Languages, Related Technologies and Applications
(CoRTA’2009), 2009.
[Pfe08]
A Comparison of Tool Support for Textual Domain-Specific Languages. In 8th OOPSLA Workshop on Domain Specific Modeling, 2008.
[RW05]
Nick Rozanski and Eoin Woods. Software Systems Architecture. Addison-Wesley,
2005.
[Szy02]
C. Szyperski. Component software: beyond object-oriented programming. ACM
Press/Addison-Wesley Publishing Co., 2002.
Towards a Conceptual Framework for Interactive
Enterprise Architecture Management Visualizations
Michael Schaub, Florian Matthes, Sascha Roth
{michael.schaub | matthes | sascha.roth}@in.tum.de
Abstract: Visualizations have grown to a de-facto standard as means for decisionmaking in the management discipline of enterprise architecture (EA). Thereby, those
visualizations are often created manually, so that they get soon outdated since underlying data change on a frequent basis. As a consequence, EA management tools require
mechanisms to generate visualizations. In this vein, a major challenge is to adapt
common EA visualizations to an organization-specific metamodel. At the same time,
end-users want to interact with the visualization in terms of changing data immediately
within the visualization for the strategic planning of an EA. As of today, there is no
standard, framework, or reference model for the generation of such an interactive EA
visualization.
This paper 1) introduces a framework, i.e. an interplay of different models to realize interactive visualizations, 2) outlines requirements for interactive EA management
visualizations referring to concepts of the framework, 3) applies the framework to a
prototypical implementation detailing the therein used models as an example, and 4)
compares the prototype to related work employing the framework.
1
Introduction
Today’s enterprises cope with the complexity of changes to highly interconnected business applications whereas local changes often result in global consequences, i.e. impact
the application landscape as a whole. At the same time, change requests to business applications or processes are required to be fast and cost-effective in response to competitive
global markets with frequently changing conditions[WR09, Ros03]. Enterprise Architecture (EA) management promises to balance between short time business benefit and long
term maintenance of both business and IT in an enterprise [MWF08, MBF+ 11]. Thereby,
having a holistic perspective of the EA is indispensable. In this vein, visualizations have
grown to a de-facto standard as means for strategic decision making in the management
discipline of EA. Concepts formally describing EA management visualizations are summarized as system cartography1 , whereby the generation of visualizations out of existing
data is not yet described in depth [Wit07], i.e. currently there exists no standard, framework, reference architecture, or best-practice for generating EA visualizations.
Slightly later than the discipline itself, also tool support for EA management emerged
[MBLS08, BBDF+ 12]. With respect to their visualization capabilities, the range of tools
for EA management reaches from mere drawing tools to a model-driven generation of
1 Formerly
known as software cartography [Mat08].
76
Michael Schaub, Florian Matthes, Sascha Roth
visualizations. The former approach has clear drawbacks since visualizations are created
manually in a handcrafted, error-prone, and inefficient process. The later approach is
often limited to a single information model aka metamodel. Thereby, such an information
model has to try to capture the entirety of all relevant entities across all business domains
and industry sectors. Since this is an endeavour doomed to fail, EA vendors chose to offer
mechanisms for extending a ‘core’ information model. Since no standard information
model for EAs exists, enterprises tend to use an organization-specific information model
reflecting their information demands and tend to adopting the enterprise’s terminology, i.e.
aforementioned extension mechanisms are frequently used [BMR+ 10a]. At the same time,
respective visualization algorithms do not adapt to those changes automatically, i.e. the
visualizations have to be adapted to the extensions leading to extensive configuration or
additional implementation/customization efforts. Since there is no standard, framework,
or reference model for generating such an interactive EA visualization, we conclude with
the following research question:
‘How does a common framework or reference model for generating interactive EA visualizations look like?’
The remainder of this chapter is structured as follows: Section 2 introduces a conceptual
framework for generating interactive visualizations in general and in particular for EA
management. An outline of requirements for interactive EA management visualizations
is given in Section 3. The framework is then applied to a prototypical implementation in
Section 4. Subsequently, Section 5 revisits related approaches and compares them to the
prototype employing the introduced framework. Finally, Section 6 concludes this paper
and gives a brief outlook on open research questions.
2
Generating interactive visualizations
Figure 1 illustrates an overview of a conceptual framework to generate interactive visualizations that is detailed in the following. The framework consists of:
A data model which is considered as the actual data d within a data source that can be
retrieved by a query q. Depending on the nature of the data source, different fields may
have different access permissions [BMR+ 10b, BMM+ 11]. Therefore, a data interaction
model di captures the different access permissions for each concrete x ∈ d, i.e. access
rights and permissions on data level but not schema level. As an example users of a certain department might only get information about business applications in their particular
business unit. An information model that describes the schema im that the data model
d is based on. “An information model is a representation of concepts, relationships, constraints, rules, and operations to specify data semantics for a chosen domain of discourse”
[Lee99]. An interaction model i that subsumes the interactions that are allowed upon
the information model level, i.e. which entity can be created, read, updated, or deleted.
For instance, a certain role can only create business applications but is not allowed to
create business units. An abstract information model which can be a template for a
certain information model or type/entity therein. Based on the observations of Buckl et
al. in [BEL+ 07], organizations use recurring patterns to describe their managed information. Especially in [Sch11], Schweda shows that recurring patterns of information models
Towards a Conceptual Framework for Interactive Enterprise Architecture Management Visualizations
Interactive View (Visualization)
Data model
View data model
query
View data
interaction
model
Data interaction
model
transformation
Symbolic model
Symbolic
interaction
model
Viewpoint
Information
model
View model
Visualization
model
Interaction
model
View interaction
model
Visual
interaction
model
Abstract
information
model
Abstract
view model
Abstract
visualization
model
Abstract
interaction
model
Abstract view
interaction
model
Abstract visual
interaction
model
VBB
Figure 1: A conceptual framework to generate interactive visualizations
77
78
Michael Schaub, Florian Matthes, Sascha Roth
have been observed which he synthesized to so-called information model building blocks
(IBBs). Such an information model template, fragment, or building block comes with a
certain abstract interaction model that describes e.g. predefined access rights synthesized as best practices.
A view data model v = q(d) such that v ⊆ d ∪ q1 , whereby q1 are results of q that are
calculated out of d, e.g. aggregations or average values. A view data interaction model
⊆ di which is derived from q. In some cases, q reduces di not only by the selected values
of d, but also additional interactions, e.g. aggregated values cannot be edited regardless
of access rights for a specific x ∈ d. A view model is the schema vm of v, such that
vm ⊆ im ∪ q2 is derived from q, whereas q2 describes the part of the schema, which has
been created entirely by q, i.e. in general q2 ! im . A view interaction model vi ⊆ i
which is determined by q, i.e. depending on a particular q interactions of i are enabled or
not by vi . For instance, on aggregated values, updates are prohibited, whereas relationships
and transitive relationships2 could be updated3 based on i. An abstract view model which
defines the information demands for a particular visualization blueprint. The abstract view
model va can be used as a basis to perform a pattern matching, i.e. matching for the pattern
given by va on im (see e.g. [BURV11, BHR+ 10a]). An abstract view interaction model
which defines permitted interactions based on the information demands va .
A symbolic model sm summarizes the rendered symbols, i.e. instances of shapes like
rectangles, lines, etc., such that ultimately sm is the visualization as such. A symbolic
interaction model offers interactions on the actual visualization. These interactions are of
general concern for all sm , e.g. navigation or adaptive zooming [CG02], and do not relate
to d or im . A visualization model vism is the definition of visual primitives, i.e. shapes
like rectangles, lines, etc. and simple compositions thereof. Thereby, sm is an instantiation of vism which has been fully configured, e.g. a red dotted line. A visual interaction
model visi are the interactions, that come with selections s ⊆ vism . Thereby, s is e.g. a
rectangle which is draggable and may change its size on user manipulation. The instantiated and configured mapping of vm to vism can be summarized as a viewpoint in line
with ISO/IEC 42010:2007 (cf. [Int07]). An abstract visualization model visa that describes more complex compositions of elements of vism . Thereby, visa is not an instance
of a vism but a predefined composition, i.e. blueprint or building block, with additionally specified variability points defined in visa that may modify the actual appearance of
sm . An abstract visual interaction model that describes possible interactions from the
pure visual point of view with respect to the predefined configurations. For instance, a
text not fitting inside a rectangle is cut after reaching a maximum length and a ‘...’ string
is appended. In addition to cutting off the over-sized text so that the text object visually
fits, a tool tip is added to give the end-user feedback of the actual text contained. Such a
behaviour is independent from a concrete visualization and thus can be defined in an abstract manner therefore constituting a separate model. A described mapping of va to visa
including variability points can be summarized as a viewpoint building block (VBB) in
line with Buckl et al. [BDMS10].
2 As
used e.g. in the visualizations introduced in [BMR+ 10b].
an update may require to create stub objects or require additional user interventions depending on the
concrete information model.
3 Such
Towards a Conceptual Framework for Interactive Enterprise Architecture Management Visualizations 79
3
Requirements for interactive EA visualizations
With a focus on EA management, we identified the following requirements for interactive
EA visualizations which will be explained with regard to the aforementioned conceptual
framework.
As outlined above, in the context of EA management enterprises tend to use organizationspecific information models since there is no common standard EA model to describe
an entire organization’s information demands. Consequently, it has to be ensured that
an arbitrary information model can be visualized dynamically, i.e. an EA visualization
tool should be able to generate visualizations out of data without the need to manually
adapt to information models (Re1). More technically speaking, this also implies that the
mapping of an information model to a visualization model must be performed dynamically
at runtime and be configurable by end-users (Re1.1).
EA management visualizations are not only used to view data but are also consulted
when making strategic decisions. These often require impact analyses which can be
performed best in a graphical manner, i.e. by manipulating the symbolic model directly
(Re2) performing ‘what-if’ analyses. EA management has many perspectives and angles to view at depending on different stakeholders with different concerns ending up in
stakeholder-specific visualizations highlighting relevant data for a special issue [AKRS08,
IL08, Mat08, BGS10]. Interactive EA visualizations must be able to visualize a subset
of the data model or the information model (Re2.1) while offering valid interactions and
keeping consistency [DQvS08]. Thereby, these manipulations should not only influence
the visualization but also underlying data so that changes to the symbolic model are propagated to the respective data model and information model (Re2.2), being permitted and
constrained by an underlying data interaction model and interaction model.
Interactions with the visualization, i.e. the symbolic model, should be preferably smooth.
Following [Nie94] “the limit for having the user feel that the system is reacting instantaneously” is about 0.1 second. To provide EA visualizations in a decentralized manner
(cf. [BMNS09]), a solution is intended to use a client/server architecture allowing a centralized data model and information model while the generated visualizations can be decentrally viewed and manipulated (Re3). In this vein, a major challenge is the reduction of
needed round-trips for propagating changes in the symbolic model, which is client-sided
to the data model, possibly located at the server. During such a round-trip all kinds of
interactions have to be locked in order to guarantee that the semantic integrity of the data
model is not violated through any further incompatible interactions following the ACID
(atomicity, consistency, isolation, durability) properties. Possible round-trips may take up
to a couple of seconds leading to decreased user adoption. As a consequence as many as
possible restrictions related to the permitted user interactions, defined by the interaction
model, should be available intermediately within the client such that manipulations to the
symbolic model are limited to a minimum and hence increase usability (Re3.1).
In [BELM08], Buckl et al. have shown that visualizations, so-called V-Patterns, recur in
the discipline of EA management. Buckl et al. also synthesized these V-Patterns in socalled viewpoint building blocks (VBB). Considering the framework explained above,
80
Michael Schaub, Florian Matthes, Sascha Roth
these V-Patterns are viewpoints whereas the paradigm of VBBs is adapted. Buckl et
al. showed in [BELM08] that recurring patterns are reused and combinations thereof.
Therefore, EA visualizations must be defined as pre-configured, parameterized4 building
blocks (Re4) in order to increase re-usability of existing software artifacts and accelerate development periods. Moreover, EA visualizations must be generated employing such
building blocks allowing combinations thereof (Re4.1) in order to enable more complex
combinations of visualizations out of building blocks by end-users.
4
Prototypical implementation
Based on the framework introduced in Section 2 a prototypical implementation is developed which will be described in this section referring to the requirements of the previous
section.
In the following, the process of generating an EA visualization, i.e. generating a symbolic
model, is explained in detail by an exemplary information model and data model. The EA
visualization generated is taken from Buckl et al. (V-Pattern 26 in [BELM08]) since they
used a pattern-based approach, i.e. they observed this kind of visualization with underlying
an information model at least three times5 in practice6 .
*
1
Business Unit (BU)
Business Application (BA)
1
+name : string
+developmentFrom : Date
+developmentTo : Date
+plannedFrom : Date
+plannedTo : Date
+productionFrom : Date
+productionTo : Date
-retirementFrom : Date
-retirementTo : Date
*
*
Location
+name : String
+name : String
*
1
1
Employee
1
*
+firstName : String
+lastName : String
+email : String
+phone : String
(a) Information model with view model (in dashed lines)
Inner
+name : String
+rect1Start : Date
+rect1End : Date
+rect2Start : Date
+rect2End : Date
+rect3Start : Date
+rect3End : Date
+rect4Start : Date
+rect4End : Date
Outer
+name : String
*
1
(b) Abstract view model
Figure 2: Pattern Matching of abstract view model and information model
Figure 2(a) shows an excerpt from an information model consisting of business applications related to each other and business units that use them and are based at a certain
location having employees that work at business units. An exemplary instantiation of this
information model is illustrated in Figure 3 which is used in the following example to
generate a time-interval map (cf. [BELM08] or Figure 5).
The first step towards generating a visualization is to define a VBB as an abstract template (Re4). Thereby, an abstract view model (see Figure 2(b)) is created stating that
‘outer’ and ‘inner’ entities linked via a 1:n relationship are the information demands for
4 In
this context, parameterized means explicitly defined variability points.
an explanation of the ‘rule of three’ see [AIS+ 77].
6 This proves practical relevance as desired for a design science approach (cf. [HMPR04]).
5 For
Towards a Conceptual Framework for Interactive Enterprise Architecture Management Visualizations
81
CRM System : Business Application (BA)
name : string = CRM System
developmentFrom : Date = 01.01.2012
developmentTo : Date = 01.04.2012
plannedFrom : Date = 01.11.2011
plannedTo : Date = 01.01.2012
productionFrom : Date = 01.04.2012
productionTo : Date = 01.01.2013
retirementFrom : Date = 01.01.2013
retirementTo : Date = 01.03.2013
IT Shared Services : Business Unit (BU)
name : String = IT Shared Services
Munich : Location
name : String = Munich
Accounting System : Business Application (BA)
name : string = Accounting System
developmentFrom : Date = 01.01.2012
developmentTo : Date = 01.06.2012
plannedFrom : Date = 15.11.2011
plannedTo : Date = 01.12.2011
productionFrom : Date = 01.06.2012
productionTo : Date = 01.01.2013
retirementFrom : Date = 01.01.2013
retirementTo : Date = 01.02.2013
Martina Musterfrau : Employee
Max Mustermann : Employee
firstName : String = martina
lastName : String = musterfrau
email : String = musterfrau@company.tld
phone : String = +49 123 798 456
firstName : String = max
lastName : String = mustermann
email : String = mustermann@company.tld
phone : String = +49 123 456 789
Figure 3: Data model with view data model (in dashed lines)
this VBB. This formal specification of the information demands is used to match the pattern of abstract view model against a given information model with the pattern matching
using library IncQuery [BHR+ 10b]. The pattern matcher searches the information model
to match for potential ‘outer’ entities required to have a name attribute connected via a
1:n relationship to ‘inner’ entities that are required to offer four pairs of start-fields and
end-fields and a name attribute. This technique is used to offer an end-user the possibility
of visualizing data with an arbitrary information model (Re1) by choosing the information of interest from a list of potential ‘outer’ and ‘inner’ entities (Re1.1). Besides the
information demands, the VBB defines symbols that are used to visualize the chosen information, i.e. it describes variability points of elements of the visualization model in an
abstract visualization model (Figure 4(a)). For instance, symbols like rectangles or circles commonly offer different color-configurations or even could be used interchangeably
(e.g. use circles instead of rectangles). Commonly, elements of the visual model become
visible in a symbolic model, while the composite symbol, shown as dotted line in Figure 4(a), is used as logical container for a set of symbols and is not directly visible. As
shown in Figure 4(a) the abstract visualization model defines that each inner entity is represented through three kinds of symbols, namely a composite symbol, a text symbol and
four rectangle symbols. Furthermore, the composite symbol is conceived to enable the
setting of constraints/rules for all symbols contained therein. Figure 4(a) illustrates text
and rectangle symbols embodying different attributes that are used for the transforming
process later on. Finally the VBB defines a mapping between abstract view model and
abstract visualization model. Therefore, the VBB states whether an object/attribute of the
abstract view model is directly bound to a corresponding object/attribute of the abstract
visualization model and how exactly. Moreover, the VBB also defines objects/attributes
of the abstract view model that are employed to calculate 7 one or more objects/attributes
of the abstract visualization model. In the given example the name attribute of an ‘inner’
entity is directly bound to the name attribute of a text symbol, whereas pairs of start- and
end-date attributes are used to calculate the width of rectangle symbols. After matching
7 We call these derived attributes which can be the result of any calculation, e.g. a sum of values, transitive
relationships in the information model, etc.
82
Michael Schaub, Florian Matthes, Sascha Roth
fWgS\ bacjfl
Uehec\ Uehec
^[deb\ jfaSlSc
][deb\ jfaSlSc
VWUilce`fT\ Uehec
_jTak\ jfaSlSc
^[deb\ jfaSlSc
][deb\ jfaSlSc
XS^a b]gVehb
VWUilce`fT\ Uehec
_jTak\ jfaSlSc
^[deb\ jfaSlSc
][deb\ jfaSlSc
VWUilce`fT\ Uehec
_jTak\ jfaSlSc
^[deb\ jfaSlSc
][deb\ jfaSlSc
YSUaWflhS b]gVehb
VWUilce`fT\ Uehec
_jTak\ jfaSlSc
^[deb\ jfaSlSc
][deb\ jfaSlSc
ZegdebjaS b]gVehb
(a) VBB: Abstract visualization model
)"*6+ 7590)4 = '%& $-756*
:<,<9+ :<,<9 = !,":.
/>;<7+ 0)56469
->;<7+ 0)56469
#6/5 7-*!<,7
!":.49<3)8+ :<,<9 = 968
10852+ 0)56469
/>;<7+ 0)56469
->;<7+ 0)56469
!":.49<3)8+ :<,<9 = 4966)
10852+ 0)56469
/>;<7+ 0)56469
->;<7+ 0)56469
%6:5")4,6 7-*!<,7
(":.49<3)8+ :<,<9 = !,36
10852+ 0)56469
/>;<7+ 0)56469
->;<7+ 0)56469
(":.49<3)8+ :<,<9 = !,36
10852+ 0)56469
/>;<7+ 0)56469
->;<7+ 0)56469
'<*;<7056 7-*!<,7
(b) VBB: Visualization model
Figure 4: The connection between abstract and non-abstract visualization model
the pattern of information demand of an abstract view model to determine which part of an
information model is needed the view data model highlighted with dashed lines in Figure
3 is extracted from the data model (Re2.1). A so-called viewpoint configuration is used
to set relevant parameters so that the viewpoint can process the view data model that is
passed over to it. Thereby, the viewpoint configuration states which fields of the information model are mapped to which fields of the abstract view model which are chosen
from a list of all possible entities and combinations thereof by the end-user (Re1.1) after
the pattern matching. In the given example each pair of from and to values is mapped to
one pair of rectXStart and rectXEnd, i.e. developmentFrom is used for rect1Start and developmentTo is mapped to rect1End. All other rectX fields of the abstract view model are
mapped in a similar way. Furthermore, the field name of a business application entity is
mapped to the inner name attribute of the abstract view model. Besides the concrete mapping of an information model to an abstract view model, the abstract visualization model is
parametrized. In our example, the colors of the rectangles and the font-size of the text are
set. The viewpoint configuration itself is passed to a restful Web Service as a JSON string
where it is processed and passed over to the VBB. The resulting runtime models that are
created through parameterizing the abstract view model and abstract visualization model
are the view model (Figure 2(a)) and elements from the visualization model (Figure 4(b))
of the viewpoint. Being fully configured the viewpoint is finally used to process all entities of the view data model. In this step for each entity of the view data model, one row
of the time-interval map, whose structure is defined in the visualization model, is added
to the symbolic model that can be seen in Figure 5. At this point, also the layout is done,
i.e. setting x/y position and width parameters are calculated on the basis of the attributes
of the view data model. In our prototypical implementation, this result is JavaScript code
utilizing the Raphaël framework to generate the visualization in the web browser of the
client (Re3). In addition, the VBB not only specifies symbols or groups thereof in terms
of composite symbols, but also equips them with predefined interactions so that the user
can manipulate the visualization (Re2). In order to be able to set only permitted interactions, different information sources are used. As explained in Section 2 the interaction
Towards a Conceptual Framework for Interactive Enterprise Architecture Management Visualizations
wo vutt
kmj vutt
yoq vutt
†pq vutt
yo~ vutt
w‚ vutt
w‚ˆ vutt
†‚ vutt
‡mp vutt
iƒn vutt
hrg vutt
83
fmƒ vutt
srqprqonm lrqnoˆ
‡†l …„
†ƒƒr‚n€ ‡~}nm|
{zl ‡~}nm|
‡sy ‡~}nm|
Figure 5: Symbolic model
model determines which actions are allowed upon the data model without affecting the integrity of the information model, whereas the data interaction model checks access rights
on a particular element in the data model. On the other hand, the abstract view interaction
model states which kind of interactions are allowed upon the abstract view model within
this VBB. In the given example (cf. Figure 6) the fields rectXStart and rectXEnd of the
inner entities can be changed (in italic) , whereas the name field must remain the same (in
bold). Propagating these changes to the data model is only allowed because they are based
upon bijective functions. In contrast, when using derived attributes that are calculated using aggregated values, interactions are not permitted since an interaction not based upon
a bijective function would inevitably cause trouble while propagating changes to the data
model (Re2.2). Additionally the abstract visual interaction model determines the permit-
!))/.
+ $)!', .-/%$&
" +,0(&3(2+(' )(+1-*
" +,0(&!-.' )(+1-*
" +,0(%3(2+(' )(+1-*
" +,0(%!-.' )(+1-*
" +,0($3(2+(' )(+1-*
" +,0($!-.' )(+1-*
" +,0(#3(2+(' )(+1-*
" +,0(#!-.' )(+1-*
3*,/.
#
"
+ $)!', .-/%$&
*"#(& ./21$0)+'
4(2/10& ./21 % (.-,/
Figure 6: Abstract view interaction model
ted interactions upon the different symbols that are used for rendering the symbolic model
later on, i.e. changes to the width of an element can be interpreted as changes of a date.
Interactions are represented by constraints that are attached to different symbols, whereas
for each possible interaction that can be triggered, like moving or dragging and dropping
symbols, a constraint is implemented with individual parameters that can be customized
to restrict this interaction. As an example, the rectangles of the time-interval map can
be moved horizontally only, whereas the composite symbols are limited to vertical movement. Figure 7(a) shows each composite and rectangle symbol equipped with a Movement
constraint to achieve the ascribed functionality. The Movement constraint itself has three
parameters, direction, minimum and maximum, that have to be set up in order to enable
the equipped symbol to be moved in the given direction and between the minimum and
maximum value. Accordingly, if a symbol should be moved horizontally and vertically,
two Movement constraints will have to be attached to the symbol. Besides Movement, fur-
84
Michael Schaub, Florian Matthes, Sascha Roth
ther constraints, like Resizing or Containment have been implemented, but are not needed
for this particular kind of visualization. Each of these constraints has its own parameters
set up individually in a VBB. In addition to interaction constraints, the symbolic interactions we used focus on direct user feedback, i.e. tool tip texts or highlight on selection. In
our example tool tip texts are used when hovering over or dragging an end of rectangles
(rectXStart or rectXEnd). As shown in Figure 5, some of the rectangles are not filled reflecting read-only access gathered from the data interaction model. With the abstract view
interaction model describing which interactions are allowed upon the abstract view model
and the abstract visual interaction model stating which user interactions are allowed upon
the different symbols of the abstract visualization model, the mapping between these two
models, that is aligned with the mapping between the abstract view model and abstract
visualization model, it is specified how the permitted user interactions of the abstract visual interaction model affect the attributes of the entities of the abstract view model so
that round-trips are omitted in the first place (Re3.1). In the given example the abstract
visual interaction model prescribes that the rectangles can only be moved horizontally.
Furthermore the abstract view interaction model states that only the values of rectXStart
and rectXEnd of inner entities can be changed. In addition the mapping between these
models describes that a horizontal movement of a rectangle symbol causes the rectXStart
and rectXEnd fields of the corresponding inner entity to be updated to the current positions.
On instantiation, the VBB is configured to a viewpoint with the visualization configuration
+
+
+
+
&,7:0:.;
&,7:0:.;
'4):!;4,./ 7:);4!#2
&4.4090/
&#54090/
(,0*,<4;: <30",2<
+
'4):!;4,./ 7:);4!#2
&4.4090/ +=&#54090/ +---
(,0*,<4;: <30",2<
+
+
+
&,7:0:.;
&,7:0:.;
%:!;#.82: $30",2<
'4):!;4,./ 6,)41,.;#2
&4.4090/
&#54090/
(a) VBB: Mapping of abstract visualization model
and abstract visual interaction model
%:!;#.82: $30",2<
'4):!;4,./ 6,)41,.;#2
&4.4090/ =&#54090/ =--
(b) Viewpoint: Mapping of visualization model and
visual interaction model
Figure 7: Abstract visual interaction model and visual interaction model
that may contain parameters for the abstract visual interaction model which can be seen
in Figure 7(b). In our prototype, different VBBs or combinations thereof (Re4.1) can be
used. After processing all entities of the view data model the information gained from
the view interaction model and visual interaction model of the fully configured viewpoint
is used to enrich the created symbolic model with the permitted possibilities of user interactions (Re2.2 & Re3.1), in terms of symbols being equipped with the corresponding
interaction constraints.
Towards a Conceptual Framework for Interactive Enterprise Architecture Management Visualizations
5
85
Related Work
This section gives a brief overview of interactive visualizations. Some of these interactive visualizations constitute a visual domain-specific language, whereas others are mere
drawing tools and are not bound directly to a data and/or information model.
JS Library D38 . With the JS based library D3 it is possible to create manifold visualizations from structured data. Examining the structure of D3 it is shown that some kind of
view model can be found even within this framework, specifying the structure of the data
that can be loaded in order to generate a visualization. At this point it has to be mentioned
that due to the static view model a mapping between an arbitrary information model and
this view model has to be implemented separately. Thus only one kind of information
model can be processed at a time without creating a new mapping between a second information model and the view model. Besides the view model D3 contains a visualization
model constituting the symbols to be used for generating visualizations. Without an extension D3 contains circles, squares, crosses, diamonds and triangles as possible symbols.
Looking at D3’s possibilities for user interaction it can be seen that only rudimentary functions are available. For example, it is possible to select a subset of the view data model and
update the visualization with this extract but there exists neither a possibility of changing
the visualized data nor can changes be propagated to the view data model not to mention
to the data model with respect to data integrity to the respective information model. With
regard to the above identified requirements it shows that a mapping between an arbitrary
information model and the view model has to be implemented separately, which is why
Re1 is just partly fulfilled and end-user configuration (Re1.1) is not offered by D3. In contrast, Re2 can not be fulfilled entirely, since D3 focuses on interactions that center around
giving user feedback. D3 does provide functions for selection of subsets of the view data
model (Re2.1). In general, changes upon the symbolic model cannot be propagated back
to a data model (Re2.2). Re3 is fulfilled in partially, because D3 being a web framework
written in JavaScript, a client/server architecture can be implemented, but communication
with the server, respectively with the information model and data model would have to
be implemented separately. Hence, complete fulfillment of Re3 is not given. Additionally, as all client/server communication would have to be implemented (Re3.1) is thought
not be fulfilled. Furthermore D3 offers different possibilities for definition of predefined
parametrize visualization types (Re4) that can be reused or even combined with manageable effort in order to create new kinds of visualizations, leading to (Re4) and (Re4.1)
being fulfilled.
yFiles. yFiles [WEK02] is a Java class library for rendering and analyzing visualizations, especially graphs. Therefore, it provides separate packages to analyze, layout, or
draw visualizations on a Java Swing form. An exemplary application showing all of the
main features of yFiles is yEd, a tool for creating visualizations of graphs, networks, and
diagrams9 . Within yFiles there exists a single static view model for all kinds of visualizations that can be rendered with this framework, mainly consisting of ‘nodes’ and
‘edges’. Additionally, a separate visualization model can be found for each type of visual8 See
9 See
http://mbostock.github.com/d3/ last accessed: Oct. 26, 2011.
http://www.yworks.com/de/products_yed_about.html last accessed Oct. 26, 2011.
86
Michael Schaub, Florian Matthes, Sascha Roth
ization, determining the symbols to be used for visualizing the entities of the view model
and for combining them in order to generate the symbolic model. Furthermore, there exists a visual interaction model for each visualization model stating which interactions are
permitted for each symbol of the generated symbolic model. Using yFiles comes with
a mapping that has to be prepared in order to process an organization-specific information model and corresponding data model, thus Re1 is partially fulfilled since the view
model is static. Visualizations generated using yFiles include possibilities of parameterizing these in a user-friendly manner (Re1.1). As yFiles contains manifold possibilities for
user interaction (Re2) is fulfilled, whereas selections of subsets of the data model are not
included (Re2.1). Re2.2 cannot be fulfilled completely as changes to the symbolic model
are propagated to the view model but not to the data model. Since yFiles is implemented
using Java there is a possibility of implementing a solution as an applet or using Java Web
Start technology to transfer interaction constraints to a client (Re3). As yFiles offers no
possibility for propagating changes to the data model (Re3.1) is not fulfilled. Only a few
types of visualizations can be generated without substantially extending yFiles and other
visualizations cannot be predefined (Re4). Also, yFiles does not include any possibility of
combining different visualization types (Re4.1).
Visio. Microsoft Visio is a desktop application to create any kind of symbolic models,
reaching from business processes in BPMN notation to construction blueprints. Among
the possibility of creating all these visualizations by hand, Visio offers the possibility of
creating these out of data files, or databases out of a predefined format. However this
option is severely restricted as it uses a static view model and does only provide a small
amount of parameters to set when querying an information model. For instance, generating
an organizational chart out of a spreadsheet or database can serve as an example, as just a
few parameters have to be mapped to possible fields that can be shown in the visualization.
Thereby, one of them is indicating the relationship between the entities. In this context Visio contains a static view model, being bound to a very limited information model. Besides
this view model there exist visualization models and visual interaction models within Visio
for each kind of visualization that can be rendered. Visio’s potential to be used for generating EAM visualizations can be shown by considering the above mentioned requirements.
Visio is able to use different information models, thus Re1 can be fulfilled partially. So
as Re1.1, because Visio offers a small set of possibilities of parameterizing visualizations.
As Visio offers manifold possibilities for user interaction Re2 is completely fulfilled, but
Visio lacks the functionality of selecting subsets of the data model, hence Re2.1 is not
fulfilled. The missing functionality of propagating changes within the symbolic model to
the view model or the data model leads to Re2.2 not being fulfilled. As Visio is a desktop
application Re3 and Re3.1 are not covered, as no client/server architecture can be implemented though no statement about round-trips can be made. Similar to yFiles, Visio
includes a limited set of visualization types that can be parametrized in some cases, but
cannot be predefined (Re4) and does not support combination thereof (Re4.1).
Generic Modeling Environment (GME). The main purpose of the GME is to create a
(visual) domain-specific language (DSL) with separated information model and its representation in the sense of a symbolic model. Therefore, GME uses a metamodel which
is represented through MetaGME that describes the main aspects of the employed infor-
Towards a Conceptual Framework for Interactive Enterprise Architecture Management Visualizations
D3
yFiles
Visio
GME
87
GMF
Re1
Re1.1
Re2
Re2.1
Re2.2
Re3
Re3.1
Re4
Re4.1
Table 1: Visualization capabilities of the presented approaches
mation model. Furthermore, model related constraints can be integrated using the Object
Constraint Language (OCL). These constraints are equipped with priorities and corresponding actions that have to be performed in case of a violation of themselves. Evaluating
GME against our requirements, GME shows that it is possible to generate visualizations
out of any kind of information model (Re1), but only one concrete information model can
be processed at a time and no configuration at runtime is offered, especially when it comes
to support for an end-user configuration (Re1.1). Moreover, GME offers far-reaching possibilities for user interaction (Re2), but a selection of subsets of the data model is not
included (Re2.1). All changes to the symbolic model are propagated to the data model
leading to (Re2.2) being fulfilled entirely. Just like Visio, GME is a desktop application,
wherefore Re3 and Re3.1 cannot be fulfilled for aforementioned reasons. As it is one of
GME’s main purposes it includes functionalities for predefined visualization types (Re4),
but it does not provide any possibility for combining two or more of these types in order
to create new kinds of visualizations (Re4.1).
Graphical Modeling Framework (GMF). Like GME, GMF aims at providing a framework to construct a DSL with separated information model and graphical representation of entities thereof. Given that GMF is based on the Graphical Editing Framework
(GEF) [RWC11] it has the same capabilities for constructing and manipulating models as the GEF. GMF offers a wide range of possibilities of implementing constraints,
i.e. constraints can be formulated in OCL, as regular expressions or implemented as Java
code. Due to their similarity the GME and GMF provide similar capabilities of fulfilling the requirements. GMF also has the ability to process any kind of information model
(Re1), but it has to be adapted to each information model that shall be used (Re1.1). GMF
also provides manifold possibilities for user interaction (Re2), but does not include any
functionalities for a selection of subsets of the data model (Re2.1). Users changes to the
symbolic model are propagated to the respective data model (Re2.2) while the integrity of
the information model is guaranteed. Like GME, GMF is a desktop application, which is
why (Re3) and (Re3.1) cannot be fulfilled. As GMF allows the definition of predefined
visualization types but does not allow the combination of these types (Re4) is fulfilled,
while (Re4.1) cannot be fulfilled.
88
Michael Schaub, Florian Matthes, Sascha Roth
6
Conclusion and Outlook
After motivating interactive visualizations for EA management, this paper introduced a
conceptual framework to realize interactive visualizations. In this paper we demonstrated
how the framework can be used 1) to compare different visualization frameworks and
2) as a reference architecture to describe and implement visualization tools. During the
latter, we showed how we implemented the different models and how they interact with
each other for a specific visualization. Thereby, the chosen visualization originates from
practice and was found as recurring pattern in the EA management domain.
In line with the design science approach of Hevner [HMPR04], further research will detail and refine the models of the introduced framework by incorporating feedback from
practical applications. We expect that for the EA management discipline only a number of
relevant viewpoint building blocks and respective interactions have to be developed while
the remaining ones are combinations thereof. In particular parameters, variability points of
visualizations and further relevant interactions for industry have to be found by empirical
evaluation of the created design artefact. Currently, the prototypical implementation has
been applied to a pattern-based case. Further research can broaden the scope these visualizations are applied to and also observe interactions actually employed by end-users.
In order to describe interactions, a formal language for describing interactivity is currently
missing. As of today, images with ‘arrows and boxes’ summarized as mock-ups are used
in combination with a full-text description of possible interactions so that behavior and
semantics become clear to end-users. Further research could address this issue incorporating different ways currently used to describe interactive behavior of visualizations and
prove the usability by end-user studies. Such a language could facilitate the way end-users
evaluate mock-ups of interactive visualizations in general and in particular for the domain
of EA management.
References
[AIS+ 77]
Christopher Alexander, Sara Ishikawa, Murray Silverstein, Max Jacobson, Ingrid
Fiksdahl-King, and Shlomo Angel. A Pattern Language. Oxford University Press,
New York, NY, USA, 1977.
[AKRS08]
Stephan Aier, Stephan Kurpjuweit, Christian Riege, and Jan Saat. Stakeholderorientierte Dokumentation und Analyse der Unternehmensarchitektur. In Heinz-Gerd
Hegering, Axel Lehmann, Hans Jürgen Ohlbach, and Christian Scheideler, editors,
GI Jahrestagung (2), volume 134 of LNI, pages 559–565, Bonn, Germany, 2008.
Gesellschaft für Informatik.
[BBDF+ 12] Marcel Berneaud, Sabine Buckl, Arelly Diaz-Fuentes, Florian Matthes, Ivan Monahov, Aneta Nowobliska, Sascha Roth, Christian M. Schweda, Uwe Weber, and Monika
Zeiner. Trends for Enterprise Architecture Management Tools Survey. Technical report, Technische Universität München, 2012.
[BDMS10]
Sabine Buckl, Thomas Dierl, Florian Matthes, and Christian M. Schweda. Building
Blocks for Enterprise Architecture Management Solutions. In Frank et al. Harm-
Towards a Conceptual Framework for Interactive Enterprise Architecture Management Visualizations
89
sen, editor, Practice-Driven Research on Enterprise Transformation, second working conference, PRET 2010, Delft, pages 17–46, Berlin, Heidelberg, Germany, 2010.
Springer.
[BEL+ 07]
Sabine Buckl, Alexander M. Ernst, Josef Lankes, Kathrin Schneider, and Christian M.
Schweda. A pattern based Approach for constructing Enterprise Architecture Management Information Models. In A. Oberweis, C. Weinhardt, H. Gimpel, A. Koschmider,
V. Pankratius, and Schnizler, editors, Wirtschaftsinformatik 2007, pages 145–162,
Karlsruhe, Germany, 2007. Universitätsverlag Karlsruhe.
[BELM08]
Sabine Buckl, Alexander M. Ernst, Josef Lankes, and Florian Matthes. Enterprise
Architecture Management Pattern Catalog (Version 1.0, February 2008). Technical
report, Chair for Informatics 19 (sebis), Technische Universität München, Munich,
Germany, 2008.
[BGS10]
Sabine Buckl, Jens Gulden, and Christian M. Schweda. Supporting ad hoc Analyses
on Enterprise Models. In 4th International Workshop on Enterprise Modelling and
Information Systems Architectures, 2010.
[BHR+ 10a] G. Bergmann, A. Horváth, I. Ráth, D. Varró, A. Balogh, Z. Balogh, and A. Okrös.
Incremental Model Queries over EMF Models. In ACM/IEEE 13th International Conference on Model Driven Engineering Languages and Systems, 2010.
[BHR+ 10b] Gábor Bergmann, Ákos Horváth, István Ráth, Dániel Varró, András Balogh, Zoltán
Balogh, and András Ökrös. Incremental Evaluation of Model Queries over EMF Models. In Model Driven Engineering Languages and Systems, 13th International Conference, MODELS 2010. Springer, Springer, 2010.
[BMM+ 11] Sabine Buckl, Florian Matthes, Ivan Monahov, Sascha Roth, Christopher Schulz, and
Christian M. Schweda. Enterprise Architecture Management Patterns for Enterprisewide Access Views on Business Objects. In European Conference on Pattern Languages of Programs (EuroPLoP) 2011, Irsee Monastery, Bavaria, Germany, 2011.
[BMNS09]
Sabine Buckl, Florian Matthes, Christian Neubert, and Christian M. Schweda. A
Wiki-based Approach to Enterprise Architecture Documentation and Analysis. In The
17th European Conference on Information Systems (ECIS) – Information Systems in a
Globalizing World: Challenges, Ethics and Practices, 8.–10. June 2009, Verona, Italy,
pages 2192–2204, Verona, Italy, 2009.
[BMR+ 10a] Sabine Buckl, Florian Matthes, Sascha Roth, Christopher Schulz, and Christian M.
Schweda. A Conceptual Framework for Enterprise Architecture Design. In Will
Aalst, John Mylopoulos, Norman M. Sadeh, Michael J. Shaw, Clemens Szyperski,
Erik Proper, Marc M. Lankhorst, Marten Schönherr, Joseph Barjis, and Sietse Overbeek, editors, Trends in Enterprise Architecture Research, volume 70 of Lecture Notes
in Business Information Processing, pages 44–56. Springer Berlin Heidelberg, 2010.
[BMR+ 10b] Sabine Buckl, Florian Matthes, Sascha Roth, Christopher Schulz, and Christian M.
Schweda. A Method for Constructing Enterprise-wide Access Views on Business
Objects. In Klaus-Peter Fähnrich and Bogdan Franczyk, editors, GI Jahrestagung (2),
volume 176 of LNI, pages 279–284. GI, 2010.
[BURV11]
Gábor Bergmann, Zoltán Ujhelyi, István Ráth, and Dániel Varró. A Graph Query
Language for EMF models. In Jordi Cabot and Eelco Visser, editors, Theory and
Practice of Model Transformations, Fourth International Conference, ICMT 2011,
Zurich, Switzerland, June 27-28, 2011. Proceedings, volume 6707 of Lecture Notes in
Computer Science, pages 167–182. Springer, Springer, 2011.
90
Michael Schaub, Florian Matthes, Sascha Roth
[CG02]
Alesandro Cecconi and Martin Galanda. Adaptive zooming in web cartography. In
Computer Graphics Forum, pages 787–799. Wiley Online Library, 2002.
[DQvS08]
Remco M. Dijkman, Dick A.C. Quartel, and Marten J. van Sinderen. Consistency in
multi-viewpoint design of enterprise information systems. Information and Software
Technology, 50(7–8):737 – 752, 2008.
[HMPR04]
Alan R. Hevner, Salvatore T. March, Jinsoo Park, and Sudha Ram. Design Science in
Information Systems Research. MIS Quarterly, 28(1):75–105, 2004.
[IL08]
Hannakaisa Isomäki and Katja Liimatainen. Challenges of Government Enterprise
Architecture Work – Stakeholders’ Views. In Maria Wimmer, Hans Jochen Scholl, and
Enrico Ferro, editors, Electronic Government, 7th International Conference, pages
364–374, Turin, Italy, 2008. Springer.
[Int07]
International Organization for Standardization.
ISO/IEC 42010:2007 Systems
and software engineering – Recommended practice for architectural description of
software-intensive systems, 2007.
[Lee99]
Y.T. Lee. Information modeling: From design to implementation. In Proceedings of
the Second World Manufacturing Congress, pages 315–321. Citeseer, 1999.
[Mat08]
Florian Matthes. Softwarekartographie. Informatik Spektrum, 31(6), 2008.
[MBF+ 11]
Stephan Murer, Bruno Bonati, Frank J. Furrer, Stephan Murer, Bruno Bonati, and
Frank J. Furrer. Managed Evolution. Springer Berlin Heidelberg, 2011.
[MBLS08]
Florian Matthes, Sabine Buckl, Jana Leitel, and Christian M. Schweda. Enterprise
Architecture Management Tool Survey 2008. Chair for Informatics 19 (sebis), Technische Universität München, Munich, Germany, 2008.
[MWF08]
Stephan Murer, Carl Worms, and Frank J. Furrer. Managed Evolution. Informatik
Spektrum, 31(6):537–547, 2008.
[Nie94]
Jakob Nielsen. Usability Engineering. Elsevier LTD, Oxford, 1994.
[Ros03]
Jeanne W. Ross. Creating a Strategic IT Architecture Competency: Learning in Stages.
MIS Quarterly Executive, 2(1), 2003.
[RWC11]
Dan Rubel, Jaime Wren, and Eric Clayberg. The Eclipse Graphical Editing Framweork (GEF). Addison-Wesley, 2011.
[Sch11]
Christian M Schweda. Development of Organization-Specific Enterprise Architecture
Modeling Languages Using Building Blocks. PhD thesis, TU München, 2011.
[WEK02]
R. Wiese, M. Eiglsperger, and M. Kaufmann. yfiles: Visualization and automatic
layout of graphs. In Graph drawing: 9th international symposium, GD 2001, Vienna,
Austria, September 23-26, 2001: revised papers, volume 129, page 453. Springer
Verlag, 2002.
[Wit07]
André Wittenburg. Softwarekartographie: Modelle und Methoden zur systematischen
Visualisierung von Anwendungslandschaften. PhD thesis, Fakultät für Informatik,
Technische Universität München, Germany, 2007.
[WR09]
P. Weill and J.W. Ross. IT Savvy: What Top Executives Must Know to Go from Pain
to Gain. Harvard Business Press, 2009.
Exploring usability-driven Differences of graphical
Modeling Languages: An empirical Research Report
Christian Schalles, John Creagh, Michael Rebstock∗
Department of Computing
Cork Institute of Technology
Rossa Ave
Cork, Ireland
christian.schalles@mycit.ie, john.creagh@cit.ie
∗
Faculty of Economics and Business Administration
Hochschule Darmstadt University of applied Sciences
Haardtring 100
64295 Darmstadt, Germany
michael.rebstock@h-da.de
Abstract: Documenting, specifying and analyzing complex domains such as information systems or business processes have become unimaginable without the support
of graphical models. Generally, models are developed using graph-oriented languages
such as Event Driven Process Chains (EPCs) or diagrams of the Unified Modeling
Language (UML). For industrial use, modeling languages aim to describe either information systems or business processes. Heterogeneous modeling languages allow
different grades of usability to their users. In our paper we focus on an evaluation of
four heterogeneous modeling languages and their different impact on user performance
and user satisfaction. We deduce implications for both educational and industrial use
using the Framework for Usability Evaluation of Modeling Languages (FUEML).
1
Introduction
In industry, models specifying information system requirements or representing business
process documentations are developed by the application of various graphical modeling
languages such as the UML and EPCs. In general, graphical modeling languages aim
to support the expression of relevant aspects of real world domains such as information
system structures or business processes [Lud03]. Almost all notations for software and
business process specifications use diagrams as the primary basis for documenting and
communicating them. The large number of available languages confronts companies with
the problem of selecting the language most suitable to their needs. Beside functional
and technical evaluation criteria, user-oriented characteristics of modeling languages are
becoming more and more a focal point of interest in research and industry [SW07]. In
this research paper we report about a comparative study on usability of selected modeling
92
Christian Schalles, John Creagh, Michael Rebstock
languages using the FUEML framework. The remainder of this paper is structured as
follows: First, we analyze the theoretical background and state the hypotheses of this
study. Secondly, we define usability in the domain of graphical modeling languages and
additionally define metrics for measuring usability. Subsequently, we present our research
methodology and our resulting findings. Lastly, we deduce implications based on our
results and give an outlook on future research.
2
Theoretical Background
The variety of definitions and measurement models of usability complicates the extraction of capable attributes for assessing the usability of modeling languages. A usability
study would be of limited value if it would not be based on a standard definition and operationalization of usability [CK06]. The International Organization for Standardization
(ISO) defines usability as the capacity of the software product to be understood, learned
and attractive to the user, when it is used under specified conditions [ISO06]. Additionally, the ISO defined another standard which describes usability as the extent to which a
product can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use [ISO98]. The Institute of Electrical
and Electronics Engineers (IEEE) established a standard, which describes usability as the
ease a user can learn how to operate, prepare inputs for, understand and interpret the outputs of a system or component [IEE90]. Dumas and Redish (1999) define, usability means
quickness and simplicity regarding a users task accomplishment. This definition is based
on four assumptions [DR99]: 1) Usability means focusing on users, 2) Usability includes
productivity, 3) Usability means ease of use and 4) Usability means efficient task accomplishment.
Shackel (1991) associates five attributes for defining usability: speed, time to learn, retention, errors and the user specific attitude [Sha91]. Preece et al. (1994) combined effectiveness and efficiency to throughput [PRS+ 94]. Constantine and Lockwood (1999) and
Nielsen (2006) collected the attributes defining usability and developed an overall definition of usability attributes consisting of learnability, memorability, effectiveness, efficiency
and user satisfaction [CL99, AKSS03] . The variety of definitions concerning usability attributes led to the use of different terms and labels for the same usability characteristics, or
different terms for similar characteristics, without full consistency across these standards;
in general, the situation in the literature is similar. For example, learnability is defined
in ISO 9241-11 as a simple attribute, ‘time of learning‘, whereas ISO 9126 defines it as
including several attributes such as ‘comprehensible input and output, instructions readiness, messages readiness‘ [AKSS03].
As a basis for our following up survey we underlay a usability definition for modeling
languages in model development and model interpretation scenarios including attributes
as follows:
The usability of modeling languages is specified by learnability, memorability, effectiveness, efficiency, user satisfaction and perceptibility. The learnability of modeling languages describes the capability of a modeling language to enable the user to learn ap-
Exploring usability-driven Differences of graphical Modeling Languages: An empirical Research Report
93
plying models based on particular language. The modeling language and its semantics,
syntax and elements should be easy to remember, so that a user is able to return to the
language after some period of non-use without having to learn the language and especially the application of models developed with specific language again. Effective model
application should be supported by particular language for reaching a successful task
accomplishment. Modeling languages should be efficient to use, so that a high level of
working productivity is possible. Users have to be satisfied when using the language. For
model interpretation scenarios the language should offer a convenient perceptibility regarding structure, overview, elements and shapes so that a user is able to search, extract
and process available model information in an easy way.
3
Model of Hypotheses
In the following we show our hypotheses supported by theory. The motivation for those
hypotheses lies