Volume P-201(2012) - Mathematical Journals
Transcription
Volume P-201(2012) - Mathematical Journals
GI-Edition Gesellschaft für Informatik e.V. (GI) publishes this series in order to make available to a broad public recent findings in informatics (i.e. computer science and information systems), to document conferences that are organized in cooperation with GI and to publish the annual GI Award dissertation. The volumes are published in German or English. Information: http://www.gi.de/service/publikationen/lni/ Elmar J. Sinz Andy Schürr (Hrsg.) Elmar J. Sinz, Andy Schürr (Hrsg.): Modellierung 2012 Broken down into • seminars • proceedings • dissertations • thematics current topics are dealt with from the vantage point of research and development, teaching and further training in theory and practice. The Editorial Committee uses an intensive review process in order to ensure high quality contributions. Lecture Notes in Informatics Modellierung 2012 14.–16. März 2012 Bamberg ISSN 1617-5468 ISBN 978-3-88579-295-6 The “Modellierung” conference series reports since 1998 on the broad range of modeling from a variety of perspectives.This year, up-to-date research results comprise topics from Foundations of Modeling, Visualization of Models, Model Based Development, Search, Reuse and Knowledge Acquisition to Domain-specific Applications. 201 Proceedings Elmar J. Sinz, Andy Schürr (Hrsg.) Modellierung 2012 14.-16. März 2012 Bamberg Gesellschaft für Informatik e.V. (GI) Lecture Notes in Informatics (LNI) - Proceedings Series of the Gesellschaft für Informatik (GI) Volume P-201 ISBN 978-3-88579-295-6 ISSN 1617-5468 Volume Editors Prof. Dr. Elmar J. Sinz Lehrstuhl für Wirtschaftsinformatik, Universität Bamberg 96052 Bamberg, Germany Email: elmar.sinz@uni-bamberg.de Prof. Dr. Andy Schürr Institut für Datentechnik, Technische Universität Darmstadt 64283 Darmstadt, Germany Email: andy.schuerr@es.tu-darmstadt.de Series Editorial Board Heinrich C. Mayr, Alpen-Adria-Universität Klagenfurt, Austria (Chairman, mayr@ifit.uni-klu.ac.at) Dieter Fellner, Technische Universität Darmstadt, Germany Ulrich Flegel, Hochschule für Technik, Stuttgart Ulrich Frank, Universität Duisburg-Essen, Germany Johann-Christoph Freytag, Humboldt-Universität zu Berlin, Germany Michael Goedicke, Universität Duisburg-Essen, Germany Ralf Hofestädt, Universität Bielefeld, Germany Michael Koch, Universität der Bundeswehr München, Germany Axel Lehmann, Universität der Bundeswehr München, Germany Ernst W. Mayr, Technische Universität München, Germany Sigrid Schubert, Universität Siegen, Germany Ingo Timm, Universität Trier Karin Vosseberg, Hochule Bremerhaven, Germany Maria Wimmer, Universität Koblenz-Landau, Germany Dissertations Steffen Hölldobler, Technische Universität Dresden, Germany Seminars Reinhard Wilhelm, Universität des Saarlandes, Germany Thematics Andreas Oberweis, Karlsruher Institut für Technologie (KIT), Germany Gesellschaft für Informatik, Bonn 2012 printed by Köllen Druck+Verlag GmbH, Bonn Vorwort Modelle stellen eines der wichtigsten Hilfsmittel zur Beherrschung komplexer Systeme dar. Die Themenbereiche der Entwicklung, Nutzung, Kommunikation und Verarbeitung von Modellen sind so vielfältig wie die Informatik mit all ihren Ausdifferenzierungen. Die Fachtagung „Modellierung“ wird vom Querschnittsfachausschuss Modellierung der Gesellschaft für Informatik e.V. seit 1998 durchgeführt und hat sich als einschlägiges Forum für die Modellierung etabliert. Sie führt Teilnehmerinnen und Teilnehmer aus allen Bereichen der Informatik sowie aus Wissenschaft und Praxis zusammen. Die Tagung zeichnet sich traditionell durch lebendige und fachgebietsübergreifende Diskussionen und engagierte Rückmeldungen aus, weshalb sie gerade auch für Nachwuchswissenschaftlerinnen und Nachwuchswissenschaftler interessant ist. Der vorliegende Band enthält die 17 Beiträge des wissenschaftlichen Hauptprogramms der Modellierung 2012, die aufgrund von jeweils 3 Gutachten aus insgesamt 45 Beiträgen ausgewählt wurden. Dies entspricht einer Annahmequote von 37,7 %. Die Themen der wissenschaftlichen Beiträge umspannen ein weites Feld, das von Grundlagen der Modellierung über Visualisierung von Modellen bis hin zu modellbasierter Entwicklung, Suche und Wiederverwendung von Modellen, Wissensgewinnung aus Modellen sowie domänenspezifischer Modellierung reicht. Ergänzt wird das Programm durch ein Doktorandensymposium sowie Workshops, Tutorien und ein Praxisforum. Alles in allem ein rundes Programm rund um die Modellierung. Wir danken allen Referenten dafür, dass sie uns ihre Beiträge anvertraut haben, sowie dem Programmkomitee und den Verantwortlichen für Doktorandensymposium, Workshops, Tutorien und Praxisforum für die Qualitätssicherung und die Auswahl des endgültigen Programms. Ein besonderer Dank geht an Dipl.-Wirtsch.Inf. Domenik Bork und Dipl.-Wirtsch.Inf. Matthias Wolf für ihren unermüdlichen Einsatz zum Gelingen der Tagung. Nicht zuletzt geht unser herzlicher Dank an Frau Regina Henninges für ihre Rolle als organisatorischer Mittelpunkt der Tagung. Bamberg und Darmstadt im Februar 2012 Elmar J. Sinz, Andy Schürr Sponsoren Wir danken den folgenden Unternehmen für die Unterstützung der Modellierung 2012: BOC Information Technologies Consulting AG MID GmbH Senacor Technologies AG Querschnittsfachausschuss Modellierung Die „Modellierung“ ist eine Arbeitstagung des Querschnittsfachausschusses Modellierung, in dem derzeit folgende GI-Fachgliederungen vertreten sind: EMISA, Entwicklungsmethoden für Informationssysteme und deren Anwendung FoMSESS, Formale Methoden und Modellierung für Sichere Systeme ILLS, Intelligente Lehr- und Lernsysteme MMB, Messung, Modellierung und Bewertung von Rechensystemen OOSE, Objektorientierte Software-Entwicklung PN, Petrinetze RE, Requirements Engineering ST, Softwaretechnik SWA, Softwarearchitektur WI-MobIS, Informationssystem-Architektur: Modellierung betrieblicher Informationssysteme WI-VM, Vorgehensmodelle für die Betriebliche Anwendungsentwicklung WM/KI, Wissensmanagement Verantwortliche Programmkomitee-Vorsitz: Workshops: Praxisforum: DoktorandInnensymposium: Tutorien: Elmar J. Sinz, Universität Bamberg Andy Schürr, Technische Universität Darmstadt Matthias Riebisch, Technische Universität Ilmenau Peter Tabeling, Intervista AG Ulrich Frank, Universität Duisburg-Essen Gabriele Taentzer, Universität Marburg Friederike Nickl, Swiss Life Deutschland Jan Jürjens, TU Dortmund und Fraunhofer ISST Programmkomitee Colin Atkinson Thomas Baar Brigitte Bartsch-Spörl Ruth Breu Jörg Desel Jürgen Ebert Gregor Engels Ulrich Frank Holger Giese Martin Glinz Martin Gogolla Ursula Goltz Holger Herrmanns Wolfgang Hesse Martin Hofmann Frank Houdek Heinrich Hussmann Matthias Jarke Jan Jürjens Gerti Kappel Dimitris Karagiannis Roland Kaschek Ralf Kneuper Christian Kop Thomas Kühne Jochen Küster Horst Lichter Peter Liggesmeyer Florian Matthes Heinrich C. Mayr Mark Minas Günther Müller-Luschnat Universität Mannheim akquinet tech@spree GmbH, Berlin BSR GmbH, München Universität Innsbruck FernUniversität Hagen Universität Koblenz-Landau Universität Paderborn Universität Duisburg-Essen Hasso-Plattner-Institut Universität Zürich, CH Universität Bremen TU Braunschweig Universität des Saarlandes Universität Marburg LMU München Daimler AG LMU München RWTH Aachen TU Dortmund und Fraunhofer ISST TU Wien Universität Wien Düsseldorf Darmstadt Alpen-Adria-Universität Klagenfurt Victoria University of Wellington IBM Research, Zürich, CH RWTH Aachen TU Kaiserslautern TU München Alpen-Adria-Universität Klagenfurt Universität der Bundeswehr München Pharmatechnik GmbH Programmkomitee (Fortsetzung) Friederike Nickl Markus Nüttgens Andreas Oberweis Erich Ortner Barbara Paech Thorsten Pawletta Jan Philipps Klaus Pohl Alexander Pretschner Ulrich Reimer Wolfgang Reisig Ralf Reussner Matthias Riebisch Bernhard Rumpe Bernhard Schaetz Peter Schmitt Andy Schürr Elmar J. Sinz Friedrich Steimann Susanne Strahringer Peter Tabeling Gabriele Taentzer Klaus Turowski Michael von der Beeck Gerd Wagner Mathias Weske Andreas Winter Mario Winter Heinz Züllighoven Albert Zündorf Swiss Life Deutschland Universität Hamburg Universität Karlsruhe Technische Universität Darmstadt Universität Heidelberg Hochschule Wismar Validas AG, München Universität Duisburg-Essen Universität Karlsruhe FH St. Gallen Humboldt-Universität zu Berlin KIT Karlsruhe Technische Universität Ilmenau RWTH Aachen Technische Universität München Universität Karlsruhe Technische Universität Darmstadt Universität Bamberg Fernuniversität Hagen TU Dresden Intervista AG Universität Marburg Universität Magdeburg BMW AG BTU Cottbus HPI an der Universität Potsdam Carl von Ossietzky Universität Oldenburg Fachhochschule Köln Universität Hamburg Universität Kassel Organisationsteam Domenik Bork Matthias Wolf Universität Bamberg Universität Bamberg Inhalt Grundlagen der Modellierung Cédric Jeanneret, Martin Glinz, Thomas Baar Modeling the Purposes of Models ........................................................................... 11 Florian Johannsen, Susanne Leist Reflecting modeling languages regarding Wand and Weber’s Decomposition Model ...................................................................................................................... 27 Janina Fengel, Kerstin Reinking Sprachbezogener Abgleich der Fachsemantik in heterogenen Geschäftsprozessmodellen ...................................................................................... 43 Visualisierung von Modellen Thomas Goldschmidt, Steffen Becker, Erik Burger Towards a Tool-Oriented Taxonomy of View-Based Modelling ............................. 59 Michael Schaub, Florian Matthes, Sascha Roth Towards a Conceptual Framework for Interactive Enterprise Architecture Management Visualizations .................................................................................... 75 Christian Schalles, John Creagh, Michael Rebstock Exploring usability-driven Differences of graphical Modeling Languages: An empirical Research Report ...................................................................................... 91 Modellbasierte Entwicklung Thomas Buchmann, Bernhard Westfechtel, Sabine Winetzhammer ModGraph: Graphtransformationen für EMF ...................................................... 107 Michael Schlereth, Tina Krausser Platform-Independent Specification of Model Transformations @ Runtime Using Higher-Order Transformations .................................................................. 123 Timo Kehrer, Stefan Berlik, Udo Kelter, Michael Ritter Modellbasierte Entwicklung GPU-unterstützter Applikationen ............................ 139 Suchen, Wiederverwendung und Wissensgewinnung Lars Hamann, Fabian Büttner, Mirco Kuhlmann, Martin Gogolla Optimierte Suche von Modellinstanzen für UML/OCL-Beschreibungen in USE... 155 Benjamin Horst, Andrej Bachmann, Wolfgang Hesse Ontologien als ein Mittel zur Wiederverwendung von Domänen-spezifischem Wissen in der Software-Entwicklung .................................................................... 171 Jochen Reutelshoefer, Joachim Baumeister, Frank Puppe A Meta-Engineering Approach for Customized Document-centered Knowledge Acquisition ......................................................................................... 187 Domänenspezifische Anwendungen Stefan Gudenkauf, Steffen Kruse, Wilhelm Hasselbring Domain-Specific Modelling for Coordination Engineering with SCOPE ............. 203 Alexander Rachmann Referenzmodelle für Telemonitoring-Dienstleistungen in der Altenhilfe .............. 219 Beate Hartmann, Matthias Wolf Erweiterung einer Geschäftsprozessmodellierungssprache zur Stärkung der strategischen Ausrichtung von Geschäftsprozessen .............................................. 235 Johanna Barzen, Frank Leymann, David Schumm, Matthias Wieland Ein Ansatz zur Unterstützung des Kostümmanagements im Film auf Basis einer Mustersprache ............................................................................................. 251 Martin Burwitz, Hannes Schlieter, Werner Esswein Agility in medical treatment processes – A model-based approach ...................... 267 Modeling the Purposes of Models Cédric Jeanneret, Martin Glinz Thomas Baar University of Zurich Binzmühlestrasse 14 CH-8050 Zurich, Switzerland {jeanneret, glinz}@ifi.uzh.ch Hochschule für Technik und Wirtschaft Berlin Wilhelminenhofstraße 75A D-12459 Berlin, Germany thomas.baar@htw-berlin.de Abstract: Today, the purpose of a model is often kept implicit. The lack of explicit statements about a model’s purpose hinders both its creation and its (re)use. In this paper, we adapt two goal modeling techniques, the Goal-Question-Metric paradigm and KAOS, an intentional modeling language, so that the purpose of a model can be explicitly stated and operationalized. Using some examples, we present how these approaches can document a model’s purpose so that this model can be validated, improved and used correctly. 1 Introduction With the advent of Model Driven Engineering (MDE), models play a more and more important role in software engineering. Conceptually, a model is an abstract representation of an original (like a system or a problem domain) for a given purpose. One cannot build or use a model without knowing its purpose. Yet, today, the purpose of a model is often kept implicit. Thus, anybody can be mislead by a model if it is used for a task it was not intended for. Furthermore, a modeler must rely on his experience and his feelings to decide how much and which detail is worth being modeled. This may result in models at the wrong level of abstraction for its (unstated) purpose. Stating the purpose of a model explicitly is only a first step to address these issues. Eventually, the purpose of a model can be characterized by a set of operations. There are two kinds of operations: (i) operations performed by humans to interpret (understand, analyze or use) the model and (ii) operations executed by computers to transform the model into another model (model transformations). Being able to express the purpose of a model with a set of model operations allows to measure how well a model fits its purpose. In previous work [JGB11], we have made a contribution towards measuring the confinement of a model (the extent to which it contains relevant information) given the set of formal operations to be executed on it. Having a set of operations is not enough, though: we must ensure that these operations can be performed on the model — no matter whether these operations are performed by humans or executed by computers. For this, we have to make explicit which information the operations need from the model and we have to determine which structures a model 12 Cédric Jeanneret, Martin Glinz, Thomas Baar has to conform to. In other words, we need to state which elements the metamodel must contain for enabling the operations. Our previous work assumes that these operations and these metamodels exist. This assumption may hold in an MDE context, but not in a wider context: Often, the purpose of a model is not even stated explicitly. Thus, there is a need for (a) methods to elicit and document modeling purposes in the first place and (b) methods to operationalize these modeling purposes systematically. In goal modeling, there are many approaches for these two tasks. However, these approaches were designed for other contexts than modeling. In this paper, we adapt two of these goal modeling approaches for systematically deriving a set of of model operations and associated metamodel elements from a qualitatively stated model purpose. First, we present Goal-Operation-Metamodel (GOM), a generalization of the Goal-Question-Metric (GQM) paradigm [Bas92]. Second, we propose to use KAOS [vL09] (a goal modeling language) as a metalanguage to create intentional metamodels. The remainder of this paper is organized as follows. In the next section, we present the problem context of our work in more details. In Section 3, we describe the GoalOperation-Metamodel method and we present intentional metamodeling with KAOS in Section 4. We discuss our findings in Section 5 while Section 6 discusses related work. Finally, we conclude the paper in Section 7. 2 Problem Context Many modeling theories distinguish two roles in the model building process: the modeler and the expert. Modeling is a collaborative activity involving a dialog among these two roles: The modeler elicits information about the original from the expert before formalizing it, while the expert validates the content of the model as explained by the modeler. These roles and the relationships are reprented in Figure 1. Hoppenbrouwers et al. even consider the model as the minutes of this dialog [HPdW05]. While building a model may be valuable on its own, the value of modeling consists of using the model as a substitute of the original to infer some new information about it. These inferences are made by the interpreter – a third role related to modeling. To achieve this, the interpreter performs various model operations on the model, like executing queries on it, extracting views from it or transforming it to other models or artefacts. When describing the nature of modeling, Rothenberg listed the following purposes of models [Rot89]: The purpose of a model may include comprehension or manipulation of its referent [the original], communication, planning, prediction, gaining experience, appreciation, etc. In some cases this purpose can be characterized by the kinds of questions that may be asked of the model. For example, prediction corresponds to asking questions of the form “What-if...?” (where the user asks what would happen if the referent began in some initial state and behaved as described by the model). Modeling the Purposes of Models builds Expert Model represents provides / validates information elicits information Modeler uses 13 Interpreter Purpose Legend: Role deliberates Original Entity Figure 1: The roles involved in a modeling activity. A clearly stated modeling purpose can be used as a contract between the modeler and the interpreter. Establishing contracts is costly, as they must be negotiated and edited. Nevertheless, such a contract can be useful in two ways: First, as a specification for a model’s purpose, it provides a strong basis on which the model can be validated. It can also provide the modeler with some guidance for improving the model so that it reaches the right level of abstraction. Second, as a description of a model’s purpose, it tells an interpreter whether the model at hand is fit for the intended use or, if several models are available, it helps him to choose which model will best fit his purpose. In the vein of [LSS94], we consider a model as a set of statements M . For each modeling purpose, there is a set of relevant statements D. In an ideal case, the set D should correspond to the set of statements in the model M . When the sets M and D differ, we can quantify the deviation of M from D by using measures from the Information Retrieval field: precision and recall. Precision measures the confinement of a model, the extent to which it contains relevant statements. Recall, on the other hand, measures the completeness of a model, that is, the proportion of relevant statements that has actually been modeled. By measuring the confinement and completeness of a model, a modeler can assess how adequate is its level of abstraction for its purpose. Indeed, a model at the right level of abstraction for its purpose is both confined and complete (M = D). However, defining the set D is challenging. In our previous work [JGB11], we made a contribution towards measuring the confinement of a model given a set of operations that characterizes its purpose. When these operations are executed on a model, they navigate through its content and gather some information by reading some of its elements. The set of elements touched by a model operation during its execution forms the footprint of that operation. Thus, the footprint contains all elements that affect the outcome of the operation. For a set of operations, the global footprint of the set of operations is the union of the footprints of each operation. This global footprint is the intersection M ∩ D: it is the set of statements that are both present in the model and used to fulfill its purpose. 14 Cédric Jeanneret, Martin Glinz, Thomas Baar In this paper, we propose two approaches to operationalize a qualitatively stated modeling purpose into a set of model operations and their supporting metamodels. Instead of inventing new methods from scratch, we adapt two existing goal modeling techniques, GQM [Bas92] and KAOS [vL09], so that they can be used in a modeling context in addition to measurement and requirements engineering, respectively. To illustrate the use of these methods in modeling, we first present two examples. 2.1 Motivating Examples In this section, we introduce two examples to motivate and illustrate our approaches to capture the purpose of a model. The first example is the London Underground map, used by Jeff Kramer in [Kra07] to highlight that the value of an abstraction depends on its purpose. The second example is related to Software Engineering, where an architect models her (or his) system according to the “4+1” viewpoints of Kruchten [Kru95] for making some performance analysis as described in [CM00]. We have used this example in our previous paper to explain the various usage scenarios of model footprinting [JGB11]. The London Underground Map As most major cities, London has an underground railway system. To help its users to navigate in London with it, its operator, the Transport for London (TfL) company provides a map of this transit system. Figure 2 shows the evolution of the map along the years. In 1919 (Figure 2a), the map was a geographical map of London with the underground lines overlaid in color. In 1928 (Figure 2b), ground features like streets were removed from the map and the outlying lines were distorted to free some space for the congested center, making it more readable. The first schematic representation of the network appeared in 1933 (Figure 2c): the precise geographic location of stations is discarded; only the topology of the network is represented. The current map (Figure 2d) contains additional information such as the accessibility of stations, the connections to other transportations systems and fare zones. In this example, the modeler is the employee of TfL designing the map. The expert is an employee of TfL who knows the underground network well. The interpreter is a user of the map. The map is used to plan trips in London, that is, the map must help travelers to answer the following questions: how to get from A to B? How much does it cost? How long does it take? Is that route accessible for disabled people? Interestingly, in this example, the people who use the model to plan their trip also use the modeled system to actually travel in London. Performance Analysis on Software Architecture For the second example, we consider an architect analyzing the performance of a piece of software. To this end, she describes its architecture using the “4+1” view model proposed Modeling the Purposes of Models (a) Map in 1919 (b) Map in 1928 (c) Map in 1933 (d) Map in 2009 15 Figure 2: Maps of the London Underground. c (a), (b) and (c): #TfL from the London Transport Museum collection c (d): #Transport for London by Kruchten in [Kru95]: This view model includes (1) use case and sequence diagrams for the scenario view, (2) class diagrams for the logical view, (3) component diagrams for the development view, (4) activity diagrams for the process view and (5) deployment diagrams for the physical view. Her model is first transformed into an extended queueing network model (EQNM) as explained by Cortellessa et al. in [CM00]. Performance indicators are then measured on the EQNM. The architect wants the following questions to be answered: What is the response time and throughput of her system? Where is its bottleneck? In this example, EQNMs can be seen as the semantic domain for architecture models written in UML. There are therefore two chained interpretations: the first interpretation translates a UML model into an EQNM, while the second interpretation analyses the EQNM. In this example, we focus on the translation from UML to EQNM. 16 Cédric Jeanneret, Martin Glinz, Thomas Baar The architect plays all three roles in this example. As the architect of her software, she is the expert. As she creates the model, she is the modeler. As she uses the model for evaluating the performance of her software, she is the interpreter. However, there are two additional stakeholders involved in this example: Cortellessa and his team developed the analysis used by the architect, while Kruchten, by defining the “4+1” viewpoint, proposed a “contract” between the modeler and the interpreter. Contrary to the previous example, the performance analysis is mostly automated. As the architect is only interested in its results, she may know little about the internals of the technique. Thus, the documentation of the analysis must state explicitly which information the analysis requires in input models. 3 Goal-Operation-Metamodel GQM is a mechanism for defining and evaluating a set of operational goals using measurement [Bas92]. In the GQM paradigm, a measurement is defined on three levels: • At the conceptual level, the goal of the measurement is specified in a structured manner: It specifies the purpose of the measurement, the object under study, the focus of the measurement and the viewpoint from which the measurements are taken. • At the operational level, the goal is refined to a set of questions. • At the quantitative level, a set of metrics is associated to each question so that it can be answered in a quantitative manner. Our approach consists of using GQM for models other than metrics. According to Ludewig [Lud03], metrics are some kind of models. However, GQM has to be extended on its three levels to describe modeling purposes other than quantitative analysis. At the conceptual level, the goal template must support purposes like code generation1 or documentation. At the operational level, the set of questions will be replaced by a set of (general) operations: Beside queries, one may need simulations and transformations to refine the goal stated at the conceptual level. Finally, the quantitative level becomes the definable level: metamodels replace metrics to support the model operations from the operational level. These operations will be run on conforming models in a similar way that questions can be answered with the value of a metric. Thus, we call this approach Goal-Operation-Metamodel (GOM). 3.1 GOM and the London Underground Map Based on the GQM template described in [Bas92], we define the goal of the map as the following: 1 Code generation, as an operation, is not supported when a model is at a conceptual level. Here, we consider code generation as the model’s purpose to be described with GOM. Modeling the Purposes of Models 17 Analyze the London Underground For the purpose of characterization With respect to reachability and connectivity of its stations From the view of a traveler In the following context: the traveler may be a disabled person, the tube is part of a larger transportation system, the map is displayed on a screen or on paper in stations From this goal, we derive the following questions to be answered from the model: (a) What is the shortest path between two stations? (b) How much does it cost to travel along a given path? (c) How long does it take to travel along a given path? (d) Is a given path accessible to a disabled person? (e) When traveling along a path, at which station to leave a train? (f) When traveling along a path, in which train (line and direction) to enter? Table 1 lists which questions are supported by the 4 versions of the map displayed in Figure 2. All maps can be used to find the shortest path between two stations and where to step in and step off trains. However, only the 2009 version fully supports disabled people and allows for computing the cost of a trip. Since it preserves the geographic location of stations, the map of 1919 can be used to estimate the time needed for a trip (without taking transfers into account). Table 1: Operations supported by the different versions of the map. Map 1919 1928 1933 2009 Path (a) √ √ √ √ Cost (b) − − − √ Time (c) √ − − − Accessibility (d) − − − √ Step Off (e) √ √ √ √ Step In (f) √ √ √ √ A map conforming to the metamodel depicted in Figure 3 could be used to answer all the questions characterizing the purpose of the map. Segments, lines and stations form the topology of the network, allowing a traveler for planning (question (a)) and executing (questions (e) and (f)) a trip with the Underground. Fare zones are involved in the computation of the cost of a trip (question (b)). The accessibility of a station serves for question (d) while the distance covered by a segment is needed to answer question (c). In this example, questions are answered “mentally” by the travelers. Still, all these questions could be formalized with queries in OCL or operations in Kermeta [MFJ05]. For example, Listing 1 presents the operation computing the cost of a trip (encoded as a sequence of segments) in Kermeta. This operation is defined for the metamodel presented in Figure 3. 18 Cédric Jeanneret, Martin Glinz, Thomas Baar FareZone +zoneID : Integer +cost : Real +fareZones * +zone 1..2 Station +destinationStation +incomingSegments 1 +name : String +isAccesible : Boolean +sourceStation +stations * 1 Segment 0..* +distance : Integer +leavingSegments 0..* 1..* {ordered} +segments +lines 1..* Map Line +lines * +name : String Figure 3: A metamodel for a map of the London Underground. / / Compute t h e c o s t o f a t r i p o p e r a t i o n c o s t ( p a t h : Sequence<Segment >): R e a l i s do r e s u l t := ”0” . toReal / / F i r s t , c o l l e c t a l l t r a v e r s e d zones var t r a v e r s e d Z o n e s : S e t<FareZone> i n i t S e t<FareZone >.new path . each { seg | var s r c : S e t<FareZone> i n i t s e g . s o u r c e S t a t i o n . z o n e var d s t : S e t<FareZone> i n i t s e g . d e s t i n a t i o n S t a t i o n . z o n e var i n t e r : S e t<FareZone> i n i t s r c . i n t e r s e c t i o n ( d s t ) i f not i n t e r . isEmpty then / / Both s t a t i o n s a r e i n t h e same z o n e traversedZones . addAll ( i n t e r ) else / / The s e g m e n t t r a v e r s e s a b o u n d a r y traversedZones . addAll ( src ) traversedZones . addAll ( dst ) end } / / Second , sum t h e c o s t o f e a c h t r a v e r s e d z o n e t r a v e r s e d Z o n e s . each {z | r e s u l t := r e s u l t + z . c o s t } end Listing 1: Operation computing the cost of a trip. 3.2 GOM and the Performance Analysis on Software Architecture In this example, we only consider the immediate goal of the model, which is the generation of an EQNM, and we leave the final goal (the performance analysis) out. Still, both goals could have been captured by GOM. Slightly adapting the GQM template [Bas92], the immediate goal of the model can be stated as follows: Analyze the architecture of a software system For the purpose of generating an EQNM With respect to the scenario view and the physical view as defined in [Kru95] From the view of the software architect In the following context: the generation of an EQNM is explained in [CM00], Modeling the Purposes of Models 19 this generation is automated, the generated EQNM will be used to analyze the performance of the architecture [CM00] describes, formally, the various steps in the generation of the EQNM from UML models: (1) deduce a user profile from the use case diagram, (2) combine the sequence diagrams into a meta execution graph (meta-EG), (3) obtain the EQNM of the hardware platform from the deployment diagram and tailor the meta-EG into an EG-instance for that platform, (4) assign numerical parameters to the EG-instance, and (5) assign environment based parameters to the EQNM, process the EG-instance to obtain software parameters before assigning them to the EQNM. This chain of transformations requires information from the following UML diagrams: use case diagrams, sequence diagrams and deployment diagrams. The other diagrams of the “4+1” model — the class, the component and the activity diagrams — are not needed for this purpose. While GOM allows to state the purpose of a model explicitly and operationalize it, goals expressed in GOM are not formal enough to be analyzed automatically, for example, to find conflicts among them. In the next section, we present how a model’s purpose can be expressed in a goal-oriented modeling language. 4 Intentional Metamodeling In the previous section, we presented a structured but informal way to specify a model’s purpose. In this section, we introduce intentional metamodeling with KAOS, a goal modeling language designed for use in early phases of requirements engineering. A KAOS model consists of four interrelated views: Goal modeling establishes the list of goals involved in the system. Refined goals and alternatives are represented in an AND/OR tree. Conflicts among goals are also represented in this diagram. Responsibility modeling captures the agents to whom responsibility for (leaf) goal satisfaction is assigned. Object modeling is used to represent the domain’s concepts and the relationships among them. Operation modeling prescribes the behaviors the agents must perform to satisfy the goals they are responsible for. 20 Cédric Jeanneret, Martin Glinz, Thomas Baar A goal can be refined into conjoined sub-goals (the goal is satisfied when all its sub-goals are satisfied) or into alternatives (the goal is satisfied when at least one of its alternatives is satisfied). Therefore, goals are represented as AND/OR trees in KAOS. In such a tree, the goals below a given goal explain how and how else the goal can be realized. On the opposite, goals higher in the hierarchy provide the rationale for a given goal, explaining why the goal is present in the model. [vL09] provides a taxonomy of goals based on their types and their categories. There are two main types of goals: behavioral goals (such as Achieve, Cease, Maintain and Avoid goals) prescribe the behavior of a system, while soft-goals (such as Improve, Increase, Reduce, Maximize and Minimize goals) prescribe preferences among alternative systems. Similarly, there are two main categories of goals: functional goals (like Satisfaction [of user requests] or Information [about a system state] goals) state the intent behind a system service and non-functional goals (like Usability or Accuracy) state a quality or constraint on its provision or its development. This taxonomy can be helpful for eliciting and specifying goals. Goals are refined until they are assignable to a single agent. Leaf goals are then made operational by mapping them to operations ensuring them. Operations are binary relationships over systems states. They can be derived from the formal specification of goals or built from elicited scenarios. Finally, a conceptual model gathers all concepts (including their attributes and the relationships among them) involved in the definition of goals and operations. We use KAOS as a metametamodel and not as a metamodel as it was initially designed for: In our approach, KAOS models are metamodels. Goals depict the modeling purposes. Operations prescribe the operations that can be executed on models and the conceptual model defines the abstract syntax of the language. Thus, a metamodel written in KAOS specifies many aspects of a modeling task: it states the purpose and intended usage of models as well as their structure. In the remainder of this section, we present KAOS metamodels for our examples. 4.1 Intentional Metamodeling and the London Underground Map A KAOS metamodel of the London Underground map is presented in Figure 4. The main goal of the map is to provide travelers with a means to understand how to travel from a station A to another station B successfully. To achieve this, the map must satisfy the following sub-goals: to help travelers to plan their trip, to help them to buy the right ticket for it and to help them for the navigation, that is, to prevent them from getting lost during their travel. These goals are operationalized through the following operations performed by the traveler: find the shortest path between stations A and B, compute its cost (by summing the fares of traversed fare zones) and carry out the plan by riding on the right line and connecting on the right station. As we did in Section 3.1, we can define these operations formally and derive a metamodel to support them. For space reasons, this metamodel is Modeling the Purposes of Models 21 not included in Figure 4, but it presented in Figure 3. Provide a means for understanding how to travel from A to B successfully Achieve [Buy Right Ticket] Achieve [Plan Trip] Avoid [Getting Lost] Compute Cost of Planned Trip Compute Shortest Path A → B Carry Out Planned Trip Traveler Figure 4: A KAOS metamodel for the London Underground map. 4.2 Intentional Metamodeling and the Performance Analysis We present an intentional metamodel of the performance analysis in Figure 5. The final goal of the architect is to analyze the performance of her architecture. This goal has been refined to three sub-goals: First, performance models are generated automatically from some UML diagrams. Then, these performance models are parametrized and solved. For space reasons, we did not further elaborate these two latter goals. We also considered UML diagrams as atoms, ignoring their internal elements such as actors, messages and nodes. A computer is responsible for the generation of performance models. This goal is operationalized with four automated operations: generate the user profile from the use case diagram, generate the meta-EG from sequence diagrams, instantiate the meta-EG into an EG-instance with the help of the deployment diagram and generate an EQNM from the deployment diagram. These operations correspond to the first three steps described in [CM00]. The last two steps are captured in the two remaining goals, parametrize and solve the performance models. 5 Discussion This paper is an initial contribution towards the modeling of models’ purposes. For this, we have adapted two existing goal modeling approaches and applied them to two modeling tasks, demonstrating the feasibility of such metamodeling. 22 Cédric Jeanneret, Martin Glinz, Thomas Baar Achieve [Performance Analysis] Achieve [Solve Performance Models] Achieve [Parametrize Performance Model] Achieve [Generate Performance Models] Computer Architect Generate User Profile Use Case Diagram User Profile Create Meta EG Sequence Diagram Meta EG Instantiate EG EG Instance Create EQNM Deployment Diagram EQNM Figure 5: A KAOS metamodel for a performance analysis. In the remainder of this section, we compare the two approaches presented in this paper, GOM and intentional metamodeling. We also discuss the benefits and difficulties related to these approaches. 5.1 Comparison GOM and Intentional Metamodeling Contrary to GOM, intentional metamodeling with KAOS can capture the complete rationale behind the creation and the use of a model. As explained in Section 4, goal models are organized in AND/OR trees. By navigating the tree from an element upwards, a modeler can find the rationale explaining a given operation, meta-class or modeling purpose. Likewise, but using downward navigation, the modeler can figure out how a model purpose is realized by looking at its sub-goals or its alternatives. In this paper, we only presented semi-formal KAOS models. However, these models can be completely formalized and thus are amenable to automated analysis, including the verification of goal refinements or the derivations of goal operationalizations [vL09]. The weaknesses of KAOS lie in the cost and difficulties of formalizing goals and operations. In comparison, GOM is a semi-formal approach. It only provides templates for stating modeling purposes and guidelines for deriving questions from this purpose. Future research should explore under which conditions a low or a high level of formality is preferred or required. Modeling the Purposes of Models 23 We have presented GOM and intentional metamodeling as two different approaches, because they come from different field: software measurement and early requirements engineering, respectively. Future work may integrate these two approaches, combining the ease of use of the templates and guidelines of GOM and the formality of KAOS. 5.2 Benefits and Limitations Stating a model purpose and making it operational allows for measuring the fitness of a model for this purpose. A model is complete if it contains all the elements necessary to fulfill its goals. Conversely, a model element is pertinent if it contributes to the satisfaction of at least one goal. Confined models only contain pertinent elements. With a formal KAOS model, it is possible to measure these qualities objectively by establishing satisfaction arguments. However, eliciting a model’s purpose and elaborating it has a cost. The benefits must be higher than the costs if the practice is to be adopted by practitioners. Models are like systems. Making explicit requirements about models (such as stating their purpose) aims at reducing the risk of creating the wrong models. Models at the wrong level of abstraction have consequences ranging from small annoyances for their interpreters to the impossibility of fulfilling the purposes they were made for. Furthermore, goal modeling is difficult. First, many modelers are not experienced in intentional modeling. Courses on Software Engineering or Modeling typically cover data, behavior and process modeling languages but leaves out goal modeling. Thus, (intentional) metamodelers will be rare in the near future. Second, goal models grow rapidly as goals are refined and alternatives are identified. 6 Related Work In this section, we present the state of the art in metamodeling and model quality and we discuss its limitations. For van Gigch [vG91], a metamodel should cover many aspects of modeling, not only “data” metamodeling (the syntax of the language). In this vein, Kermeta proposes to metamodel the behavior of models, so that the operational semantics of models can be specified [MFJ05]. In this paper, we go one step further by metamodeling modeling agents and their goals. In their model of modeling [MFBC10], Muller et al. place the intention of a model at the heart of their notation. They define intention as follows: The intention of a thing thus represents the reason why someone would be using that thing, in which context, and what are the expectations vs. that thing. It should be seen as a mixture of requirements, behavior, properties, and constraints, either satisfied or maintained by the thing. 24 Cédric Jeanneret, Martin Glinz, Thomas Baar In their notation, intentions are considered as sets and thus represented as Venn diagrams. While this notation allows to represent the intersection and the overlap among intentions, it does not allow to represent the internal content of the intention behind a model. The focus of our paper is to represent this intention, so that its modelers and its interpreters can agree and reason on it. For Nuseibeh et al. [NKF93], a viewpoint consists of (1) a style (the modeling language and its notation), (2) a work plan describing the development process of the viewpoint including possible consistency check or construction operations, (3) a domain defining the area of concern with respect to the overall system, (4) a specification describing the viewpoint’s domain using the viewpoint’s style (in other words, the view of the system from the viewpoint) and (5) a work record keeping track of development history within the viewpoint. According to the IEEE 1471 standard, a viewpoint captures the conventions for constructing, interpreting and analyzing a particular kind of view. Thus, a viewpoint defines — among others — modeling languages, model operations that can be applied to views and stakeholders whose concerns are addressed in the views. Viewpoints define the various views (and their relationships) in a software specification or in an architecture description, thus, they provide the modeler with guidelines on what they are expected to model. However, we are not aware of guidelines to define these viewpoints, nor techniques to validate that a view actually satisfies the needs of the stakeholders using it. In [MDN09], Mohaghegi surveyed frameworks, techniques and studies of model quality in model based software development. They identified 6 quality goals: correctness, completeness, consistency, comprehensibility, confinement and changeability. Manual reviews [LSS94] and metrics [BB06] can be used to assess and improve the confinement and completeness of models. However, these techniques are either bound to a given modeling language and a given process [BB06] or must be tailored for the modeling task at hand [LSS94]. In comparison, intentional metamodeling and GOM are not bound to any specific language or process. Because they document the purpose of models, goals expressed and operationalized in GOM or KAOS may serve as basis to derive checklists, guidelines and metrics for validating models. In previous work [JGB11], we propose and compare two methods to compute the footprint of an operation – the set of all information used by the operation during its execution. Dynamic footprinting reveals the actual footprint of an operation by tracing its execution on the model. In contrast, static footprinting estimates footprints by first analyzing, statically, the definition of the operation to obtain its static metamodel footprint, the set of all modeling constucts (i.e., types, attributes and references) involved in this definition. The model footprint can then be estimated by selecting only those model elements that are instances of elements in the metamodel footprint. In this previous work, we assumed that the purpose of a model can be characterized by the set of operations being carried on it and that these operations were formally defined. These assumptions are reasonable in a MDE setting. Still, in this paper, we are interested in methods to specify an arbitrary model purpose and, if possible, to refine this purpose into a set of operations whose footprints can be looked at. In other words, the focus of this paper is the elicitation, documentation and operationalization of modeling purposes. The operationalization produces metamodels and operations that can be used, accessorily, as input for model footprinting. Modeling the Purposes of Models 25 In addition to GQM and KAOS, there are other goal oriented techniques and methods for requirements engineering, such as i* [Yu97] and Tropos [CKM02]. While we could have selected these approaches to capture and analyze the purposes of models, we chose KAOS and GQM instead for their strong focus on the operationalization of goals. 7 Conclusion One cannot build a model without knowing its purpose, and one must not use a model for purposes it is not fit for. Despite its importance, the purpose of a model is often kept implicit. In this paper, we adapted two existing goal modeling approaches — GQM [Bas92] and KAOS [vL09] — to capture the purpose of a model and operationalize it into a set of operations and a metamodel. With these elements in hands, it is possible to measure how fit a model is for the purpose. We demonstrated the feasibility of the approaches by applying them to two examples. These early results are promising, but the benefits of such intentional metamodels remain to be established empirically (e.g., with industrial case studies). With the experience gained in modeling the purpose of models, we can elaborate the templates and adapt the guidelines offered by KAOS and GQM in more detail. In this vein, further research could define a profile for KAOS and develop specific analysis for intentional metamodels. For the first time, goal modeling techniques were applied to modeling itself, raising many open issues: What is the source of modeling goals, that is, who are the metaexperts? Do intentional metamodels help in model management and model reuse? Acknowledgement Our work is partially funded by the Swiss National Science Foundation under the project 200021 134543 / 1. References [Bas92] Victor R. Basili. Software modeling and measurement: the Goal/Question/Metric paradigm. Technical Report UMIACS TR-92-96, 1992. [BB06] Brian Berenbach and Gail Borotto. Metrics for model driven requirements development. In 28th International Conference on Software Engineering (ICSE ’06), pages 445–451, Shanghai, China, 2006. ACM. [CKM02] Jaelson Castro, Manuel Kolp, and John Mylopoulos. Towards requirements-driven information systems engineering: the Tropos project. Information Systems, 27(6):365– 389, 2002. 26 Cédric Jeanneret, Martin Glinz, Thomas Baar [CM00] Vittorio Cortellessa and Raffaela Mirandola. Deriving a queueing network based performance model from UML diagrams. In International Workshop on Software and Performance (WOSP 00), pages 58–70, 2000. [HPdW05] Stijn Hoppenbrouwers, H. A. Proper, and Th P. der Weide. A Fundamental View on the Process of Conceptual Modeling. In Conceptual Modeling (ER 2005), volume 3716 of LNCS, pages 128–143. Springer, 2005. [JGB11] Cédric Jeanneret, Martin Glinz, and Benoit Baudry. Estimating Footprints of Model Operations. In 33rd International Conference on Software Engineering (ICSE 2011), pages 601–610, Waikiki, Honolulu, HI, USA, 2011. ACM. [Kra07] Jeff Kramer. Is abstraction the key to computing? 50(4):36–42, 2007. [Kru95] Philippe Kruchten. The 4+1 View Model of Architecture. IEEE Software, 12(6):42–50, 1995. [LSS94] Odd Ivar Lindland, Guttorm Sindre, and Arne Sølvberg. Understanding Quality in Conceptual Modeling. IEEE Software, 11(2):42–49, 1994. [Lud03] Jochen Ludewig. Models in software engineering - an introduction. Software and Systems Modeling, 2(1):5–14, March 2003. Communications of the ACM, [MDN09] Parastoo Mohagheghi, Vegard Dehlen, and Tor Neple. Definitions and approaches to model quality in model-based software development: A review of literature. Information and Software Technology, 51(12):1646–1669, 2009. [MFBC10] Pierre-Alain Muller, Frédéric Fondement, Benoit Baudry, and Benoı̂t Combemale. Modeling modeling modeling. Software and Systems Modeling, 2010. [MFJ05] Pierre-Alain Muller, Franck Fleurey, and Jean-Marc Jézéquel. Weaving Executability into Object-Oriented Meta-Languages. In 8th International Conference on Model Driven Engineering Languages and Systems (MoDELS 2005), volume 3713 of LNCS, pages 264–278, 2005. [NKF93] Bashar Nuseibeh, Jeff Kramer, and Anthony Finkelstein. Expressing the relationships between multiple views in requirements specification. In 15th international conference on Software Engineering (ICSE ’93), pages 187–196, Baltimore, MD, USA, 1993. [Rot89] Jeff Rothenberg. The nature of modeling. In Artificial intelligence, simulation & modeling, pages 75–92. John Wiley & Sons, Inc., New York, NY, USA, 1989. [vG91] John P. van Gigch. System Design Modeling and Metamodeling. Plenum Press, New York, NY, USA, 1991. [vL09] Axel van Lamsweerde. Requirements Engineering: From System Goals to UML Models to Software Specifications. Wiley, 2009. [Yu97] Eric Yu. Towards modelling and reasoning support for early-phase requirements engineering. In 3rd International Symposim on Requirements Engineering (RE ’97), pages 226–235, 1997. Reflecting modeling languages regarding Wand and Weber’s Decomposition Model Florian Johannsen, Susanne Leist Department of Management Information Systems University of Regensburg Universitaetsstraße 31 93053 Regensburg Florian.Johannsen@wiwi.uni-regensburg.de Susanne.Leist@wiwi.uni-regensburg.de Abstract: The benefits of decomposing process models are widely recognized in literature. Nevertheless, the question of what actually constitutes a “good” decomposition of a business process model has not yet been dealt with in detail. Our starting point for obtaining a “good” decomposition is Wand and Weber’s decomposition model for information systems which is specified for business process modeling. In the investigation at hand, we aim to explore in how far modeling languages support the user in fulfilling the decomposition conditions according to Wand and Weber. An important result of the investigation is that all investigated business process modeling languages (BPMN, eEPC, UML AD) can meet most of the requirements. 1 Introduction Business process modeling is widely recognized as an important activity in a company [BWW09]. For instance, business process models can serve as a basis for decisions on IT-investments or the design and implementation of information systems [BWW09]. In view of its understandability the size of a business process model plays a central role [MRC07]. Depending on both the purpose of modeling and the target group considered, requirements on process models may differ. While a software engineer may be interested in details of a business process (e.g. complex control-flow mechanisms), another employee may only consider the more abstract model levels, giving him/her a basic understanding of the business process [BRB07]. For creating process models that are manageable and understandable in size, but also contain all the information needed (e.g. for software development, process improvement efforts etc.), they are decomposed “into simpler modules” [GL07]. In doing so, a process model is decomposed into several model levels that differ in detail [KKS04]. Nevertheless the characteristics that actually constitute a “good” decomposition [BM06, BM08] remain unclear. In practice, the decomposition of process models is usually done in an “ad hoc” fashion [RMD10]. Guidelines on how to decompose a model into subprocesses are missing [RMD10]. Our starting point is Wand and Weber’s model for a good decomposition which was developed for information systems (see [WW89, We97]). 28 Florian Johannsen, Susanne Leist We specify this model for business process modeling giving business analysts a means to evaluate their decomposed models. As already mentioned in literature (see [Re09]), the potential of the Wand and Weber model seems promising for deriving criteria to judge whether the decomposition of a process model is “good” or “bad”. As a first step in our investigation we evaluate the capabilities of common process modeling languages to enable Wand and Weber’s decomposition model. It is our aim to explore how far these modeling languages support the user in fulfilling the defined conditions. Although, in fact, common modeling languages enable the decomposition e.g. by means of hierarchical functions in Event-driven Process Chains (EPCs) or subprocesses in the Business Process Modeling Notation (BPMN), the information given on a certain model level is not only dependent on the control-flow. Sometimes additional information is needed which becomes obvious by taking, for instance, a data-oriented view (e.g. focus on data elements). Since not all modeling languages support views that are not solely directed at the control-flow, the capabilities of a modeling language influence the quality of the decomposition. This paper is structured as follows. In section two, we give a definition of the term decomposition, highlight the relevance of the Wand and Weber model, and describe the procedure for the research under study. Section three introduces the investigated business process modeling languages, and section four presents the decomposition model. Whether the process modeling languages are capable to support the decomposition model or not, is discussed in section five. Therefore requirements on process modeling languages are derived. Section six presents conclusions, a set of limitations, and potential directions for future research. 2 Conceptual Basics and Related Work 2.1 Decomposition and process model quality Manifold metrics for judging the quality of a process model were recently developed (see [GL07, Va07, MRA10]). Moreover frameworks for evaluating conceptual models exist [SR98, KLS95]. In that context, decomposition is seen as a means to improve the understandability of a process model while reducing the likelihood of errors at the same time [MRA10]. The term decomposition is used in several publications, and many further publications (see e.g. [FS06, He09]) use terms with similar meanings (e.g. deconstruction, disaggregation, specialization). We define the decomposition of a process according to Weber [We97] as a set of subprocesses in such a way that the composition of the process equals the union of the compositions of the subprocesses in the set. Everything in the composition of the process is included in at least one subprocess in the set of subprocesses we chose. The decomposition of a process is represented in a level structure of subprocesses, and, on each level, the process or the subprocesses are displayed in a process model (see [We97]). Disaggregation and specialization are seen as special types of decomposition representing a part-of-relation respectively an is-a-relation. Most of the related work distinguishes heterogeneous types of decomposition for a given objective. Reflecting modeling languages regarding Wand and Weber’s Decomposition Model 29 For example, vom Brocke [Br06] introduced design principles for reference modeling which aim to provide a greater flexibility in reference modeling. Malone et al. [Ma99] developed the “process compass” which differentiates between horizontal specialization by means of objects and vertical disaggregation into subprocesses. Heinrich et al. [He09] used disaggregation and specialization for decomposing a process landscape, aiming at identifying primarily functional similarities of the detailed subprocesses. Ferstl and Sinz [FS06] defined principles (the so-called decomposition rules) to recursively refine processes over several levels of detail which support disaggregation and specialisation. The principles were especially designed to be used within the framework of their SOM (semantic object model) methodology. Österle [Ös95] described a pragmatic procedure to decompose processes. The objective of the procedure is to detail macro processes into micro processes (see [Ös95]). Therefore he suggested four sources (services, business objects, process or activities of the customer process, existing activities) which help to derive activities from the macro process [Ös95]. While, based on their objectives, different principles of decomposition are defined in these publications, characteristics of a good decomposition are not investigated. 2.2 Relevance of Wand and Weber’s decomposition model As described above (section 2.1) various principles and suggestions to help practitioners achieve a good decomposition exist. But, to our knowledge, only one general theory of decomposition has so far been proposed in information systems (see [BM06]): Wand and Weber’s good decomposition model (see [WW89, WW90, We97]). The decomposition model is part of the Bunge-Wand-Weber model (BWW model) [We97]. The BWW model is deeply rooted in the information system discipline [Re09] and considers the representational model, the state-tracking model, and the decomposition model as named above [We97]. The representational model has gained popularity as a means of the ontological analysis of modeling languages (see e.g. [RI07, Ro09, RRK07]). Therefore modeling languages are evaluated regarding ontological completeness and ontological clarity [Re09]. Both the state-tracking and the decomposition model are based on the concepts of the representational model [We97]. Details on the BWW model can be found in Weber [We97], for example. The decomposition model as it was originally developed by Wand and Weber comprises five conditions to judge the quality of a decomposition [We97]: minimality, determinism, losslessness, minimum coupling, and strong cohesion. These conditions help a user to decide whether an information system has been appropriately decomposed or not. Investigating these principles of good decomposition [We97] to support the creation of manageable business process models in large-scale initiatives has already been promoted by Recker et al. [Re09] as a promising field for research. If the decomposition model proves to be appropriate for that purpose, guidelines on how to decompose business process models may be derived in a subsequent step. This opinion is shared by Reijers and Mendling [RM08] as well. The positive effect of the decomposition conditions on the comprehensibility of UML diagrams has already been shown empirically (see [BM02, BM06, BM08]). 30 Florian Johannsen, Susanne Leist 2.3 Procedure for deriving requirements on modeling languages Since the decomposition conditions are based on the BWW representational model [We97] and modeling languages differ regarding their ontological completeness [Re09], the question arises in how far heterogeneous modeling languages are able to support Wand and Weber’s decomposition conditions. To answer this question we adhere to the following procedure. Step 1: Specification of the decomposition conditions for business process modeling Step 2: Derivation of metrics for evaluating process models regarding the decomposition conditions Step 3: Formulation of requirements on modeling languages Step 4: Evaluation of the modeling languages Figure 1: Procedure for deriving requirements and evaluating modeling languages In a first step, the decomposition conditions, which have their origin in information systems, are being specified for business process modeling. Based on this specification, metrics are derived (step 2) to judge whether a process model adheres to the decomposition conditions as defined in step 1. These metrics serve as an objective basis for the evaluation of process models. The metrics are obtained from our specification of the decomposition conditions (step 1). Thus they address those modeling constructs that are focused for evaluating process models regarding their fulfillment of the decomposition conditions. By looking at the metrics and the modeling constructs they address, requirements on modeling languages can be defined straightaway. The third step of our procedure (figure 1) contains the formulation of the requirements on modeling languages. Using these requirements, common modeling languages are evaluated regarding their support of the decomposition conditions (step 4). In doing so, it becomes obvious as to which degree a decomposed process model that was designed by using a specific modeling language can be judged regarding its coherence with the decomposition conditions. This investigation is part of a research project which aims to define conditions for a good decomposition. The research project is based on the design science research method (see [He04]). Thus decomposition conditions will be build and evaluated afterwards. The investigation at hand serves to prove the capabilities of existing knowledge (modeling languages and Wand and Weber’s decomposition model) and builds upon design science principles as well. We evaluate existing artifacts (modeling language) using requirements derived by a proposed solution (Wand and Weber’s decomposition model) for a given problem (decomposition). 3 Business Process Modeling Languages Manifold notations exist for modeling business processes. Especially the Business Process Modeling Notation (BPMN), the enhanced Event-driven Process Chains (eEPCs), and UML activity diagrams (UML ADs) have gained considerable attention in the field of business process modeling [MR08, Me09]. BPMN and UML have been developed and promoted by the OMG as standards in the modeling domain [MR08]. However, not only the ratification by the OMG, but also the growing tool support have contributed largely to their popularity in today`s business process modeling projects [MR08]. Reflecting modeling languages regarding Wand and Weber’s Decomposition Model 31 eEPCs are characterized by a high user acceptance [Me09, STA05], especially in the German-speaking community. A lot of reference models for different areas of application (e.g. computer integrated manufacturing, logistics or retail) are designed using eEPCs, while the notation is supported by manifold modeling tools as well [Me09]. Whereas other modeling languages exist (such as Petri-nets) (see [Mi10]), most of them were developed for analysis purposes and not for communicating models to business people and employees [Mi10] which hampers their popularity. Thus, in the following, we focus on eEPCs, UML ADs and BPMN. In addition, all of these languages support modeling constructs such as “collapsed subprocesses” (BPMN), “sub-activities” (UML AD), or “hierarchical functions” (eEPC) enabling the process design on different model levels. Enhanced Event-driven Process Chains (eEPCs): Event-driven Process Chains were developed in the early 1990s for visualizing an integrated information system from a business perspective (see [STA05]). The EPC is part of the ARIS framework (see [STA05]). The ARIS framework comprises several views (e.g. data view, function view or organization view) that can be used to specify an EPC-model through additional information, for example data elements or organizational units [STA05]. In that context, we speak of enhanced Event-driven Process Chains (eEPCs). Business Process Modeling Notation (BPMN): BPMN was officially introduced in 2004. The idea was to create a graphical standard to complement executable business process languages such as BPEL or BPML, for example [MR08]. In the meantime, Version 2.0 of the standard was released by the OMG. BPMN offers a variety of graphical modeling elements which are separated into basic and extended elements [OMG10]. UML Activity Diagrams (UML ADs): UML can be seen as a standard in the field of object-oriented modeling [Ru06]. It plays a dominant role in software engineering, since the functionality as well as the static structure of software can be described by several diagram types [Ru06]. In that context, activity diagrams (UML ADs) are important for modeling business processes, software is supposed to support. In the meantime, Version 2.4.1 of UML was released by the OMG [OMG11]. An “action” is the central element of UML activity diagrams for describing the behavior within a business process [Ru06]. The terminology in the field of business process modeling techniques is not standardized. We therefore stick to the terminology of Vanderfeesten et al. [Va07] which can be used for nearly all common business process modeling languages. Therefore we consider activities, events, data elements, directed arcs, connectors, and resources as constructs of a process model. Contrary to Vanderfeesten et al. [Va07] we also list events as separate elements, since events are an important concept during process execution [Mi10] which is emphasized by modeling languages such as the EPC. We adhere to this terminology in the following. This allows us to specify the decomposition conditions regardless of the business process modeling languages used (e.g. eEPC, BPMN etc.). 4 The Decomposition Model Wand and Weber`s decomposition model [We97] focuses on the decomposition of information systems and specifies five conditions: (1) minimality, (2) determinism, (3) losslessness, (4) minimum coupling and (5) strong cohesion. 32 Florian Johannsen, Susanne Leist Some of the conditions can also be found in neighboring disciplines such as data modeling or business process modeling (see e.g. [BCN92, Be95, Va07]). These findings are referred to in order to specify the conditions for the purpose under study appropriately. In addition, Green and Rosemann [GR00] as well as Recker et al. [Re09] reflect modeling languages regarding the BWW model. Their results, too, help to specify the conditions. Minimality condition: Following Weber [We97] a decomposition „is good only if for every subsystem at every level in the level structure of the system there are no redundant state variables describing the subsystem“. In the information systems domain this means that every subsystem of an information system should be characterized by the minimum number of attributes necessary for describing the subsystem [We97]. Minimality is an aspect that has also been addressed both in data modeling (see e.g. [BCN92]) and business process modeling (see e.g. [Be95]). According to Batini et al. [BCN92] a model is minimal if no object can be removed without causing a loss of information. If there is a loss of information or not, is to be judged by the end-user, and is therefore highly subjective. In addition, it is also the end-user who decides whether a specific modeling element is necessary or not. As already stated, a software engineer may expect more details in a process model than, for instance, a normal employee (see [BRB07]). Therefore a modeling construct can be seen as needless, if it is not required by the enduser. Another important aspect of minimality is seen in avoiding redundancies. But, while designing redundant-free models is a realistic goal in data modeling, this does not apply to business process models [Be95]. Therefore redundancies in business process models are quite common and may be necessary to design semantically correct models. Becker [Be95] gives some hints as to when activities in a business process model can be merged to avoid redundancies. Nevertheless the user’s perception plays a central role in deciding whether a construct within a model should be modeled more than once (see [Be95]). Sometimes redundant-free process models may be difficult to understand because complex structures of different connectors (e.g. OR, XOR, AND) are needed. Therefore we distinguish between wanted and unwanted redundancies. Only unwanted redundant elements, however, should be avoided. The final decision whether an object in a process model is to be considered as unwanted redundant should be up to the end-user. Therefore, to evaluate different designs of a process model as regards minimality we propose the following (see table 1): Verification of minimality Number (#) of activities, events, data elements, resources that are not required by the end-user or unwanted redundant in relation to all activities, events, data elements, resources. Metric # not required or unwanted redundant activities, events, data elements, resources/ # all activities, events, data elements, resources No. 1 Table 1: Verification of minimality Regarding the metric, the size of the business process model is reflected upon when evaluating minimality. As mentioned, the end-user`s perspective is crucial at that point. Determinism condition: According to Weber [We97] determinism can be defined the following way: “For a given set of external (input) events at the system level, a Reflecting modeling languages regarding Wand and Weber’s Decomposition Model 33 decomposition is good only if for every subsystem at every level in the level structure of the system an event is either (a) an external event, or (b) a well-defined internal event”. The decomposition model mentions internal and external events [We97, Re09, GR00]. According to Burton-Jones and Meso [BM02], internal events are those events that occur during the execution of a process. Whether a specific internal event occurs, depends on decisions made or activities performed. The completeness check of a “purchase order” indicates that an order is either “complete” or “incomplete”, depending on prior steps in the process, for example. The decomposition model requires internal events to be “welldefined” [We97, Re09, GR00]. This means that knowledge concerning the prior state enables a user to predict the subsequent event that will occur [We97]. In literature, there has been discussion about the relation between OR-splits and their effect on the instantiation of a process [ADK02]. It becomes obvious that the use of OR-splits often leads to designs in which events or subsequent states are hard to predict and may lead to complications during the actual execution of the process [ADK02]. Therefore the determinism of a process model suffers from the use of OR-splits. In addition, Cardoso showed the negative effect of OR-splits on the understandability of process models (see [Ca05]). Therefore he introduced the Control-Flow-Complexity-Metric [Ca05] that relates the complexity of a process model to the use of specific connectors. Negative impacts on the understandability of business process models are also caused by XORsplits that are not based on conditional expressions. BPMN models using event-based XOR-splits, for example, are hard to interpret, since the branch to be chosen after the XOR-split depends on an event to occur, mainly the receipt of a message [OMG10]. In that case, internal events are only modeled in an implicit way, while the process flow actually comes to an abrupt stop at that point. External events, on the other hand, are triggered by factors that are beyond a company`s influence, for instance, a server crash at a supplier which prevents the regular stockpiling of the company`s warehouse [We97]. While the existence of such events should be recognized, it is hard to predict their effects on the actual process execution. When external events are known, activities to react to these external influences can be specified within a process model. Nevertheless it is often hard to identify all external events that may have an impact on a process. Therefore a modeler can only be expected to model external events, insofar as she/he is able to identify them. If a process model has few external events this can either be an indicator that the process is only little affected by external influences or that the modeler has not identified all external events properly. Despite these problems the relation between the number of external events and all events of the model can be used to judge to which degree a process model is stamped with external events. To evaluate different designs of a process model we therefore propose the following: Verification of determinism Number (#) of OR-splits in relation to all Split-operations of the model. Control-Flow-Complexity-Metric according to [Ca05]. OR-splits have the most negative impact on the complexity of the model. Number (#) of XOR-splits that are not based on conditional expressions in relation to all Split-operations of the model. Number (#) of external events in relation to all events of the model. Metrics # OR-splits/ # all Split-operations see [Ca05] # XOR-splits not based on conditional expressions/ # all Split-operations # external events/ # events of the model Table 2: Verification of determinism No. 2 3 4 5 34 Florian Johannsen, Susanne Leist Losslessness condition: Weber [We97] believes that “a decomposition is good only if every hereditary state variable and every emergent state variable in a system is preserved in the decomposition“. Simply speaking, the decomposition model demands “not to lose properties” of a thing that is being decomposed [WW89, We97]. No information must get lost during the decomposition. The ideas of Moody [Mo98] concerning the completeness of data models can be used to specify this aspect for business process models. A model therefore suffers from “losses”, if certain constructs (e.g. activities, events) are required by the target group but cannot be found in the process model itself. The perspective of the target group once again becomes decisive in that context. In addition, Weber [We97] exemplifies that decomposition can lead to a false reproduction of the real world. This means that the semantics of a business process model may be distorted during decomposition and losses of the required semantics will occur. Considering resources can be of great help during decomposition. While two activities may look equal at first sight (e.g. “checking account”), they can be different regarding both the person performing the activity and the resources needed (see [Be95]). The underlying semantics can be completely different for these activities (e.g. “checking private customers` account” vs. “checking business customers` account”). What is more, syntactical errors occurring during decomposition will lead to misinterpretations and losses of the required semantics, too. As a consequence, the syntactical correctness of a model must be guaranteed for all model levels. Therefore “losslessness” of a model can be checked by means of the following metrics; the relation once again considers the size of the model: Verification of losslessness Number (#) of missing activities, events, data elements, resources on all model levels considering an original model (or the requirements of a user). Number (#) of wrongly designed constructs (syntactically and semantically) in relation to all required constructs. Metrics # missing activities, events, data elements, resources/# all activities, events, data elements, resources of an original model (or the requirements of an user) # wrongly designed constructs/ # all required constructs No. 6 7 Table 3: Verification of losslessness Minimum coupling condition: Weber [We97] states that “a decomposition has minimum coupling iff the cardinality of the totality of input for each subsystem of the decomposition is less than or equal to the cardinality of the totality of input for each equivalent subsystem in the equivalent decomposition”. Another aspect of the decomposition model addresses the coupling of the subsystems [We97]. The condition demands a minimum coupling which requires a minimum cardinality of the totality of the input [We97]. In process models, inputs are seen as data elements and the minimum cardinality refers to the number of relations between incoming data elements and activities. In the context of business process modeling, this idea is also supported by Vanderfeesten et al. [Va07]. According to Vanderfeesten et al. [Va07, VCR07] “coupling” measures the number of interconnections between the activities of a business process model. Thus it becomes obvious in how far various activities are dependent on each other [VRA08]. “Two activities are coupled, if they contain one or more common data element(s)” [Va07]. Accordingly, the degree of coupling of a business process model can be calculated by counting the number of coupled pairs (see [Va07, VRA08]). Reflecting modeling languages regarding Wand and Weber’s Decomposition Model 35 The activities have to be selected pairwise beforehand. The mean is then determined on the basis of the total number of activities [Va07]. This approach has a strong focus on the data elements. With “minimal coupling” the activities in a business process model are neither too small nor too big (see [Va08]). Nevertheless, Wand and Weber admit that the meaning of “minimum coupling” is unclear and different interpretations can be found in literature (see [We97]). A further interpretation of Wand and Weber’s definition of “minimum coupling” in business process models could be seen in the possibility to measure the strength of the connection between the activities (see [Va08]). In that case, mainly the control-flow would be focused. The degree of coupling depends on the complexity and the type of connections (e.g. XOR, AND, OR) between the activities [VCR07]. In Vanderfeesten et al. [Va08] the so called “Cross-Connectivity-Metric (CC)” is introduced for that purpose. The coupling of a business process model is thus determined by the complexity of the connections between its activities. To compare different designs as regards their degree of “coupling”, the following metrics can be used (the relation once again considers the size of the model): Verification of minimum coupling Number (#) of “coupled pairs” (activities sharing the same data element) in relation to all activities (see [Va07]). Cross-Connectivity-Metric according to [Va08]. The strength of the connections between activities is considered by assigning weightings to the paths of the model. Metrics # coupled pairs/ # all activities*(# all activities-1) see [Va08] No. 8 9 Table 4: Verification of coupling Strong cohesion condition: According to Weber [We97] “a set of outputs is maximally cohesive if all output variables affected by input variables are contained in the same set, and the addition of any other output to the set does not extend the set of inputs on which the existing outputs depend and there is no other output which depends on any of the input set defined by the existing output set“. Whereas coupling tends to enlarge the size of an activity, cohesion downsizes activities [We97]. The “strong cohesion condition” requires for each activity of the process model that all output of an activity depends upon its input (see [We97, VRA08]). In literature, only few publications can be found on the “cohesion” of a business process model. Exceptions are Vanderfeesten et al. [VRA08] and Reijers and Vanderfeesten [RV04] who introduce metrics for measuring cohesion. A strong focus is placed on the “data elements” within an activity. These data elements are processed by operations. Operations can be understood as small parts of work within an activity [Re03]. Strong cohesion is given, if operations within an activity overlap by sharing “data elements”, either as input or as output (activity relation cohesion according to [VRA08]). In addition, strong cohesion is also dominant when several of the data elements within an activity are used more than once (activity information cohesion according to [VRA08]). This definition comes very close to the definition in Wand and Weber’s decomposition model, because they both define cohesion as mainly data-driven and focus the processing of data elements within the activities. In summary, the cohesion of an activity is determined by the extent to which the operations of an activity “belong” to each other [VRA08, RV04]. Vanderfeesten et al. [VRA08] propose three metrics to determine the cohesion of an activity. The final process cohesion is then calculated on the basis of the cohesion values of the activities. 36 Florian Johannsen, Susanne Leist Verification of strong cohesion The activity relation cohesion determines in how far the operations within one activity are related with one another [VRA08]. The activity information cohesion determines how many data elements are used more than once in relation to all the data elements [VRA08]. The activity cohesion is the product of the activity relation cohesion and the activity information cohesion [VRA08]. Metrics see [VRA08] No. 10 see [VRA08] 11 see [VRA08] 12 Table 5: Verification of cohesion Although the conditions introduced are named decomposition conditions, they do not facilitate the procedure of decomposition. They are applied on the basis of the results of the decomposition and enable the evaluation of a decomposed process model by means of metrics (introduced above). The metrics` value helps to compare different alternative models, although the interpretation of differences between the metrics` values remains an open issue. Is it worth to reduce the value of coupled pairs in relation to all activities from 0.3 to 0.1, for example? Furthermore it has to be considered that the use of these metrics means an additional effort, since all metrics introduced can be calculated for all model levels of the designed alternatives. Since the decomposition of a process model into several, more detailed model levels always means adding semantics, user specifications have to be regarded as well. Therefore some metrics cannot be directly derived from the process models but have to imply users` knowledge or specification documents. These metrics are part of the conditions “losslessness” and “minimality”. 5 Evaluation of the Business Process Modeling Languages 5.1 Requirements based on the decomposition conditions In the following, we derive requirements on modeling languages by looking at our specification of the decomposition conditions (see section 4) and the corresponding metrics that reflect our interpretation. In doing so, each requirement (see table 6) can be directly associated to the related decomposition condition as well as certain facets of our interpretation. To fulfill the minimality condition according to our interpretation from section 4, a process model should not include unwanted redundant and not required elements (see also metric 1). Not only the decision whether an element is unwanted redundant, but also whether it is required or not, is up to the user. In this regard, the context of the process is decisive. Whereas the modeling language offers modeling constructs to represent the process, the user specifies them taking into account the context of the process. The modeling language cannot prevent the user from misinterpreting the requirements resulting from the context of the process (see [Mi10]). Therefore requirements for this condition cannot be defined. The determinism condition, as it has been specified in section 4, requires a predictable control-flow of the process which implies that all internal events are well-defined. The aforementioned OR-splits often lead to designs in which events or subsequent states are hard to predict (see [ADK02]). This also becomes evident by metrics 2 and 3 we have introduced. In addition, if the conditions of the outgoing branches of an XOR-connector are not explicitly defined, the subsequent state is not determinable either (see [OMG10]). Reflecting modeling languages regarding Wand and Weber’s Decomposition Model 37 This aspect is dealt with in metric 4, while the number of XOR-splits that are not based on conditional expressions should be minimal. A process modeling language should therefore enable the definition of conditions to specify the outgoing branches of an XOR-connector (requirement 1 – see table 6) and should not support an OR-connector (requirement 3 – see table 6) (see also metrics 2, 3 and 4). Contrary to poorly-defined internal events in a good decomposition, poorly-defined external events are permitted (see [We97, GR00]). This is due to the fact that it is often not possible to predict a subsequent state a priori that occurs as a result of an external event. The “determinism condition” only demands to represent external events in a model. External events in a process model are counted by metric 5 resulting from our specification of the condition. Accordingly, the process modeling language should be able to display external events (requirement 2 – see table 6). To fulfill the losslessness condition according to our interpretation (see section 4), hereditary and emergent elements of a process are to be preserved in the decomposition. Since only being based on the knowledge of users or specification documents with which a missing (see metric 6) or wrongly designed element (see metric 7) can be identified, the process modeling language is not able to support this condition. To detect syntactically wrongly designed elements, the process modeling language has to be specified by means of its metamodel (requirement 4 – see table 6). In order to be able to define the minimum cardinality of the minimum coupling condition (according to section 4) and evaluate a process regarding metric 8, the process modeling language has to display inputs and the flow between data elements and activities (requirement 5 – see table 6). Earlier on (section 4), we made a suggestion to fulfill the minimum coupling condition which is not based on inputs, namely to investigate the strength of the connections between the activities by applying the Cross-ConnectivityMetric (see [Va08] and metric 9). The strength of the connection between the activities is measured considering all nodes (activities and connectors) and arcs. Therefore the process modeling language has to display activities, connectors as well as the arcs between them (requirement 6 – see table 6). Decomposition condition Minimality Determinism Losslessness Minimal coupling Strong cohesion Requirements No requirements can be defined The process modeling language has to provide modeling constructs for: conditions to specify outgoing arcs of an XOR-connector (requirement 1) external events (requirement 2) The process modeling language should not support an OR-connector (requirement 3) The process modeling language is defined by its metamodel (requirement 4) The process modeling language has to provide modeling constructs for: input data elements and the flow between data elements and activities (requirement 5) activities, connectors and arcs between them (requirement 6) The process modeling language has to provide modeling constructs for: input data elements (requirement 7) output data elements (requirement 8) intermediate data elements (requirement 9) the flow between the data elements (requirement 10) Table 6: Requirements on business process modeling languages Corresponding metrics metric 1 metric 4 metric 5 metrics 2,3 metrics 6,7 metric 8 metric 9 metrics 10,11,12 38 Florian Johannsen, Susanne Leist The strong cohesion condition (we have introduced in section 4) is related to the functionality a subsystem performs [WW89, We97] and requires for each activity of the process model that all output of an activity depends upon its input [We97]. To be able to measure cohesion with the suggested metrics (see metrics 10, 11 and 12) the process modeling language has to display all inputs and outputs for every activity (requirements 7 and 8 – see table 6). The flow between input and output as well as possibly existing data elements between them by means of intermediate results are to be regarded as well (requirements 9 and 10 – see table 6). A short overview of the identified requirements is given in table 6. It becomes obvious that ten requirements can be derived from our interpretation of the decomposition conditions. These requirements cover the range of modeling constructs needed to evaluate a process model regarding the decomposition conditions. The process of modeling a real-world situation is, however, subjective and thus not considered at this point. 5.2 Capabilities of business process modeling languages Support of determinism: The determinism condition focuses on modeling constructs representing external events, OR-connectors, and conditional expressions related to XOR-operations (see section 4 and 5.1). In eEPCs, the outgoing branches of an XORsplit are specified by the events to follow. While it is possible in modeling tools such as ARIS to add attributes to the arcs which specify conditional expressions, these are usually not modeled on a graphical level. In recent years, the eEPC notation was enhanced by modeling constructs for visualizing inter-organizational business processes (see [KKS04]). As a consequence, external events of cooperation partners, too, become evident. It is also possible to use start events for expressing external events (see [GR00]). The eEPC-notation provides an OR-connector. BPMN supports the exclusive gateway. The decision which one of the outgoing arcs of the exclusive gateway is chosen depends on a condition that is visualized by labeling the outgoing arcs [OMG10]. BPMN offers a variety of event-types and different triggers [OMG10] that can be used to visualize the occurrence of an external event in the process model [Re09]. BPMN supports ORconnectors as well. In UML ADs, the decision node (and a corresponding conditional expression) is used for XOR-operations [Ru06]. UML 2.0 introduces the “accept event” which can be used to express external events [Ru06]. Contrary to BPMN and eEPCs, OR-operations are not considered by UML ADs. Support of losslessness: For all notations considered, official metamodels are available (see [Sch98, OMG10, OMG05]). But these metamodels are either too focused on specific aspects of the modeling language (e.g. for BPMN: metamodel for choreography activity, artifacts metamodel, external relationship metamodel) or mainly address technical aspects. Nevertheless, literature provides metamodels which were derived from the available specifications providing a more manageable means for a practitioner to design syntactical correct business process models. In that context, Rosemann [Ro96] presents a comprising metamodel for eEPCs which also takes into account connectors and views, while Korherr and List [KL07] design a metamodel for BPMN. Bordbar and Staikopoulos [BS04] develop a metamodel for UML ADs in particular. In summary, metamodels exist in literature which are less complex than those presented in the official specifications, helping a practitioner to design syntactically correct models. Reflecting modeling languages regarding Wand and Weber’s Decomposition Model 39 Support of minimal coupling: On the one hand, minimum coupling can be determined by the interconnections between functions/activities/actions based on common data elements. On the other hand, the structure of the process model provides information to calculate the coupling degree [Va08, VCR07]. The first option takes a data-oriented view while the second option focuses the control-flow. All modeling languages considered offer modeling constructs to calculate the coupling degree according to our specification and design models with “minimum coupling”. eEPCs support the data view (see [STA05]), while in BPMN data objects are used for presenting both information and data (see [OMG10]). UML ADs have object nodes representing data elements that are transferred from one action to another one [OMG05]. These can either be attached to an action symbol as a “pin” or to an object flow. In all modeling languages the connection between the functions/activities/actions is the control-flow. Support of strong cohesion: The strong cohesion condition is based on a data-oriented view (see [VRA08, Re03]). As already stated, eEPCs support data elements, while the distinction between input and output data elements is possible. However, the flow between the data elements themselves is not visualized within an eEPC (see [STA05]). Additional diagrams would be necessary in that context (see [STA05]). In addition, possible intermediate data elements that are produced within a “basic” function while transforming an input data element to an output data element are not modeled. If the function was a “hierarchical function” with further model levels subjacent, additional data elements would be given. In BPMN, data objects can be differentiated as data input and data output on a graphical level, while the flow between the data objects is not explicitly modeled (see [OMG10]). Regarding basic activities no intermediate data elements are modeled that may be produced within the activity to create the final output data (see [OMG10]). In UML ADs, object nodes represent data elements, while the object flow respectively the “pin symbol” characterizes them as input or output data (see [Ru06]). While all modeling languages considered support input and output data, additional diagrams are necessary to highlight the flow between the data elements. Intermediate data elements within a function/activity/action in the sense of Vanderfeesten et al. [VRA08] are not explicitly modeled or supported. Therefore the degree of cohesion [VRA08] cannot be calculated by just looking at the process models. Decomposition condition Requirements Determinism Losslessness Minimal coupling Strong cohesion Key: Requirement 1 (conditions for arcs of a XOR-connector) Requirement 2 (constructs for external events) Requirement 3 (no support of OR-connector) Requirement 4 (definition of a metamodel) Requirement 5 (constructs for input data elements and flow between data elements and activities) Requirement 6 (activities, connectors and arcs between them) Requirement 7 (constructs for input data elements) Requirement 8 (constructs for output data elements) Requirement 9 (constructs for intermediate data elements) Requirement 10 (constructs for flow between data elements) : fulfilled; 0: partly fulfilled; eEPC BPMN UML AD x x x x x x x x 0 x 0 x: not fulfilled 0 Table 7: Results of the evaluation Table 7 summarizes the findings. None of the modeling languages entirely fulfills the requirements derived in section 5.1. Major differences between the languages can be seen in the requirements regarding the determinism condition. 40 Florian Johannsen, Susanne Leist Some restrictions become obvious when evaluating the languages against the requirements derived from the losslessness and strong cohesion condition. 6 Summary and Outlook The use of Wand and Weber’s decomposition model for business process modeling is meant to facilitate the decomposition of the process model. This enables a better comprehensibility of the model for its users. Whereas this statement is the basis for our complete research project and will have to be empirically validated, we have only just started our investigation with this paper. The objective was to find out which of the three selected business process modeling languages (BPMN, eEPC, UML AD) is best able to support the decomposition conditions. A first result is that requirements on business process modeling languages could not be defined for all decomposition conditions. The capabilities of the modeling languages do not vary for most of the requirements which stresses the similarities of the languages. The main differences could be detected when fulfilling the requirements of the determinism condition. An important result to be incorporated into the research project is that the business process modeling languages can meet most of the requirements and that, for all deficiencies, supplementary models or an extension of the process modeling language can be provided. In that context, it is of special interest that none of the business process modeling languages is capable to model the data elements as it is required for the strong cohesion condition. Intermediate data elements as well as the flow between the data elements have to be documented in supplementary models which will be verified by means of conducting the empirical validation of the decomposition conditions. The results of the investigation underline the need for a better integration of data elements into business process modeling. As a restriction to the above, it has to be stated that the requirements on the modeling languages were derived from the authors` interpretation of the decomposition conditions by Wand and Weber [WW89, WW90, We97]. The conditions were specified by metrics allowing an objective evaluation of different design alternatives. Nevertheless there may be other interpretations of these conditions in the context of business process modeling. While process modeling itself is a subjective task, evaluation procedures in the field of process modeling, too, may underlie subjectivity. This refers to section 5.2 in particular. In addition, the investigation is limited, because only three process modeling languages were investigated. With the next steps of the research project we aim to validate the decomposition model and derive a decomposition method which comprises principles and practical guidelines for business analysts. References [ADK02] [BCN92] [Be95] van der Aalst, W.M.P.; Desel, J.; Kindler, E.: On the semantics of EPCs: A vicious circle. In: EPK 2002: Business Process Management using EPCs, 2002; p. 71–80. Batini, C.; Ceri, S.; Navathe, S.B.: Conceptual Database Design - An entitiy relationship approach. Benjamin/Cummings Publishing, Redwood City et al., 1992. Becker, J.: Strukturanalogien in Informationsmodellen: Ihre Definition, ihr Nutzen und ihr Einfluß auf die Bildung von Grundsätzen ordnungsmäßiger Modellierung (GoM). Wirtschaftsinformatik 95. Physica, Heidelberg, 1995, p. 133-150. Reflecting modeling languages regarding Wand and Weber’s Decomposition Model [BM02] [BM06] [BM08] [BRB07] [Br06] [BS04] [BWW09] [Ca05] [FS06] [GL07] [GR00] [He09] [He04] [KKS04] [KL07] [KLS95] [Ma99] [Me09] [Mi10] [Mo98] [MR08] 41 Burton-Jones, A.; Meso, P.: How Good Are These UML Diagrams? An Empirical Test of the Wand and Weber Good Decomposition Model. In: International Conference on Information Systems (ICIS), 2002; p. 101-114. Burton-Jones, A.; Meso, P.: Conceptualizing Systems for Understanding: An Empirical Test of Decomposition Principles in Object-Oriented Analysis. Information Systems Research 2006; 17:38-60. Burton-Jones, A.; Meso, P.N.: The Effects of Decomposition Quality and Multiple Forms of Information on Novices’ Understanding of a Domain from a Conceptual Model. Journal of the Association for Information Systems 2008; 9:748-802. Bobrik, R.; Reichert, M.; Bauer, T.: View-Based Process Visualization. Lecture Notes in Computer Science 2007; Volume 4714/2007:88-95. vom Brocke, J.: Design Principles for Reference Modelling - Reusing Information Models by Means of Aggregation, Specialisation, Instantiation, and Analogy. In (Fettke, P., Loos, P. eds.): Reference Modelling for Business Systems Analysis. Idea Group Publishing, Hershey, USA, 2006. Bordbar, B.; Staikopoulos, A.: On Behavioural Model Transformation in Web Services. Lecture Notes in Computer Science 2004; 3289/2004:667-678. Becker, J.; Weiß, B.; Winkelmann, A.: A Business Process Modeling Language for the Banking Sector - A Design Science Approach. In: Fifteenth Americas Conference on Information Systems (AMCIS), 2009; p. 1-11. Cardoso, J.: How to Measure the Control-flow Complexity of Web Processes and Workflows. In (Fischer, L. ed.): Workflow Handbook. Lighthouse Point 2005. Ferstl, O.K.; Sinz, E.J.: Modeling of Business Systems Using SOM. In (Bernus, P., Mertins, K., Schmidt, G. eds.): Handbook on Architectures of Information Systems. Springer, Berlin etc., 2006, p. 347-367. Gruhn, V.; Laue, R.: Approaches for Business Process Model Complexity Metrics. In (Abramowicz, W., Mayr, H.C. eds.): Technologies for Business Information Systems. Springer, Berlin, 2007; p. 13-24. Green, P.; Rosemann, M.: Integrated process modeling: An ontological evaluation. Information Systems 2000; 25:73-87. Heinrich, B. et al.: The process map as an instrument to standardize processes: design and application at a financial service provider. ISeB 2009; 7:81-102 Hevner et al.: Design Science in Information Systems Research. MISQ 2004; 28:75105 Klein, R.; Kupsch, F.; Scheer, A.-W.: Modellierung inter-organisationaler Prozesse mit Ereignisgesteuerten Prozessketten, 2004. Korherr, B.; List, B.: Extending the EPC and the BPMN with Business Process Goals and Performance Measures. In: 9th ICEIS, 2007. Krogstie, J.; Lindland, O.I.; Sindre, G.: Towards a Deeper Understanding of Quality in Requirements Engineering. In: Proceedings of the 7th CAISE, 1995; p. 82-95. Malone, T.W. et al.: Tools for Inventing Organizations: Toward a Handbook of Organizational Processes. Management Science 1999; 45:425-443. Mendling, J.: Metrics for Process Models - Empirical Foundations of Verification, Error Prediction, and Guidelines for Correctness. Springer, Berlin et al., 2009. Mili, H. et al.: Business process modeling languages: Sorting through the alphabet soup. ACM Computing Surveys 2010; 43:1-54. Moody, D.L.: Metrics for Evaluating the Quality of Entity Relationship Models. Lecture Notes in Computer Science 1998; 507/1998:211-225. zur Muehlen, M.; Recker, J.: How Much Language Is Enough? Theoretical and Practical Use of the Business Process Modeling Notation. Lecture Notes in Computer Science 2008; 5074/2008:465-479. 42 Florian Johannsen, Susanne Leist [MRA10] [MRC07] [OMG05] [OMG10] [OMG11] [Ös95] [Re03] [Re09] [RI07] [RM08] [RMD10] [Ro96] [Ro09] [RRK07] [Ru06] [RV04] [Sch98] [SR98] [STA05] [Va07] [Va08] [VCR07] [VRA08] [We97] [WW89] [WW90] Mendling, J.; Reijers, H.; van der Aalst, W.: Seven process modeling guidelines. Information and Software Technology 2010; 52:127-136. Mendling, J.; Reijers, H.A.; Cardoso, J.: What Makes Process Models Understandable? Lecture Notes in Computer Science 2007; 4714/2007:48-63. OMG Unified Modeling Language (OMG UML) – Superstructure, 2005. OMG: Business Process Model and Notation (BPMN) – Version 2.0, 2010. OMG Unified Modeling Language, Infrastructure – Version 2.4.1, 2011. Österle, H.: Business in the information age Springer, Berlin et al., 1995. Reijers, H.A.: A Cohesion Metric for the Definition of Activities in a Workflow Process. In: Eighth CAiSE/IFIP8.1 International Workshop on Evaluation of Modeling Methods in Systems Analysis and Design, 2003, p. 116-125. Recker, J. et al.: Business process modeling: a comparative analysis. Journal of the Association for Information Systems 2009; 10:333-363. Recker, J.; Indulska, M.: An Ontology-Based Evaluation of Process Modeling with Petri Nets. Interoperability in Business Information Systems 2007; 2:45-64. Reijers, H.; Mendling, J.: Modularity in Process Models: Review and Effects. Lecture Notes in Computer Science 2008; 5240:20-35. Reijers, H.A.; Mendling, J.; Dijkman, R.: On the Usefulness of Subprocesses in Business Process Models. BPM Report, 2010. Rosemann, M.: Komplexitätsmanagement in Prozeßmodellen. Gabler-Verlag, Wiesbaden, 1996. Rosemann, M. et al.: Using ontology for the representational analysis of process modelling techniques. International Journal of Business Process Integration and Management Decision 2009; 4:251-265. Recker, J.; Rosemann, M.; Krogstie, J.: Ontology- Versus Pattern-Based Evaluation of Process Modeling Languages: A Comparison. Communications of the Association for Information Systems 2007; 20:774-799. Russell, N. et al.: On the suitability of UML 2.0 activity diagrams for business process modelling. In: 3rd Asia-Pacific conference on Conceptual modelling, 2006. Reijers, H.A.; Vanderfeesten, I.T.P.: Cohesion and Coupling Metrics for Workflow Process Design Lecture Notes in Computer Science 2004; 3080:290-305. Scheer, A.-W.: ARIS - Modellierungsmethoden - Metamodelle - Anwendungen. Springer, Berlin et al., 1998. Schütte, R.; Rotthowe, T.: The Guidelines of Modeling – An Approach to Enhance the Quality in Information Models. LNCS 1998; 1507/1998:240-254. Scheer, A.-W.; Thomas, O.; Adam, O.: Process Modeling Using Event-Driven Process Chains. In (Dumas, M., van der Aalst, W., Hofstede, A.T. eds.): Processaware information systems. John Wiley and Sons 2005, p. 119-146. Vanderfeesten, I.T.P. et al: Quality Metrics for Business Process Models. In (Fischer, L. ed.): BPM and Workflow Handbook 2007. Future Strategies, p. 179-190. Vanderfeesten, I. et al.: On a Quest for Good Process Models: The CrossConnectivity Metric. Lecture Notes in Computer Science 5074 2008; 5074:480-494 Vanderfeesten, I.; Cardoso, J.; Reijers, H.A.: A weighted coupling metric for business process models. In: Proceedings of the CAiSE 2007, p. 41-44. Vanderfeesten, I.; Reijers, H.A.; van der Aalst, W.M.P.: Evaluating workflow process designs using cohesion and coupling metrics. Computers in Industry 2008; 59:420-437 Weber, R.: Ontological Foundations of Information Systems, Queensland, 1997. Wand, Y.; Weber, R.: A Model of Systems Decomposition. In: Tenth International Conference on Information Systems, 1989; p. 42-51. Wand, Y.; Weber, R.: Toward a theory of the deep structure of information systems. In: International Conference on Information Systems, 1990; p. 61-71. Sprachbezogener Abgleich der Fachsemantik in heterogenen Geschäftsprozessmodellen Janina Fengel, Kerstin Reinking Fachbereich Wirtschaft Hochschule Darmstadt Haardtring 100 64295 Darmstadt janina.fengel@h-da.de kerstin.reinking@h-da.de Abstract: In Unternehmen bringt die Geschäftsprozessmodellierung über die Zeit Sammlungen unterschiedlicher Modelle hervor. Sind diese zusammenzuführen, erschweren semantische Unterschiede den inhaltsbezogenen Abgleich, obwohl dies Vorbedingung für ihre Integration wie beispielsweise im Falle von Analysen, Unternehmensumstrukturierungen, Fusionen oder Standardeinführungen ist. Neben semantischer Heterogenität bedingt durch die Verwendung verschiedener Modellierungssprachen liegt ein Haupthindernis für automatisiertes Matching von Modellen in der Art der Nutzung der zur Bezeichnung von Modellen und ihren Elementen gewählten natürlichen Sprache und unterschiedlich genutzter Fachsprachen. In diesem Beitrag wird hierzu eine Methode vorgestellt, wie eine Kombination von Ontology-Matching-Verfahren heuristische Unterstützung bieten kann. 1 Hintergrund und Motivation Die Geschäftsprozessmodellierung zur Beschreibung und Gestaltung betrieblichen Geschehens hat in den vergangenen Jahrzehnten stark an Bedeutung gewonnen. In der Unternehmenspraxis entsteht daher häufig der Bedarf existierende Modelle abzugleichen wie in Fällen von Projekten zur Architektur-, Daten- und Prozessintegration, semantischen Konsolidierungsprojekten, Unternehmensfusionen und B2B-Integrationen sowie bei der Einführung von Standards oder Standardsoftware. Zur Zusammenführung von Geschäftsprozessmodellen sind die vorhandenen Modelle bezüglich der Inhaltsbedeutung ihrer Elemente zu vergleichen, um Entsprechungen, Ansatzpunkte, Schnittstellen oder gar Überschneidungen und Redundanzen ermitteln zu können. Das Vergleichen und Verknüpfen heterogener Modelle ist indes eine nicht-triviale Aufgabe, denn selbst Modelle gleichen Typs unterscheiden sich häufig semantisch [BP08]. Allerdings tritt dabei semantische Heterogenität nicht nur im Bereich der Modellierungssprachen auf, sondern typischerweise bei der Auswahl der natürlich- sprachlichen Fachbegriffe, die zur Benennung der Modellelemente verwendet werden [TF07]. 44 Janina Fengel, Kerstin Reinking Besonders die frei wählbare Fachterminologie behindert eine Integration von Modellen und damit der zugrunde liegenden Daten und Prozesse, umso mehr bei unterschiedlicher Herkunft der Modelle, sei es aus dezentralen Teams, unterschiedlichen Konzernbereichen oder verschiedenen unabhängigen Unternehmen. Die in natürlicher Sprache formulierten Bezeichnungen spiegeln neben der branchenüblichen Fachterminologie auch die jeweilige tradierte unternehmensspezifische Geschäftssprache wider. Existiert kein allgemein gültiges, verbindlich definiertes Vokabular oder Regeln bezüglich deren Anwendung, können sich Modelle darin erheblich unterscheiden. Erschwert werden Abgleiche nicht nur bedingt durch die Problematik verschiedener Inhaltsbedeutungen der verwendeten Bezeichnungen und das Verständnis davon, sondern auch durch unterschiedlich gewählte Begriffe oder Begriffskombinationen zur Bezeichnung von Modellelementen. Liegen gar Namenskonflikte bedingt durch Synonymie oder Homonymie vor, sind Modelle weder manuell noch automatisiert direkt vergleich- und damit integrierbar [BRS96; TF06]. Insbesondere in großen Unternehmen existiert bereits eine Vielzahl an Geschäftsprozessmodellen, die über die Zeit von unterschiedlichen Personen oder dezentral in Gremien mit mehreren Personen, oft sogar anhand unterschiedlicher Vorgaben erstellt wurden, in verschiedenen Modellierungssprachen oder unter Nutzung unterschiedlicher Fachterminologien. Auch wenn der gleiche Sachverhalt modelliert ist, können sich arbeitsteilig erstellte konzeptuelle Modelle erheblich in ihren Bezeichnern unterscheiden, sodass die für ihre Nutzung notwendige Vergleichbarkeit nicht grundsätzlich vorausgesetzt werden kann [BD10]. Dies gilt umso mehr im Falle des Aufeinandertreffens von Modellen aus bisher unabhängig agierenden Unternehmen oder Unternehmensteilen. Daher gilt es vor Aufnahme weiterführender Arbeiten den semantischen Istzustand zu analysieren. Semantische Ambiguität ist aufzulösen, um die Aussagen von Modellen inhaltlich in Bezug bringen und abgleichen zu können, denn erst der Abgleich der Fachsprache erlaubt die Identifikation von sich inhaltlich entsprechenden Modellen und Modellelementen und darauf aufbauend gegebenenfalls weiterführende strukturelle Vergleiche [SM07]. Bisher sind solche Analyseaufgaben überwiegend nur manuell leistbar. Der notwendige Abgleich und die Integration konzeptueller Modelle wie die hier betrachteten Geschäftsprozessmodelle sind heute rein intellektuelle Arbeiten. Liegen gar viele und große Modelle vor, sind diese Aufgaben ohne automatisierte Unterstützung nur mittels großem Ressourceneinsatz zu erfüllen. Um diese Lücke schließen und das Potential von Rechenleistung zur automatisierten Verarbeitung nutzen zu können, wird nachfolgend eine entsprechende IT-gestützte heuristische Methode vorgestellt. Dieser Ansatz fokussiert auf die Nutzungsphase nach der Erstellung von Modellen, insbesondere auf Fragen der gemeinsamen Verwendbarkeit. Zur Reduktion der Arbeitslast beim bedeutungsbezogenen Abgleich auf Nutzerseite wird dazu die Anwendung von Semantic-Web-Technologien, insbesondere Ontologieverarbeitung, und eine Kombination von Verfahren zur Verarbeitung natürlicher Sprachen auf die Frage der Ermittlung semantischer Ähnlichkeit von Geschäftsprozessmodellen in Kap. 2 beschrieben. Dazu folgt die Vorstellung der Vorgehensweise zur Erschließung und Formalisierung der in Geschäftsprozessmodellen enthaltenen semantischen Information und der dafür benötigten Ontologien sowie in Kap. 3 des entsprechend implementierten Prototypen. Darauf aufbauend wird in Kap. 4 die Anwendung der Methode gezeigt. Sprachbezogener Abgleich der Fachsemantik in heterogenen Geschäftsprozessmodellen 45 Der Beitrag endet in Kap. 5 mit der Vorstellung und der Verbindung zu verwandten Arbeiten sowie in Kap. 6 einer kurzen Schlussbetrachtung und einem Ausblick auf zukünftige Arbeiten. 2 Semantische Analyse Modelle repräsentieren in der Regel abgestimmtes Fachwissen. Dies ist zum einen Wissen über die Beschreibung von Sachverhalten in Repräsentations- bzw. Modellierungssprachen, zum anderen das Fachwissen zu den modellierten Sachverhalten, beschrieben durch die organisationale bzw. Geschäftssemantik. Die Erschließung und Repräsentation dieses Wissens kann durch semantische Analyse vorgenommen werden [Li00]. Auf diese Weise lassen sich die Beziehungen zwischen den Objekte beider Domänen erfassen und darstellen. Prinzipiell kann die Repräsentation und automatisierte Verarbeitung von Wissen zum weiteren Ausbau der Informationsverarbeitung beitragen. Im Geschäftsalltag hat die Allgegenwärtigkeit des Internet als globale Infrastruktur zur hohen Akzeptanz webbasierter Unterstützung elektronischer Geschäftsabwicklung beigetragen. Die Entwicklung der Idee des Semantic-Web und seiner spezifischen Technologien bietet nun weiterführend die Möglichkeit der Nutzung webbasierter Ontologien in ihrer Eigenschaft als explizite Spezifikationen als Mittel zur Wissensstrukturierung und Herstellung semantischer Interoperabilität basierend auf offenen Standards. Das Prinzip der Annotation von Information mit Metadaten erlaubt die Repräsentation von Wissen in strukturierter, maschinenzugänglicher Form aufbauend auf Internettechnologien, lesbar sowohl für Maschinen als auch von Menschen [SBH06]. Insbesondere bietet sich die Nutzung solcher semantischer Technologien in den Fällen an, in denen intellektuelle Arbeitsleistung zu kostspielig ist und wiederkehrend Abgleiche insbesondere für große und heterogene Mengen von Daten und Informationen zu leisten sind [Fr10]. Ziel des fachsprachlichen Abgleich von Geschäftsprozessmodellen ist die Unterstützung der Vorarbeiten zu strukturellen Vergleichen von Modellen, die wiederum von der verwendeten Modellierungssprache beeinflusst werden. 2.1 Ontologieerstellung und Ontology-Matching Kernelement des Semantic-Web sind Ontologien. Dies sind im informatiktechnischen Sinne Artefakte und können als konzeptuelle Schemata verstanden werden [AF05]. Im Prinzip sind Ontologien Sammlungen von Definitionen von Elementen und ihren Beziehungen und enthalten ein abgestimmtes Vokabular [DOS03]. Sie formalisieren die Bedeutung von Begriffen. Obwohl bei der Entwicklung von Ontologien dasselbe Problem auftritt wie bei der Erstellung von Geschäftsprozessmodellen, nämlich die Entstehung semantischer Heterogenität durch die Wahl der Modellierungssprachen und der Fachsprache für die Bezeichner für Klassen bzw. Konzepte und Relationen, sind diese bei Ontologien wiederum weiterführend automatisiert nutzbar für Abgleiche. Die Forschung im Themenfeld des Ontology-Matching widmet sich Fragen der Abgleichbarkeit und Auflösung semantischer Ambiguitäten [ES07]. 46 Janina Fengel, Kerstin Reinking Ontology-Matching-Verfahren unterstützen bei der Klärung der Bedeutung verwendeter Begriffe und dienen damit der Ermittlung der Bedeutung von Aussagen über Sachverhalte bzw. deren Beschreibungen. Ziel ist das Auffinden semantischer Relationen, die sich als Ontology-Mappings ausdrücken lassen. Angewendet auf die Frage der Bestimmung der Ähnlichkeit der Inhaltsbedeutungen von Modellen und ihren Elementen können sie als semantische Korrespondenzen dienen. Dies ermöglicht Aussagen der Art „A aus Ontologie X entspricht B aus Ontologie Y“, die sich als Funktionen beschreiben lassen 𝑆𝑒𝑚𝐶𝑜𝑟𝑟 (𝑒1 ) = �{𝑒2 ∈ 𝑂2 |𝑒2 }, 𝑒1 ∈ 𝑂1 � ∈ [0,1] Diese semantischen Korrespondenzen drücken Äquivalenz oder Ähnlichkeit aus. Für das Abgleichen der in Geschäftsprozessmodellen enthaltenen Geschäftssemantik bieten sich elementbasierenden Ontology-Matching-Verfahren an. Ein umfassender Überblick dazu findet sich in [ES07]. Für weiterführende Nutzung können die Korrespondenzen persistiert werden. Dadurch können die verknüpften Ontologien bestehen bleiben, ohne zusammengeführt werden zu müssen. Dies ist besonders im Hinblick darauf nützlich, dass die zugrunde liegenden Modelle nicht ohne weiteres geändert werden können, sondern aktiv genutzt werden. Bewahrte Korrespondenzen bieten stattdessen die Möglichkeit einer virtuellen semantischen Integration. 2.2 Erschließung und Formalisierung der Semantik von Modellen Existierende Geschäftsprozessmodelle sind nicht-ontologische Ressourcen, aus denen durch Reengineering die Bedeutung der Modellaussage extrahiert und semantisch formalisiert werden kann. Eine solche Wiederverwendung von Modellen und ihre Konvertierung in Ontologien erlaubt ihre weiterführende Verwertung, während sie weiterhin unverändert aktiver Nutzung zur Verfügung stehen. Durch automatisierte Dekomposition und Überführung in Ontologien wird Maschinenzugang zum enthaltenen Wissen hergestellt. Ansatzpunkt für die Erschließung des enthaltenen Wissens ist die Überlegung, dass Modelle Fakten aus zwei Wissensbereichen enthalten. Aus dem Sprachraum der Domänensprache sind Konzepte zur Benennung von Modellen und ihren Elementen herangezogen worden, während die Konzepte der Modellierungssprache zur Beschreibung im Sinne einer Typisierung und Anordnung dieser Konzepte genutzt wurden. In Umkehrung dieses Vorgangs lassen sich Modelle zerlegen, um die jeweils verwendeten Konzepte der Sprachräume zu extrahieren und in Form semantischer Modelle zu erfassen, wie in [FR10] beschrieben. Dabei wird die vorhandene Modellinformation ohne manuellen oder zusätzlichen intellektuellen Aufwand an dieser Stelle erschlossen. Die Ontologien zur Beschreibung des Metamodells liegen in OWL bereits vor und können zur Nutzung des Vorgehens der Modelldekomposition verwendet werden. Somit fallen für den eigentlichen Abgleich keine Vorarbeiten an. Bei der Dekomposition werden Modelle mittels XSLT in zwei Ontologien in OWL DL konvertiert. Dies sind die Modellontologie mit den Bezeichnern des Modellnamens und der Modellelemente und die Modelltypontologie mit dem Modell- und den Modellelementtypen. Sprachbezogener Abgleich der Fachsemantik in heterogenen Geschäftsprozessmodellen 47 Zusammen beschreiben sie das Modell mit seinem Namen und Modelltyp sowie die Modellelemente mit ihren Namen und ihrem Modellelementtyp. Bei der Konvertierung werden alle Modellnamen und Modellelementbezeichnungen ohne weitere Verarbeitung „as-is“ transferiert. Auf diese Weise können vollständige Ausdrücke zur Weiterverarbeitung übernommen werden, denn das Fachwissen bei der Modellierung zeigt sich oft erst in der Kombination von Worten zu häufig genutzten Formulierungen. Ebenso bleibt erhalten, dass gegebenenfalls Konventionen die Vergabe von Elementbezeichnern geleitet haben, sowie die verwendete natürliche Sprache und unterschiedlicher Sprachgebrauch genauso wie Besonderheiten der Domäne. Bei Geschäftsprozessmodellen werden zur Bezeichnung von Ereignissen und Aktivitäten zumeist Ausdrücke bzw. Phrasen bestehend aus mehreren Termen verwendet, die selten einen vollständigen Satz bilden. Bei einem semantischen Abgleich ist daher jeder Term einzeln und in seiner Eigenschaft als Teil der vorliegenden Kombination zu betrachten, denn die Phrasen tragen allein in ihrer Gesamtheit die ihnen zugedachte Bedeutung. Augenfälligster Unterschied bei der Analyse der mittels der hier vorgestellten Methode abzugleichenden Modellsammlung war die Unterscheidung zwischen Modellen in deutscher und englischer Sprache. Allerdings zeigte sich, dass zumeist keine Umgangssprache zur Anwendung kam und Formulierungen von Emotionen wie Ironie oder Beschönigungen nicht auftraten. Ebenso wurden nur in geringem Umfang beschreibende Adjektive, Adverbien oder modifizierende Ausdrücke gefunden. Dabei wurde auch sichtbar, dass verschiedene Bezeichnungen desselben Begriffs nicht nur durch unterschiedlichen Sprachgebrauch seitens der Modellier, sondern auch begründet durch die Anforderungen und Beschränkungen der jeweiligen Modellierungssprache anzutreffen sind [BD10]. 2.3 Semantischer Abgleich der natürlichen Sprache der Bezeichner Um die entstandenen Modellontologien, die die Domänensemantik der konvertierten Modelle enthalten, automatisiert miteinander in Bezug zu bringen, können OntologyMatching-Verfahren angewendet werden. Für ansonsten manuell auszuführende Modellabgleiche kann so automatisierte Unterstützung geboten werden und die Modellelemente, die die Domänensemantik widerspiegeln, können unabhängig von der ursprünglich genutzten Modellierungssprache verglichen werden. Dabei zeigte sich, dass die in Prozessmodellen übliche Benennung von Elementen mit mehreren Termen in einer Phrase wie oben beschrieben durch Name-Matching-Verfahren wie beispielweise Zeichenkettenvergleiche bzw. Nutzung von String-Matching-Metriken allein zu minderwertigen Ergebnissen führt. Dies gilt insbesondere bei Vorliegen von Synonymen sowie im Falle unterschiedlicher Positionen gleicher oder ähnlicher Terme innerhalb der zu vergleichenden Phrasen. Stattdessen galt es, verschiedene Anforderungen zu erfüllen. Wie beschrieben führt unterschiedlicher Sprachgebrauch von Modellierern zur Verwendung von unterschiedlichen Bezeichnern. Daher ist davon auszugehen, dass sich Synonyme in den zu vergleichenden Modellen befinden, die beim Einsatz allein von String-Metriken als nicht übereinstimmend erkannt werden könnten. Stattdessen ist die Auflösung von Synonymen erforderlich. Ebenso ist anzunehmen, dass Bezeichner in semantisch ähnlichen Modellen in verschiedenen Sprachen vorkommen können. 48 Janina Fengel, Kerstin Reinking Daher ist es erforderlich, dass mehrsprachige Modelle verarbeitet werden können und informationslinguistische Verfahren abhängig von der jeweiligen Sprache genutzt werden. Da es sich bei den Bezeichnern in Modellen um Phrasen handelt, die keine grammatikalisch vollständigen Sätzen oder gar Texte darstellen, sind allerdings einige bestehende informationslinguistische Verfahren nicht direkt anwendbar. Beispielsweise können solchartige Phrasen kaum sinnvoll einer Part-of-Speech-Analyse unterzogen werden. Um eine der Art der Bezeichner angemessene Behandlung zu ermöglichen, wurden verschiedene Verfahren kombiniert, die nachfolgend im Einzelnen kurz vorgestellt werden. 2.4 Informationslinguistische Verfahren In den vergangenen Jahrzehnten sind verschiedene natural language processing bzw. informationslinguistische Verfahren entstanden, die sich mit der Verarbeitung natürlicher Sprache in bzw. für Informationssysteme befassen [HL09]. Sie eignen sich daher für das Ontology-Matching auf Elementebene [ES07]. 2.4.1 Kompositazerlegung Begriffe in natürlichen Sprachen können unterschiedlich komplex sein, entweder bestehend aus einem Einzelbegriff oder in Form einer Begriffskombination. Dabei besteht ein Einzelbegriff meist aus einem Wort, eine Begriffskombination aus mehreren begrifflichen Bestandteilen. Im Englischen sind dies häufig Mehrwortbenennungen, im Deutschen dagegen Komposita, d.h. die Verbindung mindestens zweier selbstständig vorkommender Worte zu einem zusammengesetzten Wort [Be05]. Für Kompositabildung erlaubende Sprachen wie das Deutsche ist es sinnvoll, Kompositazerlegung durchzuführen und die einzelnen Bestandteile des Kompositums für den Abgleich zu benutzen [St07]. Dabei ist es bei der Dekomposition von Wichtigkeit, sinnvolle begriffliche Bestandteile herzustellen, um alle Vorkommen eines Suchwortes zu finden. Zur Vermeidung nicht sinnvoller Zerlegung von Mehrwortbegriffen oder unerwünschter Zerlegung von Eigennamen können geeignete Wörterbucher unterstützen [Be05]. 2.4.2 Disambiguierung durch Auflösung von Synonymie Synonyme sind unterschiedliche Bezeichnungen für denselben Begriff. Erscheinungsformen dabei sind unterschiedliche Flexionsformen, verschiedene Schreibvarianten eines Wortes, Varianten in unterschiedlichen Zeichensystemen, Abkürzungen oder Vollformen sowie alternativ nutzbare Begriffe [We01]. Durch die Auflösung von Synonymen kann gewährleistet werden, dass semantische Übereinstimmungen zwischen Begriffen gefunden werden, selbst wenn diese unterschiedlich benannt worden sind, sodass die Abgleichsergebnisse verbessert werden [Be05]. Die Auflösung bzw. Word Sense Disambiguation kann unter Zuhilfenahme eines Thesaurus als Synonymwörterbuch vorgenommen werden [St07]. Ein Thesaurus verknüpft Terme zu begrifflichen Einheiten mit und ohne präferierte Bezeichnungen und setzt sie in Beziehung zu anderen Begriffen. Sprachbezogener Abgleich der Fachsemantik in heterogenen Geschäftsprozessmodellen 49 In solchen Begriffsordnungen werden zumeist Beziehungen wie Synonymie und Ambiguität, Hyponymie und Hyperonymie, Antonymie sowie Assoziation erfasst [SS08]. Zur Erstellung webbasierter Thesauri bietet das W3C SKOS, das Simple Knowledge Organization System [MB09]. Die Nutzung von SKOS erlaubt die Wiederverwendung frei verfügbarer Ressourcen, wie beispielsweise WordNet [Fe98] oder den Standard-Thesaurus Wirtschaft (STW) [Zb10]. 2.4.3 Behandlung von Stoppwörtern Im Information Retrieval werden Wörter, die bei Indexierungen nicht beachtet werden, Stoppwörter bzw. stop words, genannt. Zumeist übernehmen sie syntaktische Funktionen und haben somit keine Relevanz für Rückschlüsse auf den Inhalt eines Dokuments. Im Deutschen wie im Englischen sind dies Artikel, Konjunktionen, Präpositionen oder Pronomina sowie die Negation [Be05]. Gleichwohl sind sie für das Verständnis unerlässlich [Be05]. Die Menge an Stoppwörtern kann domänenspezifisch variieren, da auch Wörter enthalten sein können, die, trotzdem sie Bedeutungsträger sind, nicht verwendet werden sollen, da sie in den meisten Dokumenten vorkommen und somit nicht zur inhaltlichen Differenzierung nützen. Entsprechend bietet es sich für die Frage der Geschäftssemantik in Prozessmodellen an, diese nicht generell zu eliminieren wie vorgeschlagen in [Ko07], sondern domänenspezifisch. Abhängig von der Art von Suchen erlaubt der Verzicht auf die Eliminierung bessere Ergebnisse bei Suchen mit Wortkombinationen [Be08]. Weiterhin ist im Falle von Geschäftsprozessen bei Entscheidungen häufig die Existenz der Negation bei der Suche nach semantisch ähnlichen Elementen von Bedeutung. Insbesondere bei Vorliegen kurzer Phrasen, bei denen ein in der jeweiligen Sprache übliches Stoppwort einen erheblichen Bedeutungsunterschied ausmacht, kann die Stoppworteliminierung zu falschen Ergebnissen führen, wie beispielswiese bei Negationen [St07]. 2.4.4 Stemming Zur morphologischen Analyse bieten sich im Information Retrieval Methoden zur Grundformbildung bzw. Lemmatisierung sowie der Wortstammbildung bzw. Stemming an [St07]. Bei der Lemmatisierung wird die grammatische Grund- oder Stammform durch die Rückführung der konkreten Wortform auf einen Wörterbucheintrag ermittelt. Beim Stemming werden morphologische Varianten eines Wortes auf ihren gemeinsamen Wortstamm durch die Entfernung von Flexionsendungen und Derivationssuffixen auf einen gemeinsamen Stamm zurückgeführt, wobei dieser nicht zwingend ein lexikalischer Begriff sein muss. Im Falle des Abgleichs von Prozessmodellen können auf diese Weise Bedeutungsähnlichkeiten zwischen Aktivitäten, egal ob mittels eines substantivierten Verbs oder einer Kombination aus Verb und Substantiv benannt, und Objekten genauer ermittelt werden, da hier nur die Stammformen miteinander verglichen werden. Zudem können unerwünschte Matchings von Suffixen ausgeschlossen werden, da diese vor dem Matching entfernt werden. 50 Janina Fengel, Kerstin Reinking 2.4.5 Vergleich von Zeichenketten Eine Folge von Zeichen eines definierten Zeichensatzes wird als Zeichenkette bzw. String bezeichnet. Strings sind Zeichensequenzen beliebiger Länge aus einem definierten Vorrat [ES07]. String-Matching-Algorithmen suchen Übereinstimmungen von Zeichenketten. Diese Aufgabe fällt in den verschiedensten Domänen an und hat im Laufe der Zeit zu unterschiedlichen Ansätzen geführt [CRF03]. String-Metriken erlauben die Messung von Ähnlichkeiten zwischen Zeichenketten [SSK05]. Die Levenshtein-Distanz zweier Strings ist die minimal erforderliche Anzahl von Einfügungen oder Entfernungen zur Umwandlung der ersten in die zweite Zeichenkette [Le66]. Die Jaccard-Metrik vergleicht die Ähnlichkeit von Worten innerhalb eines Ausdrucks [Ja12]. Die JaroMetrik vergleicht Zeichen und ihre Position innerhalb der Zeichenkette, auch wenn sie einige Positionen voneinander entfernt sind [Ja89]. N-Gramme können zur Fragmentierung von Worten bzw. Zeichenketten verwendet werden [St07]. Der darauf basierende Q-Grams-Algorithmus zählt die gemeinsame Menge von Tri-Grammen in den zu vergleichenden Zeichenketten und eignet sich dadurch für so genanntes approximate string matching [ST95]. Da es bei den Ergebnissen aus den verschiedenen Verfahren große Unterschiede geben kann, ist hier die Auswahl einer passenden Metrik in Abhängigkeit von der Sprache und Funktion der Begriffe zu treffen [SSK05]. Obwohl String-Metriken allein nicht alle Bedürfnisse beim Finden von semantischen Ähnlichkeiten von Bezeichnern erfüllen, haben sie sich trotzdem als nützlich in diesem Feld erwiesen [SSK05]. Liegt keine Synonymie von Termen vor, können sie eingesetzt werden, um semantische Ähnlichkeit aufgrund von Übereinstimmungen von Zeichenketten zu bestimmen. Ein vorher durchgeführtes Stemming kann dabei die Präzision der Ergebnisse erhöhen, denn durch die Reduzierung auf den Wortstamm werden dann beispielsweise Übereinstimmungen zwischen Suffixen nicht bewertet. 3 Implementierung Zur Anwendung der beschriebenen Verfahren wurde prototypisch ein System namens LaSMat implementiert, welches für Language-aware Semantic Matching steht. 3.1 Technische Realisierung Die Realisierung der Komponenten erfolgte in Java. Das System kann als Java-API eingebunden oder über eine prototypische Oberfläche angesprochen werden. Abbildung 1 zeigt das Vorgehen zum Abgleich der Modellontologien in Form eines Sequenzdiagramms. Bei einer Anfrage wird im ersten Schritt ein Abgleich beider Phrasen vorgenommen. Dieser Vergleich erfolgt unidirektional. Liegt vollständige Übereinstimmung vor, wird der Wert 1 als Konfidenzwert und damit angenommene Stärke der gefundenen semantischen Korrespondenz zurückgegeben. Ist dies nicht der Fall, werden die Phrasen in Einzelterme zerlegt und diese miteinander verglichen. Hierbei kommen alle oben vorgestellten Verfahren zum Einsatz, wobei die Kompositazerlegung derzeit nur für die deutsche Sprache durchgeführt wird. Sprachbezogener Abgleich der Fachsemantik in heterogenen Geschäftsprozessmodellen 51 Bei allen Verfahren hat der Nutzer die Möglichkeit zu parametrisieren, indem Gewichtungen für die Ergebnisse der verschiedenen Verfahren gesetzt werden können. Die Gewichtung für Übereinstimmungen von als Stoppwort identifizierten Termen ist konfigurierbar. Zur Auflösung von Synonymen können zur Laufzeit Thesauri im SKOSFormat importiert werden. Standardmäßig eingebunden sind WordNet [Fe98] als lexikalische Ressource generell für die englische Sprache im SKOS-Format [W310] und als wirtschaftsspezifische Ressource der STW, der Begriffe in deutsch und englisch enthält [Zb10]. Daneben ist für die generelle deutsche Sprache eine von uns erstellte SKOS-Version des OpenThesaurus in Benutzung [Na05]. Abbildung 1. Sequenzdiagramm des Language-aware Semantic Matchers Dabei lässt sich über den Parameter s ∈ [0,1] als Synonym-Maß die Gewichtung von Synonym-Matches für die Ergebnisaggregation konfigurieren. Für das Stemming werden die Bibliotheken für die deutsche und die englische Sprache aus dem Snowball-Projekt genutzt [PB11]. Für das String-Matching steht eine Auswahl verschiedener StringMetriken zur Verfügung. Es wird dafür die Java-API SimMetrics genutzt [Ch06]. Für die Gewichtung des Ergebnisses in der Gesamtwertung kann ein entsprechender Wert angegeben werden. Zur Ermittlung des Gesamtwerts der Konfidenzen der gefundenen Korrespondenzen werden aus allen Verfahren die besten Ergebnisse aggregiert. Die Ergebnisse sind Matchinginformationen zu jeder Phrase. 52 Janina Fengel, Kerstin Reinking Diese lassen sich im INRIA-Format [Eu06] sowie in einer Alignment-Ontology in einem von uns dafür entwickelten Format abspeichern. Die prototypische Oberfläche ermöglicht eine tabellarische Visualisierung der Ergebnisse, wobei zur Filterung ein Schwellwert für die Stärke der gefundenen Korrespondenzen gesetzt werden kann. 3.2 Berechnung der semantischen Ähnlichkeit Gefundene Korrespondenzen werden als Tupel beschrieben in der Form 〈(𝑒1 , 𝑚1 ), (𝑒2 , 𝑚2 ), 𝑐〉 wobei - (𝑒𝑘 , 𝑚𝑘 ) der Bezeichner eines Elements einer Modellontologie ist, - c als Konfidenz die angenommen Stärke der Beziehung darstellt, ausgedrückt als numerischer Wert zwischen 0 und 1. Der entwickelte Algorithmus bestimmt einen fuzzy Wert für die Ähnlichkeit zwischen zwei Bezeichnern, wobei 1 Äquivalenz ausdrückt und 0 keinerlei Übereinstimmung bedeutet. Wir definieren die Ähnlichkeit zwischen zwei Bezeichnern als arithmetisches Mittel aller Übereinstimmungen in Relation zur Anzahl der Terme in beiden Bezeichnern mit 𝑂𝑣𝑒𝑟𝑎𝑙𝑙𝑇𝑒𝑟𝑚𝑆𝑖𝑚(𝑒1 , 𝑒2 ) 𝑂𝑣𝑒𝑟𝑎𝑙𝑙𝑇𝑒𝑟𝑚𝑆𝑖𝑚(𝑒1 , 𝑒2 ) + 𝑙𝑒𝑛𝑔𝑡ℎ(𝑒2 ) 𝑙𝑒𝑛𝑔𝑡ℎ(𝑒1 ) 𝑆𝑖𝑚(𝑒1 , 𝑒2 ) = 2 wobei - 𝑙𝑒𝑛𝑔𝑡ℎ(𝑒𝑘 ) die Anzahl an Termen der Bezeichnung 𝑒𝑘 ist, ausgedrückt als 𝑙𝑒𝑛𝑔𝑡ℎ(𝑒𝑘 ) = 𝑁𝑢𝑚(𝑡𝑒𝑘 ) - 𝑂𝑣𝑒𝑟𝑎𝑙𝑙𝑇𝑒𝑟𝑚𝑆𝑖𝑚(𝑒1 , 𝑒2 ) die Gesamtübereinstimmung zwischen allen Termen zweier Bezeichner. Für die Berechnung der Gesamtübereinstimmung wird das jeweils höchste Ähnlichkeitsmaß zwischen dem aktuell verglichenem Term und allen Termen des zweiten Bezeichners für die Berechnung herangezogen mit 𝑙𝑒𝑛𝑔𝑡ℎ(𝑒1 ) 𝑂𝑣𝑒𝑟𝑎𝑙𝑙𝑇𝑒𝑟𝑚𝑆𝑖𝑚(𝑒1 , 𝑒2 ) = � 𝑘=1 max (𝑆𝑖𝑚 �𝑡𝑘𝑒1 , 𝑡1…𝑛𝑒2 �) wobei - 𝑆𝑖𝑚(𝑡𝑘 , 𝑡𝑛 ) das Ähnlichkeitsmaß zwischen zwei Termen ist. Die Bestimmung dieses Ähnlichkeitsmaßes basiert auf der Berücksichtigung verschiedener Werte. Im Falle einer exakten Übereinstimmung ergibt das Ähnlichkeitsmaß 𝑆𝑖𝑚(𝑡𝑘 , 𝑡𝑛 ) = 1 Sprachbezogener Abgleich der Fachsemantik in heterogenen Geschäftsprozessmodellen 53 Dabei wird allerdings für den Fall, dass es sich bei den übereinstimmenden Termen um Stoppwörter handelt, das konfigurierte Stoppwort-Maß anstelle des Wertes 1 verwendet. Im Fall (k ≠ n ) würde das Ergebnis der Distanzmessung sein, dass keine Übereinstimmung vorliegt oder eine gesonderte Behandlung aufgrund der Distanz zwischen den einzelnen Zeichen nötig wäre [Ja89]. Dabei ist jedoch zu beachten, dass die Distanz zwischen zwei Termen, anders als bei reinen Zeichensequenzen, wie beispielsweise Gencodes, nicht in allen Fällen zur Bedeutungsänderungen führt, sondern trotzdem semantische Ähnlichkeit vorliegt. Dies lässt sich am Beispiel der beiden Bezeichner „check invoice“ und „invoice check“ zeigen, bei denen semantische Ähnlichkeit anzunehmen ist. Allerdings lässt die unterschiedliche Positionierung der Terme innerhalb des Bezeichners das Vorliegen unterschiedlicher Wortarten vermuten. Die Distanz der Terme lässt also auf einen Unterschied schließen, der aber kleiner ist als der bei Distanzen zwischen gleichen Zeichen in einem String [PW97]. Unser Ansatz für (k ≠ n) wird daher weitergeführt als 𝑆𝑖𝑚(𝑡𝑘 , 𝑡𝑛 ) 𝑡𝑑 wobei - td als „term disorder weight“ eingeführt wird mit einem Wert ≥ 1. Dies folgt dem Ansatz von McLaughlin zur Behandlung von „disagreeing characters” bei String-Vergleichen wie angewendet in [PW97], wobei jedoch die tatsächliche Distanz der beiden Terme aus oben genanntem Grund außer Acht gelassen wird. Dieser Wert ist konfigurierbar. Ein hoher Wert verringert daher das Ähnlichkeitsmaß zwischen zwei Termen, die an unterschiedlichen Stellen einer Phrase stehen. 3.3 Interpretation der Resultate Die Ergebnisse des Matchings drücken die Stärke einer ermittelten Korrespondenz als Konfidenzwert zwischen 0 und 1 aus. Bei der Analyse der Ergebnisse durch Domänenexperten zeigte sich allerdings, dass die Ergebnisse in dieser Form nicht intuitiv verständlich sind. Daher wird dazu eine Fuzzyfizierung vorgenommen und beginnend bei 1 für c = 1 die Angabe "exactMatch", für 1 < c > 0,745 die Angabe "closeMatch", für 0,745 < c >0,495 die Angabe "relatedMatch" Nutzern präsentiert. Dies unterstützt sie bei der Entscheidung bezüglich weiterführender Arbeiten zu Abgleichen oder Analysen. 4 Anwendung Der Prototyp wurde genutzt, um die Machbarkeit und den Nutzen zeigen zu können für eine Sammlung von insgesamt 1.380 Geschäftsprozessmodellen, die zu gleichen Teilen deutsch- oder englischsprachige Bezeichner ihrer Elemente aufweisen. Es handelt sich dabei um Modelle des SAP-Referenzmodells, verschiedene Modellen aus der Literatur sowie Referenzmodelle entnommen aus E-Business-Standards. 54 Janina Fengel, Kerstin Reinking 4.1 Empirische Evaluation Es wurden aus dieser Sammlung zufällig acht Modellpaare ausgesucht, zwischen denen Ähnlichkeit vermutet wurde. Dabei waren Modelle unterschiedlichen Typs willkürlich aus EPK, BPMN-Modellen und UML-Aktivitätsmodellen ausgewählt. Dazu wurden die konfigurierbaren Werte wie im Screenshot in Abbildung 2 ersichtlich gesetzt. Abbildung 2. Screenshot des LaSMat Zur Beurteilung des Ergebnisses der vorgenommenen Abgleiche der Modellontologien, die die Geschäftssemantik repräsentieren, wurden die gefundenen Korrespondenzen mit einer Stärke größer 0,5 verglichen mit Korrespondenzen, die manuell von Domänenexperten als Referenz erstellt wurden. Augenfällig war dabei der Zeitaufwand. Während die menschliche Arbeit für alle ausgewählten Modellpaare bei einem Umfang von einer bis mehreren Stunden lag, dauerte der Abgleich im LaSMat-System zwischen 290 ms bis maximal 3.100 ms pro Paar. Zur Beurteilung der Ergebnisgüte wurde auf Maße aus dem Information Retrieval zurückgegriffen [St07]. Dies sind Precision (P), Recall (R) und FMeasure (F) ausgedrückt als Wert zwischen 0 und 1. P beschreibt die Korrektheit als Verhältnis aller korrekt gefundener zur Menge aller gefundenen Korrespondenzen. R beschreibt die Vollständigkeit als Verhältnis aller korrekt gefundenen zur Menge aller erwarteten Korrespondenzen. Zur Gesamtbeurteilung zeigt F das gewichtete harmonische Mittel dieser beiden Werte. Die Anwendung der Methode ergab für P einen Mittelwert von 0,89, für R einen Mittelwert von 0,9 und für F einen Mittelwert von 0,89. Aus den Mittelwerten der Stichprobe lässt sich für die Grundgesamtheit als Indiz für die Machbarkeit der Methode vermuten, dass bei 5%-iger Irrtumswahrscheinlichkeit die Precision zwischen 0,8 und 0,98 und der Recall zwischen 0,83 und 0,97 liegt, wobei der Maximalwert jeweils 1 ist. Sprachbezogener Abgleich der Fachsemantik in heterogenen Geschäftsprozessmodellen 55 4.2 Detailbetrachtung zur Verfahrenskombination Zur Betrachtung zur Wirkung der Parametrisierung der verschiedenen genutzten Verfahren wurde eine Detailbetrachtung an Einzelbeispielen vorgenommen. Durch die Kompositazerlegung wurden die Ergebnisse erwartungsgemäß verbessert, beispielsweise wurde die Ähnlichkeit zwischen „Rechnungsprüfung“ und „Rechnung prüfen“ ohne Zerlegung mit einem Wert von 0,54 zurück gegeben und mit Zerlegung von 0,74. Synonym-Matches können unterschiedlich gewichtet werden. Dies erscheint sinnvoll in den Fällen, in denen es aufgrund von Quasi-Synonymen zu Bedeutungsverschiebungen kommt. Während das Matching ohne Synonymauflösung keine Übereinstimmung zwischen bedeutungsgleichen Benennungen findet, werden durch die Synonymauflösungen diese Übereinstimmungen gefunden. Dabei führt ein Wert von 0 zu einem Abgleich ohne Synonymauflösung, während alle Werte größer 0 das Ergebnis gewichten. Ein zwischen Stoppworten gefundener exakter Match beeinflusst maßgeblich das Gesamtergebnis beim Phrasen-Matching aufgrund der im Vergleich zu Volltexten geringen Anzahl an Termen. Unser Ansatz, Stoppwort-Matches mit 0.0 zu gewichten, sodass Stoppwort-Matches nicht in die Gewichtung bei der Gesamtähnlichkeitsbewertung fallen, liefert ähnliche Ergebnisse wie die Stoppworteleminierung, berücksichtigt aber weiterhin die Fälle, in denen ein Stoppwort einen Bedeutungsunterschied ausmacht. Durch Stemming konnten Abgleiche unterstützt werden, wobei für die flexionsstarke deutschen Sprache die Ergebnisse nur in geringerem Umfangs verbessert wurden im Vergleich zum Englischen. Für den Zeichenkettenvergleich kam bei der Evaluation Q-Grams zum Einsatz mit einem Term Disorder Weight von 3 gemäß des Ansatzes von Mclaughlin wie oben beschrieben. Dies lieferte unter Beachtung der Position eines Terms innerhalb der Phrase erhöhte Trefferquoten. 5 Verwandte Arbeiten Aufgrund der großen Bedeutung der Modellierung zur Beschreibung und Gestaltung betrieblichen Geschehens kommen in der Folge dem Modellabgleich und der Modellintegration eine immer entscheidendere Bedeutung für die Prozess- und IT-Optimierung und damit letztendlich für die Wettbewerbsfähigkeit von Unternehmen zu. Allerdings liegen trotz dieser Bedeutung keine für den Unternehmenseinsatz geeigneten Methoden und Werkzeuge vor. Einige in der Literatur vorliegende Arbeiten zur Modellintegration konzentrieren sich auf den Bereich der Modellierungssprachen und die Möglichkeiten der Migration oder Integration basierend auf der Übertragung der Modelle von einer Modellierungssprache in eine andere [Ge07; MK07]. Dabei wird der Aspekt heterogen verwendeter Fachsprache nicht betrachtet, sondern die Modellelementbezeichnungen werden unverändert weiter genutzt. Obwohl die Nutzung von Ontologien langfristig als Möglichkeit zur Herstellung eines einheitlichen, gemeinsamen, ständig aktuellen und kollaborativ weiterentwickelten digitalen Modells des ganzen Unternehmens gesehen werden [Fr10], existieren bisher keine Vorschläge zu ihrer Anwendung für Modellabgleiche nach deren Erstellung bzw. für Integrationen oder Konsolidierungen. Existierende Vorschläge zur Integration von Prozessmodellen konzentrieren sich zumeist auf die Phase der Ersterstellung von Modellen. 56 Janina Fengel, Kerstin Reinking Dabei wird das Vorliegen eines separat erstellten Domänenmodells zur Bezeichnung von Modellelementen oder für ihren Abgleich vorausgesetzt [BEK06; We07]. Im Gegensatz dazu erfordert unsere Methode keine zusätzlichen Vorarbeiten dieser Art. Andere Ansätze erfordern manuelle Annotationsarbeiten zur Auszeichnung von Prozessmodellelementen zur Ermöglichung semantischer Verarbeitung [HLD07; TF09; BD10]. Aktuell liegen keine Ansätze vor, die semantische Abgleiche und existierender Modelle unter Berücksichtigung sowohl der Modellierungs- als auch der genutzten Fachterminologie und verschiedener natürlicher Sprachen bieten. Hier kann unser Ansatz ergänzend wirkend. 6 Schlussbetrachtung Im vorliegenden Beitrag wurde eine Methode zum semantischen Abgleich bereits existierender Geschäftsprozessmodelle mit Hilfe von Semantic-Web-Technologien, insbesondere Ontology-Matching-Verfahren, vorgestellt. Dadurch wird die Fachsemantik in Modellen maschinell erschließbar und durch eine entsprechende sprachbezogene Auswahl, Kombination und parametrisierbare Ergebnisaggregation mehrerer sprachverarbeitender Verfahren automatisiert abgleichbar. Die ermittelten Ergebnisse können Ansatzpunkte für weiterführende Strukturvergleiche und darauf basierende Verarbeitungsschritte wie beispielsweise Konsolidierungen oder Modelländerungen bieten. Dazu wurde das hier vorgestellte System prototypisch implementiert und für den Machbarkeitsnachweis der entwickelten Methode genutzt. Dabei konnte gezeigt werden, dass die gewählte Kombination von Einzelverfahren Nutzern automatisierte Unterstützung bieten kann. Da das System die Parametrisierung von Gewichtungen vorsieht, ist hierzu weiterführende Evaluation bezüglich deren Effizienz geplant, um domänenspzifisch geeignete Kombinationen ermitteln zu können. Ebenso liefert Ontology-Matching (bisher) keine perfekten Ergebnisse. Insbesondere ist für die Fälle, in denen Phrasen numerische, kryptische oder mischsprachliche Begriffe enthalten, noch weitere Forschungsarbeit nötig. Langfristig könnte weiterführende Forschung bezüglich des entstandenen Bedarfs an Block Matching für das Erkennen taxonomischer und mereologischer Zusammenhänge nutzenstiftend sein. Insgesamt hoffen wir, mit unserem Vorschlag die Nützlichkeit der Anwendung von Semantic-Web-Technologien zur Unterstützung beim Abgleich von Geschäftsprozessmodellen gezeigt zu haben. Literaturverzeichnis [AF05] Antoniou, G.; Franconi, E.; van Harmelen, F.: Introduction to Semantic Web Ontology Languages. In: Reasoning Web. 1st Int. Summer School 2005, Malta, Springer, Berlin Heidelberg, 2005; S. 1–21. [BD10] Becker, J. et al.: Ein automatisiertes Verfahren zur Sicherstellung der konventionsgerechten Bezeichnung von Modellelementen im Rahmen der konzeptionellen Modellierung. In: Modellierung 2010, LNI 161, 2010; S. 49–65. [Be05] Bertram, J.: Einführung in die inhaltliche Erschliessung. Ergon., Würzburg, 2005. [Be08] Beus, J.: Google changes the treatment of stopwords. http://www.sistrix.com/news /713 -google-veraendert-behandlung-von-stopworten.html, 30.10.2011. Sprachbezogener Abgleich der Fachsemantik in heterogenen Geschäftsprozessmodellen 57 [BEK06] Brockmans, S. et al.: Semantic Alignment of Business Processes. In: Proc. of the 8th Intern. Conf. on Enterprise Information Systems (ICEIS 2006). INSTICC, Setúbal, 2006; S. 197–203. [BRS96] Becker, J.; Rosemann, M.; Schütte, R.: Prozeßintegration zwischen Industrie- und Handelsunternehmen - eine inhaltlich-funktionale und methodische Analyse. In Wirtschaftsinformatik 39, 1996; S. 309–316. [BP08] Becker, J.; Pfeiffer, D.: Solving the Conflicts of Distributed Process Modelling – Towards an Integrated Approach. In: 16th Europ. Conf. on Information Systems (ECIS 2008), 2008; S. 1555–1568. [Ch06] Chapman, S.: SimMetrics: Open source library of Similarity Metrics. http://sourceforge.net/projects/simmetrics/, 17.10.2011. [CRF03] Cohen, W.; Ravikumar, P.; Fienberg, S.: A Comparison of String Distance Metrics for Name-Matching Tasks. In: Proc. of IJCAI-03 Workshop on Information Integration on the Web (IIWeb-03), 2003; S. 73–78. [DOS03] Daconta, M. C.; Obrst L. J.; Smith K. T.: The Semantic Web. Wiley, 2003. [ES07] Euzenat, J.; Shvaiko, P.: Ontology Matching. Springer, Berlin, 2007. [Eu06] Euzenat, J.: An API for ontology alignment. https://gforge.inria.fr/docman/view.php/117/251/align.pdf, 17.10.2011. [Fe98] Fellbaum, C. Hrsg.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge, 1998. [FR10] Fengel, J.; Rebstock, M.: Domänensemantik-orientierte Integration heterogener konzeptueller Modelle. In: Modellierung betrieblicher Informationssysteme. Modellgestütztes Management; (MobIS 2010 ); LNI P-171, 2010; S. 63–78. [Fr10] Frank, U.: Interview mit Rudi Studer zum Thema „Semantische Technologien“. In Wirtschaftsinformatik 52, 2010; S. 49–52. [Ge07] Gehlert, A.: Migration fachkonzeptueller Modelle. Logos-Verl., Berlin, 2007. [HL09] Harms, I.; Luckhardt, H.-D.: Virtuelles Handbuch Informationswissenschaft. http://is.uni-sb.de/studium/handbuch/, 30.10.2011. [HLD05] Hepp, M. et al.: Semantic Business Process Management: A Vision Towards Using Semantic Web Services for Business Process Management. In: Proc. of the IEEE Intern. Conf. on e-Business Engineering. ICEBE 2005, IEEE., 2005; S. 535–540. [HP11] HP Hewlett Packard: Jena - A Semantic Web Framework for Java. http://jena.sourceforge.net/. 30.10.2011 [Ja12] Jaccard, P.: The Distribution of the Flora in the Alpine Zone. In The New Phytologist, 1912, 11; S. 37–50. [Ja89] Jaro, M. A.: Advances in Record-Linkage Methodology as Applied to Matching the 1985 Census of Tampa. Journal of the American Statistical Association, 1989; S. 414– 420. [Ko07] Koschmider, A.: Ähnlichkeitsbasierte Modellierungsunterstützung für Geschäftsprozesse. Universitätsverl., Karlsruhe, 2007. [Le66] Levenshtein, V.: Binary Codes Capable of Correcting Deletions, Insertions, and Reversals. In Cybernetics and Control Theory, 1966, 10; S. 707–710. 58 Janina Fengel, Kerstin Reinking [Li00] Liu, K.: Semiotics in information systems development. Cambridge Univ. Press, Cambridge, New York, 2000. [MB09] Miles, A.; Bechhofer, S.: SKOS Simple Knowledge Organization System Reference. http://www.w3.org/TR/2009/REC-skos-reference-20090818/, 20.09.2011. [MK07] Murzek, M.; Kramler, G.: The Model Morphing Approach – Horizontal Transformations between Business Process Models. In: Proc. of the 6th Intern. Conf. on Perspectives in Business Information Research - BIR'2007, Tampere, Finland, 2007; S. 88–103. [Na05] Naber, D.: OpenThesaurus: ein offenes deutsches Wortnetz. http://www.danielnaber.de/publications/gldv-openthesaurus.pdf, 12.10.2010. [PB11] Porter, M.; Boulton, R.: Snowball. http://snowball.tartarus.org/index.php, 31.10.2011. [PW97] Porter, E. H.; Winkler, W. E.: Approximate String Comparison and its Effect on an Advanced Record Linkage System. http://www.census.gov/srd/papers/pdf/rr97-2.pdf, 10.08.2011. [SBH06] Shadbolt, N.; Berners-Lee, T.; Hall, W.: Semantic Web Revisited. In IEEE Intelligent Systems, 2006, 21; S. 96–101. [SM07] Simon, C.; Mendling, J.: Integration of Conceptual Process Models by the Example of Event-driven Process Chains. In: 8. Intern. Wirtschaftsinformatik (WI 2007) Univ.Verl. Karlsruhe, Karlsruhe, 2007; S. 677–694. [SS08] Stock, W. G.; Stock, M.: Wissensrepräsentation. Oldenbourg, München, 2008. [St07] Stock, W. G.: Information Retrieval. Oldenbourg, München, 2007. [ST95] Sutinen, E.; Tarhio, J.: On Using q-Gram Locations in Approximate String Matching. In: Proc. of the 3rd Ann. Europ. Symposium on Algorithms ESA '95. Springer, Berlin, 1995; S. 327–340. [StStKo05] Stoilos, G.; Stamou, G.; Kollias, S.: A String Metric for Ontology Alignment. In: ISWC 2005. Springer-Verlag, Berlin Heidelberg, 2005; S. 624–637. [TF06] Thomas, O.; Fellmann, M.: Semantische Integration von Ontologien und Ereignisgesteuerten Prozessketten. In: Proc. EPK 2006 Geschäftsprozessmanagement mit Ereignisgesteuerten Prozessketten. CEUR-WS.org, Vol. 224, 2006; S. 7–23. [TF07] Thomas, O.; Fellmann, M.: Semantic Business Process Management: Ontology-Based Process Modeling Using Event-Driven Process Chains. In IBIS 2, 2007; S. 29–44. [TF09] Thomas, O.; Fellmann, M.: Semantische Prozessmodellierung – Konzeption und informationstechnische Unterstützung einer ontologiebasierten Repräsentation von Geschäftsprozessen. In Wirtschaftsinformatik 51, 2009, S. 506–518. [W310] Links to SKOS Data. http://www.w3.org/wiki/SkosDev/DataZone, 31.10.2011. [We01] Weiss, M.: Automatische Indexierung mit besonderer Berücksichtigung deutschsprachiger Texte. http://www.ai.wu.ac.at/~koch/courses/wuw/archive/inf-semws-00/weiss/index.html, 30.10.2011. [We07] Weske, M.: Business Process Management. Concepts, Languages, Architectures. Springer, Berlin Heidelberg, 2007. [Zb10] ZBW Leibniz-Informationszentrum Wirtschaft: STW Standard-Thesaurus Wirtschaft. http://zbw.eu/stw/versions/latest/download/about.de.html, 30.10.2011 Towards a Tool-Oriented Taxonomy of View-Based Modelling Thomas Goldschmidt1 , Steffen Becker2 , Erik Burger3 1 ABB Corporate Research Germany, Industrial Software Systems Program thomas.goldschmidt@de.abb.com 2 University of Paderborn, steffen.becker@uni-paderborn.de 3 Karlsruhe Institute of Technology (KIT), burger@kit.edu Abstract: The separation of view and model is one of the key concepts of ModelDriven Engineering (MDE). Having different views on a central model helps modellers to focus on specific aspects. Approaches for the creation of Domain-Specific Modelling Languages (DSML) allow language engineers to define languages tailored for specific problems. To be able to build DSMLs that also benefit from view-based modelling a common understanding of the properties of both paradigms is required. However, research has not yet considered the combination of both paradigms, namely view-based domain specific modelling to a larger extent. Especially, a comprehensive analysis of a view’s properties (e.g., partial, overlapping, editable, persistent, etc.) has not been conducted. Thus, it is also still unclear to which extent view-based modelling is understood by current DSML approaches and what a common understanding if this paradigm is. In this paper, we explore view-based modelling in a tool-oriented way. Furthermore, we analyse the properties of the view-based domain-specific modelling concept and provide a feature-based classification of these properties. 1 Introduction Building views on models is one of the key concepts of conceptual modelling [RW05]. Different views present abstract concepts behind a model in a way that they can be understood and manipulated by different stakeholders. For example, in a component-based modelling environment, stakeholders, such as the system architect or the deployer will work on the same model but the system architect will work on the connections and interactions between components whereas the deployer will focus on a view showing the deployment of the components to different nodes [Szy02]. This is not only true for different types of models, as e.g., the different abstraction levels defined by the Model Driven Architecture (MDA) [MCF03], but also for having different views on the same models. Specialised views on a common underlying model foster the understanding and productivity [FKN+ 92] of model engineers. Recent work [ASB09] has even promoted view-based aspects of modelling as core paradigm. Frameworks for the creation of Domain Specific Modelling Languages (DSMLs) allow to efficiently create tailored modelling languages. Different types of DSML creation approaches have emerged in recent years [KT08, CJKW07, MPS, Ecl11b, Ecl11a, KV10]. 60 Thomas Goldschmidt, Steffen Becker, Erik Burger Many of these approaches implicitly allow for, or explicitly claim to, support the definition of views on models. However, a comprehensive analysis of view-based aspects in DSML approaches has not been performed, yet. Furthermore, there is no clear determination on these concepts given in literature. Work on architectural, view-based modelling is mostly concerned with its conceptional aspects (e.g., [RW05, Cle03, Szy02, ISO11]). Their definitions are however on architecture level and do not deal with specific properties of views within modelling tools, such as their scope definition, representation, persistency or editability. In order to agree on requirements for view-based modelling and to be able to decide which view-based DSML approach to use, language engineers and modellers require a clear and common understanding of such properties. Researchers have partially used these properties explicitly or implicitly in existing work (e.g., [GHZL06, KV10]). However, as these concepts and properties are scattered across publications and are often implicitly considered, this paper aims at organising them. In order to get an overview on the view-based capabilities in existing DSML frameworks, we analysed a selection of DSML frameworks. We included graphical DSML frameworks ([KT08, CJKW07, Ecl11a]) as well as textual DSML frameworks ([MPS, Ecl11b, KV10]) to identify a common understanding of view-based domain-specific modelling. The contribution of this paper is two-fold: First, we provide a common definition of viewbased modelling from a tool oriented point of view. Second, we identify the different properties and features of view-based modelling in DSML approaches. The work presented in this paper is beneficial for different types of audience. Modellers can use the properties to make their requirements on view-based modelling more explicit. Tool builders can use the presented properties to classify and validate their approaches or as guidelines for the development of new view features. Researchers can benefit from the common definition of views on which further research can be based. The remainder of this paper is structured as follows. Section 2 presents an overview as well as a differentiation of different notions of the term “view” that are used as categories for the classification scheme. Properties of view-types, views and specific editor capabilities are given in Section 3. Related work is analysed in Section 4. Section 5 concludes and outlines future work. 2 Determination of View-Points, View-Types, and Views In this paper we try to clarify the understanding of the common terminology in viewbased modelling. As our understanding of view, view points, view types and modelling originates from a tooling perspective, our understanding of the terms varies slightly from existing definitions like the ISO 42010:2011 standard [ISO11]. Therefore, in this section we are illustrating an example of a view-based modelling approach from our own industrial experience and illustrate using this example our understanding of the terms. We find the same understanding realised in many of the tools we have classified and surveyed in order to come up with this taxonomy. Finally, we discuss our terminology in the context of existing definitions like the ISO standard. Towards a Tool-Oriented Taxonomy of View-Based Modelling 61 2.1 Tool basis In orde to come up with a generic taxonomy for tools in the view-based modelling area we analysed several DSML tools. The selection process for the tools that we analysed was based on the following criteria: 1. The search was based on electronic databases, e.g., ACM DigitalLibrary IEEEXplore, SpringerLink as well as references given by DSML experts. The search strategy was based on the keywords “domain-specific”, “modelling”, “language”, “view-based”, “view-oriented”, “views” and “framework” in various combinations. The search was conducted in several steps within December 2010 and January 2011. 2. Domain-specific modelling includes approaches stemming from several different research areas. Therefore, we included DSML approaches coming from different areas, e.g., meta-case tools, compiler-based language engineering as well as general model-driven engineering. 3. For being recognised as view-based DSML framework it should be possible to define new or use existing metamodels and create multiple concrete syntaxes for them. 4. Approaches for which a tool or framework was available were included. This ensured that approaches only having a theoretical or very prototypical character were excluded. 5. Approaches which have a tool which is not longer maintained or where the project was considered dead were excluded. 6. Finally, we excluded tools for which we found no indications for industrial relevance such as experience reports or real world evaluations. Thus, only tools proven to be mature enough for industrial application were included. The selection process was very strict as our goal was to evaluate only those tools which had a chance of being employed in industrial projects. Of course, this may threat the general applicability of our taxonomy but on the other hand ensures that the taxonomy is applicable by industry. Finally the tools we analysed were the following: Eclipse GMF [Ecl11a], MetaEdit+ [KT08], Microsoft DSL tools [CJKW07], Jetbrains MPS [MPS] and Eclipse Xtext [Ecl11b]. Due to space restrictions, we cannot include the whole survey here; however, preliminary results are available online.1 2.2 Terminology Figure 1 presents our example language and its views from the business information domain. The example language serves as a DSL to model business entities, their relations, their interactions and their persistence behaviour. 1 http://sdqweb.ipd.kit.edu/burger/mod2012/ Thomas Goldschmidt, Steffen Becker, Erik Burger Static : ViewPoint View Point 62 defines defines View Type BusinessObject Structure : ViewType ValueTypes Overview : ViewType instanciates instanciates View Model (Abstract Syntax) represents represents represents Address represents Boolean 0..1 persist() { store this; var ss1=commit; var containsThis=all[ss1] Company-> iterate(Boolean 1..1 contains=false; i | contains.or(i==this)); return containsThis; } Customer store Customer with addresses > 0 to LocatedCust Position instanciates instanciates store Company with valueType = false to CompanyTable Address 0..* Address addresses BlockImplemen tation : ViewType Interaction : ViewType instanciates instanciates defines defines Persistency : ViewType ValueTypes Customer addresses : Address [0..*] Dynamic : ViewPoint defines Invoice represents represents <?xml version="1.0" encoding="ASCII"?> <xmi:XMI xmi:version="2.0" xmlns:xmi="http://www.omg.org/XMI" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <BOs:BO name="Customer"> <signatures xsi:type=“BOs:MethodSignature“ name="persist" type="/1" <implementation xsi:type=“Bos:Block“> ... </xmi:XMI> Figure 1: Example language and its viewpoints and viewtypes used by some example views 1 hasStakesIn System models 1 1 modelledBy * showsElements View views instanciates definedBy * 1 defines analyses 1 Metamodel represents 1 1..* stakeholder 1 * Model 1 * stakeholders 1..* ViewType has represents * Stakeholder * viewTypes defines * * definedBy * interestedIn * ViewPoint representedIn 1 defines Concern 1 concern Figure 2: Terminology for view-based modelling used in this paper. Our language therefore consists of two viewpoints: a static viewpoint to model the static structure of the business entities and a dynamic viewpoint to model their dynamics. The static viewpoint consists of three view types: a structure viewtype that defines how to presents the business objects and their relations in a class digram like notation, a value viewtype that defines how to shows the attributes of the business entities, and a persistency viewtype that defines how to link the business entities with the database and defines default values. Figure 1 also contains for each of the viewtypes an illustrative example view showing a simple model. The figure shows how viewtypes relate to classes (BusinessObject) and views to instances of these classes (Customer, Address). The language’s dynamic viewpoint has an interaction viewtype that defines how to represent business object interrelations and a block implementation viewtype that defines how behaviour of single business entities is specified. Again, Figure 1 illustrates each viewtype on an example. Using this example, we introduce our terminology illustrated in Figure 2. The class digram shows the terms used in view-based modelling and represents our tool-centric understanding of views, viewtypes, and viewpoints. Towards a Tool-Oriented Taxonomy of View-Based Modelling 63 Our conceptualisation starts with the System (or the ”real-world object”) which is being studied by its Stakeholders wrt. their specific system Concerns. In our example, we may think of database designers of the system under study who want to analyse their database table structure. Therefore, they are interested in the static Viewpoint. A viewpoint represents a conceptual perspective that is used to address a certain concern. A view point includes the concern, as well as a defined methodology on how the concern is treated, e.g., instructions how to create a model from the particular viewpoint. In order to analyse the system, they create a single, consistent Model of the system under study. The model has to be an instance of its Metamodel. In order to show parts of this model, we need a set of concrete syntaxes. These concrete syntaxes are defined by Viewtypes. It defines the set of metaclasses whose instances a view can display. This description uses the metamodel elements and provides rules that comprise a definition of a concrete syntax and its mapping to the abstract syntax. It defines how elements from the concrete syntax are mapped to the metamodel and vice versa. For example, the business object structure viewtype shows classes and their relations but not the instructions how to persist the entity to the database. A View is the actual set of objects and their relations displayed using a certain representation and layout. A view resembles the application of a view type on the system’s models. A view can therefore be considered an instance of a view type. For example, the structure view in Figure 1 shows the business entities ”Customer” and ”Address”. They may be a selection of all possible classes, e.g., ”CreditCardData” is not shown on this particular view but may be shown on a different view from the same view type. Also the elements ”Customer” and ”Address” can also appear in other views. The separation between the definition of the view type and its instances is also topic of the recently started initiative of the OMG called Diagram Definition [Obj10]. Within this new standard, which currently under development, the OMG distinguishes between Diagram Interchange (DI) and Diagram Graphics (DG). Where the former is related to the information a modeller has control over, such as position of nodes, the latter is a definition of how the shapes of the graphical language look like. The mapping between a DG and a metamodel can then be defined by a mapping language such as QVT. Considering a DI instance a view and a DG definition including the mapping a view type the OMG’s definition is perfectly in line with our own experience and what we contribute in this paper. The ISO 42010:2011 standard [ISO11] gives the following definitions: A view “adresses one or more of the concerns of the system’s stakeholders” and “expresses the architecture of the system-of-interest in accordance with an architecture viewpoint” where a viewpoint “establishes the conventions for constructing, interpreting and analyzing the view to address concerns framed by that viewpoint.” In contrast to the ISO 42010:2011 standard [ISO11], we follow the idea of having a single model of the system under study and views just visualise and update this central model. For different kinds of data, we favour the use of different view types, in analogy to the model type in the ISO standard. Furthermore, we do not focus on architecture descriptions. As a consequence, our concept contains explicit the model and its metamodel which also allows us to associate the viewtypes to the metamodel. As a minor difference, we favour a view point just to address a single concern to have a clear relation. 64 Thomas Goldschmidt, Steffen Becker, Erik Burger BusinessObjects Multiplicity NamedElement ExpressionStmt name : String BusinessObject Statement statements {ordered} block valueType : Boolean 0..* 1 Block 0..* implementation 1 0..* owner signatures MethodSignature 1 inv: self.signatures->forAll(s | self.elementsOfType.typedElement .name <> s.name) elementsOfType 0..* entity 1 Lower : Integer Upper : Integer Ordered : Boolean Unique : Boolean TypeDefinition 1 0..* type typedElement TypedElement Association 1 association 2 ends AssociationEnd signature Figure 3: Example metamodel. 3 Classification of View Type, View and Editor Properties We explicitly distinguish between the definition level (view type) and the instance level (view) to pay respect to the different roles involved in domain specific modelling. Modellers use views on instance level, whereas language engineers work on view type level. To separate between the properties of these levels we first present view type properties in Section 3.1 and second view properties in Section 3.2. Finally, as DSML frameworks mostly come with their own editor framework, which also has a large impact on the way how modellers can work with their views, we present a classification scheme for editor capabilities that have an impact on view building in Section 3.3. We use the properties presented here for two different purposes. Firstly, for the communication and reasoning about specific view types and views. Having these explicit properties eases the communication and helps to avoid errors in the definition as well as the application of a view-based modelling approach. Secondly, applied to a given view-based modelling approach, the fulfilment of a property resembles the fact that a certain approach is capable of defining view types or instantiating views that feature this specific property. 3.1 View Type A view type defines rules according to which views of the respective type are created. These rules, can be considered as a combination of projectional and selectional predicates as well as additional formatting rules that determine the representation of the objects within the view. Projectional predicates define which parts of a view type’s referenced metamodel and/or elements of that metamodel a view actually shows. For example, the “BusinessObject Structure” view type defined in our example is a projection that shows elements of type BusinessObject, Association, etc. but not Block or Statement elements. Additionally, projectional predicates may also refer to specific attributes defined on metamodel level. In other words, projectional predicates define which types of elements (classes, associations, attributes, etc.) a view type includes. Towards a Tool-Oriented Taxonomy of View-Based Modelling 65 Legend exclusive OR inclusive OR * mandatory feature optional feature multiple usage per parent feature Selectional C/P {per view type/ per predicate} * Complete (C) Projectional C/P View Type * Partial (P) Depthfirst C/P Breadth-first / Local C/P Extending Contain ment C/P Overlapping Containment C/P Downwards Containment C/P Upwards Containment C/P Representation Textual Inter View Type Intra View Type Other Graphical Tabular Figure 4: Properties of view types. Selectional predicates define filter criteria based on attributes and relations of a view’s elements. For example, the “ValueTypes Overview” view type of our example only includes elements of type BusinessObject that have set their valueType attribute to true, thus showing only abstract classes. In other words, selectional predicates define on instance level, which conditions elements have to fulfil in order to be relevant for a specific viewtype. Finally, a view type contains rules defining how a view represents the projected and selected elements. Given the “BusinessObject Structure” view type of our example, one of these rules describes that the view type displays BusinessObjects as rectangular boxes with the value of their name property as label. The determination of the three rule types is given on a conceptional level, the implementation of a view type may also combine these rules into a joined rule. These three types of rules define the range of properties a view type may fulfil. Examples are projectional or selectional completeness or whether a view type defines a textual or graphical representation of the underlying model. Note that, depending on whether a view should be editable or read-only, these rules have to be considered in a bidirectional way: I) the direction which specifies how a view is created for an underlying model II) defines how a model is created and/or updated based on changes that are performed in a view. The feature diagram depicted in Figure 4 gives an overview of the properties identified by us. Complete View Type Scopes: A language engineer needs to ensure that the created language is capable of expressing programs of the targeted domain. Especially if a view-based approach is employed, achieving the desired expressiveness and coverage of the domain’s metamodel can be an error prone task. To ease the communication on this coverage as well as to provide a basis for tool builders to cope with this challenge, we define different types of completeness for the definition of a view type. A view type may be complete, which means that it considers all classes, properties, and relations of a metamodel that are reachable from the part of the metamodel for which the view type is defined. A complete view type can be used as a starting point for the interaction of a model displaying all model elements from which a modeller can dive deeper into the model using other view types. Block 1 statements 0..* Statement downwards (a) Example: containment complete. Customer addresses addresses : Address [0..*] Intra View Type Overlap: Property „addresses“ occurs more than once in the same view but using different representations Inter View Type Overlap: Classes are shown in multiple view types 0..* Address ValueTypes Address Position VTAssocs 1 signatures 0..* MethodSignature 1 0..* implementation VTPackage BusinessObject containment complete Thomas Goldschmidt, Steffen Becker, Erik Burger upwards 66 (b) Examples: view type overlaps. Figure 5: Examples for containment completeness and view type overlaps. The scope of a view type rule is given by its projectional as well as selectional parts. Thus, we can also distinguish between projectional completeness (which includes the containment and local completeness) and selectional completeness (see the definition of instance completeness below). Projectional Complete View Type Scope: Projectional completeness of views based on the MOF [Obj06] meta-metamodel can be defined on several levels. Starting from a certain model element there are two dimensions on how to span a scope to other model elements. First, traversing depth-first to other model elements via the associations defined in the metamodel and second, including, all elements that are, breadth-first, reachable through all directly attached attributes or associations. Depth-first completeness is hard to specify as it would need to be defined for each specific path through a metamodel. However, a special subset of depth-first that is useful to define is the completeness w.r.t. to the containment associations. Containment associations are the primary way for creating an organisational structure for a metamodel. Therefore, we define a special case of depth-first completeness called containmentcompleteness. A view type is containment complete concerning a specific element o if all elements that are related to it via containment associations are shown in the view. Three different notions of containment complete can be defined (Figure 5(a) shows examples based on our example metamodel.). • Downwards containment-complete means that all elements that are transitively connected to o are part of the view if o is their transitive parent. • Upwards containment-complete means that all elements that are transitively connected to o are part of the view if they are a transitive parent of o. • The third notion specifies that all transitive parents and children of o are part of the view. The second dimension of view type completeness is the breadth-first completeness, which we call local completeness. Local completeness is fulfilled if a view type can display all directly referenced elements of a given element. A view type is locally complete concerning a class c if every direct property of c can be displayed by an instance of the view Towards a Tool-Oriented Taxonomy of View-Based Modelling 67 Persistency Storage boName : String tableName : String MM BO + Persistency View Type template Storage : „store“ boName „with“ „valueType“ „=“ external {query = BusinessObject.allInstances()-> select(name = boName), property = valueType} „to“ tableName ; View Type Def. BusinessObjects BusinessObject name : String <<inherited>> valueType : Boolean MMext : Storage boName = „Company“ tableName = „CompanyTable“ Company : BusinessObject store Company with valueType = false to CompanyTable name = „Company“ valueType = false Model View Figure 6: Using an external metamodel Persistency that is connected via a query in the view type to an existing meta- Figure 7: Example view instance of the view type specified in Figure 6. model BusinessObjects. type. Making this type of completeness explicit is useful if a certain view type should be a detail editor for a specific class where the modeller should be able to view and/or edit all properties of the given class. Selectional or Instance Completeness: Selectional completeness, or instance completeness means that the selection of the view type includes all model instances that appear in the underlying model as long as the projection of the view type also includes them. However, projectional completess is not required in order to fulfil the instance completeness property. For example, a view type can have a projection of a class A which does not include a property propA. As long as the view type includes all possible instances of A it is still instance complete. In contrast to that, if a view type defines a selection criterion for A, such that only As having a propA value of “selected”, are included the template for A is not instance complete anymore. Partial View Type Scope: The scope of a view type is considered partial concerning a metamodel, if it only covers a certain part of the element types that are defined within the metamodel. This means, for example, the “BusinessObject Structure” view type of our example is partial w.r.t. the metamodel as it omits the classes such as Block. Just as a view type has different types of completeness, i.e., projectional and selectional completeness, a view type that is not complete w.r.t. one of these properties it is then automatically partial. Extending View Type Scope: In addition to the properties partial and complete, there are also view types that combine elements from the underlying model with additional information from an external model Mex . The extended information is defined by the fact that it is not directly reachable by model navigation but by some kind of external model, e.g., a decorator model, from the extended view type. Often, the information that should be added in such a view type is additionally defined using a different metamodel MM ex . In our terminology a view type always refers to a single metamodel. Therefore, the metamodel for such an extending view type refers to an artificial composite metamodel including both related metamodels. A concrete example that shows how such an extension view type could be defined is depicted in Figure 6, based on the example business object metamodel. The example shows that there is an external persistency annotation metamodel that does not have any connection with the business object metamodel. The storage annotation only contains a hint to the name of the business object that should be persisted in its boName attribute. However, it might be a requirement that a language engineer needs to define a view type not only showing elements of the persistency metamodel but also presenting information from the 68 Thomas Goldschmidt, Steffen Becker, Erik Burger Persistency Storage tableName : String BusinessObjects entity 1 BusinessObject name : String <<inherited>> valueType : Boolean Figure 8: Using an external metamodel Persistency to non-intrusively add persistency annotations to an existing metamodel BusinessObjects. business object metamodel, i.e., if the mentioned business object is a value type or not. Therefore, a query is given in the view type that retrieves the corresponding business object with the specified name and from which the valueType property is then shown in the view, as illustrated in Figure 7. Overlapping View Type Scope: This property is not a direct property of a view type but defines a relationship between two or more view types. View types may also cover scenarios where there is more than one view type that is able to represent the same type of element. On the other hand it is also possible that the same view type can handle a distinct type of element in different ways. We call these types of overlaps inter- and intra view type overlap. Figure 5(b) shows an examples for both types of overlaps. An inter view type overlap occurs whenever one or more view types are able to represent the same element. A prerequisite for this property is, that the involved view types are based on the same metamodel. Figure 5(b) shows that the “BusinessObject structure” view type has an inter view type overlap with the “ValueTypes Overview” view type as both show BusinessObjects. If the same view type can represent the same element in different ways an intra view type overlap is present. This means that there is more that one predicate in the view type that includes the same element. Figure 5(b) shows that the “BusinessObject structure” view type has an intra view type overlap as association ends are represented as a compartment within the BusinessObject’s shape as well as within a label decoration of the Association’s shape. Representation: The third type of rules which a view type defines are responsible for defining the representation of the elements of a view. A view may comprise different types of representation rules. Possible types are textual, graphical, tabular and arbitrary other types. Rules may also combine different types. For example, a graphical representation may include some textual or tabular parts as well. 3.2 Views Views, as instances of view types can also have different properties which are depicted in Figure 9 using a feature diagram. Using these properties, we classify views concerning the extent of information they show of the underlying model(s). We distinguish between selective and holistic views. Additionally, we handle the persistence and editability of layout and selection as well as inter and intra view overlaps of views. The following Towards a Tool-Oriented Taxonomy of View-Based Modelling 69 View {per view/per predicate} * View Scope Deletion Selective Holistic Addition Deletion Persistency Selection Layout Editable Entities Layout Model Overlapping Intra View Inter View Addition Figure 9: Properties of view instances. properties affect the behaviour of the a view instance, however, also the view type may define generically whether a view has a specific property or not. View Scope: A view shows a specific selection of elements from its underlying model. If changes occur in the model, i.e. elements are added or deleted, the view needs to be updated according to these changes. The view scope property defines whether this is done automatically or only if a user explicitly requests the update. Selective View Scope: A view is considered selective if it is possible to show a subset of the elements that could be shown according to its view type. A selective view only shows these specifically selected elements. The selection may either be done automatically or manually by a user of the view. For example, the view example for the “BusinessObject structure” view type depicted in Figure 1 is selective, as the modeller can manually select whether or not specific BO occurs in the view or not. In this case the view only shows the BOs “Customer” and “Address” and omits, for example, “Company”. A view can be selective concerning different types of changes: Addition Selective Addition of elements to the model that fall into the scope of a view’s view type are only added to the view’s selection, if added manually. For example, the “BusinessObject Structure” view types may not show all BOs at once. A modeller can select whether a newly added BO should appear in a certain view or not. Deletion Selective Deleting of elements from the model that fall into the scope of a view’s view type are propagated to the view’s selection, if deleted manually. Thus, elements that were deleted from the underlying model do not necessarily result in the deletion of their view representations. For example, in many graphical modelling tools, representations of elements in a view where the underlying model element is not available anymore, are not automatically removed from the diagrams but are rather annotated, indicating that the underlying model element is missing. Holistic View Scope: In contrast to addition selective views, a view may be addition holistic. This means that it always presents the whole set of possible elements that can be displayed by the view. If elements are added and/or removed this is immediately reflected automatically in the view. Modelling tools mostly use this type of view to present the user an overview on the underlying model. Analogously, deletion holistic views directly reflect any deletion of an element by removing its view representation, i.e., view and model are always synced. Overlapping: This property is not a direct property of a view but defines a relationship between two or more views. A view may be overlapping with another view. In this case 70 Thomas Goldschmidt, Steffen Becker, Erik Burger elements may occur in more than one view at once. This may be a view of the same view type but also a different one. If the element occurs in multiple views we speak of inter view overlap, whereas we call multiple occurrences within the same view intra view overlap. Editability: In addition to displaying model elements according to the view type’s rules, an editable view needs to provide means to interact with and thus modify the underlying model. Actions such as create, update and delete need to be performable to make a view editable. Editability of views can also be subdivided into two different degrees of editability. First, if only the layout information can be changed but not the actual model content, the view is only layout editable. Second, if the model content is editable through the view it is considered content editable. Another interesting aspect is that editability is closedly related to the view type scope (cf. Section 3.1). The scoping of a view type might dertermine the editability of its views. For example, a view type might omit a mandatory attribute of a metamodel class in its specification. In this case it is not possible to create new instances of this class using this view type but it is still possible to view and modify instances of the class. Persistency: A view may be persistent regarding its selection as well as its layout. Stored view layouts enable faster access, as it does not need to be created newly every time a modeller opens the view. Additionally, if a persistent view it is at the same time editable enables for customisation of a view’s selection of elements and/or layout. For non-holistic views, the modeller decides which elements a view should include and which not. If such a selection should be saved, a view needs to be selection persistent. In this case the view’s selection of elements is stored. Additionally, a modeller may customize the layout of the view by manually changing certain parts, such as explicit positioning of the elements occurring in the view, or, for textual views, white-spaces or indentations. Additionally, a modeller may add additional, mostly informal content, such as comments or annotations. If a view allows to store this kind of information it is layout persistent. 3.3 Editor Capabilities Features that have an impact on how view-based DSML frameworks deal with the interaction of users and views as well the synchronisation between view and model also influence the requirements on an employed view-based modelling approach. Figure 10 depicts a feature diagram that gives an overview on these editor capabilities. Note that we included only such properties we consider as special requirements for a view-based modelling approach. A broader view on DSML editor capabilities, at least for textual DSML approaches can be found in [GBU08]. Bidirectionality: To keep models and their views in sync, the rules that do this synchronisation need to be bidirectional (or there need to be two rules where one resembles the inversion of the other). In order to be correct, a bidirectional rule (or a pair of corresponding rules) needs to comply to the effect conformity property. To comply to effect conformity, changes made directly to the model should leave it in the same state as an equal change on the view level that is then automatically propagated back to the model would do. This Towards a Tool-Oriented Taxonomy of View-Based Modelling 71 Framework Capabilities Inconsistency Handling Constraint Inconsistency Model Inconsistency Update Strategy Immediate Bidirectionality Deferred Figure 10: Editor capabilities. automated back propagation is defined by the view type rules. Furthermore, vice-versa, changes made through the view to the model should leave the view in the same state as an equal change on the model level which is propagated by the corresponding view type rule. Additionally, Matsuda et al. [MHN+ 07] define three bidirectional properties that need to be fulfilled in order to create consistent view definitions. Update Strategy: We classify the update strategy which triggers the propagation of changes between view and model into two different types. (I) An update can be performed at the very moment a change is made to one of the sides, either model or view. This kind of update is denoted an immediate update strategy. (II) An update may occur at a point in time decoupled from the actual change event. This kind of update is denoted deferred update strategy. The point in time when updates are performed predefines the number of allowed changes between two subsequent synchronisation runs. In the immediate update strategy the transformations are executed as soon as an atomic change was performed to either the view or its model. This strategy allows a tighter coupling between view and model and avoids conflicts that may occur if an arbitrary number of changes is performed before the next synchronisation. On the other hand, the deferred update strategy allows to have an arbitrary number of changes in this time span. This allows to work with views in a more flexible way, as they can be changed offline, i.e., if the underlying model is currently not available. However, having an arbitrarily large number of changes, that need to be synchronised, dramatically increases the probability of conflicts. Consistency Conservation: As modelling is a creative process, models are mostly created step-by-step. Thus, allowing for intermediate, possibly inconsistent states may foster the usability and productivity of a view-based DSML [Fow05, FGH+ 94]. If this is the case, an editable view might contain valuable information that was created during modelling but that is not yet transformable into a valid model. We define two different classes of inconsistency: (I) violation of metamodel constraints that lead to what we call constraint inconsistency, which means that the view has statically detectable semantic errors and (II) model inconsistency if a view is syntactically incorrect and cannot be transformed into a model at all. Metamodel constraints restrict the validity of models that would theoretically be constructible obeying only the rules defined in the metamodel without constraints. This also includes multiplicity definitions for associations and attributes. For example, considering out example metamodel, an invariant defined for the metamodel class BusinessObject expresses that a MethodSignature may not have the same name as an AssociationEnd connected to the same BO (expressed as OCL invariant: inv: self.signatures ->forAll(s | self.elementsOfType.typedElement.name <> s.name 72 Thomas Goldschmidt, Steffen Becker, Erik Burger )). If there exists an instance of MethodSignature which has a name that is already given to such an AssociationEnd, this constraint is violated. However, during the process of modelling there may be intermediate states where both elements have the same name, e.g., during a renaming process. Still, the element should be representable in a view, i.e., with additional information stating that the constraint is currently violated. If constraint inconsistency was not supported, the modeller would have to first change the AssociationEnd’s name before renaming the MethodSignature. In case (II) a greater degree of freedom in modelling can be reached if a view even supports to hold content that cannot be translated into a model at all. This allows a developer to work with the view like a “scratch pad”. We denote this type of inconsistency model inconsistency. As graphical modelling tools mostly only allow the modeller to perform atomic modifications that preserve the syntactical correctness of the view, this type of inconsistency mostly only occurs within textual modelling. In the latter case modellers are often free to type syntactically incorrect information within a view. 4 Related Work Oliveira er al. [OPHdC09] presented a theoretical survey on DSLs which also included the distinction between the language usage and development perspectives. However, the presented survey remains on a more conceptual level, mostly dealing with properties such as internal vs. external, compilation vs. interpretation as well as general advantages and disadvantages of DSL approaches. The authors do neither mention graphical DSMLs nor do they include view-based modelling aspects in their survey. Pfeiffer and Pichler give a tool oriented overview on textual DSMLs in [Pfe08]. Their survey is based on three main categories, which are language, transformation, and tool. The evaluated features include the representation and composability of language definitions, transformation properties such as the update strategy as well as the kind of consistency checking that is supported. However, view-based modelling aspects and graphical or hybrid DSML approaches are omitted. Buckl et al. [BKS10] have refined the ISO 42010 standard and created a framework for architectural descriptions. A formal definition of the terms view, viewpoint and concern is provided, which is in compliance with ISO 42010. The definition is however restricted to architecture modeling. In our own previous work [GBU08] we conducted a classification based survey on textual concrete syntax approaches which have a common intersection with the approaches for view-based DSMLs. However, the focus in this previous work was on evaluating the textual modelling capabilities such as grammar classes, generator and editor capabilities (such as code completion or syntax highlighting) and did not include features that are required for view-based modelling. Another, feature based survey on textual modelling tools, is presented by Merkle in [Mer10]. Although, this survey includes some features, which we also discuss, such as the representation of the concrete syntax as well as some tool related aspects, it does not present view related features nor does it give hints on the existence of view-based aspects in the Towards a Tool-Oriented Taxonomy of View-Based Modelling 73 classified tools. 5 Conclusions & Future Work In this paper, we identified properties for the main concepts of view-based DSMLs. The analysis was based on our experiences with several different graphical, as well as textual DSML approaches. In this we distinguish between viewpoints, view types and views. We furthermore focus on properties that relate to tool oriented capabilities such as partial or overlapping view definitions or holistic and selective views. This classification scheme allows DSML developers and users to explicitly specify properties of view types and views. This enhances the communication between language engineers and modellers during requirements elicitation, specification and implementation of view-based DSMLs. Based on the classification scheme we will carry out a systematic review of existing DSML approaches. The results of this analysis are beneficial for language engineers as it helps them in selection process of a view-based modelling approach. Furthermore, we will be able to identify gaps in tool support w.r.t. view-based modelling. Preliminary results of the tool evaluation are available online.2 References [ASB09] Colin Atkinson, Dietmar Stoll, and Philipp Bostan. Supporting View-Based Development through Orthographic Software Modeling. In Stefan Jablonski and Leszek A. Maciaszek, editors, ENASE, pages 71–86. INSTICC Press, 2009. [BKS10] Sabine Buckl, Sascha Krell, and Christian M. Schweda. A Formal Approach to Architectural Descriptions – Refining the ISO Standard 42010. In Advances in Enterprise Engineering IV, volume 49 of Lecture Notes in Business Information Processing, pages 77–91. Springer Berlin Heidelberg, 2010. [CJKW07] Steve Cook, Gareth Jones, Stuart Kent, and Alan Wills. Domain-specific development with visual studio dsl tools. Addison-Wesley Professional, first edition, 2007. [Cle03] Paul Clements. Documenting software architectures: Views and beyond. SEI series in software engineering. Addison-Wesley, Boston, Mass., 2003. [Ecl11a] Eclipse Foundation. Graphical Modeling Framework Homepage. http://www. eclipse.org/gmf/, 2011. Last retrieved 2011-10-06. [Ecl11b] Eclipse Foundation. Xtext Homepage. http://www.eclipse.org/Xtext/, 2011. Last retrieved 2011-10-06. [FGH+ 94] A. Finkelstein, D. Gabbay, A. Hunter, J. Kramer, and B. Nuseibeh. Inconsistency Handling In Multi-Perspective Specifications. IEEE Transactions on Softw. Eng., 20:569– 578, 1994. 2 http://sdqweb.ipd.kit.edu/burger/mod2012/ 74 Thomas Goldschmidt, Steffen Becker, Erik Burger [FKN+ 92] A. Finkelstein, J. Kramer, B. Nuseibeh, L. Finkelstein, and M. Goedicke. Viewpoints: A Framework for Integrating Multiple Perspectives in System Development. International Journal of Software Engineering and Knowledge Engineering, 2, 1992. [Fow05] Martin Fowler. Language Workbenches: The Killer-App for Domain Specific Languages? 2005. [GBU08] Thomas Goldschmidt, Steffen Becker, and Axel Uhl. Classification of Concrete Textual Syntax Mapping Approaches. In Proceedings of the 4th European Conference on Model Driven Architecture - Foundations and Applications, pages 169–184, 2008. [GHZL06] John C. Grundy, John G. Hosking, Nianping Zhu, and Na Liu. Generating DomainSpecific Visual Language Editors from High-level Tool Specifications. In ASE, pages 25–36. IEEE Computer Society, 2006. [ISO11] ISO/IEC/IEEE Std 42010:2011 – Systems and software engineering – Architecture description. Los Alamitos,CA: IEEE, 2011. [KT08] S. Kelly and J-P. Tolvanen. Domain-Specific Modeling:Enabling Full Code Generation. Wiley-IEEE Society Press, 2008. [KV10] Lennart Kats and Eelco Visser. The Spoofax Language Workbench. Rules for Declarative Specification of Languages and IDEs. In Proceedings of OOPSLA, pages 444–463, 2010. [MCF03] S.J. Mellor, A.N. Clark, and T. Futagami. Model-driven development - Guest editor’s introduction. IEEE Software, 20:14– 18, 2003. [Mer10] Bernhard Merkle. Textual modeling tools: overview and comparison of language workbenches. In Proceedings of SPLASH, pages 139–148, New York, NY, USA, 2010. ACM. [MHN+ 07] Kazutaka Matsuda, Zhenjiang Hu, Keisuke Nakano, Makoto Hamana, and Masato Takeichi. Bidirectional Transformation based on Automatic Derivation of View Complement Functions. In Proc. of the ICFP 2007, page 47//58. ACM Press, 2007. [MPS] JetBrains MPS. http://www.jetbrains.net/confluence/display/ MPS/Welcome+to+JetBrains+MPS+Early+Access+Program. [Obj06] Object Management Group (OMG). MOF 2.0 Core Specification, 2006. [Obj10] Object Management Group (OMG). Diagram Definition, 2010. [OPHdC09] Nuno Oliveira, Maria Joao Varanda Pereira, Pedro Rangel Henriques, and Daniela da Cruz. Domain Specific Languages: A Theoretical Survey. In Proceedings of the 3rd Compilers, Programming Languages, Related Technologies and Applications (CoRTA’2009), 2009. [Pfe08] A Comparison of Tool Support for Textual Domain-Specific Languages. In 8th OOPSLA Workshop on Domain Specific Modeling, 2008. [RW05] Nick Rozanski and Eoin Woods. Software Systems Architecture. Addison-Wesley, 2005. [Szy02] C. Szyperski. Component software: beyond object-oriented programming. ACM Press/Addison-Wesley Publishing Co., 2002. Towards a Conceptual Framework for Interactive Enterprise Architecture Management Visualizations Michael Schaub, Florian Matthes, Sascha Roth {michael.schaub | matthes | sascha.roth}@in.tum.de Abstract: Visualizations have grown to a de-facto standard as means for decisionmaking in the management discipline of enterprise architecture (EA). Thereby, those visualizations are often created manually, so that they get soon outdated since underlying data change on a frequent basis. As a consequence, EA management tools require mechanisms to generate visualizations. In this vein, a major challenge is to adapt common EA visualizations to an organization-specific metamodel. At the same time, end-users want to interact with the visualization in terms of changing data immediately within the visualization for the strategic planning of an EA. As of today, there is no standard, framework, or reference model for the generation of such an interactive EA visualization. This paper 1) introduces a framework, i.e. an interplay of different models to realize interactive visualizations, 2) outlines requirements for interactive EA management visualizations referring to concepts of the framework, 3) applies the framework to a prototypical implementation detailing the therein used models as an example, and 4) compares the prototype to related work employing the framework. 1 Introduction Today’s enterprises cope with the complexity of changes to highly interconnected business applications whereas local changes often result in global consequences, i.e. impact the application landscape as a whole. At the same time, change requests to business applications or processes are required to be fast and cost-effective in response to competitive global markets with frequently changing conditions[WR09, Ros03]. Enterprise Architecture (EA) management promises to balance between short time business benefit and long term maintenance of both business and IT in an enterprise [MWF08, MBF+ 11]. Thereby, having a holistic perspective of the EA is indispensable. In this vein, visualizations have grown to a de-facto standard as means for strategic decision making in the management discipline of EA. Concepts formally describing EA management visualizations are summarized as system cartography1 , whereby the generation of visualizations out of existing data is not yet described in depth [Wit07], i.e. currently there exists no standard, framework, reference architecture, or best-practice for generating EA visualizations. Slightly later than the discipline itself, also tool support for EA management emerged [MBLS08, BBDF+ 12]. With respect to their visualization capabilities, the range of tools for EA management reaches from mere drawing tools to a model-driven generation of 1 Formerly known as software cartography [Mat08]. 76 Michael Schaub, Florian Matthes, Sascha Roth visualizations. The former approach has clear drawbacks since visualizations are created manually in a handcrafted, error-prone, and inefficient process. The later approach is often limited to a single information model aka metamodel. Thereby, such an information model has to try to capture the entirety of all relevant entities across all business domains and industry sectors. Since this is an endeavour doomed to fail, EA vendors chose to offer mechanisms for extending a ‘core’ information model. Since no standard information model for EAs exists, enterprises tend to use an organization-specific information model reflecting their information demands and tend to adopting the enterprise’s terminology, i.e. aforementioned extension mechanisms are frequently used [BMR+ 10a]. At the same time, respective visualization algorithms do not adapt to those changes automatically, i.e. the visualizations have to be adapted to the extensions leading to extensive configuration or additional implementation/customization efforts. Since there is no standard, framework, or reference model for generating such an interactive EA visualization, we conclude with the following research question: ‘How does a common framework or reference model for generating interactive EA visualizations look like?’ The remainder of this chapter is structured as follows: Section 2 introduces a conceptual framework for generating interactive visualizations in general and in particular for EA management. An outline of requirements for interactive EA management visualizations is given in Section 3. The framework is then applied to a prototypical implementation in Section 4. Subsequently, Section 5 revisits related approaches and compares them to the prototype employing the introduced framework. Finally, Section 6 concludes this paper and gives a brief outlook on open research questions. 2 Generating interactive visualizations Figure 1 illustrates an overview of a conceptual framework to generate interactive visualizations that is detailed in the following. The framework consists of: A data model which is considered as the actual data d within a data source that can be retrieved by a query q. Depending on the nature of the data source, different fields may have different access permissions [BMR+ 10b, BMM+ 11]. Therefore, a data interaction model di captures the different access permissions for each concrete x ∈ d, i.e. access rights and permissions on data level but not schema level. As an example users of a certain department might only get information about business applications in their particular business unit. An information model that describes the schema im that the data model d is based on. “An information model is a representation of concepts, relationships, constraints, rules, and operations to specify data semantics for a chosen domain of discourse” [Lee99]. An interaction model i that subsumes the interactions that are allowed upon the information model level, i.e. which entity can be created, read, updated, or deleted. For instance, a certain role can only create business applications but is not allowed to create business units. An abstract information model which can be a template for a certain information model or type/entity therein. Based on the observations of Buckl et al. in [BEL+ 07], organizations use recurring patterns to describe their managed information. Especially in [Sch11], Schweda shows that recurring patterns of information models Towards a Conceptual Framework for Interactive Enterprise Architecture Management Visualizations Interactive View (Visualization) Data model View data model query View data interaction model Data interaction model transformation Symbolic model Symbolic interaction model Viewpoint Information model View model Visualization model Interaction model View interaction model Visual interaction model Abstract information model Abstract view model Abstract visualization model Abstract interaction model Abstract view interaction model Abstract visual interaction model VBB Figure 1: A conceptual framework to generate interactive visualizations 77 78 Michael Schaub, Florian Matthes, Sascha Roth have been observed which he synthesized to so-called information model building blocks (IBBs). Such an information model template, fragment, or building block comes with a certain abstract interaction model that describes e.g. predefined access rights synthesized as best practices. A view data model v = q(d) such that v ⊆ d ∪ q1 , whereby q1 are results of q that are calculated out of d, e.g. aggregations or average values. A view data interaction model ⊆ di which is derived from q. In some cases, q reduces di not only by the selected values of d, but also additional interactions, e.g. aggregated values cannot be edited regardless of access rights for a specific x ∈ d. A view model is the schema vm of v, such that vm ⊆ im ∪ q2 is derived from q, whereas q2 describes the part of the schema, which has been created entirely by q, i.e. in general q2 ! im . A view interaction model vi ⊆ i which is determined by q, i.e. depending on a particular q interactions of i are enabled or not by vi . For instance, on aggregated values, updates are prohibited, whereas relationships and transitive relationships2 could be updated3 based on i. An abstract view model which defines the information demands for a particular visualization blueprint. The abstract view model va can be used as a basis to perform a pattern matching, i.e. matching for the pattern given by va on im (see e.g. [BURV11, BHR+ 10a]). An abstract view interaction model which defines permitted interactions based on the information demands va . A symbolic model sm summarizes the rendered symbols, i.e. instances of shapes like rectangles, lines, etc., such that ultimately sm is the visualization as such. A symbolic interaction model offers interactions on the actual visualization. These interactions are of general concern for all sm , e.g. navigation or adaptive zooming [CG02], and do not relate to d or im . A visualization model vism is the definition of visual primitives, i.e. shapes like rectangles, lines, etc. and simple compositions thereof. Thereby, sm is an instantiation of vism which has been fully configured, e.g. a red dotted line. A visual interaction model visi are the interactions, that come with selections s ⊆ vism . Thereby, s is e.g. a rectangle which is draggable and may change its size on user manipulation. The instantiated and configured mapping of vm to vism can be summarized as a viewpoint in line with ISO/IEC 42010:2007 (cf. [Int07]). An abstract visualization model visa that describes more complex compositions of elements of vism . Thereby, visa is not an instance of a vism but a predefined composition, i.e. blueprint or building block, with additionally specified variability points defined in visa that may modify the actual appearance of sm . An abstract visual interaction model that describes possible interactions from the pure visual point of view with respect to the predefined configurations. For instance, a text not fitting inside a rectangle is cut after reaching a maximum length and a ‘...’ string is appended. In addition to cutting off the over-sized text so that the text object visually fits, a tool tip is added to give the end-user feedback of the actual text contained. Such a behaviour is independent from a concrete visualization and thus can be defined in an abstract manner therefore constituting a separate model. A described mapping of va to visa including variability points can be summarized as a viewpoint building block (VBB) in line with Buckl et al. [BDMS10]. 2 As used e.g. in the visualizations introduced in [BMR+ 10b]. an update may require to create stub objects or require additional user interventions depending on the concrete information model. 3 Such Towards a Conceptual Framework for Interactive Enterprise Architecture Management Visualizations 79 3 Requirements for interactive EA visualizations With a focus on EA management, we identified the following requirements for interactive EA visualizations which will be explained with regard to the aforementioned conceptual framework. As outlined above, in the context of EA management enterprises tend to use organizationspecific information models since there is no common standard EA model to describe an entire organization’s information demands. Consequently, it has to be ensured that an arbitrary information model can be visualized dynamically, i.e. an EA visualization tool should be able to generate visualizations out of data without the need to manually adapt to information models (Re1). More technically speaking, this also implies that the mapping of an information model to a visualization model must be performed dynamically at runtime and be configurable by end-users (Re1.1). EA management visualizations are not only used to view data but are also consulted when making strategic decisions. These often require impact analyses which can be performed best in a graphical manner, i.e. by manipulating the symbolic model directly (Re2) performing ‘what-if’ analyses. EA management has many perspectives and angles to view at depending on different stakeholders with different concerns ending up in stakeholder-specific visualizations highlighting relevant data for a special issue [AKRS08, IL08, Mat08, BGS10]. Interactive EA visualizations must be able to visualize a subset of the data model or the information model (Re2.1) while offering valid interactions and keeping consistency [DQvS08]. Thereby, these manipulations should not only influence the visualization but also underlying data so that changes to the symbolic model are propagated to the respective data model and information model (Re2.2), being permitted and constrained by an underlying data interaction model and interaction model. Interactions with the visualization, i.e. the symbolic model, should be preferably smooth. Following [Nie94] “the limit for having the user feel that the system is reacting instantaneously” is about 0.1 second. To provide EA visualizations in a decentralized manner (cf. [BMNS09]), a solution is intended to use a client/server architecture allowing a centralized data model and information model while the generated visualizations can be decentrally viewed and manipulated (Re3). In this vein, a major challenge is the reduction of needed round-trips for propagating changes in the symbolic model, which is client-sided to the data model, possibly located at the server. During such a round-trip all kinds of interactions have to be locked in order to guarantee that the semantic integrity of the data model is not violated through any further incompatible interactions following the ACID (atomicity, consistency, isolation, durability) properties. Possible round-trips may take up to a couple of seconds leading to decreased user adoption. As a consequence as many as possible restrictions related to the permitted user interactions, defined by the interaction model, should be available intermediately within the client such that manipulations to the symbolic model are limited to a minimum and hence increase usability (Re3.1). In [BELM08], Buckl et al. have shown that visualizations, so-called V-Patterns, recur in the discipline of EA management. Buckl et al. also synthesized these V-Patterns in socalled viewpoint building blocks (VBB). Considering the framework explained above, 80 Michael Schaub, Florian Matthes, Sascha Roth these V-Patterns are viewpoints whereas the paradigm of VBBs is adapted. Buckl et al. showed in [BELM08] that recurring patterns are reused and combinations thereof. Therefore, EA visualizations must be defined as pre-configured, parameterized4 building blocks (Re4) in order to increase re-usability of existing software artifacts and accelerate development periods. Moreover, EA visualizations must be generated employing such building blocks allowing combinations thereof (Re4.1) in order to enable more complex combinations of visualizations out of building blocks by end-users. 4 Prototypical implementation Based on the framework introduced in Section 2 a prototypical implementation is developed which will be described in this section referring to the requirements of the previous section. In the following, the process of generating an EA visualization, i.e. generating a symbolic model, is explained in detail by an exemplary information model and data model. The EA visualization generated is taken from Buckl et al. (V-Pattern 26 in [BELM08]) since they used a pattern-based approach, i.e. they observed this kind of visualization with underlying an information model at least three times5 in practice6 . * 1 Business Unit (BU) Business Application (BA) 1 +name : string +developmentFrom : Date +developmentTo : Date +plannedFrom : Date +plannedTo : Date +productionFrom : Date +productionTo : Date -retirementFrom : Date -retirementTo : Date * * Location +name : String +name : String * 1 1 Employee 1 * +firstName : String +lastName : String +email : String +phone : String (a) Information model with view model (in dashed lines) Inner +name : String +rect1Start : Date +rect1End : Date +rect2Start : Date +rect2End : Date +rect3Start : Date +rect3End : Date +rect4Start : Date +rect4End : Date Outer +name : String * 1 (b) Abstract view model Figure 2: Pattern Matching of abstract view model and information model Figure 2(a) shows an excerpt from an information model consisting of business applications related to each other and business units that use them and are based at a certain location having employees that work at business units. An exemplary instantiation of this information model is illustrated in Figure 3 which is used in the following example to generate a time-interval map (cf. [BELM08] or Figure 5). The first step towards generating a visualization is to define a VBB as an abstract template (Re4). Thereby, an abstract view model (see Figure 2(b)) is created stating that ‘outer’ and ‘inner’ entities linked via a 1:n relationship are the information demands for 4 In this context, parameterized means explicitly defined variability points. an explanation of the ‘rule of three’ see [AIS+ 77]. 6 This proves practical relevance as desired for a design science approach (cf. [HMPR04]). 5 For Towards a Conceptual Framework for Interactive Enterprise Architecture Management Visualizations 81 CRM System : Business Application (BA) name : string = CRM System developmentFrom : Date = 01.01.2012 developmentTo : Date = 01.04.2012 plannedFrom : Date = 01.11.2011 plannedTo : Date = 01.01.2012 productionFrom : Date = 01.04.2012 productionTo : Date = 01.01.2013 retirementFrom : Date = 01.01.2013 retirementTo : Date = 01.03.2013 IT Shared Services : Business Unit (BU) name : String = IT Shared Services Munich : Location name : String = Munich Accounting System : Business Application (BA) name : string = Accounting System developmentFrom : Date = 01.01.2012 developmentTo : Date = 01.06.2012 plannedFrom : Date = 15.11.2011 plannedTo : Date = 01.12.2011 productionFrom : Date = 01.06.2012 productionTo : Date = 01.01.2013 retirementFrom : Date = 01.01.2013 retirementTo : Date = 01.02.2013 Martina Musterfrau : Employee Max Mustermann : Employee firstName : String = martina lastName : String = musterfrau email : String = musterfrau@company.tld phone : String = +49 123 798 456 firstName : String = max lastName : String = mustermann email : String = mustermann@company.tld phone : String = +49 123 456 789 Figure 3: Data model with view data model (in dashed lines) this VBB. This formal specification of the information demands is used to match the pattern of abstract view model against a given information model with the pattern matching using library IncQuery [BHR+ 10b]. The pattern matcher searches the information model to match for potential ‘outer’ entities required to have a name attribute connected via a 1:n relationship to ‘inner’ entities that are required to offer four pairs of start-fields and end-fields and a name attribute. This technique is used to offer an end-user the possibility of visualizing data with an arbitrary information model (Re1) by choosing the information of interest from a list of potential ‘outer’ and ‘inner’ entities (Re1.1). Besides the information demands, the VBB defines symbols that are used to visualize the chosen information, i.e. it describes variability points of elements of the visualization model in an abstract visualization model (Figure 4(a)). For instance, symbols like rectangles or circles commonly offer different color-configurations or even could be used interchangeably (e.g. use circles instead of rectangles). Commonly, elements of the visual model become visible in a symbolic model, while the composite symbol, shown as dotted line in Figure 4(a), is used as logical container for a set of symbols and is not directly visible. As shown in Figure 4(a) the abstract visualization model defines that each inner entity is represented through three kinds of symbols, namely a composite symbol, a text symbol and four rectangle symbols. Furthermore, the composite symbol is conceived to enable the setting of constraints/rules for all symbols contained therein. Figure 4(a) illustrates text and rectangle symbols embodying different attributes that are used for the transforming process later on. Finally the VBB defines a mapping between abstract view model and abstract visualization model. Therefore, the VBB states whether an object/attribute of the abstract view model is directly bound to a corresponding object/attribute of the abstract visualization model and how exactly. Moreover, the VBB also defines objects/attributes of the abstract view model that are employed to calculate 7 one or more objects/attributes of the abstract visualization model. In the given example the name attribute of an ‘inner’ entity is directly bound to the name attribute of a text symbol, whereas pairs of start- and end-date attributes are used to calculate the width of rectangle symbols. After matching 7 We call these derived attributes which can be the result of any calculation, e.g. a sum of values, transitive relationships in the information model, etc. 82 Michael Schaub, Florian Matthes, Sascha Roth fWgS\ bacjfl Uehec\ Uehec ^[deb\ jfaSlSc ][deb\ jfaSlSc VWUilce`fT\ Uehec _jTak\ jfaSlSc ^[deb\ jfaSlSc ][deb\ jfaSlSc XS^a b]gVehb VWUilce`fT\ Uehec _jTak\ jfaSlSc ^[deb\ jfaSlSc ][deb\ jfaSlSc VWUilce`fT\ Uehec _jTak\ jfaSlSc ^[deb\ jfaSlSc ][deb\ jfaSlSc YSUaWflhS b]gVehb VWUilce`fT\ Uehec _jTak\ jfaSlSc ^[deb\ jfaSlSc ][deb\ jfaSlSc ZegdebjaS b]gVehb (a) VBB: Abstract visualization model )"*6+ 7590)4 = '%& $-756* :<,<9+ :<,<9 = !,":. />;<7+ 0)56469 ->;<7+ 0)56469 #6/5 7-*!<,7 !":.49<3)8+ :<,<9 = 968 10852+ 0)56469 />;<7+ 0)56469 ->;<7+ 0)56469 !":.49<3)8+ :<,<9 = 4966) 10852+ 0)56469 />;<7+ 0)56469 ->;<7+ 0)56469 %6:5")4,6 7-*!<,7 (":.49<3)8+ :<,<9 = !,36 10852+ 0)56469 />;<7+ 0)56469 ->;<7+ 0)56469 (":.49<3)8+ :<,<9 = !,36 10852+ 0)56469 />;<7+ 0)56469 ->;<7+ 0)56469 '<*;<7056 7-*!<,7 (b) VBB: Visualization model Figure 4: The connection between abstract and non-abstract visualization model the pattern of information demand of an abstract view model to determine which part of an information model is needed the view data model highlighted with dashed lines in Figure 3 is extracted from the data model (Re2.1). A so-called viewpoint configuration is used to set relevant parameters so that the viewpoint can process the view data model that is passed over to it. Thereby, the viewpoint configuration states which fields of the information model are mapped to which fields of the abstract view model which are chosen from a list of all possible entities and combinations thereof by the end-user (Re1.1) after the pattern matching. In the given example each pair of from and to values is mapped to one pair of rectXStart and rectXEnd, i.e. developmentFrom is used for rect1Start and developmentTo is mapped to rect1End. All other rectX fields of the abstract view model are mapped in a similar way. Furthermore, the field name of a business application entity is mapped to the inner name attribute of the abstract view model. Besides the concrete mapping of an information model to an abstract view model, the abstract visualization model is parametrized. In our example, the colors of the rectangles and the font-size of the text are set. The viewpoint configuration itself is passed to a restful Web Service as a JSON string where it is processed and passed over to the VBB. The resulting runtime models that are created through parameterizing the abstract view model and abstract visualization model are the view model (Figure 2(a)) and elements from the visualization model (Figure 4(b)) of the viewpoint. Being fully configured the viewpoint is finally used to process all entities of the view data model. In this step for each entity of the view data model, one row of the time-interval map, whose structure is defined in the visualization model, is added to the symbolic model that can be seen in Figure 5. At this point, also the layout is done, i.e. setting x/y position and width parameters are calculated on the basis of the attributes of the view data model. In our prototypical implementation, this result is JavaScript code utilizing the Raphaël framework to generate the visualization in the web browser of the client (Re3). In addition, the VBB not only specifies symbols or groups thereof in terms of composite symbols, but also equips them with predefined interactions so that the user can manipulate the visualization (Re2). In order to be able to set only permitted interactions, different information sources are used. As explained in Section 2 the interaction Towards a Conceptual Framework for Interactive Enterprise Architecture Management Visualizations wo vutt kmj vutt yoq vutt pq vutt yo~ vutt w vutt w vutt vutt mp vutt in vutt hrg vutt 83 fm vutt srqprqonm lrqno l rn ~}nm| {zl ~}nm| sy ~}nm| Figure 5: Symbolic model model determines which actions are allowed upon the data model without affecting the integrity of the information model, whereas the data interaction model checks access rights on a particular element in the data model. On the other hand, the abstract view interaction model states which kind of interactions are allowed upon the abstract view model within this VBB. In the given example (cf. Figure 6) the fields rectXStart and rectXEnd of the inner entities can be changed (in italic) , whereas the name field must remain the same (in bold). Propagating these changes to the data model is only allowed because they are based upon bijective functions. In contrast, when using derived attributes that are calculated using aggregated values, interactions are not permitted since an interaction not based upon a bijective function would inevitably cause trouble while propagating changes to the data model (Re2.2). Additionally the abstract visual interaction model determines the permit- !))/. + $)!', .-/%$& " +,0(&3(2+(' )(+1-* " +,0(&!-.' )(+1-* " +,0(%3(2+(' )(+1-* " +,0(%!-.' )(+1-* " +,0($3(2+(' )(+1-* " +,0($!-.' )(+1-* " +,0(#3(2+(' )(+1-* " +,0(#!-.' )(+1-* 3*,/. # " + $)!', .-/%$& *"#(& ./21$0)+' 4(2/10& ./21 % (.-,/ Figure 6: Abstract view interaction model ted interactions upon the different symbols that are used for rendering the symbolic model later on, i.e. changes to the width of an element can be interpreted as changes of a date. Interactions are represented by constraints that are attached to different symbols, whereas for each possible interaction that can be triggered, like moving or dragging and dropping symbols, a constraint is implemented with individual parameters that can be customized to restrict this interaction. As an example, the rectangles of the time-interval map can be moved horizontally only, whereas the composite symbols are limited to vertical movement. Figure 7(a) shows each composite and rectangle symbol equipped with a Movement constraint to achieve the ascribed functionality. The Movement constraint itself has three parameters, direction, minimum and maximum, that have to be set up in order to enable the equipped symbol to be moved in the given direction and between the minimum and maximum value. Accordingly, if a symbol should be moved horizontally and vertically, two Movement constraints will have to be attached to the symbol. Besides Movement, fur- 84 Michael Schaub, Florian Matthes, Sascha Roth ther constraints, like Resizing or Containment have been implemented, but are not needed for this particular kind of visualization. Each of these constraints has its own parameters set up individually in a VBB. In addition to interaction constraints, the symbolic interactions we used focus on direct user feedback, i.e. tool tip texts or highlight on selection. In our example tool tip texts are used when hovering over or dragging an end of rectangles (rectXStart or rectXEnd). As shown in Figure 5, some of the rectangles are not filled reflecting read-only access gathered from the data interaction model. With the abstract view interaction model describing which interactions are allowed upon the abstract view model and the abstract visual interaction model stating which user interactions are allowed upon the different symbols of the abstract visualization model, the mapping between these two models, that is aligned with the mapping between the abstract view model and abstract visualization model, it is specified how the permitted user interactions of the abstract visual interaction model affect the attributes of the entities of the abstract view model so that round-trips are omitted in the first place (Re3.1). In the given example the abstract visual interaction model prescribes that the rectangles can only be moved horizontally. Furthermore the abstract view interaction model states that only the values of rectXStart and rectXEnd of inner entities can be changed. In addition the mapping between these models describes that a horizontal movement of a rectangle symbol causes the rectXStart and rectXEnd fields of the corresponding inner entity to be updated to the current positions. On instantiation, the VBB is configured to a viewpoint with the visualization configuration + + + + &,7:0:.; &,7:0:.; '4):!;4,./ 7:);4!#2 &4.4090/ 퍊/ (,0*,<4;: <30",2< + '4):!;4,./ 7:);4!#2 &4.4090/ +=퍊/ +--- (,0*,<4;: <30",2< + + + &,7:0:.; &,7:0:.; %:!;#.82: $30",2< '4):!;4,./ 6,)41,.;#2 &4.4090/ 퍊/ (a) VBB: Mapping of abstract visualization model and abstract visual interaction model %:!;#.82: $30",2< '4):!;4,./ 6,)41,.;#2 &4.4090/ =퍊/ =-- (b) Viewpoint: Mapping of visualization model and visual interaction model Figure 7: Abstract visual interaction model and visual interaction model that may contain parameters for the abstract visual interaction model which can be seen in Figure 7(b). In our prototype, different VBBs or combinations thereof (Re4.1) can be used. After processing all entities of the view data model the information gained from the view interaction model and visual interaction model of the fully configured viewpoint is used to enrich the created symbolic model with the permitted possibilities of user interactions (Re2.2 & Re3.1), in terms of symbols being equipped with the corresponding interaction constraints. Towards a Conceptual Framework for Interactive Enterprise Architecture Management Visualizations 5 85 Related Work This section gives a brief overview of interactive visualizations. Some of these interactive visualizations constitute a visual domain-specific language, whereas others are mere drawing tools and are not bound directly to a data and/or information model. JS Library D38 . With the JS based library D3 it is possible to create manifold visualizations from structured data. Examining the structure of D3 it is shown that some kind of view model can be found even within this framework, specifying the structure of the data that can be loaded in order to generate a visualization. At this point it has to be mentioned that due to the static view model a mapping between an arbitrary information model and this view model has to be implemented separately. Thus only one kind of information model can be processed at a time without creating a new mapping between a second information model and the view model. Besides the view model D3 contains a visualization model constituting the symbols to be used for generating visualizations. Without an extension D3 contains circles, squares, crosses, diamonds and triangles as possible symbols. Looking at D3’s possibilities for user interaction it can be seen that only rudimentary functions are available. For example, it is possible to select a subset of the view data model and update the visualization with this extract but there exists neither a possibility of changing the visualized data nor can changes be propagated to the view data model not to mention to the data model with respect to data integrity to the respective information model. With regard to the above identified requirements it shows that a mapping between an arbitrary information model and the view model has to be implemented separately, which is why Re1 is just partly fulfilled and end-user configuration (Re1.1) is not offered by D3. In contrast, Re2 can not be fulfilled entirely, since D3 focuses on interactions that center around giving user feedback. D3 does provide functions for selection of subsets of the view data model (Re2.1). In general, changes upon the symbolic model cannot be propagated back to a data model (Re2.2). Re3 is fulfilled in partially, because D3 being a web framework written in JavaScript, a client/server architecture can be implemented, but communication with the server, respectively with the information model and data model would have to be implemented separately. Hence, complete fulfillment of Re3 is not given. Additionally, as all client/server communication would have to be implemented (Re3.1) is thought not be fulfilled. Furthermore D3 offers different possibilities for definition of predefined parametrize visualization types (Re4) that can be reused or even combined with manageable effort in order to create new kinds of visualizations, leading to (Re4) and (Re4.1) being fulfilled. yFiles. yFiles [WEK02] is a Java class library for rendering and analyzing visualizations, especially graphs. Therefore, it provides separate packages to analyze, layout, or draw visualizations on a Java Swing form. An exemplary application showing all of the main features of yFiles is yEd, a tool for creating visualizations of graphs, networks, and diagrams9 . Within yFiles there exists a single static view model for all kinds of visualizations that can be rendered with this framework, mainly consisting of ‘nodes’ and ‘edges’. Additionally, a separate visualization model can be found for each type of visual8 See 9 See http://mbostock.github.com/d3/ last accessed: Oct. 26, 2011. http://www.yworks.com/de/products_yed_about.html last accessed Oct. 26, 2011. 86 Michael Schaub, Florian Matthes, Sascha Roth ization, determining the symbols to be used for visualizing the entities of the view model and for combining them in order to generate the symbolic model. Furthermore, there exists a visual interaction model for each visualization model stating which interactions are permitted for each symbol of the generated symbolic model. Using yFiles comes with a mapping that has to be prepared in order to process an organization-specific information model and corresponding data model, thus Re1 is partially fulfilled since the view model is static. Visualizations generated using yFiles include possibilities of parameterizing these in a user-friendly manner (Re1.1). As yFiles contains manifold possibilities for user interaction (Re2) is fulfilled, whereas selections of subsets of the data model are not included (Re2.1). Re2.2 cannot be fulfilled completely as changes to the symbolic model are propagated to the view model but not to the data model. Since yFiles is implemented using Java there is a possibility of implementing a solution as an applet or using Java Web Start technology to transfer interaction constraints to a client (Re3). As yFiles offers no possibility for propagating changes to the data model (Re3.1) is not fulfilled. Only a few types of visualizations can be generated without substantially extending yFiles and other visualizations cannot be predefined (Re4). Also, yFiles does not include any possibility of combining different visualization types (Re4.1). Visio. Microsoft Visio is a desktop application to create any kind of symbolic models, reaching from business processes in BPMN notation to construction blueprints. Among the possibility of creating all these visualizations by hand, Visio offers the possibility of creating these out of data files, or databases out of a predefined format. However this option is severely restricted as it uses a static view model and does only provide a small amount of parameters to set when querying an information model. For instance, generating an organizational chart out of a spreadsheet or database can serve as an example, as just a few parameters have to be mapped to possible fields that can be shown in the visualization. Thereby, one of them is indicating the relationship between the entities. In this context Visio contains a static view model, being bound to a very limited information model. Besides this view model there exist visualization models and visual interaction models within Visio for each kind of visualization that can be rendered. Visio’s potential to be used for generating EAM visualizations can be shown by considering the above mentioned requirements. Visio is able to use different information models, thus Re1 can be fulfilled partially. So as Re1.1, because Visio offers a small set of possibilities of parameterizing visualizations. As Visio offers manifold possibilities for user interaction Re2 is completely fulfilled, but Visio lacks the functionality of selecting subsets of the data model, hence Re2.1 is not fulfilled. The missing functionality of propagating changes within the symbolic model to the view model or the data model leads to Re2.2 not being fulfilled. As Visio is a desktop application Re3 and Re3.1 are not covered, as no client/server architecture can be implemented though no statement about round-trips can be made. Similar to yFiles, Visio includes a limited set of visualization types that can be parametrized in some cases, but cannot be predefined (Re4) and does not support combination thereof (Re4.1). Generic Modeling Environment (GME). The main purpose of the GME is to create a (visual) domain-specific language (DSL) with separated information model and its representation in the sense of a symbolic model. Therefore, GME uses a metamodel which is represented through MetaGME that describes the main aspects of the employed infor- Towards a Conceptual Framework for Interactive Enterprise Architecture Management Visualizations D3 yFiles Visio GME 87 GMF Re1 Re1.1 Re2 Re2.1 Re2.2 Re3 Re3.1 Re4 Re4.1 Table 1: Visualization capabilities of the presented approaches mation model. Furthermore, model related constraints can be integrated using the Object Constraint Language (OCL). These constraints are equipped with priorities and corresponding actions that have to be performed in case of a violation of themselves. Evaluating GME against our requirements, GME shows that it is possible to generate visualizations out of any kind of information model (Re1), but only one concrete information model can be processed at a time and no configuration at runtime is offered, especially when it comes to support for an end-user configuration (Re1.1). Moreover, GME offers far-reaching possibilities for user interaction (Re2), but a selection of subsets of the data model is not included (Re2.1). All changes to the symbolic model are propagated to the data model leading to (Re2.2) being fulfilled entirely. Just like Visio, GME is a desktop application, wherefore Re3 and Re3.1 cannot be fulfilled for aforementioned reasons. As it is one of GME’s main purposes it includes functionalities for predefined visualization types (Re4), but it does not provide any possibility for combining two or more of these types in order to create new kinds of visualizations (Re4.1). Graphical Modeling Framework (GMF). Like GME, GMF aims at providing a framework to construct a DSL with separated information model and graphical representation of entities thereof. Given that GMF is based on the Graphical Editing Framework (GEF) [RWC11] it has the same capabilities for constructing and manipulating models as the GEF. GMF offers a wide range of possibilities of implementing constraints, i.e. constraints can be formulated in OCL, as regular expressions or implemented as Java code. Due to their similarity the GME and GMF provide similar capabilities of fulfilling the requirements. GMF also has the ability to process any kind of information model (Re1), but it has to be adapted to each information model that shall be used (Re1.1). GMF also provides manifold possibilities for user interaction (Re2), but does not include any functionalities for a selection of subsets of the data model (Re2.1). Users changes to the symbolic model are propagated to the respective data model (Re2.2) while the integrity of the information model is guaranteed. Like GME, GMF is a desktop application, which is why (Re3) and (Re3.1) cannot be fulfilled. As GMF allows the definition of predefined visualization types but does not allow the combination of these types (Re4) is fulfilled, while (Re4.1) cannot be fulfilled. 88 Michael Schaub, Florian Matthes, Sascha Roth 6 Conclusion and Outlook After motivating interactive visualizations for EA management, this paper introduced a conceptual framework to realize interactive visualizations. In this paper we demonstrated how the framework can be used 1) to compare different visualization frameworks and 2) as a reference architecture to describe and implement visualization tools. During the latter, we showed how we implemented the different models and how they interact with each other for a specific visualization. Thereby, the chosen visualization originates from practice and was found as recurring pattern in the EA management domain. In line with the design science approach of Hevner [HMPR04], further research will detail and refine the models of the introduced framework by incorporating feedback from practical applications. We expect that for the EA management discipline only a number of relevant viewpoint building blocks and respective interactions have to be developed while the remaining ones are combinations thereof. In particular parameters, variability points of visualizations and further relevant interactions for industry have to be found by empirical evaluation of the created design artefact. Currently, the prototypical implementation has been applied to a pattern-based case. Further research can broaden the scope these visualizations are applied to and also observe interactions actually employed by end-users. In order to describe interactions, a formal language for describing interactivity is currently missing. As of today, images with ‘arrows and boxes’ summarized as mock-ups are used in combination with a full-text description of possible interactions so that behavior and semantics become clear to end-users. Further research could address this issue incorporating different ways currently used to describe interactive behavior of visualizations and prove the usability by end-user studies. Such a language could facilitate the way end-users evaluate mock-ups of interactive visualizations in general and in particular for the domain of EA management. References [AIS+ 77] Christopher Alexander, Sara Ishikawa, Murray Silverstein, Max Jacobson, Ingrid Fiksdahl-King, and Shlomo Angel. A Pattern Language. Oxford University Press, New York, NY, USA, 1977. [AKRS08] Stephan Aier, Stephan Kurpjuweit, Christian Riege, and Jan Saat. Stakeholderorientierte Dokumentation und Analyse der Unternehmensarchitektur. In Heinz-Gerd Hegering, Axel Lehmann, Hans Jürgen Ohlbach, and Christian Scheideler, editors, GI Jahrestagung (2), volume 134 of LNI, pages 559–565, Bonn, Germany, 2008. Gesellschaft für Informatik. [BBDF+ 12] Marcel Berneaud, Sabine Buckl, Arelly Diaz-Fuentes, Florian Matthes, Ivan Monahov, Aneta Nowobliska, Sascha Roth, Christian M. Schweda, Uwe Weber, and Monika Zeiner. Trends for Enterprise Architecture Management Tools Survey. Technical report, Technische Universität München, 2012. [BDMS10] Sabine Buckl, Thomas Dierl, Florian Matthes, and Christian M. Schweda. Building Blocks for Enterprise Architecture Management Solutions. In Frank et al. Harm- Towards a Conceptual Framework for Interactive Enterprise Architecture Management Visualizations 89 sen, editor, Practice-Driven Research on Enterprise Transformation, second working conference, PRET 2010, Delft, pages 17–46, Berlin, Heidelberg, Germany, 2010. Springer. [BEL+ 07] Sabine Buckl, Alexander M. Ernst, Josef Lankes, Kathrin Schneider, and Christian M. Schweda. A pattern based Approach for constructing Enterprise Architecture Management Information Models. In A. Oberweis, C. Weinhardt, H. Gimpel, A. Koschmider, V. Pankratius, and Schnizler, editors, Wirtschaftsinformatik 2007, pages 145–162, Karlsruhe, Germany, 2007. Universitätsverlag Karlsruhe. [BELM08] Sabine Buckl, Alexander M. Ernst, Josef Lankes, and Florian Matthes. Enterprise Architecture Management Pattern Catalog (Version 1.0, February 2008). Technical report, Chair for Informatics 19 (sebis), Technische Universität München, Munich, Germany, 2008. [BGS10] Sabine Buckl, Jens Gulden, and Christian M. Schweda. Supporting ad hoc Analyses on Enterprise Models. In 4th International Workshop on Enterprise Modelling and Information Systems Architectures, 2010. [BHR+ 10a] G. Bergmann, A. Horváth, I. Ráth, D. Varró, A. Balogh, Z. Balogh, and A. Okrös. Incremental Model Queries over EMF Models. In ACM/IEEE 13th International Conference on Model Driven Engineering Languages and Systems, 2010. [BHR+ 10b] Gábor Bergmann, Ákos Horváth, István Ráth, Dániel Varró, András Balogh, Zoltán Balogh, and András Ökrös. Incremental Evaluation of Model Queries over EMF Models. In Model Driven Engineering Languages and Systems, 13th International Conference, MODELS 2010. Springer, Springer, 2010. [BMM+ 11] Sabine Buckl, Florian Matthes, Ivan Monahov, Sascha Roth, Christopher Schulz, and Christian M. Schweda. Enterprise Architecture Management Patterns for Enterprisewide Access Views on Business Objects. In European Conference on Pattern Languages of Programs (EuroPLoP) 2011, Irsee Monastery, Bavaria, Germany, 2011. [BMNS09] Sabine Buckl, Florian Matthes, Christian Neubert, and Christian M. Schweda. A Wiki-based Approach to Enterprise Architecture Documentation and Analysis. In The 17th European Conference on Information Systems (ECIS) – Information Systems in a Globalizing World: Challenges, Ethics and Practices, 8.–10. June 2009, Verona, Italy, pages 2192–2204, Verona, Italy, 2009. [BMR+ 10a] Sabine Buckl, Florian Matthes, Sascha Roth, Christopher Schulz, and Christian M. Schweda. A Conceptual Framework for Enterprise Architecture Design. In Will Aalst, John Mylopoulos, Norman M. Sadeh, Michael J. Shaw, Clemens Szyperski, Erik Proper, Marc M. Lankhorst, Marten Schönherr, Joseph Barjis, and Sietse Overbeek, editors, Trends in Enterprise Architecture Research, volume 70 of Lecture Notes in Business Information Processing, pages 44–56. Springer Berlin Heidelberg, 2010. [BMR+ 10b] Sabine Buckl, Florian Matthes, Sascha Roth, Christopher Schulz, and Christian M. Schweda. A Method for Constructing Enterprise-wide Access Views on Business Objects. In Klaus-Peter Fähnrich and Bogdan Franczyk, editors, GI Jahrestagung (2), volume 176 of LNI, pages 279–284. GI, 2010. [BURV11] Gábor Bergmann, Zoltán Ujhelyi, István Ráth, and Dániel Varró. A Graph Query Language for EMF models. In Jordi Cabot and Eelco Visser, editors, Theory and Practice of Model Transformations, Fourth International Conference, ICMT 2011, Zurich, Switzerland, June 27-28, 2011. Proceedings, volume 6707 of Lecture Notes in Computer Science, pages 167–182. Springer, Springer, 2011. 90 Michael Schaub, Florian Matthes, Sascha Roth [CG02] Alesandro Cecconi and Martin Galanda. Adaptive zooming in web cartography. In Computer Graphics Forum, pages 787–799. Wiley Online Library, 2002. [DQvS08] Remco M. Dijkman, Dick A.C. Quartel, and Marten J. van Sinderen. Consistency in multi-viewpoint design of enterprise information systems. Information and Software Technology, 50(7–8):737 – 752, 2008. [HMPR04] Alan R. Hevner, Salvatore T. March, Jinsoo Park, and Sudha Ram. Design Science in Information Systems Research. MIS Quarterly, 28(1):75–105, 2004. [IL08] Hannakaisa Isomäki and Katja Liimatainen. Challenges of Government Enterprise Architecture Work – Stakeholders’ Views. In Maria Wimmer, Hans Jochen Scholl, and Enrico Ferro, editors, Electronic Government, 7th International Conference, pages 364–374, Turin, Italy, 2008. Springer. [Int07] International Organization for Standardization. ISO/IEC 42010:2007 Systems and software engineering – Recommended practice for architectural description of software-intensive systems, 2007. [Lee99] Y.T. Lee. Information modeling: From design to implementation. In Proceedings of the Second World Manufacturing Congress, pages 315–321. Citeseer, 1999. [Mat08] Florian Matthes. Softwarekartographie. Informatik Spektrum, 31(6), 2008. [MBF+ 11] Stephan Murer, Bruno Bonati, Frank J. Furrer, Stephan Murer, Bruno Bonati, and Frank J. Furrer. Managed Evolution. Springer Berlin Heidelberg, 2011. [MBLS08] Florian Matthes, Sabine Buckl, Jana Leitel, and Christian M. Schweda. Enterprise Architecture Management Tool Survey 2008. Chair for Informatics 19 (sebis), Technische Universität München, Munich, Germany, 2008. [MWF08] Stephan Murer, Carl Worms, and Frank J. Furrer. Managed Evolution. Informatik Spektrum, 31(6):537–547, 2008. [Nie94] Jakob Nielsen. Usability Engineering. Elsevier LTD, Oxford, 1994. [Ros03] Jeanne W. Ross. Creating a Strategic IT Architecture Competency: Learning in Stages. MIS Quarterly Executive, 2(1), 2003. [RWC11] Dan Rubel, Jaime Wren, and Eric Clayberg. The Eclipse Graphical Editing Framweork (GEF). Addison-Wesley, 2011. [Sch11] Christian M Schweda. Development of Organization-Specific Enterprise Architecture Modeling Languages Using Building Blocks. PhD thesis, TU München, 2011. [WEK02] R. Wiese, M. Eiglsperger, and M. Kaufmann. yfiles: Visualization and automatic layout of graphs. In Graph drawing: 9th international symposium, GD 2001, Vienna, Austria, September 23-26, 2001: revised papers, volume 129, page 453. Springer Verlag, 2002. [Wit07] André Wittenburg. Softwarekartographie: Modelle und Methoden zur systematischen Visualisierung von Anwendungslandschaften. PhD thesis, Fakultät für Informatik, Technische Universität München, Germany, 2007. [WR09] P. Weill and J.W. Ross. IT Savvy: What Top Executives Must Know to Go from Pain to Gain. Harvard Business Press, 2009. Exploring usability-driven Differences of graphical Modeling Languages: An empirical Research Report Christian Schalles, John Creagh, Michael Rebstock∗ Department of Computing Cork Institute of Technology Rossa Ave Cork, Ireland christian.schalles@mycit.ie, john.creagh@cit.ie ∗ Faculty of Economics and Business Administration Hochschule Darmstadt University of applied Sciences Haardtring 100 64295 Darmstadt, Germany michael.rebstock@h-da.de Abstract: Documenting, specifying and analyzing complex domains such as information systems or business processes have become unimaginable without the support of graphical models. Generally, models are developed using graph-oriented languages such as Event Driven Process Chains (EPCs) or diagrams of the Unified Modeling Language (UML). For industrial use, modeling languages aim to describe either information systems or business processes. Heterogeneous modeling languages allow different grades of usability to their users. In our paper we focus on an evaluation of four heterogeneous modeling languages and their different impact on user performance and user satisfaction. We deduce implications for both educational and industrial use using the Framework for Usability Evaluation of Modeling Languages (FUEML). 1 Introduction In industry, models specifying information system requirements or representing business process documentations are developed by the application of various graphical modeling languages such as the UML and EPCs. In general, graphical modeling languages aim to support the expression of relevant aspects of real world domains such as information system structures or business processes [Lud03]. Almost all notations for software and business process specifications use diagrams as the primary basis for documenting and communicating them. The large number of available languages confronts companies with the problem of selecting the language most suitable to their needs. Beside functional and technical evaluation criteria, user-oriented characteristics of modeling languages are becoming more and more a focal point of interest in research and industry [SW07]. In this research paper we report about a comparative study on usability of selected modeling 92 Christian Schalles, John Creagh, Michael Rebstock languages using the FUEML framework. The remainder of this paper is structured as follows: First, we analyze the theoretical background and state the hypotheses of this study. Secondly, we define usability in the domain of graphical modeling languages and additionally define metrics for measuring usability. Subsequently, we present our research methodology and our resulting findings. Lastly, we deduce implications based on our results and give an outlook on future research. 2 Theoretical Background The variety of definitions and measurement models of usability complicates the extraction of capable attributes for assessing the usability of modeling languages. A usability study would be of limited value if it would not be based on a standard definition and operationalization of usability [CK06]. The International Organization for Standardization (ISO) defines usability as the capacity of the software product to be understood, learned and attractive to the user, when it is used under specified conditions [ISO06]. Additionally, the ISO defined another standard which describes usability as the extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use [ISO98]. The Institute of Electrical and Electronics Engineers (IEEE) established a standard, which describes usability as the ease a user can learn how to operate, prepare inputs for, understand and interpret the outputs of a system or component [IEE90]. Dumas and Redish (1999) define, usability means quickness and simplicity regarding a users task accomplishment. This definition is based on four assumptions [DR99]: 1) Usability means focusing on users, 2) Usability includes productivity, 3) Usability means ease of use and 4) Usability means efficient task accomplishment. Shackel (1991) associates five attributes for defining usability: speed, time to learn, retention, errors and the user specific attitude [Sha91]. Preece et al. (1994) combined effectiveness and efficiency to throughput [PRS+ 94]. Constantine and Lockwood (1999) and Nielsen (2006) collected the attributes defining usability and developed an overall definition of usability attributes consisting of learnability, memorability, effectiveness, efficiency and user satisfaction [CL99, AKSS03] . The variety of definitions concerning usability attributes led to the use of different terms and labels for the same usability characteristics, or different terms for similar characteristics, without full consistency across these standards; in general, the situation in the literature is similar. For example, learnability is defined in ISO 9241-11 as a simple attribute, ‘time of learning‘, whereas ISO 9126 defines it as including several attributes such as ‘comprehensible input and output, instructions readiness, messages readiness‘ [AKSS03]. As a basis for our following up survey we underlay a usability definition for modeling languages in model development and model interpretation scenarios including attributes as follows: The usability of modeling languages is specified by learnability, memorability, effectiveness, efficiency, user satisfaction and perceptibility. The learnability of modeling languages describes the capability of a modeling language to enable the user to learn ap- Exploring usability-driven Differences of graphical Modeling Languages: An empirical Research Report 93 plying models based on particular language. The modeling language and its semantics, syntax and elements should be easy to remember, so that a user is able to return to the language after some period of non-use without having to learn the language and especially the application of models developed with specific language again. Effective model application should be supported by particular language for reaching a successful task accomplishment. Modeling languages should be efficient to use, so that a high level of working productivity is possible. Users have to be satisfied when using the language. For model interpretation scenarios the language should offer a convenient perceptibility regarding structure, overview, elements and shapes so that a user is able to search, extract and process available model information in an easy way. 3 Model of Hypotheses In the following we show our hypotheses supported by theory. The motivation for those hypotheses lies