QUAESTOR: EXPERT GOVERNED PARAMETRIC MODEL
Transcription
QUAESTOR: EXPERT GOVERNED PARAMETRIC MODEL
QUAESTOR: EXPERT GOVERNED PARAMETRIC MODEL ASSEMBLING Martin Th. van Hees Uitnodiging Voor de openbare verdediging van mijn proefschrift op maandag 10 februari 1997 in de Senaatszaal van de Aula van de Technische Universiteit Delft, Mekelweg 5 te Delft Het tijdschema is als volgt: 13.00-13.20 Toelichting op het onderzoek 13.30-14.30 Verdediging van het proefschrift 15.00-16.00 Receptie Martin van Hees Bram Streeflandweg 108 6871 HZ Renkum 0317-310151 Paranimfen: Clemens van der Nat Rob Zuiddam QUAESTOR: EXPERT GOVERNED PARAMETRIC MODEL ASSEMBLING Martin Th. van Hees QUAESTOR: EXPERT GOVERNED PARAMETRIC MODEL ASSEMBLING Martin Th. van Hees Printed by: Grafisch Bedrijf Ponsen & Looijen BV, Wageningen, Netherlands QUAESTOR: EXPERT GOVERNED PARAMETRIC MODEL ASSEMBLING PROEFSCHRIFT ter verkrijging van de graad van doctor aan de Technische Universiteit Delft, op gezag van de Rector Magnificus Prof. dr. ir. J. Blaauwendraad in het openbaar te verdedigen ten overstaan van een commissie door het College van Dekanen aangewezen, op maandag 10 februari 1997 te 13.30 uur door Martin Theodoor VAN HEES scheepsbouwkundig ingenieur geboren te Den Haag Stellingen behorende bij het proefschrift QUAESTOR: EXPERT GOVERNED PARAMETRIC MODEL ASSEMBLING Delft, 10 februari 1997 Martin van Hees 1. Het ontwikkelen van rekenmodellen t.b.v. conceptuele ontwerp-studies kan worden teruggebracht tot het verzamelen en onderhouden van modelfragmenten en hun eigenschappen. Het samenvoegen tot modellen, traditioneel een programmeeractiviteit, kan op effectieve wijze worden gegeneraliseerd, waardoor men zich kan concentreren op kwaliteit en geldigheid van de modelfragmenten (dit proefschrift). 2. We beschikken over meer bruikbare kennis dan we weten. 3. De grote uitdaging voor de kennistechnologie is om optimaal gebruik te maken van de sterke kanten van mens en machine. Door de mens niet alleen als gebruiker te beschouwen maar ook als uiterst bruikbaar onderdeel van een kennissysteem kunnen krachtige en flexibele oplossingen worden gerealiseerd. 4. De eerste grote spraakverwarring is ontstaan tijdens de bouw van de Toren van Babylon, de tweede is ontstaan na de komst van de computer. 5. Het ontwikkelen van kennissystemen lijkt op het restaureren van klassieke automobielen: het gaat langzaam, het vraagt precisie en werkelijk goede onderdelen zijn vaak moeilijk te vinden. 6. Als je de enige bent die een bepaald probleem tot een oplossing hebt gebracht, zijn er drie mogelijkheden: a) het probleem is zeer ingewikkeld, b) de kosten-baten verhouding van een oplossing is te ongunstig, of c) er is geen probleem. 7. Het verdient aanbeveling om opnieuw een filmkeuring in te voeren die ons gaat beschermen tegen de gruweljournalistiek van de televisiejournaals. 8. De Nederlandse overheid ziet een belangrijke bron van belastinginkomsten over het hoofd; op nieuwe computers kan behalve BTW ook nog accijns worden geheven overeenkomstig met die op alcoholische drank, met als valide argument de vergelijkbare verslavende invloed die er vanuit gaat. 9. De term ouderschapsverlof is verkeerd omdat deze suggereert dat je in dat geval op kantoor bent terwijl anderen voor je kinderen zorgen. 10. Een aanzienlijke reductie van het wagenpark en van het file probleem kan worden bereikt door om te schakelen van particulier naar collectief autobezit. Dit proefschrift is goedgekeurd door de promotoren: Prof. ir. J. Klein Woud, Faculteit der Werktuigbouwkunde en Maritieme Techniek, Technische Universiteit Delft Prof. dr. H. Koppelaar, Faculteit der Technische Wiskunde en Informatica, Technische Universiteit Delft Samenstelling promotiecommissie: Rector Magnificus, voorzitter Prof. ir. J. Klein Woud Technische Universiteit Delft, promotor Prof. dr. H. Koppelaar Technische Universiteit Delft, promotor Prof. ir. A. Aalbers Technische Universiteit Delft Prof. dr. J.M. Akkermans Universiteit Twente Prof. dr. ir. F.W. Jansen Technische Universiteit Delft Prof. dr. P. Sen University of Newcastle (UK) Dr. ir. R.A. Vingerhoeds Technische Universiteit Delft ISBN: 90-75757-04-2 Copyright © M. Th. van Hees, 1997. All rights reserved Aan mijn ouders, Sylvia, Maarten en Annemiek QUAESTOR (Roman History) Magistrate acting as State Treasurer, paymaster etc. The Concise Oxford Dictionary, © Oxford University Press 1964, 1976 QUAESTOR, title of a magistrate of ancient Rome. The earliest quaestors had juridical powers, but as the finances of Rome increased in complexity, two quaestors were appointed by the consuls (highest chief magistrates) to control the public treasury. After 447 BC the quaestors were elected annually by the legislative body known as comitia tributa. In 421 BC the office was opened to the plebs (common people) and the number of quaestors was raised to four. As the Roman Republic gained control of Italy and more provinces were acquired, additional quaestors were elected as financial assistants to the military commanders and provincial governors. Under Julius Caesar in the first century BC, there were 40 quaestors. The Emperor Augustus later reduced the number to 20, the usual number for the duration of the Roman Empire. ‘Quaestor’, Microsoft Encarta, © 1993 Microsoft Corp., © 1993 Funk & Wagnall’s Corp. Cover: Jan J. Blok CONTENTS PREFACE .............................................1 INTRODUCTION .......................................3 1. NOTATION CONVENTIONS ..............................5 2. 2.1. 2.2. 2.3. 2.4. 2.5. AN ADVENTUROUS VOYAGE ............................7 Itinerary ................................................7 Destination and Departure .................................10 Navigation .............................................15 Cross-roads and Choices ..................................16 Arrival? ...............................................22 3. 3.1. 3.2. 3.3. 3.4. 3.5. 3.6. NUMERICAL DESIGN MODELLING ......................25 Modelling: Why and What? ...............................25 Focus of Representation ..................................27 Representation Schemes ..................................28 Modelling: Applied Languages and Tools ....................37 What is Available? ......................................43 The Missing Link: Model Assembling .......................45 4. 4.1. 4.2. 4.3. DESIGN PROBLEM SOLVING IN NAVAL ARCHITECTURE...47 Design Practice: Reasoning and Modelling ....................47 Numerical Conceptual Design: The Concept Exploration Model ...51 Beyond Concept Exploration Models ........................52 5. 5.1. 5.2. 5.3. KNOWLEDGE & KNOWLEDGE ACQUISITION .............55 Knowledge Used in Parametric Modelling ....................55 (Deriving) RELATIONs and CONSTRAINTs ................58 TELITAB: A Generic Parametric Data Format .................61 6. 6.1. 6.2. 6.3. 6.4. 6.5. 6.6. 6.7. QUAESTOR: AN INTRODUCTION ........................69 The System ............................................69 User Interface ..........................................72 Tasks and Competence ...................................76 Expert Questions ........................................78 A Simple Application Example .............................81 The Concept Variation Model ..............................86 CVM Development Aspects ...............................90 CONTENTS 7. 7.1. 7.2. 7.3. PARAMETRIC MODELS: THE PARTS .....................93 Domain Description and Data Model ........................93 Control Knowledge ......................................96 Syntactic Elements and Aspects ...........................103 8. 8.1. 8.2. 8.3. PARAMETRIC MODELS: THE ASSEMBLING .............113 Development strategy ...................................113 Inferences ............................................115 Solver Strategies .......................................128 9. 9.1. 9.2. 9.3. 9.4. IMPLEMENTATION ASPECTS ..........................133 Data Management ......................................133 Interpreter ............................................138 Recursion ............................................143 Performance Issues .....................................144 10. QUAESTOR APPLICATIONS ............................147 10.1. SUBCEM ............................................147 10.2. The TROIKA Mine Sweeper .............................150 11. DISCUSSION AND CONCLUSIONS ......................155 11.1. Focus and Perspective ...................................155 11.2. Conclusions ..........................................160 APPENDIX A: Glossary of Terms .....................................161 APPENDIX B: On Merging Numerical and Functional Design Knowledge ......169 Bibliography ...............................................177 List of Figures ..............................................183 List of Frames ..............................................184 List of Tables ...............................................185 Summary ..................................................186 Samenvatting ..............................................188 Acknowledgement ..........................................190 Curriculum Vitae ...........................................192 PREFACE The work presented in this thesis started from the observation that in conceptual design of ships many alternative solutions have to be considered. For this purpose, vital and time consuming sub-tasks performed are the knowledge acquisition, i.e. gathering and structuring the relevant knowledge and the numerical modelling of conceptual design aspects. My primary target was to gain understanding of the nature of numerical design knowledge and of the design modelling task and to develop a general purpose approach and tool, named QUAESTOR. The basic idea was simple enough, but making these ideas work and fulfil a need has been the primary challenge and drive for this work. This thesis attempts to summarise and formalise what has been done and to present the main aspects and ideas that evolved during this development. QUAESTOR is a knowledge-based system that assists the designer with knowledge and experience management, numerical (design) modelling and computations and is applied in the domain of conceptual ship design. 1 This page intentionally left blank 2 INTRODUCTION The primary aim of this work is to support ship designers in the early phase of design by improving access to and control over design related knowledge. Access to and control over numerical and non-numerical design knowledge are considered premises for successful and efficient design modelling activities. The results of this work apply to other engineering fields as well. Real world cases are selected from conceptual naval ship design to illustrate the problem domain. The key problem addressed is model assembling, i.e. to collect and arrange numerical model fragments into decision support and analysis models. The conceptual basis and technical details of the knowledge-based system QUAESTOR are presented. QUAESTOR performs the task of numerical model assembling not autonomously but in co-operation with the designer. The guiding principles are that design (modelling) is a learning process and that the role of a designer is to make decisions [Mistree, 1990] and that the modern approach towards design is based on systems thinking and using computers as partners in the design process [MacCallum, 1985 and Smith, 1992]. The impact of automated modelling on design in general is addressed, focusing on the assembling problem in depth. It is the author’s opinion that computer assisted assembling of numerical problem representations is one of the key issues in any attempt to advance this type of design as a professional activity. The research for this thesis is performed bottom-up. It is observed that the application of the knowledge-based approach of this thesis has large impact on the way designers apply and manage their knowledge during the design process. In chapter 2 the problem of ship design is described in some detail. As an introduction to the subject, a brief historical overview is provided guiding the reader along some important steps taken and choices made during the development of QUAESTOR. The development strategy is briefly discussed. Finally, in which way is the ship design process affected by the application of this tool? Chapter 3 places my tool within a wider scope of tools and approaches in naval architectural and engineering design; which representation schemes are and can be used in numerical design modelling, which tools and languages are available for that purpose and which are actually used in reality? Various literature sources on design modelling, originating from engineering sciences and computer science are used to position the modelling strategy that has been developed. We arrive at the viewpoint which is the central issue of this thesis: solving a generic numerical design modelling problem. 3 INTRODUCTION In chapter 4 the process of reasoning in conceptual design is addressed. Traditional tools for conceptual design are discussed. A knowledge-based numerical modelling strategy is introduced and compared with the traditional design modelling practice as described in this chapter, both in terms of application and model development. Chapter 5 presents the primary forms of knowledge, their coherence, structure and acquisition. A generic numerical data format is introduced. Chapter 6 provides an introduction to QUAESTOR. The global system architecture and the primary system components are presented. An example is given in the form of a highly simplified conceptual ship design model. Finally, the knowledgebased design model or Concept Variation Model is introduced. Chapter 7 introduces the numerical parametric domain. The structure and properties of the model fragments are discussed and the extracted control knowledge is described. Finally, some syntactic and semantic aspects of the instruction set are discussed. Simple examples are used to support the elucidation. Chapter 8 presents an in-depth description of the model assembling process and the (numerical) strategies involved in executing the assembled models. Some attention is paid to the development strategy of this process. Chapter 9 addresses general implementation aspects and some particular ones of the primary system functions. In chapter 10, my contribution to ship design and analysis is elucidated by two real world applications in naval ship design. In chapter 11 the model assembling technique is reviewed in terms of limitations and focus for further research and development. An extension of the application domain is proposed by introducing functional design aspects which is discussed in more depth in Appendix B. Other discussed aspects are user friendliness and user competence and the notion of input driven modelling. Finally, I present the principal conclusions and insights that were obtained in this work. In Appendix A. a glossary is provided of the most frequently used terms and conceptions. 4 1. NOTATION CONVENTIONS Once a word has been allowed to escape, it cannot be recalled. Horace, Epistles In case a generally known concept is used in this thesis, its common name is applied in the text, for example ‘parameter’ is a concept with a well defined and generally accepted meaning. In such cases normal print without capitals is used. Throughout this thesis the terms ‘relation’ and ‘constraint’ are used to indicate different concepts in the literature review, design theory and in the developed knowledge-based technique and are used in normal print in case the meaning is related to the subject of the paragraph. The notation RELATION and CONSTRAINT is used (Times Roman small capital letters) if referred to the named concept in the developed software. Another example of a named concept is REFERENCE, indicating the name of the slot containing information in text format in a knowledge base. Terms beginning with capital letterrefer to named components of the developed technique, e.g. ‘Modeller’ to indicate the reasoning mechanism of the program. The applied font is Times New Roman. In all cases where values, expressions, parameters, data types, syntactic elements and control attributes are indicated, the Courier New font is applied, often in a slightly adapted character size. Values can be numerical, e.g. 0.354, the Boolean TRUE or FALSE and for expressions DETERMINED or PENDING. Although this work primarily deals with numerical expressions, only very few of them are found in the text. The only purpose of these few and often extremely simple examples is to elucidate the mechanisms and inferences involved in numerical modelling, i.e. to show the steps in the modelling process. Therefore, a list of symbols is not considered appropriate, the parameters used in the examples are described locally in context. 5 This page intentionally left blank 6 2. AN ADVENTUROUS VOYAGE What is a greater challenge: to accomplish a journey around the world or to build a ship for that purpose? André Citroën What does a technical person do if frequently confronted with similar problems? This chapter tells the story of a class of problems, its identification, the desire for additional tools, the steps towards a solution and finally of the tool which is obtained. The problem is parametric modelling in naval architecture and the steps are the development of a tool for this purpose: a knowledge-based system. Finally, in which way is the design process affected by the application of this tool? 2.1. Itinerary In 1983, after finishing a degree in Naval Architecture at the Delft University of Technology, I started my career at a shipyard design office. My first assignment was the conceptual design of a 1200 ton naval submarine. After this project I have been involved in variety of ships and conceptual designs. An important lesson learned in this period is that experience is important, but even more important is having good access to the accumulated knowledge of the yard and supporting knowledge infrastructure. The availability and accessibility of this knowledge at the shipyard was rather limited, not least due to the fact that the most experienced senior designer retired two months after my arrival. Another lesson was that the variety of problems to be solved in such an environment is tremendous. A bright spot, however, was the fact that similarities seemed to exist between these problems and the approach to solve them. Departing from practical experience with software development for solving (in particular) numerical problems I started to use the similarities between the various problems. I came up with the idea that a method or a tool could enable me the use of an intelligent ‘book shelf’, in other words: put knowledge in the form of procedures and data on the shelf and make the shelf answer any question which is embedded in this collection of knowledge. Inquiries at that time about the existence of such tools did not reveal any name of a software system supporting this task in a way meaningful to naval architecture. In 1986 I became project manager in the Ship Powering Department of the Maritime Research Institute Netherlands (MARIN). Important aspects of this work 7 AN ADVENTUROUS VOYAGE were industrial consultancy and the interpretation of results of model experiments. It appeared to me that the accessibility and availability of knowledge in this environment and the flexibility of its application for every-day problems are of similar importance. The approach towards problem solving which had taken shape in my mind at the yard seemed to be fully applicable to an important proportion of the knowledge domain dealt with at MARIN. An important part of my work as naval architect consists of the formulation, gathering and application of design knowledge, either in the form of numerical methods or as non-formal heuristic rules. Since computers are well capable to perform calculations I concentrated on the numerical aspects of design. The design problems referred to in the sequel are therefore mainly computational problems. In the process of numerical design problem solving a number of steps can be distinguished: • • • • • • • defining the problem collecting data related to the problem choice of the rules (e.g. numerical expressions) and facts which describe the problem area checking the applicability and validity of these rules assembling a numerical model computing/iterating towards results checking the realism of the results obtained Although the computer considerably facilitates the application of numerical expressions and methods in the design process, still considerable effort is involved to incorporate them into computer programs and in the maintenance of these programs. In comparison with other technical disciplines, empirical calculation methods are often used in naval architecture. In spite of the significant progress in several areas, the large number of sub-areas of high physical complexity remains a characteristic of naval architecture. This forces the use of simplified parametric representations, particularly in the early design phase. Another characteristic is the availability of such knowledge in literature and company records. The conventional approach to make these methods available for practical application is either to use paper and a calculator or to write a dedicated computer program for this purpose, in e.g. FORTRAN or a spreadsheet program. 8 Itinerary In this conventional approach, some fundamental problems continue to exist: • Most computer programs are ‘black boxes’, not supporting or stimulating the understanding of the user (‘input’ is provided, ‘output’ is obtained). The experience of the designer does not affect the results obtained. • The ‘knowledge’ of the problem domain is integrated in the program. Programming experience is required to modify any part of the knowledge in the program and in the way it is applied in the program. • In most cases a limited number of calculation methods is available in the form of a computer program. These programs are developed to perform certain tasks in a fixed sequence with fixed input and output. If input is not available in the proper format hardly any means are available to either convert the input into the correct format or to derive the missing input in an automated way. Additional calculations, either manual or by computer are often necessary. These aspects are considered as disadvantages of the conventional approach which basically means computer programs solving one particular problem for one particular set of input. These programs contain knowledge which may be used to solve other problems in a similar context or which are even instances of (parts) of other computer programs. My aim is to facilitate the application of numerical and related knowledge in a design and engineering environment. In such environments many different design and analysis problems are attacked by using the same set of numerical models and model fragments. To illustrate the disadvantage of the conventional approach imagine a computer program predicting the required propulsive power at a given ship speed. If the effect of hull form coefficients or dimensions on the installed power are to be explored in the conceptual design phase, the program has to be used several times with varying input. Parameter variations requiring some input related to other design aspects that imply further (manual) calculations are common practice in design. This is hardly attractive in case these various aspects cannot be evaluated in a concurrent way. Another example is the design of the propeller. The prediction program determines the optimum pitch and diameter of the propeller whereas one might wish to know the propulsive performance achieved with a given propeller (i.e. with fixed pitch and diameter). The latter cannot be obtained since input and output, as well as the algorithm of the prediction program are fixed. The program implicitly contains all knowledge required to solve that problem but the fixed input and output does not allow it. 9 AN ADVENTUROUS VOYAGE Concluding: a drastic improvement of the flexibility of the numerical modelling of design problems was desired. The question was whether or not the following could be achieved: • To capture design rules separate from a problem based algorithmic context, together with background knowledge and pragmatic rules about their application. Can these rules and knowledge become explicitly available for solving computational problems? 2.2. Destination and Departure The problem definition is to develop a computer program enabling a designer to efficiently use several different calculation methods in varying sequences and with varying starting points. This program should support a highly flexible use of these methods. After an initial study of literature and/or company records, users (e.g. naval architects) should be able to store the calculation method (i.e. the knowledge) in a kind of database for immediate use. The system should make it possible to solve any problem fitting in these knowledge components. The following section provides a summary of the problem as formulated in 1987 at the start of this thesis work. My experience in the field of ship design and consultancy and the use of computers for this purpose made clear that up to 1987 not much progress had been made in ‘computerising’ conceptual ship design. Conceptual design was an activity based on experience and intuition in which a variety of tools were sequentially applied. I learned that a designer employs in the conceptual phase of design: • • • • • Literature data Handbooks Company records (such as previous designs) Experience Numerical methods In 1987, the application of computers in the early design phase was restricted to analysing properties of the design and to drawing. Nowadays, integrated systems are available in the domain of ship design in which a large number of analysis and prediction tools are integrated and are using a common data base [Andrews, 1992]. However, CAD systems are still hardly used in the conceptual phase of the design; these systems are applied in the more advanced phases of the design, 10 Destination and Departure simply because they require a design to make analyses and predictions. This implies that major decisions about e.g. main particulars must be taken beforehand. Nowadays, CAD tools are emerging which can be used in these earlier phases of design. An example is the L/GRAND system, developed on the basis of an associative geometric modelling system [Laansma, 1995]. Although these systems potentially improve the productivity of a designer, they provide limited support in finding and validating main particulars using knowledge of the ship’s life cycle. The main particulars and/or clear and unambiguous design constraints are assumed to be known at the start. The engineering or detailed design phase is a substantial cost factor in the realisation of a maritime structure. This is the primary reason for the existence of large and advanced engineering packages that deal with modelling, materials management and production. In view of the achieved gain of time, these systems are considered to be essential means of production, similar to welding and flame cutting equipment. Until 1987 only few computerised methods have been proposed for the conceptual phase of the design and this situation has not changed much after that. A pragmatic reason is that the conceptual design forms only a negligible proportion of the total costs and effort involved in the realisation of the maritime object. For a typical naval ship building program, e.g. a series of two frigates, the approximate expenditure is 1% in conceptual design phase, 5% in engineering design and 94% in construction. The percentages only apply to platform design and construction, weapon systems are not included in these figures. So, from a merely economical point of view major investments in instruments that facilitate this design phase are hard to defend. However, the major reason for this lack of conceptual design tools is their high complexity, due to the fact that procedures and the sequence of decisions in conceptual design are not fixed at all. An important motivation for developing such tools is their contribution to design quality. In the early phase major design decisions are taken that finally affect the building costs and the operational and economical value of the object [Meek, 1992]. These decisions often have to be taken on the basis of limited knowledge of the solution space. It is generally recognised that in the early design phase 80 percent of the realisation cost are fixed whereas in the detailed design only about 20 percent is affected (the 80/20 rule [Gutierrez-Fraile, 1993]). The importance of design in early development phases of large projects is illustrated in Figure 2.1 [Wolff, 1994]. 11 AN ADVENTUROUS VOYAGE In Figure 2.1 two cost lines are drawn for typical high cost shipbuilding projects. The line representing the actual expenditure shows that in a project most of the money is spent in later phases where actual building activities occur. The other cost line represents the cost fixed by design decisions. The third line represents the level of design knowledge which purpose is to show that the designer is forced to take design decisions (Influence on ship cost line) before he has the required knowledge. An effective design model should be able to improve on this situation. Figure 2.1: Cost and knowledge as a function of time Summarising, effects of decisions taken in an early design phase often are becoming clear in a later design phase (see [Meek, 1992] for a number of examples). In most cases, it is then difficult or even impossible to introduce major changes in the concept. Using the computer as a real ‘partner’ in the conceptual design of arbitrary ship types is still hardly possible although various ‘partnership’ approaches have been pursued by researchers since the late seventies, [MacCallum, 1985-1990, Mistree, 1988, Brown, 1989, Bremdal, 1985]. Existing computerised solutions for concept design were developed for particular vessel types such as bulk carriers or dry cargo ships [Georgescu, 1989]. In the traditional approach towards design one tends to select the first alternative fulfilling the requirements. Design optimisation is mostly restricted to specific details and is not performed for the concept as a whole. The efficiency and quality of the output of the design process can be improved in different ways. In the first place we can improve the tools for the job. Above, we 12 Destination and Departure have stated that the accessibility of available design knowledge in different forms is of great importance to designers. In the second place, we can analyse the taxonomy of the design process and the applied forms of knowledge and extract the logic and mechanisms behind the reasoning process. These mechanisms implemented in computer software can be applied for design decision support. Creativity plays a major role in conceptual design and is hardly supported by existing software. Developments should focus on tools which support and stimulate this typical human capability. It is carefully concluded that, in view of the apparently limited number of standard methods and techniques (implemented in computer software), design in this phase is an ‘art’, being according to the Oxford Dictionary: Skill, especially human skill as opposed to nature. Apart from this somewhat early conclusion, the observations supported the idea that at least one important design sub-task, being the use of numerical formulations, may be facilitated in case it is approached from a more ‘systems thinking’ point of view. An artefact is considered to be a system that can be described through a number of attribute/value combinations, viz. quantities (dimensions, masses) and qualities (colour, material). On the other hand, in many cases qualities can be expressed as numbers which can be stored and used as quantities. In this simplified world all systems can be described through parameters. Between these describing parameters relationships exist which may be expressed in a numerical form. These expressions can be logical, compelling (‘hard’ or based on physics) or empirical (‘soft’) and can be written as equality or inequality. In the simplest case an unknown parameter in an expression being an equality can be solved in case the other parameters in the expression are known and if the expression is assumed to be valid. To determine whether expressions (equalities and inequalities) are TRUE or FALSE can be done in case all parameters in the expression have known (DETERMINED) values. In the design of complex systems many of such relationships are involved: empirical, physical and spatial ones, as well as (legal) regulations, constraints and requirements. Requirements and constraints can often be represented in a numerical form, either by equalities or inequalities. To improve the overall picture of the design process it is attractive to obtain an inventory of design parameters and the relationships between them. For instance from technical literature and previous designs a storehouse of such information can be retrieved. In this thesis numerical expressions are referred to as ‘RELATION’. Each RELATION should be stored together with a reference in which the origin (author, which regulatory 13 AN ADVENTUROUS VOYAGE body, when issued, etc.) is provided, its formulation expressed in parameters from the inventory and some attributes (hard/soft), values and dimensions of the applied parameters, etc. It is essential to know whether a RELATION, i.e. numerical model (fragment) is applicable for the current case. Therefore, it should be possible to attach to any RELATION (equality) one or more relationships, being either equalities or inequalities which express the ‘applicability’ or ‘validity’ of the RELATION(s) to which is referred. In the sequel these relationships expressing the validity of RELATIONs are referred to as CONSTRAINTs. The term CONSTRAINT is preferred although it is in conflict with the meaning of the word in some literature sources [Leler, 1988, Guesgen, 1992]. A CONSTRAINT is viewed as a limitation on the validity of the producer of the value of parameters (i.e. the RELATION). A CONSTRAINT is either TRUE or FALSE and DETERMINED or PENDING. A constraint with the meaning ‘validity’ is sometimes referred to as second order constraint [Leler, 1988]. Without additional means it is impossible for a designer to have a clear overview of all available numerical relationships. Theoretically speaking it is possible to incorporate all these RELATIONs into a single computer program. However, this computer program will be large, complicated and expensive to develop, manage and maintain. It is not difficult to imagine the advantages of having these RELATIONs immediately available for use. If values or estimates of all computable, yet unknown design parameters can be obtained on any moment during the design, decision support is considerably improved. To enable this availability of (design) relations and parameters an information system is desired. The following sections provide a summary of some important choices, decisions and findings in the course of building a system called QUAESTOR which enables storage, maintenance and application of design knowledge in the indicated ways. 14 Navigation 2.3. Navigation QUAESTOR was designed in two main components. The first component is the knowledge Editor containing all facilities dealing with storage, maintenance and retrieval of relevant design knowledge. The second component is the numerical Modeller, containing the numerical and ‘intelligent’ components of the program, i.e. which also is able to work with knowledge retrieval facilities. The Modeller requires functionality from the Editor meaning that in terms of development, a large overlap exists between the two main functions of the system, in particular by virtue of a common interface. These two desired system functions were clear from the beginning and remained unaltered. The challenge was to build a system for which no examples were available. By developing components which were definitely required and by obtaining detailed insight in their properties and interactions, the outline of the final system became clear in the course of time. The limits of four successive generations of PC’s have been an important stimulus to remain on track, since it limited the attraction of taking cross-roads to goals not leading to the final one: a system which can support the numerical design modelling task. A typical engineering approach is to start with a description of a system on functional and conceptual level (what should the system be able to do and what will it look like?). Subsequently decompose the system in components, develop the components and assemble them into a working system. Finally the system’s performance should be tested. The approach is basically top-down: a general solution is found and components are specified that help meet the overall expectations (a bottom-up approach to problem solving is to decompose the problem and find solutions at a detailed level [Hagen, 1993]). Prototyping as a means to gain insight in the problem domain, solutions and the behaviour of users is a widely recognised developmental strategy, particularly for early development stages. It is also suitable for more advanced development stages although it then is difficult to manage. Case studies are performed to extract knowledge about behaviour and competence of users and of software. User feedback obtained from numerous prototypes (in fact intermediate versions) has been the primary driver for the development since the first working prototype [Brouwer, 1990]. Concluding, the development approach of QUAESTOR is defined as Top-down Prototyping. 15 AN ADVENTUROUS VOYAGE 2.4. Cross-roads and Choices QUAESTOR has been developed between early 1987 and 1996. The first three years were spent with exploring the knowledge domain, developing a suitable data structure and data management system. In this period also initial versions of important system functions such as screen managers, formula parser, interpreter and solver were built. Since departure in 1987 a record was kept of most development aspects and decisions made. By consulting this project dossier a selection has been made of key decisions and ditto aspects which had large impact on the final result. I. Knowledge domain The basic assumption is that numerical relationships (referred to as RELATIONs) can be used to fix the connection between quantities (referred to as parameters) which describe the system or design at hand. RELATIONs can be used to compute values of parameters on the basis of values of other parameters. RELATIONs can be active in a solution in case its validity is checked and found true. The (numerical) validity is expressed by zero or more CONSTRAINT(s). These RELATIONs, parameters and CONSTRAINTs are considered to form a network. RELATIONs are assumed to be continuous functions if they are used TwoWay, i.e. as equation. Discontinuous RELATIONs can mainly be used OneWay, i.e. as function or TwoWay within a particular interval, fixed by CONSTRAINTs. II. Multi Case In the domain of engineering and design it is important to compare alternative solutions. This means that the system should be able to generate output of multiple cases on the basis of one or more varying input parameter(s). These Cases are not necessarily computed from the same system of equations since that can change due to RELATIONs becoming invalid. This decision greatly affected the design of the workbase (storage of problem related data), of the Modeller (which performs the actual reasoning) and of the numerical data model. This decision has complicated the development considerably but it has been found to be one of the key features of the program. It was rapidly decided to apply only double precision numbers and no reals or integers. Although initially various types were considered, this distinction appeared to be a complicating factor and of limited practical value. The single generic data structure (TELITAB) is applied throughout the program (see section 5.3). 16 Cross-roads and Choices III. Language and hardware In 1987 my experience in computer languages was confined to ALGOL60, FORTRAN77 and BASIC. ALGOL60 was outdated, FORTRAN77 is a language for mainly computational purposes and offers limited flexibility in string manipulation. Being familiar with GW-BASIC from the early days of personal computing, the decision to apply one of the at that time new, structured BASIC’s was easily made (Borland Turbo BASIC). The powerful and above all simple string manipulation of BASIC made it very suitable for my purpose. One of its important features is the ability to use strings as dynamic arrays which can be extended and reduced without reallocation (see section 9.1). The drawback, however, was that in the early nineties the support of BASIC by the software industry seemed to evaporate. An attempt to perform an automatic BASIC to C translation failed because of the excessive use of string operations and recursion which was clearly beyond the limits of the equivalent operations programmed in C as provided in the object library of the translator. BASIC has made a come-back in Microsoft’s Visual Basic (MVB) which is increasingly applied by the software industry. A conversion from Turbo Basic to MVB has been successfully performed. This removed important limits with regard to memory use and code size. Obviously, the applied hardware is the PC under MS-DOS. This choice, mainly imposed by availability has forced developments towards optimisation of speed and code size. IV. Interface design At the time of departure, character-oriented user interfaces were state-of-the-art. A non-windows solution has been developed, consisting of a screen for database management, a number of lists for parameters and expressions and a cell-oriented workbase manager which presents the problem related input and output. The basic concept of the interface has not changed since 1989. Imposed by code and memory limits, the same interface components are used for database maintenance, browsing and the dialogue. The key issues in the development of the interface have been: • to present or make all knowledge accessible which is relevant at a particular moment (whitebox approach) • to provide extensive browser functions, also during a dialogue • to enable maximum ability to interfere during a dialogue 17 AN ADVENTUROUS VOYAGE V. Database design An important starting point for the development was the desire to integrate knowledge management with its application. Initially it was attempted to store the knowledge in sequential, ASCII-type tables according to a simple relational database concept. Any attempt to perform reasoning activities with such database concepts failed, simply because they were too slow to be practical. It finally became clear that the network-type knowledge could only be used efficiently in the case where it is stored in a database organised as a network. The database finally became a set of free format interconnected binary records enabling high performance queries. This database system was the key element required for building an inference engine (Modeller) that performs reasoning tasks within acceptable response times. Apart from performance aspects, hardware limits in terms of available free memory have also been driving factors during the development of the database system. In view of these limits and the desire to manage and use large networks, both network database and computed results were moved to the disk. The disk is used as virtual memory thus trading memory against speed. VI. Parser and knowledge quality control To avoid errors and inconsistencies in the knowledge base a parser and syntax checker have been developed. The parser separates the expression in string format into a sequence of numbers, parameters, operators and functions and maintains the pointers between the new expression and parameters in the knowledge base vice versa and introduces new parameters in the knowledge base, if necessary. The syntax checker is able to detect the most obvious errors in numerical expressions such as number of parentheses and their nesting. Also, the combination and sequence of numerical and relational operators and the availability of numerical data to which is referred by special functions (section 7.3) are checked. Since QUAESTOR deals with numerical expressions there is a serious risk on redundant information, dependency between RELATIONs which may lead to singularities in the matrix representing (a part) of the assembled model. Upon entering a new RELATION, the user interface presents all RELATIONs which can be used to produce the same value(s) as the newly provided (or modified) one, allowing the knowledge engineer to check for possible redundant data. In addition to this, the Solver is capable to report dependent RELATIONs in a template to the user which then can take appropriate measures to remove the redundant information from the knowledge base. 18 Cross-roads and Choices VII. Interpreter For the purpose of executing numerical models two possibilities are available. The first one is to translate the RELATIONs and CONSTRAINTs selected from the knowledge base into conventional computer code (e.g. FORTRAN) and to compile and subsequently link this code with a multipurpose solver. This approach has been applied by [Bras, 1992, Smith, 1992 and Vingerhoeds, 1990]. The drawback of this approach is that it is extremely difficult to properly code the applicable constraints and control knowledge: the symbolic level (i.e. the modelling process) is in this case completely separated from the numerical level (i.e. the numerical solver). This means that during execution of a model no access is provided to the Modeller and a user interface is not available for the purpose of reporting problems or for adaptation of models on the basis of intermediate results. The latter requires that control and execution of the actual calculations should be performed by a solver, controlled by the Modeller through interpretation instead of batch compilation and execution. The interpreter should be able to evaluate clauses of numerical expressions (RELATIONs and CONSTRAINTs) and is invoked by the Solver which controls the overall solution process. Another advantage of the interpreter-approach is that special functions (e.g. table interpolation or selection) can be implemented (and used) relatively easy which is not the case if code generation is applied. The interpreter which forms part of the QUAESTOR shell code has full access to data stored in the knowledge base or produced by satellite applications. In section 9.2 the interpreter is discussed in more detail. VIII. Modeller The reasoning mechanism or Modeller is the ‘intelligent’ part of the program and is able to check validity, propose RELATIONs, ask input (values of parameters), compute (solve set of equations by invoking the Solver) and present results. Basic decisions on the functionality of the system are embedded in the architecture of the Modeller. The system should operate as assistant in the assembling of models (white-box approach). This is preferred above searching solutions for problems without user interference (black-box approach). However, it should be possible to operate in both modes. This approach ensures maximum flexibility and a relatively simple domain description will suffice. Specific, more ‘human’ knowledge which is difficult to capture in a knowledge base, is explicitly supplied by the user or implicitly through his decisions. This necessarily means that the user may have an important influence on the assembling process. This simplification in terms of knowledge domain is more or less outweighed by the increased complexity of responding properly on input or stimuli provided by ‘unpredictable’ humans. It 19 AN ADVENTUROUS VOYAGE appeared to be difficult to avoid failures and to pinpoint weaknesses in the code of the Modeller. The user may perform actions or provide input under a variety of inevitably untested conditions. Many of such weaknesses became clear in applications of users not being the developer since the latter bypasses many obstacles without even realising it. An important milestone was the development of functions which can determine dependencies between parameters in an assembled numerical model or template. The application of these functions simplified and improved the efficiency of recalculation for other input and of multi-case calculations. In section 8.2 the inferences performed by the Modeller are presented. IX. Solver The Solver is a routine which computes a solution of the non-linear model or template generated by the Modeller. Although a separate routine, the Solver is developed parallel with the Modeller. Important aspects of the Solver are speed and robustness. Since use is made of an interpreter which is slow if compared to compiled code, the efficiency of the iteration process has obtained much attention. The nucleus of the Solver is a relatively simple (quasi) Newton Raphson (NR) method [Hinton]. The solver computes a Jacobean around intermediate sulutions which is solved by Gaussian elimination until a particular accuracy is obtained (see section 8.3 for a discussion of the criteria for convergence). The Modeller provides systems of equations (RELATIONs) extracted from the assembled template to the Solver. An important step was the introduction of symbolic substitution or term rewriting [Leler, 1988], reducing the number of degrees of freedom of the problem by substitution. The Solver is also able to recognise strong components [Serrano, 1992]. A strong component is a sub-system of equations or a cycle that exist in the template and which can be solved independently of the rest of the template. These cycles are split off from the template and solved separately as subsystems of equations. This makes the size of the template virtually unlimited. The strategies applied by the solver are discussed in more detail in section 8.3. X. Interface to satellite application programs In the course of the development of QUAESTOR it became clear that by storing analysis methods only in the form of RELATIONs and CONSTRAINTs, knowledge bases become large and difficult to use and maintain. Another obvious disadvantage of that approach is the effort involved in the transformation of existing applications into sets of parameters, RELATIONs and CONSTRAINTs and 20 Cross-roads and Choices subsequently in testing their integrity and problem solving qualities. In addition, methods may become ‘white boxes’ in undesirable cases. To confront a designer during a dialogue with highly complex regression polynomials or neural networks may not improve his understanding of the problem domain. Instead of transforming the method into RELATIONs, analysis programs are used directly. The input of the analysis program is defined in the knowledge base as a FORTRAN type function call. On the basis of this call the interpreter prepares the input file, runs the application and retrieves the requested parameter value from the program’s output. The function call is a common QUAESTOR function which produces a single number as result. In this way satellite applications can be used as common RELATIONs which also means that potentially input and output can be reversed, i.e. by providing (multiple) result(s) QUAESTOR can compute value(s) of missing (PENDING) input variable(s). By a consistent application of the generic TELITAB data structure (see section 5.3) in the input and output of these satellites, the interfacing can be realised without much effort. This principle appeared to be the key towards design models which unite a high degree of complexity with good maintainability. Another advantage is that by interfacing with a well defined and simple file structure the development effort and responsibility can be distributed and no particular computer language is prescribed. It also limits the development effort since existing analysis programs can be used without significant adaptations. The generic data exchange files of the applications can also be used as a standard in conventional environments and is adopted by MARIN as standard exchange format between user interfaces and calculation programs. XI. Accuracy management Although the concepts of expected and acceptable accuracy are important to any designer [MacCallum, 1990] they are in practice often difficult to quantify, simply because this kind of knowledge is generally not (made) available. Due to this I have deliberately neglected this subject in QUAESTOR. In my view the results of any modelling activity should be judged by the designer on the basis of his professionalism and experience. It is the professionalism of the designer and not the degree of sophistication of the software that ensures the quality of the design and of the design model. Insight in the robustness of the solutions can also be obtained by performing sensitivity analyses with the uncertain parameters in the model. 21 AN ADVENTUROUS VOYAGE 2.5. Arrival? In 1992 a prototype of QUAESTOR became available with acceptable performance and sufficiently robust to be applied for pilot applications. The first and thus far most ambitious application is developed within the scope of the SUBCEM project [vdNat, 1993-1996]. Early 1993 QUAESTOR was introduced at the Royal Netherlands Navy (RNlN). The concept design of the TROIKA mine sweeping system was the first RNlN application [Wolff, 1994] and is discussed in section 10.2. Nowadays many QUAESTOR applications in naval ship design exist and the embedded approach is adopted as the spine of the Future Reduced Cost Combatant (FRCC) study performed by the RNlN within the scope of the NATO Maritime Operations 2015 project [Keizer, 1994-1996 and vHees, 1994]. The feedback obtained from these applications has been extensive. Literally hundreds of smaller and larger problems have been reported and solved and a large number of system functions have been developed on the basis of suggestions of designers using the system. For a developer it is quite exceptional that users are prepared to accept flaws in a software system they apply in their daily practice. More common reaction in such cases is irritation and refusal to use it any further. Although numerous improvements were incorporated in the basis of these pilot applications, the basic concepts of the system and of the underlying approach have not changed since the introduction of the first prototype in 1990. Within a relatively short period of time the tool has been adopted by the RNlN for a variety of design applications. A logical question is why, what does it add to already available tools? A senior designer1 at the RNlN gives the following explanation: For the RNlN, a primary application of this system is its ‘interfacing’-ability. Analysis programs and methods dealing with particular aspects and properties of the design require input which is retrieved from the current description of the design. These analysis models originate from various specialised sources. Prediction tools for e.g. ship hydrodynamics are provided by MARIN, for signatures and operational aspects by the Physics Electronics Laboratory TNO/FEL and of vulnerability aspect by the Prince Maurits Laboratory TNO/PML. We are able to retrieve and organise this input since we are managing all information about the current status of the design and of the relations that exist between design parameters. These relations can vary from case to case. QUAESTOR provides storage and maintenance facilities for the concept description and the relations 1) Ir. R. Zuiddam, Ministry of Defence, Royal Netherlands Navy, Directorate of Materiel, Department of Naval Construction, The Hague 22 Arrival existing between design parameters. The shell assists by gathering and converting the user input into input required by the applicable analysis models by interpreting the relations or by retrieval of input from a database. While using analysis programs in design we often desire (slightly) different output and/or are we capable to provide (slightly) different input. Due to its capabilities QUAESTOR acts as interface between our input and these programs vice versa and thus allows us to use these analysis models and link them together in the way we desire. The application, input and output varies from case to case and depends on the context of the problem. Parameter variations and problem (input/output) reversal, i.e. What-If questions are very important in design. These applications are beyond my initial aims of 1987 (section 2.2) which were mainly method management and applications on a detailed level and hardly on the level of overall conceptual design [vHees, 1992]. The rule-based properties and the subsequent improvements and extensions have made it suitable for integrating large numbers of analysis tasks into systems for conceptual design. In the chapters 6 and 10 these conceptual design models and their development aspects are discussed. Thus far the development of QUAESTOR demonstrated that: • • • • • the problem of automated parametric model assembling is more complex than is expected on the basis of the simple domain description a dialogue system is difficult to test, often fuzzy error reports are received a time consuming test/refinement cycle can hardly be avoided. close co-operation with a small group of motivated users is a prerequisite improvement remains possible 23 This page intentionally left blank 24 3. NUMERICAL DESIGN MODELLING Prove all things; hold fast that which is good. KJV 1 Thessalonians, 5:21 In technical disciplines that deal with large and complex systems, a design by simulation-approach is often followed with an emphasis on quantitative forms of knowledge. Simulation is to imitate conditions of (situation, etc.) with a model for convenience or training. In case systems, subjected to financial or physical limitations, need to be optimised for particular tasks, a parametric approach is often followed. The obvious advantage of parametric representations is that the merits of multiple solutions can be studied without a need for building physical prototypes. These models can provide decision support to a designer in an efficient and cost effective manner and help him to gain understanding of the domain. In case the artefact is highly complex, the initial numerical model may be a simple one and can be made increasingly accurate in the course of the design process. The modelling then becomes a learning process. The following sections provide a frame of reference in terms of applicable knowledge representation schemes, tools and techniques in numerical design modelling and their relation with QUAESTOR. Finally we arrive at the viewpoint which is the central issue of this thesis: solving a generic numerical model assembling problem. 3.1. Modelling: Why and What? Simulation is applied in those cases in which it is difficult to study the reality. Examples are the simulation of the dynamic positioning of a tanker during loading crude oil from a storage tanker or the manoeuvrability of ships in a future harbour. Knowledge of the behaviour and operational limits of these systems, e.g. maximum wind speed and wave height at which an operation can be performed limit accident risks. Simulation saves also cost involved in possible modifications to be introduced in the object after completion. The basis for simulation is a model of the reality. This model can be a physical model but it can also be a numerical model, being a simplified description of a system to assist calculations and predictions. A model in it simplest form is an equation describing e.g. the relation between the power transmission through a shaft, the rate of revolutions and the torque. This relation is given in the form: Power = Torque * Rotation_Rate * 2 * π 25 NUMERICAL DESIGN MODELLING This ‘numerical model’ makes it possible to calculate the torque in a propeller shaft, required for computing e.g. the shaft diameter on the basis of an allowed maximum torsion stress. To enable this calculation it is required to provide input to the model: the Rotation_Rate and the current Power setting. The units of the values need to correspond, i.e. the Torque should be defined in kNm, the Power in kW and the rotation rate in 1/s. In engineering disciplines the numerical modelling of physical systems and simulations with such models have become kernel activities. In order to establish capacities, weight, stiffness, dimensions and cost, assumptions are made with regard to boundary conditions, properties of the components, materials and environments. In many cases some form of numerical modelling is performed to justify particular choices and decisions. Physical systems and processes are modelled in different ways, depending on the required level of detail and on the aspects on which is focused. Systems interact with the environments in which they operate, systems may be the carrier of processes. What is being modelled, the system in terms of pipes, beams, pumps or is it the process or ‘program’ which is running on the system? These processes can be static or dynamic and probabilistic aspects may be involved. Modelling may be performed for various reasons, ranging from determining the feasibility of a particular artefact in terms of cost, mass, dimensions, etc. to checking the performance in a varying environment as for the above storage tanker. Such simulations provide indirect input to the design of the system. By nature, physical systems are dynamic or static which is reflected by the applied numerical model. Subsequently, numerical models are either deterministic or probabilistic. A deterministic description requires absolute values of positions, power, motions, either static or in time, whereas a probabilistic description is provided in terms of statistical parameters, e.g. the probability that certain limits are exceeded within a given time frame. In this work only deterministic views on systems and processes are addressed. From a pure numerical modelling perspective, the position is taken that time can be treated as any other variable so that static and dynamic numerical models can be dealt with in a similar manner. In section 7.3 the basic principle is elucidated with a simple mathematical pendulum. In the phase in which a concept of a future physical system is established, it is both common and useful to apply an abstract representation of the concept and of the knowledge involved. According to [Newell, 1982] representation is the sum of knowledge and access to that knowledge. Representations vary per knowledge 26 Focus of Representation domain. In general the more abstract a system or process description is, the easier it should be to manipulate it, viz. to modify, to extend or to reduce a description. For example, a block diagram of an hydraulic system is easier to adapt than its detailed engineering layout drawing. Depending on the phase of a design, usually a level of abstraction is intuitively selected in harmony with the available knowledge and with the desired ability to manipulate the model in that phase of the design. 3.2. Focus of Representation From research in Artificial Intelligence originate three basic approaches in reasoning about physical systems [Top, 1993]. The three approaches emerged from research to formalise human common-sense reasoning processes and are considered as levels in which physical systems are being abstracted and manipulated in engineering sciences. In this thesis, the position is taken that these three levels are passed in the course of a design process. I. Device-centred According to this view the behaviour of a physical device can be inferred from its structure. Here, three types of structural elements are distinguished, viz. materials, components and conduits (that transport material, energy or information between components). Device components are connected via relations. Given a configuration of device elements, the behaviour of the system as a whole can be determined. This approach requires a device-level description which, in terms of design, is the output of the process. An example is the product model of a Diesel engine in a CAD system. II. Process-centred The process-centred approach takes as primitives physical processes that induce state changes in the system. Important notions in this theory are views and processes. Views provide a (static) description of physical objects and their states, by specifying to which objects the view applies, under what conditions it is active, and by giving relations between the parameters that are valid in that situation. Processes are described in a similar manner, but in addition they contain so-called influences which indicate what causes parameter values to change, thus specifying the dynamic aspects of a system. This approach requires a process-level description. In terms of the design, a process is an intermediate level which requires assumptions of the physical layout of the device on which the process is 27 NUMERICAL DESIGN MODELLING running. The process-level behaviour is used to infer (parts of) the device level description. In the case of the Diesel engine we will use a description, or rather simulation of the combustion process to infer, e.g. bore and stroke, cooling requirements and turbo charger arrangement. III. Constraint-centred This approach takes a mathematical rather than a physical stance, since it directly starts from the system (differential) equations, and then yields the corresponding possible system states by employing a numerical solver. In this solver, the quantity space for search of a solution could be bounded. Dynamic behaviour is captured by applying differential schemes. This approach requires a constraint-level description. In design, we consider the constraint-level description as the most suitable form in the conceptual phase. In that phase we have the least device-level knowledge of the artefact and the constraints are used to find entries in the process and device levels. Case-based reasoning is a common strategy in this level. Based on generalised (device-level) knowledge of Diesel engines, main particulars are estimated of an engine which should operate within given limits of e.g. power, revs, weight and fuel consumption. In design we are interested in quantities which means that in the above three approaches the relations and constraints are numeric by nature. In particular the process and constraint-level descriptions can be merged into a numeric-level description which is covered by my knowledge-based approach. The device, process and constraint level descriptions are applied by engineers to manipulate concepts. The manipulation of the artefact description (later I will apply the term Concept Description) is the key to performing What-If inquiries: what are the consequences of this solution for the overall system performance? The creative opportunities enclosed in such abstractions are invaluable. Summarising, in the realisation of an artefact, the representation is selected which expresses the aspects which are the focus of manipulation. In practice this means also that different tools are selected for different design phases. 3.3. Representation Schemes In the sequel a summary is provided of knowledge representations which are either commonly applied or suitable for conceptual purposes in engineering sciences and which have or can be given numerical properties. 28 Representation Schemes I. Formulae, constraints and equations A formula (‘mathematic rule or statement in algebraic symbols’) is a numerical representation of one parameter into others. Thrust = Resistance/(1 - Thrust_Deduction) Thrust can be computed (i.e. the formula can be used) in case Resistance and Thrust_Deduction are known. The given formula can be considered as a constraint (‘limitation imposed on motion or action’), i.e. a limitation on the value of the applied parameters. In other words, the above (equality) constraint is fulfilled in case Thrust equals Resistance/(1 - Thrust_Deduction). In that case the Boolean value of the ‘=‘ operator is TRUE. The expression can also be viewed as an equation (‘(Math) Formula affirming equivalence of two expressions connected by the sign =’ ) which can be solved. In case Thrust_Deduction and Thrust are known Resistance can be solved. Numerical representation in formulae is attractive in view of its expressiveness. Many attributes of physical systems can be assigned numerical values, even qualitative attributes can be expressed in numbers, e.g. a colour or component number referring to particular lists of colours and components. Apart from the numerical knowledge it represents, formulae provide the means to propagate cause and effect through a model and even to reason about the structure of the system from which the parameters in the formulae are attributes. An example is to evaluate the effect of re-sizing a particular system component on the overall system performance (such as speed, cost, motions, power, etc.). A DETERMINED value of a particular attribute may serve as trigger for decisions, the same applies for not having a value, i.e. the value being PENDING. Variables in design constraints (with either PENDING or DETERMINED values) can connect concepts with each other. Simple formulae may already contain or express much more knowledge about design concepts and their coherence than the simple calculate value:... task. This numerical form of knowledge plays an important role in the understanding and manipulation of design concepts. People involved in engineering clearly perform some kind of reasoning with numerical expressions, simply because more knowledge is captured in, or related to expressions than the purely procedural number processing aspect. Apart from a 29 NUMERICAL DESIGN MODELLING simple value and the ability to calculate values, parameters and the expressions in which they are applied can capture knowledge about the behaviour and structure of physical systems. Through expressions, parameters connect concepts and are used in engineering to jump from one concept to another. In design it is common to apply the term ‘constraint’ for RELATION. A constraint is viewed as a limitation of the values of a set of design parameters, imposed in the form of an expression. The constraint is satisfied in case a set of parameter values is found fulfilling the relational and/or logical operators in the expression. Within the context of QUAESTOR I prefer the term RELATION since it relates parameters to each other. Whether or not the RELATION is applied or active depends on the problem at hand. A RELATION should be considered as a stand-alone procedure linking one parameter with a value or one or more other parameters. Activating, i.e. introducing the RELATION into the template requires decisions and/or other forms of knowledge. An active RELATION (viz. selected for solving a particular problem) can be considered as a constraint in the above, usual meaning. II. Production rules Apart from being a constraint or part of a system of equations, an active RELATION is also a Production Rule: the value of one particular parameter is produced by ‘firing’ this RELATION, either explicitly by assigning the value of the right clause to the parameter in the left clause, or implicitly by solving the system of equations in which it is included. The production rule paradigm is a model for human reasoning and captures an expert’s experience and causal reasoning strategy. A large number of domains exist in which knowledge can be captured in this way and it forms the basis of the first generation of expert systems [vdRee, 1994]. Production rules consist of an antecedent part which includes the conditions to be fulfilled prior to execution and a consequent part stating the actions to be performed: IF condition(s) THEN action(s) The advantage of production rules is the simplicity, the resemblance to human reasoning and the fact that rules can be added or removed without much effort. The result of the action(s) is called Conclusion. If condition(s) are viewed as validity, formulae, equations and constraints can be considered as Numerical Production Rules from which the value of one of the parameters is the conclusion. This notion forms the basis of QUAESTOR. In QUAESTOR, a RELATION is a simple mathematical expression in the form a=f(...). A RELATION can be applied as an equation in a system of equations. 30 Representation Schemes A CONSTRAINT is an expression which may include the relational operators <, >, = and the logical operators AND, OR, XOR, EQV and IMP. CONSTRAINTs can either be TRUE or FALSE, and DETERMINED or PENDING. CONSTRAINTs are the condition(s) in the numerical production rules represented in the knowledge base by the following implicit IF..THEN..ELSEIF..ENDIF clause: IF (Condition I = TRUE) THEN RELATION A can be used ELSE IF (Condition II = TRUE) THEN RELATION B can be used END IF III. Semantic nets Semantic nets describe a domain by means of a graphic structure consisting of nodes or objects which are interconnected by arrows or relationships. Semantic nets are developed to capture knowledge as an hierarchical structure. A relationship can be for instance an Is_a relationship. Two forms of knowledge are captured in this way: • To fix that a class of objects is a sub-class of another class of objects, e.g.: A ship is a floating structure • To fix that an object belongs to a particular class of objects, e.g.: The QE II is a ship Properties of objects that are higher in the hierarchy are transferred to objects lower in the hierarchy, i.e. objects inherit properties of their super classes (the QE II is a floating structure). Figure 3.1 shows an example of the hierarchy of components in a building. 31 NUMERICAL DESIGN MODELLING Figure 3.1: Example of a component hierarchy The advantage of semantic nets is their compactness and appealing graphic presentation. Larger networks, however, may become difficult to manage which may result in the inheritance of incorrect properties by an object. The semantic network approach is strongly related to the Object Oriented Programming paradigm (see section 4.4). In engineering design, semantic nets are applied to represent hierarchies, either of abstract concepts or of the components in a system. The relationships are defining the structure of the knowledge. Semantic nets are a highly general knowledge representation which is also suitable for representing and manipulating numerical models and their data structure. The hierarchical TELITAB data structure has aspects in common with semantic nets (section 5.3). In Figure 3.2, a numerical model, consisting of a number of formulae and their variables is represented as a semantic net. 32 Representation Schemes Figure 3.2: Numerical model in a semantic net A numerical model contains the relations: • • Value_Requested_By (VRB): the parameter is introduced into the model by this RELATION (or CONSTRAINT). The inverse relation is Value_Requested_Of (VRO). A VRB parameter is called a ‘sub goal’ of the parameter computed by(CB)the RELATION to which is referred to (see section 5.1). Calculated_By (CB): the RELATION produces the value of the parameter, no matter whether the parameter is present in more than one RELATION, requiring an iterative solution of its value. The inverse of this relation is Output_Of (OO). • Input_For (INF): the value is input to the RELATION (or CONSTRAINT), either provided by the user of the model or computed by (CB) an other RELATION. The inverse of this relation is Required_As_Input (RAI). • If_CONSTRAINT_TRUE (ICT): a RELATION can only be used in a model in case the connected CONSTRAINTs are evaluated TRUE. The inverse relation is CONSTRAINT_Of (CO). 33 NUMERICAL DESIGN MODELLING A parameter requested by the user is a top goal of the assembled numerical model (see section 5.1). Such parameters only have one CB relation with an equation (RELATION) and INF relations with other RELATIONs or CONSTRAINTs, if any. Since the parameter is requested by the user and not introduced in the model by any RELATION or CONSTRAINT, these parameter(s) have no VRB relation with any expression. The model assembling process within QUAESTOR is based on the notion that any numerical model is the result of a reasoning process, using a set of model fragments or knowledge base. The structure of numerical models consisting of parameters, RELATIONs and CONSTRAINTs is represented and manipulated by QUAESTOR as a semantic network. Some of the fundamental concepts applied are embedded in the relations described on the previous page: • • • A parameter is computed by (CB) one particular RELATION and may be input for (INF) CONSTRAINTs and other RELATIONs. A parameter is requested by (VRB) one particular RELATION or CONSTRAINT (sub goal) or is requested, i.e. introduced by the user in the model (top goal parameters). The reasoning or model assembling process is viewed as a building and maintenance process of the semantic network representing the assembled model. The knowledge base of QUAESTOR is represented as a semantic network too (Figure 3.3), although the relations between the frames are different from those in a numerical model (Figure 3.2). Parameter Parameter part_of part_of part_of RELATION RELATION has_ CONSTRAINT part_of Parameter has_ CONSTRAINT part_ CONSTRAINT of has_ CONSTRAINT CONSTRAINT part_of Parameter part_of Parameter Figure 3.3: Semantic network of Parameters, RELATIONs and CONSTRAINTs 34 Representation Schemes These relations are: • parameters in RELATIONs and CONSTRAINTs: Part_Of, inverse: Has_Part • CONSTRAINTs of RELATION: Has_CONSTRAINT, inverse: CONSTRAINT_Of In a model, the flow of control and information is directed by the specified top goal(s) and the expressions available in the knowledge base. In the QUAESTOR knowledge base only the relations between the parameters are described through the RELATIONs and CONSTRAINTs in which they are applied. Therefore, a model cannot be viewed only as a part of a knowledge base since the semantic net representing models has other relations between their nodes than that representing the knowledge base. The knowledge base represents an undirected network (Figure 3.3) whereas an assembled model represents a directed network as shown in Figures 3.2 and 6.2. The domain of QUAESTOR and the model assembling process is discussed in further detail in chapters 7 and 8 respectively. IV. Frames The frame representation is strongly related to semantic nets. A frame is a representation unit of an object. The frame contains knowledge related to the object divided in a number of headings or slots in which the important properties of the object are stored. Each frame contains a slot of the is-a type which makes it possible to determine the relationship with other frames and possibilities to inherit properties. Frames are similar to the nodes in a semantic net; the lines connecting the objects in the net are replaced by the is-a slot and possible other, user-defined slots. Except declarative knowledge, procedural knowledge can be stored in a slot. QUAESTOR applies a frame representation in its knowledge base. These frames contain all data which is local to the frame type, i.e. parameters, RELATIONs and CONSTRAINTs and in addition all addresses of frames to which is referred to. Frame 3.1 shows an example of a RELATION frame in a QUAESTOR knowledge base. 35 NUMERICAL DESIGN MODELLING No. 342 Is a : RELATION Expression: J = Va/(n*D) Reference: Velocity Ratio of Propeller Data: None Control: Two Way Is referring to: J, Va, n, D Is referred by: J, VA, n, D Frame 3.1: Summary of a typical frame in a QUAESTOR knowledge base V. Cases From the past, design cases or design knowledge may be available. These cases can be used to rapidly focus on aspects in the requirements for the current case, experienced as important in previous design cases of similar type. The relevant cases are retrieved from the ‘case-base’ and modified towards the new requirements. This approach is also referred to as adaptive design [Bras, 1992]. In AI, the formalisms involved in case-based reasoning [Riesbeck, 1989] are being developed. Case-based reasoning is highly relevant to naval architecture. Available (parent) designs are often used as starting point for new designs. By selecting a basic concept which has been designed on the basis of similar requirements, a designer expects to implicitly inherit solutions to a large number of sub-problems. By subsequent modifications of this concept towards the current requirements, the 36 Modelling: Applied Languages and Tools new design is completed. In section 4.1 this process is discussed further. QUAESTOR supports case-based reasoning in the form of Concept Variation Models (see section 6.6). 3.4. Modelling: Applied Languages and Tools Any computational modelling activity requires a language or tool in which the model can be expressed and executed. Languages and tools are developed for a variety of purposes. In view of the amount, it is beyond the scope of this thesis to provide a complete overview of what is available. Only those tools are discussed that are applied in the practice of (ship) design, more or less in the sequence of their proliferation. I. Imperative computer languages Imperative or procedural languages are the most widespread computer languages and are used for a variety of purposes, including numerical design modelling. Programs consist of a set of instructions that are executed sequentially. Important examples, especially for numerical applications are FORTRAN and PASCAL. Advantages of applying procedural languages for numerical modelling are their expressive power, the availability of special purpose libraries which provide numerical functions on a higher level (e.g. matrix operations) and their speed. A disadvantage is the relatively large effort involved in developing, maintaining and adapting applications. The developer is responsible for each step in the process, often on a very detailed level, depending on functions available in his libraries. Except for developing the numerical algorithm, in many cases dedicated facilities for data management and pre/post-processing have to be developed for each new application. In spite of these apparent disadvantages imperative languages and programming environments based on these languages are still the most frequently applied means for (numerical) design modelling. Imperative languages cover the largest possible application domain with the least flexible and adaptable applications. Since most people involved in numerical modelling have been trained to use them they are still the most commonly applied class of languages. II. Object Oriented Programming (OOP) OOP has brought important improvements in the productivity of software developers and considering the popularity of C++, receives much support, also among people involved in numerical and design modelling [Rumbaugh, 1991]. OOP is a means to capture (procedural) knowledge in class-subclass-instance37 NUMERICAL DESIGN MODELLING property relationships. Each subclass or instance holds some properties and methods from its ancestors and new properties and methods may be defined at the subclass level. One of the advantages of representing design knowledge in such a way is the inheritance of properties from a general category to a less general category of concepts. OOP also provides the possibility of Encapsulation, leaving some knowledge of an object open for public scrutiny while other knowledge is private. Encapsulation can provide security in a model, in the sense that information which is allowed to change may change [Hagen, 1993]. The general purpose simulation language SIMULA, initially developed in the early sixties is viewed as the breakthrough towards the object oriented paradigm [Nygaard, 1966]. The Smalltalk language developed by Xerox Parc is considered as the first real object-oriented programming system [Ingalls, 1981]. Programming activities in OOP are in a way restricted to the definition of objects and instructions to object what to do. In a way, OOP directs the developer more into the direction of thinking and developing in terms of tasks then in terms of producing lines of code. Modern programming environments based on OOP techniques have increased the distance between programmers and code. This is almost a necessity due to massive amounts of code required for modern, graphics oriented applications. Object Oriented languages cover a very large application domain with applications that can be reasonably well adapted and maintained. The learning time for developing classes is relatively high, in particular for people raised with conventional imperative languages. However, using classes developed elsewhere is less demanding for a programmer. Although using OOP has attractive properties for numerical modelling purposes, it still requires imperative programming activities: the programmer is obliged to express both what and how the task must be performed. III. Spreadsheets Although originating from the world of finance and accountancy, spreadsheets are becoming increasingly popular in technical environments. A spreadsheet comprises a mostly non-symbolic language based on relations between ‘cells’. Modern advanced spreadsheets offer a wide scope of word processing and graphical features. Spreadsheets are extremely useful for analysis and manipulation of data, one can state that ‘data’ is the central issue in spreadsheets and there is hardly any separation between spreadsheet models and their data. As an example, in the copy operation in a spreadsheet, the model is subordinate to the data in the cell. For technical applications this ‘data centric’ approach is less attractive, in particular in the case of parametric models which basically deal with 38 Modelling: Applied Languages and Tools a limited set of Parameter-Value Combinations (PVC’s). For more complex applications of this type the sheet rapidly becomes ill-organised and is after some time hardly accessible for extension and adaptation. Also spreadsheets basically comprise a procedural language: the operations and their sequence have to be defined by the user. Spreadsheets are commonly applied for smaller modelling applications and for prototyping. Spreadsheets cover a large application domain with reasonably well adaptable applications. IV. Computer Algebra These packages provide a specialised language with computational capabilities and facilities to symbolically manipulate numerical formulations and to evaluate (sequences of) expressions. Examples are Mathematica, Maple V, Scientific Work Place and MatLab. With these tools simulations are performed of complex physical systems such as power plants or chemical production plants. Scientific Work Place provides Desk Top publishing capabilities which makes it most suitable for preparing scientific documents. In view of the fact that models are presented and edited in their scientific notation, the ‘code’ within the limits of the syntax is easier to adapt and to extend than in the case of a conventional third generation computer language such as FORTRAN. Although some of them claim to have rule-based properties, these packages remain procedural languages: the operations and their sequence still have to be defined by the user. The applicability and the penetration of these packages in the engineering design domain is limited in view of their orientation towards advanced mathematics and simulation of complex systems. This makes them more suitable for detailed engineering purposes and for studying the behaviour of specific very complex model components than for the overall design modelling task. Computer algebra packages cover a specialised application domain with well adaptable applications. The learning time is almost negligible for users with sufficient background in mathematics. V. AI languages These languages allow a separation between algorithms and control and are often referred to as declarative languages. Applications are often sets of frames, clauses or rules which are interpreted by an inference engine. A well known example is PROLOG [Clocksin, 1984], developed in the early seventies. PROLOG is in particular suitable for diagnosis applications and offers limited numerical reasoning capabilities. Another language which is often applied in AI is LISP, developed in the late fifties. However, LISP is rather a functional language than a 39 NUMERICAL DESIGN MODELLING declarative one. It is because of its interpreter very well suited to the unstructured interaction that characterises the design process. Less known are hybrid languages which make it possible to make a mixed use of LISP and PROLOG and combine them with conventional languages. AI languages cover specialised application domains with applications that are easy to adapt and to extend in view of their non-procedural operation. In view of their declarative properties, the learning time should be low. Due to the underlying non-procedural way of thinking, people are reluctant to use these languages. VI. Expert systems and shells More dedicated tools are the so-called shells. A shell is an ‘empty’ expert system which remains from a working expert system after removing the domain dependent knowledge. The shell offers inference mechanisms and supporting facilities, for instance for visualisation or knowledge base management. Shells are more dedicated than languages since more properties are fixed, such as the strategy of reasoning, explanation facilities, etc. In [vdRee, 1994] the succesive generations and the current status of expert systems is discussed. Penetration of expert system shells into technical environments is limited which is probably due to the fact that the design community is unacquainted with declarative programming techniques and not due to their inherent technical restrictions. Shells are often dedicated to particular domains which can make them useless for applications in (slightly) different domains and/or for which different inferences are required. From the perspective of numerical modelling a problem is that most shells are dedicated to the domains of pattern recognition, monitoring, diagnosis, spatial configuration, prediction, planning and instruction. Shells which support numerical reasoning are very rare. Expert systems and shells cover small specialised application domains with applications that are easy to adapt and extend, and in particular if the knowledge is already available in the applicable format. VII. Constraint Programming Languages (CPL) These languages originate from the research domain of Constraint Satisfaction and Constraint-Based Reasoning. These languages allow programmers to describe their problem in the form of a number of constraints. The difference between a constraint in a CPL and in a procedural language is the fact that in the latter a constraint is an assignment, i.e. the right clause is executed and its value is assigned to the left clause. In CPL, however, values are searched such that the constraints are satisfied: the Boolean value of the relational operator in the 40 Modelling: Applied Languages and Tools constraints should become TRUE. In a CPL, constraints can be used both ways which means that they can be applied to produce variables in either the left or right clause of the expressions. By nature, problems which can be written as a series of constraints are non-sequential which makes the order in which the constraints are written in the source irrelevant. The solver is able to find a set of values such that all constraints in the program are fulfilled. CPL are declarative languages since they require a programmer to describe what has to be done and not how it should be done. Depending on the techniques applied, the solvers are domain dependent. A comprehensive overview of constraint programming techniques is provided in [Leler, 1988], and [Guesgen, 1992] provides a more in-depth discussion of spatial reasoning techniques. Various constraint satisfaction techniques are proposed and implemented from which the following are the most important: • • • • Local propagation: Constraints are viewed as a network of values and operators. Input is propagated through the network from operator to operator until output is produced. Local propagation cannot handle networks which contain cycles. Relaxation: Relaxation makes an initial estimate of the unknown objects, and then computes the error that would be caused by assigning these values to the objects. New estimates are then made, and new errors calculated, and this process repeats until the error is minimised. Constraints containing cycles require either relaxation or equation solving. Equation solving: Techniques as used in symbolic algebra systems can be used to solve complex systems of equations. An equation solver which is fast and general enough can fully replace both local propagation and relaxation. Equation solving can deal with programs containing cycles. Term rewriting: Involves the application of a set of rewrite rules to (parts of) an expression, transforming it into more general or simpler expressions. Although not a solving technique by itself, the application of term rewriting can ease the solution by means of the above mentioned techniques. Some languages are able to deal with ‘second order constraints’ which are constraints imposed on constraints in the program to describe their validity. In that case the language has rule-based capabilities since the second order constraints form implicit IF-THEN clauses in the program. The implementation of CPL is stated to be extremely difficult which is probably the reason why more languages have been proposed than are actually implemented. The declarative and non-sequential nature of CPL make them attractive for the domain of numerical parametric modelling. Some of the techniques applied in CPL, such as equation solving and term rewriting are very similar to those applied in the QUAESTOR solver. Examples of CPL are Sketchpad [Sutherland, 1963], ThingLab [Borning, 1979] and TK!Solver [Konopasek, 1984]. Constraint 41 NUMERICAL DESIGN MODELLING programming languages cover a small and specialised application domain with applications that are easy to adapt and extend. VIII. Functional Programming Languages (FPL) Functional programming languages are based on the application notion of function. In a functional program function definitions are the central issue. The syntax of these languages is closely related to the common mathematical notations for functions. The main program itself is written as a function which receives the program’s input as its arguments and delivers the program’s output as its result. Functional programs contain no (destructive) assignments, so variables, once given a value, never change. Therefore, a function call has no other effect than its computed results. This eliminates a major source of errors and makes the order of execution irrelevant; the flow of control need not be prescribed. Since expressions can be evaluated at any time, one can freely replace variables by their values and vice versa. This makes programs referential transparent and mathematically more tractable than their conventional counterparts [Hughes, 1989]. The idea of FP exists since the dawn of modern computing. FPL were kept out of the mainstream by their desperately slow performance and memory greed when compared to FORTRAN or C. Only recently, after a decade of research breakthroughs, FPL are becoming available that can compete with C in both time and space efficiency. Examples of these languages are Miranda, Haskell, Erlang and Concurrent Clean. Some applications are described in [Hoon, 1995] and [Hudak, 1994]. FP has some aspects in common with CPL, for example the application of term rewriting and the absence of destructive assignments. The declarative and non-sequential nature of FPL and their sound mathematical basis make them very attractive for parametric numerical modelling. An important property of FP is the application of a so-called Lazy Evaluation, which means that a function is only executed when its output is requested, which is a premise for performance. Another feature is the ability to use higher order functions, i.e. functions that can use functions as argument and which can return functions as result. This enables for example the definition of a quicksort function which cannot only sort a set of numbers but also a row of arbitrary items, e.g. data from a database. By defining the relevant (higher order) constraints as functions and by defining additional functions for equation solving, numerical models can be developed in 42 What is Available? FPL. Current FPL promise a declarative form of numerical modelling which allow fully modular and highly adaptable applications. 3.5. What is Available? Most of the tools and languages presented in the previous chapter are software engineering tools or at least languages, either declarative, functional or imperative. Even today, the average person involved in numerical modelling is mostly using imperative techniques since that provides the clearest view on the relation between the program expressing the numerical model and on its operation. The programmer is requested to think about operations and their order and coherence and must exactly define what should be done and how it should be done. From the discussed languages and tools the conventional procedural languages, spreadsheet and computer algebra packages belong to the group of imperative languages. The second group, those of the declarative languages, only require the definition of what should be done. The expressiveness of these languages is generally less than of the imperative languages. Some of these languages apply a dedicated interpreter and/or inference engine to select and execute program statements, depending on the state of the application. Others include a compiler which produces machine-code executables. Although life is made easier for the programmer within the domain of the language, he is still obliged to write a program for his purpose. Apart from what, the programmer needs to specify the relevant knowledge in the form of clauses, frames or constraints. In a sense, the programming effort is reduced to selection and assembling. Similar to imperative languages declarative languages still imply ‘one program per problem’. The AI languages, Constraint Programming languages and Functional Programming languages can be considered as declarative languages. The third group is that of expert systems and expert system shells. The most important difference between an expert system and an AI language such as PROLOG is the presence of knowledge storage and retrieval facilities in combination with an inference engine. Ideally, this entirely separates programming from program execution, i.e. problem solving. Programming is reduced to maintaining a ‘knowledge base’ whereas the process of defining ‘what’ is only required when a problem to be solved is at hand. The selection task, viz. retrieving e.g. rules, frames and clauses from a knowledge base and their inclusion into the active set (assembling), is controlled by the facilities of the shell. Again some 43 NUMERICAL DESIGN MODELLING expressiveness is lost if compared to AI languages. What is gained is additional power over the domain: with less effort useful things can be done with the contents of a knowledge base. Although shells are available for a variety of purposes, numerical (design) modelling is hardly addressed, however. Table 3.1 Tentative language classification from a numerical modelling perspective Language/tool Expressiveness FORTRAN,PASCAL etc. +++ Effort Adaptability/ Threshold re-use ----+ OOP +++ -+ + -- Spreadsheet + + -+ +++ Computer Algebra ++ ++ + ++ AI languages -- + ++ - Shells - ++ +++ -+ CPL ++ + ++ +++ FPL +++ + ++ -+ A designer requires knowledge and information to make the proper choices in the design process. He should be freed from the procedural issues involved in modelling this knowledge in computer languages in order to perform a specific task. Nowadays, the use of computers has become inevitable and users tend to select tools which provide acceptable result against the least possible effort, both in terms of learning and of implementation of models. Due to the lack of more efficient solutions and not in the least determined by education, tradition and availability, imperative languages or tools are still most commonly applied for numerical modelling purposes. In Table 3.1 an attempt is made to classify the various groups of tools and languages that are presented. The tools and languages are tentatively listed in sequence of proliferation, at least in the domain of naval architecture. For the discussed languages and tools the adaptability of models is to some extent inversely proportional to the expressiveness. Apparently, with more expressive languages it becomes more difficult or at least more time consuming to develop, extend or adapt applications. A computer language can be considered as a means to express knowledge and to control it, either explicit (imperative programming) or implicit (which inferences are available?). In the event of a language in which knowledge and control can be 44 The Missing Link: Model Assembling expressed within a small number of dedicated language elements, it becomes much easier to extend or adapt these applications, as long as adaptations and extensions fit within the syntax. Obviously it is either not possible or extremely difficult to express knowledge and control which goes beyond syntactical limits. 3.6. The Missing Link: Model Assembling In the languages and tools discussed in section 3.5 a number of desirable properties are present: • • • Imperative languages: large domain, expressiveness, speed Declarative and Functional languages: compact programming, adaptable applications Shells: problem solving capabilities using a knowledge base, highly adaptable applications Adaptable or flexible applications are important in numerical design modelling since the problems within the same ‘knowledge space’ may vary from case to case. As explained in this and the previous chapter, the knowledge representation involved in numerical design modelling is of limited complexity and consists mainly of a set of parameters, RELATIONs and CONSTRAINTs. On the other hand the control of this knowledge is more complicated. With control is meant how and when particular RELATIONs are selected, i.e. admitted into a numerical model. This complexity is clearly demonstrated by the fact that the languages and tools discussed in the previous section merely address execution of models and not the building or assembling of models on the basis of model fragments or RELATIONs. Model assembling is considered as a programming activity in most of these tools and languages. This assembling is typically performed by a domain expert. From the discussed languages and tools the Constraint Programming Languages (CPL) seem to fit in nicely with the numerical modelling domain. In CPL, the RELATIONs applied in QUAESTOR are named constraints and the CONSTRAINTs of QUAESTOR are called second order constraints. Although this seems to be a request for terminological confusion, the RELATION expresses a possible connection between parameters within a knowledge base. CONSTRAINTs are conditions for admitting RELATIONs into a model. After admission, a RELATION can be considered to be a constraint in the sense of Constraint Programming Languages. For equation solving or constraint satisfaction a number of the techniques as described in section 3.4 are applied within QUAESTOR. 45 NUMERICAL DESIGN MODELLING The major contribution of this work is a solution of a particular assembling problem, being the composition of numerical models out of model fragments or RELATIONs in the domain of engineering design and analysis. This solution has been successfully implemented in the form of the dedicated expert system shell QUAESTOR. This shell necessarily unifies a number of properties of the representation techniques, languages and tools discussed in this chapter: • • • • • • • • • • • 46 The basic form of knowledge is represented in numerical expressions, a constraint-centred approach is adopted A rule-based representation is applied: the CONSTRAINT in QUAESTOR forms the condition, interpreting the RELATION (Rule) to compute one of the parameters forms the action. The computed parameter value forms the conclusion. The conclusion (computed parameter value) of a QUAESTOR RELATION depends on the context in which it is used. A semantic network representation is used during the model assembling process and in the knowledge base A frame representation is used for parameters, RELATIONs and CONSTRAINTs Case-based reasoning by accessing and using existing (design) data Imperative programming through a capability to interface with such applications Computer algebra through its equation solver and mathematical function library Constraint-based reasoning/constraint programming languages: constraint satisfaction, term rewriting, (local) propagation of degrees of freedom AI languages through the implicit rule-based properties of the RELATION selection mechanism and the applied strategies of reasoning (backward and forward reasoning), the distinction between Rules (RELATIONs/CONSTRAINTs) and Facts (parameter values) Functional Programming: no destructive assignments, irrelevant order of execution Some resemblance with spreadsheet programs is present in the interface design 4. DESIGN PROBLEM SOLVING IN NAVAL ARCHITECTURE At the top of the stairs, there’s hundreds of people running around to all the doors. They try to find themselves an audience their deductions need applause. Genesis, The Chamber of 32 Doors Even in a time of plenty and affordable computer processing power, the development of software for analysis or concept exploration in the early design phase remains costly and time consuming. The frequency of a design analysis subtask is often too low for a tailor made computer program to be economically feasible, particularly given the fact that a large variety of such problems must be solved in the course of each design. Good access to the available and relevant knowledge is of vital importance. In this chapter conceptual design is described as a reasoning process and the traditional numerical design models are discussed. The observations made imply a need for tools above the level of application programs to improve the efficiency of design related knowledge and method management. In the last section the merits of a knowledge-based approach towards design modelling are discussed. The ideal which is being pursued is to bridge the gap between developing and using numerical design models by simply combining these two activities. 4.1. Design Practice: Reasoning and Modelling Ship design is a knowledge-intensive activity. The ever greater accumulation of design-related knowledge is not directly proportional with the number of designers and specialists which are using and managing this knowledge. Simply stated, a designer must know more than ever before and must be capable to use the available knowledge to be effective and successful. Thus far, tool development has focused on the support of the detailed design and hardly on the conceptual phase. However, in view of the complexity, size and autonomy of ships, conceptual ship design has been theoretically approached from different disciplines. Much work has been done in the field of applied mathematics [Mistree, 1990]. The emphasis in that work was put on projecting (ship) design problems on available and emerging numerical techniques. Recently, more holistic modelling of the ship design process is attempted in [Bras, 1992 and Hagen, 1993]. In particular MacCallum has carried out research on knowledge management, retrieval and control of design knowledge within the scope of 47 DESIGN PROBLEM SOLVING IN NAVAL ARCHITECTURE developing DESIGNER [MacCallum 1987-1990]. He summarises the main characteristics of conceptual design as follows: • • • • • Creative: The process to generate a conceptual design requires imagination and inventiveness. As a result of creative activities, the models may change or develop as the design or model development proceeds. Multiple solution: The design process is not deterministic, therefore the results highly depend on the choices made during the process. There can be many answers to a given design problem, all of which may achieve the technical and economical objectives. Empirical: The process of creating and evaluating a model does not always follow well formalised rules with good theoretical bases. Relationships are often of an empirical nature. Approximate: Because design is a modelling process which uses empirical relationships, the results obtained are generally approximate. Accuracy increases as the design proceeds and greater levels of detail are included. Expertise: The designer uses expertise with respect to the application of relationships and the acceptability and applicability of the (intermediate) results. Computational models may develop in time, this especially concerns empirical models which are based on experience. Apart from these characteristics conceptual design and design modelling can be viewed as a reasoning process. The designer processes and maintains information during the design process. In the sequel the knowledge and inferences are described in these terms and aspects of automated modelling support are introduced. In the conceptual design phase the designer uses mainly his experience (amongst other based on previously performed designs and feedback from these designs) and a set of rules, procedures and heuristic knowledge. A designer reasons basically from requirements, available concepts and boundary conditions to the most suitable solution within reach, given the means and time available for the design. The most suitable solution is the concept with the most desirable properties, viz. the concept which is capable to perform its task against acceptable cost and risk, taking into account all design constraints. A concept is represented by a concept description (dimensions, geometry, components). In the early phases of design, the concept description is in general not fully available in a numerical form: the concept is also described by a number of sketches and drawings. To support the design process a number of calculation methods or aspect models are used which predict or compute aspects (cost, performance, stability, 48 Design Practice: Reasoning and Modelling motions, strength, etc.) of the concept on the basis (of elements) of the concept description (main dimensions, hull form, structure, general arrangement, etc.). The aspects considered are in general properties of the concept and no elements of the concept description. The designer acts as interface between the concept description and the aspect models (Figure 4.1, left). In other words: the designer translates on the basis of his experience the results obtained with the models into adaptations of the concept description [vHees, 1995 and Wolff, 1994]. Figure 4.1: Design process: Sequential state-of-practice versus computerised optimisation From results obtained with aspect models the search direction for solutions or concept improvements is inferred. Therefore, aspect models are mostly applied inversely by the designer in an indirect manner. A direct inverse use of these models is possible by means of an optimisation technique (Figure 4.1, right). Such techniques are capable to search values for the concept variables (generally the input of the aspect models) at which pre-defined goal constraints are fulfilled or at which the minimum or maximum values of the pre-defined objective functions are obtained. Goal constraints and objective functions are normally related to properties of the concept. In order to apply an optimisation technique, a model has to be built or assembled. The problem of design model assembling prior to using optimisation techniques has not obtained the attention it deserves. Numerical design research is generally devoted to the exploration of the capabilities and merits of an established or innovative numerical solver. The configuration of design templates is regarded as a programming activity. 49 DESIGN PROBLEM SOLVING IN NAVAL ARCHITECTURE From the type of models and their application in the design process is concluded that designers reason from concepts towards properties and from desired properties back to concept modification. This implies a retrograde reasoning from desired properties towards concepts. An initial concept (e.g. a parent ship) is used as a starting value. The design spiral is often used to represent this form of sequential optimisation. In ship design computerised optimisation techniques are mainly applied on component level (e.g. for hull and propulsion system design) and hardly ever on system level [Kupras, 1983 and Pal, 1992]. Optimisation on system level is frustrated by the large effort that is involved in building a working optimisation scheme (or template), viz. a numerical model which describes the design and its properties to a sufficient level of detail. Also, design problems are in general multi-objective and it is not easy to classify the importance of the objectives. Another problem is that such techniques applied in a black-box fashion hardly provide any insight into the relationships and possible conflicts that exist in the design space. Multi-Objective Optimisation (MOO) becomes a more powerful aid to designers if its implementation enables him to trace the process and to interrupt, to supply additional data, to modify starting points, objectives or even to interactively change the optimisation scheme on the basis of information about convergence (and robustness) fed back by the solver. In [Smith, 1994] the problem of robustness is addressed and techniques are explored that provide insight in the robustness of the solutions obtained by MOO. Therefore, designers still prefer to obtain such insight by interactively exploring the design space. The aspect models and the properties they predict are used by the designer to support decisions with regard to concept adaptations. The computer is used as a means to gain insight (Figure 4.1-left) and not as supplier of well-balanced designs on the basis of a set of requirements (Figure 4.1-right). To some extent, psychological phenomena are important in the design process: a design has to evolve and confidence and consensus must be acquired with regard to its quality and feasibility. These aspects are neglected in the black box numerical optimisation process which produces an ‘optimum’ design given a set of goals and constraints. A design is hardly ever the result of a problem that, once defined, is solved in a single process. An obvious reason: the requirements are actually part of the design problem. In more complex design cases, requirements are frequently adapted on the basis of technical, operational and financial insights that are obtained in the course of the process. The initial requirements may be based on limited knowledge of what is feasible and may be of limited final value. 50 Numerical Conceptual Design: The Concept Exploration Model Negotiations between the parties involved are an essential part of the design process. These aspects are actually excluding the monolithic approach towards design with the current tools and knowledge. The proposition ‘design is finding a compromise’ is a stereotype but nevertheless correct. This proposition applies both to the requirements and to the result of the design process. In my view information technology should contribute to the (conceptual) design process by making relevant data and knowledge more accessible and by allowing its concurrent use, i.e. to accelerate the passage through the design spiral. A successful conceptual design (modelling) tool facilitates and supports the current practice or strategy of reasoning in design and design modelling rather than replacing it with a completely new design paradigm. Its application should provide insight into the relationships and possible conflicts in the design space. An automatic generation of an ‘optimum’ (concept) design on the basis of desired properties can hardly be considered as a realistic research objective. The mix of formal and informal knowledge involved in design make that synthesis will remain the task of the designer. Information technology can be an important aid and no more. 4.2. Numerical Conceptual Design: The Concept Exploration Model In practice, the large variety of design problems, boundary conditions and requirements makes that hard-coded computerised ‘single problem’ solutions provide insufficient return on investment. In ‘average’ merchant ship projects usually the time and money is lacking to develop a dedicated conceptual design model. However, in the case of high-value projects (e.g. future generation frigates or submarines) with a long project lead time in which many aspects need to be considered an integrated conceptual design model can pay off. Such models contain a number of methods (for e.g. weight, cost, energy balance, resistance and propulsion, etc.) tuned on the particular ship type. For example, the Ship Design Department of the Royal Netherlands Navy frequently employs such models for a variety of ship types. These numerical models are referred to in literature as Concept Exploration Model (CEM) [Georgescu, 1989]. A CEM is a parametric design model which is used to systematically search the design space for the ‘best’ starting point for the more detailed design or to 51 DESIGN PROBLEM SOLVING IN NAVAL ARCHITECTURE investigate the dependencies between design parameters by exploring a specific area in the design space. A CEM generates a large number of concept descriptions with their properties, fitting within a selected number of range values of ship main particulars. The concept descriptions are judged by a post processor or filter. The global architecture of CEM’s is presented in Figure 4.2. Figure 4.2: Concept Exploration Model CEM’s are used in the conceptual design phase as a CAD-tool to support the dialogue between the client and the design team. Another application is to derive non-published data of competing designs. However, the final result of a CEM exercise should be a concept description which has the most desirable properties and can be used as the starting point for the more detailed design. The accuracy of the concept description depends on the combined accuracy of the aspect models included in the CEM. 4.3. Beyond Concept Exploration Models In the development of a CEM, the gathering and analysis of ship data and available prediction tools requires much effort. In addition to that, the data need to be generalised into methods and rules for amongst other the calculation of building and life cycle cost. Effort is also involved in their validation and last but not least in the implementation of algorithms and their integration into the CEM. 52 Beyond Concept Exploration Models A practical drawback of the conventional Concept Exploration Model is the restricted maintainability and adaptability of the software. Adapting a hard coded CEM requires access to its source code, programming capabilities and knowledge of its internal structure. In particular the latter can be a problem if the software is transferred from one developer or user to another. Although the ship design community is clearly interested in computerised models for conceptual design, the problems with regard to knowledge acquisition, the often restricted applicability of the methods included, the high initial investment and the ongoing effort in maintenance and configuration management impose a restriction on their practical application. Due to this, the (conceptual) design process remains to some extent a ‘manual’ or even ‘intuitive’ search for good (starting) solutions. Moreover, the hard-coded conventional CEM is rigid and costly. By using dedicated AI techniques, flexible and adaptable Concept Exploration Models can be built and maintained. These knowledge-based design models are no monolithic solutions in e.g. FORTRAN, PASCAL or C. The applied models and other knowledge elements are separated from their control. In other words, a suitable knowledge-based system (KBS) has facilities to infer when and how to use models and related data, depending on the problem and input provided. In principle, KBS are able to assist designers if task are generic, i.e. using a formalised domain and a set of standard inferences. For parametric ship design the domain consists of (numerical) aspect models, their parameters and the parametric concept description. Standard inferences deal with the selection of aspect models. The overall task of the KBS can be the assembling of a numerical model, the management of the concept description, the preparation of aspect model input (for instance by performing other calculations), their subsequent execution and the presentation of the results. The separation of models and control has advantages for developing numerical design models: • • • • System design: Functional and technical design of the numerical design model are more easy (input and output need not be fixed). Implementation: Easier and more cost effective (only declaration of model fragments, control structures are provided by the KBS). Problem-driven modelling: The numerical model is assembled during a reasoning process, starting with the definition of the top goal parameter(s). Prototyping: Fast and efficient, after declaring a model fragment, i.e. storage in the knowledge base, it can immediately be used and linked with other methods. 53 DESIGN PROBLEM SOLVING IN NAVAL ARCHITECTURE • • Maintenance: Easier and more cost effective, model fragments are maintained separately from their control. Flexibility: The assembled design models are adaptable at run time. Through the KBS the user can apply and access any aspect model or combination of aspect models. KBS have serious potential to increase the productivity in design modelling. An important problem to be solved is the ever existing gap between the worlds of software/knowledge engineers and designers. The ideal which should be pursued in applying KBS in design is that the knowledge engineer and the designer becomes one and the same person. This avoids time consuming and costly knowledge acquisition processes in which inevitably relevant information or aspects of knowledge are lost or overlooked. The aim is to allow the designer the immediate use of the KBS to manage and apply his knowledge in an easier and more effective way than with traditional means. In the sequel we will identify forms of knowledge involved in numerical design modelling activities. 54 5. KNOWLEDGE & KNOWLEDGE ACQUISITION I was gratified to be able to answer promptly and I did. I said I didn’t know. Mark Twain, Life on the Mississippi Numerical relationships contain procedural and non-procedural aspects. The first section is an introduction of some of the non-procedural aspects of numerical knowledge. The notion that these aspects are used in modelling activities is introduced by examples of formal inferences that can be automated. Subsequently some techniques are discussed which either are or can be used for deriving parametric representations from empirical observations, i.e. data sets. The nature of these data sets is discussed and an immediate use of data as model is proposed. The chapter is concluded with a discussion of the application independent TELITAB data structure, used within QUAESTOR as format for data storage and transfer between numerical model fragments which co-operate in a larger model. 5.1. Knowledge Used in Parametric Modelling The parametric modelling of physical systems is the process of gathering and combining a set of numerical knowledge elements into a program or model which can be used for simulation and/or prediction. In the modelling process a form of backward reasoning is performed, viz. from parameters to models which can be used to determine their values. The starting point of any modelling effort is a desire to obtain a tool which supports what-if type of exercises. The model must be general enough to provide answers for a variety of inputs. The first step is to identify the parameters to be computed: the top goal parameters. A model fragment is a relationship (in QUAESTOR: RELATION) between a number of parameters and is a procedural concept: if the values of the parameters in the right clause are known (DETERMINED), the value of the left clause of the RELATION can be computed. In this case the RELATION is only used OneWay (OW): it is used as function. If all except one parameter in the right clause are DETERMINED, that missing (PENDING) value can be computed in case the RELATION is used TwoWay (TW), i.e. as equation. 55 KNOWLEDGE AND KNOWLEDGE ACQUISITION Within the context of this thesis, a RELATION is a rule in mathematical form: y = f(x1,x2,...,xn) and always contains the equality operator ‘=‘. In the foregoing the concept CONSTRAINT was introduced. A CONSTRAINT is a mathematical expression consisting of: f(u1,u2,...,un) {=,<,etc.} g(v1,v2,...,vn) or of sets in this form separated by logical operators (AND, OR, etc.) and nested by parentheses, if any. CONSTRAINT(s) express the validity of the RELATION to which is referred and must be either TRUE or PENDING before admission of the RELATION into the template. If more than one CONSTRAINTs are connected to a RELATION, an implicit AND is assumed to exist between them. A combination of a CONSTRAINT and a RELATION forms a Rule: IF (f(u1,u2,...,un) {=,<,etc.} g(v1,v2,...,vn)) THEN y = h(x1,x2,...,xn) END IF The above rule can be considered as a production rule. The conclusion of (or parameter computed by) this rule, however, need not be fixed. Depending on its property OneWay: y := h(x1,x2,...,xn) or TwoWay: y = h(x1,x2,...,xn) the conclusion can be respectively only y or one of the parameters y,x1, x2,...,xn. Except for its procedural properties the concept RELATION contains knowledge of the coherence between a second class of concepts, the parameters. Parameters are measurable or quantifiable characteristics. A parameter is indicated with a symbol. This symbol or identifier has a value, description, a dimension and properties linked to it. A parameter may have a value that is either DETERMINED or PENDING. PENDING values are managed by the Modeller for later determination, if necessary. 56 Knowledge Used in Parametric Modelling A RELATION connects parameters. From the expression y=x1/x2 we infer that the values of x1 and x2 are needed to compute y or y and x1 to compute x2. In case the RELATION has the property OneWay we infer that the RELATION cannot be used to compute x2 in case y and x1 are DETERMINED. In case the value of x1 is required and the value of y is DETERMINED, we infer that the value for x2 is required before we can compute x1. The parameter x2 becomes an additional sub goal for which a suitable RELATION must be selected by backward reasoning, i.e. search for RELATIONs (Rules) which can be used to produce values (conclusions) of the PENDING parameters. Although in AI the term (backward or forward) chaining is used in a somewhat different context, this term is applied in the sequel to indicate the action of connecting a parameter to the RELATION in the knowledge base that will produce its value. If a CONSTRAINT a<b is referring to the RELATION the values of a and b are needed in addition to y and x1 to compute x2. We can preliminary select y=x1/x2 before a and b are inferred but x2 cannot be computed (i.e. made DETERMINED) until a<b is satisfied and DETERMINED! Apart from tagging characteristics, parameters appear to have properties which are local to the parameter and do not change with the expression in which it is used. Some examples: • • • Should the value be provided by the user or should it be computed? The desired numerical format, e.g. fixed format three decimal places. Is a restriction imposed on the value, e.g. value should be Non Negative (NN). In case y is a Non Negative value and a negative value is either provided or computed we infer that this value is illegal and should be rejected. This example indicates that not only CONSTRAINTs but also the properties of RELATIONs and parameters and CONSTRAINTs can be used in the selection process of the model fragments for a model. These properties are further referred to as control attributes or simply as CONTROL. The above simple examples show the essence of parametric modelling which apply both for automated and human model assembling activities. The assembling task consists of a reasoning process in which the connections (i.e. RELATIONs and CONSTRAINTs) between parameters are used to navigate from parameters to model 57 KNOWLEDGE AND KNOWLEDGE ACQUISITION fragments vice versa. The search space for this navigation is limited in size by interpreting the various control attributes of the parameters, RELATIONs and CONSTRAINTs. These attributes and the derived inferences are forming the basis of the QUAESTOR shell and are further discussed in chapters 7 and 8. 5.2. Deriving RELATIONs and CONSTRAINTs A knowledge-based system requires relevant knowledge to use and manage. For this purpose, QUAESTOR requires RELATIONs and their validity (CONSTRAINTs). Important sources for this knowledge are technical and scientific literature and accumulated company experience. Examples are weight and cost figures or man hours per ton construction weight depending on the construction type, the applied materials, etc. For a model basin, test data obtained with a variety of ship and propeller models are major sources from which RELATIONs can be derived. Various techniques are available to derive trends or relations from data samples. Well known are the multiple linear regression (MLR) and non-linear regression (NLR) methods [Sen, 1990]. A typical result application of regression analysis in the domain of ship propulsion is described in [Holtrop, 1984]. Basically, these techniques provide an approximation of a data sample which is valid within the bounds of that sample. For similar purposes an Artificial Neural Network (ANN) can be used [Weiss, 1990]. Their advantage over regression analysis is their better capability to approximate non-linear phenomena and the fact that the ANN training procedure requires less effort than creating a regression model. Within NSMB co-operative Research Ships so-called back propagation ANN [Hertz, 1991] are successfully applied for ship performance prediction. In essence, however, both MLR/NLR and ANN imply a modelling phase which requires effort and experience with the applied modelling technique. In order to use the resulting model the implementation in a computer program is still required, however. Instead of approximating by means of a numerical model, an option is to immediately interpolate in the data sample. Traditional techniques allow line (2D), surface (3D) or matrix (nD) interpolation if the data is distributed in that way. Basically, the traditional interpolation techniques require that the number of dimensions (Nd) in a data set coincides with the number of independent parameters (Ni). The number of dimensions equals the number of independent parameters over which variations have been performed with sets of fixed values of the other independent parameters. Each independent parameter should be related to one or more of the dependent parameters (Ni=Nd). In reality, the number of independent 58 Deriving RELATION’s and CONSTRAINTs parameters hardly ever coincides with the number of dimensions in a data set (Ni>Nd). For example, a database of ship model and resistance data may have the following structure: • • • • 150 ship descriptions per ship description 30 describing parameters per ship description between 0 and 20 speed/resistance combinations per ship description between 0 and 20 speed/thrust-torque-revs combinations In this data set the ship ‘varies’ and per ship only the speed and with that a number of speed dependent parameters. For this database Ni=31 (30 description parameters and speed) whereas the number of dimensions in the database Nd=2 (ships and speed). This structure allows us to interpolate the dependent parameters resistance, thrust, torque and revs for other values than the given speeds, only for ships given in the database. For design purposes we need to predict performance values for ships that are necessarily not in the database. The obvious approach is then to develop a regression model or neural network. Another possibility is to apply data sets in the form as in the above example as a model by itself by means of a suitable interpolation technique. Data samples which contain e.g. experimental information are scattered by nature and contain more independent parameters than data dimensions (Ni>Nd). An interpolation technique which can deal with such data sets is the so-called Gaussian Radial Basis Interpolation (GRBI) [Specht, 1991]. No assumptions are made on the distribution of the data points, nor on the shape of the interpolated surface (although it should be smooth). The non-linear estimation is based on a weighted average over all the data points. These weights are based on the probability that the estimated point is equal to a data point. The GRBI method is a special form of the radial basis functions often found in neural networks. GRBI is also known as Parzen Estimation. Advantages of GRBI are: • • • One pass method which can generalise from examples as soon they are stored. Provides smooth transitions from one observed value to another even with sparse data in an N-dimensional space. By using a smoothing parameter noisy data will be filtered. 59 KNOWLEDGE AND KNOWLEDGE ACQUISITION • The method is capable to give predictions with incomplete input, i.e. a prediction can be obtained in case some values of the independent parameters are not available. A disadvantage is that the method is not able to extrapolate outside the bounds of the data sample since the estimate is bounded to the maximum and minimum values of the observation. On the other hand this method will not generate unrealistically large or small results in case extrapolation is performed outside the bounds of the data set. This phenomenon may e.g. occur with back propagation ANN in which the inputs are not restricted between the hard limits of the training sample. Another disadvantage is that the method shows a tendency to underpredict the relatively large values in the sample and to over-predict the relatively small values in the sample. In [vdBerg, 1996] a comprehensive overview of the merits and limitations of GRBI is provided. Multi-dimensional interpolation techniques for both Ni=Nd and Ni>Nd enable deriving RELATIONs from arbitrary data samples without the necessity of a numerical modelling process. In addition, such techniques can be applied to replace complex and demanding calculation methods (aspect models) by systematic sets of output for specified ranges of input. Replacing an aspect model by a systematic set of output may reduce the required input by replacing input values with fixed values, thus restricting the range of application. In addition, output may be limited into the figures of interest. In this way the design models become less complex whereas their execution speed may increase. 60 TELITAB: A Generic Parametric Data Format 5.3. TELITAB: A Generic Parametric Data Format In the modelling world the description of physical systems consists of a set of parameters and their values. Parameters provide meaning to values through their symbol, description, dimension and properties (e.g. is the value to be provided or calculated, numerical format, etc.). On the other hand, parameters provide access to numerical relationships and procedures by which their values are either calculated or used as input. In order to connect I/O of numerical relationships and procedures a generic data representation format is required. During the development of QUAESTOR this so-called TELITAB format has evolved which was a logical offspring of the data structure in the workbase. TELITAB stands for TExtLIst-TABle. The TELITAB format is used within QUAESTOR for all data storage and retrieval functions and for exchange of data with satellite applications. A TEXT is unstructured information without parameters. In a ship model database, the type of ship, the yard, owner and some relevant details are nominal by nature (Table 5.1). Table 5.1: A TEXT “Panamax crude oil tanker for ....” “Yard:....” “Heavy propeller induced flow separation observed during tuft test” A TEXT may consist of n variable length strings between quotation marks (n=>0) and delimited by a carriage return and a line feed. In general, physical systems are described parametric by means of a number of singular Parameter Value Combinations (PVC’s), as e.g. in Table 5.2: Table 5.2: Parameter Value Combinations 4 “LPP” “B” “T” “VOL” 120 22 8 12600 {Number of singular PVC’s} {First PVC} {Second PVC} {Third PVC} {Fourth PVC} This structure forms a LIST The third part of the structure is the TABLE. The TABLE generally contains a set of properties of the system, in Table 5.3 the example of the powering performance: 61 KNOWLEDGE AND KNOWLEDGE ACQUISITION Table 5.3: Speed/power TABLE 3 “1” “2” “3” “4” “5” “6” “Speed” 11.0 11.5 12.0 12.5 13.0 13.5 “Power” 7520 8770 11260 12107 14408 17290 “Revs” {3 is number of columns} 81.1 84.6 88.4 92.4 96.6 100.9 The TABLE contains a number of columns, being multiple PVC above which the parameter is placed between quotation marks. The TABLE contains a number of Cases for Speed, Power and Revs and the last Case forms the end of the TABLE. Table 5.4 presents a propeller open water diagram in TELITAB format. Table 5.4: Open water diagram in TELITAB format “Open Water diagram Propeller 4 “AeAo” 0.75 “D” 7.0 “PDRA” 0.80 “Z” 4.0 4 “EthaO” “J” “1” 0.000E+00 0.000E+00 “2” 1.282E-01 1.000E-01 “3” 2.519E-01 2.000E-01 “4” 3.695E-01 3.000E-01 “5” 4.777E-01 4.000E-01 “6” 5.703E-01 5.000E-01 “7” 6.336E-01 6.000E-01 “8” 6.301E-01 7.000E-01 model xxxx” “KQ” 4.388E-02 4.071E-02 3.702E-02 3.287E-02 2.833E-02 2.347E-02 1.836E-02 1.306E-02 “KT” 3.590E-01 3.281E-01 2.930E-01 2.544E-01 2.126E-01 1.682E-01 1.218E-01 7.387E-02 An inherent limitation of this basic TELITAB format is its ability to only transfer a single 2D data set (e.g. a LIST with propeller particulars and a TABLE with open water data). It does not allow in the same set an additional table with e.g. cavitation influence on KT and KQ as a function of the cavitation number, other than by ‘flattening’ all except one table. Flattening means that each value in the additional table will be assigned to a separate parameter, i.e. is named whereas in a TABLE a value is determined by the column parameter and row (Case) number (see Tables 5.5 and 5.6). A more generic solution than converting table elements into separate parameters is desirable to improve clarity and structure of the data and the ability to extend the number of Cases in a TABLE without defining additional parameters. With the basic format it becomes rapidly impractical to convert tables with many numbers into singular PVC’s which invites solutions consisting of patchworks of files. 62 TELITAB: A Generic Parametric Data Format Since the basic TELITAB format was already in use for several years in a variety of applications it was desirable to find a solution which did not conflict with that format. By applying a recursive alternative of the TELITAB format both objectives are met: the existing format remains valid and a flexible and hierarchical data structure is obtained. The aspect of recursion is introduced by the data type OBJECT which value is another TELITAB set, making the format ‘self referring’. The solution is illustrated with the following example. Table 5.5 contains a LIST with the parametric ship description and the description of three rudders and a TABLE with speed/resistance in basic TELITAB format. Table 5.5: Ship description in basic TELITAB format “Calculation title” 17 “LPP” 50 “B” 12 “TA” 3.3 “TF” 3.1 “VOL” 900 “LCB” -4.5 “Rudder_1_chord” 0.5 “Rudder_1_span” 1.5 “Rudder_1_thickn” 0.15 “Rudder_2_chord” 0.6 “Rudder_2_span” 1.3 “Rudder_2_thickn” 0.15 “Rudder_3_chord” 0.5 “Rudder_3_span” 1.5 “Rudder_3_thickn” 0.15 “CWP” 0.80 “CM” 0.78 2 “Speed” “Total_resistance” “1” 5.0 100 “2” 10.0 400 “3” 15.0 1100 “4” 20.0 1900 “5” 25.0 2800 The basic TELITAB data format reflects the essence of analysis: a system description represented by a LIST of data and a representation of properties of this system by a TABLE. The TABLE has a number of records for which all values in the LIST apply, i.e. are inherited. Computational tasks performed during a design process are using the system description at the input side. Of course, a single LIST may not always be the most convenient way to describe a system. The system may also contain elements to be represented in tabular form. These elements may be hardware components (such as the above rudders), or may be conditions under 63 KNOWLEDGE AND KNOWLEDGE ACQUISITION which the system operates (e.g. a matrix of wave headings, frequencies and ship speeds). On the output side the basic TELITAB format assumes that calculations are either being performed along one varying parameter (e.g. speed or time) or that the results of multiple parameter variations are combined into one TABLE (e.g. responses on the basis of speed, wave heading, frequency, etc.). In Table 5.5, the rudder data “Rudder_1_chord” until “Rudder_3_thickn” are presented in as a flattened table in LIST format whereas Table 5.6 shows the same data as a TABLE: Table 5.6: Rudder data in TABLE 3 “1” “2” “3” “chord” 0.5 0.6 0.5 “span” 1.5 1.3 1.5 “thickn” 0.15 0.15 0.15 which can be converted into the following TELITAB data set (Table 5.7): Table 5.7: Rudder data in TELITAB format 1 “thickn” 0.15 2 “chord” “1” 0.5 “2” 0.6 “3” 0.5 “span” 1.5 1.3 1.5 This set can be considered as a named OBJECT, e.g. “Rudder”. By considering this name as parameter we can view this TELITAB set as its Value. The complete set can then be rewritten into the form as presented in Table 5.8. The number of single PVC’s is reduced from 15 to 9. The opening brace under “Rudder” indicates the beginning of a new TELITAB set which is delimited with a closing brace. Through the OBJECT name, the label number and the parameter each element in the recursive TELITAB set can be addressed. This recursive data model enables the description of various sets of multiple similar objects into one data set which is not possible in the basic format. 64 TELITAB: A Generic Parametric Data Format Table 5.8: TELITAB set with OBJECT “Rudder” “Calculation title” 9 “LPP” 50 “B” 12 “TA” 3.3 “TF” 3.1 “VOL” 900 “LCB” -4.5 “Rudder” { 1 “thickn” 0.15 2 “chord” “span” “1” 0.5 1.5 “2” 0.6 1.3 “3” 0.5 1.5 } “CWP” 0.80 “CM” 0.78 2 “Speed” “Total_resistance” “1” 5.0 100 “2” 10.0 400 “3” 15.0 1100 “4” 20.0 1900 “5” 25.0 2800 On the output side of analysis models the system’s properties can be represented along more than one axis. The basic TELITAB format forces to decompose computational tasks into modules generating one TABLE at a time or to merge multiple TABLEs into one. The latter solution is possible but not preferred: it is very difficult to do something useful with such TABLEs (e.g. interpolation). The decomposition of analysis programs into such modules is time consuming and although the modules by themselves are simpler, it leads to more complex overall applications and loss of performance. Table 5.9 presents an example output of DESP, being a power prediction program for displacement ships [Holtrop, 1984] in the form of two output tables. The first table is the actual performance prediction whereas the second one contains the pulling performance at constant power. 65 KNOWLEDGE AND KNOWLEDGE ACQUISITION Table 5.9: Part of DESP output Block Coefficient CB: Blade area ratio AEAO: Pitch Diameter ratio PDRA: Correlation allowance CA: Propulsion deep VS THRUST [KNOTS] [kN] 11.00 210.3 .. .. 22.00 1225.5 23.00 1431.9 0.605 0.532 0.997 0.000267 water (calm water) ETA-D CAVP CAVN [-] [-] [-] 0.664 1.000 1.000 .. .. .. 0.633 1.000 1.000 0.626 1.000 1.000 N [RPM] 72.3 .. 157.8 167.8 PE [kW] 1003 .. 11692 14283 PS [kW] 1526 .. 18668 23053 Calculated pulling performance for constant power (calm water) <---------------PS= 17500.0KW---------------> VS R-TOT N THRUST PULL 1-T CAVP CAVN [KNOTS] [kN] [RPM] [kN] [kN] [-] [-] [-] 7.00 75.8 127.8 1621.5 1462.4 0.949 1.007 1.014 .. .. .. .. .. .. .. .. 20.00 717.8 150.7 1223.8 313.7 0.843 1.000 1.000 21.00 865.1 153.1 1191.1 138.9 0.843 1.000 1.000 In basic TELITAB format these two tables can be combined into the following, single TABLE (not all parameters are included in Table 5.10): Table 5.10: DESP output in basic TELITAB format “Propulsion deep water (calm water) and” “Calculated pulling performance for constant 4 “CB” 0.605 “AEA0” 0.532 “PDRA” 0.997 “CA” 0.000267 0 10 “VS” “RTOT” “THR” “PULL” “ETAD” “TDED” “1” 11 210 0.664 .. .. .. .. .. .. .. .. .. .. .. .. .. .. “12” 22 1226 0.633 “13” 23 1432 0.626 “14” 7 76 1622 1062 0.949 .. .. .. .. .. .. .. .. .. .. .. .. .. .. “27” 20 718 1224 314 0.843 “28” 21 865 1191 139 0.843 power 17500 kW” “CAVP” 1.000 .. .. 1.000 1.000 1.007 .. .. 1.000 1.000 “CAVN” “N” 1.000 72.3 .. .. .. .. 1.000 157.8 1.000 167.8 1.014 127.8 .. .. .. .. 1.000 150.7 1.000 153.1 “PS” 1526 .. .. 18668 23053 17500 .. .. 17500 17500 In recursive TELITAB format the output in Table 5.9 can be presented as two OBJECTs “Powering” and “Pulling” (see Table 5.11). The non-varying 66 TELITAB: A Generic Parametric Data Format parameters CAVN and CAVP in the OBJECT “Powering” are now moved from the TABLE in the OBJECT to the LIST of the OBJECT. The LIST in Table 5.11 starts with the parameters CB, AEAO, PDRA and CA and apply to all records in the OBJECTs “Powering” and “Pulling”. The recursive TELITAB set now consists of a LIST with four PVC’s and two OBJECTs, i.e. TELITAB sets. Table 5.11: Part of DESP output in recursive TELITAB format 6 “CB” 0.605 “AEA0” 0.532 “PDRA” 0.997 “CA” 0.000267 “Powering” { “Propulsion deep water (calm water)” 2 “CAVP” 1.000 “CAVN” 1.000 5 “VS” “THR” “ETAD” “N” “PS” “1” 11 210 0.664 72.3 1526 .. .. .. .. .. .. “12” 22 1226 0.633 157.8 18668 “13” 23 1432 0.626 167.8 23053 } “Pulling” { “Calculated pulling performance for constant power 17500 kW” 1 “PS” 17500 8 “VS” “RTOT” “THR” “PULL” “TDED” “CAVP” “CAVN” “N” “1” 7 76 1622 1062 0.949 1.007 1.014 127.8 .. .. .. .. .. .. .. .. .. “14” 20 718 1224 314 0.843 1.000 1.000 150.7 “15” 21 865 1191 139 0.843 1.000 1.000 153.1 } Data stored in recursive TELITAB format can be symbolically addressed and retrieved, except for the TEXT, which can only be obtained by a special query ‘get TEXT of OBJECT ...’. The addressing mechanism requires some special features to exploit the hierarchic nature of the format. In case we need the cavitation influence CAVN in the “Powering” results at a speed VS of 22 knots, we shall ask for the value of the following parameter: “Powering.CAVN.12” 67 KNOWLEDGE AND KNOWLEDGE ACQUISITION which means ‘get TABLE value in record “12” of parameter CAVN in OBJECT “Powering”. It is clear that no TABLE value of CAVN is available in “Powering”. However, the hierarchic nature allows inheritance which means that, if “Powering.CAVN.12” cannot be found, “Powering.CAVN” can be searched too, for which a value of 1.000 is returned. In case the pitch/diameter ratio “PDRA” is needed which applies to the same record, ask for: “Powering.PDRA.12” which is not available in the TABLE of OBJECT “Powering”. The second query will be “Powering.PDRA” which is also not successful. The third query will be in the top level LIST: “PDRA”, which will return a value of 0.997. In case a value is searched for in a non-existing OBJECT, e.g. “Power.PDRA.12”, a ‘not available’ is returned. In case “Powering” is asked, the complete OBJECT between { } is returned as result and the query “Powering.PS” returns a list of all values in the column PS. The recursive TELITAB format complements the simplicity of the basic format with a more flexible format imposed by demanding applications such as SUBCEM (chapter 10) . The format unifies the simple ‘Table/List’ data format as observed in many existing applications with object-oriented principles. Its merits are summarised as follows: • • • • • • Simple and generic link between applications at a symbolic level Each element in a database, viz. TEXT, (String)Value, Column or OBJECT can be addressed by name and subsequently extracted, modified and replaced One program version per application in all environments Sequence of database is insignificant, elements can be inserted at any location Applications can be linked immediately to any database, universal parameter names are desirable but not obligatory Format is hierarchic and object-oriented, by using inheritance, data redundancy can be avoided or at least reduced A disadvantage is that the process of reading data from a TELITAB file is slower than reading from a plain, fixed structure sequential input file. The basic TELITAB format for input and output is used in about 50 MARIN applications. The recursive format is used in new applications. 68 6. QUAESTOR: AN INTRODUCTION When you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot measure it, when you cannot express it in numbers, your knowledge is of a meagre and unsatisfactory kind: it may be the beginning of knowledge, but you have scarcely, in your thoughts, advanced to the stage of science. William Thomson, Lord Kelvin, Popular Lectures and Addresses, 1891-1894 The aim of the knowledge-based system QUAESTOR is to facilitate the design knowledge acquisition and management task by combining storage, maintenance and application of design knowledge and data in a single environment. QUAESTOR supports the building and use of structures (‘knowledge bases’) from numerical rules to computational model fragments (RELATIONs), their validity (CONSTRAINTs), the applied variables (parameters) and relevant CONTROL knowledge about when and how to use the models. On the basis of a problem definition (compute parameters ...) a domain dependent numerical model can be assembled and executed. The defined, or existing, knowledge-base (network of formulas) can be used to calculate or optimise any parameter used in any of the formulas or applications called. This chapter describes the global architecture and some features and properties of the system. An example is given in the form of a highly simplified conceptual ship design model. Finally, the knowledge-based design model or Concept Variation Model is introduced. 6.1. The System QUAESTOR is an expert system environment for a broad scope of numerical parametric applications. The system comprises an integrated network database management system for building and maintaining sets of numerical models and unifies aspects of amongst other rule based KBS, constraint programming languages and computer algebra as discussed in section 3.4. QUAESTOR provides an explicit knowledge representation and manipulation: • • • Facilities for capturing and storing of knowledge in a knowledge base transparent to the designer. It is possible for him to examine and modify the contents of the knowledge base. Knowledge and inference are separated, calculation schemes are not fixed. The system is able to explain how and what knowledge has been used to solve the problem. 69 QUAESTOR: AN INTRODUCTION The global system architecture is presented in Figure 6.1. In the sequel its components are described briefly and references to other chapters of this work are provided. Explanation DATA BASE DESIGNER KNOWLEDGE BASE Facts Rules MODELLER USER INTERFACE template Facts, Rules from DESIGNER WORKBASE Satellite Applications Problem description Problem status Figure 6.1: Global architecture of QUAESTOR I. Knowledge base The knowledge base contains numerical model fragments which are formalised abstractions of design concepts. The following main elements are used to construct Rules (Figure 6.1): • • RELATIONs, which are numerical relationships between design parameters CONSTRAINTs, which determine whether RELATIONs can be used or not • Parameters, used to describe the characteristics of a design The model fragments and the structure of the knowledge base were introduced in section 3.3. In section 7.1 the various frames and their slots are discussed in detail. The control knowledge or CONTROL slot captures the expectations of the knowledge engineer with regard to how and for what purpose RELATIONs and parameters can be used. The CONTROL is discussed in detail in section 7.2. The Facts in Figure 6.1 are fixed numerical values given in the knowledge base, e.g. 70 The System the E-modulus of steel. In section 7.3. the syntactic and semantic aspects of RELATIONs and CONSTRAINTs are described. II. Browser The purpose of the Browser is to provide insight to both the user and the knowledge engineer into the contents of the knowledge base and into the structure of the assembled models. The Browser is able to generate lists of RELATIONs, CONSTRAINTs and parameters fulfilling user defined search criteria e.g. which RELATIONs are now available for computing parameter x or which parameter(s) can be computed by a particular RELATION, etc. III. Database Databases of PVC’s can either be distributed over the parameters in the knowledge base or can exist in the form of TELITAB files (section 5.3). In the syntax of QUAESTOR (section 7.3) a number of special functions are defined that can access and use such data for different purposes. IV. Satellite Applications Satellite applications are invoked to perform specific computational tasks. The interfacing between QUAESTOR and these applications is by means of TELITAB data files (sections 2.4 and 5.3). V. Modeller The purpose of the Modeller is to support the user with assembling a domain dependent numerical model or template. This template represents a valid path through the knowledge base from input to output parameters. For each user defined or additional PENDING parameter, viz. goal and sub goal parameters, the Modeller selects all unused and non-rejected RELATIONs. Heuristic rules embedded in the Modeller are used to obtain a probability ranking of the selected, suitable relations (section 8.2). The ‘most probable’ RELATION is either selected or proposed to the user. The user may decide to accept or reject the proposed relations. Upon accepting a RELATION the user can change DETERMINED values or may provide values of the PENDING parameters in the proposed RELATION (or CONSTRAINT). If no values are provided, the PENDING parameters in the accepted RELATION are included in the goal list. An example of a modelling session is presented in section 6.5. The modelling process is described in some detail in section 8.2. The non-linear incremental solver subsequently solves the problem by separating the template in subsystems of equations or cycles (section 8.3). 71 QUAESTOR: AN INTRODUCTION VI. Workbase In the workbase, a (design) problem is defined by requesting the value(s) of PENDING parameter(s) in the knowledge base. The workbase contains the temporary data, i.e. the Facts related to the current problem: input values and (intermediate) results and template structure. If a solution is found, it is possible to recalculate with other input values, to change the problem while maintaining the current data or even to reverse the original problem. VII. Explanation facility At any moment during a dialogue, the assembled model and the dependencies in the template can be studied. In Table 6.1. an example template is represented in a standard text format. Upon request of the user, the interface shows lists of e.g. rejected RELATIONs or detailed information about the modelling steps taken and about the dependencies in the assembled template. 6.2. User Interface The frames in Figure 6.2 represent the general layout of the system. The arrows show how the various functions of the system are connected. Menu Work Network Base Management Parameter/ Expression List Data Editor Context Sensitive Help Figure 6.2: Map of interface 72 User Interface The user interface has two modes of operation: as interface between the knowledge engineer and knowledge base and as interface between the designer and the Modeller, workbase and explanation facilities. The knowledge engineer acquires valid numerical relationships and stores them in the knowledge base. In order to provide good access to specific parts of the knowledge base, it should be transparent and modular. The system allows the designer to assemble in dialogue models for arbitrary problems, using the contents of the knowledge base. Therefore, the user can access RELATIONs, CONSTRAINTs and CONTROL in the browser at any moment during a problem solving dialogue. The interface is based on Lotus 1-2-3 ‘classic’ ™ conventions and contains the following facilities: I. Network Management In this dialogue screen the knowledge base can be consulted and maintained. Also the problem solving dialogue mainly takes place here. The screen contains three viewports, respectively for RELATIONs, CONSTRAINTs and parameters. Each viewport presents all slots of a particular type of frame as described in section 7.1. The screen layout is presented in Frame 6.1. Frame 6.1: Network Management [C] File(s): ANATRIAL/ ¦ Time: 12:30:52 ¦ Date:1996-02-26 Relation < KT = POL(1,J,PDRA,AeAo,Z) + dKT ¦ Control < HARD, TW, AND, ON, EMPIRICAL, CLASS: Propeller Polynomials > Reference < Thrust coefficient Wageningen B-series ¦ Data < Polynomial coefficients for KT: >More --1 YES--X----------------------- - P R P&R C P&C----------------------+----Constraint< Type_Prop = 1 ¦ Control < EPS 0.01, HARD, TR, ON, NO MESSAGE ¦ Reference < Fixed pitch open type propeller, Wageningen B-series ¦ Data < ¦ ----------X----(C) MARIN 1996----- - P R P&R C P&C----------------------+----Parameter < KT > QPL Control < INI 0.01000, COL 9, FF 5, NN, SYS, OUT, VALUE, NO PARAMETERS ¦ Reference < Thrust coefficient of propeller ¦ Data < ¦ Value QWB< >INPUT Dimension ¦ [-] CLASS: Propeller Parameters >64946 1-Help,2-Edit,3-ScrHlp,4-VwPict,5-Combin.,6-String,7-LiNo,8-Keydf,9-Par.,10-Rel. KT = POL(1,J,PDRA,AeAo,Z) + dKT QUAESTOR Network Manager [/]=Menu READY FREE EDIT LINE INS NUM PROC STAN 73 QUAESTOR: AN INTRODUCTION II. Parameter List In this list all parameters or a subset of parameters in the knowledge base is presented in an alphabetical order. All slots of the parameter frames can be accessed. Input for calculations can also be supplied in this list. The screen layout is presented in Frame 6.2. Frame 6.2: Parameter List M Parameter Value Unit Reference 35 frames in list AeAo [-] Expanded blade area ratio of propeller >More C0.75 [m] Chord length of the propeller blade on .75R CT [-] Thrust coefficient of propeller D [m] Propeller diameter (or inner duct diameter) dKQ [-] Viscous scale effect on KQ dKT [-] Viscous scale effect on KT EthaO [-] Propeller open water efficiency Has [m] Distance between the propeller centreline HRAD [-] Hub/Diameter ratio of propeller >More J [-] Advance coefficient of propeller kp 0.00003 [m] Propeller blade surface roughness, >More KQ [-] Torque coefficient of propeller KT [-] Thrust coefficient of propeller KTN [-] Thrust coefficient of nozzle N [1/min] Propeller rotation rate PDRA [-] Pitch-diameter ratio of propeller Z [-] Number of propeller blades F1-Help,2-Edit,3-ScrHlp,4-VwPict,5-Comb,6-String,7-Full,8-Coupl,9-RefDat,10-Name KT (current parameter in Network Manager) dKQ Class: Propeller Parameters QUAESTOR Parameter List [/]=Menu III. READY FREE EDIT LINE INS PROC STAN Expression List In this list a user specified subset of the expressions (RELATIONs and CONSTRAINTs) in the knowledge base can be presented, e.g. the RELATIONs that may be used to calculate a specific parameter. The Expression List is similar to the Parameter List. The screen layout is presented in Frame 6.3. The user can specify selection criteria for the subset to be presented. 74 User Interface Frame 6.3: Expression List Expression Constraints? Reference 20 frames in list KT=f(J,PDRA,AeAo,Z,dKT) 1 Thrust coefficient, Wageningen B-series >More KQ=f(J,PDRA,AeAo,Z,dKQ) 1 Torque coefficient, Wageningen B-series >More KT=f(PDRA,J,dKT) 2 KT ducted propeller, nozzle 19A, Ka 3-65 >More KT=f(PDRA,J,dKT) 2 KT ducted propeller, nozzle 19A, Ka 4-70 >More KT=f(PDRA,J,dKT) 2 KT ducted propeller, nozzle 19A, Ka 5-75 >More KT=f(PDRA,J,dKT) 2 KT ducted propeller, nozzle 19A, Ka 4-55 >More KT=f(PDRA,J,dKT) 2 KT ducted propeller, nozzle 22, Ka 4-70 >More KT=f(PDRA,J,dKT) 2 KT ducted propeller, nozzle 24, Ka 4-70 >More KT=f(PDRA,J,dKT) 2 KT ducted propeller, nozzle 37, Ka 4-70 >More KT=f(PDRA,J,dKT) 2 KT ducted propeller, nozzle 33, Kd 5-100 >More KTN=f(PDRA,J) 2 KT nozzle 19A, Ka 3-65 >More KTN=f(PDRA,J) 2 KT nozzle 19A, Ka 4-70 >More KTN=f(PDRA,J) 2 KT nozzle 19A, Ka 5-75 >More KTN=f(PDRA,J) 2 KT nozzle 19A, Ka 4-55 >More KTN=f(PDRA,J) 2 KT nozzle 22, Ka 4-70 >More KTN=f(PDRA,J) 2 KT nozzle 24, Ka 4-70 >More KTN=f(PDRA,J) 2 KT nozzle 37, Ka 4-70 >More_ F1-Help,2-Edit,3-ScrHlp,4-VwPict,5-Comb,6-String,7-Full,8-Coupl,9-RefDat,10-Name KT (current parameter in Network Manager) KT = POL(J,PDRA,AeAo,Z) + dKT Class: Propeller Polynomials QUAESTOR Expression List [/]=Menu IV. READY FREE EDIT LINE INS PROC STAN Workbase The Workbase presents the input and (intermediate) results of calculations and can be accessed at any moment during the dialogue when the Modeller consults the user. The Workbase presents a LIST of singular Parameter Value Combinations (PVC’s) and a TABLE containing the varying input and output. Similar to spreadsheets, the workbase is cell oriented. The layout of the workbase is presented in Frame 6.4. V. Data Editor A full-screen editor in which large expressions, the multi-line reference and data texts can be viewed and/or modified. The user can install a preferred standard ASCII editor. 75 QUAESTOR: AN INTRODUCTION Frame 6.4: Workbase AeAo dKQ dKT POD Type_Prop Z EthaO J KQ KT 1 0.000 0.000 0.04996 0.38562 2 0.061 0.050 0.04834 0.37067 3 0.121 0.100 0.04658 0.35462 4 0.180 0.150 0.04468 0.33752 5 0.238 0.200 0.04265 0.31945 6 0.295 0.250 0.04050 0.30047 7 0.350 0.300 0.03824 0.28064 8 0.404 0.350 0.03588 0.26004 9 0.455 0.400 0.03342 0.23871 10 0.503 0.450 0.03087 0.21674 11 0.547 0.500 0.02824 0.19418 12 0.586 0.550 0.02554 0.17110 13 0.619 0.600 0.02278 0.14757 14 0.641 0.650 0.01996 0.12364 15 0.648 0.700 0.01710 0.09939 16 0.629 0.750 0.01420 0.07488 17 0.567 0.800 0.01127 0.05018 F1-Help,2-Edit,3-ScrnHelp,4-ViewPic,5-SavIni,, 7-Full,8-Coup,9-Report,10-Control Requested Value(s) - INI 0.600, COL 9, FF 3, NN, USR, OUT, VALUE, NO PA EthaO [-] Open water efficiency of propeller 0 EthaO by: EthaO = J*KT/(2*3.141592654*KQ) Solution found, leave workbase to proceed [ESCAPE], [LEFT] or [RIGHT] QUAESTOR Workbase [/]=Menu READY FREE SOLV LINE INS PROC STAN VI. 0.750 0.000E+00 0.000E+00 0.850 1 4 MENU Commands related to database management, Modeller, graphics, output, etc. are given through a tree-structured menu. A context-sensitive help system can be accessed from all quarters of the system, i.e. in the menu, on the various screen locations, in error conditions, to obtain background information on suggestions during a dialogue, etc. 6.3. Tasks and Competence Solving problems within QUAESTOR is the opposite of performing calculations with conventional computer programs. The most important difference concerns the ‘box colour’. In contradiction with computer programs dedicated to design calculations, the process is not a ‘black box’. As the algorithm of conventional applications is enclosed within the software, the algorithm of a knowledge-based model is partly ‘system’ and partly ‘user’. This means that control over the inclusion and execution of model fragments is partly performed by the system and partly by the user. The system’s interface is obviously high-end and non-directive [Steels, 1989] since it does not instruct the user to take particular steps and allows full freedom of action. Through such interfaces the two ‘parties’ involved, i.e. system and user, support and supplement each other during the dialogue in order to achieve the defined goal. In fact, the user is located in the process and forms part of the 76 Tasks and Competence algorithm. This implies that the part of the algorithm dealt with by the Modeller must be totally transparent to the user. On the other hand, the Modeller should be able to ‘invoke’ the user in case data or decisions are required. Using the RELATIONs in the knowledge base, the Modeller proposes, asks and calculates. The ‘program flow’ is controlled by providing additional knowledge, i.e. values of the presented parameters, and by taking decisions (the primary role of designers). More demanding than providing values of the known parameters are the decisions and choices that must be made in the course of the dialogue. These decisions are mainly the rejection or acceptance of proposed RELATIONs. This concerns e.g. which of the available (empirical) RELATIONs are to be used, requiring deep knowledge about the domain of the knowledge base. This know-how is most probably available with self-made knowledge bases but is less obvious if third party knowledge bases are used. In no way, the system relieves a user of having professional know-how. Deep knowledge about the defined problem and its context is required for taking the proper decisions during the process. A designer using QUAESTOR is regarded as a modelling system consisting of two co-operating subsystems, both with their specific tasks, capabilities and competence. These two subsystems communicate by transferring data and stimuli to each other. The term ‘stimulus’ (thing that rouses to activity or energy) is preferred above ‘instruction’ (making known what is to be done) as neither of the two subsystems are able to exactly predict the steps of the other subsystem that result of its stimulus. In Table 6.1 an overview of the exchanged data and stimuli is provided. Table 6.1: Data and stimuli User → Modeller Modeller → User select goal parameter(s) start dialogue proceed assembling provide (starting) value halt accept reject forever reject now present RELATION and parameter ask (starting) values warn and ask to reconsider issue warnings and error messages present (intermediate) results advise show progress 77 QUAESTOR: AN INTRODUCTION QUAESTOR needs typical ‘human’ know-how to perform its model assembling task. It can efficiently store and use numerical knowledge but it has no knowledge about its purpose other than captured in the CONTROL slots. This reflective knowledge only expresses some expectations and typical properties of the frame contents and is only a low level link between the knowledge and the user in the outside world. This purpose can only be provided by the user. In order to act in the dialogue, a user needs to know about the play. Even if it is an ‘experimental’ performance, some basic rules must be followed and understood. The user needs to understand the most important stage management rules which gives him the means to follow the performance and to act accordingly. It is observed that experts (designers) without prior experience of knowledgebased parametric model assembling can use their first application after an introduction of about 4 hours. 6.4. Expert Questions The sequel describes some important properties of QUAESTOR by answering some frequently asked expert questions. 1) How is knowledge base maintenance done? The shell includes facilities for knowledge base management. The knowledge base is a network of frames containing a RELATION, CONSTRAINT or parameter each (see also Figure 3.3). This network structure is invisible for the user, network maintenance is fully automated. The introduction of a new model fragment is purely declarative; in case existing parameters are present in the new expression, its frame is automatically linked with these existing parameter frames. If a parameter is removed from an expression by editing, the link between the expression frame and the parameter frame is removed. If the parameter is not used anymore in other expressions, its frame is removed and all pointers in the knowledge base are recalculated. By e.g. giving a parameter the name of an existing one, the former is removed from the knowledge base and all links between expressions and the removed parameter are connected to the frame of the latter parameter. The network database combines acceptable performance for maintenance purposes with high performance queries as required by the knowledge-based Modeller. 78 Expert Questions 2) From what sources originate knowledge inputs to the knowledge base? Basically QUAESTOR is a system for individual use, with a high-end interface like a spreadsheet. In case of developing large and complex applications, however, model fragments originating from various external sources are included in the knowledge base under the responsibility of a knowledge engineer. The knowledge engineer and the end user can be one and the same person although this is not necessary. Existing computer programs can be used as satellite application without significant adaptation. 3) Is it necessary to transform existing models implemented in computer programs into RELATIONs, CONSTRAINTs and parameters in order to apply them by QUAESTOR? Certainly not. The shell can be operated as a ‘controller’ of other, existing software (e.g. a program predicting ship propulsive power). If analysis models are already available in the form of a computer program, it is not necessary to develop custom models in a QUAESTOR knowledge base. Experience shows that knowledge bases should ideally contain a relatively small number of model fragments that call upon a number of satellite applications. QUAESTOR has functions for invoking and reading or interpolating in (tabular) output of these applications, which is similar to using external databases, again, all in TELITAB format. The larger knowledge bases which have been developed, e.g. for the design of submarines and naval surface vessels, are all mainly composed out of standard or dedicated applications. The smallest possible number of parameters, RELATIONs and CONSTRAINTs in a QUAESTOR knowledge base ‘glue’ these applications together. Obviously, all parameters that can either be input or output in a design study should at least be present in the knowledge base. In combination with a proper use of the facilities for knowledge base structuring, knowledge bases can be maintained on the longer term. 4) What is the search strategy of the inference engine (Modeller)? The Modeller starts with one or more top goal parameter(s) indicated by the user with ‘?’. From these top goals are chained to RELATIONs through backward reasoning, i.e. RELATIONs are selected which can be used to compute the values of these goals. The RELATION can either be a simple formula (goal parameter in left clause), an equation (goal parameter in right clause) or a satellite application. If a value is becoming DETERMINED, either by calculation or by input, the Modeller performs a forward reasoning search in order to find RELATIONs that can be 79 QUAESTOR: AN INTRODUCTION ‘fired’, i.e. what can be calculated given the current set of values? The selected list of suitable RELATIONs for a particular goal parameter is prioritised through heuristic rules implemented in the Modeller. Details of these rules and of the modelling process are provided in section 8.1. The most probable RELATION is either proposed or immediately included into the template, depending on the settings of control attributes (see section 7.2). Proposed RELATIONs can either be accepted or rejected by the user and PENDING values can be provided. By doing so, the designer explicitly or implicitly controls the model assembling process and thus the way in which the problem is solved. PENDING parameters in the selected RELATIONs are included into the goal list. The basic principle of the process is that the output of one RELATION is used as input for others. In the case of ‘dead alleys’ the Modeller applies backtracking techniques. At any moment during a dialogue, values can be changed, made DETERMINED or made PENDING. The Modeller is able to properly respond to any user action or stimulus. Templates use input and produce output in TELITAB data format, presented in the workbase. The guiding principle is that all processes involved, either within the shell or in satellite applications, use and generate data in this format. 5) How does the system distinguish between dependent and independent parameters in a model? By default, the Modeller makes no distinction between dependent and independent parameters. By providing control knowledge for parameters and RELATIONs during the development of a knowledge base, it is possible to define whether a parameter can be determined at all. In practice, control attributes are used to limit the user interference and decisions during a dialogue and to reduce the search space of the Modeller, thus increasing its performance. Being a dependent or independent parameter is free and fully depends on the problem defined and the input provided. The same holds for parameter variations: the user is entirely free in selecting input parameters to be varied. The Modeller automatically concludes which parameters depend on and vary with, the input and stores their values in the workbase. By its ability to compute solutions for problems with one or more varying independent parameter(s), the system is suitable for sensitivity analyses or Concept Exploration. 80 A Simple Application Example 6) Can problems only be solved by having a dialogue session with the system? No, any problem and solution in the form of user input and RELATIONs selected either by the system or by the user can be stored in a MACRO. A MACRO can be considered and used as a ‘traditional’ imperative computer program. However, these ‘programs’ can be dynamically adapted at runtime since full access to the Modeller remains available. This makes it possible to infer unavailable input values or to provide additional input and thus fixing particular intermediate results of the MACRO. The latter results in an automatic pruning, i.e. removing the RELATION(s) from the MACRO template which became obsolete by the additional input. 6.5. A Simple Application Example The following example is a simplified version of the design model developed for the TROIKA drone (section 10.2). The purpose of the example model is to evaluate the effect of the design speed on the displacement and design draft. The influence of the weight of the propulsion unit and of the fuel quantity is often neglected when the propulsive power is computed as a function of design speed. Increasing the design speed leads to an increase of the required propulsive power, thus increasing the propulsion unit weight and the amount of fuel required for a particular endurance. This is a clear example of the iterative character of ship design. The design problem is compressed into the following seven equations or RELATIONs. The displacement DisW is the sum of three weight groups: Payload PL, the fuel weight W_fuel and the light ship weight W_light. DisW = PL + W_fuel + W_light (eq. 1) The light ship weight W_light is the sum of the steel and outfitting weight, depending on the hull main dimensions and of the machinery weight W_mach and is expressed in the following simplified equation: W_light = 0.25*Length*Beam*Depth + W_mach (eq. 2) The weight of the fuel W_fuel is obtained by multiplying the specific fuel consumption, assumed to be 0.25 kg/kWh by the installed Power and the endurance Endu: W_fuel = 0.250*Power*Endu/1000 (eq. 3) 81 QUAESTOR: AN INTRODUCTION The installed propulsive Power is approximated with the admiralty coefficient Cad and the design speed V_des. This expression simulates the satellite power prediction program used in the real TROIKA design model: Cad = {DisW}^0.66*V_des^3/Power (eq. 4) RELATIONs in QUAESTOR consist of one parameter as left clause and an expression as right clause. W_light (eq. 2) clearly is an OneWay RELATION. The designer knows that neither one of the main dimensions nor the machinery weight can ever be determined on the basis of the light ship weight W_light. By including this knowledge in the frame of (eq. 2), the designer has ruled out the possibility for QUAESTOR to calculate one of the main dimensions if W_light is known, implicitly reducing the search space of the Modeller. The RELATION for the admiralty coefficient Cad (eq. 4) is only partly OneWay. The designer knows that the displacement DisW will never be derived from the admiralty coefficient Cad. Therefore, this possibility is ruled out by placing DisW in (eq. 4) between socalled chain-prevention braces {DisW}. Now the admiralty coefficient Cad, design speed V_des and Power can all be calculated with (eq. 4). The displacement DisW, however, cannot be calculated. The weight of the propulsive machinery W_mach is obtained by assuming it to be directly proportional to the installed Power: W_mach = 0.04*Power (eq. 5) The next RELATION expresses the Depth of the hull as the sum of the draft and the minimum freeboard FRBD required by stability aspects: Depth = Draft + FRBD (eq. 6) The last RELATION expresses the basic hydrostatics in the form of Draft as a function of the main dimensions Length and Beam and the displacement DisW, valid for the class of hull forms considered: Draft = DisW /(0.45*Length*Beam) (eq. 7) The problem to be solved is the displacement DisW as a function of selected design speed V_des, assuming a given class of parent hull forms. Since the main dimensions Length and Beam are fixed by the dimensions of the magnetic coil (see section 10.2), the displacement DisW and Draft are PENDING. The steel weight depends on the Depth of the hull (eq. 2) which is determined by the sum of Draft and freeboard FRBD (eq. 6). A key principle applied by QUAESTOR is 82 A Simple Application Example that one parameter can be chained to, i.e. is determined by one RELATION. This means that each parameter is derived from one RELATION in the template. Which RELATION is used for which parameter depends on the defined problem and on the input provided during the dialogue. However, a parameter can be present in more than one equation of the template, even if cycles are introduced as with the parameters DisW, Power and Draft (Table 6.2). A cycle is present when a goal parameter is used as input in its own (sub-)template. In such cases, the (non-linear) equations are solved as a system through successive Newton-Raphson iterations (section 8.3). However, for the purpose of controlling the reasoning process the parameter is stated to be determined by one RELATION only, for Power: (eq. 4) and for DisW: (eq. 1). For each parameter the RELATION is recorded by which it is introduced in the template. The template of this example is shown as an inference tree in Table 6.2 and as a directed network in Figure 6.3. In Table 6.2, the parameters in bold are input and the parameters in italic are determined by the RELATION. Table 6.2: Inference tree of the simplified TROIKA design model START OF INFERENCE: DisW is GOAL and is chained to: DisW=f(PL,W_fuel,W_light) ...........................(eq. 1) W_light is SUBGOAL of DisW and is chained to: W_light=f(Length,Beam,Depth,W_mach) ...............(eq. 2) W_mach is SUBGOAL of W_light, DisW and is chained to: W_mach=f(Power) .................................(eq. 5) Power is SUBGOAL of W_mach, W_light, DisW and is chained to: cycle-> Cad=f(DisW,V_des,Power) .......................(eq. 4) Power inferred W_mach inferred Depth is SUBGOAL of W_light, DisW and is chained to: Depth=f(FRBD,Draft) .............................(eq. 6) Draft is SUBGOAL of Depth, W_light, DisW and is chained to: cycle-> Draft=f(DisW,Length,Beam) .....................(eq. 7) Draft inferred Depth inferred W_light inferred W_fuel is SUBGOAL of DisW and is chained to: W_fuel=f(Power,Endu) ..............................(eq. 3) W_fuel inferred DisW inferred END OF INFERENCE 83 QUAESTOR: AN INTRODUCTION Figure 6.3: Semantic network representing the simplified TROIKA model The top goal parameter(s), i.e. the parameter(s) asked by the user are selected by input of a question mark ‘?’ in their value slot, in this example only DisW. The system starts searching for a RELATION by which DisW can be determined. DisW is the goal parameter and is chained to, i.e. determined by a RELATION (eq. 1). This RELATION contains three additional parameters of which two can be chained to other RELATIONs. The Payload (PL) cannot be inferred and QUAESTOR prompts the user to give a value for PL. When this is done the other parameters become sub goals of DisW. The first one, W_light is chained to (eq. 2). QUAESTOR demands input values for Length and Beam since these parameters cannot both be chained to other RELATIONs. Only (eq. 7) can in principle be used to determine either Length or Beam but cannot determine both. Therefore, QUAESTOR demands input for both Length and Beam and asks input for Depth for which also a RELATION is available. W_mach is a sub goal of W_light and indirectly of DisW and is chained to (eq. 5). In (eq. 5) Power is PENDING and not chained and is added to the goal list. Power is a sub goal of W_mach and indirectly of W_light and DisW. Power is chained to (eq. 4). The value of the admiralty coefficient Cad and the design speed 84 A Simple Application Example V_des are provided by the user as input which makes that Power is inferred. Inferred does not necessarily mean computed (DETERMINED) as is clear for Power, since DisW is still PENDING. Inferred means that it must be possible to compute the relevant parameter, provided the values of all the other required parameters are either inferred or input. Subsequently, the inference tree (Table 6.2) states that W_mach is inferred. The second sub goal of W_light is Depth which is chained to (eq. 6). QUAESTOR demands from the user a value for FRBD since that could not be chained, i.e. there is no RELATION available for FRBD. Draft is an additional sub goal of Depth through (eq. 6) and is chained to (eq. 7). In (eq. 7) DisW is present as PENDING. However, since DisW was already chained to (eq. 1), it is concluded not to be an additional sub goal. It is now stated that Depth and subsequently W_light are inferred. The second sub goal of DisW, W_fuel is chained to (eq. 3). In this case, only the endurance Endu is required as input. DisW is now inferred which means that all sub goals have been chained to RELATIONs and that the system of equations 1-7 can be solved. Figure 6.4: Result of example calculation As stated above, the system is able to compute solutions for one or more varying independent parameter(s). In Figure 6.4, the results of a parameter variation are presented, in this case for the design speed V_des against the displacement DisW at a fixed endurance Endu. 85 QUAESTOR: AN INTRODUCTION With the same knowledge base, other problems can be solved as well, e.g. the calculation of the value of the admiralty coefficient, which served as input for the above described example. Input data for this calculation is derived from an existing design. The values of DisW, POWER and V_des of a parent hull are provided and Cad is requested. In the following design session this calculated value of Cad is fixed and used as input, assuming the parent hull to characterise the propulsive performance of the new hull. In this way, a design exercise can be performed in a step by step manner. In the above example, only one RELATION is available for each parameter to be calculated. In real world cases often more than one RELATION is available of which the most suitable one needs to be selected. QUAESTOR will make a choice which RELATION to use, depending on validity and/or based on the heuristic rules embedded in the Modeller (see section 8.2), either or not in dialogue with the designer. The example discussed in this section is small and simple and could easily be solved in a more conventional manner. But larger problems based on exactly the same principles can consist of well over 200 RELATIONs. For the TROIKA application presented in section 10.2 about 80 RELATIONs were used in one template which also included a number of satellite applications for e.g. powering and stability. 6.6. The Concept Variation Model In chapter 4, conceptual ship design was described as a reasoning process. The current practice of parametric ship design in the form of the Concept Exploration Model or CEM was discussed. Subsequently, a knowledge-based approach towards numerical model assembling was introduced. In the previous sections QUAESTOR was introduced and a simplified design model was assembled as example. It is now time to formally present the knowledge-based alternative of the CEM together with some important development aspects. A QUAESTOR knowledge base is not a numerical model in itself, it is the raw material from which dedicated numerical models are assembled. The knowledge base contains a set of compatible numerical model fragments (RELATIONs and CONSTRAINTs). These model fragments may contain bounded knowledge (the CONTROL) of how they can be fitted together. Once a knowledge base is loaded in QUAESTOR, models can be assembled on the basis of the problem at hand, using the contents in the knowledge base. Within naval architecture, this technique is suitable for calculations with models containing only a few RELATIONs up till 86 The Concept Variation Model complex models unifying a large number of design aspects. The domain in which most application experience is obtained with this technique is the conceptual design of naval ships which has the following characteristics: • • • • High complexity of the vessel, design constraints and requirements may evolve during the design process, life cycle aspects are considered. In early phases of design, a high degree of uncertainty in concept parameters, calculate if possible, approximate if necessary. Dependency may exist between concept parameters, i.e. it is dangerous to fix values of concept parameters without knowledge of the effects of such choices on other concept parameters. Design problems are open ended, the aspects driving the design vary from case to case. The above aspects emphasise the advantage of assembling multiple models using the same set of model fragments over the traditional, hard-coded Concept Exploration Model (section 4.2). The approach of using a QUAESTOR knowledge base in conceptual design is denominated Concept Variation Model or CVM [Keizer, 1994]. The term Variation in the CVM refers to the varying composition of the assembled design models, i.e. to the variety of design problems dealt with and to the variety of ship concepts which are addressed in the assembled models. Although the knowledge base is a set of (independent) model fragments without a particular architecture, its composition and control facilities allow the assembling of models of which the architecture and application can be described in a generic manner. The CVM is both a (generic) design synthesis model in the sense of the CEM and an analysis model, configured around parametric concept descriptions, representing a (limited) number of advanced point designs or ‘case-base’. The CVM predicts effects of adaptations of the selected initial concept, i.e. point design, on its properties (e.g. powering and motion performance). The designer is free in selecting the relations affecting the concept and of the premises and properties to be studied, which provides full freedom in the approach towards the design problem. Similar to manufacturing processes, in design tools are applied in varying settings, depending on the context in which the process has to be performed. The process is outlined in Figure 6.5. using the Structured Analysis and Design Technique or SADT [Schoman, 1977]. The CVM knowledge base contains two categories of model fragments. The first category comprises the actual design knowledge, e.g. in the form of weight, volume, area energy aspect models of specific ship types and design data in some systematic form. These RELATIONs determine the values of concept parameters during the concept variation process. The second category is used to validate the 87 QUAESTOR: AN INTRODUCTION concept, e.g. with respect to stability, motions, endurance and cost. However, RELATIONs are free to change role, e.g. a stability model can be used to determine hull form coefficients on the basis of e.g. a minimum metacentric height and the required range of positive stability. CVM Decisions Payload Operational values Secondary input Concept description Concept Variation Control Measure of performance CVM Parameter values Aspect model(s) 1st cat. CVM Parameter values Aspect model(s) 2nd cat. weight volume area energy stability motions signatures cost Error! Unknown switch argument. Figure 6.5: Concept Variation Process In Figure 6.5 the design process is described as an analysis ⇔ synthesis cycle (see also section 4.1) using the desired payload and operational values as primary input. The synthesis (dimensioning) is performed by using the aspect models of the first category, the analysis (performance) by means of the models in the second category. The output of the applied aspect models are fed back into the Concept Variation Control in which they are presented to the user together with the overall process status. The model feedback, user decisions on the basis of the presented results and user response on inquiries for secondary input determine the role (analysis or synthesis) of aspect models in the CVM knowledge base and control their ‘firing’. The primary output of the concept variation process consists of a concept description, i.e. preliminary design and the performance and cost of the concept. In addition to this the aspect models of both categories in the CVM may produce extensive secondary output for other purposes. 88 The Concept Variation Model The CVM is a Concept Variation Model, this means that it predicts the effects of concept adaptations on e.g. performance and cost vice versa. The process output are basically a (validated) concept and a measure of performance, if any. Therefore, the concept descriptions of the worked-out point designs are available in the CVM as design data. The concept variation process starts with the selection of one of these point designs as initial concept (dimensions, geometry and components) on which to be varied. To facilitate the process, concepts are controlled with the least possible number of input variables. These input variables represent the independent parameters in the concept description. The concept description of the point designs in the CVM consists of a limited number of ‘fixed’ or independent Parameter-Value Combinations (PVC’s) and a larger number of ‘free’ or dependent variables which are determined by RELATIONs. QUAESTOR implicitly allows users to deviate from this by providing values for parameters instead of using the RELATIONs in the knowledge base since no distinction is made between dependent and independent variables. This role is determined by the problem definition, user input and relevant control knowledge, either in the knowledge base or provided by the user. The properties of the point designs are preferably not described by means of their actual values but through RELATIONs in the first category, describing the design knowledge. These RELATIONs may contain correction coefficients on general aspect models applied for e.g. weight, power, volume, etc. The primary concept variables (such as manning requirements or payload) are given as value. Any concept variable which can be derived by a RELATION from this input need not be provided, however. This means that the design is controlled and varied by changing a limited number of concept variables. The other concept variables are simply calculated by using the appropriate RELATIONs and correction coefficients. The knowledge-based model assembling process with the CVM is conceptually very close to the current design practice as outlined in section 4.1. The design process remains iterative and it can be approached in a sequential step-by-step manner. The difference, however, is that one can study effects of changes to a concept in a much more efficient way since many interacting aspects can be concurrently taken into account in the assembled model. Moreover, it allows modelling without fixed composition of model input and output. The SUBCEM and TROIKA models discussed in chapter 10 are regarded as CVM. Using the CVM an array of variations are imaginable, such as variations in payload configuration, design parameters and applied technology. Those 89 QUAESTOR: AN INTRODUCTION variations will result in adjustments of at least some of the concept parameters. Take for example the replacement of a conventional Diesel propulsion plant by a configuration occupying less volume, e.g. a gas turbine plant, even if this system is more expensive. Depending on the influence of the volume of the propulsion system on the overall concept, this may lead to a smaller ship and to a reduction of life cycle cost. Combined effects can be e.g. lower resistance and signatures. If the propulsion system is less complex it may require less maintenance reducing the number of personnel. A reduction of personnel implies a smaller demand for accommodation and support functions, so again a reduction of volume and weight. If these effects are properly modelled in the RELATIONs of the CVM it becomes relatively straightforward to study effects on the overall concept of such variations in requirements, applied technology and even of new or envisaged future technology. The latter can be of help in the process of establishing the focus of research and development efforts [Keizer, 1996]. 6.7. CVM Development Aspects Developing a CVM is rather a knowledge acquisition process than a software development activity. Imagine a Concept Variation Model is needed which can generate performance and cost data on the basis of a parametric concept description. As a first step, aspect models are gathered to produce (parts of) this desired output. Some input to these aspect models may not be available. If not available, other aspect models need to be selected or developed for this pending input. This proceeds recursively until a set of models is gathered which can be assembled into numerical models producing values for the cost and performance top goal parameters. In a way, this approach reflects both the knowledge acquisition process involved in building the CVM and the model assembling process of QUAESTOR using the CVM knowledge base. For selection of aspect models accuracy is important since more accurate models require more detailed concept descriptions. If the introduction of aspect models is not carefully considered and monitored, this may lead to a combinatorial explosion of the CVM knowledge base and of the number of concept variables involved. In our case of a cost and performance centred CVM, the first step in building the CVM knowledge base is to select or develop a performance evaluation model and a life cycle cost prediction model. These models most probably require a concept description and life cycle scenario(s) as input. Concluding, the descriptions of concept and its life cycle scenarios are the subsequent steps of the knowledge acquisition process. The scenarios and concept description will contain and require 90 CVM Development Aspects user defined aspects and aspects which are the output of additional models dealing with e.g. stability, motions, etc. The knowledge acquisition process is basically backward reasoning from goals or model output to suitable model fragments or aspect models. A CVM should be well-balanced, i.e. the relative accuracy of the applied aspect models should not differ too much. The weakest models finally determine the overall accuracy of the CVM knowledge base. If available or possible, the limits of model fragments should be established and included in the knowledge base. In practice, the relation between model accuracy and model input is often uncertain due to sparse validation data. Aspect models should only be introduced into a CVM knowledge base in case: • • • • • it models an aspect still missing in the current CVM it reduces the number of non-computable parameters in the CVM, if possible its simplifies the completion of the CVM and introduces the least possible additional concept parameters it is of similar accuracy as the other aspect models in the CVM, or it improves the global accuracy of the CVM 91 This page intentionally left blank 92 7. PARAMETRIC MODELS: THE PARTS Much wisdom is comprised in small words. Sophocles In this chapter, the building blocks of numerical models are presented. Numerical models are composed of a number of smaller objects or fragments. In this paradigm the modelling process is an assembling activity requiring that the model fragments connect together. For that purpose, the hierarchical TELITAB data model was developed. In this chapter some characteristics of the application domain are discussed, followed by a discussion of the structure and properties of the model fragments. Subsequently, the structure of the knowledge base and the keys and contents of its slots are discussed in detail. Then, the control knowledge which captures expectations with regard to the practical application of the model fragments is elucidated. Throughout this chapter, a number of simple examples are added to the explanations. Finally, syntactic and semantic aspects of the instruction set are discussed. 7.1. Domain Description and Data Model The domain of QUAESTOR is the design and analysis of any system that can be described in numerical quantities and relations between these quantities, i.e. both the system and its properties should be expressed in that form. By generalising these aspects and experience, the application of knowledge-based parametric model assembling is confined to domains where a network knowledge representation and control improves efficiency and quality of the design and analysis process. This network representation and control enables to freely combine a varying collection of knowledge aspects in different contexts. This is only useful if a variety of problems need to be solved within the same set of knowledge, i.e. by using the same knowledge base in varying compositions and entries. It is stated in the foregoing that numerical relationships or RELATIONs can be used to define the connection between the quantities or parameters describing systems and their properties. Valid RELATIONs can be used to compute values of parameters on the basis of values of other parameters. The validity is expressed by zero or more CONSTRAINTs connected to the RELATIONs. Together, these RELATIONs, parameters and CONSTRAINTs are forming a network of frames (Figure 3.3). 93 PARAMETRIC MODELS: THE PARTS The task of the Modeller is to check validity, propose RELATIONs, maintain template, ask input (values of parameters), compute (solve systems of equations) and present results. In this process, the contents of the knowledge base and the values in the workbase are used. The workbase contains the user defined values and those which have been inferred. A RELATION is assumed to be a continuous function. Non-continuous functions can be expressed in RELATION/CONSTRAINT combinations or can be defined by means of special functions which should not be used inversely. The three types of frames (RELATIONs, CONSTRAINTs and parameters) are considered to be independent except for the network relations between parameters and expressions, e.g. Part_Of <> Has_Part. In addition to the actual name of the parameters or the expression, the frames contain local control knowledge. In the control knowledge, expectations are captured regarding the application of the frames. In each type of frame, the knowledge is stored in dedicated slots. I. Relation • RELATION: the expression in algebraic notation, e.g. RTOT = SPLINT(1, Speed, “Vs”, “Power”) which is a spline curve interpolation on Speed (as X value) in a TELITAB set containing the columns labelled “Vs” representing X and “Power”, representing the Y interpolated value. The first argument in the SPLINT() function, ‘1’ refers to the label of the interpolation data set (see description of DATA slot). • REFERENCE: free format text containing a description of expression, source, backgrounds, warnings, etc., e.g. Power curve of S/L Motivator, full load condition • CONTROL: the control knowledge, see next section 94 Domain Description and Data Model • DATA: numerical data which is used in the expression, e.g. a speed/power table in TELITAB format: \SPLINT1\ 0 2 “VS” “1” 14.00 “2” 16.00 “3” 18.00 “4” 20.00 “5” 22.00 “6” 24.00 “Power” 7956 11338 15553 20873 27753 36714\ The label \SPLINT1\ refers to the SPLINT-function (SPLine INTerpolation) and to the first argument in the function. In this way the interpreter can find the data at runtime. The data set is delimited with backslashes ‘\’. The inclusion of text (REFERENCE) and numerical data (DATA) in the frames is a simple and powerful concept of knowledge management. In the REFERENCE slot background information can be included which may be important for a user to know or to have access to during a dialogue. At decision points during the dialogue this text is either presented or immediately available. In a sense, this sequence of texts presented to the user during a dialogue is an implicit user manual or at least a description of the model assembling process and of the resulting model. The DATA may contain tables or coefficients which are required by one of the available special functions (section 7.3). The contents of the DATA slot is in simple ASCII format and is assumed to be of limited size. In the case large databases need to be used, a reference to an external file is preferred. II. Constraint • CONSTRAINT: the expression in algebraic notation, e.g. Speed=>14 AND Speed<=24 which is the allowed range in the above spline curve interpolation. • REFERENCE: free format text, e.g. Performance curve range of S/L Motivator, full load • CONTROL: the control knowledge, see next section • DATA: numerical data used in the expression, in this case none. 95 PARAMETRIC MODELS: THE PARTS III. Parameter • Parameter: the name of the parameter, e.g. Speed • REFERENCE: free format text, e.g. Ship speed • CONTROL: the control knowledge, see next section • DATA: numerical data, e.g. a column in an internal database. In addition to numerical data stored in the expression frames, TELITAB databases can be stored in the knowledge base in a distributed manner. Each database has a unique number. The databases are stored in columns distributed over the relevant parameters. These databases can be accessed by expressions in the knowledge base and can be imported and exported. Their main purpose is to prevent conflicts by limiting redundancy in numerical data if more than one expression need to access the same data. • VALUE: input value or fixed knowledge base value. During a dialogue a value or range of values can be provided as input which is transferred to the workbase and to the Modeller. • DIMENSION: the dimension of the parameter, e.g. [Knots] for Speed. The major purpose is for reporting. 7.2. Control Knowledge In addition to knowledge of numerical and functional nature, expectations with regard to their application are stored in the knowledge base. This means that upon including a new expression or parameter in the knowledge base it is requested to anticipate on its future use. It appears that experts have strategic knowledge about the practical application of expressions and parameters. In QUAESTOR, this control knowledge is included as attributes in the CONTROL slot. Each frame type has specific control attributes, covering e.g. the following aspects: • • • • • Initial value, i.e. order of magnitude Input of value expected or not Class to which a RELATION or parameter belongs Format of output to screen and to other devices RELATION to be used as function or as equation The control attributes are a connotation to the RELATIONs, CONSTRAINTs and parameters which affect the semantics of the numerical knowledge in the 96 Control Knowledge knowledge base. During the development of the Modeller it became apparent that it was necessary and possible to capture local strategic knowledge about the frames. These attributes and the required inferences were extracted by reflecting on my own reasoning strategy in numerical model assembling. By using this local, application-independent knowledge the Modeller shows to some extent ‘human behaviour’. In a way, the Modeller has become an incomplete image of my own modelling capabilities. For users of the system it is important to understand the purpose of the control attributes since it is an important part of the system’s syntax for coding knowledge. The knowledge engineer needs to consider how the acquired numerical knowledge can be used and by selecting the proper attribute settings he can avoid the system from making inappropriate inferences. The control attributes affect the way knowledge is used by the Modeller and indirectly determine the questions it puts forward to the user during a dialogue. In case only default settings are used (for which preferences can be defined by the user), it is likely to obtain a knowledge base which is cumbersome, although it may produce the required answers. In the sequel, the control attributes are explained in detail. I. Control Attributes of RELATION The first attribute concerns the status of the RELATION, which can either be: HARD or SOFT This attribute determines whether or not the user will be requested to accept its selection. If the RELATION is HARD the Modeller assumes that it represents a relationship which is always applicable and can be used without user approval. Such RELATIONs will be introduced into the template without further notice if all its parameters (except the parameter which is chained to this RELATION) are either DETERMINED or not user input (SYS in control). If not, or if the RELATION is SOFT, a user decision is requested. In general any RELATION representing empirical knowledge should be made SOFT, in particular if more than one method is expected to be available for the same purpose. If empirical RELATIONs are made HARD, the selection should be controlled by connecting CONSTRAINT(s), e.g. Method=“FirstMethod”. Further: OW or TW 97 PARAMETRIC MODELS: THE PARTS which means respectively OneWay or TwoWay which has been addressed in section 5.1. In most cases (but not all), empirical relationships should be given the OW attribute. This prevents its use for determining any other parameter then the one forming its left clause. A sensible choice between OW (used as default) and TW directly affects the number and realism of the RELATIONs proposed and the number of values asked during a dialogue. TW should only be used if situations are envisaged in which it is appropriate to determine parameters in the right clause by means of this RELATION. The third attribute is either AND (used as default) or OR. With AND, the system only allows the use of the RELATION if all connected CONSTRAINTs, if any, are satisfied. In case the OR attribute is chosen, only one of the CONSTRAINTs needs to be satisfied to authorise the use of the RELATION. The other, if any, can then either be FALSE or PENDING. The OR attribute of the RELATION is therefore equivalent to: IF (CONSTRAINT-1 OR ... OR CONSTRAINT-N are TRUE) THEN Authorise RELATION_i for template introduction END IF It is important to notice that a RELATION being authorised for template introduction does not imply that the RELATION is actually introduced. The next attribute has either the value EMPIRICAL (which is default) or PHYSICAL. This attribute is of some importance to the Modeller and speaks for itself. Basically, if a template contains an EMPIRICAL RELATION with the left clause LPAR, the Modeller will not admit any other EMPIRICAL RELATION into the template which also has LPAR as left clause. The following example illustrates the purpose of this attribute. Non dimensional thrust coefficient KT expressed in propeller thrust THR, water density Rho, propeller revs n and diameter D: KT = THR/({Rho}*n2*D4) HARD, TW, PHYSICAL (eq. 1) It is considered that all parameters except Rho can be determined by this RELATION. 98 Control Knowledge The parameter KT can be calculated by e.g. the Wageningen B-series propeller polynomial: KT=POL(1,J,PDRA,AeAo,Z) SOFT, TW, EMPIRICAL (eq. 2) SOFT, TW, EMPIRICAL (eq. 3) or KA-series propeller polynomial: KT=POL(1,J,PDRA) in which J is the non-dimensional advance ratio, PDRA the pitch diameter ratio, AeAo the blade area ratio and Z the number of blades. It is envisaged that through (eq. 2) and (eq. 3) the parameters KT, J or PDRA may be calculated. (Eq. 1) may be used to compute either KT, THR, n or D. (eq. 1) cannot be used to determine {Rho}. We need to prevent, however, that in one template (eq. 2) and (eq. 3) are used, e.g. (eq. 2) for KT and (eq. 3) for PDRA. Although the user can prevent this by rejecting one of these two, it is a possible source of problems. It is true that due to the SOFT attribute, confirmation is requested from the user before the RELATION is admitted into the template. This requires, however, that the user is highly acquainted with the domain and exactly knows which RELATION to admit and which not. The aim should be that the model assembling can be performed with the least possible interference of the user. The contents of the knowledge base should be such that the task of the user is limited to providing values for the presented parameters, to restart the Modeller by ACCEPT and to wait for the next request for values and/or decision. By making (eq. 2) and (eq. 3) EMPIRICAL the Modeller will only allow one of these two into the template which means that the only remaining decision is which of the two RELATIONs to use. If (eq. 2) is accepted the Modeller will not come up with (eq. 3) for any purpose whatsoever. In the development phase of a knowledge base it can be useful to make particular RELATIONs inaccessible to the Modeller without the need to actually remove them from the knowledge base. For this purpose the attribute ON and OFF has been introduced. The last attribute is CLASS which purpose is to partition RELATIONs (and parameters) in different groups. Upon request, these groups can be accessed by the browser or used by the Modeller to assemble models within a particular subdomain. Only one CLASS attribute can be assigned to the RELATION frame. 99 PARAMETRIC MODELS: THE PARTS II. Control Attributes of CONSTRAINT The evaluation whether a CONSTRAINT is TRUE or FALSE is based on the comparison of expression clauses. Both the limited numerical accuracy of computers and the fact that it is desired to express when a CONSTRAINT is ‘about’ TRUE, an accuracy EPS is required if one or more equal signs ‘=‘ are present in the CONSTRAINT. By EPS 0.01 is indicated which absolute difference is allowed between the expression clauses. For the CONSTRAINT a=2*b it means: -0.01>=2*b-a=<0.01 If EPS 0 (default value) is maintained, an equality may appear never to be TRUE due to round off inaccuracy, e.g. caused by conversion from string to number vice versa. The second attribute concerns the status of the CONSTRAINT, which can also be: HARD or SOFT If the CONSTRAINT is HARD the system will compute its Boolean value prior to admitting any of the RELATIONs to which it is connected into the template. In general, the RELATION(s) will be authorised if the CONSTRAINT(s) are TRUE. However, if the CONSTRAINT is still PENDING (due to the PENDING parameter(s) in it) these RELATIONs are only admitted if the said unknown parameter(s) are also output of this template. Occasionally, PENDING CONSTRAINT parameters are included in the goal list. If a solution of such a template is found, the CONSTRAINTs are checked again. If TRUE, the solution is validated and transferred to the workbase. If FALSE, the RELATIONs to which these CONSTRAINT(s) are connected are pruned, i.e. removed from the template, forcing the Modeller to search for alternatives. If a CONSTRAINT is SOFT and evaluated FALSE the Modeller will ask the user whether the connected RELATIONs should either be removed from the template, be allowed to remain or can be admitted into the template. In practice most CONSTRAINTs will be HARD which is the default attribute value. The third control attribute is either TR (TRUE), which is default, or FA (FALSE). If CONSTRAINT(s) are connected to a RELATION, the Modeller will only authorise the RELATION if all CONSTRAINTs, if any, are TRUE (in combination with the AND attribute). The attribute TR means that the TRUE produced by the interpreter remains TRUE. FA results in a FALSE being replaced by TRUE (RELATION authorised), whereas a TRUE is replaced by FALSE (not authorised). 100 Control Knowledge Since the NOT() operator is not available in the instruction set of QUAESTOR, this attribute can be used in case the formulation of the CONSTRAINT is very complex whereas the negation NOT()can be written as a simpler expression. In practice, the FA attribute is hardly ever used. The OFF attribute can be used to make the Modeller. The default value is ON. CONSTRAINT not available to the The last attribute has either the value MESSAGE or NO MESSAGE. MESSAGE means that the system will forward a warning in case the CONSTRAINT becomes FALSE. The main purpose of the MESSAGE attribute is to check the problem solving capabilities of a knowledge base during the development phase. III. Control Attributes of Parameter For any computable parameter in the knowledge base an initial value is required: INI 1 (default value) The initial value is used as starting value by the Newton Raphson solver and is used to compute a criterion of convergence in the event of iterative calculations. The next attribute represents the expectation whether in a dialogue: • the value can either be provided as input by the user or may be determined by the system: USR/USL • the parameter is without a direct meaning to the user (intermediate result or coefficient) so its value will never be available as input for real world problems. Its value is always determined by the system: SYS/SYL • the parameter value is always a user input value and can never be determined by the system, so no INItial value is required in the CONTROL: VR (Value Requested) The distinction between SYS ⇔ SYL and respectively USR ⇔ USL is used to indicate that the parameter can ⇔ cannot be determined by a TwoWay RELATION. In fact, USL and SYL make RELATIONs OneWay for these parameters, overruling the possible TwoWay attribute of the RELATION. The opposite, however, is not valid. OneWay RELATIONs including SYS and USR parameters remain OneWay. Physical properties such as the kinematic viscosity of sea water NU are typical USL parameters. NU is either provided as input or e.g. determined by a RELATION in the following form: NU=f(temp) 101 PARAMETRIC MODELS: THE PARTS The purpose of this attribute is to move the responsibility for avoiding undesired chaining of parameters to TwoWay RELATIONs from the RELATION to the parameter. In the RELATION, undesired proposing and chaining can be avoided by putting the parameter between so-called anti-chaining braces (see e.g. DisW in eq. 4 of the example presented in section 6.5). Proposing, e.g. (eq. 3) of this section, for determining Rho makes very little sense, and that applies in fact to all RELATIONs with Rho in its right clause. By making Rho a USL parameter, no such RELATIONs will be proposed for determining Rho. In a way, the USL/SYL attributes are a OneWay (OW) attribute for parameters. The allowed range of parameter values needs to be defined as well. The solver may use this information to adjust the numerical iteration conditions in order to find a solution within the prescribed range. Warnings are issued to the user if no convergence is obtained within this range. • • • • • • NN: N : NP: P : NZ: U : Non Negative, value => 0, default Negative, value < 0 Non Positive, value <= 0 Positive, value > 0 Non Zero, value <> 0 Unrestricted Although stored in the DATA and not in the CONTROL slot, the @MINVAL and @MAXVAL attributes form part of the control knowledge and can be used to further limit the value range of parameters. In case these limits, if set, are exceeded, a warning is issued by the Modeller. No further actions are taken, however. Parameters and their values are used in calculations. Therefore, the user will require output of values in some format to screen, file and/or print, or to the screen only. If the values are only intermediate results and not considered relevant in a report, SCR is selected, else OUT is used which is the default attribute value. The format of the numbers and the column width in the workbase (and report) are immediately related to the parameter and are included in the CONTROL slot as well: • COL: Column width in workbase and printed output (default COL 9) • FF : Fixed Format (default FF 2, two decimal places), e.g. 4.21 for a value of 4.21365 and FF 2 • SF : Scientific Format, e.g. 2.36E-02 for a value of 0.02364 and SF 2 • CF : Currency Format, e.g. $3421.23 for a value of 3421.234 and CF 2 102 Syntactic Elements and Aspects The last attribute of importance is the type of value assigned to the parameter: • VALUE: the parameter has numerical value, either a single (LIST) value or a multiple (TABLE) value. • STRING: the parameter has a single string value between quotation marks, e.g. “FirstMethod”. String values can only be single and user defined. Therefore, STRING parameters are always VR. • OBJECT: the parameter has a TELITAB OBJECT as ‘value’ (see section 5.3). In practice this parameter provides access to a named OBJECT, e.g. the parameter “Rudder” in Table 5.8 provides access to the rudder descriptions. OBJECT parameters cannot be given a value, only the parameters to which “Rudder.thickn”. they refer, e.g. for parameter “thickn” in The last attribute is CLASS which purpose is to partition the parameters (and RELATIONs) in a knowledge base into different groups. These groups can be accessed by the browser in separate lists. Another purpose of CLASS is to tag parameters for the use in special functions, to compute e.g. the sum of the values of all parameters in that CLASS. Some of these functions are described in the following section. Only one CLASS attribute can be assigned to the RELATION frame. 7.3. Syntactic Elements and Aspects The syntactic elements available for the definition of expressions comprise amongst other the standard arithmetical operators for addition, subtraction, multiplication, division, exponentiation and parentheses for expression prioritising. The comma is used as all purpose delimiter. The following relational and logical operators are available for the definition of expressions. RELATIONs: = Equal sign CONSTRAINTs: = <> < > <= Equal sign Inequality Less than Greater than Less than or equal to 103 PARAMETRIC MODELS: THE PARTS => AND: OR: XOR: EQV: IMP: Greater than or equal to Performs a logical AND on the results returned by two relational operators. Returns TRUE if both are TRUE, else FALSE The inclusive OR returns TRUE if one or both of its arguments are TRUE, and returns FALSE only if both arguments are FALSE. The eXclusive OR returns TRUE if the values tested are different, and returns FALSE only if they are the same. The Equivalence operator is the opposite of XOR. It returns TRUE if the values tested are the same, and returns FALSE only if they are not. The Implication operator returns FALSE only if the first operand is TRUE and the second is FALSE. The conventional application of the above CONSTRAINT operators in e.g. FORTRAN code requires, or rather assumes that a value is available for all variables in the expression. The result of the evaluation is only a Boolean value, either TRUE (T) or FALSE (F). QUAESTOR, however, requires and computes an additional result which is either a DETERMINED (D) or PENDING (P) and is used by the Modeller to decide on future reasoning steps. The latter depends on the availability of the values of the parameters applied in the expression. Table 7.1 presents the result of the evaluation of: A {AND, OR, XOR, EQV, IMP} B (eq. 1) The expressions A and B include one relational operator each. The parameters in the expressions have values that are either DETERMINED or PENDING. DETERMINED values are either computed or input, PENDING parameters have the INItial value from the CONTROL or a result from a previous case because the current case value still has to be computed. In Table 7.1, A=T&P and B=F&D means respectively at least one PENDING (P) parameter(s) in expression A and only DETERMINED (D) parameter(s) in expression B. The expressions A and B are respectively evaluated PENDING TRUE and DETERMINED FALSE. The evaluation of either a D or P of the logical expression (eq. 1) depends on whether this expression can still obtain the opposite Boolean value if the sub-expressions A and/or B obtain the opposite Boolean value upon becoming DETERMINED. 104 Syntactic Elements and Aspects Table 7.1: Results of QUAESTOR CONSTRAINT evaluation A A A A A A A A A A A A A A A A = = = = = = = = = = = = = = = = F&D, T&D, F&D, T&D, F&P, T&P, F&P, T&P, F&D, T&D, F&D, T&D, F&P, T&P, F&P, T&P, B B B B B B B B B B B B B B B B = = = = = = = = = = = = = = = = F&D F&D T&D T&D F&D F&D T&D T&D F&P F&P T&P T&P F&P F&P T&P T&P AND F,D F,D F,D T,D F,D F,D F,P T,P F,D F,P F,P T,P F,P F,P F,P T,P OR F,D T,D T,D T,D F,P T,P T,D T,D F,P T,D T,P T,D F,P T,P T,P T,P XOR F,D T,D T,D F,D F,P T,P T,P F,P F,P T,P T,P F,P F,P T,P T,P F,P EQV T,D F,D F,D T,D T,P F,P F,P T,P T,P F,P F,P T,P T,P F,P F,P T,P IMP T,D F,D T,D T,D T,P F,P T,D T,D T,D F,P T,D T,P T,P F,P T,P T,P For defining expressions a number of standard and non-standard numerical functions are defined. The standard functions comprise some standard transcendental functions, e.g. EXP(), LN(), SIN(), ASIN(), SINH(). Some functions have been borrowed from procedural computer languages: • • • • ABS() SGN() TRUNC() ROUND() : : : : returns the absolute value of the argument returns -1, 0 or 1 for respectively negative, zero or positive argument values returns a truncated value of the argument returns a rounded value of the argument A limited set of string operations is supported: • • • • • • • MID$ : LEFT$() : RIGHT$(): UCASE$(): LCASE$(): INSTR() : LEN() : returns a sub string of given length starting at a given position returns a sub string from the left of a given length returns a sub string from the right of a given length returns argument string in uppercase characters returns argument string in lowercase characters returns position of search string in pattern string returns length of argument string String functions are allowed in CONSTRAINTs since string comparison is supported in accordance with BASIC conventions. String concatenation is not supported. In some special functions use of string arguments implies string comparison or reference to parameters in TELITAB databases. The Modeller can not assign a string value to a parameter. 105 PARAMETRIC MODELS: THE PARTS QUAESTOR’s syntax differs from third generation imperative languages as it is restricted to operators, functions and values. Both operators and functions only return single values, some Boolean. A function always has a name and its arguments between brackets. RELATIONs and CONSTRAINTs consist of closed sequences of values, functions and operators (tokens), ordered for evaluation in fixed priority. Therefore, it is obvious that the type of result obtained by expression evaluation coincides with the type of result produced by the operator on the highest level. The last operator evaluated is either a logical or a relational operator. The evaluation of RELATIONs and CONSTRAINTs will yield the (DETERMINED or PENDING) Boolean value TRUE or FALSE. Since QUAESTOR’s main purpose is constraint satisfaction, RELATIONs are used to find parameter values that satisfy the relational operator ‘=‘ whereas the CONSTRAINTs are used as condition. Constraint satisfaction does not perform assignments after user assignments. The Modeller determines which constraints, i.e. RELATIONs need to be re-satisfied. The syntax does not include any statement for e.g. memory management or file access. The interpreter only expects and evaluates functions and operators. Implementation aspects of the interpreter are discussed in section 9.2. The above functional approach implies that syntactic elements of imperative languages can only be used if they are transformed into functions returning a single value. An example of such a transformation is the following RELATION: A=IF(B,LE,3)*B+IF(B,GT,3)*(B+C) (eq. 1) For B<=3, A is found to be equal to B whereas B>3, A will be equal to B+C. The function IF(B,LE,3) returns 1 if B<=3 and 0 if B>3. An alternative notation for (eq. 1) can be: RELATION A=B A=B+C CONSTRAINT B<=3 B>3 (eq.2) (eq.3) (Eq.1) requires one frame and the notation in (eq. 2/3) requires four frames in the knowledge base. This example is a functional interpretation of the IF..THEN..ELSE construct in which the IF() function has two numerical arguments and one relational operator written in the FORTRAN way, separated by comma’s as delimiter instead of points. In this case, the (semi) FORTRAN notation is adopted to simplify expression parsing. 106 Syntactic Elements and Aspects In view of the knowledge representation of numerical design, special purpose functions have been developed. Some of these functions have been defined in order to move basic inferences from the Modeller to the interpreter, i.e. to reduce the appeal to the Modeller which is the most demanding part of the program from a performance point of view. The evaluation of a special function for e.g. finding the largest value of a number of discrete values, is computationally less demanding for the interpreter than for the Modeller when employing an alternative representation of the function. Such an alternative representation will often consist of a large number of mostly elementary RELATIONs and CONSTRAINTs, which all have to be approached by the Modeller for a solution. Benchmarks show that in terms of processing capacity, the actual computations by the interpreter consume only a small proportion of the available hardware performance whereas the actual modelling process requires a relatively large proportion. Functions with such purposes are: • MAX() : • MIN() : • SELECT() : • MATCHCASE(): returns the maximum value of arguments in a list • returns the angular value of argument normalised between -π and + π returns the minimum value of arguments in a list returns the value of argument number .. or label “..” in a TELITAB set returns the number of the case in a TELITAB set which either fully coincides with the set of argument values or which is the nearest one NORMANG() • NEAREST() • POLYGON() • ORCA() : : : : returns the value in a list closest to the argument [Pandurang, 1992] returns 1 if x, y co-ordinate is within a polygon, else returns ∅ returns either the total number of cases or the current case number As a result of e.g. a regression or Fourier analysis, numerical knowledge is often represented in a typical format. The structure of three common formats are implemented in the following functions: • POL() : returns value of n-dimensional (with n arguments) polynomial • TERM() : returns value of a linear combination of the argument terms • FOURIER(): returns value of a Fourier series Data in tabular format is frequently used in naval architecture and much of the numerical knowledge is made available in that format and not as an expression. This means that the system requires facilities to store and use tabular data as if it were RELATIONs between the column parameters (section 5.2). Interpolation techniques are used to transform TELITAB data into continuous functions, valid 107 PARAMETRIC MODELS: THE PARTS within the range of the data. The data storage and retrieval facility is provided by the DATA slot. The tabular representation of RELATIONs are used in a similar manner as ‘normal’ RELATIONs taking into account the limitations of the data. The EXECUTE argument is used in these functions to invoke a satellite application generating a TELITAB data set on which is subsequently interpolated. Four interpolation techniques are implemented: • SPLINT() : returns the cubic spline interpolated value in two or more dimensions • DQUAD() : returns the double quadratic interpolated value in two or more dimensions • LININT() : returns the linear interpolated value in two or more dimensions • GAUSSINT(): returns the Gaussian interpolated value in two or more dimensions (GRBI, see section 5.2) Time domain simulation is performed when insight is desired in the behaviour of a system exposed to e.g. time varying forces. By nature, QUAESTOR is able to compute case-wise solutions of models in which arbitrary parameters are varied. In a time domain simulation the varying parameter is ‘time’ since events are developing on that basis. The major difference between time domain simulation and a normal case-wise calculation in design is the fact that in the former the cases, i.e. time steps, are not independent. The conditions and positions computed in the previous time step are required as input for the next case. In essence, time domain simulation is mainly based on numerical integration, for e.g. dynamic systems from force to acceleration, from acceleration to speed and from speed to position. The PREVAL() function is defined to return previous case values, enabling a low level of differential and integral calculus. The expression: A=PREVAL(A,2)+K results in a value for A that is the sum of the case value of A of two cases back and K. In Frame 7.1, the application of the PREVAL() function is illustrated by presenting a simple mathematical pendulum as ‘time domain simulation model’. By constructing differential schemes by means of this function, the non-linear incremental solver of QUAESTOR is able to deal with complex and non-linear systems of differential equations. Although more powerful tools are available for solving such problems (section 3.4), QUAESTOR may be attractive in case simulation models are subject to frequent adaptations or for prototyping. Limits in this sense are mainly imposed by the computational performance of the system: complex models simply take too much computer time, spoiling the added value of automated model assembling. 108 Syntactic Elements and Aspects Frame 7.1: Pendulum template RELATION CONSTRAINTs PHI=PHISTART Speed=StartSpeed ACC=9.81*SIN((PHI+ PREVAL(PHI,1))/2) PHI=PREVAL(PHI,1)(Speed+PREVAL(Speed,1))* (t-PREVAL(t,1))/Length/2 Speed=PREVAL(Speed,1)+ ACC*(t-PREVAL(t,1)) Parameters PHI PHISTART t Speed StartSpeed ACC Length t=0 t=0 t>0 REFERENCE t>0 initial angle initial speed acceleration as a function of angle angle as a function of time t>0 speed as a function of time DIMENSION REFERENCE radians angle of pendulum in time radians angle at start (t=0) s simulation time m/s speed in time m/s pendulum speed at t=0 m/s^2 acceleration in time m wire length of pendulum The simulation is started by selecting Speed as goal parameter and by providing for time t a range, e.g. 0(0.1)2. In t=0, the Modeller will ask a value of StartSpeed. In case a value is provided, Speed is solved. In the next time step, t=0.1, the Modeller will return to time step t=0 to infer the starting condition for PHI which is done by including PHISTART into the goal list. In general it is sufficient in a simulation to select one of the time dependent parameters as top goal parameter. The Modeller will infer which parameters need to be added to the goal list by concluding that previous values returned by PREVAL() functions are PENDING. This inference mechanism gathers all required starting conditions of the simulation. The CLASS attribute was introduced to partition large knowledge bases into manageable portions. In addition to this purpose the concept of tagging parameters (e.g. as ‘Resistance Component’) is used to earmark a group of parameters, i.e. values for a combined numerical operation. The aim of the special functions developed for that purpose is to introduce simple ‘bookkeeping’ capabilities, generally the domain of spreadsheets. If the total of a series of weight groups is needed in a design model, this value can be obtained by introducing a RELATION in which all group weights are added. In the case of a large number of groups (as e.g. defined in the USN Ship Work Breakdown Structure (SWBS)), and if weight groups are added in the course of knowledge base development, this becomes unwieldy and error prone. By developing special functions that require a 109 PARAMETRIC MODELS: THE PARTS parameter CLASS as argument instead of values, it becomes sufficient to assign the correct CLASS to each newly added parameter (e.g. weight group). This ensures that its value is included in the computed total weight. This also implies that any PENDING parameter in that CLASS is added to the goal list if the function is invoked. Functions dealing with CLASSes of parameters are: • SUMCLASS() : returns the sum of all values in the indicated CLASS • MEANCLASS() : returns the mean of all values in the indicated CLASS • MOMCLASS() : returns the first order moment of two CLASSes, e.g. total cost is the • sum of unit cost times number of units STDEVCLASS(): returns the standard deviation between the values of two CLASSes • LISTCLASS() : returns a list of values in the indicated CLASS which can e.g. be used as input for a satellite program Thus far, three frame types of the QUAESTOR knowledge base have been discussed, viz. parameters, RELATIONs and CONSTRAINTs. In addition to these frames some other types were defined for capturing knowledge of a sheer procedural nature. The MACRO() expression is introduced as storage unit of a problem definition (goal parameters), the input that is provided by the user and the RELATIONs in the assembled template. An interesting aspect of the MACRO() is that the CONSTRAINT frame is used as storage unit. A CONSTRAINT can be connected to RELATIONs and parameters. In the MACRO() these connections are used to point to the input parameters and to the RELATIONs of the template. The input provided in the training session is encrypted in the MACRO expression. Upon request, a MACRO is automatically generated during a dialogue and included in the knowledge base. In case a MACRO() is selected for execution, it is converted into a template by the Modeller. Subsequently, the user gets the opportunity to modify the default input values which are in fact the values provided in the training session. The user is free to provide more information or is allowed to make unavailable MACRO input parameters PENDING. The Modeller considers these new unknowns as additional goal parameters and will try to infer their values, as in a normal dialogue session. The MACRO frames are imperative applications which are still highly adaptable since full access to the Modeller is maintained. It is not possible to use a MACRO as a subset or ‘subroutine’ in a newly assembled model. Special functions have been developed for calling satellite applications. Such applications are separate executable codes which communicate their input and output by means of TELITAB files. In the sequel the most important ones are presented. 110 Syntactic Elements and Aspects The FUNCTION call instructs QUAESTOR to store the values of the listed arguments in a temporary TELITAB file and to invoke the indicated satellite. The satellite expects input in the temporary file and writes its results in a TELITAB output file. The output element indicated in the FUNCTION argument list, either by number or by name, is read and assigned to the FUNCTION as result. If the satellite cannot produce a proper result, the value -999999 should be provided as output which is regarded as PENDING by QUAESTOR. This results in a warning and in termination of the dialogue or current case. FUNCTION is used as any other special function, which means that input parameter(s) of the satellite can be computed if output values are provided. Whether or not convergence is obtained depends on the nature of the process in the satellite and on the initial values for the iteration provided in the CONTROL slots. In case the satellite comprises a non-continuous process it is advised not to chain any of the input arguments listed in the FUNCTION to by the RELATION in which it is used. This is prevented by placing the parameters between braces or by using the OW attribute in the CONTROL. If more than one output parameter of the satellite is required in the knowledge base, as many FUNCTIONs need to be defined as there are required output parameters. QUAESTOR compares current input with previous input and does not invoke the satellite in case the input has not changed and avoids in that way unnecessary use of the satellite. Similar to any other function, the FUNCTION call produces one single value at a time. The EXECUTE call makes it possible for a satellite process to pass a TELITAB data set to the special functions using such data, viz. SPLINT(), LININT(), DQUAD(), SELECT() and POLYGON(). The use of EXECUTE is illustrated with the SPLINT function: Y = SPLINT(0, X, EXECUTE Process(a, b, c)) The first argument (0) means interpolation on a calculated TELITAB set, X is the value on which is interpolated and the sub-expression EXECUTE Process(a, b, c) produces the TELITAB data set by running Process with inputs a, b and EXECUTE can only be used in combination with the above special functions and is no alternative for FUNCTION since it is an argument and no value is assigned to it. 111 This page intentionally left blank 112 8. PARAMETRIC MODELS: THE ASSEMBLING The whole is more than the sum of the parts. Aristotle, Metaphysica In this chapter, the numerical modelling process is presented as an assembling activity. Every assembling process requires that the components of the envisaged product fit together. Modular thinking is common property in many branches of industry. A clear tendency is that large product ranges of often highly complex goods like cars and computers are engineered in such a way that they can be assembled from often complex components such as drive units and processor chips. The connections between the components are standardised and often made relatively simple if compared to the complexity of the components. This makes it possible to spread the engineering, production and product responsibility over a large number of specialised manufacturers. An obvious example is the ISA slot in PC’s, introduced in the early eighties. This simple standardised connection between computer components has initiated a technological revolution. Any manufacturer of electronic devices became able to develop high tech components which could be assembled into a computer on the kitchen table. In this chapter, similar principles are presented for the assembling of numerical models. First, the development strategy is addressed and an attempt to use the KADS framework is described. Secondly, the Modeller is described in terms of its primary inferences and finally, the solver strategies are discussed in some detail. 8.1. Development Strategy Apart from formulating the basic assumptions and intentions in 1987 as presented in section 2.2, I have basically used computer code and the project dossier to solve the theoretical and implementation problems in a step-by-step manner. In section 2.3 the term top-down prototyping was adopted. By doing so, I followed an engineering approach to software development. For developing numerical computer applications, e.g. for resistance, propulsion or stability, this approach proves generally effective. However, for knowledge-based systems (KBS) which are generally of a more complex nature it appears to have serious drawbacks. One of these is related to the fact that the knowledge and behaviour of the experts to be exhibited by the KBS is mainly implicitly available in the code and sometimes in a verbal form in the project dossier. Only by executing and testing the code for a number of selected cases it becomes clear how well the embedded knowledge reflects the behaviour of the expert. 113 PARAMETRIC MODELS: THE ASSEMBLING In [Top, 1992 and 1993] the modelling process embedded in QUAESTOR was described on a conceptual level using KADS conventions [Schreiber, 1994]. This work was carried out within the scope of the ESPRIT KADS II project. For the reasons indicated in the previous paragraph, KADS has emerged as a de-facto standard technique for conceptual modelling of KBS. KADS is the name of the first prototype computer tool developed for this purpose and was later transferred to the approach. The conceptual model of QUAESTOR was acquired by means of interviews and by reverse engineering of the prototype system. In KADS a framework for the conceptual modelling of expertise has been developed distinguishing four categories of knowledge or layers, i.e. the domain, inference, task and strategic layer. The term layer is adopted for the knowledge categories since each successive layer interprets the description at the lower layer. This four layer framework has been successfully applied for the structured acquisition of knowledge at a level between the verbal data provided by experts and the knowledge as represented by the implementation of the system. The central issue in KADS is to entirely separate the implementation aspects of KBS from its design, i.e. to perform the conceptual design of a KBS at the knowledge level [Newell, 1982] and not at the implementation level. According to KADS experience, building a conceptual model without having to worry about system requirements makes life easier for the knowledge engineer. For the design of knowledge-based systems dealing with modelling, a similar approach should be followed as for the design of (numerical) models: Specification, Construction and Assessment. Not being satisfied with the slow progress achieved with the top-down prototyping approach in which the actual knowledge is embedded in the code, I have attempted to apply these views. It is attractive to make the involved knowledge explicit and by doing so to become able to indicate gaps in the modelling strategy or to explore new strategies without long lasting exploratory implementation work. However, I have not been able to do so successfully. Except for my own limitations due to my non-IT background, the model assembling task has appeared to be much less trivial than assumed in 1992. Although the domain model is simple, the selection of RELATIONs and the template maintenance is surprisingly complex. I have been able to gain understanding of the assembling problem only by using the mechanisms involved and particularly by having them used by others, who had no advance knowledge of the embedded strategies and inferences. The very existence of high level and formal representation techniques for KBS and the effort in their 114 Inferences development and dissemination indicate that it is feasible for systems such as QUAESTOR to be designed and specified prior to performing any implementation. In case we are dealing with systems operating within a known domain, performing tasks which are similar to existing (parent) systems, this approach is likely to make development more cost effective. Comparing the knowledge-level description of the envisaged system with comparable parent models in a library may reveal weaknesses or gaps in the new design (or in the parents), accelerating the development, implying a form of case-based reasoning. Since knowledge-based parametric model assembling was an unexplored domain with many uncertainties with respect to user behaviour and desire, I finally stayed with top-down prototyping as development approach, i.e. remaining on the implementation level. Prototypes were used to extract strategic knowledge about possible inferences, user behaviour, etc. The experts also being potential users of the model assembling tool, already need some desire and capability to abstract. However, to evaluate merits and weaknesses of such a tool, represented in a highly abstract format such as KADS is beyond both capability and interest of most of these experts. The gap in language between IT specialists and technical experts is not easy to bridge, as I experienced myself. A thorough confrontation of the experts with the KBS design in some form is inevitable since the above mentioned forms of knowledge must be extracted from them anyway. In the case of QUAESTOR, the prototype proved to be an effective way to rouse interest in the design community and to extract valuable knowledge in practical applications. In case motivated experts are involved, the cost of the development can stay well within limits. Although suitable for design research purposes it is in view of its unpredictable time schedule hardly a suitable approach for commercial developments in which hard constraints are imposed on the system’s release date. The ten year period used to create QUAESTOR is normally not regarded as a realistic time frame for developing an information system. 8.2. Inferences The Modeller is a subsystem which prepares the template, either or not in cooperation with the user. The template, a set of RELATIONs selected from the knowledge base, is not simply a (non-linear) system of equations as it was viewed in earlier developmental phases [vHees, 1992] but a set that is managed in its coherence in the form of a semantic network (Figures 3.2 and 6.3). 115 PARAMETRIC MODELS: THE ASSEMBLING In the sequel, a flow diagram has been used to elucidate the control over the primary steps in the model assembling and solution process. These primary process steps are subsequently presented in the form of pseudo code. The implementation/application cycle is considered as my primary source of knowledge during the development, which explains my preference for this presentation which is relatively close to the implementation. Although more appealing, a practical drawback of using flow charts for this purpose is the amount of page space they require. Since the primary process steps can be expressed in the form of simple structures of IF-THEN and WHILE-WEND statements, pseudo code is preferred. Although the pseudo code in the frames is often based on reverse engineering of the code, it is not always resembling the actual implementation. The outer loop in the process is outlined in Figure 8.1. The loop contains the knowledge maintenance process indicated as EDIT mode and the model assembling process indicated as SOLV mode. KNOWLEDGE MAINTENANCE EDIT MODELLER SOLV Figure 8.1: Main loop of QUAESTOR Knowledge maintenance comprises all functions and facilities to create and maintain knowledge bases. Some technical details of the knowledge base are provided in sections 9.1 and 9.2. In software terms, the EDIT and SOLV modes are two operational conditions of roughly the same resources. Switching between the modes is initiated by the user selecting goal parameters (EDIT to SOLV) and by leaving (SOLV to EDIT). In Figure 8.2 a global flow diagram of the model assembling and solution process is presented. 116 Inferences K NO W LED G E M AIN T E N AN C E IN IT IA L IS E PROBLEM U N C H A IN E D P AR AM E TER ? N Y I - S E LE C T G O AL P AR AM E TE R II - V AL ID AT E G O AL P AR AM E TE R III -S E L E C T C AN D ID ATE R E LAT IO N S IV - S E LE C T C AN D ID ATE & IN C LU D E IN T E M P L ATE F ra m e 8 .2 : P E R FO R M F O R W AR D C H AIN IN G V - FIN D & S O LV E S U B S Y S TE M Y R E S TAR T W IT H O TH E R IN P U T? N O TH E R C O M P U TAB L E P AR AM E T E R S ? Y N R E P O R T S O LU T IO N N M O D EL C OM PLETED ? Y W R IT E S O LU T IO N T O W O R K B AS E & N E X T C AS E N ALL C AS E S C O M P LE TE D ? Y Figure 8.2: Control strategy of parametric model assembling and solution process 117 PARAMETRIC MODELS: THE ASSEMBLING In the sequel each item in Figure 8.2, representing an important step in the model assembling process, is elucidated. The main loop in the modelling process deals with the selection of RELATIONs for each PENDING parameter in the goal list, representing a backward strategy of reasoning. Initially the development focused on the selection of n equations and n variables without caring about which variable is calculated by which RELATION. This was in fact the trivial model assembling process referred to in [Top, 1992]. An important step ahead was to recognise a RELATION as a Production Rule with a single Valued Conclusion, i.e. one RELATION produces the value of one parameter, no matter if the parameter is implicit and present in more than one RELATION. Which parameter is produced by which RELATION is context dependent. This observation initiated the development of a reasoning process able to properly deal with arbitrary user interference. The backward reasoning strategy searches for each unchained goal parameter a suitable RELATION to compute its value, i.e. the strategy chains a suitable RELATION to each goal and sub goal parameter and these connections (see semantic relations in Figure 3.2) are maintained and used in the reasoning process. In pseudo code this strategy is presented in Frame 8.1. Frame 8.1: Backward reasoning WHILE (unchained parameters in goal list) Select unchained parameter from goal list WHILE (RELATION for the requested parameter is not yet selected) Select RELATION IF Approved & CONSTRAINTs of RELATION are fulfilled THEN include RELATION in template include unchained and PENDING parameters in RELATION in goal list END IF WEND WEND Backward reasoning is the core strategy because the system must propose parameters to the user (ask questions), which are the PENDING parameters in the selected or proposed RELATIONs. The RELATIONs in which the current goal parameter is used are collected from the knowledge base and the one most suitable for the current goal parameter is selected, i.e. chained to this goal parameter. However, only by using backward reasoning, there is a clear risk that the system will put forward superfluous questions, tempting the user to wrong decisions which may yield data inconsistency. In this context a superfluous question is a 118 Inferences value asked which can already be derived from the available data in the workbase and the RELATIONs in the knowledge base. The demand for user competence reduces if such questions are not asked. This is realised by introducing forward reasoning as a screening clause [Clancey, 1982]. This means that the system checks whether it is possible to derive additional values on any new supply of data, either input or computed values. The implemented forward reasoning strategy is limited since it only recursively selects RELATIONs that can be ‘fired’, i.e. which include only (except one) DETERMINED and/or determinable parameters. Given the current set of computed and provided (DETERMINED) values and the HARD RELATIONs available in the knowledge base, the determinable parameters can be computed. SOFT RELATIONs in the knowledge base are not considered by this strategy since that would result in an undesirable increase of system-user interaction. HARD RELATIONs that can be used to compute the determinable parameters are called forward selected RELATIONs. Performance limitations do not allow the Modeller to actually calculate the determinable parameters, as was attempted in earlier versions. This also implies that CONSTRAINTs including determinable parameter(s) are not evaluated either. As soon as hardware and software performance allow, it is desirable to re-implement the immediate calculation of determinable parameters since the introduction of data inconsistency is still possible within the current strategy. Another possible source of data inconsistency is the way in which QUAESTOR asks for input values. Parameters are presented in clusters. These clusters contain the parameters in a proposed RELATION or CONSTRAINT which are PENDING, user input (USR/USL/VR in CONTROL) and not presented before. By providing a value for each of the parameters within a cluster, there is a possibility that one of the other, still PENDING parameters becomes computable through other RELATIONs. A solution may be to perform a screening of the other still PENDING parameters of the cluster by forward reasoning, each time a new value is provided. This is not done since this inconsistency risk is considered acceptable if compared to the disadvantageous performance implications of the additional screening. Forward reasoning is only performed after accepting the input for the complete cluster, if any. 119 PARAMETRIC MODELS: THE ASSEMBLING Frame 8.2 presents the forward reasoning process in pseudo code. The Modeller maintains a list of parameters (CheckList) that are either computed or input during the dialogue. The forward reasoning process checks each CheckList parameter whether it can contribute to the calculation of other parameters and removes it from this list after checking. The process departs from these most recently acquired values and has access to all previously acquired values in the workbase. Frame 8.2: Forward reasoning WHILE (CheckList not empty) Select next parameter from CheckList Find expressions (RELATIONs & CONSTRAINTs) in which this parameter is present, not used in template, not rejected, not FALSE WHILE (not all expressions checked AND parameter not determinable) Select next expression IF expression = CONSTRAINT THEN Evaluate CONSTRAINTs and if FALSE and DETERMINED, remove connected RELATIONs from template, if any ELSE IF RELATION AND HARD (no user involvement) THEN Determine its PENDING and not determinable parameters IF only one PENDING parameter AND computable by RELATION THEN Evaluate its connected CONSTRAINTs, if any IF none, or CONSTRAINTs are TRUE AND DETERMINED THEN Single PENDING Parameter is determinable RELATION is forward selected END IF END IF END IF Expression checked WEND Parameter checked, remove from CheckList WEND I. Select goal parameter In Frame 8.3 the procedure for finding the next unchained parameter is outlined. This parameter will subsequently be chained to a RELATION. The purpose of this loop is twofold. The first is to find one unchained parameter which is preferably a determinable parameter, since these are already computable. The second purpose is to check whether there are chained but PENDING parameters which are determinable by another RELATION than the one to which is chained by backward reasoning. If so, the latter RELATION is removed from the template. By replacing it with the forward selected RELATION, it becomes certain that the goal parameter becomes computable. The RELATION which is removed becomes available for other purposes. 120 Inferences Frame 8.3: Select goal parameter Select first unchained parameter of goal list as initial goal parameter Give determinable goal parameters priority over other goal parameters: WHILE no forward chained and unchained goal parameter (PAR1) found Select next parameter from goal list IF parameter is determinable THEN IF parameter is unchained THEN PAR1 = parameter (PAR1 found) ELSE IF parameter is PENDING THEN IF the forward selected RELATION for PAR2 already chained to PAR2 THEN Remove this parameter and RELATION from the list of forward selected RELATIONs and determinable parameters. ELSE The current parameter is chained to another RELATION than the one that can now compute its value, given the data in the work base. If no unchained determinable goal parameter is found, the conclusion is drawn that this RELATION is not the right one. PAR2 = parameter (PAR2 found) END IF END IF END IF WEND IF PAR1 is found THEN goal parameter is PAR1 ELSE IF PAR2 is found THEN No determinable goal parameter found At least one of the chained and PENDING parameters in the template is determinable. Remove this RELATION from the template so that it can be immediately replaced by the forward selected RELATION (Backtracking). So, goal parameter is PAR2 ELSE initial goal parameter is selected END IF II. Validate goal parameter In Frame 8.4 is checked whether the goal parameter is not introduced by a CONSTRAINT which became computable in the mean time. If so, the parameter is no sub goal anymore and can be removed from the goal list. Sub goal parameters are normally introduced in the template by RELATIONs. However, depending on the availability of suitable RELATIONs in the knowledge base, PENDING parameters of CONSTRAINTs of a suitable RELATION may be added to the goal list prior to actually admitting that RELATION into the template. The computed values of these CONSTRAINT parameters will finally determine whether this RELATION can indeed be introduced. The purpose of a later re-evaluation of these CONSTRAINTs is that CONSTRAINTs may become DETERMINED due to one or more of their parameters becoming DETERMINED. As shown in Table 7.1 not all PENDING parameters in a 121 PARAMETRIC MODELS: THE ASSEMBLING CONSTRAINT need to be DETERMINED in order to produce a DETERMINED Boolean value. Frame 8.4: Validate goal parameter Check whether the selected goal parameter is only present in a CONSTRAINT which can now be evaluated IF goal parameter is no top goal parameter THEN IF goal parameter is not present in RELATION(s) of the template Goal parameter is present in CONSTRAINTs of template RELATIONs Evaluate these CONSTRAINTs IF CONSTRAINTs DETERMINED THEN Goal parameter is obsolete Remove goal parameter from goal list select next unchained goal parameter, if any (Frame 8.3) END IF END IF END IF THEN Select candidate RELATIONs III. In the above two frames, the goal parameter was selected and validated. In Frame 8.5, a pre-selection is performed of all RELATIONs which are available in the knowledge base for computing the goal parameter. From this list of candidates, the most ‘probable’ one will later be selected or proposed to the user. The algorithm in this frame also applies the heuristic rules presented in Frame 8.6 to prioritise the selected RELATIONs. This algorithm is intricate in its ability to deal with events in which no suitable candidates are found. In those events, the problem is extended with e.g.: • finding PENDING parameters of particular CONSTRAINTs • • removing RELATIONs from the template to make them available for other goal parameters removing (‘pruning’) branches from a template that have been arranged for a goal parameter that did not prove to be computable or that introduced non-computable sub goals, etc. The pseudo code in Frame 8.5 only presents the spine of the algorithm. To avoid the selection process from being trapped in a deadlock, a backtracking technique is applied. By preserving data on recent reasoning steps, dead alleys once found are avoided later, resulting in pruning and attempts to find other network paths. Two aspects of the backtracking strategy are indicated in Frames 8.3 and 8.5). 122 Inferences Frame 8.5: Select candidate RELATIONs Determine list of candidate RELATIONs for the current goal parameter which are not yet in the template, not rejected, not FALSE, not recently backtracked, within defined CLASS, if any. IF no candidates found THEN IF goal parameter is top goal THEN Top goal not computable, remove from goal list END IF IF more unchained parameters in goal list THEN Select next goal parameter (Frame 8.3) ELSE Start calculation END IF END IF DO Select next candidate IF in candidate RELATION parameters are present that: - are already in the goal list AND - not being the current goal parameter AND - are not yet chained to a relation AND - when the current candidate is the only available one for another, still unchained goal parameter THEN Candidate is invalid, select next candidate ELSE IF RELATION is EMPIRICAL and another EMPIRICAL RELATION is present in the template with the same left clause THEN Candidate is invalid, select next candidate ELSE Check CONSTRAINTs of candidate, if any IF no CONSTRAINTs or CONSTRAINTs TRUE THEN Candidate is valid Give candidate priority ranking (Frame 8.6) END IF END IF LOOP Possible candidates = candidates - invalid candidates - valid candidates If no valid candidate(s) are found, add PENDING parameters of CONSTRAINTs of possible candidates to the goal list IF no valid candidate(s) are found THEN IF goal parameter is no top goal OR suitable RELATION in template for other goal parameter THEN prune template branch with non-computable goal parameter (backtracking) select next goal parameter (Frame 8.3) END IF END IF In Frame 8.6, the heuristic rules for priority ranking are provided in an explicit form. These heuristic rules are mainly experimentally derived and are partly based on my pragmatic preferences in the selection of model fragments. 123 PARAMETRIC MODELS: THE ASSEMBLING Frame 8.6: Heuristic priority ranking of candidate RELATIONs candidate priority ranking (higher number is higher priority) 1 - Priority mark P=(-)number of unchained and PENDING parameters in the candidate 2 - Context limiting: if CLASS of goal parameter is equal to CLASS of RELATION, set priority to P=1 3 - If the left clause is the current goal parameter, then give high priority mark P=1 4 - If CONSTRAINT(s) DETERMINED increase priority to P=0 if P<0 and P=P+1 if P=>0 5 - If candidate is OneWay increase priority P=P+1 6 - Low priority if parameter is recently backtracked P=-10 7 - Select simplest expression in the event of equal ranking 8 - HARD RELATIONs are always preferred over SOFT RELATIONs, independent of the ranking result of rules 1-7 In ranking the priority of the candidate RELATIONs a relatively simple approach is adopted. According to the nature of heuristic rules it is not quite possible to ‘prove’ the rules 1-8 in Frame 8.6. The rules are experimentally derived and tuned on the domain and the Modeller shows in general desirable ‘behaviour’ by using them. The applied strategy is to assign a ranking number in a hierarchical way; some aspects are assumed to overrule others rather than that both aspects should be weighted in the ranking. Another practical argument against weighing is that in general only a few of the rules 1-8 apply. The first step (rule 1) is to count the number of unchained and PENDING parameters in the RELATION. This value (times -1) is a measure of the effort involved in the completion of the template (see also section 6.7: CVM Development Aspects) and is assigned as initial ranking to the RELATION. The second rule applies context limiting. It is assumed that if a RELATION has the same CLASS attribute as the current goal parameter, it should be given a higher ranking. In view of the purpose and use of the CLASS attribute, it is assumed that such a RELATION has a larger probability of being applicable, not considering its number of unchained and PENDING parameters. In this event the ranking is set to 1 and the same is done when rule 3 applies. It is observed that RELATIONs having the same left clause as the goal parameter are preferred over simpler expressions in which the goal parameter is present in the right clause. Rule 4 implies that RELATIONs with one or more DETERMINED CONSTRAINTs are preferred over ones without. The background of this rule is that these CONSTRAINTs require values that are apparently provided or computed in an earlier stage. These values may indicate an implicit selection of RELATION(s). If e.g. a CONSTRAINT Method=1 is connected and Method has been given the value 1, 124 Inferences RELATIONs connected to this CONSTRAINT are obviously preferred over RELATIONs with other or even without CONSTRAINTs. Although somewhat unexpectedly, OneWay RELATIONs have a slight preference over TwoWay RELATIONs. Necessarily, these RELATIONs will also fulfil rule 3, being the left clause equal to the current goal parameter. The fact that OneWay RELATIONs are preferred is purely heuristic; the Modeller performance is slightly improved by the application of this rule. This preference is expressed by a ranking increment of 1. Frame 8.5 refers to backtracked RELATIONs. Backtracked RELATIONs were once considered suitable but appeared to be unsuitable later on, e.g. due to the impossibility of chaining (one of) its sub goals to a RELATION. Later in the assembling process, such RELATIONs may become suitable again, implying that they should not be rejected. A RELATION that is either rejected by the Modeller or by the user is not available anymore unless it is explicitly stated (i.e. ‘unrejected’) by the user. Summarising, a backtracked or previously unsuitable RELATION should not (immediately) be selected again, resulting the Modeller to become trapped in a deadlock. This is avoided by rule 6 which gives back tracked RELATIONs an arbitrary low ranking of -10, no matter the outcome of applying rules 1-5. The list of backtracked RELATIONs is erased as soon as new parameter values become available in the workbase, either input or by calculation. Rule 7 deals with events in which rules 1-6 yield the same ranking for two or more RELATIONs. This is done by preferring the simplest, i.e. shortest RELATION which is consistent with rule 1. Rule 7 is implemented in the PROPOSE algorithm. The same holds for rule 8 which states that HARD RELATIONs are always preferred over SOFT RELATIONs which is realised by first proposing candidates from the acquired HARD group, if any (Frame 8.7). IV. Select candidate and include in template In Frame 8.7 the previously selected candidate RELATIONs are separated into a HARD and a SOFT set and are subsequently proposed and/or selected on the basis of priority ranking. In the event of two RELATIONs having the same priority, the heuristic rule 7 instructs to select the simplest, i.e. shortest expression. The HARD candidates are always proposed first (rule 8). Depending on the CONTROL of both the candidate and its parameters, it is either proposed to the user or immediately introduced into the template. Proposing a RELATION and its parameters to the user enables him to accept or reject it, to access the knowledge base, to provide values 125 PARAMETRIC MODELS: THE ASSEMBLING to PENDING parameters or even to arbitrary parameters, or to make arbitrary parameters PENDING. All these actions may result into template and workbase maintenance steps. The pseudo code in Frame 8.7 only presents the candidate selection mechanism. User interference and inclusion of RELATIONs in the template is governed by the PROPOSE algorithm and is not further discussed. Frame 8.7: Select candidate and include in template WHILE (not selected and candidates available) Select next candidate from SOFT and HARD pre-selections in sequence of computed priority and expression length IF HARD candidate THEN PROPOSE Candidate and its PENDING USR/L and VR parameters if any (it is then also possible to reject the candidate) IF accepted THEN include RELATION in template include unchained and PENDING parameters in goal list Start Solver ELSE IF values provided THEN start Solver END IF END IF IF SOFT candidate THEN PROPOSE candidate and its PENDING USR/L and VR parameters, if any IF accepted THEN include RELATION in template include unchained and PENDING parameters in goal list Start Solver ELSE IF values provided THEN start Solver END IF END IF WEND V. Find subsystem and solve After each extension of the template, it is attempted to find a solvable subsystem of equations. This subsystem may be a 1x1 system as a minimum but also a cycle or strong component [Serrano, 1992]. This algorithm makes it possible to deal with very large templates. The main loop in the algorithm presented in Frame 8.8 performs a search per parameter, i.e. for each chained and PENDING parameter a recursive search is performed in the template for a complete nxn system of equations or strong component, if any. If a strong component is found, it is solved. PENDING parameters of CONSTRAINTs are allowed in a strong component if they also belong to its output. If the solution leads to RELATIONs becoming invalid, they are pruned from the template, together with their sub goal parameters and chained126 Inferences to RELATIONs, if any. Each time a subsystem of equations is solved, the Modeller performs a forward reasoning search on the computed parameters (Frame 8.2). Frame 8.8: Find subsystem and solve Start: Perform forward reasoning on newly acquired values WHILE not all chained and PENDING parameters in template are checked Select next unchecked, PENDING and chained parameter in template Find Strong Component for parameter IF Strong component found (at least 1 x 1 system) Solve system of equations IF successful THEN Go back to Start END IF END IF WEND THEN Select next goal parameter, if any VI. Miscellaneous In Figure 8.2 two additional steps are indicated, respectively write solution to workbase & next case and report solution. In the former, the computed solution is transferred to the workbase which also comprises its management. In a multi case calculation parameters switch from single value into multiple value, either because their value varies with the input or because a parameter is newly introduced or removed from the template. The latter comprises the presentation of the results to the user and the production of printed reports in a generic format. In the assembling and solution process as described in the previous pages, use is made of a variety of additional algorithms which are either briefly mentioned, such as the PROPOSE algorithm, or not addressed at all. All these algorithms are important in a sense that the Modeller cannot work without either one of them. However, the above descriptions represent a condensed and complete view on the overall process flow and philosophy. 127 PARAMETRIC MODELS: THE ASSEMBLING 8.3. Solver Strategies The template contains a number of PENDING parameters which are either present in explicit or implicit expressions. In the event of explicit expressions, the value of the parameters can be computed by immediate evaluation of these expressions, even if the right clause of that RELATION contains PENDING parameters which are explicit in the RELATIONs to which they are chained to in the template. In the sequel this is referred to as symbolic substitution. If the PENDING parameters are only expressed implicitly in the template, i.e. are only present in the right clause(s) of RELATIONs, the template contains one or more cycles. The latter category of template parameters can only be solved by means of an iterative process. From the perspective of a designer, the ability to solve implicit template parameters is attractive since it allows inverse use of expressions, methods and satellite programs (see also the example in section 6.5). This desire requires an iterative Solver which is sufficiently general and robust to deal with the large variety of numerical problems occurring in design. This Solver should perform its task with the least possible user interference. Since use is made of an interpreter for expression evaluation which is slow if compared to compiled code, the efficiency of the iterative Solver has obtained much attention. The Modeller controls the Solver by delivering the system of equations, i.e. the set of RELATIONs and PENDING parameters taken from the assembled template. Already in an early development stage it was decided to use a gradient method, i.e. an multi-dimensional Newton Raphson (NR) method and not a Regula Falsi (RF) based method [Press, 1992]. Although the RF method is theoretically more robust than the NR method, the latter is preferred due to its higher convergence speed. The convergence speed of the RF method is in the order of h, whereas for NR this value is in the order of h2. The convergence speed is a measure for the number of expression evaluations to be performed in an iteration. Since expression evaluation is performed by an interpreter and is costly in terms of computer time, it is desirable to use a method with the highest convergence speed, i.e. which requires the least possible number of expression evaluations. Another practical drawback of the RF method is the fact that two, instead of only one, initial values per parameter are required which need to be on both sides of the final solution. If they are not, additional evaluations are necessary to find these starting values. The NR method requires only one starting value and in practice, it has appeared that 128 Solver Strategies convergence is generally obtained without a necessity of providing other starting values. For the application domain of QUAESTOR, the NR method appeared to be a good compromise between speed and robustness. Generally, the method yields good convergence performance in the application domain since most functions are well behaved. In addition to this, starting values of one order of magnitude different from the final solution are in general sufficiently accurate to reach convergence. These tentative values of the parameters are considered as a control attribute of the parameter. For each parameter in the knowledge base this value is defined only once by the knowledge engineer and is stored as INI value in the parameter CONTROL slot. Except from being used as iteration starting value, the INI values are used to calculate the initial global and local criteria of convergence. During the NR iteration, these criteria are dynamically adapted on the basis of the intermediate solutions. A drawback of gradient methods in general is that they require continuous differentiable functions which impose limits on the form and the application of numerical expressions (RELATIONs) in a knowledge base. Also, the partial derivatives should always be non-zero to allow a solution. Another problem is that NR based iterations yield only one root whereas there can be more than one. In case non-negative (NN) parameters (most parameters representing physical systems are NN), the problem of convergence towards negative roots can be overcome by adjusting the relaxation factor by either a manual or an automatic selection of other starting values. The most frequently used NR method in the QUAESTOR Solver is the full multidimensional method. This implies that for each iteration a full Jacobian is determined. This method yields fast convergence but requires that in each iteration all partial derivatives are computed. In the event of an iterative use of satellite applications this method may require many program runs and hence an impractical amount of computer time. In order to reduce the number of satellite runs in such cases, the option of using a Quasi Newton Raphson (QNR) method is introduced. The QNR method implemented in the QUAESTOR Solver only computes the full Jacobian in the first iteration and consecutively iterates with it. In general, the QNR method requires more iteration steps with per step less expression evaluations (and satellite calls). The method is less robust, however, since the partial derivatives based on initial values may not represent the partial derivatives in vicinity of the solution. 129 PARAMETRIC MODELS: THE ASSEMBLING Another extension to the NR method which is implemented is the Gauss Seidl method. Instead of computing the full Jacobian each iteration a step-by-step onedimensional NR iteration is performed. Each new single value solution is immediately substituted and used in the following evaluations. The method is often faster than the multi-dimensional NR method but is less robust. In particular when satellite applications are involved, the QNR or Gauss-Seidl NR can be advantageous. In case templates consist of only QUAESTOR RELATIONs or a noniterative use of satellites, the full multi-dimensional NR method is the most reliable one and is chosen as the default method. As indicated above, the performance of the Solver is greatly affected by the number of evaluations performed in the course of an iteration. It is obvious that in the event of a non-linear system of equations the number of evaluations is proportional to n2, n being the number of unknowns. However, in QUAESTOR’s application domain, the systems of equations are generally represented in the form of very narrow band matrices. This implies a number of evaluations that is proportional to a value between n and n2. This is due to the fact that most PENDING template parameters, i.e. (sub) goal parameters are present in only two RELATIONs or in a number of RELATIONs close to two. The first RELATION being the one which introduces the sub goal and the second one being the RELATION to which this sub goal is chained to, i.e. the RELATION which computes the parameter. A top goal parameter can even be present in only one RELATION. It is advantageous to reduce n both from the perspective of convergence speed and stability and the number of evaluations to be performed. The reduction of degrees of freedom is achieved in the following two manners. The first step in reducing degrees of freedom is performed by the Modeller which, prior to invoking the Solver, separates a template into a number of single RELATIONs which can be evaluated independently and into a number of strong components, if any. Each time the Modeller receives input, it applies a search strategy for strong components before invoking the Solver (see Figure 8.2 and Frame 8.7). By transforming a larger template into a limited set of such subsystems instead of solving it as a single system of equations the overall stability and convergence behaviour of the Solver greatly improved. Even more important is that this strategy allows much larger templates than when the template is solved as one system of equations, which is limited by memory constraints. The second step in reducing degrees of freedom is symbolic substitution by the Solver. Symbolic substitution is performed by replacing right clause parameters in 130 Solver Strategies RELATIONs by pointers to right clauses of RELATIONs in which the parameter is left clause. Any moment the interpreter reads such a pointer, it is called recursively with the pointed expression clause as argument (see section 9.2 for a further discussion of the interpreter). The expression clauses which are referred to may also include other implicit and/or explicit template parameters. To avoid redundant expression evaluation, it is managed which expression clauses need to be reevaluated and which not. This depends on whether or not the clause contains or refers to parameters having values different from those applied in the previous evaluation of that expression (clause). A typical aspect of the Solver is the integration of numerical and symbolic error handling for the user. On the symbolic level, the Solver must use the symbolic knowledge in the knowledge base to propose or apply strategies to overcome the error. This symbolic level deals with the CONTROL attributes of RELATIONs and parameters and the validity, i.e. CONSTRAINTs of the RELATIONs in the template. This implies that in addition to finding a solution of a numerical problem, the Solver must validate the solution against the symbolic knowledge and must take appropriate actions if the validation is not successful. The aspects in the validation are the following: • • • • Are both the global and local convergence criteria fulfilled? Are the CONSTRAINTs of the template still TRUE? Did any numerical error occur during expression evaluation? Are the obtained values in conflict with the CONTROL of the parameters? The iteration is controlled by comparing the successively obtained solutions. On the basis of the initial values in the CONTROL, a global criterion is derived which is roughly the sum of the initial values multiplied by the adjusted solver accuracy, which default value is e.g. 0.001. For each solution the sum of all absolute values is computed and is compared to the previous solution. Global convergence is achieved in case the difference between two successive sums is less than the global criterion. At the second level, convergence is checked for each parameter in the system of equations by comparing the previous solution with the new one. At the third, level it is checked whether the current solution is not in conflict with a CONTROL attribute, e.g. a result of -5.63 for a non-negative (NN) parameter. In the expression evaluations performed during the iteration process it can occur that some function cannot be successfully evaluated, e.g. a division by zero, a square root of a negative number or a non-integer negative exponent. In such cases 131 PARAMETRIC MODELS: THE ASSEMBLING the interpreter returns a default value and saves the address of this RELATION or CONSTRAINT in an error buffer and the iteration proceeds as if nothing is wrong. As soon as the intermediate results change again into values that allow a successful evaluation of these functions, the error address is removed from the error buffer. This approach requires the interpreter to check the arguments of such critical functions. In theory, these checks decrease the performance of the interpreter but they are required anyway to avoid the program from crashing. This strategy implies that as soon as convergence is achieved, the buffer must be checked. In case the buffer is not empty, actions should be taken to eliminate the error. Depending on the condition, the error(s) are either reported to the user, or the CONSTRAINTs of the template are checked, or the RELATIONs in which the errors occurred are considered unsuitable and are rejected. In such cases, the template is pruned, i.e. the Solver removes the error RELATIONs from the template and subsequently reports to the Modeller that no solution is found. Following this, the Modeller will attempt to complete the template with other RELATIONs, if any. If the solution has survived the convergence and CONTROL checks, the Solver checks whether the CONSTRAINTs connected to the template RELATIONs are still valid. If they are, the solution is saved in the workbase and if they are not, the connected RELATIONs are removed from the template and the Modeller subsequently attempts to solve the problem using other RELATIONs. If convergence is achieved without any errors or problems the checks as discussed above are performed in a fixed sequence. If problems occur with convergence or expression evaluation, or in the event of direct or indirect dependencies between RELATIONs in the template, the focus of the Solver is shifted from the mathematical towards the symbolic level. In these cases, the Solver may need to ‘invoke’ the user (see section 6.3) and may ask assistance in the form of new iteration starting values and may ask the user to select RELATIONs to be removed from the template, etc. In view of the many possible error situations, the generally most effective Solver behaviour and steps to be taken in error situations have been derived and optimised more or less experimentally. Examples of these steps are to check first for numerical errors, then the CONSTRAINTs and then to ask for new iteration starting values. 132 9. IMPLEMENTATION ASPECTS To be both a speaker of words and a doer of deeds. Homer, Iliad In this chapter some implementation aspects of QUAESTOR are discussed. First, the data management aspects and code constructs are described which had much impact on the implementation of the database system and Modeller. Next the principle of the QUAESTOR interpreter is outlined. The implementation of the interpreter and the internal representation of expressions in the knowledge base are briefly discussed, also introducing the aspect of code recursion. Following, the merits and limits of code recursion and its application in the QUAESTOR code are elaborated. The chapter concludes with the performance aspects of the code, also in relation with the compiler and operating system versions. 9.1. Data Management A major difference between QUAESTOR and conventional technical/numerical computer applications for design and simulation lies in the use of structured variables. In the latter class of applications, it is in most cases possible to allocate arrays of given dimensions since the memory demand of the process is either fixed or can be determined prior to execution. In these applications, increasing array dimensions while preserving its current data contents can usually be avoided. However, in a knowledge-based system, memory demand is often uncertain. The large number of structured variables involved makes it unattractive or even impossible to allocate ‘sufficient’ memory to all these variables, in particular given the memory limitations under MS-DOS. In most imperative languages it is either difficult or impossible to increase the array size while maintaining the contained data. Object Oriented languages such as C++ know a Container-class which allows such operations. BASIC offers an implicit reallocation mechanism in its capability to use variable length strings. Consider the concatenation of two variable length strings A$ and B$: A$=“An” B$=“ example” A$=A$+B$ PRINT A$ {length 2 bytes} {length 8 bytes} {length of A$ now 10 bytes} 133 IMPLEMENTATION ASPECTS Results in: An example In addition, e.g.: A$=MKI$(A%) returns a two byte string representing the binary value of the integer variable A%, whereas A%=CVI(A$) converts the 2 byte string A$ into the integer value A%. It is obvious that these statements can be used to create dynamic arrays of e.g. frame addresses which occupy no more memory space than required for the number of addresses in the set. Another advantage is that the developer is not obliged to manage the number of the elements stored in the string variables, simply because the length of a dynamic string is proportional to the number of elements, i.e. addresses contained. The string conversion of integer numbers vice versa, together with the convenient string concatenation make it extremely simple to combine, reduce or extend arrays while maintaining the current data contents. The number of elements can also be zero, implying that memory is used only if necessary. About 120 of such dynamic arrays are used in the database system, Modeller and Solver. Apart from the memory efficiency point of view, these mechanisms facilitate exploratory programming in BASIC. Due to its simplicity many developers choose to build prototypes in BASIC and to translate them later into C for speed or in order to include additional features. A severe limit of using dynamic strings is imposed by the total amount of available string space which is usually limited to 64 KB, coinciding with one MS-DOS memory segment. In most 16 bit BASIC compiler implementations the theoretical maximum length of a single string is 32 KB. However, it is observed that a realistic limit is about 10 KB. In general, the above mentioned address sets are generally small and in the order of 10-100, requiring respectively 20-200 byte of string memory. However, depending on the contents of the knowledge base and on the operations performed the sets can easily be much larger. This makes it impossible (or highly memory inefficient) to allocate ‘sufficient’ memory to all arrays. In practice, it appeared that all string operations in the QUAESTOR code can be performed within the available 64 KB memory segment. That string manipulation is heavily used, is illustrated by the fact that in its 27,000 lines of 134 Data Management code well over 12,000 tokens are present which either represent strings or string functions. Although the available 64 KB appeared to be sufficient to accommodate all strings used at runtime, severe string space problems were encountered with the initially used Borland Turbo BASIC compiler (BTB). These problems were related to ‘pollution’ of the string memory, resulting in a program crash sooner or later. Each time memory was required for a (new) string, the first open space sufficient to accommodate that string was allocated. For larger strings this location is often found at the end of the string space already in use, with the effect that eventually string space overflow may occur. This space allocation mechanism results in a multitude of ‘holes’ in the string memory, often referred to as garbage. Since BTB did not offer any mechanisms for garbage collection, it was attempted to develop a satellite in C performing a reallocation process. This attempt proved unsuccessful and the final solution was to create a single string array in which all shared (not the local stack oriented) strings were stored. By periodically writing this array to disk, erasing the array in RAM and restoring it from disk, it was possible to ‘program around the problem’. The performance effects were negligible but the accessibility of the code was reduced, however. On the other hand, it became easier and less error prone to pass data to subroutines and functions. The more recent Microsoft Visual BASIC (MVB) compiler has an efficient garbage collection mechanism, which made the above strategy superfluous. The knowledge-based model assembling process requires storage facilities for the numerical models which allow high performance queries. The performance during knowledge base maintenance is also of importance but less than for the model assembling process. During the development it became apparent that the RELATION-CONSTRAINT-Parameter network (Figure 3.3) could only be used successfully if stored in a database organised as a network. The frames containing the RELATIONs, CONSTRAINTs and parameters are forming the nodes in the network. The database implementation manages a set of free format binary records containing the model fragments and pointers to each other. The limitation of the 640 KB memory model has forced the use of the disk as virtual memory. The knowledge bases are stored in single binary files. As soon as a knowledge base is opened it is divided into three data sets, containing respectively the network, the DATA and REFERENCE slots and the pointer list defining the binary addresses in the files containing respectively the network and data slots. This division provides a scratch version of the selected knowledge base 135 IMPLEMENTATION ASPECTS on disk which can be modified without affecting the original. The original knowledge base can be replaced only through a File Save command. The binary network file consists of a more or less random sequence of RELATION, parameter and CONSTRAINT frames from which the locations are determined by the pointer list. The division in sets of RELATIONs and alphabetically ordered parameters is achieved by pointers between the frames; the sequence in which the frames are presented in e.g. the Parameter list (see Frame 6.2) does not at all reflect the order in which the frames are located in the network file. The developed binary network database provides acceptable performance for modelling purposes. The above indicates that the complexity of this sub system is mainly related to guarding the consistency of the network pointers during inserting and deleting of frames in the knowledge base. The development/ debugging cycle of the database subsystem has been an long lasting and demanding activity, not in the least due to the often ambiguous error reports. It was not always easy or even possible to pinpoint the problem due to the impossibility of simulating the reported error. Again, string operations play a key role in the database system. The binary database files are considered as large strings from which sub-strings are either inserted or removed. The RAM string space limitations as discussed above have resulted in block-wise shift operations in the files if a frame is either removed or changed in length. The database has not been the most fascinating aspect of the development but it has paved the way to a Modeller with satisfactory performance and has greatly improved the efficiency of its development. The intensity of data transport between database and Modeller might be huge: about 30,000 database calls were counted during assembling a template of 30 relations. This value included the retrieval of pointers, at that time still from a binary file from the disk. Based on this result, it was immediately decided to replace the pointer table file by an array in the RAM, since necessarily about half of the calls were queries for data pointers in the knowledge base. All pointers in the database are 16 bit, allowing a theoretical maximum number of 215-1 frames, i.e. 32767 (integer number). The network and data pointers are 32 bits, allowing a theoretical maximum knowledge base file size of 2.0 GB. The theoretical maximum size of one frame data set is 32 KB, however, in view of the string space limitations as discussed above, in practice this maximum size will be about 10 KB. Thus far, this has been amply sufficient. In view of the development of the CLASS functions (section 7.3), the maximum number of different parameters 136 Data Management in one expression has been increased from the initial 255 to 32767 (obviously less in practice). The network database enables the Modeller to immediately read strings containing sets of e.g. CONSTRAINT addresses connected to a RELATION, vice versa or to read sets of parameter addresses in particular RELATIONs or CONSTRAINTs, vice versa, all by single function calls. Subsequently, the modelling process performs operations on these address sets such as Union, Complement or Intersection. For example, consider the determination of the candidate RELATIONs for the current sub goal parameter (section 8.2). The steps are the following: 1) determine RELATIONs in which current goal parameter is present, so read from the goal parameter frame the set of expression addresses in which the current goal parameter is used 2) remove expressions not being RELATIONs from the set found in 1) 3) remove expressions from the set resulting from 2) which cannot produce the value of the goal parameter, i.e. evaluate the CONTROL of the both expressions and of the goal parameter (see section 7.2) 4) determine union of rejected (S1$), forward selected (S2$), FALSE (S3$) and template relations (S4$) 5) determine complement of the set resulting from 4) with respect to the set resulting from 3) The set resulting from 5) contains the addresses of the candidate RELATIONs, if any. In QUAESTOR, the steps 1 to 5 are combined into the following statement: L$=FUUNKNOWN$(S1$+S2$+S3$+S4$,FUVERWP$(Goal_Par)) in which: • FUVERWP$(Goal_Par)) performs steps 1-3, • S1$+S2$+S3$+S4$ represents step 4 • FUUNKNOWN$(A$,B$) determines the complement of A$ with respect to B$, i.e.: B$-A$={x:x∈B$,x∉A$} The above example is characteristic for the implementation of the model assembling process which basically consists of operations on sets of frame addresses stored in dynamic strings. Amongst other, these sets are representing the template, determinable parameters, forward selected RELATIONs, rejected and authorised RELATIONs and CONSTRAINTs. Use is made of a variety of functions performing operations such as referred to in the above example. 137 IMPLEMENTATION ASPECTS It only became possible to perform reasoning tasks within acceptable response times after the network database subsystem was more or less operational. Slots in the various frames were added and adapted until 1992. Later adaptations were made while maintaining compatibility with older knowledge bases. This is due to the fact that professional application of the system started at that time. 9.2. Interpreter In section 2.4 a motivation is provided for using interpretation instead of compilation for expression evaluation. A drawback of expression evaluation by an interpreter is its slowness if compared to compiled code. The challenge was to develop an interpreter with acceptable performance. For expression evaluation two basic strategies can be followed. The first is the Reversed Polish Notation (RPN). To evaluate 4*5, the steps are: 4 push to stack, 5 push to stack, multiply. In order to use RPN, the expression needs either to be written in that format or the algebraic notation must be converted into RPN prior to evaluation. Using an notation RPN for RELATIONs and CONSTRAINTs in the knowledge base was not considered. It is unwieldy to designers and engineers which are mainly used to the more common algebraic notation. Therefore, in QUAESTOR the second strategy is applied which is evaluation from left to right, so evaluating the expressions in algebraic order. Other strategies for expression evaluation originate from constraint programming languages and are based on the notion that a set of constraints (QUAESTOR: RELATIONs) can be viewed as a network of variables and operators. The results of the operations are propagated through the network. In section 3.4 two of these strategies (local propagation and relaxation) are briefly discussed as constraint satisfaction technique. By considering an expression clause as such a network, these techniques are in fact also suitable for expression evaluation. However, due to their slowness, these techniques were not further considered. The developed recursive expression evaluation is more efficient since no search is performed for operators that can be ‘fired’. 138 Interpreter For this purpose, the CALCUL interpreter was developed which task is to evaluate clauses of numerical expressions (RELATIONs and CONSTRAINTs). Which expression clause is controlled by the Solver performing the constraint satisfaction process. The Solver invokes the interpreter to compute the elements of the Jacobian required in the Newton Raphson iteration (see section 8.3), either directly, or indirectly through the CONSTRAINT evaluator. Any expression clause consists of a sequence of tokens, representing either values, operators or functions. Values are either (sets of) numerical values in the expression or parameters. Operators are either arithmetical (+, -, / etc.), relational (=, >, <= etc.) or logical (AND, OR, etc.). Since the interpreter only evaluates expression clauses, both the relational and logical operators are clause delimiters. As soon as one of these operators is encountered, the interpreter returns the result of the evaluation to the Solver or to the CONSTRAINT evaluator. The latter subsequently evaluates the relational and logical operators, if any. In RELATIONs, only the equal sign (=) is allowed as relational operator. The task of the Solver is to find values of the PENDING parameter(s) in the RELATION making the equality fulfilled, thus satisfying the constraints. Functions are either the standard mathematical functions or special functions dedicated to the application domain. Groups of these functions are given a generic structure in order to allow a standard evaluation of their argument lists and only require a dedicated numerical process to be performed on these arguments. This saves code space and facilitates knowledge coding. Examples are the functions performing operations with tabular data and with CLASSes of parameters (see section 7.3). Although RPN is not used, it is obvious that in expression evaluation the priority of the operations must be taken into account. One solution is to convert the algebraic expression into RPN before evaluation. The drawback of this approach is that a relatively complex process is required to determine the sequence of the evaluation. This process is a combination of sorting and expression parsing and has to be repeated each time the expression clause is evaluated. The solution adopted in QUAESTOR is recursive evaluation from left to right. In the knowledge base, expressions are not represented in the algebraic form in which they are presented by the user interface. Any expression in the knowledge base consists of a binary data sequence which is evaluated per byte. 139 IMPLEMENTATION ASPECTS Each byte read by the interpreter represents a token, i.e. values, operators or functions, e.g.: • • Addition ‘+’ is ASCII character 34 SIN is ASCII character 57 • ASCII character 255 indicates that a two byte database address of a parameter follows hereafter Numerical data, for instance polynomial coefficients in the DATA slot, is transformed into a binary data set in the expression, preceded by an ASCII character 0 and then a 16 bit length pointer. • The purpose of the above internal expression representation is to allow evaluation without (data) parsing and without additional database access. Expression and data parsing is time consuming and is performed only when the expressions (and/or data) are introduced in the knowledge base. This binary representation is evaluated per byte by the CALCUL interpreter. An important additional reason for this format is that it allows the use of the computed-GOTO instead of SELECT CASE construct. In most compilers the latter construct is implemented as IF..THEN..ELSEIF..ELSEIF... constructs which is very slow, in particular when a large number of CASEs are involved. The QUAESTOR interpreter has a switch statement which includes about 120 different options or tokens to be dealt with and each new function simply adds another option to this switch. The computed-GOTO provides a switch statement by which the program counter jumps to a computed address without any IF..THEN type of testing which makes the number of options irrelevant. However, the computed-GOTO statement is considered as ‘bad practice’ by software engineers. Characteristic is the fact that the computed (or assigned)-GOTO statement is made obsolescent in the FORTRAN90 standard which implies that it will not be supported by future FORTRAN standards. The reason for this aversion is the bad accessibility and maintainability of complex computed-GOTO constructs which is also confirmed by the code of CALCUL. In this subroutine the computed-GOTO construct is applied since performance prevailed over code clarity and yielded an overwhelming 50 200 times gain in performance in comparison to the implementation with a SELECT CASE switch. Since CALCUL basically returns the result of a single operation, it can be used recursively to compute the argument values of functions or the values on which to perform an arithmetical operation. In this process, the operator priority plays an important role. Parenthesis can be used also for expression prioritising. 140 Interpreter The recursive mechanism is illustrated with the evaluation of the following simple expression clause: 3+4*SIN(X-0.785) with 1.57 as workbase value of X The Solver performs the call Level 1: CALCUL(3+4*SIN(X-0.785), Result, nil) The arguments are respectively the expression clause, Result value to be returned and the operation providing the priority on which to leave the evaluation (nil at top level). First ‘3’ is encountered by CALCUL, then ‘+’. The following step is to compute the value to be added to 3. CALCUL is called recursively with as argument the part of the expression not yet evaluated (actually the evaluation position counter), so: Level 2: CALCUL(4*SIN(X-0.785), Arg_Result, ‘+’) Next, CALCUL encounters ‘4’. The next operator is ‘*’ which has a higher priority as ‘+’ which means that it must be evaluated first. The next step is then: Level 3: CALCUL(SIN(X-0.785), Arg_Result, ‘*’) CALCUL now encounters the SIN function which has priority over any arithmetical operator, so CALCUL is called recursively again: Level 4: CALCUL((X-0.785), Arg_Result, ‘SIN’) Following, a parenthesis is encountered which triggers again a recursive call: Level 5: CALCUL(X-0.785), Arg_Result, ‘(‘) Next, ‘X’ is encountered to which the workbase value 1.57 is given. Next, ‘-’ is encountered which has priority due to the previously encountered opening parenthesis, resulting into another recursive call to compute the value to subtract: Level 6: CALCUL(0.785), Arg_Result, ‘-’) The expression value ‘0.785’ is encountered and following the closing parenthesis is encountered. Except the fact that the closing parenthesis is the last character of the expression it is a trigger to leave CALCUL until the level of the opening parenthesis. 141 IMPLEMENTATION ASPECTS Level 6 returns Arg_Result = 0.785 Level 5 finishes X-0.785, → 1.57-0.785, so returns Arg_Result=0.785 Level 4 returns the value of the term between parentheses, i.e. Level 5 yields Arg_Result = 0.785 Level 3 finishes SIN(0.785) and returns Arg_Result = 0.707 Level 2 finishes 4*0.707 and returns Arg_Result = 2.828 Level 1 finishes 3+2.828 returning 5.828 as result to the Solver. In this simple example already 5 recursive calls of the interpreter are performed. In real world cases this can easily be 20 or 30 levels deep. Thus far this has never given any problems with regard to runtime stack overflow. In the next section an extreme form of recursion is described which has been applied in an experimental function. Although expressions are evaluated from left to right, the above recursive interpreter applies an implicit form of Reversed Polish Notation. The relational and logical operators in CONSTRAINTs are evaluated by the recursive EVALUATE routine. This routine evaluates all expression clauses of CONSTRAINTs by means of the interpreter CALCUL and reduces CONSTRAINTs e.g. into the following primitive form: (A=B AND C=>D) OR E<F A-F represent the PENDING or DETERMINED numerical results of the expression clauses evaluated by CALCUL. Subsequently, the relational operators in the above expression are evaluated, reducing the expression to: (B1 AND B2) OR B3 in which B1-B3 are the PENDING or DETERMINED Boolean values of the expression parts. As a final step the logical operators are evaluated, yielding a PENDING or DETERMINED Boolean value (see Table 7.1). 142 Recursion 9.3. Recursion In the course of the development a number of sub-processes were implemented in a recursive form. The interpreter routine CALCUL, discussed in the previous section is an excellent example of the advantage of recursion. In the program, direct and indirect recursion is applied. Direct recursion means that a routine invokes itself and pushes all local data on the stack, e.g. in CALCUL. As soon as the recursive operation is finished the data pops up again. Indirect recursion is involved when subroutine X launches subroutine Y which launches X again. In QUAESTOR, an extreme form of indirect recursion has been applied in the experimental COMPUTE() function which is used in SUBCEM [vdNat, 1996]. COMPUTE() is a special function which allows a form of subroutine definition in the knowledge base: Delta_RF = COMPUTE(RF,INPUT(C:10)) COMPUTE(RF,INPUT(C:15)) (eq. 1) (Eq. 1) calculates the difference of the hull frictional resistance at sea water temperatures of respectively 10 and 15 degrees centigrade. The fundamental limitation of the Modeller is that a RELATION can only be used once in a template, it is not possible to use instances of the same RELATION in the same template. In the above example this means that the parameter RF at 10 and 15 degrees centigrade should be given separate parameter names, requiring also (in fact redundant) RELATIONs to compute these parameters. The COMPUTE function, however, allows the interpreter to invoke the complete Modeller with RF as top goal, first for 10 degrees and subsequently for 15 degrees. The Modeller builds and solves a template and returns the computed value of RF as result to the COMPUTE function, implicitly creating instances of various parameters in the knowledge base. This involves extreme indirect recursion: Modeller calls interpreter, interpreter calls Modeller and after the modelling process, the Modeller calls the interpreter again, etc. since the COMPUTE function can also be used recursively. The COMPUTE function did work properly but was abandoned since no intermediate results were maintained and were open for user scrutiny. Secondly, the function was extremely slow since each time the COMPUTE function is invoked, a complete template must be assembled. A better solution has been realised by the introduction of the recursive TELITAB format in QUAESTOR (section 5.3). Anyway, the COMPUTE() function clearly demonstrated that the limits of recursion in the applied compiler are far beyond the level at which it is actually used in the code. 143 IMPLEMENTATION ASPECTS The advantage of recursion is the compactness and elegance in which algorithms can be coded. There are many occasions in which either recursion is involved or can be used. A drawback is that, although the elementary function of a recursive subroutine can be simple, the overall task performed may be hard to comprehend. In addition, debugging can be extremely difficult and time consuming, due to the often complex data management, deciding on which data to share and to make local, etc. In QUAESTOR, the following processes are implemented recursively: • • • • • • • expression interpreter constraint evaluator generation of inference tree (see e.g. Table 6.2) determination of strong components in a template linear, spline and double quadratic matrix interpolation TELITAB database access text screen manager 9.4. Performance Issues Most of the components of the system have passed several revisions with the primary aim to increase performance and to reduce code size. The latter was in particular imposed by the initially used compiler (Turbo Basic) which could not use overlays. Given the fact that every 18 months computer performance increases by a factor 2, developers nowadays are careless about speed. QUAESTOR does not take full advantage of higher processors’ speed since the data transport between the Modeller and respectively the knowledge base and the workbase is becoming the major bottleneck. Benchmarks show that most of the computer time is consumed between the (iterative) calculations. Both the workbase and the knowledge base are disk oriented. To further improve the performance of the system, some form of data caching will be necessary. Computational speed was important during the development of the system because of dialogue response times. The aim is a system which is fast enough to follow an experienced user, i.e. a user should not need to wait between giving an answer and waiting for the next question in the dialogue. In the course of the development, hardware, operating system and compiler have been upgraded several times. In view of the above mentioned massive amount of data transport, 144 Performance Issues faster hardware yield less performance improvement than expected on the basis of the processor index only. Migrating from Turbo Basic to MVB yielded a speed loss of about 40 per cent. A job which was previously done in 50 seconds was taking 70 now. The cause of the loss of performance with MVB relative to BTB is the use of overlays necessitated by the size of the code. Another observation is that operating system updates rather tend to reduce performance than to improve it. A DOS 5 runtime of 50 seconds took 100 seconds under DOS 6 for the same job. Running the program in a DOS box under Windows 3.1 yielded another performance reduction of about 30 per cent. Obviously, in terms of computer performance, the outdated DOS memory model is causing increasing overhead. The computer is performing more and more additional and in fact irrelevant tasks to support an outdated memory model, leaving less performance to the actual process. Apart from the response time issue, the performance of the system determines practical limits of model size and complexity. Currently, the market is abandoning the 15 year old segmented (640 KB) memory model and is moving towards a linear 32 bits memory model (Windows 95/NT, OS/2). Migrating QUAESTOR to this platform will be considerable effort mainly since the current character oriented interface needs to be replaced by a graphical equivalent. However, most of the kernel code of the database, Solver and Modeller can be ported without much effort. After porting, the code of the database system should be revisited in order to take full advantage of the new platform. A considerable performance improvement is expected as a result, since much of the present database code dealing with the limits of the segmented memory model becomes obsolete. 145 This page intentionally left blank 146 10. QUAESTOR APPLICATIONS Factual evidence can never ‘prove’ a hypothesis; it can only fail to disprove it, which is what we generally mean when we say, somewhat inexactly, that the hypothesis is ‘confirmed’ by experience. Milton Friedman, Essays in Positive Economics At MARIN, QUAESTOR is applied as a consultancy and rapid prototyping tool for numerical models, e.g. [vHees, 1992] and [vManen, 1996]. Throughout this thesis, a number of highly simplified examples from ship design and hydrodynamics are used to elucidate the principles of numerical model assembling. This chapter presents two applications of the program. The first one is the SUBCEM design model for underwater vehicles developed by NEVESBU/RDM and the Delft University of Technology (DUT). Due to the fact the SUBCEM project was started in a relatively early development stage of QUAESTOR, this application had a considerable impact on its functionality. The Royal Netherlands Navy successfully developed and used a number of design models in QUAESTOR for e.g. SWATH’s, frigates, submarines and for the TROIKA mine sweeper system which is the second example to be discussed in this chapter. The impact of QUAESTOR on the design (modelling) practice is illustrated with the TROIKA design case. 10.1. SUBCEM The aim of the SUBCEM2 project was to create a flexible and adaptable Concept Exploration Model (CEM) for underwater vehicles. SUBCEM should produce balanced vehicle solutions in the conceptual phase of a design [vdNat, 1993]. The model comprises a numerical description of the vehicle, a large number of design rules and calculation methods and a number of satellite applications, controlled by QUAESTOR. In the course of this project, important advantages and limits of knowledge-based concept exploration with QUAESTOR were observed and special attention was paid to the way submarine designers can use such systems. 2) This research is carried out within the scope of the SUBCEM project, a joint effort of NEVESBU, RDM, MARIN and the Delft University of Technology. This project is sponsored by the Netherlands Foundation for the Co-ordination of Maritime Research (CMO). 147 QUAESTOR APPLICATIONS Important aspects in the design of underwater vehicles are the balance between weight and displacement and the balance between volume within the vehicle and space required to accommodate all the necessary components. Strict space and shape requirements are typical for the design of underwater vehicles. Therefore, a major feature of this CEM is its ability to support the modelling of the spatial arrangement by visualising user-defined layouts. An algorithm has been developed that controls the assignment of space to objects without a need for detailed arrangement drawings. This algorithm calculates the space efficiency of an achieved arrangement, used as an indicator for the feasibility of the created concept. An in-depth description of SUBCEM and of the experience with QUAESTOR as modelling tool is provided in [vdNat, 1996]. One of the apparent limitations of QUAESTOR for the SUBCEM project is its inability to handle spatial relationships. Important in the concept design phase is the combined manipulation of numerical and spatial relationships. Within the SUBCEM project this limitation has been overcome by developing satellite applications for spatial purposes. This ensures that design modifications are reflected throughout the complete evaluation without detailed control by the user. Another important restriction which was exposed in this project is related to dealing with multiple objects governed by similar relations, e.g. various groups of similar components in energy systems. To overcome this restriction, it was attempted to develop the Modeller towards using object-oriented techniques in combination with the existing network structure of the knowledge base [vdNat, 1994]. The COMPUTE function discussed in section 9.3 has also been an attempt to resolve this problem. In an object-oriented knowledge base the parameters are structured in meaningful hierarchies of objects. Heuristic rules implemented in the Modeller should give higher priority to RELATIONs that contain only parameters belonging to the same object than to RELATIONs containing parameters from other objects. Parts of the knowledge base may even be made inaccessible. In theory, the hierarchy of parameters and RELATIONs are an additional source of knowledge to the Modeller and to the user. Serious attempts were made to implement such techniques in QUAESTOR. Although the hierarchic knowledge base was successfully implemented in the database management system, it proved to be impossible adapt the Modeller in such way that it was able to use it properly. Using the hierarchic qualities of the knowledge base resulted in insoluble conflicts between required and existing 148 SUBCEM inferences. On the other hand, it became clear that a hierarchic knowledge base is much more difficult to develop and maintain. In such a knowledge base, the model fragments cannot be treated anymore as fully independent objects with unique properties which they are in the non-hierarchic knowledge base. The latter characteristic of QUAESTOR is an important premise for flexibility and low threshold for its use. Concluding, the concept of an hierarchic knowledge base was abandoned. This implied that the problem of dealing with multiple sets of similar objects remained unresolved until the development and implementation of the recursive TELITAB data format. By this format, the hierarchic aspect was in fact introduced in the underlying data structure of QUAESTOR (section 5.3). The ability of QUAESTOR to deal with this data representation format proved to be a solution to this problem. The recursive TELITAB format is a logical extension on the existing basic format, the database management system only requires a limited number of adaptations and the same holds for the Modeller. This solution combines the required expressive power to deal with multiple sets of similar but different objects with the simplicity of the existing structures, not in the least due to its coherence with the basic concept of the system. 149 QUAESTOR APPLICATIONS 10.2. The TROIKA Mine Sweeper QUAESTOR was introduced at the Royal Netherlands Navy (RNlN) in early 1993. The development and design of a new class of unmanned mine sweepers proved to be an excellent test case. The design of this craft is relatively simple and straightforward in comparison with the design of a surface warship or submarine. Due to the fact that at that moment, no conceptual design tools were available for this type of vessel it was decided to develop a model ‘from scratch’ and to implement it as QUAESTOR knowledge base. This section deals with some design aspects of unmanned mine sweepers and presents aspects of using a KBS as modelling tool. The traditional method of mine sweeping is performed by towing a sweeping gear. The RNlN has chosen a solution in which the actual sweeping is performed by remote controlled drones, called the TROIKA3 mine sweeper concept. These drones are guided and controlled from a ‘mother’ platform situated away from the high risk area. This solution will place personnel outside the danger area and reduces the number of personnel involved when compared to more traditional methods of mine sweeping. The TROIKA concept was new to the Netherlands Navy and no relevant design experience was available. A ‘design by analysis’ method was adopted in order to overcome the lack of relevant design experience. The developed analysis model is used to minimise the cost of the design. The basic structure of the TROIKA drone consists of a thick-walled steel cylinder around which copper wire coils are wound, all enclosed within a streamlined body. Aided by the steel core, these coils can generate a magnetic signature comparable to that of a ship. The cylinder houses the main components (see Figure 10.1). A deckhouse is placed on top of the cylinder which is only used during manoeuvring and crossings. 3) This section is based on [Wolff, 1994], courtesy P.A. Wolff, Royal Netherlands Navy 150 The TROIKA Mine Sweeper 3 2 2 1 10 5 1 2 3 4 5 6 4 aft-body coils air intake fore-body exhaust rudder propeller 6 7 8 9 7 8 9 10 11 11 hydraulic pump diesel engine electric generator electric cabinet steel cylinder Figure 10.1: General arrangement of the TROIKA drone The initial focus of the design problem is the performance of the magnetic gear which is greatly affected by the dimensions of the steel cylinder. The available analytical methods for calculating the magnetic performance are limited to simple dipole moments without the influence of steel. Initially, the influence of steel was expressed by a gain factor, useful in the early design phase where only relative performances between available options are important. The TROIKA design model is structured in such a way that the main operational requirements are direct input for the model: • • • • Magnetic signature (magnetic moment) Payload (weight, volume and power consumption) Speed (sweeping mode) Endurance (hours between refuelling) Cost is considered as the top goal of the model. There are two categories of RELATIONs included in the model. The operational requirements are present in RELATIONs of the first category in which the specific design knowledge of the 151 QUAESTOR APPLICATIONS TROIKA is embedded, i.e. L/B ratios for fore and aft body, optimum cylinder diameter and cylinder wall thickness. These RELATIONs are used to generate the main dimensions, scantlings, weight and centre of gravity, powering characteristics, etc. This parametric concept description is needed by the RELATIONs or aspect models of the second category which calculate e.g. cost, stability and trim. These values are used in the concept validation phase as described in section 6.6 (The Concept Variation Model). Steel located inside the coils can increase the magnetic output by as much as 15 times. By using more steel inside the costly coils their size can be reduced. This will of course lead to additional displacement, thus increasing the required propulsive power. The model shows a clear trade-off between the cost of the coils and the cost of the platform. The cost is expressed in a non-dimensional Cost Index (CI) which includes the platform and payload costs and not the almost platform independent cost of the Command & Control systems. Figure 10.2: Wall thickness t versus Cost Index CI for different cylinder radii The dotted lines in Figure 10.2 represents the results of the Finite Element calculations, the solid lines represent the dipole calculations. For a given cylinder radius R the Cost Index CI decreases with increasing wall thickness t. Although the effects of increasing the wall thickness on the construction cost are negligible, the effects of the increased displacement on the size and cost of the propulsion system are significant, as illustrated with the example in section 6.5. 152 The TROIKA Mine Sweeper Secondly, costs are lower for smaller cylinder radii. The results of the dipole calculations indicate that the optimum cylinder is probably a long massive cylinder which is of course not feasible. Therefore, the diameter is determined by installation requirements of the equipment to be located in the cylinder. The model indicated that the dimensions of the coil are a major platform cost driver. These dimensions are based on the magnetic performance predicted with the simple dipole method. In order to reduce the degree of uncertainty in the performance and thus in the main dimensions, it was decided to explore the magnetic performance of various coil/cylinder configurations by means of a Finite Element Method (FEM) package. The results of these calculations were included as database into the knowledge base. This database is used as an improved prediction model of the magnetic performance and was applied to refine the drone and coil dimensions. General areas of interest were: influence of wall thickness, diameter/length ratio of the cylinder and the influence of the fore and aft body shape. This approach is a typical example of replacing a complex and demanding calculation method (the FEM analysis package) by a database of systematic output results (section 5.2). The results of the FEM calculations indicated that only up to a particular value of the cylinder wall thickness an increased gain factor and a smaller coil configuration are obtained. Any further increase of the wall thickness has adverse effects. The magnetic performance is not improving any further and the only result is additional weight. On the basis of these findings, both the wall thickness and cylinder radius were fixed and included into the concept description. Initially, constraints were imposed on the main dimensions. These were a maximum value of ship length (L_MAX) and draft (T_MAX). In QUAESTOR, the same model can be used to solve problems with varying sets of dependent and independent parameters. The radius (R) was calculated as a function of wall thickness (t) when the length is fixed on L_MAX. And the same was done with the draft fixed on T_MAX. The grey area in Figure 10.2 is the envelope of possible design solutions within the given constraints. On the basis of this small envelope it was concluded that these constraints were imposing too narrow limits on the design space. This was considered as a risk for the overall feasibility of the design, so it was decided to drop them. The model was in this case used as a means to validate the staff requirements (section 4.1). 153 QUAESTOR APPLICATIONS It is interesting to note that model validity is indeed a major form of design knowledge. The constraints imposed by the installation of equipment finally determined the cylinder radius whereas the wall thickness was more or less determined by the draft and length limits. The simple dipole method was very accurate in the range in which the final dimensions were selected. The FEM calculations were performed mainly to verify the unknown accuracy of the simple method and the fact that in this case the coil dimensions were important cost drivers. This study clearly demonstrates that it is impossible to predict which questions have to be answered in later design phases and what the importance is of using simplified or less simplified prediction models. This means that any design model used during the early design phases has to be highly flexible if one desires to continue using, adapting and extending it in later phases. This above design case is relatively simple and is manageable by a single person. Larger or more complicated designs need a more structured approach, in particular if more parties are involved in its development. After this first application a number of QUAESTOR-based design studies and design model developments were undertaken by the RNlN, e.g. [Zuiddam, 1994] and [vCoevorden, 1995]. As spinoff from QUAESTOR’s use, the generic Concept Variation Model or CVM has been developed which is discussed in sections 6.6 and 6.7. 154 11. DISCUSSION AND CONCLUSIONS Nothing new is quite perfect. Marcus Tullius Cicero, Brutus In this chapter the developed technique is evaluated and ongoing and future work is discussed. At first, the functional domain is introduced as a logical extension of the numerical parametric domain. For conceptual design applications, it is proposed to bridge the gap between the functional and numerical aspects of design by inferring the numerical design requirements, i.e. top goal parameters from the envisaged tasks and functions of the artefact. Subsequently, the reverse of problem-driven model assembling is introduced as input-driven model assembling. The latter strategy has important merits from the perspective of the user. The efficiency of the Solver may be further improved by adapting the sequence in which computation is performed. Possible improvement of the user interface and some implementation aspects are briefly discussed. This chapter is concluded by reflecting on the work and what has been achieved thus far. 11.1. Focus and Perspective The numerical model assembling technique as described in this thesis is used in different real world applications, in both analysis and design. Once over the threshold, the technique is appealing because of its ability to support and facilitate the reasoning process in design and analysis. Being able to freely associate numerical model fragments, practically any problem statement in the domain of the knowledge base can be dealt with. Users are often guided towards favourable solutions that were not obvious at the beginning. Numerous pilot applications have made the tool sufficiently robust in order to be professionally applied. However, having gained practical experience with the new technique it is possible to establish the focus of further research and development. Based on this experience, aspects are identified which can be further improved or may enlarge its scope of application. The first aspect is related to the various forms of design knowledge. Thus far, the assembling process of design models applies properties (CONTROL) of numerical knowledge, viz. RELATIONs, CONSTRAINTs and parameters. The process departs from a set of user defined top goal parameters, stating the design problem. As indicated before, design problems are never simply stated and solved without a halt. The problem statement has to be ‘designed’ as well and may change in due course of time on the basis of new insights gained (section 4.1). 155 DISCUSSION AND CONCLUSIONS It is concluded that the technique presented in this thesis fully answers to its title: ‘Expert Governed Model Assembling’. The system requires the user to be familiar with both the contents of his knowledge base and the applied model assembling principles. In general, some relevant domain knowledge is needed to successfully use computer applications in science and engineering, but (too) high demands discourage users. Apart from the merely commercial aspect, the reduction of required competence for using software systems remains a major AI research focus. The aim is to maintain user control over the model assembling process with the least possible directives. The problem statement of either design or analysis problems is one of the most demanding aspects of using QUAESTOR. In the sequel two techniques for inferring top goal parameters are introduced, one suitable for complex design applications like naval ships and another more suitable for analysis tasks in which fewer parameters and model fragments are involved. The application of QUAESTOR in conceptual naval ship design is thus far confined to management of design data, numerical model fragments and aspect models and the assembling of models referred to as CVM (see section 6.6). The CVM is used in the conceptual design for numerical problem solving. Functional knowledge of the concept in the form of capabilities and operational requirements are considered as input for the CVM. These capabilities and operational requirements are the result of a functional analysis of scenarios, tasks and sub-tasks and is an activity entirely separated from the conceptual design as supported by the CVM. Due to a changing world or insights evolving during the design process, the operational capabilities resulting from the concept design may be quite different from the ones initially intended. This implies that the ship may be suited to additional tasks so that it can be finally deployed in additional scenarios. The foregoing arguments support the idea that conceptual design will benefit from a concurrent approach towards functional analysis and conceptual design. By merging the two levels and the related forms of knowledge, an improved exchange of cause and effect between the conceptual design and its intended purpose can be achieved. In Appendix B an interface is proposed between functional and the existing numerical reasoning capabilities. Only a limited development effort is envisaged in order to deal with the added functional domain. Extensions are mainly needed in the parser and two new parameter CONTROL attributes need to be introduced and supported. In addition to the existing VALUE, STRING and OBJECT attributes (see section 7.2) the new TASK and ELEM attributes need to be 156 Focus and Perspective introduced, indicating respectively Tasks and Elements. The Modeller requires only a few additional inferences to deal with these new ‘parameters’ and the functional RELATIONs in which they are applied. A second user competence aspect touches the basic concept of QUAESTOR which is problem-driven modelling. This implies that the user needs to state his problem, i.e. needs to select top goal parameter(s) to be chained to RELATIONs. In most analysis applications the top goal selection is straight-forward. In other cases this selection is less trivial and requires deep knowledge of the relationships between the parameters in the domain of interest. It is noted that users with domain experience possess this knowledge and are able to forecast the computable parameters given what they currently know. In a way, they perform in thought a forward reasoning operation: knowing x, will it be possible to derive y? For those less familiar with the domain this may be very difficult or even impossible. They may select top goal parameters which cannot be derived from what they know, as appears in the course of the dialogue. Since the system cannot tell which were the missing inputs, the (inexperienced) user may need several attempts to make any progress towards a solution, and by doing so, to become acquainted with the domain. Missing input implies that the sum of the facts or values provided by the user and the facts and rules (RELATIONs and CONSTRAINTs) in the knowledge base do not allow the assembling of a template in which all PENDING parameters can be chained to a RELATION. Since the role of the parameters in terms of being either dependent or independent is not fixed in the assembling process, in fact any of the parameters in the knowledge base may be missing in such cases. Although forward reasoning is applied in QUAESTOR, its primary purpose is to avoid superfluous questions (section 8.2). It is not used to infer computable goal parameters on the basis of a set of input values. The reverse of problem-driven model assembling is input-driven model assembling. This implies the ability to provide a set of parameter values (input) and to find all computable goal parameters, either or not expert governed. The latter mainly depends on the settings of the control attributes in the knowledge base. From the perspective of the user, such a strategy may make life easier since he does not need to bother about top goal parameters but only needs to ‘tell’ the system what he knows. The system will then compute any value which can be derived from this input, using the model fragments in the knowledge base. If input-driven modelling is feasible as an efficient strategy, it also becomes possible to develop applications in which QUAESTOR is used as ‘embedded engine’ behind a conventional, low-end interface. In this low-end interface, the user can 157 DISCUSSION AND CONCLUSIONS select predetermined sets of independent parameters. The embedded engine receives these values over the data link with the low-end interface, infers al feasible goal parameters, assembles and executes a model and returns all computed values to the low-end interface. Experiments with a knowledge base on ship propellers indicate the feasibility of this approach. To realise a robust and efficient form of input-driven model assembling, the Modeller requires further attention. Although cycles or systems of (>1) equations are recognised in assembled templates, it is a premise and a major problem in this respect to recognise them in the knowledge base without actually performing any calculation. Within a set of hundreds of RELATIONs the number of possible combinations becomes too large. A promising strategy is to identify parameters that are probably computable and to check that by employing the conventional backward reasoning model assembling strategy. An option related to input driven model assembling is to validate sets of parameter values (e.g. concept descriptions of competing designs) within a knowledge base. By using a variant of the input-driven strategy it becomes possible to establish how well these values fit in the available model fragments. Suitable ways need to be found to provide insight to the user how well provided data sets agree with the knowledge base. The recursive TELITAB format is to overcome the limits of the single List/Table structure of the basic format. From the very beginning, the basic format has been the foundation of QUAESTOR. In order to exploit the hierarchic nature of the recursive format, it is embedded into the system step by step. The first step is the implementation of a recursive workbase, comprising both the internal data management and the user interface (Frame 6.4). The second step is adapting the Modeller towards using it properly. This implies the introduction of value inheritance between objects in the hierarchy (see section 5.1) which is mainly implemented on the level of the network database. In a multi-case calculation the Solver executes the template case by case. As stated in section 9.4 most computer time is consumed in workbase management and in the search for strong components between the successive cases. The actual expression evaluations consume less than half of the resources. By adopting a template execution per equation for all cases instead of per case all equations, the overall performance is expected to improve considerably. This improvement will be mainly due to the fact that the search for strong components is reduced to one case only. In terms of the TABLE in the TELITAB data set representing the workbase, the calculations are performed column-wise instead of row-wise. A complicating 158 Focus and Perspective factor is dealing with templates which need to be adapted due to CONSTRAINTs becoming FALSE. In such events this strategy may become less efficient than the current one. Although experienced users evaluate the interface sufficiently user friendly, it does not make that impression on novice users. One of the reasons is that most interaction takes place in the Network Manager (Frame 6.1) which also serves as browser and is used for knowledge base maintenance. Imposed by the initial compiler, the interface was rather optimised on minimum code size than on user friendliness. Code size constraints implied a combined use of features for several different purposes such as the Network Manager to which user friendliness was more or less subordinate. Another reason why novice users do not feel at home immediately is due to choices made in a early stage of the development. First, the program is not windows oriented but ‘room’ oriented and the various rooms or screens are entered and left by using the cursor keys. Although a very effective interface concept for knowledge base browsing, it is rarely applied in commercial software. Another interface aspect is related to the command menu. In 1987 I decided to apply a Lotus 1-2-3™ form of interfacing to facilitate cursor and menu control but made the mistake to locate the menu at the bottom of the screen. In most contemporary software, the main menu resides at the top of the screen since pull down menus have become a standard feature. Adapting the interface in this respect requires a considerable effort and is necessitated anyway by the intended migration to a 32 bits graphical environment (section 9.4). Furthermore, the presentation during the dialogue of parameters in a proposed RELATION or CONSTRAINT may be improved. It seems better to present them in a list similar to Frame 6.2. instead of one by one in the parameter viewport of the Network Manager (Frame 6.1) which cause users to miss parameters during a dialogue. Further study is required into a reduction of the information presented to the user during a dialogue. 159 DISCUSSION AND CONCLUSIONS 11.2. Conclusions The principal insights obtained during the development and application of the described model assembling technique are listed below: • The development of numerical models supporting analysis and conceptual design can be reduced to the process of acquisition and maintenance of model fragments and their properties. The model assembling, traditionally a programming activity, can be automated, leaving sufficient ability to the user to govern the assembling process. Not being bothered by algorithmic issues, one can entirely focus on the quality and validity of the model fragments, being the actual kernel activity of any modelling activity. • Although initially intended for assembling and executing smaller numerical analysis models QUAESTOR proved to be well suited to conceptual design. The assembling and execution of large models (>>100 RELATIONs including several satellite applications) became possible by a successive removal of technical limitations as described in this thesis. • No paradigm shift towards conceptual design and design modelling is required by this tool. The designer is relieved of tasks he was already performing by automating several of his own familiar inferences. The efficiency of this automated reasoning and model assembling process puts the designer literally in control of his numerical models. • The absence of a paradigm shift makes a seamless transition possible from a ‘conventional’ to a knowledge-based approach towards conceptual design. In spite of this, the introduction of QUAESTOR in naval ship design confirms the general statement that knowledge-based systems (KBS) have more impact on organisations than conventional information systems. Users and suppliers of design knowledge are in particular affected by the modular way of working and thinking which comes with its use. • KBS are rarely applied in technical environments. The experience obtained with QUAESTOR reveals that the introduction of KBS in such environments is thwarted by the procedural education of scientists and engineers and by the fact that highly advanced tools are available to support imperative design modelling and programming (section 3.4). It appears, however, that KBS can compete with these imperative tools in conceptual design applications. 160 APPENDIX A Glossary of Terms In this glossary the terms most frequently used in this work are summarised in alphabetical order. In case the reader is unfamiliar with knowledge-based systems or naval architecture in general, this glossary can be used as aid to memory. In the body of the thesis, the reader is assumed to be familiar with these terms after their introduction. Aspect model A numerical model (in general a computer program) which can predict aspects (cost, performance, stability, motions, strength, etc.) of a ship concept on the basis (of elements) of the concept description (main dimensions, hull form, structure, general arrangement, etc.). The aspects consider in general properties of the concept and no elements of the concept description. Backward reasoning The backward reasoning strategy searches for each unchained parameter a suitable RELATION which can compute its value. Backward reasoning is the core strategy of the Modeller because the system must propose parameters to the user (ask questions). The RELATIONs in which these parameters are used are collected from the knowledge base and the ‘most suitable’ candidate is selected and is chained to the parameter, i.e. used to produce the value of the parameter. Browser A facility in the database management system that allows to search for data in a knowledge base in many different ways and to present these data to the user. Calculation In this work the terms ‘dialogue’ and ‘calculation’ frequently appear. Both terms refer to a session of the KBS in which a computational problem is solved. Case In QUAESTOR it is possible to vary any arbitrary (input) parameter present in the assembled model (template). A Case is one solution vector of the template. The term Case is also used in this work within the context of case-based reasoning. This is the form of reasoning in which similar solutions (i.e. design solutions) are used to rapidly focus on aspects in the requirements for the current case which 161 Glossary of Terms were experienced as important in previous, similar designs. The relevant cases are retrieved from the ‘case-base’ and modified towards the new requirements. Chain/Chaining/Chained This term is applied to indicate the action of connecting a (sub) goal parameter to the RELATION in the knowledge base that will produce its value. Concept description A design concept is represented by a concept description which consists of dimensions, geometry and components. In the early phases of design, the concept description is in general not fully available in a numerical form: the concept is also described by a number of sketches and drawings. QUAESTOR can be used to create and adapt a parametric concept description consisting of a set of Parameter-Value Combinations (PVC’s). Concept Exploration Model (CEM) A CEM is a parametric design model which is used to systematically search the design space for the ‘best’ starting point for the more detailed design or to investigate the dependencies between design parameters by exploring a specific area in the design space. A CEM generates a large number of concept descriptions with their properties, fitting within selected ranges of ship main particulars. The concept descriptions are judged by a post processor or ‘filter’. Concept Variation Model (CVM) The CVM is partly design synthesis and partly analysis model, configured around parametric concept descriptions, representing a (limited) number of advanced point designs and supports the case-based reasoning strategy common to naval architects. The CVM is a generic model since the models are assembled from model fragments in a knowledge base in co-operation with the designer on the basis of the problem definition at hand. The assembled models can e.g. predict effects of modifications of the selected basic concept, i.e. point design, on its properties (e.g. powering and motion performance). The designer is free in the selection the relations affecting the concept and of the premises and properties to be studied, which provides full freedom in the approach towards the design problem. 162 Glossary of Terms CONSTRAINT In QUAESTOR, a CONSTRAINT is a mathematical expression consisting of: f(u1,u2,...,un) {=,<,etc.} g(v1,v2,...,vn) or of sets in this form separated by logical operators (AND, OR, etc.) and nested by parentheses, if any. The CONSTRAINT expresses the validity of the Relation to which it is referring and must be either TRUE or PENDING before its admission into the template. The term constraint in normal print refers to a concept local to that paragraph. CONTROL A slot in a QUAESTOR knowledge base. QUAESTOR needs typical ‘human’ know-how to perform its model assembling task. It can efficiently store and use numerical knowledge but it has no knowledge about its purpose other than captured in the CONTROL slots. This reflective knowledge expresses expectations with regard to the practical application of the model fragments and is a low level link between the procedural knowledge in the model fragments and the user in the outside world. DATA A slot in a QUAESTOR knowledge base. The DATA slot may contain tables or coefficients which are required by one of the available special functions. The contents of the DATA slot is in simple ASCII format and is assumed to be of limited size. In the case large databases need to be accessed, a reference to an external file is preferred. DETERMINED For parameter values DETERMINED means known and fixed. For CONSTRAINTs it means that sufficient data are available to evaluate a TRUE or FALSE, leading either to rejecting or accepting a particular RELATION in the template. The Modeller takes advantage of the DETERMINED (or PENDING) attribute of parameters and CONSTRAINTs in the reasoning process. Determinable parameter A parameter in the knowledge base, not necessarily a goal parameter at present, which can be computed by a Forward Selected RELATION, given the current status of the model and of the data provided. 163 Glossary of Terms Dialogue The communication between system and user, controlled by the Modeller is referred to as dialogue. The system makes suggestions or requires user decisions. The user can affect in this way the direction into which the template or solution evolves. See also Calculation. Editor One of the two main components of QUAESTOR is the knowledge Editor containing all facilities dealing with storage, maintenance and retrieval of design knowledge. Expression Any combination of parameters, numbers, functions, arithmetical, relational and logical operators is called an expression. Expression is the collective noun for RELATIONs and CONSTRAINTs. Forward reasoning The Modeller reasons backward from (sub) goal parameters to RELATIONs, i.e. RELATIONs are selected which can be used to compute the values of these goals. If a value is becoming available, either by calculation or by input, the Modeller performs a forward reasoning search in order to find RELATIONs that can be ‘fired’, i.e. what can now be calculated given the current set of values? The forward reasoning strategy of QUAESTOR is rather a forward selection strategy because the actual chaining action is always performed through backward reasoning (see section 8.2). Forward selected RELATION A RELATION in the knowledge base which is not yet included in the template being assembled but which is earmarked to be chained to (i.e. to determine the value of) a particular parameter, not necessarily a goal parameter at present. Frame A package of data in a knowledge base. QUAESTOR distinguishes three main types: RELATIONs, CONSTRAINTs and parameters. The frame contains the data associated to the type of frame in dedicated slots, i.e. CONTROL, REFERENCE, DATA, VALUE and DIMENSION. Goal parameter The PENDING template parameter selected by the Modeller to be chained to a RELATION, i.e. to find a suitable RELATION in order to compute its value. 164 Glossary of Terms Inference tree The network of relations between parameters and RELATIONs in a template, either represented as a semantic network or in nested text format. The inference tree exactly defines the input parameters, which parameter is computed by which RELATION and by which RELATION a parameter is introduced into the template, i.e. is made sub goal parameter. Interpreter The part of the program that actually performs the calculations. The interpreter is fed with RELATIONs and parameter values and returns a result. Although a relatively slow process of itself, the interpreter makes it possible to perform computation without any intermediate code generation or compilation. Knowledge base A set of frames in a network format. A QUAESTOR knowledge base consists of a number of model fragments (RELATIONs), their validity (CONSTRAINTs) and the parameters used in these expressions. These model fragments can be assembled into numerical models or templates which can be executed to solve design problems (derive values). Modeller The part of QUAESTOR that controls the communication between the user and the system and performs the reasoning steps and knowledge base queries involved in the model assembling process. Network database A type of database management system, capable of dealing with data packages or frames of various types, which maintains and uses a set of pointers between them. This pointer structure allows high performance queries, e.g. by the Modeller for determining the CONSTRAINTs in which a particular parameter is used. Parser A facility which separates expressions in a sequence of numbers, parameters, functions and operators and after this performs a syntax check prior to allowing the expression into the knowledge base. The parser reports any syntax and type errors and protects the knowledge base against being corrupted. PENDING For parameter values PENDING means unknown and not (yet) fixed. For CONSTRAINTs it means that no sufficient data are available to evaluate either a 165 Glossary of Terms DETERMINED TRUE or FALSE, implying that the decision to either reject or accept a particular RELATION in the template may need to be postponed until sufficient data are gathered. The Modeller takes advantage of the PENDING (or DETERMINED) attribute of parameters and CONSTRAINTs in the reasoning process. Production rule Production rules consist of an antecedent part which includes the conditions to be fulfilled prior to execution and a consequent part stating the actions to be performed: IF condition(s) THEN action(s) The result of the action(s) is called Conclusion. If condition(s) are viewed as validity, formulae, equations and constraints can be considered to be Numerical Production Rules from which the value of one of the parameters is the conclusion. This concept is applied in QUAESTOR. REFERENCE A slot in a QUAESTOR knowledge base. In the REFERENCE slot background information can be included which may be important for a user to know or have access to during a dialogue. At decision points during the dialogue this text is either presented or immediately available as an implicit user manual or at least a description of the assembled model. RELATION In QUAESTOR, a RELATION is an expression in mathematical form: y = f(x1,x2,...,xn) A RELATION always contains the equality operator ‘=‘ and is used as production rule by QUAESTOR. The IF clause of the rule is made up by the evaluation of CONSTRAINTs, if any, and the conclusion is the value of either one of the parameters applied in the RELATION. The term relation in normal print refers to a concept local to that paragraph, to indicate e.g. the relation between the nodes of a semantic network. Satellite A computer program containing an aspect model interfaced with a QUAESTOR knowledge base. Satellites make it possible to use existing software without the necessity to transform the embedded numerical model into a set RELATIONs, CONSTRAINTs and parameters. This conversion can be impractical or even be 166 Glossary of Terms impossible. By an immediate use of the existing problem the model can be applied in a similar flexible manner as RELATIONs in the knowledge base. The only modification of these existing codes is related to the I/O which need to be in TELITAB format. For reasons of efficiency and maintainability, large and complex applications (e.g. a CVM knowledge base) should consist of the least possible number of frames connecting the maximum number of satellites. Semantic network Semantic networks describe a domain by means of a graphic structure consisting of ‘nodes’ or objects which are interconnected by lines or ‘relationships’. Semantic nets are developed to capture knowledge with a hierarchical structure. A relationship can be for instance a Is_a relationship. Two forms of knowledge are captured in this way: • • To fix that a class of objects is a sub-class of another class of objects To fix that a particular object belongs to a particular class of objects The graphic representation is appealing but rapidly becomes unwieldy for larger structures. In QUAESTOR the representation is used in a symbolic form in the knowledge base and in the assembled models. The template assembling process is viewed as the process of building and maintaining a semantic network. Slot A dedicated subset of a frame, containing information which belongs to that frame, i.e. a string containing the expression or parameter and the CONTROL, REFERENCE, DATA, VALUE or DIMENSION (see also Frame). Solver The subsystem of QUAESTOR that controls the computational process. The Solver transforms the template into a suitable format for the interpreter, invokes the interpreter and applies common numerical schemes on the solution of implicit parameters or strong components. The Solver also deals with error handling and controls user interference. Strong component A strong component is a subsystem of equations or a cycle that exist in the template which are solved independently of the rest of the template. 167 Glossary of Terms TELITAB A generic parametric data format used as standard format for exchange and storage of data throughout QUAESTOR. The format is also used as I/O format between QUAESTOR and satellite applications (see section 5.3). Template The template is the numerical model assembled from RELATIONs and parameters retrieved from a knowledge base. Its structure is fixed by the semantic relations between the RELATIONs and parameters as visualised in Figure 3.2. Workbase The workbase contains the parameters and their (initial) values, which are (or were) active in the most recent dialogue. In addition, the workbase contains the addresses of the parameters and expressions of the template and their relations, i.e. the inference tree, the rejected relations, the TRUE and FALSE constraints, etc. 168 APPENDIX B On Merging Numerical and Functional Design Knowledge In section 11.2 the possibility to reason about system functions was indicated as a way to further penetrate into the conceptual level of naval ship design, using basically the same reasoning principles as applied in the numerical model assembling process. In the sequel these ideas are further explored and illustrated with simple cases. The technical design requirements of naval vessels are based on a thorough analysis of the envisaged scenarios in which these ships will be deployed. These scenarios (e.g. a blockade) require that the vessels are able to perform particular tasks. This functional analysis of the envisaged ‘scenario space’ is performed to infer which tasks need to be performed within these scenarios. Tasks are subsequently decomposed in ‘sub-task’. These tasks and sub task are forming the basis of the requirements of the complete system, consisting of various objects, e.g. vessel(s), manning and logistic support. The design requirements are the result of such analyses. In section 6.6. the CVM was introduced as a set of parametric model fragments and aspect models configured around a limited number of point designs. The CVM is used to study effects of concept modifications on the performance of a concept, vice versa. In the sequel we will confine to ship (and manning) tasks. Departing from the CVM approach, it is an attractive idea to bridge the gap between tasks and requirements, i.e. to provide tasks as input to the CVM instead of (quantitative) operational requirements. In order to do so, it is necessary to expose the reasoning process from task to sub-tasks and from sub-tasks to the properties of the considered objects. Some of these properties are denominated operational values. In line with the system’s theory [Klir, 1969] we consider a Task as the program of the system and Sub-Task as a sub-program. An program has output and requires some form of input (information, energy, money). A program imposes requirements on the structure and organisation of its system, i.e. similar to computers a general system is only capable to run a particular program when it fulfils these requirements. In design it is common practice to reason from the desired program output towards the structure and organisation of the envisaged system. 169 APPENDIX B The systems that are being considered consist of simpler systems or Elements. Each of these Elements have abilities required for performing one or more SubTasks and may have relations with other Elements. Elements and Sub-Tasks can be described by means of quantities. Quantities can be assigned a value and refer to specific properties of the Elements or Sub-Tasks. Values are either numerical, nominal or Boolean. Some examples: • Task: Sailing The Task Sailing requires a propulsor. The propulsor requires energy in the form of a torque and a number of revs (attributes of the Element propulsor). This torque and number of revs require propulsion machinery. The propulsion machinery requires a control system, has mass, various operational conditions, etc. • Task: Performing a blockade The blockade-Task comprises Sub-Tasks Seizing and Boarding. Seizing requires a suitable weapon system whereas Boarding requires a means of transport, e.g. a dinghy or helicopter. The availability of a dinghy or helicopter is a staff decision and is most probably related with other Tasks to be performed by the platform (e.g. Anti-Submarine Warfare). The functional analysis can be brought closer to the structure of the system by reasoning only about ability. The ability of a system states whether certain Tasks can be performed by or with the system. The performance is considered to be the extent in which -or how well- the given Task can be carried out by the system and should be expressed in measurable quantities. In the above example, the dinghy or the helicopter need to fulfil requirements regarding their capabilities (properties). By reasoning in this way, a relation is assumed between Elements and Sub-Tasks; an Element is basically hardware and/or software and has properties. Element properties can only be used as design and performance requirement if they are expressed in quantities. The above examples are derived from an existing hierarchy of (Sub-) Tasks and Elements. This hierarchy can be represented in a semantic network. An Element has relations with Tasks, i.e. an Element points to a number of Tasks with a needed_for relation. On its turn a Task points to one or more Elements with a requires relation. This network makes it possible to reason from Tasks towards required Elements vice versa. In my view, ‘reasoning about ability’ has aspects in common with the reasoning process performed by the Modeller. In the following some technical aspects of this 170 On Merging Numerical and Functional Design Knowledge domain and of the basic inferences are discussed. The possibility of linking the functional world with the parametric world of the CVM is demonstrated with a simplified example. Elements and Tasks can be approached in a parametric way. The relations in the semantic network can be modelled by expressions from which the left clause represents a Task and the parameters in the right clause represent either Elements or (other) Tasks. The connection between a Task and an Element consists of the equality sign. In case Tasks are also allowed in the right clause it is possible to refer e.g. from Task_a by means of a requires-relation to Task_b. A DETERMINED value of the parameter representing an Element implies ‘present’ whereas PENDING declares the Element to be ‘not present’ in the system. For a Task this is respectively ‘can’ or ‘cannot’ be performed. The Modeller acquires the Elements needed for a given set of Tasks and for a set of Elements it can determine the capability (the set of practicable Tasks). Providing or deriving values for respectively Elements or Tasks determine the result of the process. Error! Unknown switch argument. 171 APPENDIX B Figure B.1: Semantic network of Tasks and Elements In Figure B.1 the Tasks and Elements are represented as frames in a semantic network connected by a Requires relation. Transforming Figure B.1 into a ‘parametric’ form yields the equivalent structure shown in Figure B.2. This representation potentially bridges the gap between the functional knowledge in the form of Tasks and Elements and design knowledge in the form of parameters, RELATIONs and CONSTRAINTs. By combining numerical and functional knowledge into a single knowledge base, the functional level and the technical/numerical level of the CVM are linked. Tasks and Elements are modelled as parameters and the relations between Tasks and Elements are given in the form of RELATIONs. By this connotation, the ‘numerical’ representation in Figure B.2 is fully equivalent to that in Figure B.1. Figure B.2: Semantic network of Tasks and Elements via expressions The parametric network of Figure B.2 can be used to reason from Tasks to Elements and to other Tasks. Assigning a value to a Task or an Element implies a 172 On Merging Numerical and Functional Design Knowledge selection of that particular Task or Element, either by the user or by the Modeller. This means that if a value is not provided by the user, this does not automatically imply a rejection. A selection may also be indirect, e.g. by selecting other Tasks or Elements. Therefore, selections that are not made nor rejected by the user may still be made by the system. By organising the knowledge in this way, it comes within reach to gather by backward reasoning the set of required Elements on the basis of a given set of Tasks. By forward reasoning the ability to perform other Tasks and to weigh additional Elements against Tasks that can be performed becomes possible. This knowledge representation in combination with mainly existing inferences is used in this way to match Tasks and Elements vice versa in a logical and comprehensive way. The relations between Tasks and Elements are not yet completed. We can reason from: • • • Tasks to Tasks Tasks to Elements Elements to Tasks The reasoning from Elements to Elements is still missing. It is not difficult to imagine a required relation between Elements, however. To remain with the example of the helicopter: the Element Helicopter requires the Elements: Hangar, Heli_Deck and Pilot. The Pilot requires accommodation, etc. Apparently there is no objection against an Element as left clause of the expression, the equality operator (‘=‘) means Requires. Organised in a semantic network, Tasks and Elements are frames in which further relevant data can be stored. It is important to notice that the required reasoning strategies for dealing with the above semantic network are very similar to those required for the numerical model assembling process (in the CVM). It is easy to imagine this for Elements. Within QUAESTOR Elements can be represented by Parameter Value Combinations. As indicated in the previous paragraph Elements may refer to other Elements and should therefore be organised in a network. Within the knowledge base as implemented in QUAESTOR an Element (represented as parameter) can only refer to other Elements (viz. parameters) through expressions. Expressions are either RELATIONs or CONSTRAINTs. 173 APPENDIX B By reasoning from Tasks to Elements, knowledge is acquired about the Tasks that can -or need to be- performed and of the Elements (or components) that need to be present in the system. In principle it is not sufficient to only accept the Element which is done by providing a value for the parameter representing the Tasks (or Element). For the purpose of design, a description of this Element is needed. As an experiment of thought, the following (simplified) integrated approach towards functional analysis and CVM is presented. In this thesis, the importance to the modelling process is shown of the state (PENDING/DETERMINED) of parameters and CONSTRAINTs. By assuming that the value of the parameter representing the Task or Element has no meaning, we can use this status to indicate whether or not the Task or Element is present or selected. RELATIONs are used by QUAESTOR to determine values of parameters. However, a RELATION can also be used to ‘determine’ an Element. By defining an OneWay RELATION in which an Element parameter forms the left clause and the numerical Element Variables in the right clause we can reason from Element ‘available’ towards Element ‘description’, so from Element to Element Variables and similarly from Tasks to Task Variables. Frame B.1: Template example Expression Control Blockade = Tracing + Seizing + Boarding + Time span + Cost HARD,OW Boarding = Dinghy + Personnel + Speed SOFT,OW Boarding = Helicopter + Personnel + Speed SOFT,OW Helicopter = Pilot + Hangar + Jet_fuel + Aux._Eq. + Heli_deck + Maintenance +Type + Mass + Range + LCG + VCG + $ SOFT,OW Hangar = Length + Breadth + Height + Mass + LCG + VCG + $ HARD,OW Jet fuel = Volume + LCG + VCG + $ HARD,OW Task, Element, Task Variable and Element Variable In this approach, sets of Element Variables and Task Variables can be introduced as top goal parameters in a model assembling session in a simple and elegant 174 On Merging Numerical and Functional Design Knowledge manner. In other words, the CVM knowledge base may comprise functional knowledge in the form of Element descriptions and what is needed to perform a Task. Frame B.1 shows an example of reasoning from Tasks towards Element and Task descriptions. Within this small network we can reason in a few steps from Blockade to Jet fuel-Volume. The SOFT RELATIONs for Boarding represent a choice between either one of these two options. The above example shows that the functional analysis can be interfaced with the CVM via the Element Variables and Task Variables. By definition, the Modeller chains a parameter to one RELATION. This fact should be considered while building the network (or knowledge base). All combinations of Tasks, Elements, Task Variables and Element Variables should therefore be established in an unambiguous way: • • A Task can be linked through expressions to a combination of Elements, Tasks and Task Variables (interfacing Tasks with the CVM) An Element can be linked through expressions to a combination of Elements, Tasks and Element Variables (interfacing Elements with the CVM) The Task Variables in a RELATION refer to the Task forming its left clause. The Element Variables in a RELATION refer to the Element forming its left clause. Since the functional RELATIONs connect the functional and the numerical world it is not allowed to have Task Variables and Element Variables in one and the same clause. Neither is it allowed to have a Task Variable nor an Element Variable as left clause in a functional RELATION. These numerical variables can only be sub goals of functional RELATIONs and will be chained to numerical RELATIONs, bridging in this way the gap between the functional and numerical sub-domains of the CVM. 175 This page intentionally left blank 176 Bibliography Andrews, 1992 Andrews, D.J., Hyde, K.M.: “CONDES, A Preliminary Warship Design Tool to Aid Customer Decision Making”, Practical Design of Ships and Mobile Units, Elsevier Science Press, 1992, ISBN 1-85166-863-2, pp. 2.1298-2.1310 vdBerg, 1996 Van den Berg, E.: “Probability based Prediction from Measurements using Gaussian Interpolation”, Report No. 12403-2-SE, MARIN, Wageningen, The Netherlands, 1996 Borning, 1979 Borning, A.: “ThingLab - A Constraint-Oriented Simulation Laboratory”, Xerox PARC Technical Report SSL-79-3, Palo Alto, California, July 1979 Bras, 1992 Bras, B.A.: “Foundations for Designing Decision-Based Design Processes”, Thesis University of Houston, Systems Design Laboratory, 1992 Bremdal, 1985 Bremdal, B.A.: “Marine Design Theory and the Application of Expert Systems in Marine Design”, Computer Applications in the Automation of Shipyard Operation and Ship Design V, Elsevier Science Publishers, 1985 Brouwer, 1990 Brouwer, R.: “Description and use of VIBREX, a QUAESTOR knowledge base to evaluate the vibration risk in the preliminary design phase of a ship”, graduation work T.U. Delft, October 1990 Brown, 1989 Brown, D.C., Chandrasekaran, B.: “Design Problem Solving, Knowledge Structures and Control Strategies”, Pitman, London, Morgan Kaufman Publishers, Inc., San Mateo, California, 1989 MacCallum, 1985 MacCallum, K.J., Duffy, A.H.B.: “Approximate Calculations in Preliminary Design”, Computer Applications in the Automation of Shipyard Operation and Ship Design V, Elsevier Science Publishers, 1985 MacCallum, 1987 MacCallum, K.J., Duffy, A.H.B.: “An expert system for preliminary numerical design modelling”, Computational Mechanics Publications, Butterworth & Co (Publishers), 1987 MacCallum, 1990 MacCallum, K.J., Duffy, A.H.B.: “Representing and Using Numerical Empiricism in Design”, AI in Engineering Conference, Vol. 1, July 1990, CAD Centre, Faculty of Engineering, University of Strathclyde, Glasgow, Springer Verlag, Boston USA 177 Bibliography Clancey, 1982 Clancy, W.J.: “The Epistemology of a Rule Based Expert System - A Framework for Explanation”, Artificial Intelligence 20 (1983), pp. 215-251 Clocksin, 1984 Clocksin, W.F., Mellish, C.S.: “Programming in PROLOG”, Springer Verlag, Berlin, Heidelberg, 2nd Edition 1984 vCoevorden, 1995 van Coevorden, P. (Royal Netherlands Navy): “A QUAESTOR Based Concept Exploration Model of a Corvette Surface Effect Ship”, graduation thesis, Hogeschool Rotterdam & Omstreken, June 1995 Georgescu, 1989 Georgescu, C., Verbaas, F., Boonstra, H.: “Concept Exploration Models for Merchant Ships”, Conference on CFD and CAD in Ship Design 1989, Elsevier Science Publishers BV Guesgen, 1992 Guesgen, H.W., Hertzberg, J.: “A Perspective of Constraint-Based Reasoning, An Introductory Tutorial”, Lecture Notes in AI 597, Springer Verlag 1992 Gutierrez-Fraile, 1993 Gutierrez-Fraile R., “Design for Production, Production Methods, Ship Production & Procurement”, WEMT93, Madrid Hagen, 1993 Hagen, A.: “The Framework of a Design Process Language”, Dr. Ing. Dissertation, Div. of Marine Systems Design, Univ. of Trondheim, August 1993 vHees, 1992 van Hees, M.Th.: “QUAESTOR: A Knowledge-Based System for Computations in Preliminary Ship Design”, Practical Design of Ships and Mobile Units, Elsevier Science Press, 1992, ISBN 185166-863-2, pp. 2.1284-2.1297 vHees, 1994 Van Hees, M.Th.: “MO 2015: Principles of Model Development”, MARIN Report No. 3/123451-SE, October 1994 vHees, 1995 Van Hees, M.Th.: “Towards Practical Knowledge-based Design Modelling”, PRADS’95 Symposium, Seoul, Korea, September 1995 Hertz, 1991 Hertz, J.A.: “Introduction to the theory of neural computation”, Redwood City, Addison-Wesley 1991, pp. 248-250 Hinton Hinton, E. (Editor): “NAFEMS Introduction to Non-linear Finite Element Analysis” 178 Bibliography Holtrop, 1984 Holtrop, J.: “A Statistical Re-analysis of Resistance and Propulsion Data”, International Shipbuilding Progress, Vol. 31, No. 363, November 1984 Hoon, 1994 De Hoon, W., Rutten, L., and Van Eekelen, M.C.J.D.: “Funsheet: A Functional Spreadsheet”, Proceedings of the 6th International Workshop on the Implementation of Functional Languages, pp . 11.1-11, Norwich, UK, September 1994, Hudak, 1994 Hudak, P., Jone, M.P.: “Haskell vs. Ada vs. C++. Awk vs. ..., An Experiment in Software Prototyping Productivity”, Arpa Order 8888, Contract N00014-92-C-0153 Hughes, 1989 Hughes, J.: “Why Functional Programming Matters”, The Computer Journal, Vol. 32, No. 2, 1989 Ingalls, 1981 Ingalls, D.: “Design Principles behind Smalltalk”, BYTE Magazine, August 1981 Keizer, 1994 Keizer, E.W.H.: “Future Reduced Cost Combatant Study, Study Prospectus 3rd Draft”, MO 2015 Phase 2, MARIN, October 1994 Keizer, 1996 Keizer, E.W.H.: “Future Reduced Cost Combatant Study, Status Report”, MO 2015 Phase 2, MARIN, April 1996 Klir, 1969 Klir, G.J.: “An Approach to General Systems Theory”, Litton Educational Publishing, Inc., 1969 Konopasek, 1984 Konopasek, M., Jayaraman, S.: “The TK!Solver Book”, Osborne, McGraw-Hill, Berkeley, California, 1984 Kupras, 1983 Kupras, L.K.: “Computer Methods in Preliminary Ship Design” Delft University Press, 1983 ISBN 9062751067 Leler, 1988 Leler, W.: “Constraint Programming Languages, Their Specification and Generation”, AddisonWesley Publishing Company, Amsterdam, 1988 Laansma, 1993 Laansma, K.S.: “L/GRAND: A New Generation Ship Design System”, Schip & Werf , pp. 554555, Nr. 12, 1993 vManen, 1996 Van Manen, J.D. and Van Terwisga, T.: “A New Way of Simulating Whale Tail Propulsion”, Office of Naval Research Symposium, Trondheim, Norway , June 24-28, 1996 179 Bibliography Meek, 1992 Meek, M.: “Marine Design: Advancing with Realism”, Plenary Lecture 2, Prads’92 Symposium, Newcastle upon Tyne, 1992 Mistree, 1988 Mistree, F., Muster, D.: “Designing for Concept: A Method that Works”, International Workshop on Engineering Design and Manufacturing Management, Melbourne, Australia, November 21-23, 1988 Mistree, 1990 Mistree, F.: “Decision-Based Design: A Contemporary Paradigm for Ship Design”, Transactions, Society of Naval Architects and Marine Engineers, Jersey City, New Jersey, pp. 565-597, 1990 vdNat, 1993 Van der Nat, C.G.J.M. (NEVESBU): “SUBCEM with QUAESTOR - Phase 1, Feasibility Study on a Concept Exploration Model for Under Water Vehicles implemented in a Knowledge-based System”, Report No.: TU-Delft, OEMO 93/17 (in Dutch) vdNat, 1994 Van der Nat, C.G.J.M. (NEVESBU), Van Hees, M.Th. (MARIN): “A Knowledge-based Concept Exploration Model for Underwater Vehicles”, International Marine Design Conference (IMDC), May 25-27, 1994, Delft, the Netherlands vdNat, 1996 Van der Nat, C.G.J.M.: “A Knowledge-based Concept Exploration Model for Submarines”, Thesis TU-Delft, in preparation Newell, 1982 Newell, A.: “The Knowledge Level”, Artificial Intelligence 18, pp. 87-127, 1982 Nygaard, 1966 Nygaard, K., Dahl, O.J.,:”SIMULA - an ALGOL-based Simulation Language”, Communications of the ACM, Vol. 9, No. 9, Donald E. Knuth, Editor, September 1966 Pal, 1992 Pal, P.K.: “Computer Aided Design of Tugs”, Practical Design of Ships and Mobile Units, Elsevier Science Press, 1992, ISBN 1-85166-863-2, pp. 2.1475-2.1488 Pandurang, 1992 Pandurang Nayak, P.: “Automated Modelling of Physical Systems”, Thesis, Department of Computer Science, Stanford University, Stanford, California 94305 Press, 1992 Press, W.H. et al.: “Numerical Recipes in FORTRAN, The Art of Scientific Computing”, Cambridge University Press, 1986-1992 Riesbeck, 1989 Riesbeck, C.K., Schank, R.C.: “Inside Case-Based Reasoning”, Lawrence Erlbaum Associates, New Jersey, 1998 180 Bibliography vdRee, 1994 Van de Ree, R.: “SCWERE Supervisory Control Systems”, Thesis Delft 1994 Rumbaugh, 1991 Rumbaugh, J.: “Object Oriented Modelling and Design”, Prentice Hall, 1991 Schoman, 1977 Schoman, K., Ross, D.T.: “Structured Analysis for Requirement Definition”, IEEE Transactions on Software Engineering, Vol. SE-3, No 1, January 1977 Schreiber, 1994 Schreiber, A.Th., Wielinga, B.J., De Hoog, R., Akkermans, J.M., Van de Velde, R.: “CommonKADS: A Comprehensive Methodology for KBS Development”, IEEE EXPERT, 0885-9000/94 1994 IEEE. Sen, 1990 Sen, A., Srivastava, M.: “Regression Analysis, Theory, Methods, and Applications”, Springer Verlag, 1990 Serrano, 1992 Serrano, D., Gossard, D.: “Tools and Techniques for Conceptual Design”, Artificial Intelligence in Engineering Design, Vol. 1, Design Representation and Models of Routine Design, Chapter 3, Academic Press, San Diego, 1992 Smith, 1992 Smith, W.F.: “The Modelling and Exploration of Ship Systems in the Early Stages of DecisionBased Design”, Thesis, Systems Design Laboratory, University of Houston, 1992 Smith, 1994 Smith, W.F., Mistree, F.: “The Development of Top Level Ship Specifications: A Decision-Based Approach”, International Marine Design Conference (IMDC), May 25-27, 1994, Delft, the Netherlands Specht, 1991 Specht, Donald F.: “A General Regression Neural Network”, IEEE Transactions on Neural Networks, Vol. 2, Nr. 6, pp. 568-576, November 1991 Steels, 1989 Steels, L.: “Kennissystemen”, Addison Wesley, 1989, Amsterdam Sutherland, 1963 Sutherland, I. “SKETCHPAD: A Man-Machine Graphical Communication System”, IFIPS Proceedings of the Spring Joint Computer Conference, January 1963 Top, 1992 Top, J.L., Westen, S.: “Reverse Engineering of QUAESTOR with KADS”, Second KADS User Meeting, Munich, February 17-18, 1992 Top, 1993 Top, J.L.: “Conceptual Modelling of Physical Systems”, Thesis Enschede, 1993 181 Bibliography Vingerhoeds, 1990 Vingerhoeds, R.A., Netten, B.D., Boullard, L.: “On the Use of Expert Systems in Numerical Optimisation”, IMACS Annals on Computing and Applied Mathematics Proceedings MIM-S2, Brussels, September 3-7, 1990 Weiss, 1990 Weiss, S.M., Kulikowski, C.A.: “Computer Systems That Learn: Classification and Prediction Methods from Statistics, Neural Nets, Machine Learning, and Expert Systems”, Morgan Kaufmann Publishers, Inc., San Mateo, California, 1990, pp. 62-68 Wolff, 1994 Wolff, P.A. (Royal Netherlands Navy): “Development of a Remote Controlled Mine Sweeper”, Paper 24, INEC 94 Cost Effective Maritime Defence, 31 August - 2 September, 1994 Zuiddam, 1994 Zuiddam, R. (Royal Netherlands Navy): “A Comparative Study of the SWATH and Monohull Concept”, graduation thesis T.U. Delft, May 1994 (in Dutch) 182 List of Figures 2.1 Cost and knowledge as a function of time 12 3.1 Example of a component hierarchy 32 3.2 Numerical model in a semantic net 33 3.3 Semantic network of Parameters, RELATIONs and CONSTRAINTs 34 4.1 Design process: Sequential state-of-practice versus computerised optimisation 49 4.2 Concept Exploration Model 52 6.1 Global architecture of QUAESTOR 70 6.2 Map of interface 72 6.3 Semantic network representing the simplified TROIKA model 84 6.4 Result of example calculation 85 6.5 Concept Variation Process 88 8.1 Main loop of QUAESTOR 116 8.2 Control strategy of parametric model assembling and solution process 117 10.1 General arrangement of the TROIKA drone 151 10.2 Wall thickness t versus Cost Index CI for different cylinder radii 152 B.1 Semantic network of Tasks and Elements 171 B.2 Semantic network of Tasks and Elements via expressions 172 183 List of Frames 3.1 Summary of a typical frame in a QUAESTOR knowledge base 36 6.1 Network Management 73 6.2 Parameter List 74 6.3 Expression List 75 6.4 Workbase 76 7.1 Pendulum template 109 8.1 Backward reasoning 118 8.2 Forward reasoning 120 8.3 Select goal parameter 121 8.4 Validate goal parameter 122 8.5 Select candidate RELATIONs 123 8.6 Heuristic priority ranking of candidate RELATIONs 124 8.7 Select candidate and include in template 126 8.8 Find subsystem and solve 127 B.1 Template example 174 184 List of Tables 3.1 Tentative language classification from a numerical modelling perspective 44 5.1 A TEXT 61 5.2 Parameter Value Combinations 61 5.3 Speed/power TABLE 62 5.4 Open water diagram in TELITAB format 62 5.5 Ship description in basic TELITAB format 63 5.6 Rudder data in TABLE 64 5.7 Rudder data in TELITAB format 64 5.8 TELITAB set with OBJECT “Rudder” 65 5.9 Part of DESP output 66 5.10 DESP output in basic TELITAB format 66 5.11 Part of DESP output in recursive TELITAB format 67 6.1 Data and stimuli 77 6.2 Inference tree of the simplified TROIKA design model 83 7.1 Results of QUAESTOR CONSTRAINT evaluation 105 185 QUAESTOR: EXPERT GOVERNED PARAMETRIC MODEL ASSEMBLING Summary The aim of this work is to create a semi-automated method for the assembling, execution and maintenance of parametric design models. In spite of the array of available computer languages and tools for solving numerical problems, the actual assembling of the numerical models has thus far obtained little research attention. Within the available tools, the actual model assembling is viewed as a programming activity. These tools generally provide a library of numerical methods and an instruction set for a ‘manual’ translation of the problem into the form manageable by the computer. This research is founded on the observation that in numerical design modelling many tasks are performed that contain elements of repetition which can be made generic. The first modelling sub-task comprises the selection of suitable model fragments, and the actual assembling and implementation of the model in some computer code is the second sub-task. The time and effort involved in these tasks cannot be directed towards the actual core activity of numerical design modelling: the development, gathering and advancing of the model fragments involved. On the basis of these observations, a dedicated network knowledge model has been developed and implemented. On the basis of this knowledge model, a strategy of reasoning has been developed that controls the numerical model assembling process in dialogue with the user. By applying this new modelling strategy in design one can fully concentrate on the very essence of the design problems, being the quality and validity of the knowledge required to solve them. By means of several prototype applications, the heuristic rules, reasoning steps and properties of model fragments were derived. Furthermore, a syntax for coding the relevant forms of knowledge and a general purpose numerical solver were developed and implemented. A hierarchic parametric data model has been developed that is sufficiently expressive for all exchange and storage of numerical data in the process of modelling and computation. In order to function, the modelling strategy needs access to a number of methods and techniques which are partly known and applied in different existing tools. The core strategy which is referred to as the Modeller is considered to be the primary contribution of this work. No comparable implementations were found although some known concepts of Artificial Intelligence are applied in the Modeller. In 186 QUAESTOR: EXPERT GOVERNED PARAMETRIC MODEL ASSEMBLING, Summary addition to this, the application of this strategy in the conceptual design of ships has improved our understanding of the role and function of numerical relationships in the design process. The program QUAESTOR, being the result of this work, is a union of a rule-based expert system, computer algebra and constraint programming. The conceptual design of naval ship has obtained much attention in this work since it was the primary touchstone during the development and, thus far, its most demanding application. The design process is described on a conceptual level and is linked to the knowledge-based parametric model assembling process with QUAESTOR. Some practical applications and a formal description of the generic design model or Concept Variation Model are presented. An important observation made in the practical application of the developed technique in ship design is that the system supports and facilitates the existing approach and has not forced to switch over to other ways of working and thinking. The absence of a design paradigm shift is the key to the acceptance of the system. This fact alone makes a seamless transition possible from the ‘conventional’ to a knowledge-based approach towards conceptual design. In spite of this, the introduction of QUAESTOR in naval ship design confirms the general statement that knowledge-based systems have more impact on organisations than conventional information systems. Users and suppliers of design knowledge, inside and outside the organisation are in particular affected by the modular way of working and thinking which comes with QUAESTOR. Martin Th. van Hees 187 QUAESTOR: EXPERT GOVERNED PARAMETRIC MODEL ASSEMBLING Samenvatting De doelstelling van dit werk is het realiseren van een semi-geautomatiseerde methode voor het samenstellen en doorrekenen van numerieke (ontwerp) modellen. Ondanks de beschikbaarheid van een veelheid aan talen en gereedschappen voor het oplossen van numerieke problemen heeft de feitelijke samenstelling van de numerieke modellen tot op heden weinig aandacht gekregen. De modelsamenstelling wordt binnen de beschikbare middelen vrijwel zonder uitzondering beschouwd als een programmeeraktiviteit. Deze gereedschappen bieden in de meeste gevallen een aantal numerieke oplossingsmethoden alsmede een instructieset voor de ‘handmatige’ formulering van het probleem. Het onderzoek is uitgegaan van de waarneming dat bij numerieke (ontwerp) modellering veel handelingen worden uitgevoerd die een grote mate van herhaling bevatten en welke generiek kunnen worden opgelost. De eerste handeling betreft het selecteren van geschikte modelfragmenten en het samenstellen ervan tot een model, de tweede betreft de codering van het model. De hiervoor benodigde tijd en energie kan niet op de feitelijke kern van het probleem worden gericht: het ontwikkelen, verzamelen en verbeteren van de modelonderdelen. Op basis van deze waarnemingen is een netwerk model van de belangrijkste kenniselementen ontwikkeld en geïmplementeerd alsmede een redeneerstrategie die in dialoog de modelsamenstelling stuurt. Door toepassing van deze strategie kan men zich geheel richten op de inhoudelijke kant van (ontwerp)problemen. Aan de hand van verscheidene prototype toepassingen zijn de redeneerstappen en heuristische regels afgeleid die men bij het samenstellen van modellen gebruikt alsmede de eigenschappen van modelonderdelen en de benodigde functies. Tevens is een hiërarchisch parametrisch gegevensmodel ontwikkeld dat voldoende expressief is voor alle gegevensopslag en -uitwisseling die in het systeem plaats vindt. De modelleerstrategie bleek als basis een aantal methoden en technieken te vereisen die deels bekend zijn en reeds in uiteenlopende gereedschappen worden toegepast. De modelleerstrategie zelf, welke wordt gezien als de voornaamste bijdrage van dit werk staat nog op zichzelf hoewel ook daarin enkele bekende concepten uit de kennistechnologie zijn toegepast. Belangrijke inzichten zijn 188 QUAESTOR: EXPERT GOVERNED PARAMETRIC MODEL ASSEMBLING, Samenvatting verkregen over de rol en mogelijkheden van numerieke relaties in ontwerpprocessen. Het programma QUAESTOR dat het resultaat vormt van dit werk is een kruising tussen een op regels gebaseerd kennissysteem, computeralgebra en constraint-programmering. Het conceptueel ontwerpen van marineschepen krijgt veel aandacht in dit werk omdat het een belangrijke toetssteen was tijdens de ontwikkeling en tot op heden de meest veeleisende toepassing is van het programma. De procesgang in het conceptueel ontwerp is beschreven en vervolgens afgebeeld op de kennisgebaseerde modelsamenstelling met QUAESTOR. Een aantal gerealiseerde toepassingen alsmede een formele beschrijving van het hieruit ontwikkelde generieke ontwerpmodel, het Concept Variatie Model worden gepresenteerd. Een belangrijke waarneming bij de inmiddels gerealiseerde praktijktoepassingen is dat het kennissysteem de bestaande werk- en denkwijzen ondersteunt en dat men niet op een wezenlijk andere ontwerpmethode behoefde over te gaan. De afwezigheid van een dergelijke paradigma-verschuiving bij het scheepsontwerp is voor QUAESTOR de sleutel tot een snelle acceptatie in een praktijkomgeving. Dit maakt een vrijwel naadloze overgang mogelijk van de ‘conventionele’ ontwerpmethode naar één met een kennissysteem. Toch bevestigt de introductie van QUAESTOR de algemene stelling dat kennissystemen meer invloed hebben op organisaties en hun werkwijzen dan conventionele informatie-systemen. Met name betreft de modulaire werk- en denkwijze en de eisen die dat oplegt aan de kennisleveranciers binnen en buiten de organisatie hiervan een belangrijk element. Martin Th. van Hees 189 Acknowledgement This work could not be achieved without the contributions of numerous people. These people, colleagues and friends from MARIN, the Royal Netherlands Navy and universities have either contributed, stimulated or inspired me. Countless times I have entered into discussions and presented my ideas, listening to opinions and views. Although convinced of my views on parametric modelling and model assembling in design I attempted to check them with people and practice throughout the years. I should like to thank prof. Henk Koppelaar and prof. Hans Klein Woud who continuously offered their support and advice during the writing of this thesis. My special thanks are in order for prof. Klein Woud for his enthusiasm about this subject and for taking the initiative to applications which have greatly contributed to the final result. In chronological order I am indebted to the following people. Prof. Schelte Hylarides has stimulated me to proceed in the early hours of this study with his clear insight into the potential of the approach, given the fact that it was still in embryonic stage. Ir. Robin Brouwer was victimised to develop the first professional application using a prototype version which did not quite earn that name and greatly contributed in this way to the first, workable version of the program. My special thanks are in order for my fellow graduate student ir. Clemens van der Nat. For years he has been my ‘sparring partner’ within the SUBCEM project and his feed-back and sharp insight in what is, and is not, possible has been invaluable. I also wish to compliment him with his perseverance in passing all set-backs and misbehaviour (of the program). Since early 1993 the Royal Netherlands Navy has generously supported the development and application of QUAESTOR in a financial, moral and technical sense. A personal word of appreciation goes to ir. Philipp Wolff and ir. Rob Zuiddam for their support and important contributions to the joint development of a useful conceptual design method using QUAESTOR and its dissemination. I am furthermore indebted to their permission to use their findings and experience for this thesis. I want to thank ir. Ed Keizer for his confidence, his ongoing effort in the dissemination of my approach to model assembling and for sharing with me his 190 Acknowledgement broad view on conceptual naval ship design. His concurrent approach towards design, uniting different people and disciplines in this task is still an important incentive in this work. I am indebted to the MARIN management team for their permission to undertake this work and to publish the results in this thesis. I thank Jan van Zeggelaar for guiding me safely through the labyrinth of MSWord, Gerard Trouerbach for transforming my rough sketches into neat figures and Mariëtte Drinoczy for the “tough” reading. Lastly and most importantly I wish to thank my wife Sylvia for putting up with my absent-mindedness and for frequently picking me out of the clouds. 191 Curriculum Vitae Martin van Hees was born on November 24, 1955 in The Hague in The Netherlands. He attended secondary school in Voorburg and Zoetermeer until 1976 and performed military service in 1977. In 1983 he finished his studies in Naval Architecture and Marine Engineering at the Department of Naval Architecture at Delft University of Technology. The subject of his graduation thesis was a conceptual design model for the energy systems on board of suction hopper dredgers. After this, he was employed by the New Building Department of Wilton Fijenoord Shipyard at Schiedam where he worked on ship conversions and repair and on the conceptual design of merchant ships, surface warships and of a naval submarine. In 1986 he moved to the Maritime Research Institute Netherlands (MARIN) in Wageningen, where he worked in the Ship Powering and Offshore Departments for about four years on applied experimental research related to ship propulsion and ship motions. In 1990 he moved to the Software Engineering Department. As senior project manager he is now involved in the development of ship performance prediction methods, knowledge-based systems and their application in conceptual (naval) ship design and in software integration. 192