Prefacio - GAIA – Group of Artificial Intelligence Applications
Transcription
Prefacio - GAIA – Group of Artificial Intelligence Applications
Prefacio CoSECiVi en su segunda edición se consolida como el punto de encuentro en España de los grupos de investigación que trabajan en temas relacionados con los distintos aspectos del desarrollo de videojuegos. En estas actas podemos encontrar trabajos que se centran en la aplicación de técnicas de inteligencia artificial, y en particular aprendizaje automático y algoritmos genéticos, al desarrollo del comportamiento de los personajes no controlados por el jugador en un videojuego, en unos casos, y al análisis de los partidas, en otros. Encontramos también un número de trabajos relacionados con el desarrollo de aplicaciones serias de los videojuegos, más allá del puro entretenimiento, incluyendo tanto aplicaciones concretas como metodologías genéricas para el desarrollo de este tipo de juegos. Un tercer grupo de trabajos están más relacionados con la ingeniería de sistemas de entretenimiento, que van desde la generación procedimental de contenido hasta las técnicas que ayudan a la depuración y prueba de los videojuegos. CoSECiVi prosigue así con su vocación de servir de punto de encuentro multidisciplinar jugando un papel complementario al de conferencias más especializadas en áreas más tradicionales como la Inteligencia artificial, la enseñanza asistida por ordenador o la Ingeniería del software. En esta edición CoSECiVi continua también la colaboración con la feria profesional Gamelab Conference buscando tender puentes entre el mundo profesional y el académico. Esta colaboración se ha vuelto más estrecha en esta ocasión pues CoSECiVi pasa a formar parte de la propia programación de Gamelab, ocupando una de las dos sesiones paralelas durante el primer día de la feria, que tiene como colofón un evento de networking conjunto Gamelab-CoSECiVi. Junio 2015 Pedro Antonio González Calero David Camacho Marco Antonio Gómez Martín Table of Content. Raúl Lara-Cabrera, Mariela Nogueira-Collazo, Carlos Cotta and Antonio J. Fernández Leiva. Game Artificial Intelligence: Challenges for the Scientific Community. ........................................................1 Maximiliano Miranda and Federico Peinado. Improving the Performance of a Computer-Controlled Player in a Maze Chase Game using Evolutionary Programming on a Finite-State Machine............13 Antonio A. Sáchez-Ruíz . Predicting the Winner in Two Player StarCraft Games. ............................24 Diana Lora. Clustering de Jugadores de Tetris. .................................................................................36 José. Jiménez, Antonio Mora and Antonio J. Fernández Leiva. Evolutionary Interactive Bot for the FPS Unreal Tournament 2004. .........................................................................................................46 Vicente Nacher, Fernando García-Sanjuan and Javier Jaen. Game Technologies for Kindergarten Instruction: Experiences and Future Challenges. .............................................................................58 Lilia García, Marcela Genero and Mario Piattini. Refinamiento de un Modelo de Calidad para Juegos Serios. ...................................................................................................................................68 Marta Caro-Martínez, David Hernando-Hernández and Guillermo Jiménez-Díaz. RACMA o cómo dar vida a un mapa mudo en el Museo de América. ........................................................................80 Rafael Prieto, Nuria Medina Medina, Patricia Paderewski and Francisco Gutiérrez Vela. Design methodology for educational games based on interactive screenplays. .........................................90 Pablo Delatorre, Anke Berns, Manuel Palomo-Duarte, Pablo Gervás and Francisco Madueño. Diseño de un juego serio basado en el suspense. ..........................................................................102 Nahum Álvarez and Federico Peinado. Modelling Suspicion as a Game Mechanism for Designing a Computer-Played Investigation Character. ....................................................................................112 Daniel Palacios-Alonso, Victoria Rodellar-Biarge, Pedro Gómez-Vilda and Victor Nieto-Lluis. Spontaneous emotional speech recordings through a cooperative online video game. ................122 Francisco M Urea and Alberto Sanchez. Towards Real-time Procedural Scene Generation from a Truncated Icosidodecahedron........................................................................................................132 Ismael Sagredo-Olivenza, Gonzalo Flórez-Puga, Marco Antonio Gómez-Martín and Pedro A. González-Calero. Implementación de nodos consulta en árboles de comportamiento. ................144 José Rafael López-Arcos, Francisco Luis Gutiérrez Vela, Natalia Padilla-Zea, Patricia Paderewski and Noemí Marta Fuentes García. Evaluación de una historia interactiva: una aproximación basada en emociones. .....................................................................................................................................156 Jennifer Hernández Bécares, Luis Costero Valero and Pedro Pablo Gómez-Martín. Automatic Gameplay Testing for Message Passing Architectures. ..................................................................168 Víctor Rodríguez-Fernández, Cristian Ramírez-Atencia and David Fernández. A Summary of Player Assessment in a Multi-UAV Mission Planning Serious Game. ........................................................180 Antonio Fernández-Ares, Pablo García Sánchez, Antonio Mora, Pedro Castillo, Maribel GarcíaArenas and Gustavo Romero López. An overview on the termination conditions in the evolution of game bots. .....................................................................................................................................186 Game Articial Intelligence: Challenges for the Scientic Community Raúl Lara-Cabrera, Mariela Nogueira-Collazo, Carlos Cotta and Antonio J. Fernández-Leiva Department Lenguajes y Ciencias de la Computación, ETSI Informática, University of Málaga, Campus de Teatinos, 29071 Málaga Spain {raul,mnogueira,ccottap,afdez}@lcc.uma.es Abstract. This paper discusses some of the most interesting challenges to which the games research community members may face in the area of the application of articial or computational intelligence techniques to the design and creation of video games. The paper focuses on three lines that certainly will inuence signicantly the industry of game development in the near future, specically on the automatic generation of content, the aective computing applied to video games and the generation of behaviors that manage the decisions of entities not controlled by the human player. Keywords:Game Articial Intelligence, Non-player characters, Proce- dural Content Generation, Aective Computing, Human-like behavior. 1 Introduction According to the Entertainment Software Association [1] there are 155 million U.S. Americans that play videogames, with an average of two gamers in each game-playing U.S. households. The total consumer spend were 15.4 billion dollars in 2014 just in the United States. These gures show the good condition of the video game industry, which has taken the lead in entertainment industry. This situation has been a motivation for the research applied to video games, which has gained notoriety over the past few years, covering several areas such as psychology and player satisfaction, marketing and gamication, articial intelligence, computer graphics and even education and health (serious games). In the same way, the industry is beginning to adopt the techniques and recommendations academia oers. The reader interested in the current state of articial intelligence techniques within the industry should refer resources AI Summit from the Game Developers Conference, the AI Game Programming Wisdom and Game Programming Gems 1 such as the website AiGameDev , the book collections, or the book by Ian Millington and John Funge [22]. Research in articial intelligence may take advantage of the wide variety of problems that videogames oer, such as adversarial planning, real-time reactive behaviors and planning, and decision making under uncertainty. For instance, 1 http://aigamedev.com 1 real-time strategy games, which are a portion of the whole videogames, are being used as testbeds and frameworks for brand new articial intelligence techniques, as stated by our previous study on real-time strategy and articial intelligence [18]. This paper aims at some interesting trends that seems to guide the future of videogames, and the challenges that they oer to academia, focusing on the application of articial intelligence and, more precisely, computational intelligence, i.e. bio-inspired optimization techniques and meta-heuristics [20]. We want to clarify that the universe of uses of optimization techniques on the development and game design is extremely broad and we do not pretend to make an exhaustive tour on it in this paper; in fact we recommend the interested reader a reading of other papers that have been published in the literature and serve as a basis to learn about the state of the art [37,43]. We focused, instead, on certain research areas that will inuence signicantly in the creation of commercial games over the next decade, we refer to the procedural content generation, aective computing, which has an impact in player satisfaction and the creation of behaviors or strategies of decision making for non-playable characters (NPC). 2 Procedural Content Generation Procedural content generation refers to the creation of videogame content through algorithmic means, such as levels, textures, characters, rules and missions (the content may be essential to game mechanics or optional). The fact of procedurally generate the content may reduce the expenses of hiring many designers in order to create the content manually and even being a source of inspiration for them, suggesting novel designs. Moreover, it is possible to establish some criteria the generated content must meet, such as adjusting the created level to the player's game style in order to oer a continuous challenge to her. If the generation process is made in real time and the content is diverse enough then it may be possible to create real innite games, which oers the player a brand new gaming experience every time she starts a new one. These benets are wellknown by the industry, as exposed by the use of these kind of techniques in successful commercial games such as Terraria. Borderlands saga, Skyrim, Minecraft and Many distinctions may be drawn when dealing with procedural content generation and its procedures. Regarding when the content is generated, it might be during the execution of the game (online generation) or during the development (oine generation). If we speak about the main objective of the generated content, this could be necessary for the game progression, hence it is mandatory to ensure that the content is valid, or it should be optional, as the decoration of levels. Another question is the nature of the generation algorithm, that is, if we have a purely stochastic algorithm, in which content is created from random seed, or, conversely, a deterministic algorithm, where the content is generated by a parameter vector. The third possibility is the hybridization of both perspectives, 2 designing an algorithm with a stochastic and a deterministic component working together. Looking at the objectives to be met, the creation process can be done in a constructive manner, ensuring the validity of the content throughout the process. The other option is to follow a scheme of generating and testing, where a lot of content, which goes through a validation phase and subsequent disposal of that which does not comply with the restrictions, is generated. The latter scheme is the most currently employed by the research community, and it is based on the search of the content in the space of possible solutions. The validation is done by assigning values to content so that its level of quality is quantied according to the objectives. Apart from maps and levels, there are other examples of content that may be generated procedurally such as music [7], stories for role-playing games [27], game rules [11] and quests [27]. These techniques are commonly used to generate maps and levels, as evidenced by the large number of papers devoted to this issue [14]. For instance, authors of [17] approach the problem of matchmaking in multiplayer videogames evolving maps for a rst-person shooter in order to improve the game balancing for certain combinations of player skills and strategies. With a similar content type, authors of [10] presented a genetic algorithm for the generation of levels for the Angry Birds game whose objective is to minimize the elements' movement during a period of time, obtaining stable structures during the process. 3 Aective computing and player satisfaction It was Rosalind Picard who in 1995 introduced the term Aective Computing and denes it as the computation which relates arises or inuences emotions [30]. In the context of video games, there is still research on how to extrapolate the vast eld of emotions onto a game's stage, the good reasons why to do it are clear [12], but the results obtained so far are usually modest compared to everything academia expect to achieve. One of the earliest forms used to incorporate emotions in games was through the narrative, by generating situations that caught the player either by the characters, the incorporation of all kinds of conicts and fantastic stories or by real Final Fantasy life situations ( and Resident Evil sagas use this kind of emotional narrative). In a similar way, there are other games that stand out due to the high realism of their simulations and the incorporation of emotions to the main character (see Figure 1). This kind of aective focused on the main character requires a lot of artistic work in order to simulate the emotions in a realistic manner (it is common to use motion capture techniques and hire professional actors). For instance, the videogame Beyond: Two Souls heavily relies on the narrative and guides the player through a predened story, hence limiting the possible actions the player may take so it is easier for the software to have control over the emotional ow of the main character. Generally, this approach to the implementation of emotion has as main objective to increase the user's immersion into the game and to do so, it sets emotional dynamics between the human 3 player and the main character, so the player r empathizes with the actions her character conveys. Fig. 1. Characters from Heavy Rain and The Walking Dead There is another approach which aims to make the non-playable characters (NPCs) involved in the game behave as emotional individuals and, therefore, their emotions should inuence the game whenever they take decisions. In order to achieve this, NPCs must have a high level of perception because they must react to every event that happens around them which could aects their emotional state, such as shrill sounds or re. In this sense there exist interesting proposals that oer techniques and models devoted to implement emotional behaviors for the virtual agents [15,28,29]. The aforementioned approaches manages the aectivity through the virtual agents, focusing on what happens inside the videogame without taking into account the emotional relationships that players express while they are playing the game. This situation led us to a third approach, namely aective videogames, self-adaption in which is related to the eld of modeling, evaluation and increase of player satisfaction [25,39]. Self-adaption refers to the ability of a game to take into account the preferences and gaming style of the player and react to these features increasing the player satisfaction and making a unique gaming experience for each player. In the case of aectivity, the game should self-adapt depending on the emotions the player expresses while she is playing, establishing emotional links between human and NPCs. The most interesting papers in this eld are focused mostly on creating a formal model that represents the behavior of the player so it is possible to evaluate her level of satisfaction, all based on psychological research on satisfaction [8,21,35]. Then one has to consider the employment of this model to determine what level of satisfaction experiences the player and then proceed to readjust the game to maintain or raise that level. Some successful proposals in this sense are [13,40,41,42] Each approach mentioned here is an open research eld having several lines that also demand new solutions, and optimization is one of them. Most success- 4 ful proposals out there, some of which have been cited here, consider a search process, based on meta-heuristics, to explore the broad search space that is generated from two inherently complex contexts: videogames and emotions. 4 Behaviors Traditionally the Articial Intelligence (AI) of a game has been coded manually using predened sets of rules leading to behaviors often encompassed within the so called articial stupidity [19], which results in a set of known problems such as the feeling of unreality, the occurrence of abnormal behaviors in unexpected situations, or the existence of predictable behaviors, just to name a few. Advanced techniques are currently used to solve these problems and achieve NPCs with rational behavior that takes logical decisions in the same way as a human player. The main advantage is that these techniques perform automatically the search and optimization process to nd these smart strategies. Bio-inspired algorithms are the basis of many of these advanced methods, as they are a suitable approach in this regard, because they are able to produce solutions of great complexity as an emerging result of the optimization process, and its adaptive capacity allows them to incorporate information provided by the user. Due to this, there are several successful proposals that follows this approach. For instance, co-evolution [36] is one of the heuristic techniques inspired by the natural evolution principles that has been widely used in videogame AI programming. In [34] the author presents a research that were capable of evolving the morphology and behaviour of virtual creatures through competitive coevolution that interact in a predator/prey environment. Other interesting papers used co-evolution to obtain game strategies for articial players of a war game called Tempo [2,3,16,26]. Machine learning is used as well when modeling the behavior of articial players. Authors of [31] have used self-organizing maps in order to improve the maneuvering of platoons in a real-time strategy game. By analyzing data obtained from the sensors, the authors of [32] have developed an algorithm for an articial pilot so it is able to learn the race track and drive through it autonomously. In [9], the authors obtained several features from the maps of a real-time strategy game and use them to determine a NPC's behavior. 4.1 Notable challenges We are immersed in an era of resurgence of articial intelligence that directly inuences the development of game AI and, as a consequence, the generation of decision making mechanisms for NPCs provides an exciting challenge to the scientic community as this goal can be conducted from many dierent points of views. In the following we enumerate some of the most exciting directions in which the development of game NPC behaviors can be done. 5 Human-like behaviors Current players demands highest quality opponents, what basically means obtaining enemies exhibiting intelligent behavior; in addition, especially, in on-line games, it is well-known that players enjoy playing against other human players, however it is also known that many of the players involved in Massively multiplayer online (MMO game) are bots created by the game developers and this can reduce the immersion of the player in the game. Therefore, developers make a signicant eort (in terms of funds) to generate bots that simulate to play in a human style with the aim of providing human players the sensation of being facing other 'human` players (that in fact might be non-human). In a more wide context, this can be translated to an adaptation of the Turing Test [38] to the eld of the videogame development. The basic fundamental is that an NPC that plays like a human might be considered as a human player, as the NPC would not be distinguished from a human player (assuming the judge to assess the humanity of the bot is not allowed to see -physically speaking- the players). However, one of the main problems that developers nd to cope with the objective of generating human-like bots is that it is not easy to evaluate what a `human-like intelligence' means for a bot in videogames. This precisely is one of the main problems, a hard problem, and moves the context to a psychological scenario that introduces more complexity to its solving. In this context, the 2k bot prize is a competition that proposes an interesting adaptation of the Turing test in the context of the well-known FPS game Unreal Tournament 2004, a multi-player online FPS game in which enemy bots are controlled by some kind of game AI. More information on this is provided below. General Game Playing (GGP) Can a bot play dierent games without being previously specically trained for them? This is basically the question that underlies the research to generate automated general game players. In some sense, this issue is related to the creation of human-like behaviors, as a general player mimics a human that learns the rules of a game and subsequently is able to play it without being previously trained on it. The skill to play would be acquired with the game experience and this is other of the fundamentals under the GGP concept. As said in [4], A general game player (GGP) is a computer program that plays a range of games well, rather than specializing in any one particular game Recently, GGP obtained public recognition via DeepMind, an articial intel- ligence - developed by a private company associated to the giant Goggle- that was able to master a diverse range of Atari 2600 games; this general player consists of a combination of Deep Neural Networks with Reinforcement Learning [23]. 6 GGP opens grand challenges not only for the community of development of games, but for the society in general. Other issues the recent boom of casual games played in mobile devices provokes that both the Design and Gameplay of games demand resources that appear and evaporate continuously during the execution of a game. This precisely occurs in the so-called pervasive games (i.e., "games that have one or more salient features that expand the contractual magic circle of play spatially, temporally, or socially" [24]) where the gaming experience is extended out in the real world. Playing games in the physical world requires computations that should be executed onthe-y in the user's mobile device and having into account that players can decide to join or drop out the game in each instant. The application of AI techniques can help to improve the immersion of the player by generating automatically new objectives or imposing constraints to the game. This is an open issue that demands more research in a future close. 4.2 Competitions Over the past few years dierent competitions where researchers have the opportunity to compare their strategies and algorithms in specic scenarios and games have appeared. Some examples of the most important are listed along with a brief description: 2K BotPrize2 : The objective is to develop an Unreal NPC capable of tricking human players to believe it is human as well Starcraft AI Competition3 : Anual competition of Starcraft NPCs that ght each other to be the winner Simulated Car Racing Competition4 : The objective is to develop an articial car driver that competes in a virtual racing championship GVG-AI5 : The General Video Game AI Competition is a competition where the articial players might be capable of playing several game genres, trying to be as generic as possible There is a problem with these competitions: the challenges are very specic and closely linked to the game on which they are played. Thus, strategies for winning over-specialize in exploiting the features of the game itself, but throwing a poor performance when they are used in another game. Therefore, another possible challenge is to design generic competitions to be able to give good results not only in a single game or environment, but in several of them, something that is already being tackled in the previously outlined 2 3 4 5 GVG-AI competition. http://botprize.org/ http://webdocs.cs.ualberta.ca/~cdavid/starcraftaicomp/ http://cig.dei.polimi.it/ http://www.gvgai.net/ 7 5 Tools and frameworks This section is devoted to present the tools or frameworks that the scientic community has at its disposal for testing and validation of the results obtained during the research. Nowadays there are many tools freely available, so following there is a collection of the most used with their main features, to serve as a reference list to researchers in articial intelligence and videogames. Fig. 2. A screenshot of Starcraft ORTS (Open Real-Time Strategy) [5] is a real-time strategy game designed specically as a research tool and published under the GLP (GNU public license). It features an open message protocol and the client application let the researchers analyze the performance of their algorithms playing games in a controlled environment where the simulation takes place on the server side. Another 6 strategy game that is widely used as a research tool is Starcraft , which features 7 a software library (BWAPI ) that helps to connect the game engine with AI 8 strategies (see Figure 2. Furthermore, RoboCode is a platform whose objective 6 7 8 http://us.blizzard.com/en-us/games/sc/. http://bwapi.github.io http://robocode.sourceforge.net/ 8 is to develop a combat robot using Java or .NET to ght against other robots in real time. Planet Wars 9 y ANTS10 are two games developed for the AI competition hosted by Google in 2010 and 2011, respectively. The former is a space conquest game for many players whose objective is conquer all the planets in a map, the latter is a multiplayer game as well where every player represents a set of ants whose objective is gather food and conquer their opponents' anthills. In Vindinium 11 the player has to take the control of a legendary hero using the programming language of her choice to ght with other AI for a predetermined number of turns. The hero with the greatest amount of gold wins. Eryna 12 [6] is another tool created to support the research on AI applied to videogames. It is a real time multiplayer game that lets the user launch games between several NPCs and evaluate the results. Their fundamental components are: the game engine that follows an authoritative server architecture and process concurrently several connections and processes, an AI module that is fully customizable and lets the researcher develop her own NPCs and a module for procedural content generation capable of generate new maps. 13 is a tool-set developed from the source code of the platform SpelunkBots game Spelunky. It provides the researchers an easy way to code an articial player for this game. The tool has been developed by Daniel Scales [33]. 6 Conclusions The challenges in the research lines that we have mentioned throughout this paper are huge and certainly aect other areas beyond the eld of video games. For instance, the generation of quasi-human behavior is something that is being already investigated and traditionally have their seed in the famous Turing test. The possibilities opened up by applying science to videogames are vast: the integration of feelings in articial players and the option to build a direct channel between them and the sentimental perception of the player through the so-called Aective Computing. Regarding procedural content generation, it has been shown that is a hot eld in academia, with a large number of papers related to it. Moreover, the videogame industry is successfully using many of the advances obtained by academia, although there are many non-tackled challenges in this sense. We end this paper mentioning that there are many areas related to the use of Computational/Articial Intelligence that have not been specically described here, where researches might nd additional challenges such as player modeling, computational narrative and AI-assisted game design among others. We are dealing with stimulating challenges not only for the near future, but the present. 9 10 11 12 13 http://planetwars.aichallenge.org/ http://ants.aichallenge.org/ http://vindinium.org/ http://eryna.lcc.uma.es/ http://t2thompson.com/projects/spelunkbots/ 9 Acknowledgements This work has been partially supported by Junta de Andalucía within the project 14 ), by Ministerio de Ministerio español de Economía P10-TIC-6083 (DNEMESIS y Competitividad under (preselected as granted) project TIN2014-56494-C4-1P (UMA-EPHEMECH), and Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech. References 1. Association, E.S.: Essential facts about the computer and video game industry (2015) 2. Avery, P., et al.: Coevolving a computer player for resource allocation games: using the game of Tempo as a test space. Ph.D. thesis, School of Computer Science University of Adelaide (2008) 3. Avery, P.M., Michalewicz, Z.: Static experts and dynamic enemies in coevolutionary games. In: IEEE Congress on Evolutionary Computation. pp. 40354042 (2007) 4. Browne, C., Togelius, J., Sturtevant, N.R.: Guest editorial: General games. IEEE Trans. Comput. Intellig. and AI in Games 6(4), 317319 (2014) 5. Buro, M.: ORTS: A hack-free RTS game environment. In: Schaeer, J., et al. (eds.) Computers and Games. Lecture Notes in Computer Science, vol. 2883, pp. 280291. Springer (2002) 6. Collazo, M.N., Leiva, A.J.F., Porras, C.C.: Eryna: una herramienta de apoyo a la revolución de los videojuegos. In: Camacho, D., Gómez-Martín, M.A., González- st Calero, P.A. (eds.) Proceedings 1 Congreso de la Sociedad Española para las Ciencias del Videojuego, CoSECivi 2014, Barcelona, Spain, June 24, 2014. CEUR Workshop Proceedings, vol. 1196, pp. 173184. CEUR-WS.org (2014) 7. Collins, K.: An introduction to procedural music in video games. Contemporary Music Review 28(1), 515 (2009) 8. Csikszentmihalyi, M., Csikzentmihaly, M.: Flow: The psychology of optimal experience, vol. 41. HarperPerennial New York (1991) 9. Fernández-Ares, A., García-Sánchez, P., Mora, A.M., Merelo, J.: Adaptive bots for real-time strategy games via map characterization. In: Computational Intelligence and Games (CIG), 2012 IEEE Conference on. pp. 417721. IEEE (2012) 10. Ferreira, L., Toledo, C.: A search-based approach for generating angry birds levels. In: Computational Intelligence and Games (CIG), 2014 IEEE Conference on. pp. 18. IEEE (2014) 11. Font, J., Mahlmann, T., Manrique, D., Togelius, J.: A card game description language. In: Esparcia-Alcázar, A. (ed.) Applications of Evolutionary Computation, Lecture Notes in Computer Science, vol. 7835, pp. 254263. Springer Berlin Heidelberg (2013) 12. Freeman, D.: Creating emotion in games: The craft and art of emotioneering™. Comput. Entertain. 2(3), 1515 (Jul 2004) 13. Halim, Z., Baig, A.R., Mujtaba, H.: Measuring entertainment and automatic generation of entertaining games. International Journal of Information Technology, Communications and Convergence 1(1), 92107 (2010) 14 http://dnemesis.lcc.uma.es/wordpress/ 10 14. Hendrikx, M., Meijer, S., Van Der Velden, J., Iosup, A.: Procedural content generation for games: A survey. ACM Trans. Multimedia Comput. Commun. Appl. 9(1), 1:11:22 (Feb 2013) 15. Johansson, A., Dell'Acqua, P.: Emotional behavior trees. In: Computational Intelligence and Games (CIG), 2012 IEEE Conference on. pp. 355362. IEEE (2012) 16. Johnson, R., Melich, M., Michalewicz, Z., Schmidt, M.: Coevolutionary Tempo game. In: Evolutionary Computation. CEC'04. Congress on. vol. 2, pp. 16101617 (2004) 17. Lanzi, P.L., Loiacono, D., Stucchi, R.: Evolving maps for match balancing in rst person shooters. In: Computational Intelligence and Games (CIG), 2014 IEEE Conference on. pp. 18. IEEE (2014) 18. Lara-Cabrera, R., Cotta, C., Fernández-Leiva, A.J.: A Review of Computational Intelligence in RTS Games. In: Ojeda, M., Cotta, C., Franco, L. (eds.) 2013 IEEE Symposium on Foundations of Computational Intelligence. pp. 114121 (2013) 19. Lidén, L.: Articial stupidity: The art of intentional mistakes. AI Game Programming Wisdom 2, 4148 (2003) 20. Lucas, S.M., Mateas, M., Preuss, M., Spronck, P., Togelius, J. (eds.): Articial and Computational Intelligence in Games, Dagstuhl Follow-Ups, vol. 6. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik (2013) 21. Malone, T.W.: What makes things fun to learn? heuristics for designing instruc- rd tional computer games. In: Proceedings of the 3 ACM SIGSMALL symposium and the rst SIGPC symposium on Small systems. pp. 162169. ACM (1980) 22. Millington, I., Funge, J.: Articial intelligence for games. CRC Press (2009) 23. Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., Hassabis, D.: Human-level control through deep reinforcement learning. Nature 518(7540), 529533 (2015) 24. Montola, M., Stenros, J., Waern, A.: Pervasive Games. Morgan Kaufmann, Boston (2009) 25. Nogueira, M., Cotta, C., Leiva, A.J.F.: On modeling, evaluating and increasing players' satisfaction quantitatively: Steps towards a taxonomy. In: et al., C.D.C. (ed.) Applications of Evolutionary Computation - EvoGAMES. Lecture Notes in Computer Science, vol. 7248, pp. 245254. Springer (2012) 26. Nogueira, M., Cotta, C., Leiva, A.J.F.: An analysis of hall-of-fame strategies in competitive coevolutionary algorithms for self-learning in RTS games. In: Nicosia, th G., Pardalos, P.M. (eds.) Learning and Intelligent Optimization - 7 International Conference, LION 7, Catania, Italy, January 7-11, 2013, Revised Selected Papers. Lecture Notes in Computer Science, vol. 7997, pp. 174188. Springer (2013) 27. Onuczko, C., Szafron, D., Schaeer, J., Cutumisu, M., Siegel, J., Waugh, K., Schumacher, A.: Automatic story generation for computer role-playing games. In: AIIDE. pp. 147148 (2006) 28. Peña, L., Ossowski, S., Peña, J.M., Sánchez, J.Á.: EEP - A lightweight emotional model: Application to RPG video game characters. In: Cho, S., Lucas, S.M., Hingston, P. (eds.) 2011 IEEE Conference on Computational Intelligence and Games, CIG 2011, Seoul, South Korea, August 31 - September 3, 2011. pp. 142149. IEEE (2011) 29. Peña, L., Peña, J.M., Ossowski, S.: Representing emotion and mood states for virtual agents. In: Klügl, F., Ossowski, S. (eds.) Multiagent System Technologies - 11 9 th German Conference, MATES 2011, Berlin, Germany, October 6-7, 2011. Pro- ceedings. Lecture Notes in Computer Science, vol. 6973, pp. 181188. Springer (2011) 30. Picard, R.W.: Aective computing. MIT Media Laboratory (1995) 31. Preuss, M., Beume, N., Danielsiek, H., Hein, T., Naujoks, B., Piatkowski, N., Stür, R., Thom, A., Wessing, S.: Towards intelligent team composition and maneuvering in real-time strategy games. IEEE Transactions on Computational Intelligence and AI in Games 2(2), 8298 (2010) 32. Quadieg, J., Preuss, M., Kramer, O., Rudolph, G.: Learning the track and planning ahead in a car racing controller. In: Computational Intelligence and Games (CIG), 2010 IEEE Symposium on. pp. 395402. IEEE (2010) 33. Scales, D., Thompson, T.: Spelunkbots API-an AI toolset for spelunky. In: Computational Intelligence and Games (CIG), 2014 IEEE Conference on. pp. 18. IEEE (2014) 34. Sims, K.: Evolving 3D Morphology and Behavior by Competition. Articial Life 1(4), 353372 (1994) 35. Sweetser, P., Wyeth, P.: Gameow: a model for evaluating player enjoyment in games. Computers in Entertainment (CIE) 3(3), 33 (2005) 36. Thompson, J.N.: The geographic mosaic of coevolution. University of Chicago Press (2005) 37. Togelius, J., Yannakakis, G.N., Stanley, K.O., Browne, C.: Search-based procedural content generation: A taxonomy and survey. IEEE Trans. Comput. Intellig. and AI in Games 3(3), 172186 (2011) 38. Turing, A.: Computing machinery and intelligence. Mind 59(236), 433460 (1950) 39. Yannakakis, G.N.: How to model and augment player satisfaction: a review. In: WOCCI. p. 21 (2008) 40. Yannakakis, G.N., Hallam, J.: Towards capturing and enhancing entertainment in computer games. In: Advances in articial intelligence, pp. 432442. Springer (2006) 41. Yannakakis, G.N., Hallam, J.: Towards optimizing entertainment in computer games. Applied Articial Intelligence 21(10), 933971 (2007) 42. Yannakakis, G.N., Hallam, J.: Real-time game adaptation for optimizing player satisfaction. Computational Intelligence and AI in Games, IEEE Transactions on 1(2), 121133 (2009) 43. Yannakakis, G., Togelius, J.: A panorama of articial and computational intelligence in games. Computational Intelligence and AI in Games, IEEE Transactions on PP(99), 11 (2014) 12 Improving the Performance of a Computer-Controlled Player in a Maze Chase Game using Evolutionary Programming on a Finite-State Machine Maximiliano Miranda and Federico Peinado Departamento de Ingeniería del Software e Inteligencia Artificial, Facultad de Informática, Universidad Complutense de Madrid 28040 Madrid, Spain m.miranda@ucm.es, email@federicopeinado.com Abstract. The continuous sophistication of video games represents a stimulating challenge for Artificial Intelligence researchers. As part of our work on improving the behaviour of military units in Real-Time Strategy Games we are testing different techniques and methodologies for computer-controlled players. In this paper Evolutionary Programming is explored, establishing a first approach to develop an automatic controller for the classic maze chase game Ms. Pac-Man. Several combinations of different operators for selection, crossover, mutation and replacement are studied, creating an algorithm that changes the variables of a simple finite-state machine representing the behaviour of the player’s avatar. After the initial training, we evaluate the results obtained by all these combinations, identifying best choices and discussing the performance improvement that can be obtained with similar techniques in complex games. Keywords. Video Game Development, Player Simulation, Artificial Intelligence, Machine Learning, Evolutionary Computing, Genetic Algorithms 1 Introduction Video games are constantly experiencing improvements in graphics, interfaces and programming techniques. It is one of the most challenging and interesting field of application for Artificial Intelligence (AI), considering it as a large set of “toy worlds” to explore and play with. Recently, we have started working on how to improve the behaviour of military units in Real-Time Strategy (RTS) games. In this context, different techniques and methodologies for computer-controlled players as part of that research project are being tested. In this genre it is very common to implement the behaviour of a military unit as a Finite-State Machine (FSM). Some of these machines are very complex, having multiple parameters that should be adjusted by experimentation (playtesting) and by experienced game designers. We have decided to choose a simple game to start trying different techniques and methodologies (as it is the case of 13 2 Evolutionary Programming). If significant results are found, we could study their applicability for improving the more complex FSMs of RTS games. For this paper we have developed an automatic controller for a version of the game Ms. Pac-Man implemented in a software platform called “Ms. Pac-Man vs. Ghosts”. The behaviour of the protagonist (Ms. Pac-Man) in the game is implemented by a FSM with some variable values that act as a threshold to control the state transitions. A genetic algorithm is used to find the best values for the performance of this FSM, aiming to probe the utility of this approach improving the game score. The structure of this paper is as follows: Section 2 is focused on the definitions of the concepts and the background needed to understand the rest of the paper. Section 3 presents our computer-controlled player for the Ms. Pac-Man game. Section 4 describes how we apply evolutionary computation to improve the performance (score) of the player and Section 5 shows and discusses the results of the experiments. Finally, in Section 6 we present our conclusions, foreseen the next steps of this project. 2 Related Work Recently, we have started working on how to improve the behaviour of military units in RTS games with two main objectives: participating in international challenges of AI applied to video games, such as the Student StarCraft AI Tournament [1] and the AIIDE StarCraft AI Competition [2], and improving the multi-level FSMs used in Mutant Meat City, a RTS-style video game created in 2014 as an academic project. Before applying evolutionary programming to complex games, we are performing experiments using more simple games, such as Pac-Man. Pac-Man is a popular arcade created by Toru Iwatani and Shigeo Funaki in 1980. Produced by Namco, it has been consider as an icon since its launch, not only for the videogames industry, but for the twentieth century popular culture [3]. Game mechanics consist on controlling PacMan (a small yellow character) who is eating pills (white dots) in a maze, while avoiding to be chased by four ghosts that can kill him (decreasing his “number of lifes”). There are also randomly-located fruits in the maze that add extra points to the score when eaten. The game is over when Pac-Man loses three lifes. There are also special pills, bigger than the normal ones, which make the ghost to be “edible” during a short period of time. The punctuation in the score grows exponentially after each ghost been eaten during this period. Ms. Pac-Man is the next version of the game, produced in 1982 by Midway Manufacturing Corporation, distributors of the original version of Pac-Man in the USA. This version is slightly faster than the original game and, in contrast with the first one, the ghosts do not have a deterministic behaviour, their path through the maze is not predefined [4]. This makes the game more difficult, being much more challenging the creation of strategies to avoid being killed. Over the years there have been several proposals in the academy in relation to using AI for maze chase games as Pac-Man, both for controlling the protagonist or the antagonists the game. Ms. Pac-Man vs. Ghosts League was a competition for developing completely automated controllers for Ms. Pac-Man with the usual goal of optimiz- 14 3 ing the score. For supporting this competition, a Java framework “Pac-Man vs. Ghosts” was created for implementing the game and making it easy to extend the classes for controlling the characters. With respect to Evolutionary Computing, genetic algorithms were originally developed by Cramer [5] and popularized by Koza [6] among others. This paradigm has become a large field of study, being widely-used in AI challenges and for optimizing the behaviour of intelligent automata. These algorithms could be useful for optimizing controllers in video games, even in sophisticated titles using autonomous agents [7]; but for our purposes, a controlled and limited scenario as the simple mazes of Ms. Pac-Man is perfect to test the methodology for evaluating the performance of a computer-controlled player. 3 A Computer-Controlled Player for the Maze Chase Game For this work, we have designed a simple controller for Ms. Pac-Man based on the StarterPacman class of the framework “Ms. Pac-Man vs. Ghosts”. The controller implemented in this class is one of the simplest of the framework and it has been changed to transform it in a simple FSM with just three states: Pilling: Ms. Pac-Man goes to the closest pill in order to eat it. In case there were several pills at the same distance, it will follow a preference order according to the direction toward these and clockwise starting from the top. Ghosting: Ms. Pac-Man goes to the closest ghost in “edible” state. Runaway: Ms. Pac-Man runs away from the closest ghost. In the implementation of this FSM we use four numerical variables that later on will compose the chromosome of the individuals of the population that the genetic algorithm will be using: Edible_Ghost_Min_Dist: The minimum distance that an “edible” ghost should be in order to start chasing him. Non_Edible_Ghost_Min_Dist: The minimum distance a ghost should be to start running away from him. Min_Edible_Time: The minimum time that make sense to be chasing an “edible” ghost. Min_Pills_Amount: The minimum number of pills that should stay in the level to start going toward them proactively instead of hiding from or eating ghosts. Using these variables, the FSM has these transition rules: If the closest ghost is non-edible, its distance is inferior to Non_Edible_Ghost_Min_Dist and the number of pills near Ms. Pac-Man is lower than Min_Pills_Amount, the state changes to Runaway. 15 4 If the closest ghost is edible, its distance is inferior to Edible_Ghost_Min_Dist and the number of pills near Ms. Pac-Man is lower than Min_Pills_Amount, the state changes to Ghosting. In other case, the state changes to Pilling. 4 Evolutionary Optimization of the Computer-Controlled Player After the Ms. Pac-Man automatic player controller is explained, we perform the study to test if genetic algorithms can improve the FSM in terms of performance in the game. 4.1 Fitness Function Genetic algorithms require an fitness (or evaluation) function to assign a punctuation to each “chromosome” of a population [8]. In this case the punctuation of the game itself will be used, so the game is executed with the parameters generated by the algorithm and average values are calculated after a constant number of game sessions played by a phenotype (set in 10). The average score from a set of played games acts as the fitness function for our algorithm. 4.2 Genetic Algorithm When creating the genetic algorithm it has been implemented a codification based on floating point genes for the chromosome of the individuals. These genes are real numbers that take values between 0 and 1. The value of the genes are multiplied by 100 in order to evaluate the results. Indeed the FSM of Ms. Pac-Man controller needs values between 0 and 100. So on, every individual of the population represents a Ms. PacMan controller. We have implemented two selection operators, six crossovers, two mutations and four substitutions (also called regrouping), in order to perform different tests and determine which combination of operators get the best results. Selection. These are our two types of selection: Selection by Ranking: Individuals are ordered in a list according to their fitness. The probability for an individual to be chosen for the crossover is higher as higher is its average score. Selection by Tournament: N individuals of the population are selected randomly. Among these individuals the one with the better fitness value is selected. Crossover. These are our six types of crossover: One-Point-based Crossover: The parental chromosomes genes are interchanged from a given gene position. 16 5 Multi-Point-based Crossover: The parental chromosomes genes are interchanged from two given positions. Uniform Crossover: Two progenitors take part and two new descendants are created. A binary mask determines the division of the genes which are going to be crossed. Plain Crossover: N descendants are generated. The value of the gene of the descendant in the position i is chosen randomly in a range defined by the genes of the progenitors that are located in the same position. Combined Crossover: This is a generalization of the plain one, called BLX-alpha. N descendants are generated. The value of the gene of the position i in the descendant is chosen randomly from an interval. Arithmetic Crossover: Two new descendants are generated according to an arithmetic operation. The value of the gene i in the descendant X is the result of the operation (being A and B the progenitors and Ai the value of the gene i of the chromosome A): And the value of the gene i in the descendant Y is the result of the operation: r represents a variable real number, and for this experiment is set to 0,4. Mutation. These are our two types of mutations: Uniform Mutation: A gene is randomly selected and it mutates. The value of the gene is replaced by another randomly generated. Mutation by Interchange: Two genes are selected and they interchanged positions. Substitution. These are our four types of substitution: Substitution of the Worse: The descendants replace the individuals with worse fitness from all the population. Random Substitution: The individuals that are going to be substituted are randomly chosen. Substitution by Tournament: Groups of N individuals are selected and the worst of each group is replaced by a descendant. Generational Substitution: The descendants replace their own parents. At the beginning, the population is initialized with a certain number of individuals (100 by default), all of them created with random genes. Each one is evaluated before it is added to the population structure (a tree structure is used to maintain the population ordered by the fitness value of each individual). Then, the minimum, maximum and average fitness of this population is calculated and the next generation is produced. This process is repeated several times (500 generations of individuals are created by default). 17 6 5 Results and Discussion As it has been mentioned, using these operators of selection, crossover, mutation and substitution of individuals, we have been able to test different combinations of operators obtaining the results shown below. Instead of testing all the possible combinations (96 different experiments) and studying the interactions between operators one by one, as a first exploration of the problem we have taken a different approach. We have created a pipeline of “filters” for the operators, using heuristics based on principles of Game Design, so only operators offering the best performing results in their category are selected and the rest are discarded for the remaining experiments. The graphics which are displayed below represent the results of the experiments. The X axis represents the number of the generation produced and the Y axis the values obtained in this generation. 5.1 Selection of the Substitution Operator These operators are set: selection-ranking, crossover-uniform, mutation-uniform; and the different substitution operators are tested. See Fig. 1 and Fig. 2. Fig. 1. The Worst and Random substitution 18 7 Fig. 2. Tournament and Generational substitution As can be seen in the results, the substitution operator of the Worst was the one with better results in this algorithm. The random one is a little inconsistent, with big variations and making worse the average fitness (it is even worst: it discarded several times the best individual). The substitution by Tournament seems to work quite well but in other experiments, not shown in these graphs, it never improves the best individual. Finally, the generational replacement just does not improve anything. Therefore, the Substitution of the Worst operator is chosen as the best operator in its category. 5.2 Selection of the Mutation Operator The next operators are set: selection-Ranking, crossover-Uniform, substitution-of the Worse (established in the previous point). Then, the different mutation operators are tested. The combination with the uniform mutation has already been tested in the previous step so we only have to test mutation by interchange. See Fig. 3. Fig. 3. Mutation by interchange 19 8 This method of mutation slightly improves the Uniform mutation operator, both in the average and maximum fitness where it gets some important jumps. Therefore we chose this method. 5.3 Selection of the Crossover Operator After using the substitution operator (the Worst) and the mutation operator (by interchange), it is time to test the crossover operators (except the uniform one because it has been tested in first step of this pipeline). See Fig. 4, Fig. 5 and Fig. 6. Fig. 4. One-point-based and Multipoint-based Crossover Fig. 5. Plain and Combined Crossover 20 9 Fig. 6. Arithmetic Crossover The One-Point-based operator get better results than the Uniform sometimes, but others it does not improve. The Multi-Point-based improves the results much more. The plain crossover produces few hops (improvements) in the best individual but these hops are very large, more than in the Multi-Point. The combined crossover also produces good results but, again, it does not produce as good improvements as the plain. Finally the arithmetic crossover produced a great improvement in the worse individuals (and thus the average fitness), however it does not improve the best. Therefore the Plain crossover is selected for the remaining experiments. 5.4 Choice of the Selection Operator Once the operators of Substitution (the Worst), mutation (by Interchange) and crossover (plane) are set, it is the turn of testing the selection operators, in this case, the selection by tournament. See Fig. 7. Fig. 7. Selection by Tournament As can be seen in the graph, the selection method by tournament can produce large jumps in the best individual, in the most part of the experiments, it exceeds both max- 21 10 imum and average fitness to the selection by ranking method, consequently this selection method is selected and at this point we have selected all the operators for the genetic algorithm and it produces this individual with the chromosome with the maximum fitness: (120, 14, 324, 75). Let us remember that these values represent the variables used in the FSM of our controller for Ms. Pac-Man. The operators of the genetic algorithm implemented selected by the results of these experiments are: Selection by Tournament, Plane Crossover, Mutation by Interchange and Substitution of the Worst. 6 Conclusions The combination of operators selected after this particular game designer-style experimentation produce results that are far from the more than 20.000 points reached in the scores of official competitions of Ms. Pac-Man vs. Ghosts. But results seems reasonable considering the simplicity of the FSM developed here and the non-exhaustive methodology followed. The goal of this research was to double check that a genetic algorithm can improve significantly a FSM, even if it is a simple one and it is used as a practical testbed for game design. We could affirm that average scores of a population of FSM can be improved in more than 100%, and that the better ones also receive an improvement of approximately 60%. Taking into account the simplicity of the implemented controller, it seems reasonable that a more elaborated one, for instance a multi-level FSM or a subsystem based on fuzzy rules for state transition, can be improved with this evolutionary approach. Now we plan to repeat similar experiments with other combinations of operators, using a more rigorous approach, at the same time we add variations in some of the operators (for instance, changing the points in the crossing methods or modifying the function in the arithmetic crossing). Of course, our roadmap includes increasing the complexity of the FSM and starting to explore a simple strategy game. References 1. Student StarCraft AI Tournament (SSCAIT) http://www.sscaitournament.com/ 2. AIIDE StarCraft AI Competition http://webdocs.cs.ualberta.ca/~cdavid/starcraftaicomp/index.shtml 3. Goldberg, H.: All your base are belong to us: How fifty years of videogames conquered pop culture. Three Rivers Press (2011) 4. Kent, S.L.: The ultimate history of video games: From Pong to Pokemon and beyond… The story behind the craze that touched our lives and changed the world. pp. 172-173. Prima Pub (2001) 5. Cramer, N.: A representation for the adaptive generation of simple sequential programs. International Conference on Genetic Algorithms and their Applications. Carnegie-Mellon University, July 24- 26 (1985) 6. Koza, J.: Genetic programming: on the programming of computers by means of natural selection. MA: The MIT press, Cambridge (1992) 22 11 7. Mads, H.: Autonomous agents: Introduction. (2010). Retrieved January 19, 2013 from http://www.cs.tcd.ie/Mads.Haahr/CS7056/notes/001.pdf 8. Melanie, M.: An Introduction to genetic algorithms (Complex adaptive systems). pp. 7-8. A Bradford Book (1998) 23 Predicting the Winner in Two Player StarCraft Games ∗ Antonio A. Sánchez-Ruiz Dep. Ingenierı́a del Software e Inteligencia Artificial Universidad Complutense de Madrid (Spain) antsanch@fdi.ucm.es Abstract. In this paper we compare different machine learning algorithms to predict the outcome of 2 player games in StarCraft, a wellknown Real-Time Strategy (RTS) game. In particular we discuss the game state representation, the accuracy of the prediction as the game progresses, the size of the training set and the stability of the predictions. Keywords: Prediction, StarCraft, Linear and Quadratic Discriminant Analysis, Support Vector Machines, k-Nearest Neighbors 1 Introduction Real-Time Strategy (RTS) games are very popular testbeds for AI researchers because they provide complex and controlled environments on which to test different AI techniques. Such games require the players to make decisions on many levels. At the macro level, the players have to decide how to invest their resources and how to use their units: they could promote resource gathering, map exploration and the creation of new bases in the map; or they could focus on building defensive structures to protect the bases and training offensive units to attack the opponents; or they could invest in technology development in order to create more powerful units in the future. At the micro level, players must decide how to divide the troops in small groups, where to place them in the map, what skills to use and when, among others. And all these decision have to be reevaluated every few minutes because RTS games are very dynamic environments due to the decisions made by the other players. Most of the literature related to AI and StarCraft focuses on the creation of bots that use different strategies to solve these problems. There are even international competitions in which several bots play against each other testing different AI techniques [5, 4, 3]. In this paper we use a different approach, our bot does not play but acts as an external observer of the game. Our goal is to be able to predict the winner of the game with certain level of trust based on the events occurring during the game. In order to do it, we have collected data ∗ Supported by Spanish Ministry of Economy and Competitiveness under grant TIN2014-55006-R 24 from 100 different 2 player games, and we have used them to train and compare different learning algorithms: Linear and Quadratic Discriminant Analysis, Support Vector Machines, k-Nearest Neighbors. The rest of this paper is organized as follows. Next section describes StarCraft, the RTS game that we use in our experiments. Section 3 explains the process to extract the data for the analysis and the features chosen to represent the game state. Section 4 describes the different data mining classifiers that we use to predict the winner. Next, Section 5 analyzes the predictions produced by the different classifiers and the accuracy that we are able to reach. The paper concludes with a discussion of the related work, conclusions and some directions for future work. 2 StarCraft StarCraft1 is a popular Real-Time Strategy game in which players have to harvest resources, develop technology, build armies combining different types of units and defeat the opponents. Players can choose among 3 different races, each one with their own types of units, strengths and weaknesses. The combination of different types of units and the dynamic nature of the game force players to adapt their strategies constantly, creating a really addictive and complex environment. Because of this, StarCraft has become a popular testbed for AI researchers that can create their own bot using the BWAPI2 framework. In this paper we will focus on just one of the three available races: the Terrans that represent the human race in this particular universe. At the beginning of the game (see Figure 1), each player controls only one building, the command center, and a few collecting units. As the game progresses, each player has to collect resources, build new buildings to develop technology and train stronger troops in order to build an army and defeat the opponents. Figure 2 shows the same game after one hour of play, and now both players control several different units. In fact, the mini-map in the bottom left corner of the screen reveals the location of both armies (blue and red dots), and the game seems balanced because each player controls about half of the map3 . 3 Data Collection and Feature Selection In order to collect data to train the different classifiers we need to play several games. Although StarCraft forces the existence of at least one human 4 player in the game, we have found a way to make the internal AI that comes implemented 1 2 3 4 http://us.blizzard.com/en-us/games/sc/ http://bwapi.github.io/ In this example we have removed the fog-of-war that usually hides the parts of the map that are not visible for the current player. Note that human players are actually the ones controlled by bots using BWAPI while computer players are controlled by the game AI. 25 Fig. 1: StarCarft: first seconds of the game. Fig. 2: StarCraft: state of the game after 1 hour playing. 26 Duration of games number of games 6 4 2 0 30 60 90 120 time (min) Fig. 3: Duration of the games in minutes in StarCraft to play against itself. This way we are able to play as many games as we need automatically, and we are sure the game is well balanced since both players are controlled by the same AI. It is possible to modify the predefined maps included in StarCraft to make the internal game AI to play against itself using a map editor tool provided with the game. In our experiments we have modified the 2 players map Baby Steps, so that StarCraft controls the first 2 players and there is an extra third human player. There are different AI scripts available depending on the desired level of aggressiveness, we have used Expansion Terran Campaign Insane. The human player has no units, will be controlled by our BWAPI bot and has full vision of the map. Finally, we disable the normal triggers that control the end of the game so we can restart the game from our bot when one the first 2 players wins. This last step is important because the normal triggers would end the game as soon as it starts because the third player has no units. Therefore, our bot cannot interfere in the development of the game but can extract any information we require. We have created a dataset containing traces of 100 games in which each player won 50% of the times. Figure 3 shows the duration in minutes of the games. There are a few fast games in which one of the players was able to build a small army and defeat the other player quickly, but most games last between 45 and 100 minutes. The average duration of the games is 60.83 minutes. Figure 4 shows the evolution of resources and units of one player computed as the average values of 100 games. The x-axis represents time as a percentage of the game duration so we can uniformly represent games with different duration, and the y-axis the number of resources (left image), buildings and troops (right image). Regarding resources, we see that during the first quarter of the game the player focus on gathering resources that will be expended during the second quarter, probably building an army and developing technology. During the second half of the game resources do not change so much, probably because 27 resources gas minerals units 4000 troops buildings 60 3500 40 3000 2500 20 2000 0 1500 0 25 50 75 100 time (%) 0 25 50 75 100 time (%) Fig. 4: Available resources, buildings and troops as the game progresses. game 1 1 1 1 frame 9360 9450 9540 9630 gas1 minerals1 scv1 marine1 [...] gas2 minerals2 scv2 marine2 [...] winner 2936 2491 18 23 ... 2984 2259 20 26 ... 1 2952 2531 18 20 ... 3000 2315 20 20 ... 1 2968 2571 18 14 ... 3024 2371 20 14 ... 1 2892 2435 18 12 ... 2940 2219 20 7 ... 1 Table 1: Features selected to represent each game state (traces). We store the game and current time, the strength of each player (resources, troops and buildings) and the winner. there are not so many resources left in the map and the player has to invest them more carefully. Regarding troops and buildings, the initial strategy is to build an army as fast as possible, while the construction of buildings seems more lineal. During the second half of the game there are more buildings than troops in the map, but we need to take into account that some of those buildings are defensive structures like anti-air turrets or bunkers that also play a role in combat. The final fall in the number of troops and buildings correspond to the last attacks, in which half of the times the player is defeated. During the games we collect traces representing the state of the game at a given time. Each trace is represented using a vector of features labeled with the winner of the game (see Table 1). We try to capture the strength of each player using the available resources and the number of units of each particular type controlled at the current time. The table also shows the game and the current frame (1 second are 18 game frames) for clarity, but we do not use these values to predict the winner. We extract one trace every 5 seconds collecting an average of 730 traces per game. There are 2 different types of resources (minerals and gas), 15 different types of troops and 11 different types of buildings only in the Terran race. So we need a vector of 28 features to represent each player in the current state. We could have decided to represent the strength of each player using an aggregation function instead of using this high dimensional representation, but since this is 28 a strategy game we hope to be able to automatically learn which combination of units is more effective. 4 Classification algorithms We will use the following classification algorithms in the experiments: – Linear Discriminant Analysis (LDA) [10] is classical classification algorithm that uses a linear combination of features to separate the classes. It assumes that the observations within each class are drawn from a Gaussian distribution with a class specific mean vector and a covariance matrix common to all the classes. – Quadratic Discriminant Analysis (QDA) [11] is quite similar to LDA but it does not assume that the covariance matrix of each of the classes is identical, resulting in a more flexible classifier. – Support Vector Machines (SVM) [9] have grown in popularity since they were developed in the 1990s and they are often considered one of the best outof-the-box classifiers. SVM can efficiently perform non-linear classification using different kernels that implicitly map their inputs into high-dimensional feature spaces. In our experiments we tested 3 different kernels (lineal, polynomial and radial basis) obtaining the best results with the polynomial. – k-Nearest Neighbour (KNN) [2] is a type of instance-based learning, or lazy learning, where the function to learn is only approximated locally and all computation is deferred until classification. The KNN algorithm is among the simplest of all machine learning algorithms and yet it has shown good results in several different problems. The classification of a sample is performed by looking for the k nearest (in Euclidean distance) training samples and deciding by majority vote. – Weighted K-Nearest Neighbor (KKNN) [12] is a generalization of KNN that retrieves the nearest training samples according to Minkowski distance and then classifies the new sample based on the maximum of summed kernel densities. Different kernels can be used to weight the neighbors according to their distances (for example, the rectangular kernel corresponds to standard un-weighted KNN). We obtained the best results using the optimal kernel [17] that uses the asymptotically optimal non-negative weights under some assumptions about the underlying distributions of each class. All the experiments in this paper have been run using the R statistical software system[13] and the algorithms implemented in the packages caret, MASS, e1071, class and kknn. 5 Experimental results Table 2 shows the configuration parameters used in each classifier. The values in the table for each classifier were selected using repeated 10-fold cross validation 29 Classifier Base LDA QDA SVM KNN KKNN Accuracy Parameters 0.5228 0.6957 0.7164 0.6950 kernel = polynomial, degree = 3, scale = 0.1, C = 1 0.6906 k=5 0.6908 kernel = optimal, kmax = 9, distance = 2 Table 2: Classification algorithms, configuration parameters and overall accuracy. over a wide set of different configurations. The overall accuracy value represents the ratio of traces correctly classified, and it has been computed as the average accuracy value of 16 executions using 80% of the traces as the training set and the remaining 20% as the test set. One open problem in classification is to be able to characterize the domain to decide in advance which learning algorithm will perform better. We usually do not know which algorithm to choose until we have run the experiments. In our experiments all of them seem to perform very similar. The base classifier predicts the winner according to the number of traces in the dataset won by each player (i.e. ignores the current state to make the prediction) and it is included in the table only as a baseline to compare the other classifiers. The best results are for QDA that reaches a level of accuracy of 71%. 71% might not seem to be very high but we have to take into account that the games are very balanced because the same AI controls both players and the distribution of resources in the map is symmetrical for both players. Besides, in this experiment we are using all the traces in the dataset, so we are trying to predict the winner even during the first minutes of each game. Figure 5 shows some more interesting results, the average accuracy of the different classifiers as the game progresses. RTS games are very dynamic environments and just one bad strategic decision can tip the balance towards one of the players. How long do we have to wait to make a prediction with some level of trust? For example, using LDA or QDA we only have to wait until a little over half of the game to make a prediction with a level of accuracy over 80%. It is also interesting that during the first half of the game the classifiers based on lazy algorithms like KNN and KKNN perform better, and other algorithms like LDA and QDA obtain better results during the second half. All the classifiers experience a great improvement in terms of accuracy when we get close to the middle of the game. We think that at this point of the game both players have already invested most of their resources according to their strategy (promoting some type of units over others, locating the defensive buildings in the bases...) so it is easier to predict the outcome of the game. When the games reaches the 90% of their duration, all classifiers obtain a level of accuracy close to 100% but that is not surprising because at this point of the game one the players has already lost an important part of his army. 30 1.0 classifier 0.8 lda accuracy qda svm knn 0.6 kknn 0.4 0 25 50 75 100 time (%) Fig. 5: Accuracy of classifiers as the games progress. Another important aspect when choosing a classifier is the number of samples you need during the training phase in order to reach a good level of accuracy. Figure 6 shows the level of accuracy of each classifier as we increase the number of games used for training. Lazy approaches like KNN and KKNN seem to work better when we use less than 25 games for training, and LDA is able to model the domain better when we use more than 30 games. Finally, we will analyze the stability of the predictions produced by each classification algorithm. It is important to obtain some prediction that do not change constantly as the game progresses. Figure 7 shows the number of games at a given time for witch the prediction did not change for the rest of the game (in this experiment we make 20 predictions during each game at intervals of 5% of the duration). So, for example, when we reach the half of the game LDA will not change its prediction anymore for 10 out of the 20 games we are testing. In conclusion, is this domain and using our game state representation, LDA seems to be the best classifier. It obtains a level of accuracy over 80% when only 55% the game has been played, it learns faster than the other algorithms from 30 games in the training set, and it is the most stable classifier for most part of the game. 31 0.70 classifier accuracy lda 0.65 qda svm knn kknn 0.60 0.55 0 20 40 60 80 number of games Fig. 6: Accuracy of classifiers depending on the number of games used to train them. 6 Related work RTS games have captured the attention of AI researchers as testbeds because they represent complex adversarial systems that can be divided into many interesting subproblems[6]. Proofs of this are the different international competitions have taken place during the last years in AIIDE and CIG conferences[5, 4, 3]. We recommend [15] and [14] for a complete overview of the existing work on this domain, the specific AI challenges and the solutions that have been explored so far. There are several papers regarding the combat aspect of RTS games. [8] describes a fast Alpha-Beta search method that can defeat commonly used AI scripts in RTS game small combat scenarios. It also presents evidence that commonly used combat scripts are highly exploitable. A later paper [7] proposes new strategies to deal with large StarCraft combat scenarios. Several different approaches have been used to model opponents in RTS games in order to predict the strategy of the opponents and then be able to respond accordingly: decision trees, KNN, logistic regression [20], case-based reasoning [1], bayesian models [19] and evolutionary learning [16] among others. 32 20 number of stable games 15 classifier lda qda 10 svm knn kknn 5 0 0 25 50 75 time (%) Fig. 7: Number of games for which each classifier becomes stable at a given time. In [18] authors present a Bayesian model that can be used to predict the outcomes of isolated battles, as well as predict what units are needed to defeat a given army. Their goal is to learn which combination of units (among 4 unit types) is more effective against others minimizing the dependency on player skill. Our approach is different in the sense that we try to predict the outcome in whole games and not just the outcome of battles. 7 Conclusions and Future work In this paper we have compared different machine learning algorithms in order to predict the outcome of 2 player Terran StarCraft games. In particular we have compared Linear and Quadratic Discriminant Analysis, Support Vector Machines and 2 versions of k-Nearest Neighbors. We have discussed the accuracy of the prediction as the game progresses, the number of games required to train them and the stability of their predictions over time. Although all the classification algorithms perform similarly, we have obtained the best results using Linear Discriminant Analysis. There are several possible ways to extend our work. First, all our experiments take place in the same map and using the same StarCraft internal AI to control 33 both players. In order to avoid bias and generalize our results we will have to run more experiments using different maps and different bots. Note that it is not clear whether the accuracy results will improve or deteriorate. On the one hand, including new maps and bots will increase the diversity in the samples making the problem potentially more complex but, on the other hand, in this paper we have been dealing with an added difficulty that is not present in normal games: our games were extremely balanced because the same AI was controlling both players. Each bot is biased towards some way of playing, like humans, and we are not sure about the effect that may have in our predictions. Another approach to extend our work is to deal with games with more than 2 players. These scenarios are much more challenging, not only because the prediction of the winner can take values from a wider range of possibilities but because in these games players can work in group as allies (forces in StraCraft terminology). On the other hand, we have addressed only one of the three available races in our experiments and, of course, in the game some units from one race are more effective against other units of other races. Finally, in this paper we have chosen to use a high dimensional representation of the game state that does not take into account the distribution of the units and buildings in the map, only the number of units. We do not consider either the evolution of the game to make a prediction, we forecast the outcome of the game based on a picture of the current game state. It is reasonable to think that we could improve the accuracy if we consider the progression of the game, i.e., how the game got to the current state. We think there is a lot of work to do selecting features to train the classifiers. References 1. Aha, D.W., Molineaux, M., Ponsen, M.: Learning to win: Case-based plan selection in a real-time strategy game. In: in Proceedings of the Sixth International Conference on Case-Based Reasoning. pp. 5–20. Springer (2005) 2. Altman, N.S.: An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression. American Statistician 46, 175–185 (1992) 3. Buro, M., Churchill, D.: AIIDE 2012 StarCraft Competition. In: Proceedings of the Eighth AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, AIIDE-12, Stanford, California, October 8-12, 2012 (2012) 4. Buro, M., Churchill, D.: AIIDE 2013 StarCraft Competition. In: Proceedings of the Ninth AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, AIIDE-13, Boston, Massachusetts, USA, October 14-18, 2013 (2013) 5. Buro, M., Churchill, D.: AIIDE 2014 StarCraft Competition. In: Proceedings of the Tenth AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, AIIDE 2014, October 3-7, 2014, North Carolina State University, Raleigh, NC, USA (2014) 6. Buro, M., Furtak, T.M.: RTS games and real-time AI research. In: In Proceedings of the Behavior Representation in Modeling and Simulation Conference (BRIMS. pp. 51–58 (2004) 7. Churchill, D., Buro, M.: Portfolio greedy search and simulation for large-scale combat in starcraft. In: 2013 IEEE Conference on Computational Inteligence in Games (CIG), Niagara Falls, ON, Canada, August 11-13, 2013. pp. 1–8 (2013) 34 8. Churchill, D., Saffidine, A., Buro, M.: Fast Heuristic Search for RTS Game Combat Scenarios. In: Proceedings of the Eighth AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, AIIDE-12, Stanford, California, October 812, 2012 (2012) 9. Cortes, C., Vapnik, V.: Support-Vector Networks. Mach. Learn. 20(3), 273–297 (Sep 1995) 10. Fisher, R.A.: The Use of Multiple Measurements in Taxonomic Problems. Annals of Eugenics 7(7), 179–188 (1936) 11. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Springer Series in Statistics, Springer New York Inc., New York, NY, USA (2001) 12. Hechenbichler, K., Schliep, K.: Weighted k-Nearest-Neighbor Techniques and Ordinal Classification (2004), http://nbnresolving.de/urn/resolver.pl?urn=nbn:de:bvb:19-epub-1769-9 13. James, G., Witten, D., Hastie, T., Tibshirani, R.: An Introduction to Statistical Learning: with Applications in R. Springer Texts in Statistics, Springer (2013) 14. Lara-Cabrera, R., Cotta, C., Leiva, A.J.F.: A review of computational intelligence in RTS games. In: IEEE Symposium on Foundations of Computational Intelligence, FOCI 2013, Singapore, Singapore, April 16-19, 2013. pp. 114–121 (2013) 15. Ontañón, S., Synnaeve, G., Uriarte, A., Richoux, F., Churchill, D., Preuss, M.: A Survey of Real-Time Strategy Game AI Research and Competition in StarCraft. IEEE Trans. Comput. Intellig. and AI in Games 5(4), 293–311 (2013) 16. Ponsen, M.J.V., Muñoz-Avila, H., Spronck, P., Aha, D.W.: Automatically Acquiring Domain Knowledge For Adaptive Game AI Using Evolutionary Learning. In: Proceedings, The Twentieth National Conference on Artificial Intelligence and the Seventeenth Innovative Applications of Artificial Intelligence Conference, July 913, 2005, Pittsburgh, Pennsylvania, USA. pp. 1535–1540 (2005) 17. Samworth, R.J.: Optimal weighted nearest neighbour classifiers. Ann. Statist. 40(5), 2733–2763 (10 2012) 18. Stanescu, M., Hernandez, S.P., Erickson, G., Greiner, R., Buro, M.: Predicting Army Combat Outcomes in StarCraft. In: Proceedings of the Ninth AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, AIIDE-13, Boston, Massachusetts, USA, October 14-18, 2013 (2013) 19. Synnaeve, G., Bessière, P.: A Bayesian Model for Plan Recognition in RTS Games Applied to StarCraft. In: Proceedings of the Seventh AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, AIIDE 2011, October 10-14, 2011, Stanford, California, USA (2011) 20. Weber, B.G., Mateas, M.: A Data Mining Approach to Strategy Prediction. In: Proceedings of the 5th International Conference on Computational Intelligence and Games. pp. 140–147. CIG’09, IEEE Press, Piscataway, NJ, USA (2009) 35 El Clustering de Jugadores de Tetris Diana Sofía Lora Ariza Dep. Ingeniería del Software e Inteligencia Articial Universidad Complutense de Madrid, Spain dlora@ucm.es El análisis del comportamiento en el contexto de los videojuegos se ha convertido en una práctica muy popular debido permite obtener información vital de los jugadores. En modelos de negocio como los videojuegos Free-to-Play (F2P) esta información es importante para aumentar el interés de los jugadores/clientes en el videojuego, la cantidad de jugadores y las ganancias obtenidas. De igual forma, también permite realizar mejoras en el diseño. En este trabajo se realiza análisis de comportamiento de trazas del videojuego TetrisAnalytics. En éstas se encuentran los movimientos tácticos del jugador durante la partida, a las que se aplican los métodos de clustering SimpleKMeans y EM(Expectation-Maximization) vía Weka. Abstract. 1 Introducción Los videojuegos se han convertido en sosticados sistemas de información. A través de éstos se obtiene gran cantidad de datos por medio de la interacción que tiene el usuario con el videojuego [7]. La telemetría y las métricas de videojuegos han demostrado ser indispensables para la medición de diferentes aspectos del rendimiento de los sistemas, la nalización exitosa de los proyectos y el análisis de sus jugadores [11]. Existe una gran cantidad de información escondida en conjuntos de datos de comportamiento. Sin embargo, para obtener esa información es necesario aplicar diferentes técnicas que permitan extraer patrones de los datos [7]. Por medio de técnicas de Aprendizaje No Supervisado, como clustering, es posible extraer patrones de comportamiento para la toma de decisiones de diseño y renar la estrategia de ganancias utilizada. La aplicación de técnicas de inteligencia articial a conjuntos de datos de comportamiento de jugadores es reciente en comparación con otros campos de aplicación [5]. En este artículo, utilizando la herramienta de minería de datos Weka se aplican pruebas sobre conjuntos de datos de comportamiento con el n de clasicar los diferentes jugadores de dicha muestra. Se aplican los algoritmos K-means y EM. Los métodos de clustering, y particularmente el algoritmo k-means, es uno de los más populares utilizados para la evaluación de bases de datos de comportamiento en el contexto de los videojuegos. Este análisis se realiza con la nalidad de vericar la eciencia de la clasicación de los diferentes jugadores en juegos con bajo rango de movimientos disponibles. Asimismo, la fase inicial para el ajuste dinámico de la dicultad en videojuegos es el desarrollo de un 36 modelo de comportamiento que posteriormente será utilizado para relacionar las habilidades de los jugadores con la dicultad estimada para este. En la sección 2 de este artículo, se describe la importancia del análisis del comportamiento de los jugadores en videojuegos, una de las técnicas de Aprendizaje No Supervisado más utilizadas y la herramienta de minería de datos empleada para realizar los experimentos. La sección 3 contiene el diseño del videojuego, las variables y los algoritmos utilizados. En la sección 4 se presentan los resultados obtenidos. Y por último, en la sección 4 se comentan las conclusiones y trabajo futuro. 2 Análisis del Comportamiento de Jugadores El análisis del comportamiento en videojuegos ha pasado a ser una práctica ampliamente utilizada debido a que esta provee información vital sobre la población de jugadores. En modelos de negocio como videojuegos Free-to-Play (F2P) esta información permite conocer mejor los clientes/jugadores con el n de aumentar los ingresos [5,7]. El análisis del comportamiento de los usuarios permite conocer las características más atractivas del juego, los artículos más utilizados e incluso, la forma en que los usuarios interactúan entre sí. Al conocer esta información, es posible realizar mejoras en el diseño de los elementos más utilizados y así mantener el interés de los jugadores e incluso atraer más. De igual forma, la clasicación de los jugadores a partir de la interacción que estos tienen con el juego permite conocer el nivel de habilidad que cada uno. La nalidad de un juego es que sus jugadores se diviertan. Esta se deriva del dominio de las habilidades y la satisfacción de cumplir los retos propuestos. Después de dominar las habilidades, el juego da una recompensa al jugador por su buen trabajo y crea nuevos objetivos por alcanzar. Así se crea un ciclo donde el jugador se encuentra constantemente aprendiendo, dominando habilidades y obteniendo recompensas, lo que lo mantiene interesado [1]. El ciclo de aprendizaje-dominiorecompensa debe estar en el nivel adecuado; así, el juego no es lo sucientemente fácil para ser aburrido o demasiado difícil para ser frustrante. Ambos casos llevan al mismo resultado, donde el usuario abandona el juego. Por medio del ajuste dinámico de la dicultad, es posible modular cómo el juego responde a las habilidades particulares de sus jugadores durante la sesión de juego. La dicultad es considerada un factor subjetivo que se deriva de la interacción entre el jugador y el reto propuesto. Además, esta no es estática, ya que entre mayor dominio tiene el jugador sobre una habilidad concreta, más difícil se convierten las tareas a realizar [9, 12]. La clasicación de jugadores basándose en su comportamiento es parte de una línea de investigación que se centra en el desarrollo de juegos personalizables [7, 17]. Esta permite realizar un análisis completo de las bases de datos de comportamiento y obtener resultados a través de los diferentes perles. Con los resultados es posible realizar mejoras en el diseño del juego y en las estrategias de monetización utilizadas [3,10]. Con técnicas de Aprendizaje No Supervisado, como clustering, es posible disminuir la dimensionalidad de un conjunto de datos 37 y obtener aquellas características más relevantes entre los jugadores. En el caso en que se desee clasicar el comportamiento de los jugadores, se usa clustering si se desconoce cómo varía su comportamiento o si no existen clases denidas [7]. El Clustering es uno de los métodos de Aprendizaje No Supervisado más utilizados en diferentes campos de aplicación. Esta técnica ha sido utilizada en el análisis de los videojuegos como una forma de encontrar patrones en el comportamiento de los jugadores, la realización de comparativas entre videojuegos(bechmark), la evaluación del desempeño del jugador, y el diseño y entrenamiento de personajes no controlados. Los algoritmos existentes para la asignación de los objetos a un cluster se clasican en parcial("soft" ) o total("hard" ). Un algoritmo es de asignación total, cuando el objeto es identicado completamente por el cluster. Análogamente, un algoritmo es considerado de asignación parcial, cuando un objeto puede pertenecer a diferentes cluster por tener cierto grado de semejanza [2]. En las categorías principales de los métodos de clustering se encuentran los métodos de particionamiento, también conocido como clustering de centroides [6, 8]. Estos métodos construyen grupos que cumplen dos requerimientos: cada grupo debe contener por lo menos un objeto y cada objeto solo puede pertenecer a un grupo. Los métodos de particionamiento empiezan realizando una división inicial entre los datos. Luego utiliza una técnica iterativa de re-ubicación de objetos, donde intercambian objetos de un grupo a otro con el n de validar la semejanza de los elementos pertenecientes al mismo cluster. El criterio utilizado en el algoritmo k-means es uno de los más populares. En este cada cluster es representado por el valor medio de los objetos en dicho cluster [8]. Weka es una herramienta sencilla de usar con herramientas de Minería de Datos y Aprendizaje Automático [15, 16]. Esta herramienta ofrece diferentes algoritmos para la transformación de bases de datos y su pre-procesamiento, la ejecución de técnicas de minería de datos y el análisis del resultado. Entre los algoritmos implementados para clustering se encuentra k-Means y EM. El algoritmo EM es una adaptación de k-means, excepto que este calcula la probabilidad que tiene cada instancia de pertenecer a un cluster. En clustering, EM genera una descripción de probabilidad de los clusters en términos de la media y la desviación estándar para los atributos numéricos y realiza un conteo para los valores nominales [14]. Los métodos de clustering han sido utilizados para la categorización del comportamiento de los jugadores en diferentes casos de estudio. Missura y Gärtner [12] utilizan k-means y máquinas de vectores de soporte para predecir el ajuste dinámico de la dicultad en un juego sencillo de disparar a naves extraterrestres. Este caso de estudio se enfoca en la construcción de un modelo de dicultad encargado de agrupar los diferentes tipos de jugadores en principiante, promedio y experto; encontrar el ajuste de dicultad asociado a cada grupo y utilizar estos modelos para realizar predicciones a partir de pocas trazas del juego. Por otro lado, Drachen et al. [4] aplican los algoritmos k-means y SIVM (Simplex Volume Maximization) a un conjunto de datos de 260,000 jugadores de MMORPG Tera Online y el multijugador FPS Battleeld 2: Bad Company 2. 38 Utilizando la técnica de Drachen et al. [3], desarrollan perles de los jugadores a partir de los algoritmos mencionados. Con k-means obtienen los perles más comunes entre los jugadores, y con Archetypal Analysis vía SIVM, obtienen los comportamientos extremos. Otro caso de estudio similar es el realizado por Drachen et al. [5] con el popular MMORPG World of Warcraft. La mayoría de trabajos académicos realizados actualmente se centran en un banco de pruebas pequeños con datos de comportamiento obtenidos a través de telemetría. En la investigación académica, es complejo realizar este tipo de pruebas debido a la falta de disponibilidad de bases de datos grandes. A pesar que existen algunos estudios de juegos comerciales, esto continúa siendo escaso. Sin embargo, la colaboración entre estos dos sectores(empresas e instituciones de investigación) ha mejorado [5]. 3 Conguración Experimental Para realizar la clasicación de comportamiento de los jugadores se utiliza TetrisAnalytics. La implementación de este juego de Tetris es sencilla. Se tiene 7 diferentes piezas y al pasar el tiempo, la cha cae más rápido por el tablero. El jugador puede realizar los movimientos tradicionales(izquierda, derecha, rotación y bajar la cha). El juego sólo naliza cuando el jugador pierde; es decir, cuando las chas alcanzan la parte superior del tablero. Las variables obtenidas de cada partida son las siguientes: PosX: posición X donde el jugador ubica la pieza. Esta variable es una enumeración con los siguientes valores: x0, x1, x2, x3, x4, x5, x6, x7, x8, x9. PosY: posición Y donde el jugador ubica la pieza. Esta variable es una enumeración con los siguientes valores: y0, y1, y2, y3, y4, y5, y6, y7, y8, y9, y10, y11, y12, y13, y14, y15, y16, y17, y18, y19. Rot: rotación de la cha. Esta variable es una enumeración con valores: r0, r1, r2, r3. NextPiece: pieza siguiente que saldrá a continuación(valor entero entre 0 y 6). Celda_X_Y (X e Y enteros desde 0 hasta el ancho/alto - 1): estado de cada celda concreta del tablero. Valores 0 o 1 dependiendo de si está vacía u ocupada respectivamente. TipoFicha: tipo de cha. Enumerado p0, p1, p2, p3, p4, p5, p6. Al nalizar la partida, se guarda un archivo binario con la información completa de la partida. A partir de este binario se pueden generar archivos ARFF que posteriormente serán ingresados en Weka para su análisis. Las políticas de generación de archivos ARFF utilizadas son: PosYTipoFichaARFF: guarda todos los atributos de la política anterior mas el tipo de cha actual. LastXLinesARFF: guarda la información de las chas (posición, rotación, tipo de cha y cha siguiente). Al igual que la parte superior del tablero; es 39 decir, las últimas las no vacías. La "X" del nombre puede variar entre 1-4, lo cual indica la cantidad de las que se guardan. La posición de la cha se almacena relativa a la Y de la la superior. EstadoCompletoARFF: guarda todo el estado del juego, esto incluye: la posición de la cha, rotación y tipo, así como la cha siguiente y el estado completo del tablero (casilla libre=0, casilla ocupada=1). Cada instancia contiene el movimiento táctico del jugador; es decir, cada instancia contiene la posición x e y donde el jugador ubica la cha junto con los demás atributos que incluya la política de generación de archivos. Fig. 1. Pantallazo del juego. Se efectúa la evaluación clases a clusters con los algoritmos SimpleKMeans y EM para clustering de Weka, donde k=número de jugadores. En este modo, Weka inicialmente ignora la clase y genera los clusters. Posteriormente, en la fase de pruebas, asigna las clases a los clusters basándose en la clase con mayor cantidad de instancias que se encuentren en dicho cluster. Luego calcula la tasa de error de la clasicación, el cual se basa en la asignación de clases a clusters y muestra la matriz de confusión [14]. El algoritmo EM, en modo evaluación de clases a clusters, realiza el cálculo de likelihood function, asigna clases a los clusters, muestra la matriz de confusión y la cantidad de instancias incorrectamente clasicadas. De igual forma, el algoritmo SimpleKMeans en modo evaluación de clases a clusters, realiza la asignación de clases a clusters, muestra la matriz de confusión, la tasa de error y la sumatoria del error cuadrado de los clusters. Las trazas evaluadas corresponden a 6 jugadores diferentes. Tres de estos, son IA's. Dos de las IA's utilizadas son sencillas, su táctica es ubicar la cha en la parte izquierda o derecha del tablero. La tercera tiene una habilidad alta 40 para jugar y su comportamiento es similar a los movimientos de los jugadores reales. Esta IA, llamada HardCodedTacticalAI, evalúa el tablero de acuerdo a su altura(la más alta con alguna posición ocupada) y el número de espacios en este(número de posiciones sin ocupar en una columna por debajo de la primera posición ocupada). Esta evaluación la realiza para cada una de las 4 rotaciones posibles de la pieza y selecciona la mejor ubicación. 4 Resultados Para el análisis de las trazas de TetrisAnalytics se realizan dos pruebas diferentes con el n de evaluar la clasicación de los jugadores con respecto a sus movimientos tácticos individuales y además, en una segunda prueba se realiza la unión de las trazas de la jugada actual con la siguiente con el n de analizar el impacto que tienen la ubicación de las piezas si el jugador conoce cual es la cha siguiente. El análisis efectuado para el comportamiento de los jugadores respecto a los movimientos individuales se desarrollaron con 4312 instancias de 6 jugadores. Tres de los jugadores son IA's, una posee buenas habilidades para jugar Tetris(jugador 3). Las otras dos son "tontas" y sus movimientos consisten en poner las chas siempre en el lado izquierdo o derecho(jugador 5 y jugador 4 respectivamente) del tablero; mientras que los demás jugadores distribuyen las chas de manera más homogénea a través del tablero. Esto se puede examinar en la gura 2, donde cada color representa un jugador. Por medio de TetrisAnalytics es posible producir archivos ARFF con diferentes conjuntos de variables. Con la política de generación PosYTipoFichaARFF se exporta el tipo de cha actual, el tipo de cha siguiente, la posición x y y donde se ubica la cha, y la rotación que tiene la pieza en su ubicación nal. Con Last4LinesARFF y EstadoCompletoARFF, se agrega la información de las 4 las no vacías superiores del tablero y el estado completo del tablero respectivamente. Característica PosX de ubicación de la cha. Varia de x0 a x9. Donde x0 es la posición más a la izquierda. Fig. 2. 41 La tasa de error para las pruebas con los movimientos tácticos individuales son las siguientes: PosYTipoFichaARFF: SimpleKMeans 70.61% (3045 instancias), EM 71.12% (3067 instancias). Last4LinesARFF: SimpleKMeans 44.15% (1904 instancias), EM 48.46% (2090 instancias). EstadoCompletoARFF: SimpleKMeans 45.06% (1943 instancias), EM 53.43% (2304 instancias). Para todos los casos SimpleKMeans clasica mejor que EM. Asimismo, aunque se esperaba que al tener la información del tablero completo diera mejores resultados que la información parcial de este, la mejor clasicación fue realizada teniendo la información de las 4 las superiores no vacías del tablero. Debido a que la evaluación es realizada con los movimientos tácticos individuales de los jugadores, es posible que el estado completo del tablero, hasta cierto punto, no sea tan relevante para su clasicación debido a que solo se tiene en cuenta la ubicación de cada cha, no de un conjunto de chas consecutivas. En la matriz de confusión(tabla 1), las columnas representan el número de predicciones del algoritmo y las las las instancias en la clase real. La mayoría de las instancias de los jugadores 4 y 5 pertenecen a los clusters 0 y 3 respectivamente. Estos dos jugadores, debido a tener comportamientos extremos, su clasicación es bastante acertada. Sin embargo, el resto de jugadores, debido a que realizan diferentes movimientos, pueden llegar a tener comportamientos similares y así ser clasicados erróneamente. Esto ocurre porque el espacio de movimientos posibles permitidos a los jugadores es bastante reducido, así que al evaluar las partidas con movimientos individuales por jugador puede llevar a una mala clasicación. 1. Asignación de clases a clusters con SimpleKMeans, conguración Last4LinesARFF y movimientos individuales. Table 0(JUG6) 1(JUG2) 2(JUG4) 3(JUG3) 4(JUG1) 5(JUG5) Clusters Asignados 131 190 29 0 0 145 104 204 76 0 0 118 28 113 1 725 0 14 147 104 371 0 0 168 203 99 194 0 4 149 56 82 30 34 760 33 JUG1 JUG2 JUG3 JUG4 JUG5 JUG6 Debido a que la evaluación de los movimientos de los jugadores de manera individual presentan una tasa de error considerablemente alta, también se realiza la evaluación de las trazas uniendo la instancia de la cha actual con la siguiente por jugador y partida. Esta unión se realiza porque durante la partida el jugador puede observar la cha siguiente a ubicar en el tablero. Por lo tanto, se puede 42 considerar que el jugador ubica la cha actual de manera diferente si conoce cual será la siguiente. Así que se realiza un análisis de los dos movimientos consecutivos, con el n de vericar si existe algún patrón entre la ubicación de la cha actual con base en la cha siguiente que este tendrá disponible. Por consiguiente, se realizan las pruebas con las tres políticas de generación de archivos ARFF de TetrisAnalytics. La tasa de error que presentan las pruebas al unir la traza de la pieza actual con la siguiente son las siguientes: PosYTipoFichaARFF: SimpleKMeans 60.03% (2588 instancias), EM 58.33% (2515 instancias). Last4LinesARFF: SimpleKMeans 40.78% (1756 instancias), EM 47.08% (2030 instancias). EstadoCompletoARFF: SimpleKMeans 52.32% (2253 instancias), EM 53.032% (2296 instancias). En el caso de PosYTipoFichaARFF, se puede observar una disminución importante en el porcentaje de instancias incorrectamente clasicadas para ambos algoritmos. Con la conguración EstadoCompletoARFF, ambos algoritmos alcanzan una tasa de error similar y menor con respecto a la prueba anterior. Last4LinesARFF sigue siendo la mejor clasicada. Para el caso de los jugadores JUG4 y JUG5, los cuales presentan comportamientos extremos, se puede observar una clasicación correcta de todas sus instancias en la tabla de confusión(Tabla 2) resultante de la evaluación con SimpleKMeans. Asignación de clases a clusters con SimpleKMeans, conguración EstadoCompletoARFF y unión de movimientos consecutivos. Table 2. 0(JUG1) 1(JUG3) 2(JUG6) 3(JUG5) 4(JUG4) 5(JUG2) Clusters Asignados 176 124 84 0 2 116 107 52 398 0 0 134 170 139 156 0 0 174 40 71 21 0 761 28 36 122 0 758 0 12 139 283 41 0 0 162 JUG1 JUG2 JUG3 JUG4 JUG5 JUG6 Debido a restricciones de espacio, no se muestra la tabla de confusión resultante de la asignación de clases a clusters utilizando el algoritmo EM. Sin embargo, la clasicación realizada por este algoritmo, a pesar de su mejora, no es tan buena como se espera. Los jugadores JUG4 y JUG5 siguen siendo clasicados en diferentes clusters a pesar que su comportamiento es diferente al resto de jugadores evaluados. A pesar de esto, se puede notar una mejora con respecto a las pruebas realizadas con movimientos individuales. 43 Prueba Estadística Con el n de conocer cual de los algoritmos es utilizados es el mejor, se realiza la t-test pareada corregida vía Weka Experimenter. Este método realiza la comparación de los resultados de N algoritmos de clasicación sobre el mismo conjunto de datos [13]. Por defecto, el nivel signicativo es de 0.05 y en el campo de comparación se selecciona percent_correct. El conjunto de datos ingresado es el generado a partir de la política EstadoCompletoARFF con dos movimientos consecutivos por instancia. En el resultado muestra que con la conguración SimpleKMeans(t = 50.05) supera a EM(t = 46.48). 5 Conclusiones y Trabajo Futuro Para un juego sencillo como TetrisAnalytics, se realiza el análisis del comportamiento de un conjunto de datos que contiene instancias tácticas de 6 jugadores aplicando los algoritmos SimpleKMeans y EM vía Weka. Se evalúan en dos instancias: primero utilizando los movimientos tácticos individualmente, y luego la unión de la traza actual con la siguiente de la misma partida. Debido a que el juego seleccionado tiene pocos movimientos posibles, de ahí que los algoritmos tienden a confundir los movimientos tácticos entre jugadores. En casos extremos, como los jugadores que siempre ubican las chas a la izquierda y a la derecha, la tasa de error es baja e incluso, cuando se considera la información de la cha siguiente, la tasa de error es nula para el caso de SimpleKMeans. No obstante, para jugadores que realizan diferentes movimientos la clasicación correcta es deciente; aunque mejora cuando se unen las trazas de cha actual y cha siguiente. Los algoritmos SimpleKMeans y EM calculan las matrices de confusión con las cuales podemos identicar la clasicación de las instancias de los jugadores a cada cluster. Para 4 de los jugadores, sus instancias se encuentran distribuidas en todos los clusters debido a que estos realizan diferentes movimientos tácticos a través de la partida. Esto da como resultado un comportamiento similar entre jugadores debido a la baja cantidad de movimientos posibles en el juego. Por ello, se puede concluir que la clasicación del comportamiento en juegos con rango de movimientos disponibles restringido no es muy acertada. La evaluación de movimientos tácticos individuales por jugador no describe en su totalidad el comportamiento de este. Sin embargo, los jugadores tienden a ubicar la cha actual dependiendo de la siguiente con lo que mejora la clasicación cuando se tienen en cuesta mas de un movimiento. Para trabajo futuro, se realizarán experimentos donde se unan más de dos movimientos tácticos de un jugador por partida. Al obtener una mejor clasicación de jugadores, la asignación de la dicultad dependiendo de las habilidades mostradas por cada uno es más acertada mejorando la experiencia de juego y el interés de este. References 1. Cook, D.: The chemistry of game design. Gamasutra (2007) 44 2. Drachen, A.: Introducing clustering i: Behavioral proling for game analytics. Game Analytics Blog (Mayo 2014), http://blog.gameanalytics.com/blog/ introducing-clustering-behavioral-profiling-game-analytics.html 3. Drachen, A., Canossa, A., Yannakakis, G.: Player modeling using self-organization in tomb raider: Underworld. In Proceedings of IEEE Computational Intelligence in Games (CIG) 2009 (Milan, Italy) (2009) 4. Drachen, A., Sifa, R., Bauckhage, C., Thurau, C.: Guns swords and data: Clustering of player behavior in computer games in the wild (2012), in proceedings of IEEE Computational Intelligence in Games, 2012 (Granada, Spain) 5. Drachen, A., Thurau, C., Sifa, R., Bauckhage, C.: A comparison of methods for player clustering via behavioral telemetry. https://andersdrachen.les.wordpress.com (2013) 6. Drachen, A: Introducing clustering ii: Clustering algorithms. Game Analytics Blog (2014), http://blog.gameanalytics.com/blog/ introducing-clustering-ii-clustering-algorithms.html 7. El-Nasr, M.S., Drachen, A., Canossa, A.: Game analytics: Maximizing the value of player data. Springer Science & Business Media (2013) 8. Han, J., Kamber, M.: Data Mining: Concepts and techniques. Morgan Kaufmann (2006) 9. Hunicke, R.: The case for dynamic diculty adjustment in games. In: Proceedings of the 2005 ACM SIGCHI International Conference on Advances in computer entertainment technology. pp. 429433. ACM (2005) 10. Mahlman, T., Drachen, A., Canossa, A., Togelius, J., Yannakakis, G.N.: Predicting player behavior in tomb raider: Underworld. In Proceedings of the 2010 IEEE Conference on Computational Intelligence in Games (2010) 11. Mellon, L.: Applying metrics driven development to mmo costs and risks. Versant Corporation, Tech. Rep. (2009) 12. Missura, O., Gärtner, T.: Player modeling for intelligent diculty adjustment. In Proceedings of the 12th international conference on discovery science (2009), berlin 13. Rabbany, R.: Comparison of dierent classication methods. http://webdocs.cs.ualberta.ca/rabbanyk/research/603/short-paper-rabbany.pdf 14. Witten, I.: Using weka 3 for clustering. http://www.cs.ccsu.edu/markov/ccsu_courses/DataMining Ex3.html 15. Witten, I.: Waikito courses. https://weka.waikato.ac.nz/dataminingwithweka/preview 16. Witten, I., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (2005) 17. Yannakakis, G.N., Hallam, J.: Real-time game adaptation for optimizing player satisfaction (2009), berlin 45 Evolutionary Interactive Bot for the FPS Unreal Tournament 2004 José L. Jiménez1 , Antonio M. Mora2 , Antonio J. Fernández-Leiva1 1 2 Departamento de Lenguajes y Ciencias de la Computación, Universidad de Málaga (Spain). josejl1987@gmail.com,afdez@lcc.uma.es Departamento de Teorı́a de la Señal, Telemática y Comunicaciones. ETSIIT-CITIC, Universidad de Granada (Spain). amorag@geneura.ugr.es Abstract. This paper presents an interactive genetic algorithm for generating a human-like autonomous player (bot) for the game Unreal Tournament 2004. It is based on a bot modelled from the knowledge of an expert human player. The algorithm provides two types of interaction: by an expert in the game and by an expert in the algorithm. Each one affects different aspects of the evolution, directing it towards better results regarding the agent’s humanness (objective of this work). It has been conducted an analysis of the experts’ influence on the performance, showing much better results after these interactions that the non-interactive version. The best bot were submitted to the BotPrize 2014 competition (which seeks for the best human-like bot), getting the second position. 1 Introduction Unreal Tournament 2004[1], also know as UT2004 or UT2K4, is a first person shooter (FPS) game mainly focused on the multiplayer experience, which includes several game modes for promoting this playability. It includes a oneplayer mode in which the rest of team partners (if so) and rivals are controlled by the machine, thus they are autonomous. They are called bots. One of the main contributions of Unreal series is the utilities that the game offers for the gamers to create and share their own creations or modifications on the game contents (mods). These could be new weapons, scenarios, and even new game modes. Most of them are, in turn, more popular among the players than the original ones. They can be created with the editor that the game provides (UnrealED), included with every copy. The Artifical Intelligence (AI) has completely changed in the last years, from the classical set of static behavioural rules (including some random component), the use of Finite State Machines (FSM), or the recent use of Scripts, Navigation Meshes and Behavioural trees. The most extended among the Non-Player Characters (NPC) could be the FSMs [2]. However, nowadays the AI is focused, not only in defining very competitive rivals or partners, but also in the humanness of them. This aims to design ‘be- 46 2 J.L. Jiménez et al. lievable’ NPCs to which the player can feel empathy, in order to improve his/her experience and lasting appeal about the game. This paper presents a work in this line, proposing an Interactive Genetic Algorithm (IGA) [3] which evolves a bot, previously defined, by modelling the knowledge and tricks of an expert human player, named Expert UnrealBot [4]. The objective is that the bot has a human-like behaviour, rather than just a very competitive bot with a very good performance in the game. This bot has been designed to play in the Deathmatch mode, in which the aim in a match is to reach a number of frags (enemies killed) or the maximum amount of frags in a limited amount of time. There could be just 1 vs 1 player or several players fighting against others. The presented approach includes two types of interaction: the first one is conducted by an expert in the game (expert human player in UT2K4), who can assess the behaviour of the bot considering its humanness; the second type of interaction is done by an expert in the algorithmic part, i.e., he/she can appraise the performance on an individual in the algorithm. We have also defined a parallel approach which can profit the run in a cluster of computers in order to reduce by ten the computational time (which was very expensive). 2 2.1 Background and related works Turing Test for bots This is a variation of the classical Turing Test in which a human judge who looks and interacts with a virtual world (a game), must to distinguish if the other actors (players) in the game are humans or bots. This test was proposed in order to advance in the fields of Artificial and Computational Intelligence in videogames, since a bot which can pass this test could be considered as excellent, and could increase the quality of the game from the players’ point of view. Moreover the test tries to prove that the problem of AI for videogames is far from being solved. Aiming to other areas, the methods used in this test could be useful in virtual learning environments and in the improvement of the interaction human-robot (or machines in general). The Turing Test for bots is focused on a mutiplayer gamer in which the bot has to cooperate of fight against other players (humans or bots), making the same decisions that a human would take. This was transformed into an international competition with the features: – There will be (ideally) three players: a human player, a bot and a judge. – The bot must ‘simulate’ to be more human than the human player, however the marks are not complementary, i.e. both players can obtain a mark between 1 and 5 (1=Non-human, 5=Human). – The three players cannot be distinguished from ‘outside’ (even with a random name and appearance). Thus, the judge cannot be influenced by these features. 47 Evolutionary Interactive Bot for the FPS Unreal Tournament 2004 3 – There is no chat during the match. – Bots cannot have omniscient powers as in other games. They can just react to the same stimuli (caught by means of sensors) than a human player. The human player must be an averaged one, in order to avoid very competitive behaviours that could also influence in the judge’s decision. In 2008 it was held the first 2K BotPrize competition (BotPrize from now on), in which UT2K4 was considered as the ‘world’ for this test. The participants should create their human-like bots using an external library to interact with the game by means of TCP connections, named GameBots [5]. The first BotPrize competition considered the Deathmatch mode, and divided the rounds in 10 minutes matches. The judges were the last joining to the matches and observed on player/bot exactly once and none bot was categorised as more human than a human. In the following editions (in 2009, 2010 and 2011) the marks of the bots were improved, however they still were not able to overcome to any of the human players. Anyway, the maximum humanness score for the human players was just 41.4 %. This demonstrates the limitations of the test (or the competition), since even appraise a human behaviour is a quite complex task. 2.2 Human-like bots The winners of BotPrize 2012 were U Tˆ2Bot [6] and MirrorBot [7]. The first one was created by a team of the University of Texas. This bot is based in two main ideas: the application of multiobjective neural evolution in order to use just those combat actions which seem to be human-conducted and a proper navigation of the map. They used stored logs of human plays in order to optimize both objectives. MirrorBot was created by Mihai Polceanu (a PhD student in the LAB-STICC, ENIB). It basically imitates the reactions of the opponent, using interactive techniques. The idea is that if a judge is close to the bot it would imitate his/her behaviour, which could clearly mislead the judge and make him/her thinking that the player is showing a human-like behaviour. Obviously this technique is not useful if the opponent is another bot. The bot we are presenting here is named NizorBot, and it is an improved version of our previous ExpertBot [4]. The aim of the latter was the modelling of the behaviour of an expert Spanish human player (one of the authors), who participated in international UT2K4 contests. It included a two-level FSM, in which the main states define the high level behaviour of the bot (such as attack o retreat), meanwhile the secondary states (or substates) can modify this behaviour in order to meet immediate or necessary objectives (such as taking health packages or a powerful weapon that is close to the bot’s position). The decisions about the states is made by an expert system, based in a huge amount of rules considering almost all the conditions (and tricks) that a human expert would do. It also uses a memory (a database) to retain important information about the fighting arena, such as the position of items and weapons, or advantageous points. This information is stored once the bot has discovered them, as a human player would do. 48 4 J.L. Jiménez et al. ExpertBot is formed by two layers: The first one is the cognitive layer, in charge of controlling the FSM considering the environmental stimuli (perceived by sensors). It decides the transitions between states and substates using the expert system and the memory (database). The second one is the reactive layer, which does not perform any kind of reasoning, and just react immediately to events during the match. 3 Methodology Human intervention for guiding the optimisation process has proved to be very effective improving, not only the quality of the solution, but also the performance of the algorithm itself for finding the solution to a problem [8]. This improvement is specially needed for problems where the subjectivity is part of the evaluation process. This is typical of problems which involve human creativity (such as generative art), or those which aims to increase the human satisfaction or challenge (such as content generation in games). However, the interaction in an optimisation algorithm could be conducted in many different ways and could affect to many different aspects of the approach. On the one hand, the participation of an expert in problem scope would be key in order to measure the quality of a candidate solution. On the other hand, an expert in the algorithmic scope could improve the overall performance of the approach or could direct the search/optimization process to promising areas of the space of solutions [9]. Even though these methods perform very well, interactive algorithms are not very common since they have some flaws, including the human tiredness or boringness if the expert has to intervene several times (which should be, in turn, ideally). The usual solution [10] consists in adapting the algorithm to be more ‘proactive’, so it could predict somehow the decisions that the human will take about a candidate solution. Another approach is based in the ‘extrapolation’ of the decisions taken by the human, so similar solutions (closer in Euclidean distance for instance) would receive automatically a similar value. However this approach depends on the problem definition and on the structure of the solutions, since close solutions in the space of solutions should be really similar in the problem scope. In this work, we propose an Interactive Genetic Algorithm (IGA)[3], in which the experts guide the optimization of the ExpertBot parameters in order to obtain a human-like bot. We think that this intervention is very relevant in order to value the candidate solutions (i.e. bots) regarding something as subjective as the humanness of an autonomous agent in a game. In the following sections the algorithm and the points of interaction are described. 49 Evolutionary Interactive Bot for the FPS Unreal Tournament 2004 3.1 5 Chromosome structure Every individual in the GA is a chromosome with 26 genes, divided in 6 blocks of information. The description of each block is: – Distance selection block. Set of parameters which control the distance ranges that the agent considers to attack with one specific type of weapon, and also the distances to the enemy both for attacking it or for defending from it. • Gene 0 : Short distance. Value between 0 and 1200. • Gene 1 : Medium distance. Value between the current value for Gene 0 and 2000. • Gene 2 : Far distance. Value between the current value for Gene 1 and 2800. – Weapon selection block. Set of parameters which control the priority assigned to every weapon considering some factors, such as distance to enemy, altitude where the enemy is, and the physical location of the agent with respect to the enemy. • Genes 3-11: They control the priority associated to every weapon. Values between 0 and 100. – Health control block. Set of parameters which control the level of health of the bot, depending on which it will be offencive, defencive or neutral. • Gene 12: If the health level is lower than this, the bot will be defencive. Value between 0 and 100. • Gene 13: If the health level is lower than this, the bot will be neutral. Value between the current value for Gene 12 and 160. – Risk block. Set of parameters which control the amount of health points that the bot could risk. • Gene 14: Value between 5 and 30. • Gene 15: Value between 15 and 80. • Gene 16: Value between 15 and 60. • Gene 17: Value between 10 and 120. • Gene 18: Value between 20 and 100. – Time block. Gene which defines the amount of time which the agent considers to decide that the enemy is out of vision/perception. • Gene 19: Value between 3 and 9 seconds. – Item control block. Set of parameters which control the priority assigned to the most important items and weapons. As higher the value is, higher the priority will be for ‘timing’ the item/weapon (i.e. control of its respawn time). • Gen 20: Priority of Super Shield Pack. Value between 0 and 100. • Gen 21: Priority of Shield Pack. Value between 0 and 100. • Gen 22: Priority of Lightning Gun. Value between 0 and 100. • Gen 23: Priority of Shock Rifle. Value between 0 and 100. • Gen 24: Priority of Flak Cannon and of Rocket Launcher. Value between 0 and 100. • Gen 25: Priority of Minigun. Value between 0 and 100. 50 6 3.2 J.L. Jiménez et al. Fitness and evaluation The fitness function is defined as: (f r − d) + s2 + s1 +log((dmgG − dmgT2) + 1) if f r > d f (f r, d, dmgG, dmgT, s1, s2) = fr s1 d + s2 + 2 +log((dmgG − dmgT ) + 1) if f r < d Where f r is the number of enemy kills the bot has obtained (frags), d is the number of own deads, dmgG is the total damage produced by the bot, and dmgT is the total damage it has received. s1 and s2 refers respectively to the number of Shields and Super Shields the bot has picked up (very important item). This function rewards the individuals with a positive balance (more frags than deads) and a high number of frags. In addition individuals which perform a high amount of damage to the enemies are also rewarded, even if they have not got a good balance. The logarithms are used due to the magnitude that the damages take, being around the thousand. We add 1 to avoid negative values. Normally the fitness functions must be less complex, in this case, just considering overall the number of frags and deads. However, we have decided to consider several factors to value the performance of a bot, above all, the gathering and use of items, as they are very important in the matches in UT2K4. The evaluation of an individual consists in setting the values of the chromosome in the NizorBot AI engine, then a 1 vs 1 combat is launched between this and a standard UT2K4 bot in its maximum difficulty level (they are more robust and reliable). Once the time defined for the match is finished, the summary of the individual (bot) performance regarding these values is considered for the fitness computation. There is a high pseudo-stochastic component in these battles to take into account, since the results do not depend completely on our bot, but also on the enemy’s actions which we cannot control. Thus, the fitness function is considered as noisy [11], since an individual could be valued as good in one combat, but yield very bad results in another match. This problem will be addressed in future works. 3.3 Selection and Genetic Operators A probability roulette wheel has been used as selection mechanism, considering the fitness value as a proportion of this probability. The elitism has been implemented by replacing the five worst individuals of the new population by the five best of the current one. The uniform crossover operator, so every gene of a descendent has the same probability of belonging to each one of the parents. 51 Evolutionary Interactive Bot for the FPS Unreal Tournament 2004 3.4 7 Interactive Evaluations The interaction of the game expert has been conducted just stopping in some specific points of the evolution (in some generations). Then, a form with several questions is presented to him/her. It can be seen in Figure 1. Fig. 1. Interactive evaluation form This form shows the data about the best individual of the current generation (and thus, the best overall due to elitism). The expert can watch a video of the corresponding bot playing a match (using the number of generation and bot’s ID). After the visualisation of the record, the evaluator must fill in the form, checking the boxes which better describe the behaviour he/she has seen in the video. Every checkbox is associated to a set of genes of the chromosome, so, if the expert thinks that there is a good behaviour in any sense this would be translated into the algorithm by ‘freezing’ (or blocking) the corresponding genes in the chromosome of the best. This will affect the rest of the population when this individual combines and spreads its genetic material. Thus, this interaction will also reduce the search space, so the algorithm performance will be improved. The algorithm then continues the run until a new stopping point is reached. The interaction of the algorithm expert is done by just studying the fitness evolution plotted in graphs in the software Graphic User Interface and changing the parameter values of the GA. Thus, he/she can affect the evolutionary process increasing the crossover or mutation probability, for instance. 3.5 Parallel Architecture The evaluation of every bot/individual is very expensive in time (around 150 minutes per generation), since a match must be played almost in real time due to UT2K4 does not permit accelerating the game without provoking secondary effects, such as bad collision detections. For this reason, we have implemented a parallel version of the algorithm, in which several individuals are evaluated at the same time (running the matches 52 8 J.L. Jiménez et al. in different machines), following a client/server architecture. We have used the extension of Pogamut [12] which let the programmer to launch matches in command line with the desired options, also including the bots with the corresponding chromosome values. Once the match has ended, the bot (client) sends the summary/statistics to the server, in order to compute a fitness value for that bot. If there is an error during a match, the server will be aware of this and can reinitialise the match. 4 Analysis of results This section is devoted to analyse the performance of our proposals. To do so, we have consider the same map DM-ironic and a generational interactive genetic algorithm with population size 30, number of generations 50, mutation variation 10%, mutation probability 0.33%, uniform crossover and crossover probability 1.0, elitist replacement policy keeping the 5th best candidates from the previous generation. Interaction with the human expert is forced at specific generations; three versions were considered depending on the number of interactions: 1 interaction (forced at generation 25), 2 interactions (forced at generations 16 and 33), and 4 interactions (forced at generations 10,20,30 and 40). We have also considered a version with no human interaction. Five runs per algorithm, and 5 minutes for each evaluation were executed. Figure 2 shows the results obtained by these four variations of algorithm. As interactivity increases, the results gets more consistent. Anyway, this can be the result of fixing some parts of the chromosome encoding. The consequence might be that candidates are more similar and with equivalent fitness. 4.1 Humanity evaluation To evaluate the adequacy of our proposal, our bot competed in the 2K BotPrize competition3 (edition 2014) that proposes the Turing test in UT2K4. This competition has been sponsored by 2K Australia. In the original competition, the evaluation of the humanity of bots was in charge of a number of judges that participated directly in the matches; this means a First Person Assessment (FPA). In the edition of 2014, a Third Person Assessment (TPA) was included by means of the participation of (external) judges via a crowdsourcing platform. The general schema of the new evaluation is shown in Figure 3 which shows the function to calculate the Humanness value (H) of a bot taking into account the FPA and TPA. In addition, since 2014, this competition proposes a second challenge that evaluates the own skill of the bots to be able to judge the humanity of the other bots. However, this is not an objective for our bot. The results of the reliability evaluation (both by humans and also bots) is shown in Figure 4. 3 http://botprize.org/ 53 Evolutionary Interactive Bot for the FPS Unreal Tournament 2004 9 Fig. 2. Fitness comparison. In all box plots, the red line indicates the median of the distribution, whereas the bottom and top of the box correspond respectively to the first and third quartile of the distribution. Fig. 3. Evaluation system considered in BotPrize 2014 54 10 J.L. Jiménez et al. Fig. 4. Results of detection of bots (reliability) in Botprize 2014 As it can be seen in Figures 5 and 6, the winner of this edition was Mirrorbot (the same as in previous editions), which was one of the two bots that passed the Turing test adapted to games in the previous editions of the competition (so that it can be considered the state-of-the-art). However, as result of the new (and harder) evaluation system, it does no reach the value for being consider human (i.e., 0.5) although is relatively close to it. Out bot (NizorBot) show a very good performance finishing as the runner-up and with a humanity factor relatively close to be considered as human. it was programmed taking into account the patterns of behaviour of an expert human player in Unreal Tournament 2004, and this might explain its success. Fig. 5. Results of Botprize 2014(I) 55 Evolutionary Interactive Bot for the FPS Unreal Tournament 2004 11 Fig. 6. Results of Botprize 2014(II) 5 Conclusions and future work This paper describes a genetic algorithm-based proposal to create a Non-Player Character (bot) that plays, in a human-style, the game Unreal Tournament 2004. Some user-centric proposals (i.e., with human interaction) have been proposed and compared with a non-interactive version. All the proposals perform similarly if they are compared according only to the score (i.e., fitness) obtained during a number of matches; however, according to a humanity factor the incorporation of interactivity provides very good results, and this has been proved via the results obtained by one of our human-guided algorithms in the context of the 2K BotPrize competition. There is room for many improvements: for instance, the schema of navigation in our bots might be improved. In addition, currently the fitness function that guides the search process introduce noise in the optimisation process, and we believe that the incorporation of several human experts in the guidance of our algorithms surely would produce more consistent results. Acknowledgements This work has been partially supported by project V17-2015 of the Microprojects program 2015 from CEI BioTIC Granada, Junta de Andalucı́a within the project P10-TIC-6083 (DNEMESIS4 ), by Ministerio de Ministerio español de Economı́a y Competitividad under (preselected as granted) project TIN2014-56494-C4-1P (UMA-EPHEMECH), and Universidad de Málaga. Campus de Excelencia Internacional Andalucı́a Tech. 4 http://dnemesis.lcc.uma.es/wordpress/ 56 12 J.L. Jiménez et al. References 1. http://www.unrealtournament.com/: Unreal tournament (2014) 2. Arbib, M.A.: Theories of abstract automata. Prentice-Hall (1969) 3. Sims, K.: Artificial evolution for computer graphics. ACM SIGGRAPH Computer Graphics 25(4) (1991) 319–328 4. Mora, A., Aisa, F., Caballero, R., Garcı́a-Sánchez, P., Merelo, J., Castillo, P., Lara-Cabrera, R.: Designing and evolving an unreal tournamenttm 2004 expert bot. Springer (2013) 312–323 5. WEB: Gamebots project (2008) http://gamebots.sourceforge.net/. 6. Schrum, J., Karpov, I.V., Miikkulainen, R.: Ut? 2: Human-like behavior via neuroevolution of combat behavior and replay of human traces. (2011) 329–336 7. Polceanu, M.: Mirrorbot: Using human-inspired mirroring behavior to pass a turing test. (2013) 1–8 8. Klau, G., Lesh, N., Marks, J., Mitzenmacher, M.: Human-guided search. Journal of Heuristics 16 (2010) 289–310 9. Cotta, C., Leiva, A.J.F.: Bio-inspired combinatorial optimization: Notes on reactive and proactive interaction. In Cabestany, J., Rojas, I., Caparrós, G.J., eds.: Advances in Computational Intelligence - 11th International Work-Conference on Artificial Neural Networks, IWANN 2011, Part II. Volume 6692 of Lecture Notes in Computer Science., Springer (2011) 348–355 10. Badillo, A.R., Ruiz, J.J., Cotta, C., Leiva, A.J.F.: On user-centric memetic algorithms. Soft Comput. 17(2) (2013) 285–300 11. Mora, A.M., Fernández-Ares, A., Merelo, J.J., Garcı́a-Sánchez, P., Fernandes, C.M.: Effect of noisy fitness in real-time strategy games player behaviour optimisation using evolutionary algorithms. J. Comput. Sci. Technol. 27(5) (2012) 1007–1023 12. http://pogamut.cuni.cz/main/: Pogamut - virtual characters made easy — about (2014) 13. Dyer, D.: The watchmaker framework for evolutionary computation (evolutionary/genetic algorithms for java) (2014) 14. http://human machine.unizar.es/: Human-like bots competition 2014 (2014) 15. http://xstream.codehaus.org/: Xstream - about xstream (2014) 57 Game Technologies for Kindergarten Instruction: Experiences and Future Challenges Vicente Nacher, Fernando Garcia-Sanjuan, Javier Jaen ISSI/DSIC, Universitat Politècnica de València, Camí de Vera S/N, 46022 Valencia (Spain) {vnacher, fegarcia, fjaen}@dsic.upv.es Abstract. Games are an ideal mechanism to design educational activities with preschool children. Moreover, an analysis of current kindergarten curricula points out that playing and games are an important basis for children development. This paper presents a review of works that use games for kindergarten instruction and analyses their underlying technologies. In addition, in this work we present future challenges to be faced for each technology under consideration focusing on the specific needs and abilities these very demanding users have. The end goal is to outline a collection of future research directions for educators, game designers and HCI experts in the area of game-based kindergarten instruction supported by new technologies. Keywords: Games, Kindergarten, Pre-Kindergarten, Education, Serious games, Review, Multi-touch, Robots, Tangible User Interfaces (TUI) 1 Introduction According to Huizinga play is innate to human culture [7] and children play in many ways and with different types of artifacts [5]. The importance of game play in early childhood education is also recognized by multiple national and international organizations. For instance, according to the Spanish Education Law (LOE) passed in 2007, the working methods in childhood education “will be based on experiences, activities and games” with the purpose of “contributing to the physical, affective, social and intellectual development of children” [8]. Hence, playing is a basic pillar in children education and development. However, despite the huge number of works addressing children play [17, 2] and the presence of games in children educational curricula, there is a lack of works that address the relations between play and learning in environments based on new emerging technologies such as interactive surfaces and robots. Therefore, in this paper we provide a review of works that use technologies to develop games that help children to improve the three dimensions of their development already mentioned: physical, socio-affective and intellectual. The analyzed works demonstrate that there are technologies with suitable mechanisms to support very young children instruction based on play but the analysis also reveals that there are 58 still missing aspects that need to be addressed. Therefore, in this paper we provide a set of future areas of work that can be developed in the near future. The end goal is to define a research path to give educators appropriate guidelines for each technology and to design games and activities that foster pre-school children development. 2 Developing Technology-Based Games for Pre-School Children Many previous works have used technology-aided learning activities to support preschool (aged 2-6 years) children development. In this section, these works are presented by technology. 2.1 Traditional Computers A few years ago, traditional computers were used to develop mainly intellectual and cognitive aptitudes among very young children. Jones and Liu [10], for instance, studied how kids aged 2-3 interact with a computer. They designed a videogame which used visual stimuli, animations, and audio to capture the kid’s attention. For example, the computer told the child to press a certain keyboard button, and informed the user whether the interaction had been successful. For simplification purposes, only a few buttons of the keyboard were used, disabling the rest. The game contained educative contents in order to enhance vocabulary through learning colors, toy names, food, computer parts, etc., and also to learn mathematical concepts such as big/small, or logical relations like cause/effect (e.g., if a key is pressed, something will happen on the screen). In their study, the researchers observed that meaningful interactions with this kind of technology do not appear before two and a half years of age. Because computers were at first fixed to a single location, it was difficult for children to engage in games that encouraged mobility and physical exercise. However, other types of physical development, such as the improvement of fine motor skills, could be trained using this kind of technologies. As an example, Ahlström and Hitz [1] evaluated precise pointing interactions using mouse on children aged 48-58 months. In order to do so, they proposed a game that consisted on selecting and dragging colored elements on the screen. Results showed that an assistive technique can improve children’s pointing accuracy. Similarly, Strommen et al. [23] devised a videogame to evaluate which input device improved precision tasks on three-year-olds, namely mouse, joystick, or trackball. The associated videogame consisted on directing a Cookie Monster through a path up to a given target cookie for him to eat and results showed the trackball as the more accurate, but the slowest, way to interact. These two works were not aimed at training any specific capacity, however, in our opinion, videogames that require this type of precision could be used to improve the fine motor skills on children. 59 2.2 Interactive Surfaces The natural and intuitive way of interaction provided by the multi-touch technology [20] makes it ideal for preschool children. As pointed out by Shneiderman et al. [19], the three basic ideas behind the direct manipulation style that enable a natural interaction are: 1) the visibility of objects and actions of interest; 2) the replacement of typed commands by pointing-actions on the objects of interest; and 3) the rapid, reversible and incremental actions that help children to keep engaged, giving them control over the technology and avoiding complex instructions. Supporting this idea the Horizon report [9] placed tablets and smartphones as one of the two emerging technologies suitable for children aged under 2 years. The suitability of multi-touch technology has motivated several works focused on kindergarten children and the use of tablets and smartphones. The works by Nacher et al [15, 13] reveal the huge growth in the number of existing educational applications targeted to pre-kindergarten children and evaluate a set of basic multi-touch gestures (tap, double tap, long pressed, drag, scale up, scale down, one finger rotation and two finger rotation) in a tablet with children aged between 2 and 3. Their results show that pre-kindergarten children are able to perform successfully the tap, drag, scale up, scale down and one-finger rotation gestures without assistance and the long pressed and double tap gestures with some assistive techniques that fit the gesture to the actual abilities of children. Another interesting study was conducted by Vatavu et al [27] who evaluated the tap, double tap, single hand drag and double hand drag gestures (see Fig. 1) with children between 3 and 6 years with tablets and smartphones. On overall, their results show good performance except for the double hand drag gestures, which are affected by some usability issues. Moreover, the results show a correlation between children with higher visuospatial skills (i.e. having better skills for understanding relationships between objects, as location and directionality) and both, a better performance in the drag and drop tasks and the accuracy when performing tap gestures. Although these applications are developed for experimental purposes, these or similar applications could be used as games in order to help children in their fine motor skills development and visuospatial skills through interactive surfaces. A step further goes the work by Nacher and Jaen [16] who present a usability study of touch gestures that imply movement of the fingers on the tablet (drag, scale up, scale down and one finger rotation) requiring high levels of accuracy. Their results show that very young children are able to perform these gestures but with significant differences between them in terms of precision depending on their age since they are in the process of developing their fine motor skills. Finally, the authors propose as a future work an adaptive mechanism that fits the required accuracy to the actual level of development of each child, this mechanism could be used to help children to exercise and develop their fine motor skills. 60 Fig. 1. Child performing simple and double drag gestures (extracted from [27]). Another interesting work with pre-kindergarten children and tablets is the study by Nacher et al [14] which makes a preliminary analysis of communicability of touch gestures comparing two visual semiotic languages. The results show that the animated approach overcomes the iconic. Hence, basic reasoning related to the interpretation of moving elements on a surface can be effectively performed during early childhood. These languages could help children in identifying direct mappings between visual stimulus and their associated touch gestures. Therefore, the use of these languages could be particularly interesting in the development of games in which pre-school children could play autonomously. Several studies have evaluated the suitability of multi-touch surfaces to support educational activities with children. Zaranis et al [29] conducted an experiment to evaluate the effectiveness of digital activities on smart mobile devices (tablets) when teaching mathematical concepts such as general knowledge of numbers, efficient counting, sorting and matching with kindergarten children. Their results confirm that the tablet-aided learning provides better learning outcomes for children than the traditional teaching method. Another study provided by Chiong and Shuler [4] conducts an experiment involving audiovisual material on touch devices adapted to children aged three to seven years and their results show that children obtain remarkable gains in vocabulary and phonological awareness. Another work using tablets is the study by Berggren and Hedler [3] in which the authors present CamQuest. CamQuest is a tablet application that enables children to move around and recognize geometric shapes in the real objects that they see. The tablet shows the images from the camera and the application integrates the geometric shape to look for (see Fig. 2). This application combines the learning of shapes (such as circle, square, rectangle, and triangle) with active play since children are investigating their surroundings. Moreover, the application can be used in pairs fostering collaboration between children and defining roles between them, so that children develop their social skills. 61 Fig. 2. Child interacting with CamQuest (extracted from [3]). On the other hand, other studies have focused on the use of tabletops with educational purposes. For example, Yu et al [28] present a set of applications for children aged between 5 and 6 years. The applications contribute to the development of intelligence, linguistic, logical, mathematical, musical and visual-spatial aspects with activities such as listening a word and picking out the picture that represents it; shooting balloons with the right numbers, etc. Following the same research path, Khandelwal & Mazalek [11] have shown that this technology can be used by pre-kindergarten children to solve mathematical problems. The work of Mansor et al [12] conducts a comparison of a physical setting versus a tabletop collaborative setting with children aged between 3 and 4 years and suggests that children should remain standing during these operations because, otherwise, they find it difficult to drag objects on the surface due to bad postures. 2.3 Robots and Technologically-Enhanced Toys Unlike computers or surfaces, tridimensional toys and robots have the capacity of being grasped, hence serving as a sort of tangible user interface (TUI), which present an added value in childhood education “as they resonate with traditional learning manipulatives” [22]. Research concerning robots for pre-kindergarten and kindergarten children has focused on building technology to develop intellectual capacities such as linguistic aptitudes. In this respect, Ghosh and Tanaka [6] design a CareReceiving Robot (CRR) to help the kids learn English. This robot adopts the role of the pupil and the children play with it acting as teachers. This way, they can learn as they teach the robot. The researchers propose two games with this platform: a game to learn colors and another to learn vocabulary about animals. In the first one, called “color project”, the kids show a colored ball to the robot and tell it which color it is. Then, the robot touches the ball and guesses its color. In the second game, “vocabulary project”, a series of flashcards are shown to the robot, and it has to guess which animals they represent. In both cases, the purpose of the kid is to correct the robot when it is wrong, or to congratulate it when it answers correctly. Experiments performed with the children through observation reveal that they are very motivated at 62 first, but tend to feel bored and frustrated quickly if the robot is too often right or wrong, respectively, since the game becomes monotonous. Tanaka and Matsuzoe [25] posteriorly revealed that kids aged 3 to 6 are capable of learning verbs by playing with the CRR, and they even suggested that learning through playing with the robot might be more effective than not involving such a tangible artifact. Shen et al. present Beelight [18], a bee-shaped robot and a tabletop serving as its honeycomb (see Fig. 3) aimed at teaching colors to children aged 4 to 6 years, which is reported to cause excitement and astonishment on the kids. The authors present two games implemented with this approach. On the one hand, “color sharing”, in which the kids would grab the robot and show a color to it. Then, the bee would glow in said color and, if placed on the honeycomb, it would be colored as well. The second game, “color searching”, would consist of the bee being illuminated with a given color and the children having to search for some object of said color and place it on the honeycomb. In case of success, the honeycomb would play a song. Fig. 3. Beelight (extracted from [18]) Also aimed at improving language and literacy skills, Soute and Nijmeijer [21] design an owl-shaped robot to perform story-telling games with children aged 4 to 6. This robot (see Fig. 4) narrates a partial story which the students must complete showing some flashcards to it. A small study is also conducted during a game session and the results show the system is engaging for the kids. Fig. 4. A girl playing with an owl-shaped robot to foster language skills (extracted from [21]) 63 Besides training linguistic abilities, other robots could also be used to develop spatial capabilities. For example, Tanaka and Takahashi [26] design a tangible interface for kids aged 3 to 6 in the form of a tricycle (see Fig. 5) to remotely control a robot. The movements performed on the tricycle (i.e., forward, backward, left, right) are mapped to movements of the tele-operated robot. Although not specifically built for this purpose, in our opinion this kind of interface could be used to stimulate spatial mappings on kindergarten children. Fig. 5. Tricycle interface (extracted from [26]) Another advantage of using robots is that they can move. Therefore, they can be used to enhance physical development. QRIO [24] is a humanoid robot introduced in a toddlers’ classroom to make the kids move and dance, hence encouraging physical exercise. The robot would dance autonomously to the music (see Fig. 6) and react to the movements of a dancing partner (i.e, to his/her hand movements or clapping). Fig. 6. Children dancing with QRIO (extracted from [24]) 3 Discussion In Table 1, the works listed above are classified in terms of several factors: the age of the users involved; the capacities, inferred from [8], that the works can improve, i.e., physical development (P), socio-affective development (S) and cognitive and intellectual development (I). For each capacity there are several areas; related to physical 64 development the analyzed works address physical exercise (P-p) and fine motor skills (P-f) areas; in the social development we can identify the collaboration area (S-c); and in the cognitive and intellectual development we can find the spatial (I-s), the linguistic (I-l), the logic and the mathematic (I-m), and the exploration and discovery skills (I-e) areas. The works are also categorized by the technology used; computers (C), tablets (T), mobiles/smartphones (M), tabletops (TT) or robots (R). Finally, the last dimension covers the type of interaction; tangible (T), keyboard (K), mouse (Mo), joystick (J), multi-touch (M), body gestural (G) or vocal (V). Table 1. Comparison of works Work Khandelwal et al [11] Tanaka et al [24] Tanaka et al [6,25] Jones & Liu [10] Tanaka & Takahashi [26] Soute & Nijmeijer [21] Ahlström et al [1] Shen et al [18] Strommen et al [23] Nacher et al [15] Nacher et al [13] Nacher et al [14] Nacher & Jaen [16] Vatavu et al [27] Chiong & Shuler [4] Zaranis et al [29] Yu et al [28] Age (years) 3-5 0-2 3-6 2-3 3-6 4-6 4-5 4-6 3 2-3 2-3 2-3 2-3 3-6 3-7 4-6 5-6 Capacities I P I I I I P I P P P I P P I I I Mansor et al [12] Berggren & Hedler [3] 3-4 4-5 I I, S Areas I-m P-p I-l I-l, I-m I-s I-l P-f I-l P-f P-f P-f I-l P-f P-f I-l I-m I-l,I-s, I-m I-e, S-c I-m,S-c Technology TT R R C R R C R C T T T T T-M T T TT Interaction T T, G V, G K T G Mo T Mo, J, B M M M M M M M M TT T M M The review of all the works that use new technologies to help pre-school children development shows that there is a great number of works focused on the development of the physical and intellectual capacities of children. Focusing on the physical capacities, most works present activities and games that address the development of fine motor skills. However, few works have been proposed with preschool children when developing games that support their gross motor skills or promote health and wellbeing through performing physical activity and active play. In our opinion, the most appropriate technologies for these types of applications are tablets, smartphones and robots due to their ability to be moved from one place to another. Regarding the cognitive and intellectual dimension, most works focus on games that foster the logic, mathematical and linguistic skills. Nonetheless, there are no works fostering the de- 65 velopment of spatial abilities or supporting exploration and discovery. On the other hand, despite the suitability of the new technologies, such as tabletops, tablets, smartphones and robots for collaborative playing, the development of social and affective skills is not fully exploited with preschool children. Hence, a future work to be addressed is the use of these technologies for the development of games that support and foster the relationships with others. In addition, there are unexplored areas in the social-affective dimension. An interesting future challenge is the use of new technology games to improve the self-awareness, self-regulation and emotional intelligence of pre-kindergarten children. Finally, it is also worth mentioning that looking at the year of publication of the works listed there is a trend to leave the traditional computer and tabletop technologies behind and select the tablets, smartphones and robots as the preferred technologies for developing games for the youngest. This makes the multi-touch and tangible interactions as the most promising techniques that will need further research efforts to analyze their adequacy and limitations when applied to preschool children. To sum up, the contributions of this paper are twofold. The first one is a review of the state of the art of technology-aided activities that support the three dimensions of kindergarten children development. The reviewed studies show the suitability of game technologies for the improvement and development of very young children capacities. The second contribution is a set of future challenges listing the unexplored areas of preschool children development in which game technologies may have a real and measurable impact. These areas will have to be the focus of intense research in the near future to create games that support all the dimensions of preschool children development. Acknowledgements This work received financial support from Spanish MINECO (projects TIN201020488 and TIN2014-60077-R), from Universitat Politècnica de València (UPV-FE2014-24), and from GVA (ACIF/2014/214). References 1. Ahlström, D. and Hitz, M. Revisiting PointAssist and Studying Effects of Control- Display Gain on Pointing Performance by Four-Year-Olds. Proc. of IDC'13, 257–260. 2. Barnett, L.Developmental benefits of play for children. Journal of Leisure Research 22, 2 (1990), 138–153. 3. Berggren, J. and Hedler, C. CamQuest: Design and Evaluation of a Tablet Application for Educational Use in Preschools. Proc. of IDC'14, 185–188. 4. Chiong, C. and Shuler, C. Learning: Is there an app for that? Investigations of young children’s usage and learning with mobile devices and apps. New York, 2010. 5. Fein, G.G. Reviews Pretend Play in Childhood : An Integrative Review. Child Development 52, 4 (1981), 1095–1118. 6. Ghosh, M. and Tanaka, F. The impact of different competence levels of care-receiving robot on children. Proc. of IROS'11, 2409–2415. 66 7. Huizinga, J. Homoludens. Wolters-Noordhoff, Groningen, 1985. 8. Jefatura del Estado. Ley Orgánica 2/2006, de 3 de mayo, de Educación. 2006. 9. Johnson, L., Adams, S., and Cummins, M. The NMC Horizon Report: 2012 K-12. The New Media Consortium, Austin, Texas, 2012. 10. Jones, M. and Liu, M. Introducing Interactive Multimedia to Young Children: A Case Study of How Two-Year-Olds Interact with the Technology. Journal of Computing in Childhood Education 8, 4 (1997), 313–343. 11. Khandelwal, M. and Mazalek, A. Teaching table: a tangible mentor for pre-k math education. Proc. of TEI'07, 191–194. 12. Mansor, E.I., De Angeli, A., and de Bruijn, O. The fantasy table. Proc. of IDC’09, 70–79. 13. Nacher, V., Jaen, J., Catala, A., Navarro, E., and Gonzalez, P. Improving Pre-Kindergarten Touch Performance. Proc. of ITS'14, 163–166. 14. Nacher, V., Jaen, J., and Catala, A. Exploring Visual Cues for Intuitive Communicability of Touch Gestures to Pre-kindergarten Children. Proc. of ITS'14, 159–162. 15. Nacher, V., Jaen, J., Navarro, E., Catala, A., and González, P. Multi-touch gestures for prekindergarten children. International Journal of Human-Computer Studies 73, 2015, 37–51. 16. Nacher, V. and Jaen, J. Evaluating the Accuracy of Pre-Kindergarten Children Multi-touch Interaction. Proc. of Interact'15. 17. Samuelsson, I.P. and Carlsson, M.A. The Playing Learning Child: Towards a pedagogy of early childhood. Scandinavian Journal of Educational Research 52, 2008, 623–641. 18. Shen, Y., Qiu, Y., Li, K., and Liu, Y. Beelight: helping children discover colors. Proc. of IDC'13, 301–304. 19. Shneiderman, B., Plaisant, C., Cohen, M., and Jacobs, S. Designing the User Interface: Strategies for Effective Human-Computer Interaction. Prentice Hall, 2009. 20. Smith, S.P., Burd, E., and Rick, J. Developing, evaluating and deploying multi-touch systems. International Journal of Human-Computer Studies 70, 10 (2012), 653–656. 21. Soute, I. and Nijmeijer, H. An Owl in the Classroom: Development of an Interactive Storytelling Application for Preschoolers. Proc. of IDC'14, 261–264. 22. Strawhacker, A. and Bers, M.U. “I want my robot to look for food”: Comparing Kindergartner’s programming comprehension using tangible, graphic, and hybrid user interfaces. International Journal of Technology and Design Education, (2014). 23. Strommen, E.F., Revelle, G.L., Medoff, L.M., and Razavi, S. Slow and steady wins the race? Three-year-old children and pointing device use. Behaviour and Information Technology 15, 1 (1996), 57–64. 24. Tanaka, F., Fortenberry, B., Aisaka, K., and Movellan, J.R. Plans for Developing Realtime Dance Interaction between QRIO and Toddlers in a Classroom Environment. Proc. of ICDL'05, 142–147. 25. Tanaka, F. and Matsuzoe, S. Learning Verbs by Teaching a Care-Receiving Robot by Children: An Experimental Report. Proc. of HRI'12, 253–254. 26. Tanaka, F. and Takahashi, T. A tricycle-style teleoperational interface that remotely controls a robot for classroom children. Proc of HRI'12, 255–256. 27. Vatavu, R., Cramariuc, G., and Schipor, D.M. Touch interaction for children aged 3 to 6 years : Experimental fi ndings and relationship to motor skills. International Journal of Human-Computer Studies 74, (2015), 54–76. 28. Yu, X., Zhang, M., Ren, J., Zhao, H., and Zhu, Z. Experimental Development of Competitive Digital Educational Games on Multi-touch Screen for Young Children. Proc. of Edutainment'10, 367–375. 29. Zaranis, N., Kalogiannakis, M., and Papadakis, S. Using Mobile Devices for Teaching Realistic Mathematics in Kindergarten Education. Creative Education 04, 07 (2013), 1–10. 67 Refinamiento de un Modelo de Calidad para Juegos Serios Lilia García-Mundo, Marcela Genero, Mario Piattini Instituto de Tecnologías y Sistemas de Información, Universidad de Castilla-La Mancha, Paseo de la Universidad, 4, 13071 Ciudad Real, Spain {liliacarmen.garcia@alu.uclm.es}; {marcela.genero, mario.piattini} @uclm.es Resumen. En este trabajo se persiguen los siguientes objetivos: 1) Presentar brevemente el modelo de calidad para Juegos serios (QSGame-Model) propuesto en un trabajo previo de los autores, 2) Describir las actividades llevadas a cabo para diseñar y construir una encuesta que será distribuida a expertos en el desarrollo y enseñanza de Juegos serios (o videojuegos) y cuyas respuestas nos servirán para refinar el modelo propuesto, y 3) Presentar los primeros resultados obtenidos tras la realización de la encuesta por tres expertos. La distribución de la encuesta a un grupo mayor de expertos y la validación de la utilidad del QSGame-Model, quedan pendientes como trabajo futuro. Palabras claves. Encuesta, Modelo de Calidad, Juegos Serios, Experimentos, Estudios Empíricos 1 Introducción Zyda [1] define un Juego Serio (JS) como “una competición mental jugada en un ordenador de acuerdo a reglas específicas que utiliza el entretenimiento para alcanzar objetivos en la formación empresarial, en la educación, en la salud, en la política pública y en la comunicación estratégica”. De forma simplificada un JS se considera un juego cuyo objetivo principal va más allá del mero entretenimiento [2]. El uso de los JS proporciona muchos beneficios: existe evidencia de que son más eficaces que los métodos de enseñanza tradicionales en cuanto a la formación de las habilidades cognitivas [3], son prometedores en el desarrollo de habilidades motoras [3] permiten mejorar el potencial de los empleados y sus capacidades técnicas [4], permiten a los estudiantes experimentar situaciones que sería imposible experimentar en la vida real [2], etc. Además de ser un mercado de rápido crecimiento [5], los JS constituyen un área de oportunidades que está en constante crecimiento. En 2012, los ingresos de todo el mundo para el aprendizaje basado en el juego ascendieron a 1,5 mil millones de dólares. Con una tasa de crecimiento global del 8% al año, se prevé que en 2017 los ingresos en todo el mundo de este tipo de aplicaciones alcanzarán los 2,3 mil millones de dólares [6]. 68 Los JS pueden ser un medio para alcanzar metas relevantes tanto desde el punto de vista personal como desde el punto de vista institucional. La cantidad de usuarios de estas aplicaciones crece día a día, lo que significa que su impacto social es muy alto. Es por esta razón que su calidad es muy importante, y por lo tanto como investigadores y profesionales de la informática consideramos que es nuestro deber garantizar la calidad de los JS. Por todo lo dicho, decidimos centrar nuestra investigación en la calidad de los JS. Comenzamos, como es normal en cualquier investigación, realizando una revisión de la literatura siguiendo una metodología conocida con el anglicismo de “mapeo sistemático de la literatura” (Systematic Mapping Study (SMS)) [7,8,9]. El SMS es una metodología ampliamente utilizada en ingeniería del software para realizar revisiones de la literatura [10] que se lleva a cabo con el fin de obtener una visión general de un determinado tema de investigación de manera sistemática, fiable, rigurosa y auditable [8,9], que intenta encontrar la máxima información posible del tema investigado, evitando sesgos en los resultados obtenidos. Concretamente el propósito de este SMS fue conocer el estado del arte de la investigación sobre la calidad de los JS [11]. Los resultados del SMS revelaron que en los 112 artículos encontrados, los investigadores estaban principalmente preocupados en demostrar o confirmar si el JS había logrado el propósito para el que fue creado y si proporcionaba placer y entretenimiento. Los artículos evaluaban diferentes sub-características de calidad como por ejemplo, la operabilidad, la estética de la interfaz de usuario, la completitud funcional, entre otras. Si bien existe un modelo estándar de calidad de producto software como el ISO/IEC 25010 [12], no hemos podido encontrar un modelo de calidad consensuado que se pueda aplicar a cualquier JS. Esto nos motivó a proponer en un trabajo previo una versión preliminar de un modelo de calidad de producto específico para el dominio de los JS, llamado QSGame-Model [13]. Ahora queremos ir un paso más allá, y conocer la opinión de expertos, en el desarrollo y enseñanza de JS (o videojuegos), sobre el modelo de calidad propuesto, con el objetivo de obtener un modelo de calidad consensuado por expertos. Concretamente queremos, a través de una encuesta, preguntarles a los expertos si los atributos de calidad propuestos les parecen adecuados y si su definición es comprensible. Los objetivos de este trabajo son: 1) Presentar brevemente el modelo de calidad para JS propuesto (QSGame-Model) en un trabajo previo de los autores [13], 2) Describir las actividades llevadas a cabo para diseñar y construir una encuesta con el objetivo de refinar el modelo propuesto a través de las opiniones de expertos y 3) Presentar los primeros resultados obtenidos tras la realización de la encuesta por tres expertos. El resto de este documento está organizado de la siguiente forma. La Sección 2 presenta un resumen del QSGame-Model. La Sección 3 describe el proceso seguido en el diseño y construcción de la encuesta y la Sección 4 presenta los primeros resultados obtenidos de la encuesta realizada por tres expertos. Por último, las conclusiones y las principales ideas sobre nuestro trabajo futuro se presentan en la Sección 5. 69 2 Presentación del QSGame-Model Antes de introducir el QSGame-Model haremos una breve presentación del estándar de calidad de software ISO/IEC 25010 [12], que fue usado como base para construir el modelo de calidad para JS. El principal objetivo del estándar ISO/IEC 25010 es especificar y evaluar la calidad de los productos software por medio de un modelo de calidad que se utiliza como marco para la evaluación del software [12]. El modelo de calidad ISO/IEC 25010 está compuesto por dos modelos que son útiles en lo que respecta a la evaluación de la calidad de un producto software: ─ Modelo de calidad de producto: mediante la medición de propiedades internas (tales como especificación de software, el diseño arquitectónico, entre otros), o mediante la medición de propiedades externas (típicamente midiendo el comportamiento del código cuando se ejecuta); y ─ Modelo de calidad en uso: mediante la medición de la calidad en propiedades de uso (i.e. cuando el producto está en uso de forma real o simulada). El modelo de calidad de producto clasifica las propiedades de calidad de producto en ocho características y treinta sub-características de calidad; mientras que el modelo de Calidad en uso describe cinco características y nueve sub-características de calidad [12]. Un modelo de calidad está definido por características generales del software, que son refinadas en sub-características, las que a su vez se descomponen en atributos, produciendo así una jerarquía de múltiples niveles. La parte inferior de la jerarquía contiene atributos medibles de software cuyos valores se calculan mediante el uso de una determinada medida. Estas medidas deben definirse de forma completa y precisa dentro del modelo de calidad. Por tanto, la salida de la evaluación de la calidad de un producto de software es un conjunto de valores de medición que tienen que ser interpretados con el fin de proporcionar realimentación a los desarrolladores y diseñadores acerca de la calidad de los productos de software. El estándar ISO/IEC 25010 es genérico y las características que define son relevantes para todos los productos de software y no están relacionados exclusivamente con el código o software ejecutable, sino también con el análisis y diseño de los artefactos. Debido a su naturaleza genérica, el estándar fija algunos conceptos de calidad de alto nivel, que se pueden adaptar a dominios específicos [14]. Existen varias propuestas de modelos de calidad que toman como base un estándar y lo adaptan a dominios específicos, como por ejemplo, Radulovic et al. [15] presentan un modelo de calidad basado en el estándar ISO/IEC 25010 [12], de los productos para las tecnologías semánticas llamado SemQuaRE, Herrera et al. [16] proponen un modelo de calidad en uso para los portales Web (QiUWeP) basado en el estándar ISO/IEC 25010 [12], y Carvallo et al. [17] construyeron un modelo de calidad del producto basado en el estándar ISO/IEC 9126-1 [18] para los servidores de correo. Como nuestro objetivo es la calidad de los JS, realizamos en primer lugar un SMS para recopilar todo lo publicado en la literatura sobre este tema [11]. En este SMS encontramos que aunque las investigaciones abordan varios aspectos de calidad de los JS, no existe un modelo de calidad de producto consensuado que se pueda aplicar a 70 cualquier JS específico. Basándonos en estos resultados, definimos una versión preliminar de un modelo de calidad, que se extiende del modelo de calidad de producto del ISO/IEC 25010 [12], para los JS llamado “QSGame-Model” [13] que se presenta en la Fig. 1. Este modelo además de basarse en el estándar mencionado, considera tanto las características consideradas en las investigaciones incluidas en el SMS como los elementos que caracterizan la jugabilidad [19] como atributos de producto. El objetivo principal que perseguimos es que este modelo de calidad pueda servir a los desarrolladores de JS a construir JS de calidad. El modelo se definió siguiendo una metodología propuesta por Franch y Carvallo [14] para la construcción de modelos de calidad para dominios específicos. El detalle de los pasos seguidos para la construcción del QSGame-Model se puede encontrar en [13]. Fig. 1. QSGame-Model En la Fig. 1, los recuadros con fondo blanco representan las sub-características donde no se realizaron modificaciones, es decir estas sub-características permanecen en el QSGame-Model igual que en el estándar. Los recuadros con fondo obscuro representan las sub-características en las que se realizaron modificaciones. Las modificaciones consistieron en añadir atributos y medidas en las tres sub-características de la Adecuación Funcional: Completitud Funcional, Exactitud Funcional y Pertinencia Funcional; y en cinco sub-características de la Usabilidad: Reconociblidad Adecuada, Facilidad de Aprendizaje, Operabilidad, Estética de la Interfaz de Usuario y Accesibilidad. Fig. 2 muestra, a modo de ejemplo, los atributos añadidos a las tres subcaracterísticas de la Adecuación Funcional. En el resto de las características del modelo de calidad de producto (Eficiencia en el desempeño, Compatibilidad, Fiabilidad, Seguridad, Mantenibilidad, y Portabilidad) no añadimos ni modificamos atributos o medidas. El modelo completo puede encontrar en: http://alarcos.esi.uclm.es/SeriousGamesProductQualityModel/ 71 Fig. 2. Característica Adecuación Funcional: sub-características y atributos 3 Encuesta sobre el QSGame-Model Como se mencionó anteriormente, los modelos de calidad estándar, como el propuesto en [12], son genéricos y en teoría aplicables a cualquier producto software. Aunque al ser tan genéricos, suelen no adaptarse a productos software de dominios específicos y por ello es necesario adaptarlos. Al adaptar un modelo de calidad existente a un dominio específico como el QSGame-Model, es necesario consultar a expertos en el desarrollo y enseñanza de JS (o videojuegos) para saber si los atributos de calidad propuestos son adecuados y comprensibles y así su opinión permitirá refinar el modelo y obtener un modelo de calidad consensuado. Por ello, como primer paso, nos planteamos refinar el QSGame-Model mediante una encuesta realizada a expertos en el desarrollo y la enseñanza de los JS (o videojuegos). Para diseñar la encuesta seguimos las directrices establecidas en [20]. En el resto de esta sección describiremos las actividades que llevamos a cabo para el diseño y construcción de la encuesta. 3.1 Objetivo de la encuesta El objetivo de la encuesta es “Obtener la opinión de expertos en el desarrollo y enseñanza de JS con respecto a la relevancia que tienen para ellos cada uno de los atributos de calidad propuestos en el QSGame-Model y también saber si la definición de dichos atributos les resulta comprensible”. 3.2 Diseño de la encuesta Existen diversos tipos de diseño de encuestas, algunos de ellos son [21]: ─ Encuestas transversales: son estudios en donde a los encuestados se les solicita información en un punto determinado en el tiempo [20]. 72 ─ Encuestas longitudinales: son estudios donde el objetivo es conocer la evolución de una determinada población a través del tiempo [20]. El objetivo que nos planteamos en esta encuesta nos condujo a elegir el diseño de encuesta transversal, ya que como se menciona en [21] la mayoría de las encuestas que se realizan en la ingeniería del software son de este tipo. Los cuestionarios que diseñamos son auto-administrados y los aplicaremos a través de Internet [20]. Debido a que los cuestionarios auto-administrados son sin supervisión, las instrucciones sobre cómo rellenarlo las incorporamos antes que las preguntas de la encuesta. Esto es muy importante porque cuando los encuestados no son guiados por una persona, es fundamental que comprendan perfectamente cómo deben proceder con la encuesta. 3.3 Población objetivo La población objetivo de la encuesta son profesionales que se dedican al desarrollo y enseñanza de JS (o videojuegos), tanto en universidades como en empresas. 3.4 Estructura de la encuesta La encuesta está estructurada en tres bloques principales: ─ Glosario: contiene una lista de las definiciones de los términos que se utilizan en el contexto de los JS y de los videojuegos. El propósito del glosario es que todos los encuestados utilicen un mismo término para referirse a un mismo concepto al momento de rellenar la encuesta. ─ Antecedentes y experiencia: es un bloque de preguntas relacionadas con aspectos demográficos de los encuestados como su sexo, su nivel de educación, país en el que trabajan, su experiencia en las TIC´s y en el desarrollo de videojuegos o JS, así como su formación específica en el desarrollo de JS. La Fig. 3 muestra un ejemplo de este tipo de preguntas. Este bloque de preguntas nos ayudarán a contextualizar las respuestas de la encuesta. ─ Valoración de los atributos de calidad de los JS: contiene un bloque de preguntas relacionadas con los atributos de calidad propuestos en el QSGame-Model. Un ejemplo de este tipo de preguntas se muestra en la Fig. 4. Estas preguntas tienen como propósito conocer, en base a los conocimientos y experiencia de los encuestados, la comprensión y la importancia que tienen para ellos cada uno de los atributos de calidad del QSGame-Model. Además, en este bloque de preguntas se pretende obtener realimentación de los encuestados sobre alguna observación adicional acerca de cada uno de los atributos. 73 Fig. 3. Ejemplo de preguntas de la encuesta sobre antecedentes y experiencia. Fig. 4. Ejemplo de pregunta sobre la valoración de los atributos de calidad de los JS. 3.5 Construcción y ejecución de la encuesta Para diseñar las preguntas de la encuesta se tuvieron las siguientes consideracio- nes: ─ Las preguntas se elaboraron teniendo en cuenta el objetivo de la encuesta. ─ La redacción de las preguntas se realizó de una forma que resultara fácil de comprender y precisa de responder por los encuestados. ─ Se incluyeron solamente las preguntas necesarias. ─ Se estandarizaron las respuestas con una escala ordinal: 1- Es muy importante; 2Es algo importante; 3 – No es importante. ─ Se usó un lenguaje convencional en la redacción de las preguntas, i.e. terminología que resultara familiar a los encuestados. ─ Se evitó la inclusión de preguntas negativas. Finalmente, el cuestionario quedó integrado por 10 preguntas relacionadas con aspectos demográficos, 35 preguntas relacionadas con los nuevos atributos de calidad agregados en el modelo propuesto y una pregunta abierta final. Esta pregunta abierta solicita a los encuestados indicar cualesquier otro aspecto de calidad relevante de los JS que no fue incluido, proporcionándonos así una realimentación del modelo. El proceso seguido para la construcción y ejecución de la encuesta fue el siguiente: 1. El conjunto inicial de preguntas del cuestionario se creó tomando como base los atributos de calidad propuestos en el modelo de calidad QSGame-Model. 74 2. Antes de poner la encuesta en línea se realizó un estudio piloto con 3 expertos en el desarrollo y enseñanza de JS, profesores que están impartiendo un Curso de Experto en Desarrollo de Videojuegos en la Escuela Superior de Informática de la Universidad de Castilla La-Mancha, con el fin de refinar la encuesta y reducir ambigüedades. Los resultados de este estudio y cómo éstos resultados sirvieron para refinar las encuestas inicial, se muestran en la Sección 4 de este trabajo. 3. Actualmente estamos implementado la versión refinada de la encuesta con la herramienta Survey Monkey tanto en inglés como en castellano [22]. 4. Paralelamente estamos buscando contactos para que participen en la encuesta. Intentando reclutar la mayor cantidad de personas, los autores de este trabajo estamos buscando contactos en grupos de investigación de universidades que tengan un master o una especialidad en desarrollo de JS (o videojuegos) y en empresas especializadas en el desarrollo y venta de JS (o videojuegos). Además, pensamos asistir a congresos de JS como SGames (http://sgamesconf.org/2015/show/home), el VSGames (http://www.his.se/en/Research/our-research/Conferences/VS-games2015/) y el CoSECiVi (http://gaia.fdi.ucm.es/sites/cosecivi15/). 5. Una vez establecidos los contactos, les pediremos que rellenen la encuesta y posteriormente procederemos a realizar un análisis estadístico cuantitativo de las respuestas recopiladas, a través de estadísticos descriptivos, porcentajes de ocurrencia, etc., mostrados en formato gráfico y tabular. 4 Estudio Piloto: Primera Ejecución de la Encuesta A modo de estudio piloto, le pedimos que rellenaran la encuesta a tres profesores que actualmente están impartiendo un Curso de Experto en Desarrollo de Videojuegos en la Escuela Superior de Informática de la Universidad de Castilla La-Mancha (http://www.cursodesarrollovideojuegos.com). El principal objetivo de este estudio piloto es tener una realimentación inicial de expertos, en el desarrollo y enseñanza de JS, sobre el diseño de la encuesta y que sus respuestas nos sirvan para modificar el modelo. Los tres encuestados son de sexo masculino, tienen un perfil de investigadores con un alto nivel de estudios (grado de doctor), con experiencia de 5 a más años en el área de las TIC´s, en el desarrollo de software, en el desarrollo de videojuegos y en el desarrollo de JS. Con la realimentación proporcionada por los encuestados en este estudio piloto, además de algunas modificaciones menores realizadas sobre la redacción de algunas preguntas, se realizaron principalmente los siguientes cambios y observaciones: ─ Se modificaron las descripciones de 2 atributos de calidad: Reglas Claras y Control Real. La Tabla 1 muestra las descripciones originales y las descripciones modificadas de estos atributos. Con respecto al atributo Reglas Claras, los encuestados argumentaron que las reglas del juego no necesariamente deben establecerse todas al inicio del juego sino que, algunas se pueden ir conociendo durante el juego. La descripción del atributo Control Real resultó incomprensible para los 3 en- 75 cuestados. Les resultaba confuso el significado del término “control real” en este contexto. ─ Se añadieron 4 atributos de calidad: Retos Compartidos, Recompensas Compartidas, Andamiaje Correcto e Idoneidad de Equipo. La Tabla 2 muestra los nombres y descripciones de los atributos añadidos. Los atributos Retos Compartidos, Recompensas Compartidas e Idoneidad de Equipo están relacionados con aspectos de socialización. La socialización se refiere al fomento del factor social, la experiencia en grupo o la interacción con otros jugadores que provoca que el juego sea más exitoso para el conjunto de personas que lo juegan y que puede contribuir a incrementar el grado de satisfacción de quien lo juegue [19]. Los dos primeros atributos relacionados con la socialización se refieren a la posibilidad de que los jugadores puedan mostrar a otros jugadores tanto los retos logrados como las recompensas obtenidas en el juego. El atributo Idoneidad de Equipo está relacionado con la posibilidad de que el jugador pueda realizar las funciones del juego de forma conjunta con otros jugadores. El atributo Andamiaje Correcto está relacionado con el desarrollo incremental del proceso de aprendizaje. Este proceso se debe basar en el incremento de la dificultad en los retos de un juego en la misma medida en la que el jugador alcanza niveles de mayor dificultad en el juego [19]. ─ Una observación que hicieron los expertos fue que el modelo no considera aspectos específicos de la jugabilidad como la diversión, el placer y la satisfacción. La jugabilidad se refiere al conjunto de propiedades que describen la experiencia del jugador en un juego y está relacionada con la diversión, el placer, y la satisfacción [19] como atributos del modelo de calidad en uso. Nuestro modelo considera atributos de producto que creemos que pueden ejercer una influencia en la calidad en uso para lograr una mejor experiencia del jugador en un juego o lograr una mejor jugabilidad experimentada por el jugador. ─ Al final de la encuesta, se añadió otra pregunta abierta para solicitarle a los encuestados sobre cualquier otra observación adicional que desearan hacer a la encuesta. Table 1. Atributos del QSGame-Model modificados. Nombre del atributo Reglas Claras Control Real Descripción del atributo (original / modificada) Las reglas del juego deben establecerse claramente al inico del juego. Las reglas del Juego serio deben establecerse claramente durante el juego. Los controles utilizados en el Juego serio deben permitir al jugador controlar el juego de la forma mas real posible. Los controles utilizados en el Juego serio deben asemejarse lo más posible a lo que representan en la realidad (por ejemplo si un control de un juego es un volante de coche, que sea lo más parecido a un volante de este tipo en la realidad). Table 2. Atributos del QSGame-Model añadidos. Nombre del Descripción del atributo 76 atributo Retos Compartidos Recompensas Compartidas Andamiaje Correcto Idoniedad de Equipo Las funciones del Juego serio, deben permitir al jugador mostrar a otros jugadores los retos que ha alcanzado. Las funciones del Juego serio, deben permitir al jugador mostrar a otros jugadores las recompensas que ha obtenido. Las funciones del Juego serio, deben proporcionar al jugador retos que se incrementen en dificultad a medida que el jugador avanza en el juego. Las funciones del Juego serio, pueden ofrecer la opción de ser jugadas en equipos de jugadores. Al finalizar el estudio piloto, la encuesta quedó integrada por 10 preguntas relacionadas con antecedentes y experiencia, 39 preguntas relacionadas con los atributos de calidad del modelo propuesto y dos pregunta abiertas. La Tabla 3 muestra un extracto de las preguntas de la encuesta. Table 3. Extracto de las preguntas de la encuesta No. 1 2 3 4 5 5 Descripción de la pregunta en la encuesta El Juego serio debe tener todas las funciones necesarias para alcanzar los objetivos establecidos en la especificación de requisitos. En las funciones del Juego serio, por cada objetivo establecido se debe ofrecer un reto y por cada reto alcanzado se debe ofrecer una recompensa. Las funciones del Juego serio, deben permitir al jugador mostrar a otros jugadores los retos que ha alcanzado. Las funciones del Juego serio, deben permitir al jugador mostrar a otros jugadores las recompensas que ha obtenido. Las funciones del Juego serio deben proporcionar un resultado correcto y preciso que indique al jugador cuál es su progreso en el juego. Conclusiones y Trabajo Futuro Los resultados de una revisión de la literatura sobre la calidad de los JS que realizamos con anterioridad [11], revelaron que no existe un modelo de calidad de producto consensuado que se pueda aplicar a cualquier JS. Esto nos motivó a plantearnos como objetivo definir y validar un modelo de calidad de producto específico para los JS, que denominamos “QSGames-Model” [13]. Este modelo se basa principalmente en el estándar actual sobre la calidad del producto ISO/IEC 25010 [12] y también considera los elementos que caracterizan la jugabilidad [19] como atributos de producto y que creemos también podría aplicarse a los videojuegos en general. El objetivo principal que perseguimos es que este modelo de calidad pueda servir a los desarrolladores de JS a construir JS de calidad. Para que el QSGame-Model sea un modelo de calidad consensuado por expertos en el desarrollo y enseñanza de JS (o videojue- 77 gos), hemos diseñado una encuesta para preguntarles a estos expertos si los atributos de calidad propuestos les parecen adecuados y si su definición es comprensible. En este trabajo hemos descrito el proceso que llevamos a cabo para el diseño y construcción de esta encuesta y además, presentamos los resultados de un estudio piloto, en el que tres profesores de un Curso de Experto en Desarrollo de Videojuegos (http://www.cursodesarrollovideojuegos.com), que se imparte en Escuela Superior de Informática de la Universidad de Castilla La-Mancha, rellenaron la encuesta. La realimentación recibida en este estudio piloto nos ayudó a identificar 4 nuevos atributos de calidad, relacionados 3 de ellos con aspectos de socialización y el otro con el desarrollo incremental del proceso de aprendizaje, además de algunas modificaciones menores realizadas sobre la redacción de algunas de las preguntas. Nuestro trabajo futuro se centrará principalmente en el refinamiento y validación del QSGame-Model. Para ello, hemos planificado en primer lugar distribuir la encuesta a la mayor cantidad posible de expertos en el desarrollo y enseñanza de JS (o videojuegos). Una vez refinado el modelo, llevaremos a cabo experimentos para obtener evidencia empírica sobre la utilidad del QSGame-Model, es decir, obtener evidencia empírica que nos permita asegurar si la presencia del modelo hace posible construir JS de mejor calidad. De esta manera habremos obtenido un modelo de calidad para JS consensuado por expertos y además útil para los desarrolladores de JS. Reconocimientos Este trabajo ha sido financiado por los siguientes proyectos: GEODAS-BC (Ministerio de Economía y Competitividad y Fondo Europeo de Desarrollo Regional FEDER, TIN2012-37493-C03-01) e IMPACTUM (Consejería de Educación, Ciencia y Cultura de la Junta de Comunidades de Castilla La Mancha, y Fondo Europeo de Desarrollo Regional FEDER, PEII11-0330-4414. También nos gustaría agradecer al Instituto Tecnológico de Ciudad Victoria y PRODEP por habernos concedido la beca que hizo posible realizar el trabajo de investigación presentado en este artículo. Referencias 1. Zyda, M.: From Visual Simulation to Virtual Reality to Games. Computer 38 (9), 25-32 (2005) 2. Susi, T., Johannesson, M. Backlund, P.: Serious Games – An Over-view. Technical Report HS- IKI -TR-07-001. School of Humanities and Informatics University of Skövde Sweden (2007) 3. Wouters, P., Van der Spek, E., Van Oostendorp, H.: Current Practices in Serious Game Research: A Review from a Learning Outcomes Perspective, Games-Based Learning Advancements for Multi-Sensory Human Computer Interfaces: Techniques and Effective Practices. IGI Global, Hershey PA USA, 232-250 (2009) 4. LUDUS. How can one Benefit from Serious Games, http://www.ludusproject.eu/sgbenefits.html 78 5. Michael, D. R., Chen S. L.: Serious Games: Games That Educate, Train, and Inform. Thomson Course Technology PTR, Boston Ma (2006) 6. Ambient Insight Research. The 2012-2017 Worldwide Game-based Learning and Simulation-Based Markets, http://www.ambientinsight.com/Resources/Documents/AmbientInsight_SeriousPlay2013_ WW_GameBasedLearning_Market.pdf 7. Kitchenham B. A., Budgen D., Brereton O. P.: Using mapping studies as the basis for further research – A participant observer case study. Information and Software Technology, 53 (6), 638-651 (2011) 8. Kitchenham B. A., Charters S.: Guidelines for Performing Systematic Literature Reviews in Software Engineering. Technical Report EBSE-2007-01. Software Engineering Group of Keele University Durham UK (2007) 9. Petersen K., Vakkalanka S., Kuzniarz, L.: Guidelines for conducting systematic mapping studies in software engineering: An update. Information and Software Technology, 64, 118 (2015) 10. Zhang H., Ali Babar, M. Systematic reviews in software engineering: An empirical investigation. Information and Software Technology, 55(7), 1341-1354 (2013) 11. Vargas, J. A., García-Mundo, L., Genero, M., Piattini, M.: A Systematic Mapping Study on Serious Game Quality. In 18th International Conference on Evaluation and Assessment in Software Engineering (EASE´14), p. 15. ACM (2014) 12. ISO/IEC: ISO/IEC IS 25010: Systems and Software Engineering - Systems and Software Quality Requirements and Evaluation (SQuaRE) - System and Software Quality Models, ISO (International Organization for Standarization) (2011) 13. Garcia-Mundo, L., Genero, M., Piatini, M.: Towards a Construction and Validation of a Serious Game Product Quality Model. Enviado al Seventh International Conference on Virtual Worlds and Games for Serious Applications (2015) 14. Franch, X., and Carvallo, J. P.: Using quality models in software package selection. IEEE Software 20(1), 34-41 (2003) 15. Radulovic, F., García-Castro, R., Gómez-Pérez, A.: SemQuaRE—An extension of the SQuaRE quality model for the evaluation of semantic technologies. Computer Standards & Interfaces, 38, 101-112 (2015) 16. Herrera, M., Moraga, M. A., Caballero, I., Calero, C.: Quality in use model for web portals (QiUWeP). In 10th International Conference on Web Engineering, ICWE 2010, pp. 91101. Springer-Verlag (2010) 17. Carvallo, J., Franch, X., Quer, C.: Defining a quality model for mail servers. In Proceedings of the 2nd International Conference on COTS-based Software Systems, (ICCBSS 2003), pp. 51-61. Springer-Verlag Berlin Heidelberg (2003) 18. ISO/IEC 9126-4: ISO/IEC-9126-4 Software Engineering – Product Quality – Quality in use metrics (2004) 19. González, J. L.: Jugabilidad. Caracterización de la experiencia del jugador en videojuego. Tesis Doctoral, , Universidad de Granada (2010) 20. Kitchenham, B. A., Pfleeger, S. L.: Personal opinion surveys. In Guide to Advanced Empirical Software Engineering: Shull, F., Singer, J., Sjøberg, D.I.K. (eds.), pp. 63-92, Springer London (2008) 21. Genero, M., Cruz-Lemus, J. A., Piattini, M.: Métodos de Investigación en Ingeniería del Software. RaMa (2014) 22. Survey Monkey. https://es.surveymonkey.com/ 79 RACMA o cómo dar vida a un mapa mudo en el Museo de América Marta Caro-Martínez, David Hernando-Hernández, Guillermo Jiménez-Díaz Dept. Ingeniería del Software e Inteligencia Artificial Universidad Complutense de Madrid {martcaro,davihern,gjimenez}@ucm.es Resumen La Realidad Aumentada es una tecnología que permite aumentar el mundo real que percibimos con elementos virtuales interactivos. En este artículo describimos el uso de esta tecnología en el Museo de América de Madrid, sobre un mapa mudo del continente americano en el que, gracias a la Realidad Aumentada creamos personajes que dan vida al mapa y proporcionan información sobre las culturas presentes en el museo. Keywords: Realidad Aumentada, Museos, Unity3D, Vuforia 1. Introducción La Realidad Aumentada es una tecnología que combina la visualización del mundo real con elementos virtuales interactivos en tiempo real. Aunque hace unos años esta tecnología era costosa y necesitaba de una gran inversión en dispositivos que diesen soporte a la misma, a día de hoy está al alcance de la mano de cualquier persona que tenga un dispositivo móvil de última generación (smartphone o tablet). La Realidad Aumentada está siendo introducida en los museos como un medio innovador de dinamización y que facilita la inclusión de nuevos contenidos sin necesidad de tener que introducir nuevos elementos físicos en él. La Realidad Aumentada proporciona una componente interactiva muy novedosa, una nueva forma de involucrar a los turistas y visitantes de un museo con los contenidos del mismo, lo cual añade nuevo valor a nuestro patrimonio cultural turístico. En este artículo detallamos el desarrollo de la aplicación RACMA, destinada a añadir contenidos a un mapa mudo del continente americano que se encuentra en el Museo de América de Madrid. La aplicación incluye también una experiencia aumentada en casa, de modo que los contenidos del museo también pueden ser visitados fuera de él. En la siguiente sección realizamos una introducción a la Realidad Aumentada y una breve revisión de su uso en museos. Posteriormente describimos cuál es la motivación del Museo en el uso de la Realidad Aumentada (Sección 3) para más adelante describir la solución que proponemos, la aplicación RACMA (Sección 4). El artículo finaliza con detalles del estado actual de la aplicación y el trabajo futuro (Sección 5). 80 2 Marta Caro-Martínez, David Hernando-Hernández, Guillermo Jiménez-Díaz 2. Realidad Aumentada La Realidad Aumentada es una tecnología basada en el uso de dispositivos tecnológicos para crear una visualización del mundo real en la que se superponen elementos virtuales. Los dispositivos añaden estos elementos en tiempo real, creando de esta forma una visión mixta a través del dispositivo [1]. Para poder saber dónde superponer el contenido virtual se utiliza el reconocimiento de puntos de interés (o tracking) para, posteriormente, reconstruir (reconstruct/recognize) un sistema de coordenadas en el mundo real, necesario para posicionar los objetos virtuales. Los puntos de interés se pueden identificar mediante marcadores, como imágenes y códigos BIDI o QR, texto, objetos 3D simples como cilindros o cubos, hasta objectos 3D complejos con geometría conocida. Otra alternativa es obviar el uso de marcadores, identificando los puntos de interés por GPS u otros medios de ubicación (como los beacons, que usan tecnología Bluetooth y se emplean principalmente dentro de edificios). Una vez identificada la posición del punto de interés se puede hacer uso de sistemas inerciales de movimiento (brújula, acelerómetros, giroscopios...) para actualizar el sistema de coordenadas creado de acuerdo al punto de interés. Aunque hace unos años los medios necesarios para hacer uso de una experiencia de Realidad Aumentada eran costosos, la realidad actual es completamente diferente gracias a la potencia y características de los dispositivos móviles actuales. Para poder disfrutar de aplicaciones de Realidad Aumentada son necesarios los siguientes elementos: Un dispositivo que soporte el software de Realidad Aumentada, que ha de tener los siguientes componentes: (1) Un monitor o pantalla donde se va a proyectar la imagen virtual superpuesta sobre la imagen real; (2) una cámara digital que toma la información del mundo real; (3) un procesador potente para procesar las imágenes captadas por la cámara; y (4) otras características como acelerómetros, GPS, giroscopios, brújula, sensores ópticos, bluetooth, identificación por radio frecuencia (Radio Frecuency Identification o RFID), etc. Opcionalmente, marcadores que el software de Realidad Aumentada va a interpretar para ubicar una referencia en el mundo real. El software de Realidad Aumentada en sí mismo, responsable de interpretar los datos de ubicación en el mundo real y los movimientos del dispositivo para proyectar un conjunto de elementos virtuales en la pantalla del dispositivo. Aunque los sistemas de reconocimiento y seguimiento de los elementos del mundo real pueden parecer complicados, en la actualidad existen múltiples librerías y kits de desarrollo que ayudan a la implementación de este tipo de aplicaciones [2]. ARToolkit1 fue probablemente una de las pioneras en dar soporte al desarrollo de aplicaciones de Realidad Aumentada y es de código abierto. Wikitude2 es otra de las más conocidas y, entre otras características, dispone de 1 2 ARToolkit: http://www.hitl.washington.edu/artoolkit/ Wikitude: https://www.wikitude.com/ 81 RACMA o cómo dar vida a un mapa mudo en el Museo de América 3 una aplicación (Wikitude Studio) que facilita la creación de sistemas de Realidad Aumentada sencillas sin necesidad de tener muchos conocimientos de programación. Layar3 y Junaio4 también están pensadas para poder desarrollar sencillas aplicaciones de Realidad Aumentada sin necesidad de tener que programar. Esta última es un servicio proporcionado por Metaio SDK5 , un framework que permite crear aplicaciones de Realidad Aumentada en múltiples dispositivos (Android, iOS, Windows Phone) y que, además, da soporte para el desarrollo de aplicaciones con Unity3D6 , uno de los motores de juegos más utilizados en la actualidad. Para este motor de juegos también está la librería Vuforia7 , la cual hemos usado para el desarrollo de la aplicación descrita en este artículo. 2.1. Realidad Aumentada y Museos Son muchos los diferentes usos de la Realidad Aumentada –marketing y publicidad, educación, aplicaciones médicas, entretenimiento, turismo...[1]. Nuestro interés se ha centrado principalmente en el ámbito del patrimonio cultural y su uso en museos ya que es ahí donde nos ha surgido la necesidad de aplicarla. La Realidad Aumentada aplicada sobre los contenidos de los museos es un medio innovador de dinamización y con un gran potencial ya que permite atraer a nuevas audiencias más familiarizadas con estas tecnologías, aumentar la información que el museo proporciona a los visitantes sin necesidad de modificar el museo en sí mismo y mejorar la experiencia de usuario, tanto dentro como fuera del museo. La Realidad Aumentada proporciona, además, una componente interactiva muy novedosa, una nueva forma de involucrar a los turistas y visitantes de un museo con los contenidos del mismo, lo cual añade nuevo valor al patrimonio cultural turístico [3]. En España, museos como el Thyssen-Bornemisza o monumentos como la Alhambra de Granada ya disponen de aplicaciones lúdicas de Realidad Aumentada para de involucrar al público más joven en la visita turística [4]. Aunque existen más usos de la Realidad Aumentada en los museos [5] destacamos los siguientes: Guías del museo: Algunos museos como el Louvre han creado aplicaciones para guiar a los visitantes por distintas rutas dentro del museo [6]. Algunas de estas guías no solo presentan información adicional al visitante sino que también cuentan con actividades lúdicas [7]. Algunos proyectos como ARtSENSE van un paso más allá, adaptando los contenidos de la aplicación a los intereses del visitante [8]. Reconstrucción de patrimonio cultural. La Realidad Aumentada permite visualizar aquello que está oculto o que ya no existe. Por ejemplo, The Augmented Painting es una aplicación que muestra las imágenes espectrales (rayos 3 4 5 6 7 Layar: https://www.layar.com/ Junaio: http://www.junaio.es/ Metaio SDK: http://www.es.metaio.com/ Unity3D: http://unity3d.com/ Vuforia: https://developer.vuforia.com/ 82 4 Marta Caro-Martínez, David Hernando-Hernández, Guillermo Jiménez-Díaz Figura 1. Mapa mudo del continente americano en el Museo de América X, infrarrojos, etc.) del cuadro La habitación de Van Gogh sobre el propio cuadro [9]. Por otro lado, Archeoguide es otra aplicación que permite ver in situ reconstrucciones de algunos monumentos griegos [10]. Tal y como veremos más adelante, RACMA se puede considerar un híbrido entre estos dos tipos de usos, ya que servirá como guía del museo a la vez que muestra información que actualmente es invisible a los ojos de los visitantes. 3. Motivación y Descripción del problema El Museo de América de Madrid reúne una gran colección de arqueología y etnología americana. Dentro del museo hay un gran mapa mudo del continente americano que no transmite nada a la mayoría de visitantes ya que éstos pasan de largo sin pararse ni siquiera a mirarlo (Figura 1). El mapa se encuentra al principio del museo por lo que se desea darle una utilidad real, haciendo que en él se pueda ver y consultar información sobre las principales culturas que están representadas en el museo. El mapa, de aproximadamente 16 metros de largo por 6 metros de ancho, se encuentra en el suelo de una sala del museo. Los visitantes pasan sobre una pasarela que está a 1 metro por encima del nivel del suelo. En la Figura 2 se muestra un esquema de ubicación del mapa. Este esquema da una idea de las distancias con las que se tiene que trabajar, lo que ha supuesto uno de los principales problemas con los que nos hemos encontrado durante el desarrollo de la aplicación, tal y como describiremos más adelante. El museo deseaba mostrar información sobre las áreas culturales de América y las culturas expuestas en el museo pero no estaba dispuesto a añadir elementos físicos que modificasen el mapa. Además, la información a incluir era bastante extensa. Así mismo, el museo quería que esa información no se quedase solo dentro del propio museo sino que fuese accesible desde fuera de él. Por este motivo, la idea del uso de la Realidad Aumentada para añadir la información sobre los 83 RACMA o cómo dar vida a un mapa mudo en el Museo de América 5 Figura 2. Esquema de la ubicación del mapa en la sala del Museo de América contenidos del museo sobre el mapa del continente americano se convirtió en una propuesta prometedora para el museo. 4. RACMA: Realidad Aumentada de las Culturas del Museo de América Para resolver el problema presentado se ha diseñado RACMA (Realidad Aumentada de las Culturas del Museo de América), una aplicación de Realidad Aumentada para dispositivos móviles que se puede utilizar tanto dentro como fuera del museo para dar vida a un mapa mudo del continente americano, proporcionando información sobre las distintas áreas culturales y culturas expuestas en el museo. Para ello se ha optado por poblar el mapa con personajes que representan cada una de las culturas. Al interactuar con estos personajes accederemos a la información relacionada con las áreas culturales que representan. RACMA es un híbrido entre los dos usos destacados de la Realidad Aumentada (vistos en la Sección 2.1), ya que hace visible lo invisible del mapa, a la vez que hace de guía del museo para sus visitantes. A continuación se detalla con más profundidad el funcionamiento de la aplicación y la tecnología empleada. 4.1. Descripción general La aplicación desarrollada se puede utilizar tanto dentro como fuera del museo8 . Dentro de la aplicación se han identificado ambos modos de funcionamiento como “Realidad Aumentada en el Museo de América” y “Realidad Aumentada 8 Se pueden ver algunos prototipos en funcionamiento de la aplicación en esta lista de vídeos en Youtube: https://goo.gl/LO1CY1 84 6 Marta Caro-Martínez, David Hernando-Hernández, Guillermo Jiménez-Díaz Figura 3. Aspecto de RACMA cuando se usa como “Realidad Aumentada en el Museo de América” en casa”, respectivamente. En ambos casos su uso es similar y se desarrolla en tres fases: 1. Localizar el marcador de la aplicación. 2. Colocar a los personajes dentro del mapa. 3. Interactuar con los personajes para acceder a información. Durante la primera fase se pide al usuario que localice un marcador, es decir, que enfoque con la cámara del dispositivo en el que se está ejecutando a dicho marcador. Internamente, este marcador servirá como origen de coordenadas para la siguiente fase. Los marcadores dentro y fuera del museo son distintos. Dentro del museo se barajaron diferentes opciones, como colocar imágenes o códigos QR sobre el mapa, cubos o dejar un dispositivo móvil fijo en un soporte en la sala del mapa. Finalmente se decidió colocar una imagen de tamaño DIN A2 con el logotipo de la aplicación en la pared que hay a la espalda del mapa (como aparece en la Figura 3). El tamaño y la ubicación del marcador fueron uno de los mayores problemas encontrados durante el desarrollo de la aplicación, debido a la gran distancia (aproximadamente 6 metros) hasta el marcador y la baja iluminación de la sala del museo en la que se encuentra el mapa. Cuando la aplicación se utiliza fuera de casa el marcador empleado es un mapa esquemático del continente americano que el usuario puede imprimir para usar con la aplicación (ver Figura 4). Una vez que se ha localizado el marcador, la aplicación coloca a los personajes en su ubicación inicial dentro del mapa. El marcador sirve como origen de coordenadas para colocar a los personajes, de modo que cada uno de ellos se ubica en el área que le corresponde dentro del mapa. Se han desarrollado cuatro áreas culturales con sendas culturas: Área cultural Costa Noroeste, representada por la cultura Tlingit. Área cultural Mesoamérica, representada por la cultura Maya. Área cultural Andina, representada por la cultura Inca. 85 RACMA o cómo dar vida a un mapa mudo en el Museo de América 7 Figura 4. Aspecto de RACMA cuando se usa como “Realidad Aumentada en casa” Figura 5. Información accesible al interactuar con un personaje: descripción de la cultura o del área cultural y galería de imágenes, con mapa de ubicación de las piezas dentro del museo. Área cultural Amazónica, representada por la cultura Shuar. Cada uno de los personajes tiene un aspecto característico, de modo que sean fácilmente identificables. Estos personajes se mueven dentro de su área cultural para darles más vida y dinamismo. Además, se han colocado otros elementos interaccionables para proporcionar información sobre los contenidos expuestos y la aplicación en sí misma. Una vez posicionados, el usuario puede interactuar con los personajes pulsando sobre ellos en la pantalla del dispositivo móvil. Esta simple interacción da acceso, primeramente, a la información detallada del área cultural, incluyendo una descripción de la misma y una galería de imágenes de las piezas de este área que se exponen en el museo. Desde aquí también se puede acceder a la información concreta de la cultura representada. Igual que antes se incluye una descripción y una galería de imágenes de las piezas expuestas de esta cultura. Ambas galerías incluyen información sobre la ubicación de las piezas dentro del museo. El aspecto de estas interfaces se puede ver en la Figura 5. 86 8 4.2. Marta Caro-Martínez, David Hernando-Hernández, Guillermo Jiménez-Díaz Tecnología empleada RACMA ha sido desarrollado íntegramente en Unity3D (v4.6) para dispositivos Android. Para la parte de Realidad Aumentada se ha utilizado la librería Qualcomm Vuforia (v3.0.9) ya que su integración con Unity3D es muy sencilla y rápida. De entre los posibles tipos de marcadores soportados por Vuforia se han empleado los ImageTarget, que usan una imagen como marcador. Para poder utilizar este tipo de marcador las imágenes se han tratado con el servicio Target Manager de Vuforia, que genera un mapa de características de la imagen para poder reconocerlas fácilmente. A pesar de que Vuforia no acepta cualquier imagen como marcador, las usadas en RACMA no han supuesto ningún problema y el reconocimiento del marcador se realiza de manera rápida. El desarrollo de la funcionalidad de “Realidad Aumentada en casa” fue bastante rápido. Sin embargo, la funcionalidad de “Realidad Aumentada en el Museo de América” fue más problemática debido a los problemas de distancia e iluminación anteriormente descritos. Además, la ubicación de uno de los personajes quedaba fuera del entorno del marcador, lo que hacía difícil posicionarlo en la aplicación. La gran distancia entre el visitante con el dispositivo móvil en el que se ejecuta la aplicación y la ubicación del marcador hacía que los personajes flotasen sobre el mapa, se moviesen a saltos y perdiesen su ubicación original. Ello nos obligó a utilizar la característica Extended tracking de Vuforia: una vez que se localiza el marcador, Vuforia es capaz de inferir su posición gracias a la información del entorno aunque el marcador quede fuera del campo de visión de la cámara del dispositivo, evitando los cambios de posición aleatorios de los personajes y los movimientos a saltos. De acuerdo a la filosofía de desarrollo en Unity3D, la aplicación se compone de las siguientes escenas: Menú principal. Esta es la escena inicial y consiste en un simple menú que da acceso a distintas funcionalidades de la aplicación. Realidad aumentada en el museo y Realidad aumentada en casa. Ambas escenas tienen una estructura similar y son las que hacen uso de los gameObjects proporcionados por la librería Vuforia. En ellos se encuentra la cámara de realidad aumentada (ARCamera) y una representación del marcador ImageTarget. Este objeto es el padre de la escena en la que están colocados los personajes. Inicialmente esta subescena está desactivada, para que cuando se detecte el marcador el marcador se active y aparezcan los personajes sobre el mapa. Cada personaje dispone de un componente genérico responsable de cargar la escena de información asociada al área cultural que el personaje representa, así como un componente que le permite deambular por su área en el mapa. Área cultural. Se ha creado una escena por cada área cultural. Cada una de estas escenas contiene al personaje representativo de la misma, usado en alguna de las ventanas de información, así como algún otro elemento estético. Además, la cámara tiene un componente genérico que carga y presenta toda la información disponible asociada a un área cultural. Este componente está parametrizado con el nombre del área y de la cultura más destacada en ese 87 RACMA o cómo dar vida a un mapa mudo en el Museo de América 9 área con el fin de cargar toda la información contenida en la carpeta de recursos asociada y generar las interfaces dinámicamente. Este componente ha sido de gran utilidad ya que ha servido para independizar los contenidos de la aplicación de su presentación y hace que la aplicación sea extensible para la inclusión de nuevas áreas de manera rápida y sencilla. Información. Esta escena se carga para presentar información superpuesta en las pantallas donde está haciéndose uso de la cámara de realidad aumentada. La aplicación se ha desarrollado utilizando como entorno de pruebas un móvil Doogee Valencia DG800 con Android 4.4 KitKat. Posteriormente se ha probado con un total de 12 dispositivos Android distintos, como los Samsung Galaxy S4 y S5 y el Nexus 5. No se han detectado problemas en la mayoría de los dispositivos con Android KitKat. Sin embargo, la aplicación no ha funcionado en los dispositivos con Android Lollipop debido a una incompatibilidad con Vuforia. Además, destaca la imposibilidad de utilizar la aplicación en el Nexus 5 dentro del museo. El problema se debe a una incompatibilidad entre Vuforia y la cámara del dispositivo en entornos con baja luminosidad, que hace que la imagen se vea completamente negra y que, por tanto, no sea posible localizar el marcador. Sin embargo sí es posible utilizar la funcionalidad “Realidad Aumentada en casa” en el Nexus 5 siempre y cuando tengamos la iluminación adecuada. 5. Estado actual y trabajo futuro Tras completar el desarrollo de la aplicación se han comenzado a realizar evaluaciones formales con usuarios. Ademas de la realización de pruebas de usabilidad de la misma, tanto en su versión “Realidad Aumentada en casa” como en “Realidad Aumentada en el Museo de América” se están estudiando la aceptación de esta tecnología (la Realidad Aumentada) entre los visitantes de los museos teniendo en cuenta sus conocimientos en este tipo de aplicaciones y el rango de edad. Hasta la fecha se han realizado un total de 34 evaluaciones divididos por rangos de edad. Los resultados preliminares son prometedores ya que, en general, a los usuarios les resulta muy fácil de utilizar y les ha gustado la originalidad de la tecnología, el diseño de la aplicación y ver a los personajes en tres dimensiones delante de ellos pudiéndolos tocar casi con la mano. A pesar de esto, algunos usuarios han sugerido añadir información adicional como billboards sobre el personaje para identificar mejor las culturas que representan o la inclusión de más personajes. La aplicación ha tenido una amplia aceptación, la descargarían y la recomendarían y a una gran mayoría de los usuarios les gustaría utilizar aplicaciones similares a esta en otros museos. Actualmente también se ha subido una versión Beta a Google Play para poder realizar una prueba más exhaustiva con otros dispositivos, ya que hemos visto que estamos encontrando problemas dependientes del modelo concreto de dispositivo móvil. Algunos usuarios comentaron también la posibilidad de llevarlo a tablets o a dispositivos iOS por lo que esto serán algunas líneas de desarrollo futuro a estudiar. 88 10 Marta Caro-Martínez, David Hernando-Hernández, Guillermo Jiménez-Díaz Para finalizar, estaríamos interesados en estudiar el impacto de la inclusión de actividades más lúdicas dentro de la aplicación –minijuegos, inclusión de otros personajes con los que jugar dentro del mapa... La aplicación actual es meramente informativa pero pensamos que la inclusión de mecánicas de juegos podrían atraer a usuarios más jóvenes a los museos. Ahora bien, sería necesario estudiar si precisamente esta jugabilidad genera rechazo entre otros tipos de usuarios mayores. Agradecimientos Quisiéramos agradecer a Andrés Gutiérrez y Beatriz Robledo, del Museo de América, su ayuda en el desarrollo de los contenidos de la aplicación. También agradecer a Samuel C. Palafox y a Juan Francisco Román su trabajo en el arte 2D y 3D de la aplicación. Referencias 1. Carmigniani, J., Furht, B., Anisetti, M., Ceravolo, P., Damiani, E., Ivkovic, M.: Augmented reality technologies, systems and applications. Multimedia Tools and Applications 51(1) (2011) 341–377 2. Amin, D., Govilkar, S.: Comparative Study of Augmented reality SDKs. International Journal on Computational Sciences and Applications 5(1) (2015) 11–26 3. Angelopoulou, A., Economou, D., Bouki, V., Psarrou, A., Jin, L., Pritchard, C., Kolyda, F.: Mobile augmented reality for cultural heritage. In: Mobile Wireless Middleware, Operating Systems, and Applications. Springer (2012) 15–22 4. Española, A.C.: Anuario AC/E 2015 de Cultura Digital. Modelos de negocio culturales en Internet. Museos y nuevas tecnologías. (2015) 5. Huang, Y., Jiang, Z., Liu, Y., Wang, Y.: Augmented reality in exhibition and entertainment for the public. In Furht, B., ed.: Handbook of Augmented Reality. Springer New York (2011) 707–720 6. Miyashita, T., Meier, P., Tachikawa, T., Orlic, S., Eble, T., Scholz, V., Gapel, A., Gerl, O., Arnaudov, S., Lieberknecht, S.: An augmented reality museum guide. In: Proceedings of the 7th IEEE/ACM International Symposium on Mixed and Augmented Reality, Washington, USA, IEEE Computer Society (2008) 103–106 7. Tillon, A., Marchand, E., Laneurit, J., Servant, F., Marchal, I., Houlier, P.: A day at the museum: An augmented fine-art exhibit. In: IEEE International Symposium on Mixed and Augmented Reality. (Oct 2010) 69–70 8. Damala, A., Stojanovic, N., Schuchert, T., Moragues, J., Cabrera, A., Gilleade, K.: Adaptive augmented reality for cultural heritage: Artsense project. In Ioannides, M., Fritsch, D., Leissner, J., Davies, R., Remondino, F., Caffo, R., eds.: Progress in Cultural Heritage Preservation. Volume 7616 of Lecture Notes in Computer Science. Springer Berlin Heidelberg (2012) 746–755 9. van Eck, W., Kolstee, Y.: The augmented painting: Playful interaction with multispectral images. In: IEEE International Symposium on Mixed and Augmented Reality, IEEE (2012) 65–69 10. Vlahakis, V., Ioannidis, N., Karigiannis, J., Tsotros, M., Gounaris, M., Stricker, D., Gleue, T., Daehne, P., Almeida, L.: Archeoguide: an augmented reality guide for archaeological sites. IEEE Computer Graphics and Applications 22(5) (2002) 52–60 89 Design methodology for educational games based on interactive screenplays Rafael Prieto de Lope ∙ Nuria Medina-Medina ∙ Patricia Paderewski ∙ F.L. GutiérrezVela Centro de Investigación en Tecnologías de la Información y la Comunicación, Universidad de Granada (CITIC-UGR). C/ Periodista Rafael Gómez Montero 2, 18014, Granada, Spain. rapride@correo.ugr.es nmedina@ugr.es patricia@ugr.es fgutierr@ugr.es Abstract. A number of studies have been published on the benefits offered by educational video games for student development and there has been a constant increase in the use of serious games for this purpose. Very few methodological proposals for educational video game development, however, have been published in scientific literature and the proposals analyzed in this paper display certain drawbacks that limit their application. This article therefore presents a new methodology for developing educational games based on interactive screenplays. This methodology seeks a balance between the overall and the detailed view required to create the game. In order to achieve this, the methodology moves between different levels of abstraction and deconstructs the process into phases and steps that structure this complex task and which can be understood by non-technical members of the multidisciplinary team. Keywords: Serious games, educational games, development methodology 1 Introduction All games whether commercial or non-commercial have a number of common features such as high interactivity, fun, rules that the player must follow and in many cases a competitive element. Serious games [24][3], however, are not only aimed at providing entertainment or competiveness but also at exploiting these in order to improve training in areas such as education, public policy, health or communication strategies. In recent years, there has been a boom in the number of serious games, and since 2007 there has been a considerable increase in the scientific production in this field. A thorough search of scientific literature on serious games from 1990 to 2012 revealed that 54% of papers on this subject were published in the period 2007-2012 [23]. Another relevant fact is shown by Vargas [20] who states that in a systematic search, 60.71% of serious games belong to the educational sphere. These results might be explained by problems such as dropping out of school due to lack of motivation. It is possible that educational video games (also called educational games in this paper) can provide that missing motivation, thereby making them an excellent teaching tool 90 for teachers. Correspondingly, a number of studies have identified certain advantages of using video games in education [5] [16] [18] in that they reduce reaction time improve hand-eye coordination increase self-esteem improve spatial conception (manipulating objects in 2D and 3D, rotation plans, etc.) encourage interactive learning motivate learning through challenges stimulate exploratory behavior and the desire to learn permit simulators so that users can practice without any real consequences improve social skills and basic math articulate abstract thinking improve cognitive skills (e.g. strategic planning, multiple learning styles, etc.) Our aim in this article is to highlight the shortage that still exists of specific methodologies for designing educational games that must be conceived by non-technical personnel (including educators, writers and artists) to be used by software developers. The article is organized as follows. Section 2 outlines the current state of methodologies for designing the video games and educational games discussed in Section 3. Section 4 describes our approach in an attempt to reduce the previously identified disadvantages and briefly describes a video game currently being developed. Finally, Section 5 presents our conclusions and the framework for the application of the proposed methodology (that of the development of an educational game to teach comprehensive reading to upper primary school children [11]). 2 State of scientific methodology literature on game development Development methodology refers to a series of techniques and/or processes by which a video game is developed. While it is possible to develop a video game by following various general software methodologies (e.g. the waterfall model, the incremental or the agile method, etc.), game development generally consists of three phases: pre-production, production and post-production based on the film’s life cycle. In addition, certain authors have even defined a preliminary phase [19]. Our interest, however, lies in the development of game-specific methodologies and with this in mind these methodologies and processes are outlined below. 5M methodology for games The 5M classification is often used in the engineering industry and can be applied to video game development as follows [12]: Method: general organization of the different production steps, including the inflow of material production and the intervention of human actors 91 Milieu: all the elements involved in serious game production, for example domain experts (teachers, doctors, engineers, etc.), independent subcontractors (sound technicians, graphic designers, etc.) and students and tutors (testing and feedback) Manpower: the team of human actors involved in the production chain. For reasons of comprehension, these actors are described by their roles (pedagogical expert, programmer, etc.) although these roles can be assigned to a single person. Machine: set of tools that help the human actors produce the serious game Materials: documents, prototype models, executable files, databases and other devices used to produce the final serious game Design process based on Padilla-Zea models The game is defined by a series of models generated during the design process [14]: educational content models, entertainment content models, models for the interrelation between the educational and the entertainment content and user models for adaption. This approach emphasizes the relationship between educational objectives and play challenges that the game activities share with the educational tasks being implicitly undertaken. Methodology based on Westera levels This approach combines three different levels [22] for the system integration, framework and structure of the video game: on a conceptual level, a game is considered to be a system (i.e. a set of interrelated elements). A game is designed by specifying certain relevant factors, taking into account the two fundamental dimensions of space and time: the space dimension covers the static configuration of gaming locations (virtual) and includes associated objects, attributes and relationships, and its evolution over time covers the game dynamics. on a technical level, the framework describes the basic architecture of the game development system which describes the system and its tools for developing the places, objects, actor roles and scenarios of the video game. on a practical level, i.e. the structure of the game, the options offered to the players and the multimedia representation of the game environment SUM methodology SUM is an agile methodology for game development that adapts the Scrum structure and roles [1]. SUM suits small multidisciplinary teams (three to seven components) and short-term projects (less than a year). The methodological definition is based on SPEM 2.0 (Software and Systems Process Engineering Metamodel Specification). The main advantage of SPEM is its flexibility and adaptability since it is not necessary to mention specific practices. Roles: The methodology defines four roles: development team, internal producer, customer and beta tester. 92 Life-cycle: This is divided into iterative and incremental phases that are executed sequentially, with the exception of risk management, which is performed throughout the project. Ontological methodology In his work, Llansó [8] outlines the problems common to game development and focuses on the uniqueness of the multidisciplinary team that is usually involved (e.g. the artists, designers, programmers and in the case of serious games, all manner of professionals) and this can sometimes result in the breakdown of project communication. By way of solution, the methodology proposes the ontology as a basis for communication whereby the designers are solely responsible for describing the characters, objects, functions and status of the run of play and the programmers refine the technical details and objectives. In this way, they are working on different views with the same information. 3 Discussion about existing proposals Although game development in general and the design of educational video games in particular are complex processes that are far removed from conventional software development, very little has so far been published on the design or development of serious or educational games from a specific perspective. Among the work that stands out in this field is the ontological methodology. This emphasizes the particular characteristics of working with a multidisciplinary team, which is essential for game development, and offers a complete guide to solving this problem. There are, however, certain drawbacks that are not restricted to serious games (and possibly this type of video game should be disregarded) and the main focus is on facilitating communication between the different team members while ignoring other difficulties which are inherent to the design itself. In addition, the ontological syntax may not be intuitive to non-technical staff. The collaborative learning methodology presented in [14] considers collaboration to be an enriching part of the learning process. By employing very formal models, however, it lacks graphical notations that are easy for the multidisciplinary team members to understand. Since the SUM methodology is directed towards video games in general and is defined for small projects, it is not suitable for the purpose of this study (although it might be considered supplementary). Similarly, the 5M methodology proposes an interesting production process for educational games, but is unsuitable for software engineering. The following common shortcomings have also been identified: There is no sufficiently detailed process to explain the series of steps to be followed when constructing the interactive story around which the game will be executed. There is a lack of mechanisms to enable collaboration, except in the methodology [14] which defines the rules of collaboration or cooperation between two or more players in order to achieve goals, challenges and achievements. 93 There is no clear or definite correspondence between education and fun, except in the proposal in [14] where the formalization of this interrelation is fundamental to the game balance. In this article, however, we only explore the conceptual level of educational and recreational purposes and do not define how educational challenges are included within the game narrative. No graphical notations are used to specify the game: graphical notations are only used in the work by Llansó [8] and then as a supplement. Graphical notations are useful for non-technical staff, designers and developers alike. 4 A new methodology based on interactive screenplays The methodology proposed in this paper focuses on educational games with narrative and begins with the narrative screenplay of the game organized into chapters and scenes. The various other game elements are then progressively added to this script (e.g. scenarios, characters, fun and educational challenges, etc.). The use of narrative as the core helps writers, educators and artists construct the adventure and dynamics of the game, and is supported and complemented throughout the process by the designers. A series of graphical notations can then be generated from the interactive script such as diagrams showing the challenges, objects and scenarios. Not only do these diagrams provide an abstract view of the game but they also facilitate video game implementation and can be directly interpreted by the developers who were not involved in the design. Fig.1. Methodology based on interactive screenplay 94 More specifically, the methodology comprises a series of ordered, iterative steps (Figure 1) that begins with three preliminary phases. Pre1. Design of the educational challenges: basic competences and educational objectives In this first phase, the team of teachers and educators (which could also include parents and guardians) determines the competences and specific educational objectives that the game will address. In the first step, the team defines the competences. A competence is considered to be more than knowledge and skills and involves the ability to meet complex demands, supporting and mobilizing psychosocial resources (including skills and attitudes) in a particular context [4]. For example, the following eight basic skills [9] [10] are defined for the Spanish education system: 1. linguistic communicative competence 2. mathematical competence 3. knowledge of and interaction with the physical world 4. data processing and digital competence 5. learning to learn 6. social and civic competence 7. autonomy and personal initiative 8. cultural and artistic competence Depending on the game’s pedagogical framework, certain skills will of course be included. In the second step, educational goals are established whereby the objectives to be achieved during the development of an educational cycle or specific subject are defined, and these will be integrated either directly or indirectly into subsequent achievement assessment as the game is used. In this pre-phase, the teaching team could use curricular models with which they are familiar. Pre2. Design of the type of game Before designing the story and challenges of the game it is necessary to determine a series of game characteristics that may affect subsequent design decisions. These features include gender, avatar control, platform, future users, narrative level, area of application and interactivity. For example, the classification of [7] could be used to determine the video game genre (e.g. action, adventure, fight, logic, simulation, sport, strategy, etc.). The platform used could be a PC, console or smartphone/tablet and in order to identify future users the age recommended in [15] could be used or more specifically, the group at which the game is aimed (e.g. primary pupils). Depending on the narrative level to be included in the video game and based on [2] but with a reduction in the number of categorization criteria from ten to six, the following types could be established: no narrative, elementary narrative, basic narrative, full narrative, complex narrative and “narrative is everything” (from the lowest to highest weight of the narrative in the game). From the perspective of how players control their avatars, it is necessary to establish whether there is third or first person avatar control (for any avatar appearing on the scene) or if any avatar may represent the player (for example, sever- 95 al characters are controlled as in Sim or none as in Tetris). Finally, interaction establishes how the player or players interact with the game, whether active (by interacting with their own body) or standard (by interacting using special or common peripherals). It is also possible at this stage to decide whether interaction is point & click or touch (although this will obviously depend on the platform chosen). Esthetic aspects and choice of 2D or 3D could also be specified in this pre-phase to be considered during the character design phase. Pre3. Initial design of the story and main characters Generally speaking, in order to fully define the game’s story, various iterations are required and the number of these is likely to be proportional to the narrative level. When a complex, full or everything narrative is chosen, it is easy to lose the overall view of the adventure and the associated dynamics. In order to reduce this risk, an initial, abstract story design should be drawn up. Some or all of the main characters that will appear in the future game are also chosen. This design could be enhanced with graphical sketches. With these three pre-phases, this initial conception enables the design team to tackle each of the phases listed below. 1. Chapter design In the methodology proposed, a chapter is defined as the item of the highest level that is used to organize the story and facilitate content integration. Each game should comprise at least one chapter, although there are usually several. The transition and order of the chapters can be established using a chapter flowchart. In order to define each chapter, it is necessary to provide an identifier, an abbreviated name and the plot of the chapter’s overall adventure. The Hero’s Journey [21] is one example of how the story line can be organized into chapters and these include Ordinary World, Call to Adventure, Refusal of the Call, Meeting with the Mentor, Crossing the First Threshold, Tests, Allies and Enemies, Approach to the Inmost Cave, The Ordeal, Reward, The Road Back, Resurrection and Return with the Elixir. While not compulsory, it is possible to specify the different educational objectives of each chapter in order to ensure early on that the educational component is balanced between chapters. 2. Scene design Each chapter is split into scenes which comprise the chapter story line in the same way as the scenes of a play or film. The number and order of scenes in a chapter can be specified using a scene flowchart. Since the flow is not normally unique, the design team can define transitions depending on a player’s future decisions and actions, which would mean that certain scenes are optional (i.e. the player does not have to live them all). Once each scene in the chapter has been described with its name and brief summary, the items listed below are specified. 96 Design of the scenario In this phase, the scenario of the scene is described and identified with an ID. The scenario is the place where the actions and dialogues occur in a scene. The scenario definition includes both a static and a dynamic part. While the static part defines the environment (e.g. room, lake, etc.) and the objects to be found there (e.g. table, wall chart, weapon, etc.), the dynamic part defines object interactivity (e.g. inventoried or not, mobile or not, associated powers, etc.) in the scene. Some objects can therefore have certain associated interactions (e.g. take the object, change some of its attributes, move it, etc.) in one scene but not another. For this reason, the scenario is included in the scene description. When one scenario is statically and dynamically identical to the scenario of another scene, it is therefore not redefined but the ID is used directly. Design of the characters In the scenario and during the scene, one or more characters will appear, some of which will have been briefly described in the third pre-phase. The first time characters appear, it is necessary to describe in detail their characters (and these can subsequently be expanded upon in successive iterations), appearance and personality. Initially, words can be used to describe their physical appearance but in successive iterations, graphical sketches can be used. Whenever a character is involved in future action, there is no need to describe them again, except if some of their attributes have changed (for example, they are wearing different clothes or have become badtempered because of something that happened). Design of the dialogues and play challenges During the scene, the characters perform various actions in order to overcome the game’s play challenges and they can also talk to each other and hold dialogues. Again, a flowchart can be used to describe the actions and/or dialogues that comprise the scene. There is the added difficulty that the order is not usually fixed and there will be some flexibility (or free will) in the scene so that the players can choose their own game paths. Each action or dialogue must then be defined. In a first iteration of the proposed methodology, a dialogue or challenge can be described in a couple of words but at a later stage, it is necessary to outline the sequence of steps needed to complete each action and the exchange of phrases in the dialogue using a series of diagrams. For dialogues, it is possible to adapt traditional film scripts, indicating the character who speaks and their mood (or other applicable attribute) in each intervention. Once again, when the avatar participates in a dialogue it is important to note that in order to increase the fun and complexity; the dialogue will not be closed but will depend on the answers chosen by the player. These decisions should be specified in the dialogue flowchart. Because of the possible dialogue complexity, it is advisable to first define the successful dialogue (the key to overcoming the challenges of the scene) and gradually add new alternatives. Play challenges (actions) can be part of a larger play challenge and can also be recorded as necessary for the future score in the game. At the end of this phase, therefore, the game mechanics and score should be clear. 97 3. Identification / labeling of educational challenges and assessment Associated with the play challenges are the education objectives being pursued and these are hidden in certain parts of the dialogues. In this case, it is possible to specify that a particular point of the dialogue poses an educational challenge or offer some information needed to solve it. It is also necessary to indicate when a response in a dialogue, a step or a complete play challenge (in an action) is the solution to an educational challenge. Whenever an educational goal is achieved, the corresponding evaluation rule should be defined. The evaluation rule may have associated conditions for its application and will use the values collected in different parts of the scene (or even in other scenes) where the player has been working on the educational task. Finally, it should be noted that the educational component can be divided not only among the play actions and dialogues but also the scenario objects (for example, a letter that provides specific knowledge to solve an educational challenge or an interaction with an object that means that the educational goal has been reached). This also could be recorded in this phase. 4. Identification/labeling of emotions One aspect that should not be overlooked when designing a scene is the identification of emotions that we wish the player to display. For this, the emotions established in [17] will be classified in order to design the player’s experience, marking the parts of the dialogue or the steps of an action that aim to evoke a particular emotional reaction. It is apparent from the study in [6] that this is a complex process and the viewer’s responses through their emotions are analyzed in depth as the viewer watches a video, defining the two axes of valence and excitement to represent these emotions. 5. Adaptation design In this phase, it is necessary to determine whether the game is capable of adapting to the player’s capabilities and characteristics, the game device or the environment. We therefore need to define what attributes can be customizable in the game (e.g. educational challenges, interaction mode, narrative, evaluation rules, etc.), based on the properties (the player’s knowledge, tastes and preferences, device resolution, physical context, etc.), how adjustments should be made (adaptation techniques to modify difficulty, change a character’s appearance, etc.) and when (time when the adjustment is made and how the player controls this). Although some methodologies do exist that create product lines which can be adapted to different groups [13], we have created a single product that can be adapted by adjusting certain features to suit the requirements of each child in the same group or school year. 6. Collaboration design Following the collaborative proposal in [14], it is necessary to mark the actions (play challenges) or steps within the actions that must or can (as determined) be performed in groups. 98 Use Case: Designing a videogame for comprehensive reading This methodology has been conceived from our experience of designing an educational game [11] to practice comprehensive reading and which is still being developed. The game is an adventure with 2D graphics and point & click interaction. The narrative tells the story of a boy/girl (adjustable avatar) on which the future of planet Earth depends. For this, the avatar must travel back in time and find certain characters (e.g. Cleopatra) who will give him/her historically important items so that the player to meet the challenges required). The avatar must give these treasures to a series of evil aliens who are aiming to clone or destroy Earth. We use graphical notations to ease communication between members of our multidisciplinary team. The following figure shows a simplified version of the scene diagram for a chapter where the avatar must accomplish a goal. Fig.2. Scene diagram example As the example illustrates, the Rome chapter comprises four scenes, one of which is optional (the video game can proceed if this scene is omitted). It is also possible to observe three types of transitions: standard transition, go back transition, and in the final optional scene you can see how it is possible to go to the visit prison scene without having completed a scene. 5 Conclusion and future work Despite the great impact of video games on contemporary society and their proven value for supporting and enriching the learning process of schoolchildren of all ages, there are currently few specific methodologies for developing educational video games and the ones that do exist display certain shortcomings as discussed previously. This paper, therefore, presents a new proposal which is based on an interactive screenplay that integrates all transverse game aspects. Our methodology proposes a bottom-up strategy since the overall game design (educational objectives, type of game, history and main characters) is created in the first three pre-phases, to be further refined at a later stage in the chapters and scenes. A bottom-up strategy is performed and each detail is defined in an interactive screen- 99 play with the engagement of transverse aspects such as characters, scenarios, dialogues, challenges, emotions, adaptation rules, collaboration possibilities, play score and evaluation of educational goals as the transverse aspects. The interesting thing is that from this low-level script it is possible to create a series of more abstract diagrams depicting overall challenges, competence assessment, transitions between scenarios, object interactions, character evolution, emotional experience progress, etc. These results will be used by developers and can previously be used by designers to analyze the balance and correctness of the design. It should be mentioned that designing emotions, adaptation and collaboration is optional as not all games have these. There are three main lines to our future work: firstly, to complete the graphical notations in order to produce the diagrams needed for each phase; secondly, to apply the methodology (including graphical notations) to finish designing our game so that it may be implemented by a company from the resources created; and thirdly, to incorporate the possibility of using a tool to assist with the creation of diagrams. Acknowledgments This research is supported by the Ministry of Science and Innovation (Spain) as part of VIDECO project (TIN2011-26928) and the Andalusia Research Program under the project P11-TIC-7486 co-financed by FEDER (European Regional Development Fund - ERDF). References 1. Acerenza, N., Coppes, A., Mesa, G., Viera, A., Fernández, E., Laurenzo, T. and Vallespir, D. (2009).Una Metodología para Desarrollo de Videojuegos. En Anales 38º JAIIO - Simposio Argentino de Ing. de Software (ASSE 2009), pp. 171-176. 2. Belinkie, M. The Video Game Plot Scale. August 30th, 2011.http://www.overthinkingit.com/2011/08/30/video-game-plot-scale/. 3. Connolly, T. M., Boyle, E. A. MacArthur, E., Hainey, T. and Boyle, J. M. (2012). A systematic literature review of empirical evidence on computer games and serious games, Computers & Education, 59 (2) (2012), pp. 661–686. 4. DeSeCo. (2003). La Definición y Selección de Competencias Clave. Organización para la Cooperación y el Desarrollo Económico (OCDE), y traducido con fondos de la Agencia de los Estados Unidos para el Desarrollo Internacional (USAID). 5. Griffiths M. (2002).The educational benefits of videogames. Education and Health, 20(3), 47-51. 6. Hanjalic, A. and Xu, L-Q. (2005). Affective video content representation and modeling. IEEE Transactions on Multimedia, 7 (1) (2005), pp. 143–154. 7. Herz, J. C. (1997). Joystick nation: How videogames ate our quarters, won our hearts, and rewired our minds. Boston: Little, Brown and Company. 8. Llansó, D., Gómez-Martín, P. P., Gómez-Martín, M.A. and González-Calero, P. A. (2013). Domain Modeling as a Contract between Game Designers and Programmers. SEED, 13-24. Madrid, Spain. 100 9. LOE. (2006). Ley Orgánica 2/2006, de 3 de mayo, de Educación. Publicada en «BOE» núm. 106, de 04/05/2006. 10. LOMCE. (2013). Ley Orgánica 8/2013, de 9 de diciembre, para la mejora de la calidad educativa. Publicada en «BOE» núm. 295, de 10 de diciembre de 2013, pp. 97858-97921. 11. López-Arcos, J. R., Gutiérrez-Vela, F. L., Padilla-Zea, N., Medina-Medina, N. and Paderewski, P. (2014). Continuous Assessment in Educational Video Games: A Role playing approach. Proceedings of the XV International Conference on Human Computer Interaction. ACM, 2014. 12. Marfisi-Schottman I., Sghaier A., George S., Tarpin-Bernard F. and Prévôt P. (2009). Towards industrialized conception and production of serious games. Proceeding of The International Conference on Technology and Education, pp. 1016– 1020. Paris, France. 13. Matinlassi, M., Niemelä, E. and Dobrica, L. (2002). Quality-driven architecture design and quality analysis method, A revolutionary initiation approach to a product line architecture. VTT Technical Research Centre of Finland, Espoo, 2002. 14. Padilla-Zea N., Medina-Medina, N., Gutiérrez-Vela, F. L. and Paredewski, P. (2011). A Model-Based Approach to Designing Educational Multiplayer Video Games. Technology-Enhanced Systems and Tools for Collaborative Learning Scaffolding. Springer Berlin Heidelberg, 2011. 167-191. 15. PEGI. Pan European Game Information. http://www.pegi.info/es/. 16. Rosas, R., Nussbaum, M., Cumsille, P., Marianov, V., Correa, M., Flores, P., and Salinas, M. (2003).Beyond Nintendo: design and assessment of educational video games for first and second grade students. Computers& Education, 40(1), 71-94. 17. Russell, J. A., and Steiger, J. H. (1982). The structure in persons' implicit taxonomy of emotions. Journal of Research in Personality, 16, 447- 469. 18. Squire, K. (2003).Video games in education. Int. J. Intell. Games and Simulation, 2(1), 49-62. 19. Sykes, J. and Federoff, M. (2006). Player-Centred Game Design, in CHI Extended Abstracts 2006, pp. 1731-1734. 20. Vargas, J. A., García-Mundo, L., Genero, M. and Piattini, M. (2014). A Systematic Mapping Study on Serious Game Quality. In Proceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering, ACM. 21. Vogler, C. (2002). El viaje del escritor: las estructuras míticas para escritores, guionistas, dramaturgos y novelistas. Barcelona: Ma non Troppo. 22. Westera, W., Nadolskl, R.J., Hummel, H.G.K. and Woperels, I.G.J.H. (2008). Serious games for higher education: a framework for reducing design complexity. Journal of Computer Assisted Learning, 24, pp. 420–432. 23. Wouters, P., Van Nimwegen, C., Van Oostendorp, H. and Van der Spek, E.D. (2013). A Meta-Analysis of the Cognitive and Motivational Effects of Serious Games. J. Educational Psychology, vol. 105, no. 2, pp. 248-265. 24. Zyda, M. (2005).From visual simulation to virtual reality to games. Computer, 38(9), 25-32. 101 Diseño de un juego serio basado en el suspense Pablo Delatorre1, Anke Berns2, Manuel Palomo-Duarte1, Pablo Gervas3, y Francisco Madueño1 1 3 Departamento de Ingenierı́a Informática Universidad de Cádiz, Spain pablo.delatorre@uca.es, manuel.palomo@uca.es, paco.maduechuli@alum.uca.es, WWW home page: http://departamentos.uca.es/C137 2 Departamento de Filologı́a Francesa e Inglesa Universidad de Cádiz, Spain anke.berns@uca.es, WWW home page: http://departamentos.uca.es/C115 Departamento de Ingenierı́a del Software e Inteligencia Artificial Universidad Complutense de Madrid, Spain pgervas@sip.ucm.es, WWW home page: http://nil.fdi.ucm.es Resumen En las últimas dos décadas, los juegos educativos han sido ampliamente utilizados para mejorar el aprendizaje de lenguas extranjeras. Los estudiantes generalmente se divierten aprendiendo mediante desafı́os intelectuales. No obstante, no todos los juegos logran mejorar la educación de cada alumno de la misma forma y al mismo ritmo. Diferentes caracterı́sticas del juego repercuten en la experiencia del aprendizaje. En el presente artı́culo presentamos un estudio preliminar para el diseño de un videojuego en el que enfocar aspectos psicológicos como el suspense con objeto de mejorar la atención del estudiante mientras que juega. De esta forma pretendemos no sólo enganchar a los estudiantes a jugar varias veces al mismo juego, sino también conseguir un mayor aprendizaje en cada partida. En nuestro artı́culo, describimos cómo diferentes componentes emocionales (valencia, intensidad y control) pueden afectar al jugador. Esto se implementa en un prototipo de juego de detectives basado en el suspense que dinámicamente crea la historia de la partida, teniendo en cuenta las puntuaciones emocionales de los diferentes conceptos empleados en el juego. 1. Introducción La tendencia de emplear juegos para propósitos educativos no es nueva: desde hace décadas ha sido utilizada para enganchar a los estudiantes en el proceso educativo, haciendo que dicho proceso sea divertido y atractivo. Aparte de tener un enorme potencial motivador, los juegos ofrecen la oportunidad de atraer 102 estudiantes con caracterı́sticas especı́ficas, que les permita aprender mediante situaciones en las cuales tienen que experimentar, explorar y negociar para conseguir realizar con éxito la tarea requerida en un entorno descontextualizado [1,2]. Desde el nacimiento y crecimiento constante de las Tecnologı́as de la Información y las Comunicaciones (TIC), también los videojuegos se han convertido en herramientas populares para incrementar todo tipo de procesos de aprendizaje (matemáticas, ingenierı́a del software, ciencias, historia, etcétera), ası́ como el aprendizaje de idiomas extranjeros [3,4,5,6]. Kirriemuir y McFarlane [7] distinguen dos categorı́as de videojuegos que son empleadas en el ámbito educativo: los juegos clásicos cuyo objetivo es proveer simplemente entretenimiento, y los juegos de aprendizaje, también llamados “juegos serios”, que son diseñados expresamente con propósitos educativos [8]. Incluso aunque ambos tipos de juegos son utilizados en la educación, los juegos serios gozan de una mayor popularidad, debido a que permiten a diseñadores y educadores no sólo tener en consideración las necesidades especı́ficas de los estudiantes sino además, si el diseño es apropiado, recoger las interacciones del proceso de aprendizaje para ser monitorizadas y analizadas [9,10,4]. Combinando ambos aspectos (aprendizaje y evaluación), dichos videojuegos son capaces de ofrecer entornos de aprendizaje altamente interesantes, facilitándolos tanto a estudiantes con nuevas oportunidades para aprender como a profesores con nuevas posibilidades de valorar el rendimiento y el proceso de sus estudiantes [11]. Más allá de la enorme popularidad de los juegos basados en entornos abiertos de aprendizaje, escasamente o nada guiados por orientaciones explı́citas, varios investigadores han recalcado la importancia del diseño de entornos de aprendizaje basados en instrucciones para guiar a los estudiantes durante sus procesos de aprendizaje, asegurando un mejor resultado. De acuerdo con Kirschner [12], la orientación se vuelve especialmente relevante cuando se trata de alumnos principiantes, mientras que los avanzados ya tienen suficiente conocimiento como para desarrollar sus propias guı́as internas. Existen diversos factores que afectan al proceso educativo en un juego [13]. En el presente artı́culo, proponemos enfocar aspectos psicológicos como el suspense con objeto de mejorar la atención del estudiante mientras que juega. De esta forma pretendemos no sólo enganchar a los estudiantes a jugar varias partidas al mismo juego, sino también conseguir un mayor aprendizaje en cada partida. Nuestro enfoque es el diseño de juegos serios que pretendan apoyar el aprendizaje de lenguas extranjeras para nuestros estudiantes fuera del aula. Esto se logra facilitándoles entornos virtuales de aprendizaje (Virtual Learning Environments en inglés) diseñados como una aproximación al descubrimiento y al cuestionamiento, aspectos que fomentan los procesos de aprendizaje de los estudiantes aportándoles entornos de aprendizaje que les alienten a explorar contenidos y a resolver problemas [8]. Debido a que los alumnos objeto del caso de estudio parten de un nivel básico (A1, CEFR), es especialmente beneficioso facilitarles tanto vocabulario como sea posible [14], lo cual se consigue dándoles un entorno interactivo y atractivo en el cual los jugadores necesiten participar en conversaciones con personajes no-jugadores (diálogos basados en el sistema point-and-click, tan- 103 to para leer como para escuchar) y seguir las pistas que reciben para superar paso a paso el juego (en nuestro caso resolviendo un misterio inicial). Para obtener el máximo rendimiento del sentido de la curiosidad del estudiante, nuestro objetivo es engancharlos en la exploración, satisfacer su curiosidad y fomentar su aprendizaje a través de un entorno de aprendizaje basado en la investigación. El resto de este documento se organiza como sigue: en la sección 2 facilitamos las bases de nuestro estudio, seguido del diseño del juego en la sección 3. Finalmente, en la sección 4 nos centramos en las conclusiones de nuestro estudio y el trabajo futuro. 2. Bases del estudio A pesar de que recientemente ha habido una tendencia a aumentar la diversidad de videojuegos en el ámbito educativo, la revisión bibliográfica muestra que el tipo más utilizado para la enseñanza y el aprendizaje de idiomas son las aventuras y los juegos de rol [14]. Mientras que los juegos de aventura son frecuentemente usados para principiantes (con el enfoque principal de conseguir facilitar a los alumnos la mayor exposición posible al idioma), los juegos de rol generalmente se utilizan en niveles más avanzados para fomentar la fluidez en el idioma. En todo caso, no hemos sido capaces de hallar evidencia alguna en el uso de juegos de suspense orientados a la educación, lo cual convierte el proyecto actual en una propuesta innovadora. 2.1. Emociones y memoria Hasta hace relativamente poco, las emociones no han sido consideradas en el estudio de la comprensión de los comportamientos cognitivos como la memoria. No obstante, poco a poco ha ido haciéndose evidente que no es posible negar la relevancia de las emociones en todos los aspectos de nuestro dı́a a dı́a [15]. Los estudios sobre el comportamiento de la memoria explı́cita consciente durante experiencias emocionales han revelado tres amplios campos de influencia de la emoción en la memoria: en el número (cantidad) de sucesos recordados, la vivencia subjetiva (calidad) de dichos sucesos, y la cantidad de detalles precisos recordados sobre experiencias previas [16]. Burton et al. [17] mostraron en 2004 que, para las tareas que requieren memoria explı́cita, el rendimiento global es mejor para los conceptos relativos a pasajes afectivos en comparación con conceptos neutros. Otros estudios demuestran que la memoria de los jugadores para los términos neutros ha mejorado cuando dichos términos fueron presentados en una frase con contenido emocional [18,19]. Como explican Kesinger et al. [20] dentro del laboratorio, los eventos negativos son recordados frecuentemente con un gran sentido de vivencia, comparados con los eventos positivos (Ochsner, 2000; Dewhurst et al., 2000). En contraste, los estı́mulos positivos son generalmente rememorados sólo cuando hay implicado un sentimiento de familiaridad o con información general no especı́fica (Ochsner, 104 2000; Bless & Schwarz, 1999). Este efecto de valencia4 en la memoria para los detalles puede tener un impacto en el tratamiento del individuo (por ejemplo, en los ancianos). Incluso aquéllos que tienden a focalizar más la información positiva que la negativa, en una situación de inmersión en el suspense (como emoción especı́fica relacionada), la capacidad de recordar es similar tanto para la valencia negativa como para la positiva [20]. 2.2. Suspense Citando a Ortony et al., 1990 [22], el suspense implica emoción. El suspense es considerado como la intervención de una emoción de esperanza y una emoción de miedo, unida a un estado cognitivo de incertidumbre. Esta definición puede ser ampliada “(...) como una emoción anticipativa, iniciada por un evento el cual genera anticipaciones sobre un futuro (y perjudicial) evento resolutivo para uno de los personajes principales” [23]. Asimismo, el suspense ha sido descrito como “estado emocional y respuesta que la gente tiene en situaciones en las cuales un desenlace que les concierne es incierto” [24]. El suspense es un instrumento narrativo importante en términos de gratificaciones emocionales. Las reacciones en respuesta a este tipo de entretenimiento están relacionadas positivamente con la diversión [25], teniendo un alto impacto en la inmersión de la audiencia y la suspensión de incredulidad [26]. El patrón general indica que los lectores generalmente encuentran los textos literarios más interesantes cuando el contexto incluye el suspense, existe coherencia y la temática es compleja. Estas tres causas son valoradas como influyentes en aproximadamente el 54 % de las situaciones consideradas de interés, siendo el suspense el valor con la mayor contribución individual al explicar aproximadamente el 34 % del resultado [27]. De acuerdo con esto, los experimentos en la industria del videojuego concluyen que los jugadores encuentran los juegos de suspense más divertidos que en sus versiones que carecen de él [28]. Por otro lado, la influencia del suspense no se circunscribe únicamente al campo del entretenimiento [29]. Como hemos explicado previamente, en el área de la educación es una forma directa de crear emociones que estimulen la impresión afectiva del contenido, lo cual afecta positivamente al rendimiento de la memoria explı́cita, pero también implı́cita [17]. Además, el suspense es interesante en términos de tratamiento psicológico, ya que asiste a cuestiones como la solución creativa de problemas para el control de efectos negativos y estresantes [30]. 3. Diseño del juego El juego que hemos diseñado para nuestro actual estudio está diseñado para estudiantes de un curso inicial de alemán (A1, MCERL) y tiene como objetivo 4 La valencia emocional describe el grado en el cual algo causa una emoción positiva o negativa [21]. 105 fortalecer, sobre todo, la comprensión lectora de los alumnos. Consiste en un sistema basado en la interacción en el cual los jugadores deben encontrar al asesino de un personaje que propone la máquina. Para desarrollar las tareas del juego con éxito, los jugadores deben resolver una serie de puzles, de modo que al resolver cada uno de ellos se les provee de una pista necesaria para resolver otro. Nuestra estrategia de puesta en marcha se basa en la idea de usar el mismo generador de historias para los dos grupos de estudiantes (grupo de suspense y grupo de control) que participarán en el experimento. Ninguna de las experiencias para ambos grupos difiere salvo en un aspecto: los diferentes objetos del juego ubicados en el entorno, cuya percepción esperamos tenga impacto tanto en la actitud de los estudiantes como en su aprendizaje. Un trabajo de investigación denominado Affective Norms for English Words (ANEW) [31] muestra el impacto de la afectividad de cada palabra de un conjunto de aproximadamente mil términos, puntuadas según su valencia (positividad o negatividad del término), excitación y sensación de control (de uso menos frecuente en otros experimentos). Para nuestro propósito, hemos empleado dos grupos de objetos: uno con valencia medio-alta y bajo valor de excitación (términos neutros para el grupo de control), y el otro con bajas valencias y medio-alto rango de excitación (grupo de suspense). Consideramos los objetos de este último grupo como capaces de generar suspense en el contexto narrativo adecuado. La selección ha sido realizada manualmente en esta ocasión, si bien nuestro objetivo es automatizar el proceso en futuras versiones. Durante el desarrollo del estudio hemos encontrado varios aspectos interesantes. En primer lugar, para el grupo de bajo suspense hemos decidido excluir objetos considerados originalmente por su valencia e intensidad, como el martillo y las tijeras, por ser muy sensibles al contexto. Por ejemplo, facilitar información al jugador sobre unas tijeras y un papel de regalo sobre una mesa sugerirı́a una respuesta positiva; no obstante, si en lugar del papel de regalo la mesa estuviera manchada de sangre, la percepción de las tijeras cambiarı́a, convirtiéndose en un concepto con muy baja valencia y alta excitación en relación a la respuesta emocional. Esto no ocurre, no obstante, con objetos como la radio, mantequilla, la moneda y el reloj de pulsera, donde el contexto particular es menos influyente. Asimismo, hemos revisado las traducciones entre la lengua materna de los alumnos y el idioma objetivo del aprendizaje para evitar la aparición de términos polisémicos que puedan invalidar el estudio. En ambas versiones (neutra y con suspense), los objetos del juego han sido seleccionados de acuerdo al estudio realizado por Moors et al., 2013, [32] que presenta la valencia, intensidad y control de un total de 4300 palabras alemanas. Este estudio es complementario al original ya mencionado [31], y a su equivalente en castellano [33]. En la Tabla 1 y mostramos los conceptos escogidos y su puntuación. Con objeto de invitar a los estudiantes a explorar el entorno del juego, resolviendo el crimen paso a paso, hemos diseñado una partida basada en puzles en la cual los jugadores requieren conocer diferentes testigos que les aportarán pistas. 106 Palabra Valencia Intensidad Control Palabra bala daga hueso máscara pistola llave muñeca veneno 2.55 2.81 2.98 3.41 2.28 4.44 4.22 2.06 5.38 4.53 3.31 3.39 5.41 3.44 3.09 4.44 5.48 5.21 4.53 3.80 5.41 4.13 3.28 5.19 botella (de whisky) botella (vacı́a) caramelos cristal (pequeño) gafas (gafas) rotas lámpara libro Valencia Intensidad Control 4.30 2.94 5.27 5.05 3.95 2.42 4.36 4.89 3.44 2.38 3.65 3.30 3.02 3.91 3.33 3.11 3.95 3.13 3.91 4.84 3.61 3.16 3.70 4.02 Términos de suspense Términos neutros Tabla 1. Términos para la versión con suspense y neutra Para estar seguros de que los participantes obtienen la información necesaria a fin de mejorar sus destrezas en el idioma extranjero, tanto en aspectos gramaticales como en el vocabulario en particular, los diálogos han sido diseñados cuidadosamente de acuerdo con el nivel de los alumnos y los objetivos de aprendizaje de la asignatura. Adicionalmente y siguiendo los criterios emocionales referidos más arriba, se ha seleccionado con atención cada término utilizado en el juego. En algunos casos eso nos ha obligado a priorizar aquellos términos sobre los cuales, siguiendo la puntuación ANEW, esperamos obtener un gran impacto en los estudiantes por encima de otros conceptos significativos pero de menos puntuación emocional. La razón por la cual estamos especialmente interesados en aumentar la atención de los estudiantes es debido a que creemos que una mayor atención conlleva una mejora en cuanto a la retención de conceptos y, por lo tanto, en cuanto al aprendizaje por parte de nuestros alumnos. Las Figuras 1 y 2 presentan algunos de los escenarios del juego y objetos incluidos. Por ejemplo, elementos considerados con baja valencia y alta intensidad (ver Tabla 1) como la pistola, la máscara, la daga y el veneno se encuentra en las primeras, mientras que en el segundo par de imágenes se hallan conceptos usados en la versión neutra, como los caramelos, las gafas, la botella de whisky y la llave. Figura 1. Ejemplo de habitaciones y objetos de suspense 107 Figura 2. Ejemplo de habitaciones y objetos neutros 4. Discusión y conclusiones Los juegos educativos han sido ampliamente usados para mejorar el aprendizaje de lenguas extranjeras durante las últimas dos décadas. Desafortunadamente, los objetos de los juegos son generalmente escogidos sin un análisis psicológico previo respecto a su potencial atractivo para la atención de los estudiantes. Nosotros proponemos un enfoque hacia esos objetos que crean suspense con la idea de incrementar dicha atención y, en consecuencia, el aprendizaje. Es importante destacar que la elección del diccionario de términos afectivos debe ser realizada observándose el contexto y el idioma nativo del estudiante. Esto es necesario debido a que no existen respuestas emocionales cuantitativamente universales a las palabras. Ası́, la existencia de diferencias estadı́sticas entre las puntuaciones entre españoles y americanos es considerable en las tres dimensiones emocionales5 . De igual manera deben ser tenidas en cuenta las condiciones especı́ficas del experimento6 . No obstante y aunque no se ha encontrado un criterio general que determine cuál es el mejor diccionario, el balance entre valencia e intensidad puede ser considerado similar en todos los revisados. En relación a los términos a incluir, también deben observarse ejemplos como “el pájaro en la jaula”. A pesar de que “pájaro” sugiere una emoción positiva, dicha emoción no es la misma en el contexto “jaula”. Ası́, hay una diferencia afectiva en el efecto de usar “una jaula vacı́a” respecto a “una jaula llena de pájaros”. Para crear suspense el contenido de la jaula podrı́a no ser especificado, ası́ como añadir una serie de caracterı́sticas adicionales como que esté sucia u oxidada. 5 6 Por ejemplo, en relación a la excitación, en la versión original de ANEW los participantes españoles mostraron una mayor activación emocional que los americanos; por otro lado, éstos puntuaron más la dimensión del control [33]. Por ejemplo, Moors et al., 2013, [32] refieren que los participantes de su estudio fueron dirigidos menos a observar diferencias entre control e intensidad que los participantes de otros estudios, lo que podrı́a explicar por qué obtuvieron una correlación positiva más fuerte entre valencia e intensidad que en otros experimentos. 108 Otro aspecto interesante en este ámbito es la selección de la estrategia para crear el suspense. Con objeto de mantener el experimento bajo control, necesitamos simplificarlo haciendo que únicamente varı́en los objetos asociados al efecto emocional. De otro modo, el número de dimensiones harı́an el estudio inmanejable. La influencia de otras estrategias narrativas para el aprendizaje7 serán estudiadas más adelante. Con objeto de medir el impacto de los objetos seleccionados, en a) la creación del suspense y b) el efecto en el aprendizaje del lenguaje, además de la evaluación de sus conocimientos, invitaremos a los estudiantes a rellenar un cuestionario al final del experimento dirigido especialmente a recoger la intensidad con la cual han percibido el suspense. Posteriormente, se realizará el correspondiente análisis de los datos [40]. El prototipo del videojuego diseñado se encuentra disponible con licencia libre en su forja8 . 5. Agradecimientos Este trabajo ha sido financiado por la Unión Europea bajo los proyectos UBICAMP: Una solución integrada a las barreras virtuales de movilidad (526843 LLP1-2012 Es-Erasmus-ESMO) y OpenDiscoverySpace (CIP-ICT-PSP-2011-5). Asimismo, este trabajo ha sido parcialmente financiado por el proyecto WHIM 611560 financiado por la Comisión Europea, dentro del 7o Programa Marco, en el área temática de Tecnologı́as de la Información y las Comunicaciones (ICT) y el programa de Tecnologı́as Futuras y Emergentes (Future Emerging Technologies, FET). Referencias 1. Reinders, H., Wattana, S.: Learn english or die: The effects of digital games on interaction and willingness to communicate in a foreign language. Digital Culture & Education 3 (2011) 3–29 2. Cruz-Benito, J., Therón, R., Garcı́a-Peñalvo, F.J., Lucas, E.P.: Discovering usage behaviors and engagement in an educational virtual world. Computers in Human Behavior 47 (2015) 18 – 25 Learning Analytics, Educational Data Mining and data-driven Educational Decision Making. 3. De Freitas, S., Neumann, T.: The use of ’exploratory learning’ for supporting immersive learning in virtual environments. Computers & Education 52 (2009) 343–352 4. Berns, A., Palomo-Duarte, M.: Supporting foreign-language learning through a gamified APP. In: Rosario Hernández & Paul Rankin. Higher Education and Second Language Learning. Supporting Self-directed Learning in New Technological and Educational Contexts. Peter Lang (2015) 181–204 7 8 Como son la empatı́a por los personajes [34,35,36], el nivel de amenaza [37], la proximidad e inevitabilidad del desenlace [38,39], ası́ como su transcendencia [37,23]. https://github.com/Gandio/ProjectRiddle 109 5. Lorenzo, C.M., Lezcano, L., Sánchez-Alonso, S.: Language learning in educational virtual worlds - a TAM based assessment. Journal of Universal Computer Science 19 (2013) 1615–1637 6. González-Pardo, A., Rodrı́guez Ortı́z, F.d.B., Pulido, E., Fernández, D.C.: Using virtual worlds for behaviour clustering-based analysis. In: Proceedings of the 2010 ACM Workshop on Surreal Media and Virtual Cloning. SMVC ’10, New York, NY, USA, ACM (2010) 9–14 7. Kirriemuir, J., McFarlane, A.: Literature review in games and learning (2004) ISBN: 0-9544695-6-9 [on-line, http://archive.futurelab.org.uk/resources/documents/lit reviews/Games Review.pdf]. 8. Bellotti, F., Ott, M., Arnab, S., Berta, R., de Freitas, S., Kiili, K., De Gloria, A.: Designing serious games for education: from pedagogical principles to game mechanisms. In: Proceedings of the 5th European Conference on Games Based Learning. University of Athens, Greece. (2011) 26–34 9. Melero, J., Hernández-Leo, D., Blat, J.: A review of scaffolding approaches in gamebased learning environments. In: Proceedings of the 5th European Conference on Games Based Learning. (2011) 20–21 10. Palomo-Duarte, M., Berns, A., Sánchez-Cejas, A., Caballero, A.: Assessing foreign language learning through mobile game-based learning environments. International Journal of Human Capital and Information Technology Professionals (2015 (en revisión)) 11. Chaudy, Y., Connolly, T., Hainey, T.: Learning analytics in serious games: A review of the literature. European Conference in the Applications of Enabling Technologies (ECAET), Glasgow (2014) 12. Kirschner, P.A., Sweller, J., Clark, R.E.: Why minimal guidance during instruction does not work: An analysis of the failure of constructivist, discovery, problem-based, experiential, and inquiry-based teaching. Educational psychologist 41 (2006) 75–86 13. Schmidt, R.: Awareness and second language acquisition. Annual Review of Applied Linguistics 13 (1992) 206–226 14. Cornillie, F., Jacques, I., De Wannemacker, S., Paulussen, H., Desmet, P.: Vocabulary treatment in adventure and role-playing games: A playground for adaptation and adaptivity. In: Interdisciplinary Approaches to Adaptive Learning. A Look at the Neighbours. Springer (2011) 131–146 15. Phelps, E.A.: Human emotion and memory: interactions of the amygdala and hippocampal complex. Current opinion in neurobiology 14 (2004) 198–202 16. Kesinger, E.A., Schacter, D.L.: Memory and emotion. In: Handbook of emotions. Guilford Press (2008) 601–170 17. Burton, L.A., Rabin, L., Vardy, S.B., Frohlich, J., Wyatt, G., Dimitri, D., Constante, S., Guterman, E.: Gender differences in implicit and explicit memory for affective passages. Brain and Cognition 54 (2004) 218–224 18. Brierley, B., Medford, N., Shaw, P., David, A.S.: Emotional memory for words: Separating content and context. Cognition and Emotion 21 (2007) 495–521 19. Liu, H., Hu, Z., Peng, D.: Evaluating word in phrase: The modulation effect of emotional context on word comprehension. Journal of psycholinguistic research 42 (2013) 379–391 20. Kensinger, E.A., Garoff-Eaton, R.J., Schacter, D.L.: Memory for specific visual details can be enhanced by negative arousing content. Journal of Memory and Language 54 (2006) 99–112 110 21. Citron, F.M., Gray, M.A., Critchley, H.D., Weekes, B.S., Ferstl, E.C.: Emotional valence and arousal affect reading in an interactive way: neuroimaging evidence for an approach-withdrawal framework. Neuropsychologia 56 (2014) 79–89 22. Ortony, A., Clore, G.L., Collins, A.: The Cognitive Structure of Emotions. Cambridge University Press (1990) 23. de Wied, M., Tan, E.S., Frijda, N.H.: Duration experience under conditions of suspense in films. Springer (1992) 24. Carroll, N.: The paradox of suspense. Suspense: Conceptualizations, theoretical analyses, and empirical explorations (1996) 71–91 25. Oliver, M.B.: Exploring the paradox of the enjoyment of sad films. Human Communication Research 19 (1993) 315–342 26. Hsu, C.T., Conrad, M., Jacobs, A.M.: Fiction feelings in Harry Potter: haemodynamic response in the mid-cingulate cortex correlates with immersive reading experience. NeuroReport 25 (2014) 1356–1361 27. Schraw, G., Flowerday, T., Lehman, S.: Increasing situational interest in the classroom. Educational Psychology Review 13 (2001) 211–224 28. Klimmt, C., Rizzo, A., Vorderer, P., Koch, J., Fischer, T.: Experimental evidence for suspense as determinant of video game enjoyment. CyberPsychology & Behavior 12 (2009) 29–31 29. Delatorre, P., Arfè, B.: Modulare la suspense del lettore attraverso un modelo computazionale. In: XXVIII Congresso Nazionale Sezione di Psicologia dello sviluppo e dell’educazione. (2015) (accepted). 30. Zachos, K., Maiden, N.: A computational model of analogical reasoning in dementia care. In: Proceedings of the Fourth International Conference on Computational Creativity. (2013) 48 31. Bradley, M.M., Lang, P.J.: Affective norms for english words (ANEW): Instruction manual and affective ratings. Technical report, Technical Report C-1, The Center for Research in Psychophysiology, University of Florida (1999) 32. Moors, A., De Houwer, J., Hermans, D., Wanmaker, S., van Schie, K., Van Harmelen, A.L., De Schryver, M., De Winne, J., Brysbaert, M.: Norms of valence, arousal, dominance, and age of acquisition for 4,300 dutch words. Behavior research methods 45 (2013) 169–177 33. Redondo, J., Fraga, I., Padrón, I., Comesaña, M.: The spanish adaptation of ANEW (affective norms for english words). Behavior research methods 39 (2007) 600–605 34. Gallagher, S.: Empathy, simulation, and narrative. Science in context 25 (2012) 355–381 35. Gerdes, K.E., Segal, E.A., Lietz, C.A.: Conceptualising and measuring empathy. British Journal of Social Work 40 (2010) 2326–2343 36. Keen, S.: A theory of narrative empathy. Narrative 14 (2006) 207–236 37. Delatorre, P., Gervas, P.: Un modelo para la evaluación de la narrativa basada en partidas de ajedrez. In: Proceedings of the 1st Congreso de la Sociedad Española para las Ciencias del Videojuego (CoSECiVi 2014), CEUR Workshop Proceedings (2014) 38. Zillmann, D., Tannenbaum, P.H.: Anatomy of suspense. The entertainment functions of television (1980) 133–163 39. Gerrig, R.J., Bernardo, A.B.: Readers as problem-solvers in the experience of suspense. Poetics 22 (1994) 459–472 40. Bellotti, F., Kapralos, B., Lee, K., Moreno-Ger, P., Berta, R.: Assessment in and of serious games: An overview. Advances in Human-Computer Interaction (2013) 111 Modelling Suspicion as a Game Mechanism for Designing a Computer-Played Investigation Character Nahum Alvarez1 and Federico Peinado2 1 Production Department, Gameloft Tokyo 151-0061 Tokyo, Japan nahum.alvarezayerza@gameloft.com 2 Departamento de Ingeniería del Software e Inteligencia Artificial, Facultad de Informática, Universidad Complutense de Madrid 28040 Madrid, Spain email@federicopeinado.com Abstract. Nowadays creating believable characters is a top trend in the video game industry. Recent projects present new design concepts and improvements in artificial intelligence that are oriented to this goal. The expected behaviour of computer-played characters becomes increasingly demanding: proactivity, autonomous decision-making, social interaction, natural communication, reasoning and knowledge management, etc. Our research project explores the possibility for one of these characters to investigate, solving an enigma while dealing with incomplete information and the lies of other characters. In this paper we propose how to manage trust issues when modelling suspicion in the artificial mind of a Columbo-like detective, considering that our desired gameplay is a murder mystery game in which the player plays the role of the culprit. Keywords. Interactive Digital Storytelling, Video Game Design, Artificial Intelligence, Believable Characters, Epistemic Modal Logic, Trust Systems 1 Introduction Computer simulations allow a high grade of realism; however, computer-controlled characters in virtual worlds still feel like mere automatons: the user gives an input and receives a response. Advances in intelligent behaviour are changing the user experience, improving these characters’ performance and expanding the scope of their potential uses. One aspect that refrain us from believing such characters are “alive” is that they seem completely “naive”: they will believe without doubts all the information received from other sources. Of course, in interactive storytelling this issue is mainly confronted with a script that the character will follow: if it has to suspect, it will suspect, if it has to lie, it will lie, etc. but there is no real intelligence behind those decisions. A system capable of managing suspicion would improve greatly the degree of realism of the characters, but such feature has not been sufficiently explored. 112 In this work we focus in the question of trust, intrinsic to human nature, and introduce a model for a computer-played character that suspect about others and try to detect when they are covering their actions or telling lies. In this model, the autonomous “investigator” works with incomplete information, managing uncertain knowledge and potentially false pieces of evidence. We also propose a videogame application designed to test our model. The application consists of an interactive version of the popular TV series from the NBC: Columbo, where the player controls the murderer instead of the main character of the story, Lieutenant Columbo, who is the homicide detective. The culprit is obvious to the player since the very beginning of the game, and the goal of this character is to deceive the detective for not getting caught (as in the original episodes, the format of the typical “whodunit” mystery is reversed to the “howcatchthem” paradigm or “inverted detective story”). In order to frame our proposal, in Section 2 we review other models of deception and trust. Section 3 presents our model of suspicion for computer-controlled characters and Section 4 describes an example scenario of how our model would work using a murder mystery game as an application. Finally, in Section 5 we present and discuss our conclusions, foreseen the next steps of this project. 2 Related Work Modelling deception has been object of research from long ago. Jameson presented IMP in 1983 [11], a system that simulates a real estate agent who tries to deceive the user. Nissan and Rousseau described how to model the mental state of agents in a more playable scenario: a crime investigation interactive fiction [19]. In later works, Nissan showed a model for representing complex stories involving deceptions and suspicion [18]. This proposal is a theoretical model, but it established a basis to construct working systems using trust and partial knowledge. Such formal models have been presented since then, showing an integrated frame for uncertainty and deception in game theory [16]. Sometimes the character may work not only with incomplete information, but also with information that is considered as “uncertain” (depending on the sincerity of the characters that reveal it). In order to manage these features, intelligent characters have to decide which information to trust. Research on trust and deception itself is a broad concept with multiple aspects that have been explored in the literature, such as selfdeception and how it affects decision-making [10], deceptive strategies in auction style interactions [5] or deceptive information in decision-making processes for games [14]. However, these theoretic approaches were mostly oriented to a very limited type of interactions. Certainly, trust is a crucial aspect to deal when requesting services or information from third parties [15]. It has been extensively treated in literature but still it does not have a standard formal definition [22]. In a general classification proposed in [17], four general types of trust are defined: basic trust, a priori generalized trust, inner dialogicality and context-specific trust. The last type is defined for applications that require trust management, which is the case in a murder mystery game; for instance, 113 we trust a doctor about health issues when he is treating us, but not about the death of his wife when he is the main suspect in her murder. This type of trust contains contextspecific rules, so we can decide if trusting or not using different methods. For example, Griffin and Moore [8] present a model that assigns different truth values to the concepts managed by autonomous agents, allowing them to choose if they trust their sources or not. These agents simulate how the situation would evolve if they take certain decisions, trying to maximize their outcome in a minimax fashion. The system of Wang et al. [24] uses fuzzy values, implying the information of a source is not definitely true or false, but it has some grade of truth, and we can also find systems using a vector of parameters for different aspects of trust [25]. Another model of trust and commitment [12] proposes that positive and negatives experiences impact in the trustee, treating all of them equally (although probably in real life negative ones have a greater impact). Trusting a source of information does not only rely in judging over the data received from that source, but we also have to take in account all the related sources. Moreover, we also should take in account third parties’ information: for example, in [3] trust is built without direct interaction between the parts using a set of relationships and trust parameters in their multi-agent model, allowing to build trust on the information of a third party agent. This transitivity is well defined in the theory of Human Plausible Reasoning [7], a frame designed to simulate the way a human reason about truth values. This frame establishes an ontology-based model that uses a fuzzy parameter for measuring the degree of trust of a statement or a source. This has been used as a basis for developing complex trust systems, showing good results. ScubAA [1] is another example consisting of a generic framework for managing trust in multi-agent systems. A central trust manager is in charge of sending the user’s request to the most trusted agent for the needed task. The trust degree of each agent is calculated using not only the information it has but also the information of the agents related to them, so an agent can evaluate agents directly unknown for it. This transitivity is useful for building a trust network where information of third parties is considered, affecting the trust of the agents, even if they are not present or currently known. We also use this approach in our proposal, but in our system, instead of designing a centralized system, each agent can manage its own trust network. Also, it is advisable to take into account the trust history of the other parties [13], in order to trust in sources that were “trustful” in the past for us or for our partners [26]. Anyway, agents have to take this information carefully, because they only receive partial information from others. This also presents an interesting feature and a powerful possibility: if a malicious agent manages to tarnish your trust level for other parties, they will not trust you in future interactions. In our model, allowing the character to build its own trust values, makes the information they hold to be incomplete and different for each character. This feature is beneficial from our point of view, because it allows introducing deceptive information in the system without being “caught”, testing the investigator’s ability to find the truth. Finally, in order to generate trust in other parties or deceive them with our lies, we have to relay in argumentation. In literature, different analysis about epistemology and argumentation has been examined and described thoroughly, establishing directions that allow modelling critical reasoning [2]. In short, we can trust a source only 114 after building argumentation backing it [23]. Especially interesting for our goal is how to build critical questions in order to attack other’s argumentation, a process that will help us to discover the “holes” in an agent’s argumentation, finding not only lies but also important information previously unknown [9]. 3 A Model of Suspicion Our knowledge model is designed to allow computer-controlled characters to decide if they should believe (trust in) the information received from a third party. The decision-making or judgement about a concrete piece of information will be based on the previous data they have about it, and about the source from they received that information. This paper is focused on introducing the model, so we have deliberately postponed the analysis of the more technical aspects (i.e. the reasoning engine) for further research. In order to test our model, the characters will operate in a detective scenario, a great domain example where trust and deception are key features to have into account. In this scenario, the user plays the role of a murderer who has to mislead an important non-player character: the detective. The player’s goal will be to prevent the detective to discover the murderer’s identity (his/her own identity). The scenario contains also other suspicious characters, each one with its own information about the incident. The detective will obtain new information (from now on, Facts) about the scenario, talking with other characters and finding evidences, storing all this information in a knowledge base. In order to decide what information should trust and how, the first challenge the detective confronts is determining the truth value of Facts. As we saw in the previous section, other models are based on lists of trust parameters, fuzzy truth values, or statistical probabilities. We decided to use a discrete system using a ternary value (“true”, “false” and “unknown”) for each Fact. The first time the detective gets informed about a Fact, he will create an entry in his knowledge base about it, and he will update its value depending of the additional information he finds about it. Not knowing certain information is not the same than knowing it is true or false (i.e. open world assumption), so we use the value “unknown” in order to mark those doubtful Facts and subsequently trying to discover their real value with abductive reasoning . The next question is how to “quantify” the truth value of a Fact. We only have found simple methods in the literature for this decision-making process: usually it is enough to use a fixed probability number or comparing the number of agents supporting a fact with the number of agents denying it. However, if the system is that simple we may lose certain desirable characteristics as taking into account evidences that support the truth or falsehood about a fact, or remembering if a character previously lied to us before evaluating her statements, especially if those statements are related with the one about she lied. Considering these requirements, our detective will have a list of truth values about a Fact where they will store each evidence they obtain about it. Having a bigger number of supporting evidence for a Fact (“true” values) will make him to believe the statement, and on the other hand, having more contradictory evidence (“false” values) 115 will lead the agent to consider the statement as a lie. This is similar to the system used in [6] where the positive evidences are cancelled by negative ones. Whenever we confirm or discard a fact from a source, we will record whether he lied to us or not. If we receive information from one source, the truth value of that information would be set initially to “true”, assuming the source did not lie before (so we trust in him). We would set the information as “false” if the source lied about a related topic, or “unknown” if we don’t have previous information about the source, or if the source lied but about another topic. Once we have a model describing how the information will be stored and judged, we need a reasoning method in order to figure out the truth value of unknown Facts. Although it is not the main focus of this paper to present a reasoning implementation, it is necessary to describe such method in order to show how to deal with its characteristic features: explicitly unknown values and non-trustable sources. In our model, the automatic reasoner has the goal to clear “unknown” values by giving them “false” or “true” values by identifying if assigning those values generates a contradiction or a possible outcome. We propose using two joint techniques that resemble a detective work to achieve our goal. The first one would be using an “established knowledge” database. Facts we will receive are checked using a common sense’s rules database: if some Fact contradicts a rule in that database we will know that the source is lying; for example, stating that a character was outside of a building while it was raining, but seeing that his clothes are dry. Also, we can do this Fact checking with our first person evidences: if we know a Fact for sure, any information that contradicts it would be false. The second technique consists in trying to clear the “unknown” actively. If we want to know who the murderer is, but we have limited information including false evidence from the culprit, it’s very likely that once we obtain enough information from other characters we will have some “relative contradictions”, represented by having “true” and “false” values in the list of truth values obtained from different sources about the same fact. Upon finding a contradiction, the character will run two simulations, respectively binding the truth value for that fact to “true” and “false”, and propagating further consequences. If the resulting knowledge base of the simulation has a contradiction with another Fact we know for sure is true (values in our “established knowledge” database), we can discard that value. We are currently analysing reasoning tools in order to apply the most suitable one to work with our suspicion model. In order to model sophisticated scenarios, the reasoner should have planning capabilities, and be able to work with hypothetic or potential facts. For example, since our knowledge structure works with undetermined truth values, a model based in Intuitionistic Logic would fit well. Using this technique, a suitable framework would be the Hypothetical Logic of Proof [21]. This work is derived from Natural Deduction [20], which is designed to be similar to intuitive, informal reasoning, like the one we can see in detective’s stories. Next steps in this research will explore the different techniques that work well with our model, presenting a working reasoning prototype. This prototype will consist on as a detective game like the one we have mentioned, which is detailed in the next section. 116 4 Application in a Murder Mystery Game In order to test our model we propose a game around the concept of trust and deception. Previous researchers have also used games as well for simulating trust relations, like [6] where an investment game (a modification of Berg’s game [4]) is used for analysing how users react when trust and deception is introduced to the game (showing that users trust more in a system with advisor, even if he can deceive them). Our scenario will be an inverted detective story based in the Columbo TV episodes, where the player has the role of the culprit in a murder case, and his goal is to deceive a computer-controlled detective (a sort of virtual Columbo) and lead him to arrest another suspect instead of the player. The simulation will play as a series of rounds in which the detective will question the player and the other suspects. Obviously, the player knows the details of the murder and he will have the opportunity to hear what other characters say in order to gain additional information from them to be used for deceiving the detective. The player can, for example, create an alibi for himself (“I was with that person at the moment of the crime”), or blame a third party (“I saw that person entering in the victim room just before the crime”). If the player manages to plant enough false evidence, the detective will arrest the wrong character and the player will win the game. However, if the detective find contradictions in the player’s alibi or manages to extract the true testimony from the other suspects, he will soon discover the player as the real culprit, making him lose the game. In order to illustrate these ideas, the next figures show how the knowledge of the detective evolves during a game session. The example is inspired in the first episode of the original Columbo series, “Murder by the Book” in which Jim Ferris, a famous writer, is killed by his colleague, Ken Franklin. As part of his alibi, Ken tries to incriminate the Mafia, dropping a false document in the crime scene and lying about the next book Jim was working on. The player plays the role of Ken, explaining his alibi to the detective, who will try to find something that contradicts Ken’s suggestions. The next paragraphs explain just a small part of the complete plot of the episode, enough to illustrate the basic ideas behind our model. Firstly, the crime is introduced (Fig. 1) with a clear fact (represented by a white box), and some basic deductions (black arrows) performed by the automatic reasoner. This puts the detective in his way to find the murderer, that could be someone close to the victim, or not (red arrows with a white head represents disjoint Facts). Fig. 1. The crime is introduced and basic deductions are performed 117 Then, the interrogation starts (Fig. 2) and the detective asks Joanna, Jim’s wife, about the enemies of his husband. If Joanna tells the truth, which initially is “believable” (discontinuous line) because there are no contradictions, Jim has no enemies. That means the murderer could be Joanna herself or Ken, his closest friend. Fig. 2. Joanna is interrogated, adding facts and alternatives about the murderer Considering the complete case, there are still many unknown Facts, so the interrogation continues (Fig. 3), this time asking Ken about his friend. As the real culprit, the player may try to deceive the detective lying about the existence of a secret project: Jim was writing a book about the Mafia. He could probably reinforce the idea with more Facts, explicitly lying about Jim and his “enemies”. This contradiction makes “Jim has enemies” an unknown Fact that the detective will investigate. Fig. 3. Ken is interrogated, adding lies and contradictory facts Later on the game session, a false document is found in the crime scene (Fig. 4). As the person who put that document there, Ken has been waiting for this moment and its “dramatic effect”. A list of names of dangerous Mafia members is found in the desk of the victim. The reasoner of the computer-controlled detective finds plausible that the writer created that list, so it seems evident that Jim was really working on that book. But at the end of the game session, looking for more clues to solve other parts of the mystery, more evidence appears: the detective notices that the document has been 118 folded as if someone stored it in his pocket (Fig. 5). It does not make sense that Jim created that list and folded it before putting on his desk, so other person should have done it, possibly trying to incriminate the Mafia instead of the real murderer. So Ken’ story about the Mafia was a lie, and probably previous Facts coming from him should be considered “false”. Now the detective believes Ken is a liar and one of the main suspects: a good approach to the solution of the whole case. Fig. 4. Evidence is found in the crime scene. Ken’s information seems to be true Fig. 5. Another piece of evidence. The detective will not trust Ken anymore 5 Conclusions In this paper we surveyed previous works on computational trust and deception and proposed a model for computer-controlled characters that suspects from others, not trusting any information received from third parties. Previous work on trust models relies on sources that generally do not lie, but in our case that assumption cannot be accepted. Our model is designed for situations with incomplete information and a 119 constant need of judge what is likely to be true or false. Deciding who to trust is a key process in real life that can support and enrich a very wide range of games. We used a ternary depiction for the truth value of facts and an automatic reasoner for inferring potential outcomes starting from the known facts. The model manages a list of potential truth values for each Fact, even coming from information given by a third party, selecting the value that appears the most from trustful sources. It also marks the other parties as trustful or not whenever it discards contradictory facts or accepts valid ones coming from them. If the model cannot decide the truth value of something (hence, it has the same number of supporting and refuting evidence), it will mark it as “unknown”. We described superficially how a reasoning engine could work using this model, trying to clear “unknown” values by reasoning over the potential outcomes that any of the truth values could generate. Such engine needs to be explored more in depth after a comparison of potential techniques, and will be the focus of our next steps. We also designed a game mechanism that is illustrated with an example scenario based in the first episode of the Columbo TV series for developing our model. In this gameplay, the player takes the role of the culprit in a murder case and a computer-controlled detective tries to discover who did it by looking for evidence and questioning suspects, including the player. As the next steps of our research, we are currently working on a computational model that implements this example scenario as a simple text-based dialogue, allowing the user to play the culprit role with the computer acting as the detective. Also, we plan to test the scenario, containing the complete murder mystery, with real users taking the roles of the culprit and the detective in a “Wizard of Oz experiment” where we can analyse how they act and what kind of questions are asked in order to evade suspicion or discovering lies. With the results of this experiment, we will establish a behavioural baseline for comparing our model with the one someone in the role of a detective would have. Then we will re-enact the experiment, but this time using our application where the users only play the culprit role, and we will compare the behaviours of the human detective and the computational one. Further research opens interesting paths: we want to analyse the nature of deception as well, so we want to expand our model allowing the rest of the non-player characters to decide when they should lie in order to cover their alibi, or hide private information. We think that creating a working model for trust and deception would be a useful contribution to interactive storytelling and video games, and to the Artificial Intelligence community as well, because it is a feature not fully explored yet. References 1. Abedinzadeh, S., Sadahoui S.: A trust-based service suggestion system using human plausible reasoning. Applied Intelligence 41, 1 (2014): 55-75. 2. Amgoud, L., Cayrol, C.: A reasoning model based on the production of acceptable arguments. Annals of Mathematics and Artificial Intelligence 34, 1-3 (2002): 197-215. 3. Barber, K.S., Fullan, K., Kim, J.: Challenges for trust, fraud and deception research in multiagent systems. Trust, Reputation, and Security: Theories and Practice. Springer Berlin Heidelberg (2003): 8-14. 120 4. Berg, J., Dickhaut, J., Mccabe, K.: Trust, reciprocity, and social history. Games and Economic Behavior 10, 1 (1995): 122-142. 5. Broin, P.Ó., O’Riordan C.: An evolutionary approach to deception in multi-agent systems. Artificial Intelligence Review 27, 4 (2007): 257-271. 6. Buntain, C., Azaria A., Kraus S.: Leveraging fee-based, imperfect advisors in human-agent games of trust. AAAI Conference on Artificial Intelligence (2014). 7. Collins, A., Michalski, R.: The logic of plausible reasoning: A core theory. Cognitive Science 13, 1 (1989): 1-49. 8. Griffin, C., Moore, K.: A framework for modeling decision making and deception with semantic information. Security and Privacy Workshops, IEEE Symposium (2012). 9. Godden, D.J., Walton, D.: Advances in the theory of argumentation schemes and critical questions. Informal Logic 27, 3 (2007): 267-292. 10. Ito, J. Y., Pynadath, D. V., Marsella, S. C. Modeling self-deception within a decisiontheoretic framework. Autonomous Agents and Multi-Agent Systems 20, 1 (2010): 3-13. 11. Jameson, A.: Impression monitoring in evaluation-oriented dialog: The role of the listener's assumed expectations and values in the generation of informative statements. International Joint Conference on Artificial intelligence 2 (1983): 616-620. 12. Kalia, A.K.: The semantic interpretation of trust in multiagent interactions. AAAI Conference on Artificial Intelligence (2014). 13. Lewicki, R.J. and Bunker, B.B.: Trust in relationships: A model of development and decline. Jossey-Bass (1995). 14. Li, D. , Cruz, J.B.: Information, decision-making and deception in games. Decision Support Systems 47, 4 (2009): 518-527. 15. Li, L., Wang, Y.: The roadmap of trust and trust evaluation in web applications and web services. Advanced Web Services. Springer New York (2014): 75-99. 16. Ma, Z.S.: Towards an extended evolutionary game theory with survival analysis and agreement algorithms for modeling uncertainty, vulnerability, and deception. Artificial Intelligence and Computational Intelligence. Springer Berlin Heidelberg (2009): 608-618. 17. Marková, I., Gillespie, A.: Trust and distrust: Sociocultural perspectives. Information Age Publishing, Inc. (2008). 18. Nissan, E.: Epistemic formulae, argument structures, and a narrative on identity and deception: a formal representation from the AJIT subproject within AURANGZEB. Annals of Mathematics and Artificial Intelligence 54, 4 (2008): 293-362. 19. Nissan, E., Rousseau, D.: Towards AI formalisms for legal evidence. Foundations of Intelligent Systems. Springer Berlin Heidelberg (1997): 328-337. 20. Prawitz, D. Natural Deduction. A Proof-Theoretical Study. Almqvist & Wiksell, Stockholm (1965). 21. Steren, G., Bonelli, E. Intuitionistic hypothetical logic of proofs. Electronic Notes in Theoretical Computer Science, 300 (2014): 89-103. 22. Walterbusch, M., Graüler. M., Teuteberg, F.: How trust is defined: A qualitative and quantitative analysis of scientific literature (2014). 23. Walton, D.: The three bases for the enthymeme: A dialogical theory. Journal of Applied Logic 6, 3 (2008): 361-379. 24. Wang, Y., Lin, K.J., Wong, D.S., Varadharajan, V.: Trust management towards serviceoriented applications. Service Oriented Computing and Applications 3, 2 (2009): 129-146. 25. Xiong, L., Liu, L.: PeerTrust: Supporting reputation-based trust for peer-to-peer electronic communities. Knowledge and Data Engineering, IEEE Transactions 16, 7 (2004): 843-857. 26. Zhou, R. Hwang, K.: PowerTrust: A robust and scalable reputation system for trusted peerto-peer computing. Parallel and Distributed Systems, IEEE Transactions 18, 4 (2007): 460473. 121 Spontaneous emotional speech recordings through a cooperative online video game Daniel Palacios-Alonso, Victoria Rodellar-Biarge, Victor Nieto-Lluis, and Pedro Gómez-Vilda Centro de Tecnologı́a Biomédica and Escuela Técnica Superior de Ingenieros Informáticos Universidad Politécnica de Madrid Campus de Montegancedo - Pozuelo de Alarcón - 28223 Madrid - SPAIN email:daniel@junipera.datsi.fi.upm.es Abstract. Most of emotional speech databases are recorded by actors and some of spontaneous databases are not free of charge. To progress in emotional recognition, it is necessary to carry out a big data acquisition task. The current work gives a methodology to capture spontaneous emotions through a cooperative video game. Our methodology is based on three new concepts: novelty, reproducibility and ubiquity. Moreover, we have developed an experiment to capture spontaneous speech and video recordings in a controlled environment in order to obtain high quality samples. Keywords: Spontaneous emotions; Affective Computing; Cooperative Platform; Databases; MOBA Games 1 Introduction Capturing emotions is an arduous task, above all when we speak about capturing and identifying spontaneous emotions in voice. Major progress has been made in the capturing and identifying gestural or body emotions [1]. However, this progress is not similar in the speech emotion field. Emotion identification is a very complex task because it is dependent on, among others factors, culture, language, gender and the age of the subject. The consulted literature mentions a few databases and data collections of emotional speech in different languages but in many cases this information is not open to the community and not available for research. There is not an emotional voice data set recognized for the research community as a basic test bench, which makes a real progress in the field very complicated, due to the difficulty in evaluating the quality of new proposals in parameters for characterization and in the classification algorithms obtained using the same input data. To achieve this aim, we propose the design of a new protocol or methodology which should include some features such as novelty, reproducibility and ubiquity. 122 2 Spontaneous emotional speech recordings Typically, emotional databases have been recorded by actors simulating emotional speech. These actors read the same sentence with different tones. In our research, we have requested the collaboration of different volunteers with different ages and gender. Most of volunteers were students who donated their voices. First of all, they had to give their consent in order to participate in our experiment. Therefore, we show a novel way of obtaining new speech recordings. The next key feature for this task is reproducibility, where each experiment should provide spontaneity, although the exercise was repeated a lot of times. This characteristic is the most important drawback we have found in the literature. Most of the time, when it carries out an experiment, this user is discarded immediately, because he/she knows perfectly the guideline of the exercise, for this reason the spontaneity is deleted. The third feature is, ubiquity. When we speak about this concept, we refer to carrying out the exercise in every part of the world, but that does not mean that we cannot use the same location or devices. Nowadays, new technologies such as smart-phones, tablets and the like are necessary allies in this aspect. In view of all the above, multiplayer videogames are the perfect way to achieve the last three premises. Each game session or scenario can be different. Moreover, we can play at home, in a laboratory or anywhere. Thanks to the Internet, we can find different players or rivals around the world who speak other languages and have other cultures, etcetera. Each videogame has its own rules, thus each player knows the game system and they follow these rules if they want to participate. For this reason we find the standardization feature intrinsic in the videogames. Therefore, we conclude videogames are the perfect tool in order to elicit spontaneous emotions. This research has two main stages; they consist of capturing emotions through a videogame, more specifically League of Legends (aka LoL), and identify the captured emotions through the new cooperative framework developed by our team. To assess the viability of our protocol, we have developed a controlled experiment in our laboratory. In subsequent sections, it will be explained in detail. The contribution of this work is to establish a community to cooperate in collecting, developing and analyzing emotional speech data and define a standard corpus in different languages where the main source of samples will be emotional speeches captured through videogame rounds. In this sense, this paper is a first step in proposing the design and development of an online cooperative framework for multilingual data acquisition of emotional speech. This paper is organized as follows. In the next section, we introduce some emotional databases of speech and foundations for modeling emotions in games. In section III, we introduce the proposed experiment. And finally, we conclude with the summary and future works. 123 Spontaneous emotional speech recordings 2 3 Previous Works Below, we present previous works carried out by different researchers who have focused their attention in emotional areas. Some of them have elaborated emotional databases, others have developed affective models to improve the realism of NPCs (Non Player Character) or have attempted to verbalize certain situations that happen for game rounds, etc. We are going to attempt to find a common ground between using videogames and the design of a protocol to capture emotions through voice. 2.1 Emotions in Videogames According to [2] there exists a lack a common, shared vocabulary that allows us to verbalize the intricacies of game experience. For any field of science to progress, there needs to be a basic agreement on the definition of terms. This concept is similar to the lack of agreement for the relevant features in order to characterize and classify speech emotions. They define two concepts, flow and immersion. Flow can be explained as an optimal state of enjoyment where people are completely absorbed in the activity. This experience was similar for everyone, independent of culture, social class, age or gender. This last assertion is a key point for us, because we are searching for the most suitable method or protocol for anyone. Immersion is mostly used to refer to the degree of involvement or engagement one experiences with a game. Regarding arousal and valence, Lottridge has developed a novel, continuous, quantitative self-report tool, based on the model of valence and arousal which measures emotional responses to user interfaces and interactions [3]. Moreover, Sykes and Brown show the hypothesis that the player’s state of arousal will correspond with the pressure used to press buttons on a gamepad [4]. Concerning elicit emotion and emotional responses to videogames, in [5] presents a comprehensive model of emotional response to the single-player game based on two roles players occupy during gameplay and four different types of emotion. The emotional types are based on different ways players can interact with a videogame: as a simulation, as a narrative, as a game, and as a crafted piece of art. On the other hand, some researchers focus their attention on psychophysiological methods in game research. Kivikangas et al. carry out a complete review of some works in relation with these kind of methods. They present the most useful measurements and their effects in different research areas such as game effects, game events, game design and elicited emotions. Electromyography (EMG), Electrodermal activity (EDA), Heart Rate (HR), among others, are some of these measurements [6]. Another initiative was developed by [7], designing requirements engineering techniques to emotions in videogame design, where they introduced emotional 124 4 Spontaneous emotional speech recordings terrain maps, emotional intensity maps, and emotional timelines as in-context visual mechanisms for capturing and expressing emotional requirements. Regarding emotion modeling in game characters, Hudlicka and Broekens present theoretical background and some practical guidelines for developing models of emotional effects on cognition in NPCs [8]. In [9] have developed a toolkit called the Intelligent Gaming System (IGS) that is based on Command (Atari, 1980). The aim was to keep engagement as measured by changing heartbeat rate, within an optimum range. They use a small test group, 8 people, whose experience was documented and thanks to their conclusions, they could design a theory of modes of affective gaming. 2.2 Emotional Databases Most databases have been recorded by actors simulating emotional discourses and there are a very few of them of spontaneous speech [10], [11]. The emotions are validated and labeled by a panel of experts or by a voting system. Most of the databases include few speakers and sometimes they are gender [12] unbalanced, and most of recorded data do not consider age. Then they restrict to carrying out research related with subject age range [13]. It can be noticed in several publications that the data are produced just for specific research, and the data are not available for the community. Some databases related to our research are briefly mentioned next. Two of the well-known emotional databases for speech are the Danish Emotional Speech Database (DES) [14] and the Berlin Emotional Speech Database (BES) [15] in German. BES database, also known as Emo-Database, is spoken by 10 professional native German actors, 5 female and 5 male. It includes the emotions of neutral, anger, joy, sadness, fear, disgust and boredom. The basic information is 10 utterances, 5 short and 5 longer sentences, which could be used in daily communication and are interpretable in all applied emotions. The recorded speech material was around 800 sentences. All sentences have been evaluated by 20-30 judges. Those utterances for which the emotion was recognized by at least 80% of the listeners will be used for further analysis. DES database is spoken by four actors, 2 male and 2 female. It contains the emotions of neutral, surprise, happiness, sadness and anger. Records are divided into simple words, sentences and passages of fluent speech. Concerning stress in speech, the SUSAS English database (Speech Under Simulated and Actual Stress) is public and widely used [16]. It contains a set of 35 aircraft communication words, which are spoken spontaneously by aircraft pilots during a flight, and also contains other samples of non-spontaneous speech. Finally, the work closest to our approach that we have found in literature, has been in [17]. They have developed an annotated database of spontaneous, multimodal, emotional expressions. Recordings were made of facial and vocal 125 Spontaneous emotional speech recordings 5 expressions of emotions while participants were playing a multiplayer first-person shooter (fps) computer game. During a replay session, participants scored their own emotions by assigning values to them on an arousal and a valence scale, and by selecting emotional category labels. 3 Affective Data Acquisition As mentioned before, we used a videogame-like source of elicited emotions. The chosen game was League of Legends (aka LoL). To carry out this task, it was necessary to organize a little tournament, where the team with the best score at the end of the tournament, obtained a check for the amount of 20 e per person as well as a diploma. With the obtained samples, we attempt to find correlates between acoustics, glotals or biomechanics parameters and the elicited emotions. To extract these parameters, we have used [18]. 3.1 The Subjects The subjects are students of Computer Science at the Universidad Politécnica of Madrid. Apparently, students had not got any disease in their voices and they gave their explicit consent in order to participate in the experiment. 3.2 The Game LoL is a multiplayer online battle arena (MOBA) video game developed and published by Riot Games. It is a free-to-play game supported by micro-transactions and inspired by the mod Defense of the Ancients for the video game Warcraft III: The Frozen Throne. League of Legends was generally well received at release, and it has grown in popularity in the years since. By July 2012, League of Legends was the most played PC game in North America and Europe in terms of the number of hours played [19]. As of January 2014, over 67 million people play League of Legends per month, 27 million per day, and over 7.5 million concurrently during peak hours [20]. 3.3 The Environment Each of the game sessions are carried out in our Laboratory, Neuromorphic Speech Processing Laboratory, which belongs to R+D group Centro de Tecnologı́a Biomédica. In this laboratory, we possess a quasi-anechoic chamber that is designed in order to entirely absorb the acoustic waves without echoing off any surface of the chamber such the floor, roof, walls and the like. The chamber has a personal computer inside, where a player remains throughout the game round. Outside of the chamber, there will be another four personal computers at the disposal of four members of the rest of the team. Concerning the choice of the anechoic chamber, it is easy to understand that we are looking for ideal conditions in order to develop of following stages such as characterization and extraction of parameters, selection of parameters, classification and finally, detection of emotions in speech [21]. 126 6 3.4 Spontaneous emotional speech recordings Hardware Features Each player has used a SENNHEISER PC 131 headset with wire connector, 30 - 18000 Hz of headphone frequency, 80 - 15000 Hz of microphone frequency, 2000 Ohms of output impedance and 38 Db of sensitivity. On the other hand, computers used in the experiment had the following features. Intel Core 2 Quad - CPU Q6600, 4 cores up to 2.40 Ghz, 4 GB of RAM and Nvidia GForce 8600GT with 512 MB Graphics Card. Moreover, it has been connected to a webcam in order to record faces and gestures for each game session. Video recordings could be crucial in order to recognize in the following stages the saved emotion. 3.5 The Experiment During two weeks, we will convene ten subjects in two shift sessions. The will be one shift in the morning and another one in the afternoon. Each session will have five players, where each player will log in to LoL’s website with his/her official account. Four of five players remain together in the same room, whereas the remaining player will be isolated inside of anechoic chamber. Approximately, each session will be limited to three hours and a half, because the average length of a game is 40 minutes. Once a game is over, the player who stays inside the chamber, will come out the chamber and the following mate takes his place. This continues until five rounds have been completed. Therefore, each day of tournament, we will have recorded 10 different players for 40 minutes. These records will be saved in our raw speech recordings database. Each team will play a maximum of 3 games over the two weeks of tournament. The process of the experiment is depicted in the Fig. 1. This experiment and others, which although they are not the objective of this paper to analyze them deeply, have been developed in order to elicit spontaneous emotions by our team. These experiments are incorporated inside the framework Emotions Portal. 3.6 The Framework The aim of this experiment is the capturing of speech recordings through a cooperative online videogame. The idea is that our server collects spontaneous emotional voice in different languages, with different accents and origins, etc. This framework can be defined as cooperative, scalable or modular and not subjective. According to Fig. 2, we divide our online framework into four stages: User identification, Start of the recording, Play the game and End of the recording and save the speech and video Recording. At the top of the picture is depicted the player who is inside of the anechoic chamber. He/she is connected to our platform which is deployed in our server. The user identification step consists of a sign up process through a web form. In this web form, users provide their personal data, for instance, their name, native language, country, gender, age and email. The last requirement, email address, is convenient in order to have the chance to keep in touch with the user and to give him/her information about the progress of the project. Once logged in to our platform, the user can choose 127 Spontaneous emotional speech recordings Fig. 1. Scenario of Experiment. 128 7 8 Spontaneous emotional speech recordings the different kind of experiments which are available. The actions mentioned before are explained as follows. First of all, users answer if they suffer from any type of disease in their voice at the moment of performing the test. It is crucial to know if the user suffers from any organic or functional dysphonia, diseases that can affect voice and prevent the use of biometric techniques. Then, the player chooses the language of procedure. When the user is ready to start playing, he/she presses the Start button. In this moment, the platform begins recording and throws a call to LoL’s application. LoL’s platform opens and the user carries out the process of sign in with his/her LoL’s account. These steps are depicted through numbers 1 and 2 in Figure 2. Approximately 40 minutes later, the game round is over. The player signs out of the LoL’s platform and he/she presses the Stop button in our platform. Finally, the speech and video recording are saved in our repository inside of our server. The last steps are depicted through numbers 3 and 4 in Figure 2. 4 Summary The spontaneous data acquisition is a very complex topic to resolve. However, it is the key point to improve human computer interfaces, robots, video games and the like. As mentioned before, some researchers have developed new models and methodologies to improve NPCs, and others have researched new designs to evoke certain emotions during the game sessions. We have developed a methodology in order to capture spontaneous emotions through a cooperative videogame with high quality audio in a controlled environment. Afterwards, we will be able to design a well-labeled database and continue with our previous work on emotion recognition. Acknowledgments. This work is being funded by grants TEC2012-38630-C0401 and TEC2012-38630-C04-04 from Plan Nacional de I+D+i, Ministry of Economic Affairs and Competitiveness of Spain. References 1. P. Ekman, Handbook of cognition and emotion. Wiley Online Library, 1999, ch. Basic emotions, pp. 45–60. 2. W. IJsselsteijn, Y. De Kort, K. Poels, A. Jurgelionis, and F. Bellotti, “Characterising and measuring user experiences in digital games,” in International conference on advances in computer entertainment technology, vol. 2, 2007, p. 27. 3. D. Lottridge, “Emotional response as a measure of human performance,” in CHI’08 Extended Abstracts on Human Factors in Computing Systems. ACM, 2008, pp. 2617–2620. 4. J. Sykes and S. Brown, “Affective gaming: measuring emotion through the gamepad,” in CHI’03 extended abstracts on Human factors in computing systems. ACM, 2003, pp. 732–733. 129 Spontaneous emotional speech recordings 9 5. J. Frome, “Eight ways videogames generate emotion,” Obtenido de http://www. digra. org/dl/db/07311.25139. pdf, 2007. 6. J. M. Kivikangas, G. Chanel, B. Cowley, I. Ekman, M. Salminen, S. Järvelä, and N. Ravaja, “A review of the use of psychophysiological methods in game research,” Journal of Gaming & Virtual Worlds, vol. 3, no. 3, 2011, pp. 181–199. 7. D. Callele, E. Neufeld, and K. Schneider, “Emotional requirements in video games,” in Requirements Engineering, 14th IEEE International Conference. IEEE, 2006, pp. 299–302. 8. E. Hudlicka and J. Broekens, “Foundations for modelling emotions in game characters: Modelling emotion effects on cognition,” in Affective Computing and Intelligent Interaction and Workshops, 2009. ACII 2009. 3rd International Conference on. IEEE, 2009, pp. 1–6. 9. K. Gilleade, A. Dix, and J. Allanson, “Affective videogames and modes of affective gaming: assist me, challenge me, emote me,” DiGRA - Digital Games Research Association, 2005. 10. S. Ramakrishnan, “Recognition of emotion from speech: a review,” Speech Enhancement, Modeling and recognition–algorithms and Applications, 2012, p. 121. 11. D. Ververidis and C. Kotropoulos, “A review of emotional speech databases,” in Proc. Panhellenic Conference on Informatics (PCI), 2003, pp. 560–574. 12. V. Rodellar, D. Palacios, P. Gomez, and E. Bartolome, “A methodology for monitoring emotional stress in phonation,” in Cognitive Infocommunications (CogInfoCom), 2014 5th IEEE Conference on. IEEE, 2014, pp. 231–236. 13. C. Muñoz-Mulas, R. Martı́nez-Olalla, P. Gómez-Vilda, E. W. Lang, A. ÁlvarezMarquina, L. M. Mazaira-Fernández, and V. Nieto-Lluis, “Kpca vs. pca study for an age classification of speakers,” in Advances in Nonlinear Speech Processing. Springer, 2011, pp. 190–198. 14. I. S. Engberg and A. V. Hansen, “Documentation of the danish emotional speech database DES,” Internal AAU report, Center for Person Kommunikation, Denmark, 1996. 15. F. Burkhardt, A. Paeschke, M. Rolfes, W. F. Sendlmeier, and B. Weiss, “A database of german emotional speech.” in Interspeech, vol. 5, 2005, pp. 1517–1520. 16. J. H. Hansen, S. E. Bou-Ghazale, R. Sarikaya, and B. Pellom, “Getting started with SUSAS: a speech under simulated and actual stress database.” in Eurospeech, vol. 97, no. 4, 1997, pp. 1743–46. 17. P. Merkx, K. P. Truong, and M. A. Neerincx, “Inducing and measuring emotion through a multiplayer first-person shooter computer game,” in Proceedings of the Computer Games Workshop, 2007, pp. 06–07. 18. “BioMetroPhon - Official Webpage,” 2008, URL: http://www.glottex.com/ [accessed: 2015-05-04]. 19. J. Gaudiosi. Riot games’ league of legends officially becomes most played pc game in the world. [Online]. Available: ”http://www.forbes.com/sites/johngaudiosi/2012/07/11/ riot-games-league-of-legends-officially-becomes-most-played-pc-game-in-the-world/ ” (2012) 20. I. Sheer. Player tally for league of legends surges. [Online]. Available: ”http://blogs. wsj.com/digits/2014/01/27/player-tally-for-league-of-legends-surges/” (2014) 21. V. Rodellar-Biarge, D. Palacios-Alonso, V. Nieto-Lluis, and P. Gómez-Vilda, “Towards the search of detection in speech-relevant features for stress. expert systems,” Expert Systems, 2015. 130 10 Spontaneous emotional speech recordings Fig. 2. Scenario of player inside anechoic chamber. 131 Towards Real-time Procedural Scene Generation from a Truncated Icosidodecahedron Francisco M. Urea1 and Alberto Sanchez2 1 2 FranciscoMurea@gmail.com, Universidad Rey Juan Carlos alberto.sanchez@urjc.es, Abstract. The procedural method is cutting edge in gaming and virtual cities generation. This paper presents a novel technique for procedural real-time scene generation using a truncated icosidodecahedron as basic element and a custom physics system. This can generate a virtual museum in an interactive-way. For doing this, we have created a simple interface that enables creating a procedural scenario regarding the user-specified patterns, like number of pieces, connection mode, seed and constraints into the path. The scene is generated around three dimensional edges meaning a new point of view of hypermuseums. It has its own physics system to adapt the movement through the three dimensions of the scene. As a result, it allows the user to create a procedural scene in almost real-time where a user character can go over the 3D scene using a simple interactive interface. Keywords: Procedural generation, 3D scene generation, real-time, hypermuseum, truncated icosidodecahedron 1 Introduction Nowadays, the generation of virtual procedural scenes is growing importance in video games and virtual reality. The procedural generation is used in many areas, such as generating volumetric fire [FKM+ 07], creating textures [RTB+ 92] or piles of rock for building [PGGM09]. In fact, the building information modeling is a popular research area, e.g. with some works from Müller et al. [MWH+ 06] and Patow [Pat12]. On the other hand, physical museums are trying to bring the art and history to the general public. The concept of hypermuseum [Sch04] was born to virtually recreate the interior of a museum. Thus, a hypermuseum is usually understood as an identical recreation of a physical museum. They are subject to the physics laws and have a typical path structure. This work presents a new point of view for hypermuseums unleashing the imagination. The idea is creating a hypermuseum regardless of the physical laws, following the Escher’s picture “Relativity” or “Inception” movie. We use a truncated icosidodecahedron (TI) to create the virtual path rotating and connecting its different faces. 132 2 Francisco M. Urea et al. The procedural process generation is not used to be in real time because the idea is creating and saving something to be used later, but in this case, the hypermuseum is procedurally generated at the same time that it is gone over. Thus, this paper presents an almost real-time approach for procedural generation using a space voxelization. The rest of the manuscript is organized as follows. In Section 2, several works related to our proposal are described. Section 3 presents the TI and explains the mathematics that have been used in the polyhedron and Sec. 4 shows the general idea of creating an hypermuseum with these pieces. Section 5 shows the process to generate a TI into the scene. Section 6 explains the path generation and the custom physics system. Section 7 explains the space division used for data storage and accelerating the generation process. Section 8 shows the evaluation of our proposal. Finally, Section 9 analyzes the conclusions and states the open issues of this work. 2 Related work Some researches have previously created hypermuseums. For instance, [Sch98] generates an accurate representation of a museum using a WWW database. [WWWC04] generates virtual content in which visitors can interact on a display or via web. William Swartout et al. propose to create virtual guides using a natural language and human aspects [STA+ 10]. Nevertheless, as far as we know, this proposal means the first time that a procedural process is being used to generate a hypermuseum. In any case, there are some works with similar procedural approaches. [STBB14] presents a complete survey with different procedural methods useful to generate features of virtual worlds, including buildings and entire cities that can be viewed as similar to hypermuseums. Different methods employ a combination of computational geometry algorithms similar to our proposal. Stefan Greuter et al. [GPSL03] describe a “pseudo infinite” city where several kinds of buildings are procedurally generated with different parameters. Julian Togelius et al. [TPY10] generate a map using a multi-objective evolutionary algorithm. [LSWW11] creates paths in a similar way to this work where a user can walk trough them. Finally, [CwCZ11] creates procedural elements using voxels as basic elements. Against these alternatives, this paper presents a novel technique towards real-time procedural scene generation. 3 Model description & breakdown This section presents the figure used for generating the scene and explain the required mathematical basis to create it. An icosidodecahedron is a Platonic polyhedron with twenty triangular and twelve pentagonal faces (see Fig. 1, first element). Every Platonic solid has a dual. The dual solid is the polyhedral element whose vertices are positioned 133 Towards RT Scene Generation 3 in the center of each face of the first solid. The Platonic solid can be vertextruncated by its dual when the ratios ddodecahedron /dicosahedron (being d the radius between the faces and the center of polyhedron) get appropriate values. Their intersection gives Archimedean solids. In this case, an icosahedron and a dodecahedron are combined to obtain the icosidodecahedron. Our figure called TI, presents a more complex geometry than an icosidodecahedron. We use the vertex truncation between an icosidodecahedron and a rhomb-triacontahedron. Figure 1 shows the obtained truncation. A complex polyhedra is used for connecting the faces by means of a connecting bridge allowing a user character to walk trough the faces. The face transitive duals of the series of the vertex-transitive polyhedra go from the rhomb-triacontahedron to the deltoid-hexecontahedron, passing through a continuous series of intermediate hexakis-icosahedra (also called disdyakis-triacontahedra). The following parameters are used for truncation: p 1 + (1/τ 2 ) √ dicosahedron = τ / 3 ddodecahedron = 1/ 1/(1 + 1/τ 2 ) ≤ drt ≤ 1 (1) (2) (3) where τ is the golden ratio and drt is the radius of a rhomb-triacontahedron. Fig. 1. Truncation process of an icosidodecahedron. A different polyhedron is obtained varying the ratio of the rhomb-triacontahedron. Each triangular face of the TI is enumerated as follows. First, a face is randomly selected to be numbered with number one. For each face, three connected faces can be accessed numbering them in clockwise. Then, we get the second face and repeat this process. Finally, all local positions and rotations of the faces from the center of the icosidodecahedron are saved. For each face, there is another face with the same rotation and opposite position requiring only four data values in space position to position a face. 4 Truncated Icosidodecahedron Museum The Truncated Icosidodecahedron Museum (TIM) consists on successively joining custom platforms using the geometry of the triangular faces of the TI. Only 134 4 Francisco M. Urea et al. two custom platforms have been defined: i) looking outside of the TI and, ii) looking inside it. The platforms are built based on the icosidodecahedron’s geometry, i.e. it generates a TI if twenty platforms were chosen. The user can select a series of parameters, such as the number of faces, the seed, connection mode between them and if the pattern can use less faces than the given. Introducing the same pattern and the same seed creates the same scene. A scene, where the user character has the capability for walking through it, is interactively generated. Each platform of the TIM has three connection bridges as joints between neighbor faces of a TI. The faces neighbor to a face are called border faces. The connection between faces are used as paths. Some faces have not connected all their connecting bridges. These faces are called limit faces whereas all the used faces are called useful faces. We use the limit faces to generate new TIs from there to enlarge the TIM scenario. Figure 2 shows the differences between kinds of face. Fig. 2. The faces drawn in a) are useful faces. The faces with horizontal lines in b) are border faces of the face with vertical lines. The face with vertical lines in c) is a limit face. 5 Building a TI This section explains the building of a TI and the problem found during its creation. Subsection 5.1 presents the method to get the useful faces. Subsection 5.2 explains the different kinds of face selection. Subsection 5.3 defines the generation of the TI into the virtual scene. 5.1 Getting of useful faces Figure 3 shows the process to get the useful faces. The following buffers are required: 1. All faces. This list contains useful faces. 2. Available faces. This list contains the faces to get more faces, i.e. it stores the limit faces until the date. It requires a first face, named inception face, to generate the TI given in the constructor. 3. Inaccessible faces. This list contains the faces which we cannot use. 135 Towards RT Scene Generation 5 Fig. 3. Getting face algorithm loop. Firstly, the inception face is added in “available faces” . Immediately a face is got from “available faces” as limit face. Then get the border faces to the limit face. If any of these border faces has already been used or it is contained on “inaccessible faces”, it is refused. Then, get one of the accessible border faces and add it to “all faces” and “available faces”. The process checks then the buffer of “available faces” removing all completed faces. This process is repeated until the size of “all faces” buffer is the same as the number of provided faces. If any face collides, the face is added into “inaccessible faces” buffer before the rebuilt process. 5.2 Connection mode There are three different modes to connect a face with its border faces. All modes repeat the same process but the condition to select the next face to connect differs between them. – Random mode. This mode takes a random face from the buffer of “available faces“. This face is used as limit face. Then get a random face from the border faces of this limit face. At the end, this face is checked and added to the buffers “all faces” and “available faces”. – Full mode. This mode takes the first face of “available faces” as limit face. Then get the first face of its border faces. Once this face is checked, it is added to the buffers of “all faces” and “available faces”. After the clean process, if all border face of this limit face was added, this limit face was erased from “available faces” and take the first face of “available faces”as limit face. If the new limit face has not got border faces, the algorithm takes a random face of “ available faces” as limit face. – Linear mode. At first, this mode randomly takes a face from “available faces”. Then get one border face. It is checked and added to “all faces” and “available faces”. Finally, instead of taking a face from “available faces”, take the last face added as limit face, and repeat the process. If this border face has not got border faces, the algorithm take one as limit face from “available faces”. In every mode, the buffer of “available faces” is cleaned at the end of each iteration. This process consist on removing the completed faces. 136 6 Francisco M. Urea et al. 5.3 Generation of a TI Fig. 4. Model of faces from TI and completed TIs This section shows the generation of a TI in the virtual space. First of all, the algorithm requires the following parameters: – Inception face. First platform to generate a TI. A platform is the representation of a face. In the first TI the algorithm gets a random number for selecting an inception face and zero to border face. – Blocked face. Platform that cannot be used. – Buffer of “all faces”. – Face mode. Our proposal presents two different alternatives (see Figure 4). The first one (outside face) is used for walking outside the TI. In the other hand (inside face) is used for traveling inside the TI. The process to generate a TI is represented in Algorithm 1. A platform is understood as the representation of a face. Algorithm 1 Generation of TI faces in virtual space. 1: procedure PutFaces(pathList) 2: for all f aces do 3: if f aceM ode = 0 then 4: Create face with platform ”FaceOutside” 5: else 6: Create face with platform ”FaceInside” 7: end if 8: Add plattform to TI 9: ChangeLocalP osition(idF ace, platf orm) 10: ChangeLocalRotation(idF ace, platf orm) 11: end for 12: end procedure The algorithm instantiates the custom platform given the face mode and this process is repeated for each face from “all faces” buffer. Each platform requires the local position and rotation to place itself in the right position. These parameters are pre-calculated to make the process faster. 137 Towards RT Scene Generation 6 7 Path generation To generate a path in the museum, firstly a TI is created regarding the user patterns as shown above. This is added into the “available TIs” buffer. Then, the algorithm follows the next steps: i) generating all possible TIs, ii) checking the new TI generated, iii) rebuilding the collided objects, and iv) repairing inaccessible paths. The algorithm stops when the “available TIs” buffer is empty. The sequence of creation a path from a TI is explained in next subsections. Fig. 5. Loop to generate a path by means of the union of TIs. Green line means the normal process. Orange line means the alternative case. Blue line is used when a TI is destroyed. 6.1 Generation of new TIs The number of platforms varies depending on the user patterns. Some of these platforms have not connections with others. We use the limit platforms to generate new TIs and connect them. The parameters required to generate a new TI child are the number of faces of the new object, the connection mode and the condition if the path generated can be lower than the number of faces. 6.2 Union of icosidodecahedrons First of all, the faces in available faces from TI are used as limit face to generate new unions. For each face from this buffer, its border faces are calculated. The border faces are needed for two reasons. First, the border face of the TI father is the dual face of the border face of the inception face from the new TI. It means that the border face of the new TI uses the connecting bridge to join both TIs. For this reason, this border face is removed from the border faces to avoid a collision. Second, the border faces are needed to know the where the connecting bridge is to position the new TI. Next, the algorithm calculates the center position for the new object. The midpoint method is used for calculating the new center. As we explained in 138 8 Francisco M. Urea et al. Section 3, every face has an opposite symmetrical face. Thus, the algorithm uses the connecting bridge position between the two objects as the middle point. The algorithm applies the following equation to get the new center: Centerchild = P ositionbridge ∗ 2 − Centerf ather (4) The new generated object is added into the TI list to check the buffer. Then, if all unions of the TI have been generated, the object is saved into the “full TI” buffer. If some of the unions cannot generate a new object (Section 7 explains the reason), the object is added to the border list. This process is repeated for each TI into the “available TIs” buffer. 6.3 Collision Detection We use a convex mesh around the platform (see Fig. 6) to avoid platform intersection. A ghost layer component is created for each platform. When a collision is detected, this component evaluates the latest created platform and selects which one has to be destroyed. At the end, if the platform is not destroyed, its ghost layer component is deleted. The algorithm has the following steps: 1. The collider detects a collision. If the collided platform has not a ghost layer component and it is not part of the TI father, it is marked to be destroyed. 2. If the collided platform has a ghost layer component, it is evaluated if its identifier is higher. If its identifier is lower, it is added to the collision list. 3. Before destroying the platform, the collision list is checked it. If there is any platform in the list, it is destroyed. If not, the platform remains without destroying. Fig. 6. Mesh collider The collision detection process automatically starts when a platform is instantiated. If the platform has to be destroyed, a message is sent to the kernel process. This message contains the collided TI and the destroyed face. This face is saved into the “inaccessible faces” buffer and a new TI is rebuilt with this constraint. 139 Towards RT Scene Generation 6.4 9 Rebuilding TI This process is run when a platform of a TI is destroyed. Then, the algorithm creates another TI avoiding the face corresponding to the platform taking into account the number of useful faces enabled by the user patterns. Nevertheless, if the inception face has collided, this TI is destroyed. 6.5 Repairing TI The repair of a TI is required when some unions can not be generated, due to a TI destruction. This process consists on replacing the connecting bridge with a wall. When the platform is repaired, the collider is replaced with another one, which best matches into the mesh taking into account the connecting bridges destroyed. 6.6 Internal physics A TI can be technically understood as a simplification of a sphere. This idea is used to generate an adaptable physics system. While the user walks trough the TI, the force is calculated using the Newton’s universal gravitational constant (see Fig. 7). When the user walks outside the object, the force pulls the user to the center of the object, allowing to the user walks over the sphere surface. If the user walks inside, the force pushes the user from the center. The player system detects the mode and changes the direction of the force when it is necessary. Fig. 7. Internal physics. The green arrow represents when the user walks outside the surface of the platform. The gray arrow represents when he travels inside the surface. 7 Space division The representation space has been voxelized. Each voxel has the size of a TI storing the data that represents it. When the user character is moved, the cubes 140 10 Francisco M. Urea et al. ot TIs, which have not to be represented, are saved with all its content to provide persistence. A map, where the key is the center of the voxel and the content is a list with all TI data, is used to accelerate the process. In addition, the voxelization has other uses, like limiting the generation of TIs. The cube, where a TI is represented, is divided in three parts in all axis (see Fig. 8). Each one of the subcubes is a voxel to get faster memory accesses. When the user character walks and passes to other voxel, the system accesses memory only using the following nine cubes (instead of accessing the whole memory). This division also places the user in the center of the voxel seeing the existing objects in every direction. Fig. 8. Picture of division of representation space. Fig. 9. Red cube is border margin to generate TI. Green cube is space to representation. We create a margin border (see Fig. 9) to avoid collisions among the TIs generated in different time steps. The size of the margin border is half TI, making the generation of platforms outside the cube impossible. 8 Results We test our proposal using a Intel i7-3630QM CPU laptop with a Nvida 660GTXM and 16Gb DDR3 RAM. The application has been developed by using the game engine Unity3D v4.6.2f1[3] to enable running it in any platform, like Windows, Mac, Linux, mobile or web. The user can walk over the scene specified regarding its own patterns by using the custom physics and sound steps, like if the user is inside of a museum. Figures 10 and 11 show examples of the scene created at the same time that is gone over. This work can be downloaded and tested in https://dl.dropboxusercontent.com/u/697729/TIM/Index.html. To evaluate its performance, we have created a custom pattern and a specific tour to allow us to replicate the scene. We run the same testbed 7 times obtaining the Fig. 12. Figure 12 is divided in six columns, which correspond to different steps of path generation. As we can see, the standard deviation is low, ensuring a similar behavior in different runnings. Regarding performance, the collision detection 141 Towards RT Scene Generation Fig. 10. Render in real time outside of scene. 11 Fig. 11. Render inside the other scene. Fig. 12. Average time in ms. and standard deviation for each step of the algorithm traveling around the scene with the following patterns: 1 face in random mode, 12 faces in full mode, 1 face in random mode. phase requires most of the time. This is due to the Unity’s collision system requires three internal loops to evaluate collision and destruction. Finally, we test the GPU vs CPU performance. We test different scenes, but the time using GPU is very similar to use CPU. 9 Conclusions and future work This paper presents a novel technique to procedurally generate a hypermuseum from a TI. It could be adapted to generate any kind of scene. The user walks into an infinite procedural 3D scene created at the same time that it is gone over. This kind of generation presents a new point of view to understand hypermuseums. Additionally, we have created a panel to make user access easy to specific rooms. Thus, the user can select the room which would like to visit to teleport there. As future work, we are planning to improve the collision detection system or create a custom one. Furthermore, we plan to adapt other kinds of figures to work, making more complex scenes and better adapted to the user needs, e.g to add the possibility of using for game scenarios. Finally, we plan to integrate procedural artworks. Acknowledgments Thanks to Hector Zapata, CEO of Tetravol S.L., to propose this as work as a Master’s Thesis and TI platforms. References [3D] Unity 3D. Unity 3d engine. http://unity3d.com/. 142 12 Francisco M. Urea et al. [CwCZ11] Juncheng Cui, Yang wai Chow, and Minjie Zhang. A voxel-based octree construction approach for procedural cave generation. International Journal of Computer Science and Network Security, pages 160–168, 2011. [FKM+ 07] Alfred R. Fuller, Hari Krishnan, Karim Mahrous, Bernd Hamann, and Kenneth I. Joy. Real-time procedural volumetric fire. In Proceedings of the 2007 Symposium on Interactive 3D Graphics and Games, I3D ’07, pages 175–180, New York, NY, USA, 2007. ACM. [GPSL03] Stefan Greuter, Jeremy Parker, Nigel Stewart, and Geoff Leach. Realtime procedural generation of ‘pseudo infinite’ cities. In Proceedings of the 1st International Conference on Computer Graphics and Interactive Techniques in Australasia and South East Asia, GRAPHITE ’03, pages 87–ff, New York, NY, USA, 2003. ACM. [LSWW11] M. Lipp, D. Scherzer, P. Wonka, and M. Wimmer. Interactive modeling of city layouts using layers of procedural content. Computer Graphics Forum, 30(2):345–354, 2011. [MWH+ 06] Pascal Müller, Peter Wonka, Simon Haegler, Andreas Ulmer, and Luc Van Gool. Procedural modeling of buildings. ACM Trans. Graph., 25(3):614–623, July 2006. [Pat12] G. Patow. User-friendly graph editing for procedural modeling of buildings. Computer Graphics and Applications, IEEE, 32(2):66–75, March 2012. [PGGM09] A. Peytavie, E. Galin, J. Grosjean, and S. Merillou. Procedural generation of rock piles using aperiodic tiling. Computer Graphics Forum, 28(7):1801– 1809, 2009. [RTB+ 92] John Rhoades, Greg Turk, Andrew Bell, Andrei State, Ulrich Neumann, and Amitabh Varshney. Real-time procedural textures. In Proceedings of the 1992 Symposium on Interactive 3D Graphics, I3D ’92, pages 95–100, New York, NY, USA, 1992. ACM. [Sch98] Werner Schweibenz. The” virtual museum”: New perspectives for museums to present objects and information using the internet as a knowledge base and communication system. In ISI, pages 185–200, 1998. [Sch04] Werner Schweibenz. Virtual museums. The Development of Virtual Museums,ICOM News Magazine, 3(3), 2004. [STA+ 10] William Swartout, David Traum, Ron Artstein, Dan Noren, Paul Debevec, Kerry Bronnenkant, Josh Williams, Anton Leuski, Shrikanth Narayanan, Diane Piepol, H. Chad Lane, Jacquelyn Morie, Priti Aggarwal, Matt Liewer, Jen-Yuan Chiang, Jillian Gerten, Selina Chu, and Kyle White. Ada and Grace: Toward Realistic and Engaging Virtual Museum Guides. In Proceedings of the 10th International Conference on Intelligent Virtual Agents (IVA 2010), Philadelphia, PA, September 2010. [STBB14] Ruben M. Smelik, Tim Tutenel, Rafael Bidarra, and Bedrich Benes. A survey on procedural modelling for virtual worlds. Computer Graphics Forum, 33(6):31–50, 2014. [TPY10] Julian Togelius, Mike Preuss, and Georgios N. Yannakakis. Towards multiobjective procedural map generation. In Proceedings of the 2010 Workshop on Procedural Content Generation in Games, PCGames ’10, pages 3:1–3:8, New York, NY, USA, 2010. ACM. [WWWC04] Rafal Wojciechowski, Krzysztof Walczak, Martin White, and Wojciech Cellary. Building virtual and augmented reality museum exhibitions. In Proceedings of the Ninth International Conference on 3D Web Technology, Web3D ’04, pages 135–144, New York, NY, USA, 2004. ACM. 143 Implementación de nodos consulta en árboles de comportamiento ? Ismael Sagredo-Olivenza, Gonzalo Flórez-Puga Marco Antonio Gómez-Martı́n and Pedro A. González-Calero Departamento de Ingenierı́a del Software e Inteligencia Artificial Universidad Complutense de Madrid, España Abstract. Los árboles de comportamiento son la tecnologı́a más utilizada en videojuegos comerciales para programar el comportamiento de los personajes no controlados por el jugador, aunando una gran expresividad, con cierta facilidad de uso que permite la colaboración entre programadores y diseñadores de videojuegos. En este artı́culo describimos un nuevo tipo de nodos en los BTs que representan consultas que devuelven sub-árboles en tiempo de ejecución y explicamos cómo se pueden integrar en Behavior Bricks, un sistema que hemos desarrollado para la creación y ejecución de BTs. 1 Introducción La creación de comportamientos para personajes con controlados por el jugador (en inglés NPC, Non Playable Characters) involucra, entre otros, a los diseñadores y los programadores de inteligencia artificial (o usando las siglas inglesas, AI). Los diseñadores son los encargados de idear y describir las reglas del juego (¿Cómo se juega? y ¿A qué se juega?) con el objetivo de que éste sea divertido, mientras que los programadores son los encargados de llevar esas ideas a cabo. De entre todas las tareas de diseño, la especificación del comportamiento de los NPCs en una parte vital ya que afecta directamente a cuestiones como la dificultad o las mecánicas de juego [1]. En este proceso de intercambio de ideas entre ambos se producen problemas de entendimiento y comprensión por falta de una formalización de lo que debe hacer el NPC o cómo se debe comportar. Dicho proceso, que es común a la mayorı́a de problemas en la fase de análisis de la ingenierı́a del software, se agrava en el desarrollo de videojuegos ya que la especificación ni siquiera está completa cuando se inicia el desarrollo del juego y ésta va evolucionando con el tiempo, para hacer que el juego sea divertido o tenga una correcta aceptación entre el público. En este contexto con requisitos tan volátiles, es crucial minimizar en lo posible los problemas derivados de la comunicación entre programador y diseñador, intentando darle al diseñador las herramientas necesarias para que sea él mismo el que implemente los comportamientos, haciendo que el programador intervenga lo menos posible. Un editor visual que le ayude a programar comportamientos de ? Financiado por el Ministerio de Educación y Ciencia (TIN2014-55006-R) 144 forma gráfica, sobre un modelo de ejecución fácil de comprender puede ayudar a esta tarea. Tradicionalmente se han utilizado muchas técnicas que intentan presentar la lógica de estos comportamientos de una forma más simple para el diseñador o para programadores menos cualificados. Desde utilizar lenguajes de script, más sencillos de entender, para extender los comportamientos, hasta utilizar modelos de ejecución como las máquinas de estado o árboles de comportamiento que pueden ser representados gráficamente y además manejan conceptos más cercanos a la forma de pensar de un diseñador. En los últimos años, uno de los modelos más utilizados son los llamados árboles de comportamiento (o en inglés Behavior Trees o BTs) por su versatilidad sobre otras técnicas como describiremos en la sección 2. Tanto esta técnica como las tradicionales máquinas de estado, utilizan un conjunto de tareas primitivas de bajo nivel que son las que interactúan con el entorno del NPC realmente, ya que el ejecutor simplemente decide cuales de todas ellas debe ejecutar en un instante concreto. El problema surge cuando el juego va creciendo y se van creando nuevas tareas primitivas que van afectando a más NPCs, por lo que añadir una nueva forma de realizar una tarea, puede implicar modificar el comportamiento de varios tipos de NPCs. O en un caso aún más extremo, que ni siquiera el diseñador sepa expresar, usando las herramientas, qué es lo que debe hacer el NPC o cómo debe hacerlo. En estos casos, se necesita un proceso iterativo de prueba y error que implica una modificación de la especificación tanto en el lado del diseñador, que debe cambiar su diseño teórico del comportamiento, como en el del programador, que debe cambiar la implementación de dichos comportamientos, añadir nuevas tareas primitivas, etc. En este artı́culo se describe una extensión de los BTs que facilita el proceso iterativo de diseño del comportamiento. Madiante técnicas de razonamiento basado en casos (Case-based reasoning, CBR), permite a los diseñadores crear comportamientos aportándoles ejemplos. Estos ejemplos pueden generarse bien interactivamente entrenando al NPC, o bien a partir de diferentes implementaciones de la tarea a aprender, usandolas en diferentes contextos. La decisión de cuál de las posibles alternativas se ejecutará, se relega al momento de ejecución. Además de asistir al diseñador, esta técnica permite crear comportamientos menos previsibles desde el punto de vista del jugador, lo que incrementa la sensación de que los NPCs tienen un comportamiento más humano y menos predecible. Adicionalmente, este nuevo modelo de ejecución diferido soporta otras caracterı́sticas como permitir que los comportamientos soporten nuevas formas de resolver una tarea sin modificar su estructura o crear comportamientos que vayan aprendiendo con el tiempo la mejor forma de resolver una tarea. El resto del artı́culo se estructura de la siguiente manera: en la sección 2 se describe qué son los árboles de comportamiento, en la sección 3 se revisa el trabajo relacionado, en la sección 4 se explica qué es Behavior Bricks, nuestra herramienta de creación de comportamientos, en la sección 5 mostramos nuestra propuesta de extensión de BT y Behavior Bricks y finalmente en la sección 6 describimos las conclusiones y trabajo futuro. 145 2 Behavior Trees Los BTs [2] modelizan la inteligencia artificial de los NPCs de una forma similar a las máquinas de estados (Finite State Machine o FSM) [3] que tradicionalmente se han utilizado con este fin. La diferencia principal entre ambos radica en que los BTs son mucho más escalables ya que en las máquinas de estados, el número de las transiciones entre los estados se dispara cuando el comportamiento a implementar crece. Además, los BTs se adaptan mejor a las necesidades de los diseñadores ya que éstos suelen preferir modelos de ejecución que permitan describir los comportamientos guiados por objetivos o metas a conseguir [4]. Los BTs reemplazan las transiciones de las máquinas de estado con información de control adicional mediante unos nodos intermedios, que son los que deciden el orden en el que deben ejecutarse las tareas primitivas. Estas tareas primitivas son aquellas partes del comportamiento que tienen acceso al entorno virtual al que pertenece el NPC y pueden ser de dos tipos: – Acciones: Son tareas que modifican de alguna forma el entorno del NPC, por ejemplo moverlo, atacar a un enemigo, saltar, etc. Las acciones se sitúan en los nodos terminales (hojas) de los BTs. – Condiciones: Simplemente comprueban el estado del entorno sin modificarlo. Estas tareas pueden usarse en los nodos hoja de los BTs o bien en ciertos nodos intermedios especiales que toman decisiones de control. Cuando la ejecución de un nodo termina, éste notifica a su nodo padre si la ejecución de la tarea tuvo éxito (Success) o no (Failure). Los nodos intermedios utilizarán esta información proveniente de sus nodos hijo para tomar la decisión de cuáles serı́an los siguientes nodos a ejecutar (dependiendo del tipo de nodo intermedio). Algunos de los nodos intermedios más utilizados son los siguientes: – Secuencia: ejecuta una lista de comportamientos hijo del nodo en el orden que se han definido. Cuando el comportamiento que está ejecutándose actualmente termina con éxito, se ejecuta el siguiente. Si termina con fallo, él también lo hace. Devolverá Success si todos se han ejecutado con éxito. – Selector: tiene una lista de comportamientos hijo y los ejecuta en orden hasta encontrar uno que tiene éxito. Si no encuentra ninguno, termina con fallo. El orden de los hijos del selector proporciona el orden en el que se evalúan los comportamientos. – Parallel: ejecuta a la vez todos sus comportamientos hijo. Esta ejecución no debe necesariamente ser paralela (en varios núcleos), sino que puede realizarse entrelazada; lo importante es que ocurre simultáneamente dentro de la misma iteración del bucle de juego. Los nodos parallel pueden tener como polı́tica de finalización la del nodo secuencia (acabar si todos los hijos lo hacen), o la de los selectores (acabar si uno de sus nodos hijos lo hace). – Selector con prioridad: los nodos selector llevan incorporada una cierta prioridad en el orden de ejecución, pero no re-evaluan nodos que ya han terminado. El selector con prioridad (priority selector) se diferencia del selector 146 normal en que las acciones con más prioridad siempre intentan ejecutarse primero en cada iteración. Si una acción con mayor prioridad que otra que ya se estaba ejecutando puede ejecutarse, se interrumpe la acción que se estaba ejecutando anteriormente. – Decoradores: son nodos especiales que sólo tienen un hijo y que permiten modificar el comportamiento o el resultado de ese hijo. Algunos ejemplos son decoradores que repiten la ejecución del hijo un número de veces determinado, los que invierten el resultado del hijo (de éxito a fracaso y viceversa) o aquellos que poseen una condición adicional de forma que se ejecutará o no el hijo en base a si se cumple esa condición. 3 Trabajo relacionado El Razonamiento basado en casos (en inglés Case-based reasoning, CBR) [5] es el proceso de solucionar nuevos problemas basándose en las soluciones de problemas anteriores y es un conjunto de técnicas ampliamente utilizadas en multitud de dominios en inteligencia artificial. En CBR necesitamos mantener una base de casos en la que debemos almacenar qué tarea o acción se ejecutó en el pasado y en qué contexto se hizo. Con esos datos, el CBR trata de inferir cuál de estos resultados es el más apropiado de usar en el problema actual, asumiendo que si aplicamos soluciones buenas en el pasado a problemas similares al actual, esas soluciones deberı́an de ser buenas también en el problema actual. La información almacenada en la base de casos puede provenir de conocimiento experto o ser generada almacenando las acciones que el experto está realizando mientras resuelve el problema, generando lo que se suele denominar como trazas. Estas trazas no son más que ejemplos generados por un experto que sirven de información al CBR para tomar sus decisiones. La idea de extender los BTs con nodos terminales que no ejecutan una tarea concreta sino un conjunto de posibles tareas, aplazando la decisión de cuál ejecutar al momento de ser ejecutada, ya se ha explorado en trabajos previos con la inclusión de nodos consulta [6] para poder recuperar implementaciones de comportamientos similares en base a una descripción ontológica realizada por el propio diseñador, ası́ como para recuperar planes en juegos de estrategia como Starcraft [7] donde se seleccionaban BTs con diferentes implementaciones de una acción del juego en base a información semántica proporcionada por expertos. Estos trabajos previos se basaban en la idea de que la información semántica que permitı́a seleccionar el BT a ejecutar, proviene de una descripción semántica de dicho problema mediante una ontologı́a. Esta información semántica describe el contexto idóneo en el que se debe utilizar dicho comportamiento. Esta información ha de ser introducida por el diseñador, lo cual es poco flexible y requiere de un trabajo extra por su parte que es, además, propenso a errores. Esta nueva aproximación pretende proporcionar mecanismos para que la generación de esa información semántica y los valores correctos de la misma, se aprendan automáticamente. De esta forma se simplifican las tareas que el diseñador o los programadores deben realizar para mantener una ontologı́a actualizada. 147 También otros autores han utilizado otras técnicas de aprendizaje automático con otros enfoques para intentar seleccionar el BT que mejor resolvı́a un problema usando algoritmos genéticos [8] y otras técnicas de aprendizaje automático como el Q-Learning [9]. 4 Behavior Bricks Behavior Bricks1 es una herramienta cuya finalidad es el desarrollo de comportamientos inteligentes para videojuegos. Permite la incorporación de tareas primitivas desarrolladas por los programadores para ser usadas en árboles de comportamiento, construidos por los diseñadores a través de un editor visual integrado, lo que permite la colaboración entre ambos, minimizando problemas de comunicación y permitiendo que ambos trabajen en paralelo. En Behavior Bricks se define como tarea primitiva al mı́nimo fragmento de código ejecutable por un comportamiento, que son las acciones y condiciones de los BTs. Estas primitivas se definen mediante un nombre (único en toda la colección) que determina la semántica de la acción o condición y opcionalmente, una lista de parámetros de entrada (Y también de salida en el caso de las acciones). Estas primitivas deben ser escritas por los programadores en C#. Las acciones primitivas constituyen los “ladrillos” básicos para la creación de comportamientos por parte de los diseñadores. La posibilidad de parametrizar estas primitivas las dota de generalidad, lo que permite que un estudio de videojuegos pueda, con el tiempo, construirse una biblioteca de acciones y condiciones reutilizables entre proyectos. Para permitir la comunicación entre las diferentes tareas, los parámetros de entrada y salida de las mismas leen y escriben de un espacio de memoria compartido al que denominamos la pizarra del comportamiento. Estos parámetros se deben asociar a las variables de la pizarra en el diseño del comportamiento, permitiendo interrelacionarlas. La información almacenada en estas pizarras es lo que denominamos contexto del comportamiento. Al igual que las primitivas, los comportamientos construidos por los diseñadores también pueden tener parámetros de entrada y de salida. Esto permite que dichos comportamientos puedan ser reutilizados en otros BTs como tareas primitivas. La extensión que proponemos en este trabajo surge de forma natural gracias a esta última caracterı́stica: si los BTs ya construidos pueden colocarse en otros BTs más generales como nodos hoja, podemos ver esos BTs como definiciones de tareas primitivas implementadas con árboles en lugar de directamente en código. La tarea implementada con ese BT puede ser un nuevo tipo de tarea primitiva que pasa a estar disponible en la colección o una implementación distinta de una tarea primitiva ya existente. Por lo tanto, dada una tarea, podemos tener un conjunto de posibles implementaciones de dicha tarea que podemos utilizar. Para poder hacer esta asociación, los parámetros de entrada y de salida de la tarea a implementar y el comportamiento que la implementa deben coincidir en número y tipo. 1 http://www.padaonegames.com/bb/ 148 Fig. 1: El editor de Behavior Bricks 4.1 El editor de comportamientos de Behavior Bricks Behavior Bricks se compone de dos partes: el motor de ejecución de comportamientos y el editor de esos comportamientos. Aunque el motor de ejecución es aséptico y puede ser utilizado en cualquier juego que sea capaz de ejecutar código en C#, el editor está hecho como plug-in para Unity3D 2 . La figura 1 muestra el aspecto del editor. que está implementado usando UHotDraw [10], un framework desarrollado por los autores que simplifica la creación de editores en Unity. 4.2 Creación de primitivas en Behavior Bricks Las acciones primitivas en Behavior Bricks son implementadas por los programadores usando C#. Para proporcionar tanto le nombre, como los parámetros de entrada y salida se usa atributos de C#. Las acciones se implementan heredando de una superclase con un conjunto de métodos que deben ser sobrescritos y que son invocados por el intérprete durante el ciclo de vida de la acción. De ellos, el más importante es OnUpdate(), llamado en cada iteración del ciclo del juego, y que debe devolver si la acción ha terminado ya (con éxito o fracaso) o debe seguir siendo ejecutada en la próxima iteración. Las condiciones evalúan el estado del mundo permitiendo determinar si se cumplen ciertas reglas. Al igual que las acciones, las condiciones primitivas son también implementadas por los programadores utilizando los mismos mecanismos (atributos de C# y herencia). En este caso, el método sobrescrito será Check(), que devolverá un valor booleano. 4.3 Metodologı́a de uso de Behavior Bricks En la industria hay cierto recelo a la hora de permitir a los diseñadores crear comportamientos, incluso si disponen de una herramienta que les facilite la 2 http://unity3d.com/ 149 Fig. 2: Comportamientos de alto y bajo nivel y su asignación de responsabilidades tarea, soliendo encargarse principalmente de supervisar el comportamiento [4]. Basándonos en algunos experimentos, hemos detectado que con una metodologı́a adecuada y formación mı́nima, los diseñadores son capaces de realizar comportamientos generales sin necesidad de tener demasiados conocimientos de programación. Para ello hemos definido una metodologı́a de uso de Behavior Bricks en la que definimos varios roles a la hora de crear estos comportamientos según el tipo de usuario de la herramienta. Entre los profesionales dedicados al diseño de videojuegos existen algunos con un perfil más técnico y otros que prácticamente no tienen conocimientos informáticos. Los diseñadores no técnicos podrı́an tener problemas para crear un comportamiento completo usando el editor, por lo que nuestra metodologı́a propone dividir los comportamientos generales en comportamientos a bajo nivel y comportamientos a alto nivel. Cuando pensamos en un comportamiento a alto nivel lo hacemos viendo dicho comportamiento como una descripción simple del comportamiento general de un NPC, similar a lo que los diseñadores suelen hacer en los primeros prototipos del juego [11]. Estos comportamientos a alto nivel pueden contener otros comportamientos más especı́ficos y más cercanos a los problemas tı́picos de implementación a los que denominamos comportamientos de bajo nivel y que deberı́an ser creados por diseñadores técnicos o por programadores. De esta forma, podemos dividir tareas y asignar responsabilidades dependiendo de las caracterı́sticas del equipo de desarrollo, permitiendo la cooperación entre ambos. Para clarificar estos conceptos, la figura 2 muestra un esquema del reparto de responsabilidades que define nuestra metodologı́a de uso de Behavior Bricks. 5 Extensión de Behavior Bricks con Query Nodes En el proceso de creación del comportamiento, asumimos que el diseñador sabe cómo debe comportarse el NPC, pero vamos a suponer que estamos en un estado temprano de desarrollo del juego en el que los comportamientos de los 150 NPCs aún no están claros. Incluso con la separación de tareas descrita en nuestra metodologı́a, la comunicación entre programador y diseñador debe ser fluida y constante en estos primeros compases. Imaginemos que nuestra biblioteca de tareas primitivas por sucesivos proyectos desarrollados, empieza a ser de un gran tamaño y está llena de comportamientos similares pero que tiene distintos matices de implementación. ¿Cuál de ellos debe usar el diseñador? Primeramente puede que ni siquiera lo tenga claro; puede incluso que el programador esté ocupado haciendo otras tareas y el diseñador no disponga de conocimientos suficientes como para crear comportamientos a bajo nivel que se adapten a sus necesidades.¿Qué alternativas tiene el diseñador en este contexto? Puede esperar a que esas tareas a bajo nivel sean implementadas, puede intentar ir probando todas las tareas similares que encuentre en la biblioteca o aventurarse a implementar una nueva tarea por sı́ solo. Nosotros proponemos una extensión de Behavior Bricks y de los BTs en general, basada en la idea de incorporar un nuevo nodo hoja denominado Query Node, descrito en [6], en la que aprovechando las posibilidades que ofrece Behavior Bricks para crear jerarquı́as de tareas y comportamientos, el diseñador introduzca como nodo hoja, un nodo especial al que denominamos nodo Query en el que se define el tipo de tarea que queremos ejecutar, pero no cuál de sus posibles implementaciones se ejecutará. Dicha decisión se llevará a cabo en tiempo de ejecución, seleccionando aquella que mejor se adapte al entorno donde finalmente se utilice. A modo de ejemplo, imaginemos que el diseñador dispone de multitud de implementaciones diferentes de la acción Attack. Por ejemplo: puede atacar cuerpo a cuerpo, desde lejos, puede intentar rodear al enemigo antes de atacarle, o puede autodestruirse si ve que sus posibilidades de victoria son escasas. ¿Cuál es la más adecuada? En [6] o [7] para tomar esta decisión se usaba conocimiento experto descrito mediante una ontologı́a o descripción semántica, tanto de los comportamientos almacenados como del árbol donde querı́amos integrar dicho nodo Query. Nuestra nueva aproximación es adquirir este conocimiento mediante un proceso de aprendizaje, intentando que el diseñador tenga que introducir el mı́nimo conocimiento semántico posible para que sea fácil de mantener. Para ello nos enfrentamos a varios problemas a resolver. Por un lado, no todos los comportamientos son utilizables en todos los NPCs. Tampoco el diseñador quiere que sus NPCs puedan atacar de todas las formas implementadas posibles. Ası́ que debe haber algún tipo de prerrequisito para poder usar una tarea o un comportamiento. En la definición de las tareas primitivas añadimos un nuevo método: bool CheckPrerequisites() que informa al comportamiento que lo ejecuta de las restricciones de dicha tarea. Imaginemos por ejemplo en el caso de la acción Attack, que para poder usar el ataque a distancia, el NPC debe disponer de un arma a distancia, o solo ciertos enemigos de cierto nivel son capaces de rodear antes de atacar o que para poder autodestruirse, el NPC debe tener una bomba. Este mecanismo permite de forma sencilla para el programador de la acción describir qué restricciones de uso son las que precisa dicha acción. Estas restricciones pueden ser supervisadas por el diseñador e incluso añadir nuevas restricciones 151 adicionales de diseño en el nodo Query, (por ejemplo que la acción a ejecutar sea del nivel adecuado para el NPC o dependiendo de la dificultad del juego) usando una condición previamente programada que el nodo chequea para cada posible comportamiento a elegir. No es necesario definir los prerrequisitos de los comportamientos complejos creados por los diseñadores ya que estos se pueden inferir de las tareas primitivas y otros subcomportamientos que utilicen los comportamientos seleccionados por el nodo Query, simplemente chequeando todos lo prerrequisitos de todas las tareas que contenga el comportamiento. Ası́ pues, cuando el nodo Query se ejecute, deberá buscar en la base de casos aquellos que implementen la tarea a ejecutar que ha sido establecida en tiempo de diseño y de entre ellos, preseleccionar aquellos que son ejecutables cumpliendo por un lado sus prerrequisitos y por otro lado la condición definida en el nodo Query. De todos estos candidatos se debe seleccionar cuál es el más adecuado. Si no hay ningún comportamiento recuperado, entonces se ejecutará la tarea primitiva establecida. 5.1 Métodos de selección del comportamiento más adecuado Cuando el nodo Query ha seleccionado el conjunto de posibles comportamientos a ejecutar, debe decidir cuál de ellos ejecuta. Para ello el algoritmo recurre a una base de casos donde recupera cuál de los candidatos a ejecutar se utilizó en unas circunstancias similares a las actuales. Para ello debemos guardar en esta base de casos la acción ejecutada y el contexto del árbol de comportamiento que la ejecutó (su pizarra o parte de ella). La información almacenada en la base de casos tendrá entonces una lista de atributos con su valor en el momento de ejecutar el comportamiento. Utilizando una medida de similitud entre los parámetros almacenados y los de la pizarra del comportamiento, seleccionaremos el comportamiento que se ejecutó en un contexto más parecido al nuestro. El problema aquı́ es hacer la correspondencia entre los parámetros almacenados en las trazas de la base de casos y el estado del contexto del Query Node, ya que se necesita hacer una correspondencia semántica entre el nombre del atributo almacenado en la base de casos y el nombre del atributo de la pizarra del comportamiento que ejecuta el nodo Query. Ası́ que para poder hacer esta correspondencia, necesitamos etiquetar los parámetros con una etiqueta semántica común para todos los comportamientos. Estas etiquetas semánticas son creadas por los diseñadores en forma de una simple lista o relacionándolas por sinónimos mediante un tesauro si queremos una semántica más rica. De esta forma, en vez de almacenar los nombres de la pizarra del comportamiento que genera la traza para la base de casos, se almacena su valor semántico. En la tabla 1 podemos ver la pizarra del comportamiento que ejecuta el nodo Query Attack. En la tabla 2 podemos ver un ejemplo de una base de casos para la acción Attack con diferentes valores semánticos de los atributos y el resultado de similitud con el contexto actual mediante una distancia euclı́dea simple como ejemplo ilustrativo de como realizar dicha recuperación. Los marcados con guión no están almacenados para el caso concreto ya que no son relevantes. En este ejemplo, 152 Attribute SemanticTag Value Distance EnemyDistance 13 Bullet Bullets 25 Life PlayerLive 12 EnemyInCover BehindCover False Table 1: Pizarra del comportamiento que ejecuta el nodo Query Attack nos quedarı́amos con la distancia mı́nima que determina la mayor similitud con el contexto actual, que en este caso es el comportamiento AttackToDistance. EnemyDistance PlayerLive Bullets BehindCover Behavior Task Similarity 12 5 50 AttackToDistance Attack 2.88 1 0 MeleeAttack Attack 23.28 2 1 12 AutoDestroy Attack 12.16 10 12 0 True RoundAndMelee Attack 11.27 Table 2: Base de casos de ejemplo con las diferentes implementaciones de Attack Para disminuir posibles errores al seleccionar el comportamiento a ejecutar se pueden seleccionar los K comportamientos más similares (KNN, del inglés k-Nearest Neighbors algorithm) [12] y quedarnos con el que más se reptia. Al seleccionar la acción en base a la similitud, el diseñador no puede garantizar como se va a comportar el NPC por lo que pueden surgir comportamientos emergentes que van a ofrecer una cierta impredicibilidad en el comportamiento del NPC, que puede ser interesante desde el punto de vista del jugador y su percepción de la experiencia jugable pero que no es siempre del agrado de los diseñadores que normalmente prefieren mantener el control del juego. Por lo que hay que intentar mantener dichos comportamientos emergentes bajo control para que el NPC no realice comportamientos sin sentido o alejados de la idea que el diseñador tiene del juego. 5.2 Generación de la base de conocimiento Para generar la base de casos, definimos un nuevo decorador llamado Record que se asigna al nodo del árbol que queremos sustituir por el nodo Query posteriormente y que utilizaremos para almacenar las trazas. Este árbol generado por el diseñador generará trazas guardando los atributos del comportamiento que el diseñador seleccione. Las trazas se guardarán en una base de casos que puede ser alimentada con diferentes implementaciones del sub-árbol a aprender. Otra alternativa para generar la base de casos es establecer un nuevo tipo de nodo terminal que denominamos InteractiveRecord. Este nodo permite ejecutar los comportamientos a petición de un usuario, asignando cada comportamiento a ejecutar a un control. De esta forma, cuando el NPC decida ejecutar 153 la acción a aprender, avisará el diseñador para que este pulse el control asociado al comportamiento correspondiente que él considere mejor en ese momento, de forma que enseña al NPC como debe comportarse. Dado que el diseñador puede equivocarse al ejecutar el comportamiento, se puede asignar a cada una de los comportamientos a ejecutar una función de valoración que determine cómo de buenas han sido estas trazas, pudiendo hacer una selección de la base de casos descartando aquellas decisiones erróneas del diseñador ejecutando el juego. 5.3 NPCs con aprendizaje adaptativo Modificando ligeramente el comportamiento del nodo Query, podemos permitir aprendizaje durante el juego, dotando el NPC de la capacidad de adaptarse al comportamiento del jugador y aprender nuevas estrategias. El NPC puede partir de una base de casos configurada por el diseñador, pero que se irá modificando en base a lo que el NPC vaya aprendiendo durante el tiempo de juego con el jugador. Para ello, el nodo Query dispone de un parámetro que determina la probabilidad de seleccionar un comportamiento de la base de casos o uno al alzar de entre los disponibles. La probabilidad de elegir uno u otro puede ir variando en función de múltiples aspectos como la diversidad de comportamientos existente en la base de casos o los resultados obtenidos por el NPC. Si el NPC detecta que sus estrategias no consiguen vencer al jugador, este puede incrementar la probabilidad de elegir nuevas formas de implementar un comportamiento, dándole más probabilidad de elegir uno de forma aleatoria. Las trazas en este aprendizaje inline deben de estar puntuadas para saber si el comportamiento que se ha seleccionado ha cumplido su objetivo (por ejemplo para la tarea de atacar, si ha dañado al jugador y cuánto daño le ha hecho). Cuando el nodo Query seleccione un comportamiento de la base de casos, seleccionara de entre los K más parecidos, aquel que tenga una valoración más alta. Configurando este parámetro de “aleatoriedad” podemos generar comportamientos más emergentes y más sorprendentes para el jugador, lo que ayuda a reducir la sensación de IA scriptada (repetitiva) ya que el NPC puede cambiar de estrategia si detecta que la elegida no es eficaz. 6 Conclusiones y trabajo futuro Con esta extensión de los BTs permitimos a los diseñadores un mayor grado de autonomı́a, pudiendo prescindir en muchas ocasiones del programador para implementar comportamientos a bajo nivel ası́ como ayudar al diseñador a que pueda entrenar al NPC de forma interactiva, lo que le facilita la tarea de crear comportamientos, incluso si no dispone de conocimientos de programación. Esto le permite desarrollar comportamientos a bajo nivel que según nuestra metodologı́a, deberı́an ser realizados por diseñadores técnicos o programadores. Hay que tener en cuenta que el aprendizaje automático requiere de una implicación activa del diseñador en el mismo por lo que puede resultar tedioso llevarla a cabo. 154 Como trabajo futuro queremos poner en práctica estas técnicas y validarlas experimentalmente para ver cómo funcionan y qué aceptación tienen entre los diseñadores y si realmente simplifican la creación de comportamientos complejos, ası́ como testear la capacidad de aprendizaje del NPC con la técnica de aprendizaje diseñada y poder aprender comportamientos de arriba a abajo, es decir ir aprendiendo comportamientos de bajo nivel y posteriormente, usando los nodos Query ya aprendidos, aprender comportamientos de más alto nivel hasta permitir aprender el comportamiento completo del NPC. Además queremos estudiar qué funciones de similitud obtienen mejores resultados en la recuperación ası́ como qué algoritmos de selección de atributos podemos utilizar para simplificar la base de casos, ası́ como estudiar la posibilidad de simular la ejecución de los comportamientos en la fase de selección de comportamientos candidatos a ser ejecutados, de forma especulativa para que sólo se tenga en cuenta los preresquisitos de las acciones que realmente se ejecutarán en el contexto del Query Node que las ejecuta. References 1. Rouse III, R.: Game design: Theory and practice. Jones & Bartlett Learning (2010) 2. Champandard, A.J.: 3.4. In: Getting Started with Decision Making and Control Systems. Volume 4 of AI Game Programming Wisdom. Course Technology (2008) 257–264 3. Rabin, S.: 3.4. In: Implementing a State Machine Language. Volume 1 of AI Game Programming Wisdom. Cengage Learning (2002) 314–320 4. Champandard, A.J.: Behavior trees for next-gen ai. In: Game Developers Conference. (2005) 5. Aamodt, A., Plaza, E.: Case-based reasoning: Foundational issues, methodological variations, and system approaches. AI communications 7(1) (1994) 39–59 6. Flórez-Puga, Gómez-Martı́n, Dı́az-Agudo, González-Calero: Query enabled behaviour trees. IEEE Transactions on Computational Intelligence And AI In Games 1(4) (2009) 298–308 7. Palma, R., Sánchez-Ruiz, A.A., Gómez-Martı́n, M.A., Gómez-Martı́n, P.P., González-Calero, P.A.: Combining expert knowledge and learning from demonstration in real-time strategy games. In: Case-Based Reasoning Research and Development. Springer (2011) 181–195 8. Colledanchise, M., Parasuraman, R., Ögren, P.: Learning of behavior trees for autonomous agents. arXiv preprint arXiv:1504.05811 (2015) 9. Dey, R., Child, C.: Ql-bt: Enhancing behaviour tree design and implementation with q-learning. In: Computational Intelligence in Games (CIG), 2013 IEEE Conference on, IEEE (2013) 1–8 10. Sagredo-Olivenza, I., Flórez-Puga, G., Gómez-Martı́n, M.A., González-Calero, P.A.: UHotDraw: a GUI framework to simplify draw application development in Unity 3D. In: Primer Simposio Español de Entretenimiento Digital. (2013) 11. Hudson’s, K.: The ai of bioshock 2: Methods for iteration and innovation. In: Game Developers Conference. (2010) 12. Dasarathy, B.V.: Nearest neighbor (NN) norms: NN pattern classification techniques. (1991) 155 Evaluación de una historia interactiva: una apro-‐ ximación basada en emociones López-Arcos, J.R., Gutiérrez Vela, F.L., Padilla-Zea, N., Paderewski, P., Fuentes García, N.M. University of Granada, Spain {jrlarco, fgutierr, npadilla, patricia, nmfuentes}@ugr.es Abstract Las emociones juegan un papel fundamental en la transmisión de historias, ya que permiten crear un vínculo entre el receptor de la historia y los personajes involucrados. A su vez, ésta favorece la inmersión en un mundo virtual, como son los videojuegos y es un componente motivacional que favorece la enseñanza por medio de éstos. El diseño de las historias contenidas en videojuegos presenta una complejidad alta debido a su carácter interactivo. En el presente trabajo se propone una serie de técnicas para diseñar y evaluar estas historias interactivas e incluirlas en un videojuego, prestando especial atención a las emociones contenidas en ellas como elemento evaluable. Keywords. Interactive Storytelling, Educational Video Games, Evaluation, Emotions, Interaction 1 Introducción En anteriores trabajos hemos presentado [1][2] una aproximación para estructurar y analizar la historia que se diseña e incluye en los videojuegos educativos (VJE) como elemento que da sentido narrativo al proceso de juego. En ella, mediante un modelo conceptual y una serie de herramientas gráficas (basadas en el concepto de storyboard), se proporcionan mecanismos para diseñar la historia interactiva que va a formar parte del VJE. Esa historia, gracias a su flexibilidad y adaptabilidad, permite dar una “justificación narrativa” a las diferentes tareas que el jugador realiza durante el proceso de juego. El presente trabajo pretende profundizar en el proceso de diseño y de análisis de la efectividad de la narrativa y centrar su atención en la evaluación del efecto emocional de las historias interactivas en el usuario. Podemos referirnos a este usuario como jugador, en cuanto a que supera los retos proporcionados por el videojuego y “vive” la historia que le propone, o como invitado, como participante en una experiencia de juego diseñada por otra persona. El resto del presente trabajo se organiza del siguiente modo: el apartado 2 expone un breve estudio sobre la relación entre la narrativa, las emociones y la experiencia de usuario; a continuación, nuestra propuesta comienza en el apartado 3, que describe la 156 evaluación de la narrativa dentro del proceso de creación del videojuego definiéndola como dos grandes pasos: en primer lugar, las tareas referentes a su diseño y, en segundo lugar, las tareas orientadas a la evaluación de la narrativa; el apartado 4 profundiza en la evaluación de la narrativa desde el punto de vista del nivel de intensidad emocional; finalmente, en el apartado 5 se exponen las conclusiones y el trabajo futuro. 2 Las emociones, la narrativa y la experiencia de usuario El presente trabajo parte de la idea de que la relación entre la historia y el jugador esta basada en una experiencia de entretenimiento y que gran parte de esta experiencia, en el caso de los videojuegos, se genera por medio de emociones. Esta relación está basada, sobre todo, en el interés que la experiencia despierta en el invitado a lo largo de la duración de ésta. Schell [3] presenta los siguientes tres componentes del interés: • • • Interés inherente: el que despierta por sí solo un hecho concreto en una persona o colectivo. Belleza en la presentación: los recursos estéticos, sonoros, etc. ofrecidos como empaque de la experiencia y que condicionan el grado de interés del invitado. Proximidad psicológica: este concepto está relacionado con la inmersión del jugador. Se basa en una idea muy simple: los hechos que nos ocurren a nosotros mismos nos interesan más. Cuanto más cercana sea la persona afectada por un evento, más interesante nos parece dicho evento. Las habilidades de empatía e imaginación del invitado, unidas al conocimiento que éste tenga de los personajes y el mundo representados en la experiencia juegan un papel esencial al permitir la aproximación de los personajes a su entorno psicológico. Por otro lado, el entretenimiento interactivo ofrece una ventaja notable en relación a la proximidad psicológica: el invitado puede ser el personaje principal (o alcanzar un nivel de empatía muy alto con él). La proximidad psicológica hace que las emociones que genere la historia sean tomadas por el invitado como algo propio. La Figura 1, [4] muestra la evolución que han experimentado las experiencias de entretenimiento de acuerdo al papel que juega el invitado en éstas. Desde la literatura hasta los videojuegos, pasando por el cine, se ha producido un acercamiento que hace que el invitado forme, cada vez más, parte de la historia. En la literatura, el autor describe el personaje al lector; en el cine, el espectador observa directamente al personaje y, en el videojuego, el jugador es el personaje y controla sus acciones. Esto se puede interpretar como un aumento de la proximidad psicológica de las experiencias. La interactividad, sin embargo, viene acompañada de un enorme reto: El narrador pierde el control sobre la secuencia de eventos ya que el control se le pasa al jugador (en el caso de los juegos). Schell [3] propone una posible solución al problema de la pérdida del control del narrador sobre la secuencia de eventos. Para ello, emplea una serie de técnicas que él llama “técnicas de control indirecto”. Estas técnicas se usan para orientar al jugador a 157 realizar ciertas acciones sin que su sensación de libertad se vea amenazada gravemente. Figura 1. Comparación entre la literatura, el cine y el videojuego [4]. El autor propone las siguientes técnicas a tener en cuenta en el diseño de historias interactivas: • • • • Entorno: Influir mediante estímulos visuales. Situar un elemento llamativo, en la escena, invita al jugador a acercarse a él; sin embargo, también es posible disponer los elementos del escenario de forma que el invitado se vea llamado a investigar. Interfaz: Influir en el jugador usando un mecanismo de interacción concreto. Por ejemplo, un controlador con forma de volante hace que el invitado comprenda que la experiencia se basa en conducir, y que no intente hacer otra cosa. Avatar: Influir en el jugador por medio de la empatía con el personaje. Si el avatar siente miedo u otras emociones, estas se transmiten al invitado que lo está controlando y éste actuará en consecuencia. De igual modo, si el avatar presenta heridas o cansancio, el jugador intentará poner remedio a esa situación. Historia: Influir mediante recursos narrativos. Disponer la historia mediante objetivos que el jugador tenga que completar, provocando su interés por avanzar en la narrativa. Las emociones son una componente fundamental en el ser humano, ayudándole a dar significado, valor y riqueza a las experiencias que vive [5]. En la actualidad, es reconocido que las emociones juegan un rol crítico e imprescindible en todas las relaciones con las tecnologías, desde el uso en videojuegos hasta la navegación por sitios web, pasando por el uso de dispositivos móviles, entre muchos otros [6] [7]. En los últimos años las emociones han sido reconocidas como un aspecto fundamental durante el análisis y la evaluación de la experiencia de los usuarios con productos, sistemas y servicios interactivos así como se ha considerado un elemento importante a la hora de medir su calidad. A nivel de investigación, existen numerosos trabajos [8] donde se analiza qué aspectos de las emociones son los importantes a la hora de realizar una evaluación de la usabilidad y de la experiencia del usuario y qué técnicas son las más adecuadas para realizar dicha evaluación. Los métodos de evaluación de las emociones van desde complejos cuestionarios hasta métodos más o menos invasivos. En los primeros, de forma subjetiva, se pregunta a los usuarios cómo se sienten antes y después del uso del sistema, complemen- 158 tándolo con grabaciones de las reacciones de los mismos. En los segundos, usando dispositivos específicos, se miden las reacciones fisiológicas y, en base a los resultados medidos, se infieren las emociones que experimentan los usuarios. Las técnicas de evaluación emocional son una forma importante de recopilar y medir información valiosa sobre aspectos cualitativos y cuantitativos de la experiencia de un usuario. En el caso especifico de los videojuegos, las emociones también son un aspecto diferenciador de los juegos, no hay que olvidar que, desde su concepción, están diseñados para generar emociones y de esta forma atrapar al jugador en el mundo que nos presenta el juego. Uno de los factores que otorga importancia a las emociones es que éstas pueden actuar como motivadores esenciales. Es decir, emociones como el afecto positivo o el placer pueden desarrollar una conducta motivacional dirigida hacia unos objetivos específicos. Estos objetivos pueden ser la propia diversión en el caso de un videojuego o el aprendizaje en el caso de un videojuego educativo. A nivel educativo, teorías como el condicionamiento operante [9] afirman que la experiencia emocional placentera puede ser considerada como un forma importante de refuerzo positivo en los procesos de aprendizaje, incrementando la probabilidad de que se repita la conducta que dio lugar a esa experiencia emocional. La narrativa por su parte, y como profundizamos en el siguiente apartado, juega un papel muy importante en el diseño de la experiencia del jugador en un videojuego. Además, en el caso concreto de los videojuegos educativos, proporciona una justificación motivacional a las actividades pedagógicas realizadas por el usuario. Para diseñar una experiencia narrativa, es imprescindible definir las emociones contenidas en ésta. Las emociones en una historia son imprescindibles para crear un vínculo entre el receptor de la historia y los personajes y situaciones descritos en ésta. Así lo demuestran estudios como el presentado en [10], donde los autores definen métodos de interacción con la narrativa basados en las emociones. 3 Análisis de la narrativa dentro del proceso de diseño 3.1 Diseño de la narrativa del videojuego Analizando lo procesos de desarrollo de videojuegos actuales, se llega a la conclusión de que los autores de la historia del videojuego (guionistas) y diseñadores de las mecánicas de juego deben definir los elementos de la historia interactiva de tal forma que la experiencia de juego sea satisfactoria a nivel narrativo ya que esto va a influir de forma determinante en el éxito del juego. De este modo, el jugador recibirá una historia coherente y atractiva aunque la esté transformando activamente por medio de sus acciones de forma interactiva. Para llevar a cabo este complejo diseño y en base a nuestra experiencia en el desarrollo de videojuegos educativos, proponemos las siguientes tareas: 1. Definir las características de la experiencia que se quiere generar. 2. Elegir el tipo de historia adecuado. 159 3. 4. 5. 6. 7. Escribir un guión tradicional (Evolución Narrativa) que será usado como base. Estructurar la historia. Realizar un guión esquemático de la historia. Realizar un guión gráfico interactivo (o storyboard interactivo). Evaluar la estructura y la intensidad de la historia diseñada, sus elementos y su efecto en la experiencia de juego. Para ello, es necesario tener en cuenta sus posibles instancias. Esta tarea, por su complejidad e importancia, se define de forma más detallada en el apartado 3.2. 1) Definir las características de la experiencia Esta tarea está basada en realizar una descripción formal del efecto que tendrá el juego sobre el jugador. Definir de forma clara el efecto y las emociones que el diseñador tiene intención de producir en el jugador, permite basar el diseño del videojuego en esta experiencia y analizar si esta idea inicial se está perdiendo en el proceso o si el producto se mantiene fiel a ella. No hay que olvidar que todo juego se diseña con el claro objetivo de divertir y que muchos de los juegos fracasan por no alcanzar los niveles adecuados de diversión, sobre todo cuando hablamos de videojuegos educativos. La diversión puede medirse y diseñarse mediante las propiedades que definen la jugabilidad [11] del producto. 2) Elegir el tipo de historia correcta En [12], se ha detallado una categorización de los diferentes tipos de videojuegos según los niveles de intensidad con los que se puede integrar una historia con el juego. Desde videojuegos en los que la historia no es relevante más que para encuadrar la acción, como Tetris o Space Invaders, a videojuegos donde las mecánicas están al servicio de la historia, como Heavy Rain. Estas historias pueden tener un alto contenido emocional, como es el caso de The last of Us. El uso de esta clasificación sirve de ayuda para tomar decisiones durante el diseño del juego. En el caso de los videojuegos educativos, en los que se centra gran parte del trabajo de los autores del presente documento, se debe responder la siguiente pregunta: ¿Qué tipo de juego es más apropiado para el contenido educativo que se desea transmitir? Otras decisiones a tomar deben responder a cuestiones como el género del videojuego, que condicionará la forma de construir la historia más apropiada [12]. Dichos géneros se definen por la mecánica de juego dominante (plataformas, rol, shooter, aventura gráfica…). Por su parte, la historia también tendrá un género literario propio (terror, aventura, comedia, drama…) que es también un elemento a tener en cuenta en el diseño del videojuego. 3) Escribir un guión tradicional El guión tradicional describe una historia que, independientemente de su nivel de complejidad, debe ajustarse a los cánones que la tradición narrativa nos ofrece. Los trabajos de Vogler [14] sobre la metáfora de el viaje del héroe y los arquetipos que pueden adoptar los personajes son un buen marco de trabajo para todo escritor que actué como guionista de videojuegos. En el caso de los videojuegos, la particularidad más notable que presentan las historias es su interactividad, y es por ello que se deben tener en cuenta trabajos más específicos como los presentados en [15]. 4) Estructurar la historia 160 Para poder gestionar la complejidad de la historia de forma más cómoda y, por tanto, poder diseñarla y analizarla, de forma adecuada, es necesario dotarla de una estructura formal. Una importante ventaja de esta estructuración es la posibilidad de alcanzar niveles más altos de interactividad pudiendo controlar los problemas de inconsistencia narrativa que puedan surgir. Para realizar dicha estructuración, en este método proponemos los siguientes elementos de historia descritos en [16] y en [2] mediante un modelo conceptual: 1) eventos, escenas, secuencias, capítulos y líneas argumentales, ordenados jerárquicamente; 2) personajes; 3) escenarios y 4) objetos. Teniendo en cuenta los elementos descritos anteriormente, el diseñador construye un grafo que describe de forma jerárquica y visual cómo las diferentes piezas de la historia influyen unas en otras y el orden en el que deben suceder. 5) Realizar un guión esquemático de la historia El guión esquemático que se propone en el presente trabajo es una herramienta de ayuda al diseñador del VJE. Este guión, al que llamaremos “storyboard esquemático o técnico”, se representa mediante una tabla que permite relacionar de forma sencilla el contenido narrativo con el lúdico y el educativo. La tabla diseñada es una buena herramienta para incluir en el Documento de Diseño del Juego (GDD) y contiene los siguientes elementos: 1) Identificador de la escena; 2) Contenido Educativo; 3) Contenido Lúdico; 4) Evolución Narrativa; 5) emociones generadas 6) Ilustración de la escena; 7) Elementos interactivos o destacables; 8) Escenas anteriores y siguientes; 9) Precondiciones y postcondiciones que debe cumplir la escena. El guion esquemático es una herramienta muy importante en actividades tan importantes como el balanceo entre la parte educativa y lúdica del juego o en la asignación de emociones a los elementos narrativos del juego. 6) Realizar un guión gráfico interactivo El uso de prototipos es una práctica aconsejada en el proceso de desarrollo de cualquier producto software. Para realizar el storyboard interactivo, se puede usar la técnica que más encaje con el producto que se está desarrollando y con las necesidades de los diseñadores. El modo más sencillo pueden ser representaciones en papel de cada escena (Técnica del mago de OZ). Otra técnica aplicable a historias más complejas es realizar una simulación informática del videojuego en la que se narre la historia mediante herramientas de edición de historias interactivas como Twine [17]. No hay que olvidar que el número de instancias de la historia puede ser muy alto y el uso de simuladores o prototipos facilita en gran medida los análisis de la narrativa. 3.2 Evaluación de la historia Una historia interactiva puede adoptar diferentes formas dependiendo de cómo el jugador se comporte en ella. La estructura diseñada en la tarea 4 del apartado 3.1 debería definir la manera en que se generan todas esas posibles formas de la historia, a las que llamamos instancias de historia [2]. Es necesario, por tanto, comprobar que cada una de esas instancias sea una historia coherente e interesante, y que proporcione una experiencia de juego óptima. Para poder analizar y evaluar esta forma de narrativa, proponemos las siguientes tareas que podrán ser realizadas independientemente y de forma iterativa a lo largo del proceso de diseño: 161 1. 2. 3. 4. Generación e identificación de las instancias interesantes para el estudio. Análisis de la estructura narrativa de cada instancia de historia. Análisis de la intensidad de cada instancia de historia. Evaluación de los elementos de la historia y de la experiencia del jugador. Generación e identificación de las instancias interesantes para el estudio Al estructurar la historia de acuerdo al modelo conceptual propuesto en [16], se genera un grafo que recoge todos los eventos que pueden ocurrir en la historia interactiva. Obtener las posibles instancias de esa historia de forma automática en base al grafo construido, permite el análisis individual de cada una de ellas. Por tanto, es posible detectar aquéllas instancias que destaquen por su alta o baja calidad. Análisis de la estructura narrativa de cada instancia de historia Con la estructura narrativa nos referimos a estructuras formales tratadas en teorías clásicas sobre narrativa. En [18] los autores proponen una serie de métricas y estadísticas para abstraer y organizar esta información dada una historia. Algunas de ellas son: 1) contabilizar la ocurrencia de escenas pertenecientes a cada etapa del Viaje del Héroe [14]; 2) calcular qué porcentaje del tiempo que ocupa la historia pertenece a cada uno de los actos de la estructura clásica narrativa de tres actos; 3) identificar eventos pertenecientes al núcleo de la historia y los eventos satélite y contabilizarlos; 4) contabilizar el número de personajes que correspondan a cada uno de los arquetipos definidos por Vogler [14] y 5) calcular en qué porcentaje de cada acto de la historia están presentes cada uno de esos personajes arquetipo. En nuestra propuesta, los diseñadores del videojuego pueden obtener estas métricas y estadísticas automáticamente y comparar las instancias de la historia entre sí y con otras historias modelo. De este modo, se pueden detectar fácilmente posibles deficiencias en la historia interactiva. Análisis de la intensidad de cada instancia de historia Para poder medir la calidad de una experiencia de entretenimiento, es necesario encontrar un elemento medible y que varíe a lo largo de la duración de la experiencia y que por supuesto pueda ser relacionado con la calidad de la experiencia percibida por el jugador. Un parámetro que podemos usar es el interés despertado en el jugador, en cada momento de la experiencia. Gracias a que podemos medirlo, es posible generar “curvas de interés” que nos den una idea de la intensidad de la experiencia que está percibiendo el jugador. La Figura 2 muestra una típica curva de interés en una experiencia de entretenimiento clásica [3]. En esta curva se observa una representación de una buena experiencia de entretenimiento, muy usada como objetivo en la producción cinematográfica. En ella se observa que el invitado comienza con cierto grado de interés propiciado por sus propias expectativas del producto (A). En primer lugar, encuentra un gancho (B), un evento que le proporciona un alto grado de interés. A partir de ahí se suceden una serie de eventos que proporcionan altibajos en el grado de interés (C-F) para culminar en el clímax (G). La experiencia termina con un desenlace de un grado de interés idealmente algo mayor a las expectativas originales (H). Estas curvas de interés son muy útiles para crear una experiencia de entretenimiento. Pueden compararse con otras curvas para comprobar qué experiencia es más efec- 162 tiva. Además, puede compararse la curva de una experiencia real con la curva prevista por el diseñador de la experiencia, el cual puede asignar un grado de interés o intensidad a cada escena durante el diseño. Figura 2. Curva de interés típica. Evaluación de los elementos de la historia y de la experiencia del jugador Por último, proponemos que todos los elementos de la historia pueden ser evaluados por separado y dependiendo de la necesidad del proceso de diseño. A modo de ejemplo, en [19] presentamos una serie de experiencias para evaluar y así mejorar el diseño de los personajes de un videojuego educativo. Proponemos que estas experiencias en forma de pre-test, test y post-test pueden extenderse a la evaluación de cualquiera de los elementos de la historia: partes concretas de la historia, intensidad emocional, personajes, escenarios, objetos, diálogos o cualquier otro. 4 Medida de la intensidad basada en emociones En el apartado 3.2 hablamos de la importancia de realizar un análisis de la intensidad de la experiencia de usuario en cada instancia de la historia. Para ello, proponemos obtener curvas de intensidad o interés a partir de experiencias y mediciones de los diferentes atributos que presenta el interés de un espectador hacia una historia. En la presente sección, proponemos enriquecer ese análisis mediante la medición de las emociones concretas que se generan en la experiencia. Aunque existen innumerables aspectos y variantes a las reacciones emocionales que puede mostrar una persona, proponemos, en una primera fase y para reducir la complejidad del análisis, regirnos por la lista de las seis emociones universales humanas elaborada por Ekman [20]: alegría, aversión, ira, miedo, sorpresa, tristeza. Dependiendo del tipo de historia y de los intereses de los autores, será interesante centrarse en medir algunas de esas seis emociones. Proponemos el análisis de curvas de intensidad emocional específicas de las emociones seleccionadas. Sin embargo, el problema reside en cómo es posible medir emociones. Para realizar una estimación del impacto emocional de la historia, proponemos dos enfoques: a) realizar una estimación por parte de los autores, o b) aplicar técnicas de medición durante experiencias reales. Respecto al primer enfoque, los autores pueden estimar en qué grado aparece cada emoción en cada parte de la historia que están construyendo. Al realizar la estructura- 163 ción de la historia a la que se hace referencia en el apartado 3.1 y que se explica en [1], los diseñadores pueden asignar esos grados de emoción a cada escena o evento de la estructura de la historia. De este modo, de cada instancia de historia potencialmente generable se podrán obtener estos datos, analizarlos y compararlos para decidir si la historia ofrece instancias buenas o con el carisma emocional deseado (Figuras 3 y 4). Figura 3. Valor de la intensidad de las seis diferentes emociones en siete eventos de la historia. Sin embargo, cada jugador reacciona completamente diferente a una experiencia de juego. Además, en una historia interactiva, los eventos cambiarán de orden o el jugador puede estirar o acortar deliberadamente el tiempo de la experiencia, perdiéndose en gran medida el efecto emocional esperado por los autores. Para poder estimar el impacto emocional real de una historia interactiva, es necesario realizar experiencias con jugadores reales y observar los resultados. Figura 4. Ejemplo de distribución de la intensidad emocional en una escena concreta de una historia. No obstante, medir las emociones de un jugador puede no ser una tarea sencilla. Para este enfoque, proponemos dos técnicas diferentes pero no excluyentes: a) solicitar a los jugadores de la experiencia que rellenen un cuestionario sobre el impacto emocional, y b) monitorizar las reacciones de los jugadores para extraer datos sobre dicho impacto emocional. 164 Respecto al uso de cuestionarios, proponemos hacerlo al finalizar la experiencia de juego. El cuestionario a utilizar estará centrado en los eventos más destacables de la experiencia o en los que sean objeto de estudio. Además, debe ser un cuestionario adaptado a la experiencia particular y a los jugadores involucrados. A modo de ejemplo, en experiencias descritas en nuestro trabajo previo [21], al ser los sujetos del estudio niños de edad muy temprana, el cuestionario se realizó de forma oral, y se usaron cartulinas con iconos que representaban emociones fácilmente identificables por ellos (Figura 5). A diferencia de los niños, los participantes adultos pueden identificar distintos niveles para una misma emoción. Para ellos, proponemos usar un cuestionario impreso en el que para cada escena o evento de la historia interactiva que sea objeto de estudio, tengan que asignar un valor numérico a la intensidad de cada una de las emociones. Figura 5. Resultados de una experiencia de análisis de emociones con niños. De forma adicional, existe la posibilidad de monitorizar al jugador e interpretar sus emociones de forma poco intrusiva. En [22], los autores proponen un sistema con interacción emocional. En él, recogen las reacciones emocionales mediante el uso de una cámara y un software de detección facial que capta información sobre los signos de las seis emociones propuestas más la neutral. Para ello, identifica puntos clave en el rostro del usuario y las variaciones en la distancia entre ellos [23]. Este ejemplo es útil para ayudar a detectar las emociones que el jugador experimenta y comparar estos datos con los expresados por el propio jugador en el cuestionario; facilitando así la tarea de evaluación de las mismas. Estos datos sobre el impacto emocional de la experiencia de juego permiten estudiar este aspecto de la historia interactiva. Al integrar este estudio dentro del proceso de desarrollo del videojuego se pueden usar esos datos para mejorar el producto y diseñar así una experiencia interactiva basada en una historia compleja y satisfactoria para el jugador. 165 5 Conclusiones En el presente trabajo se propone un método para integrar el diseño del componente narrativo del videojuego dentro de su proceso global de diseño. Dicho método se compone de una serie de tareas que incluyen la evaluación de la estructura de la historia y de su intensidad emocional. En el aspecto emocional, se proponen una serie de técnicas para poder contabilizar estas emociones y así obtener valores en base a unas métricas. Actualmente, estamos trabajando en experiencias para poner en práctica el uso del método. El objetivo es integrarlo en una metodología y, mediante el uso de ésta, diseñar una herramienta automática para la construcción y la evaluación de la narrativa de un videojuego educativo. 6 Agradecimientos Este trabajo está financiado por el Ministerio de Ciencia e Innovación, España, como parte del Proyecto VIDECO (TIN2011-26928), Red de Trabajo Iberoamericana de apoyo a la enseñanza y aprendizaje de competencias profesionales mediante entornos colaborativos y ubicuos (CYTED - 513RT0481), el Proyecto de Excelencia P11TIC-7486 financiado por la Junta de Andalucía y el proyecto V17-2015 del programa Micropoyectos 2015 del CEI BioTIC Granada. 7 References [1] Padilla-Zea, N., Gutiérrez, F. L., López-Arcos, J. R., Abad-Arranz, A., & Paderewski, P. (2014). Modeling storytelling to be used in educational video games. Computers in Human Behavior, 31, 461-474. [2] López-Arcos, José Rafael; Gutiérrez Vela, Francisco Luis; Padilla-Zea, Natalia; Paderewski, Patricia (2014) A Method to Analyze Efficiency of the Story as a Motivational Element in Video Games. Proceedings of the European Conference on Games Based Learning, Vol. 2, p705 [3] Schell, J. (2005). Understanding entertainment: story and gameplay are one. Computers in Entertainment (CIE), 3(1), 6-6. [4] Lee, Terence, October 24, 2013. Disponible en la web: http://hitboxteam.com/designing-game-narrative [5] A. Jacko, Julie; Sears, The Human-Computer Interaction Handbook: Fundamentals, Evolving Technologies and Emerging Application, Second Edi., vol. 20126252. CRC Press, 2008. [6] S. Brave and C. Nass, “Emotion in Human-Computer Interaction.,” in The Human-Computer Interaction Handbook. Fundamentals, Evolving Technologies and Emerging Applications., Second., vol. 29, no. 2, A. Sears and J. A. Jacko, Eds. Standford University: Lawrence Erlbaum Associates, 2008, p. 1386. 166 [7] D. A. Norman, Emotional Design: Why We Love (or Hate) Everyday Things, vol. 2006, no. 2. Basic Books, 2004, p. 272. [8] M. Hassenzahl and N. Tractinsky, “User experience - a research agenda,” Behav. Inf. Technol., vol. 25, no. 2, pp. 91–97, Mar. 2006. [9] Tarpy, Roger M. (2003). Aprendizaje: teoría e investigación contemporáneas. McGraw Hill. [10] Fred Charles, David Pizzi, Marc Cavazza, Thurid Vogt, Elisabeth André. 2009. EmoEmma: Emotional Speech Input for Interactive Storytelling (Demo Paper) The Eighth International Conference on Autonomous Agents and Multiagent Systems (AAMAS-Demos), Budapest, Hungary, May 2009 [11] González Sánchez, J. L. (2010). Jugabilidad. Caracterización de la experiencia del jugador en videojuegos. [12] Wendy Despain (Ed.). (2009). Writing for video game genres: From FPS to RPG. AK Peters, Ltd.. [13] Belinkie, M. “The Video Game Plot Scale”, [online]. August 30th, 2011. Available on the Web: http://www.overthinkingit.com/2011/08/30/videogame-plot-scale/ [14] Vogler, C. (1998). The Writer's journey. Michael Wiese Productions. [15] Lebowitz, J., & Klug, C. (2011). Interactive storytelling for video games: A player-centered approach to creating memorable characters and stories. Taylor & Francis. [16] Padilla-Zea, N., Gutiérrez, F. L., López-Arcos, J. R., Abad-Arranz, A., & Paderewski, P. (2014). Modeling storytelling to be used in educational video games. Computers in Human Behavior, 31, 461-474. [17] http://twinery.org [18] Ip, B. (2011). Narrative structures in computer and video games: part 2: emotions, structures, and archetypes. Games and Culture, 6(3), 203-244 [19] López-Arcos, J. R., Padilla-Zea, N., Paderewski, P., Gutiérrez, F. L., & Abad-Arranz, A. (2014, June). Designing stories for educational video games: A Player-Centered approach. In Proceedings of the 2014 Workshop on Interaction Design in Educational Environments (p. 33). ACM. [20] Ekman, P. & Friesen, W. V. (1969). The repertoire of nonverbal behavior: Categories, origins, usage, and encoding. Semiotica, 1, 49–98. [21] Padilla-Zea, N., López-Arcos, J. R., Sánchez, J. L. G., Vela, F. L. G., & Abad-Arranz, A. (2013). A Method to Evaluate Emotions in Educational Video Games for Children. Journal of Universal Computer Science, 19(8), 1066-1085. [22] Cerezo, E., Baldassarri, S., & Seron, F. (2007). Interactive agents for multimodal emotional user interaction. In IADIS Multi Conferences on Computer Science and Information Systems (pp. 35-42). [23] Cerezo, E., & Hupont, I. (2006). Emotional facial expression classification for multimodal user interfaces. In Articulated Motion and Deformable Objects (pp. 405-413). Springer Berlin Heidelberg. 167 Automatic Gameplay Testing for Message Passing Architectures Jennifer Hernández Bécares, Luis Costero Valero and Pedro Pablo Gómez Martı́n Facultad de Informática, Universidad Complutense de Madrid. 28040 Madrid, Spain {jennhern,lcostero}@ucm.es pedrop@fdi.ucm.es Abstract. Videogames are highly technical software artifacts composed of a big amount of modules with complex relationships. Being interactive software, videogames are hard to test and QA becomes a nightmare. Even worst, correctness not only depends on software because levels must also fulfill the main goal: provide entertainment. This paper presents a way for automatic gameplay testing, and provides some insights into source code changes requirements and benefits obtained. Keywords: gameplay testing, testing, automatisation 1 Introduction Since their first appearance in the 1970s, videogames complexity has been continuously increasing. They are bigger and bigger, with more and more levels, and they tend to be non-deterministic, like Massively Multiplayer Online games where emergent situations arise due to players’ interactions. As any other software, videogames must be tested before their release date, in order to detect and prevent errors. Unfortunately, videogames suffer specific peculiarities that make classic testing tools hardly useful. For example, the final result depends on variable (nearly erratic) factors such as graphics hardware performance, timing or core/CPU availability. Even worst, correctness measure is complex because it should take into account graphic and sound quality, or AI reactivity and accuracy, features that cannot be easily compared. On top of that, videogames are not just software. For example, it is not enough to test that physics is still working after a code change, but also that players can end the game even if a level designer has moved a power up. Videogames quality assurance becomes nearly an art, which must be manually carried out by skilled staff. Unfortunately, this manual testing does not scale up when videogames complexity grows and some kind of automatisation is needed. This paper proposes a way for creating automatic gameplay tests in order to check that changes in both the source code and levels do not affect the global gameplay. Next section reviews the existing test techniques when developing 168 2 J. Hernández Bécares, L. Costero Valero, P. P. Gómez Martı́n software. Section 3 describes the component-based architecture that has become the standard for videogames in the last decade and is used in section 4 for creating logs of game sessions that are replayed afterwards for testing. Section 5 puts into practise all these ideas in a small videogame, and checks that these tests are useful when levels are changed. The paper ends with some related work and conclusions. 2 Testing and Continuous Integration Testing is defined by the IEEE Computer Society [1] as the process of analysing a software item to detect the differences between existing and required conditions and to evaluate the features of the software item. In other words, testing is a formal technique used to check and prove whether a certain developed software meets its quality, functional and reliability requirements and specifications. There are many testing approaches, each one designed for checking different aspects of the software. For example, a test can be done with the purpose of checking whether the software can run in machines with different hardware (compatibility tests), or whether it is still behaving properly after a big change in the implementation (regression tests). Alternatively, expert users can test the software in an early phase of the development (alpha or beta tests) to report further errors. Unit testing is a particularly popular test type designed to test the functionality of specific sections of code, to ensure that they are working as expected. When certain software item or software feature fulfills the imposed requirements specified in the test plan, the associated unit test is passed. Pass or fail criteria are decision rules used to determine whether the software item or feature passes or fails a test. Passing a test not only leads to the correctness of the affected module, but it also provides remarkable benefits such as early detection of problems, easy refactoring of the code and simplicity of integration. Detecting problems and bugs early in the software development lifecycle translates in decreasing costs, while unit tests make possible to check individual parts of a program before blending all the modules into a bigger program. Unit tests should have certain attributes in order to be good and maintainable. Here we list some of them, which are further explained in [2, Chapter 3]: – Tests should help to improve quality. – Tests should be easy to run: they must be fully automated, self-checking and repeatable, and also independent from other tests. Tests should be run with almost no additional effort and designed in a way that they can be repeated multiple times with the exact same results. – Tests should be easy to write and maintain: test overlap must be reduced to a minimum. That way, if one test changes, the rest of them should not be affected. – Tests should require minimal maintenance as the system evolves around them: automated tests should make change easier, not more difficult to achieve. 169 Automatic Gameplay Testing for Message Passing Architectures 3 3 Game Architecture The implementation of a game engine has some technological requirements that are normally associated to a particular game genre. This means that we need software elements specifically designed for dealing with different characteristics of a game like the physics system, the audio system, the rendering engine or the user input system. These software pieces are used by interactive elements of the game like the avatar, non-player characters (NPCs) or any other element of the game logic with a certain behaviour. The interactive elements of the game are called “entities”, and they are organised inside game levels so that the user experience is entertaining and challenging but still doable. Entities are specified in external files that are processed during the runtime. This way, level designers can modify the playability of the game without involving programmers. One of the most important tasks of a game engine is the management of the entities that are part of the game experience. An entity is characterised by a set of features represented by several methods and attributes. Following an objectoriented paradigm, each feature is represented by one or more classes containing all the necessary methods and attributes. Connecting the different classes that define the features of an entity leads to different game engine architectures, which can be implemented in very different ways. The classical way of relating the different entity features is using inheritance. This way, an entity would be represented by a base class that will inherit from multiple classes, granting the entity different features. This method, in addition to the amount of time it requires to design a class structure and hierarchy, present certain additional problems described in [5]: – Difficult to understand: the wider a class hierarchy is, the harder it is to understand how it behaves. The reason is that it is also necessary to understand the behaviour of all its parent classes as well. – Difficult to maintain: a small change in a method’s behaviour from any class can ruin the behaviour of its derived classes. That is because that change can modify the behaviour in such a way that it violates the assumptions made by any of the base classes, which leads to the appearance of difficult to find bugs. – Difficult to modify: to avoid errors and bugs, modifying a class to add a method or change some other cannot be done without understanding all the class hierarchy. – Multiple inheritance: it can lead to problems because of the diamond inheritance, that is, an object that contains multiple copies of its base class’s members. – Bubble-up effect: when trying to add new functionalities to the entities, it can be inevitable to move a method from one class to some of its predecessors. The purpose is to share code with some of the unrelated classes, which makes the common class big and overloaded. The most usual way to solve these problems is to replace class inheritance by composition or aggregation associations. Thereby, an entity would be composed 170 4 J. Hernández Bécares, L. Costero Valero, P. P. Gómez Martı́n by a set of classes connected between them through a main class that contains the rest of them. These classes are called components, and they form entities and define their features. Creating entities from a set of components is called Component-Based Architecture. It solves all the problems mentioned before, but because it does not have a well-defined hierarchy, it is necessary to design a new mechanism of communication between components. The proposed mechanism is based on the use of a message hierarchy that contains useful information for the different components. Whenever a component wants to communicate with any other component, the first one generates a message and sends it to the entity that owns the receiver component. The entity will emit a message to all of its components, and each of them will accept or reject the message and act according to the supplied information. This technique used for communicating is called Message Passing. Message passing is not only important for communicating between components, but also for sending messages from one entity to another. These messages between entities are essential when designing the playability of the game. The reason for using messages between entities is that they need to be aware of the events and changes in the game in order to respond accordingly. 4 Recording Games Sessions for Testing Traditional software tests are not enough to check all the features that a game has. Things such as playability and user experience need to be checked by beta testers, who are human users that play the game levels over and over again, doing something slightly different each time. Their purpose is to find bugs, glitches in the images or incorrect and unexpected behaviours. They are part of the Software Quality Assurance Testing Phase, which is an important part of the entire software development process. Using beta testers requires a lot of time and effort, and increases development costs. In fact, testing is so important and expensive that has become a business by itself, with companies earning million of dollars each year and successfully trading on the stock market1 . We propose an alternate form of testing, specifically designed for messagepassing architectures with a component-based engine design. The objective is to have “high-level unit tests”, based on the idea of reproducing actions to pass the test even when the level changes. To achieve that, we record game sessions and then execute the same actions again, adjusting them slightly if necessary so that the level can still be completed after being modified. Then, we check if the result of this new execution is the expected. Next sections give a detailed explanation on how to do this. 4.1 Using a New Component to Record Game Sessions In order to record game sessions easily, having a component-based design is a great advantage. Our solution is based on the creation of a new component called 1 http://www.lionbridge.com/lionbridge-reports-first-quarter-2015-results/, last visited June, 2015. 171 Automatic Gameplay Testing for Message Passing Architectures 5 CRecorder added to any of the existing entities in the game. After that, when an entity sends a message to all of its listeners, the CRecorder component will also receive the message and act accordingly. For example, a component aimed at recording actions in a game needs to be registered as a listener of the entity Player. Thus, whenever an action takes place in the game the component will be notified. Once the component has been created and the messages handled, saving all the actions to an external file is an easy task. Two different kind of files can be generated with this CRecorder component. One of them contains which keys have been pressed and the mouse movements, and the other one contains interesting events that happened during the gameplay, such as a switch being pressed by the player. 4.2 Raw Game Replay Today, it is not uncommon that keyboard and mouse input logs are gathered by beta testers executables so programmers can reproduce bugs when, for example, the game crashes during a game session. Our use of those logs is quite different: compatibility and regression tests. Using logs of successful plays, the game can be automatically run under different hardware configurations, or after some software changes in order to repeat the beta testers executions to check if everything is still working properly. Loading recorded game sessions and replaying them contributes towards having repeatable and automated tests, which were some of the advisable attributes of unit tests mentioned in section ??. Our approach can go further by using the high-level logs for providing feedback when the execution fails. While replaying the log, the system not only knows what input event should be injected next, but also what should happen under the hood thanks to the high-level events gathered during the recording phase. If, for example, after two minutes of an automatic game session an expected event about a collision is missing, the test can be stopped by reporting a problem in the physics engine. 4.3 Loading Recorded Game Sessions and Replicating The State The previous raw game replay is not suitable when, for example, the map level has changed, because the blind input event injection will make the player wander incorrectly. For that reason, we introduce a new approach for replaying the game that starts with knowing which actions can happen in the game and trying to replicate the state to make the replay of this actions accurate. Some of the attributes of a game are the actions that can happen, the physical map or the state of the game. When the objective of recording a game is replaying it afterwards, it is necessary to think about what attributes need to be stored in the log. As an example, imagine an action in which a player picks up a weapon from the ground. To replay that, it is required to know which player (or entity) performs the action, what action is taking place and the associated entity (in this case, the weapon). Another important thing to take into account is the time frame for completing the action and the state of the game when it happens. 172 6 J. Hernández Bécares, L. Costero Valero, P. P. Gómez Martı́n A player cannot take a weapon if the weapon is not there or if he is not close enough to take it. Therefore, the state needs to be as close as possible when the time frame approaches so that replicating the action is feasible. On the other hand, storing the position of the weapon is not required, as using that position could lead to replaying a wrong action if the map changes. That information is stored in a different file (a map file), it is loaded into the game and it can be accessed during the run time. With the purpose of modeling all these attributes, we use a powerful representations called Timed Petri nets, which can be very helpful to replay recorded games. 4.4 Modeling the Game With Timed Petri Nets Petri nets [3, 4] are modeling languages that can be described both graphically and mathematically. The graphical description is represented as a directed graph composed by nodes, bars or squares, arcs and tokens. Elements of a Petri net model are the following: – Places: they are symbolized by nodes. Places are passive elements in the Petri net and they represent conditions. – Transitions: bars or squares represent transitions, which are the actions or events that can cause a Petri net place to change. Thus, they are active elements. Transitions are enabled if there are enough tokens to consume when the transition fires. – Tokens: each of the elements that can fire a transition are called tokens. A discrete number of tokens (represented by marks) can be contained in each place. Together with places they model system states. Whenever a transition is fired, a token moves from one place to another. – Arcs: places and transitions are connected by arcs. An arc can connect a place to a transition or the other way round, but they can never go between places or between transitions. Figure 1 shows an example on how to model the actions of opening and closing a door with a Petri net. Figure 1a shows the initial state, in which a door is closed and a token is present. Figure 1b shows the transition that takes place when a player opens the door. Then, the new state becomes 1c, making the token move from S1 to S2. If then the player performs another action and closes the door (showed in transition 1d), the token returns to the initial state again, 1a. Notice that in figures 1b and 1d the tokens are in the middle of one of the arcs. Petri Net models do not allow tokens to be out of places, but in this example they have been put there to highlight the movement of the token. Classic Petri nets can be extended in order to introduce an associated time to each transition. When transitions last more than one time unit, they are called timed Petri nets. Introducing time in the model is essential for replaying games because actions are normally not immediate. For instance, if we want to replay an action such as “opening a door”, firstly the player needs to be next 173 Automatic Gameplay Testing for Message Passing Architectures 7 (a) State 1: Closed door (b) Transition 1: Player opens the door (c) State 2: Opened door (d) Transition 2: Player closes the door Fig. 1. Modeling the actions of opening and closing a door with a Petri net. to the door, and then perform the action of opening it. That means that the transition could be much longer than just a time unit, and other actions could be in progress at the same time. For that reason, modeling the game as a timed Petri net makes it easier than modeling it as a state machine. After loading a recorded file to the game with the purpose of replaying it, actions need to be performed in a state as close as possible as the original state. Moreover, actions are normally ordered: a player cannot walk through a door if the door has not been opened before. In practice, some actions cannot be performed until some other actions have finished. If the game has several entities capable of firing transitions, they can be represented as different tokens in the Petri net model. A token firing a transition may cause some other tokens to change their place, which is exactly what happens in games. Actions may affect several entities, not just one, so using Petri nets to model that behaviour seems to be reasonable. When we detect that a player made an action in the first execution of the game and the corresponding Petri net transition is enabled (it is possible to complete the action), the appropriate messages have to be generated and injected to the application. There is no difference between messages generated when a real person is playing and the simulated messages injected. Components accept the messages, process them and respond accordingly. For that reason, the resulting file should be exactly the same and it is possible to see that the actions that are happening in the game have not changed. 174 8 J. Hernández Bécares, L. Costero Valero, P. P. Gómez Martı́n 4.5 Replaying Game Sessions and Running Tests There are two possible ways of replaying a recorded game session: replicating the exact movements of the user or trying to reproduce specific actions using an artificial intelligence (AI ). Replicating game sessions can be useful when we want to make compatibility tests (running tests using different hardware) or regression tests (introducing software improvements but not design changes). However, replicating game sessions consists on simulating the exact keys pressed by the user. These tests are very limited, since the slightest changes on the map make them unusable. Reproducing specific actions can solve this limitation. Saving detailed traces of the game sessions that we are recording gives us the chance to use that information to make intelligent tests using an AI. That way, we can still use the recorded game sessions to run tests even if the map also changes. Once that replaying games is possible, it can be used to design and run tests. An input file with information for the tests can be written. In that file, the tester can define several parameters: – Objectives: the tester can specify which messages should be generated again and if they should be generated in order or not. If those messages are generated, it means that the particular objective is fulfilled. – Maximum time: sometimes it will not be possible to complete one of the tasks or objectives, so the tester can set a maximum time to indicate when the test will interrupt if the objectives are not completed by then. – User input file: the name of the file containing all the keys pressed when the user was playing and the associated time. – Actions input file: the name of the file with all the high level actions that the user performed, when he did them and the attributes of those actions. Using those two ways of replaying the game can lead to the generation of very different tests. It is also possible to combine both ways and run a test that reproduces the actions in the input file and if and only if the AI does not know how to handle a situation it replicates the movements in the user file. Several different tests can be run to check whether it is possible to complete a task. If any of them succeeds, then the objective is fulfilled. It is also possible to launch the game more than once in the same execution so that various tests are run and the objectives checked. 5 Example of Use: Time and Space Time and Space is a game developed by a group of students of the Master en Desarrollo de Videojuegos from the Universidad Complutense de Madrid. This game consists on several levels with switches, triggers, doors and platforms. The player has to reach the end of each level with the help of copies from itself. These copies will step on the triggers to keep doors opened or platforms moving while the player goes through them. Sometimes clones will even have to act as 175 Automatic Gameplay Testing for Message Passing Architectures 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 { } 9 ” timestamp ” : 7 3 4 0 0 , ” type ” : ”TOUCHED” , ” info ” : { ” associatedEntity ” : { ”name” : ” P l a y e r C l o n e 1 ” , ” type ” : ” P l a y e r C l o n e ” }, ” entity ” : { ”name” : ” D o o r T r i g g e r 1 ” , ” type ” : ” D o o r T r i g g e r ” }, ” player ” : { ”name” : ” P l a y e r ” , ” p o s i t i o n ” : ” Ve ct or 3 ( 3 7 . 0 4 2 3 , −2.24262 e −006 , − 6 . 5 3 3 1 5 ) ” , ” type ” : ” P l a y e r ” } } Fig. 2. Example of a trace generated when a copy push a button barriers against enemies to keep them from shooting the player. Also, there are platforms in the game that cannot be traversed by the player, but only by his copies. These copies are not controlled by the person that is playing the game. Their movements are restricted so they just reproduce the actions they made in the previous execution of the level, before the player copied itself.2 When the tester activates the recording of traces, a file in json format is generated. This format was chosen because of its simplicity to read and write to a file using different styles (named objects or arrays). Figure 2 shows a trace recorded when a player clone touches a button. This file contains the information of the execution: what actions were performed, when they took place and the entities that were associated to that action (player, switch, enemy, etc). Note that the entity position is not recorded because it can be read from the map file. The player position is also necessary to imitate the movements when this trace is replayed. To reproduce the previously recorded traces adapting them to the new level, two different type of events can be distinguished: – Generated by the player: these are the actions generated by the player itself. The recorded trace available in the file consists on the timestamp, the type of the action performed and the entity related to the action. Some other information may be stored depending on the type of the action. The actions can either be reproduced immediately or reproduced using the AI of the videogame. 2 Full gameplay shown at https://www.youtube.com/watch?v=GmxV_GNY72w 176 10 J. Hernández Bécares, L. Costero Valero, P. P. Gómez Martı́n – Generated by some other entity: in this case, we try to make the state of the player as close as possible as the player state when the trace was recorded. With that purpose, we store the information needed to know the player state along with the recorded event. In Time and Space, the state of the player only consists on its position in the map, so that is the only information we need to save in the trace log. Figure 2 shows an example of this type of traces. For replicating the game state, we use again the AI of the videogame, which is responsible for the moves of the player, making sure that they are valid. In order to detect and reproduce traces, the actions have been modeled by a Petri net, introduced in section 4.4. Thanks to these models, it is possible to replicate the actions in the same order that was recorded in the first place. This is not something trivial. Some of the actions can be reproduced before some previous ones if the player does not know how to carry out that action and gets stuck without doing anything. Because of the nature of Time and Space and the possibility of creating clones, all the Petri nets generated have a fed back structure, with multiple tokens moving from state to state inside the net. Timed Petri nets are used instead of simple Petri nets because most of the actions cannot be reproduced until previous actions are done. For example, when a clone of the player presses a button to open a door, it is necessary to wait until the door is open before starting to go through the door, even if the clone has already pushed the button. For that reason, performing all the actions from a trace in order makes it easy to have almost an exact reproduction of the gameplay. Even if this solution is almost exact, this method has some limitations. Using Petri nets to reproduce traces means that the videogame needs to have an AI implemented, capable of controlling the player inside the game. Fortunately, there are a lot of games (like Time and Space) that use an AI for directing all non-player characters movements that can be reused for that. However, despite the fact that the results we have from Time and Space are very promising, there are some use cases in which the reproduction of traces is not going to work properly. One of these examples is when the player needs to push a button that is not on the same level as the ground in which the player is standing. In this case, the AI of the videogame cannot find out where the button is, so the reaction of the player will just be waiting there without moving. This situation shows that the testing method proposed is valid, but remarks that an AI designed for control the player is needed. In order to automatise the testing phase we created a configuration file where the tester can choose if he wants to record the game or load previous executions that were recorded before. If he chooses to reproduce something recorded, then he can specify the name of the file that contains the actions that are going to be loaded and reproduced. We have recorded game traces from a level that the user played and completed successfully, that is, reaching the end of it. To check that the programmed replay system works with Time and Space, we reproduced those exact traces without 177 Automatic Gameplay Testing for Message Passing Architectures 11 any modification in the same level. However, the map was slightly changed in those new executions. Some of the tests that we carried out were the following: – Firstly we tried to change the map and move the switches to reachable places from the position in which they were initially placed. We then repeated the same test but we also changed the end of level mark. Both of the tests were still feasible after all the changes, and by replaying the same traces it is still possible to complete the objectives and reach the end of the level. Another test we made consisted in placing one of the switches behind a closed door. In this case, we could see that the player detected that the new position was not reachable and therefore he did not move from the position he had before detecting he had to press the switch. – After that, we recorded traces from a level in which the player needs three different copies to win the level. To reach the end of the level, the player has to go through a moving platform and some rays have to be deactivated. That is why the player needs his clones. If the player tries to go through the rays before deactivating them, he dies. To make the tests, several changes were added to the map. For example, we tried to pass the test after interchanging the switches between them. By doing that, they were closer or further away from the player. Running the test allows us to see that despite of all the changes, it is still possible to complete the level without difficulties. We recorded a video that shows the reproduction of two of these tests3 . Because the game has been implemented following a standard componentbased architecture, it was not necessary to make major changes in it. To record the game session we only added the CRecorder component as described in 4.1, which receives all the messages generated during the gameplay. The code for the CRecorder component and the required changes made in the original implementation are about 8 KB. Moreover, two new modules of about 127 KB were created for recording and replaying the messages. These modules were designed with a general purpose and only slightly modified to work with this game. 6 Related Work With systems growth in size and complexity, tests are more difficult to design and develop. Testing all the functions of a program becomes a challenging task. One of the clearest examples of this is the development of online multiplayer games [6]. The massive number of players make it impossible to predict and detect all the bugs. Online games are also difficult to debug because of the non-determinism and multi-process. Errors are hard to reproduce, so automated testing is a strong tool which increases the chance of finding errors and also improves developers efficiency. Monkey testing is a black-box testing aimed at applications with graphical user interfaces that has become popular due to its inclusion in the Android Development Kit4 . It is based on the theoretical idea that a monkey randomly 3 4 https://www.youtube.com/watch?v=1OBlBKly1pk http://developer.android.com/tools/help/monkey.html 178 12 J. Hernández Bécares, L. Costero Valero, P. P. Gómez Martı́n using a typewriter would eventually type out all of the Shakespeare’s writings. When applied to testing, it consists on a random stream of input events that are injected into the application in order to make it crash. Even though this testing technique blindly executes the game without any particular goals, it is useful for detecting hidden bugs. Due to the enormous market segmentation, again specially in the Android market but more and more also in the iOS ecosystem, automated tests are essential in order to check the application in many different physical devices. In the cloud era, this has become a service provided by companies devoted to offer cloud-based development environments. Unfortunately, all those testing approaches are aimed at software, ignoring the fact that games are also maps, levels and challenges. We are not aware of any approach for automatic gameplay testing as described in this paper. 7 Conclusions and Future Work Although some methods for automatising gameplay tests exist, they are aimed at checking software aspects, not taking into account the necessity of checking that both the maps and levels are still correct. Because these levels and maps also evolve alongside software while developing games, finding a way to run automatic tests to check that all the modifications introduced into levels are consistent is a must. In this paper we have introduced a proposal on how to carry out these tests. Taking advantage of the component-based architecture, we have analised the cost of introducing the recording and replaying of traces in games, which allow us to automatically repeat gameplays after modifying the levels. This proposal has been tested with a simple game, proving the viability of the idea. Despite the promising results, the work we carried out is still on preliminary stages. It is still necessary to test this technique in more complex games, as well as proving its stability to more dramatic changes in them. References 1. Software Engineering Technical Committee of the IEEE Computer Society: IEEE Std 829-1998. IEEE-SA Standard Board (1998) 2. Meszaros, G: XUnit test patterns: refactoring test code. Addison-Wesley, (2007) 3. Popova-Zeugmann, L: Time and Petri Nets. Springer-Verlag Berlin Heidelberg (2013) 4. Estevão Araújo, M., Roque, L.: Modeling Games with Petri Nets. DIGRA2009 Breaking New Ground: Innovation in Games, Play, Practice and Theory (2009) 5. Gregory, J.: Game Engine Architecture. A K Peters, Ltd. (2009) 6. Mellon, L.: Automatic Testing for Online Games. Game Developers Conference, 2006. 179 A Summary of Player Assessment in a Multi-UAV Mission Planning Serious Game Vı́ctor Rodrı́guez-Fernández, Cristian Ramirez-Atencia, and David Camacho Universidad Autónoma de Madrid (UAM) 28049, Madrid, Spain, {victor.rodriguez,cristian.ramirez}@inv.uam.es, david.camacho@uam.es AIDA Group: http://aida.ii.uam.es Abstract. Mission Planning for a large number of Unmanned Aerial Vehicles (UAVs) involves a set of locations to visit in different time intervals, and the actions that a vehicle must perform depending on its features and sensors. Analyzing how humans solve this problem is sometimes hard due to the complexity of the problem and the lack of data available. This paper presents a summary of a serious videogame-based framework created to assess the quality of the mission plans designed by players, comparing them against the optimal solutions obtained by a Multi-Objective Optimization algorithm. Keywords: Mission Planning, Multi-UAV, Serious Game, Player Assessment, Multi-Objective. 1 Introduction The study of Unmanned Aerial Vehicles (UAVs) is constantly increasing nowadays. These technologies offer many potential applications in numerous fields as monitoring coastal frontiers, road traffic, disaster management, etc [2]. Nowadays, these vehicles are controlled remotely from ground control stations by human operators who use legacy Mission Planning systems. The problem of Mission Planning for UAVs can be defined as the process of planning the waypoints to visit and the actions that the vehicle can perform (loading/dropping a load, taking videos/pictures, etc), typically over a time period. The fast evolution of UAV systems is leading to a shortage of qualified operators. Thus, it is necessary to re-design the current training process to meet that demand, making UAV operations more accessible and available for a less limited pool of individuals, which may include high-skilled videogame players [4]. This work presents a summary of a previous work [6], focused on creating a videogame-based Multi UAV Mission Planning framework, that studies and compares human plans with those generated by a Mission Planning algorithm. Modern approaches formulate the Mission Planning problem as a Constraint Satisfaction Problem (CSP) [1], where the mission is modelled and solved using constraint satisfaction techniques. CSPs are defined as a tuple <V,D,C> of 180 variables V = v1 , . . . , vn ; for each variable, a finite set of possible values Di (its domain), and a set of constraints Ci restricting the values that variables can simultaneously take. In order to find optimal solutions for these problems, in this work an optimization function has been designed to search for good solutions minimizing the fuel consumption and the makespan of the mission. To solve this optimization problem, a Multi-Objective Branch & Bound (MOBB) algorithm [8] has been designed in order to find the optimal solutions in the Pareto Optimal Frontier (POF). This algorithm will be integrated in the developed framework to compare and rank the plans created by human players. The rest of the paper is structured as follows: section 2 describes how a mission is defined in the UAV domain. Section 3 describes the game developed to simplify the Multi-UAV Cooperative Mission Planning Problem (MCMPP) problem and collect players Mission Plans. Section 4 explains the experiments performed and the experimental results obtained. Finally, last section presents the final analysis and conclusions of this work. 2 The Mission Planning Problem The MCMPP is defined as a number n of tasks to accomplish for a team of m UAVs. There are different type of tasks, such as exploring a specific area or searching for an object in a zone. These tasks can be carried out thanks to the sensors available on the UAVs performing the mission. Each task is performed in a specific geographic zone and a specific time interval. In addition, the vehicles performing the mission has some features that must be considered to check if a mission plan is correct. These features include the initial position, the initial fuel, the available sensors and one or more flight profiles. A flight profile specifies for a vehicle at a moment its speed, its fuel consumption rate and its altitude. Figure 1 shows an assignment of a UAV u to two tasks i and j. In this assignment it is necessary to assure that u has enough fuel and the sensors needed to perform both tasks and then return to its initial position. To ensure this, it is necessary to compute the distance du∈i from the initial position of u to the entry point of task i and then take the fuel consumption rate from the flight profile in order to compute the fuel consumed traversing this path. In addition, the speed vu from the flight profile is used to compute the path duration. Then, having the duration τi of task i and the speed vi given by the flight profile of the sensor used to perform the task, we can deduce the distance traversed by the UAV during the task performance, and therefore, using the fuel consumption rate of sensor’s flight profile, deduce the fuel consumed too. Next, the previous steps are repeated with task j. Finally, it is necessary to compute the fuel consumption and flight time for the return of the UAV from the last task performed to its initial position. When considering MCMPP as an optimization problem, the variables to minimize are the total fuel consumption and the makespan of the mission, i.e. the time elapsed since the mission start time until the mission is finished. 181 Fig. 1: Example of assignment of a UAV u to tasks i and j. In previous works [5], we have modelled this problem as a CSP and automatically obtained a set of optimal solutions using a MOBB algorithm. 3 Developing a Mission Planning Videogame The game created to accomplish the MCMPP problem has been designed focusing on the accesibility that professional mission planners lack of. It is based on the multi-UAV simulation environment Drone Watch And Rescue, that we designed in order to extract and analyze data from the user interactions [7]. Figure 2 shows a screenshot of Mission Planning Scenario in the game. This screen can be divided into five distinct parts: 1. 2. 3. 4. 5. 6. Main Screen: Displays graphically the Mission Scenario. Waypoints panel : Shows the flying path of the selected UAV. Plan submission button: Submits and saves the player’s Mission Plan. UAV’s panel : Displays basic information and sensors of the selected UAVs. Task Panel : Displays basic information and sensors of the selected task. Console Panel : Logs the result of the player’s interactions during a gameplay To achieve an intuitive and quick understanding of the different controls available in the game, almost all of them are activated by doing mouse clicks on the game’s Main screen. Below is detailed the whole set of game controls: – – – – Select UAV : Allows the player to see the UAV current path and information. Select Task : Allows the player to see the task information. Assign/Unassign UAV to Task Submit Plan: Submits and saves the current Mission Plan. The game has been developed using web development technologies from the field of videogames. Their main advantages include the portability of the game between both desktop and mobile systems, and a high availability: using any web browser with HTML5 capabilities, a user can access the URL where the game is hosted and play it without installing any additional software. However, it is important to note the limitations of this type of technologies. The system requirements on a videogame are much higher than those of a common web application, and current Javascript engines, despite being more and 182 Fig. 2: Game screenshot showing the Mission Scenario used in this work. Numbers represent the different parts of the Graphical User Interface (GUI) more powerful, yet have notorious performance troubles when running computeintensive jobs. Because of this, the game has been designed with a 2-level architecture (server-client), based on the design patterns used in the development of multi-user real time applications and videogames [3]. Client-Server communication is achieved by the use of the Websockets communication protocol, which offers lower latency than HTTP, and is specially suitable for real time data streams. For more information about the architecture, see [7]. 4 Experimentation In this work, the main goal of the experimentation is to rank the quality of the Mission Plans designed by players in the video game described in section 3 against those obtained automatically and optimally by a MOBB algorithm, detailed in the complete work [6]. The Mission Scenario used in this experiment features 8 tasks to be assigned to 5 UAVs scattered throughout the map. A graphical representation of this Mission Scenario can be seen, as a game screenshot, in Figure 2. In this scenario, we must compute the optimal mission plans in terms of the variables Makespan and Fuel Consumption. For this aim, we used the MOBB algorithm developed in [6] to find the Makespan-Fuel consumption POF of the biobjective problem. We obtain that for the proposed scenario, the POF is composed of six optimal solutions. To evaluate the quality of a player’s Mission Plan, we get its Makespan and Fuel consumption values, normalize them into [0, 1], and then compute the Euclidean distance of such values to the also normalized Makespan-Fuel consump- 183 Fig. 3: Comparison between player mission plans (red points) against the computed Pareto Optimal Frontier for variables Makespan and Fuel Consumption. tion POF calculated before. The player’s plan quality will represent his score in the game, and will allow us to compare gameplays. To carry out this experiment, a set of 15 players submitted a Mission Plan playing the video game developed. None of them had knowledge in the field of MCMPP, and only received a brief tutorial about the game objective and the game controls. Figure 3 shows the performance of each player’s gameplay as a point in the Makespan-Fuel space. The closer a point is to the POF, the better rank the player will have. Table 1 shows the first ranking positions numerically. The complete ranking is shown in [6]. The results prove there is not a dominant planning style in terms of the optimization variables focused by the players. Most of the points are located at the center of the space, which means that the general trend that a novice player follows in this type of problems is balancing the values to optimize. It is also remarkable that the Mission Plans are generally quite close to the POF. 5 Conclusions and Future Work This paper has presented a summary of the contributions made by some published works in the field of the Multi-UAV Cooperative Mission Planning Problem, specially focused on assessing user performance when designing plans. A video-game based framework is created to make this problem understandable for non-expert users, and to rank and compare player plans against the optimal ones computed by a Multi-Objective Optimization algorithm. As future work, we intend to extend the video game to allow the creation of more complex plans, to introduce some gamification elements (as tutorials and levels) that make it even more accessible, and to include elements that 184 Table 1: Top 5 player ranking. The less score the better ranking position Ranking Makespan (h) Fuel consumption (L) Score 1 2 3 4 5 5.91 3.85 2.51 3.66 5.93 143.59 149.45 169.63 160.44 146.66 0.00000 0.00052 0.00213 0.01014 0.01405 improve the analysis of the players, as identifications to track the evolution of their gameplays, or time spent measurements to rank the player’s speed. Acknowledgments This work is supported by the Spanish Ministry of Science and Education under Project Code TIN2014-56494-C4-4-P, Comunidad Autonoma de Madrid under project CIBERDINE S2013/ICE-3095, and Savier an Airbus Defense & Space project (FUAM-076914 and FUAM-076915). The authors would like to acknowledge the support obtained from Airbus Defence & Space, specially from Savier Open Innovation project members: José Insenser, Gemma Blasco, César Castro and Juan Antonio Henrı́quez. References 1. Guettier, C., Allo, B., Legendre, V., Poncet, J.C., Strady-Lecubin, N.: Constraint model-based planning and scheduling with multiple resources and complex collaboration schema. In: Procedings of the Sixth International Conference on Artificial Intelligence Planning Systems (AIPS). pp. 284–292 (2002) 2. Kendoul, F.: Survey of advances in guidance, navigation, and control of unmanned rotorcraft systems. Journal of Field Robotics 29(2), 315–378 (2012) 3. Lewis, M., Jacobson, J.: Game engines. Communications of the ACM 45(1), 27 (2002) 4. McKinley, R.A., McIntire, L.K., Funke, M.A.: Operator selection for unmanned aerial systems: comparing video game players and pilots. Aviation, space, and environmental medicine 82(6), 635–642 (2011) 5. Ramirez-Atencia, C., Bello-Orgaz, G., R-Moreno, M.D., Camacho, D.: Branching to find feasible solutions in Unmanned Air Vehicle Mission Planning. In: International Conference on Intelligent Data Engineering and Automated Learning. pp. 286–294 (2014) 6. Rodrı́guez-Fernández, V., Atencia, C.R., Camacho, D.: A Multi-UAV Mission Planning Videogame-based Framework for Player Analysis. In: Evolutionary Computation (CEC), 2015 IEEE Congress on. IEEE, In press (2015) 7. Rodrı́guez-Fernández, V., Menéndez, H.D., Camacho, D.: Design and Development of a Lightweight Multi-UAV Simulator. In: Cybernetics (CYBCONF), 2015 IEEE International Conference on. IEEE, In press (2015) 8. Rollon, E., Larrosa, J.: Constraint optimization techniques for multiobjective branch and bound search. In: International Conference on Logic Programming, ICLP (2008) 185 An overview on the termination conditions in the evolution of game bots A. Fernández-Ares, P. Garcı́a-Sánchez, A. M. Mora, P. A. Castillo, J. J. Merelo, M. G. Arenas, and G. Romero Dept. of Computer Architecture and Technology, University of Granada, Spain {antares,pablogarcia,amorag,pacv,jmerelo,mgarenas,gustavo}@ugr.es Abstract. Evolutionary Algorithms (EAs) are frequently used as a mechanism for the optimization of autonomous agents in games (bots), but knowing when to stop the evolution, when the bots are good enough, is not as easy as it would a priori seem. The first issue is that optimal bots are either unknown (and thus unusable as termination condition) or unreachable. In most EAs trying to find optimal bots fitness is evaluated through game playing. Many times it is found to be noisy, making its use as a termination condition also complicated. This paper summarizes our previous published work where we tested several termination conditions in order to find the one that yields optimal solutions within a restricted amount of time, to allow researchers to compare different EAs as fairly as possible. To achieve this, we examined several ways of finishing an EA who is finding an optimal bot design process for a particular game, Planet Wars in this case, with the characteristics described above, determining the capabilities of every one of them and, eventually, selecting one for future designs. Keywords: Videogames, RTS, evolutionary algorithms, termination criteria, noisy fitness 1 Introduction Evolutionary Algorithms (EAS) are one of the methods usually applied to find the best autonomous agent for playing a game, i.e. the best bot [1, 2] through a process mimicking the natural evolution of the species. As in any other algorithm, the termination condition is a key factor as the rest ot the experimental setup since it affects the algorithmic performance, with respect to the quality of the yielded solution, and also to the amount of resources devoted to the run. The usual stopping criterion in EAs [3] is reaching a constant number of generations (or evaluations), which is normally related to a fixed computing power budget for carrying out the run. Another usual approach is based in a number of generations in which the best solution is not improved or the distance to the optimum is not reduced [4]. However, neither of them might be useful in certain kind of problems such as games, mainly due to the noisy nature of the fitness function [2]. Noise and optimal fitness reachability, are normally not taken into account when choosing how to stop the evolution process. Usual approach is to use a fixed 186 2 A. Fernández-Ares et al. number of evaluations or a fixed amount of time, usually given by the game or challenge constraints. In this paper we present a summary of our previous work [5], where we tried to solve these issues by the introduction of novel stopping criteria for the EAs. They are compared against classical ones, and among themselves when trying to generate competitive bots for video games using Genetic Programming (GP) [6], as this method has proved to be quite flexible and has obtained good results in previous works [7]. Planet Wars 1 game was chosen in our experiment, as it is a simple Real Time Strategy (RTS) combat-based game (only one type of resource, one type of attack and one type of unit), and also it has been widely used in the literature, using different generation methods and fitness functions [8–11]. This game fulfills the next two conditions: initial position of bots is random and the decisions are stochastic, although the result of the combat is deterministic. Summarizing, the objective of the study presented in [5] (and outlined in this work) was to find a stopping criteria that converges to optimal solutions and that is independent of the method chosen. To measure the quality of every approach, we considered time, or number of generations, needed to obtain the solution and the quality of that solution. 2 Methodology and experimental setup As previously stated, in our described work [5] we proposed different termination criteria based on different EA features, such as the parameters of the algorithm (maximum number of generations) or the population (improvement, replacement or age). A Score Function was proposed in order to measure the quality of a generated bot (a solution or individual in the algorithm). This scoring method tries to reduce the effects of the noisy evaluation (following the guidelines of other works [12]) by computing fitness from the result of 30 different matches against an expert rival. Thus, the function considers the number of victories, turns to win and turns resisted before being defeated by the opponent (in the case of lose). The rival is ExpGenebot [13], and based on the improvement of the heuristics proposed by a human player. The fitness of each individual i of the population is obtained using the next formula where N is the number of simulations2 : β = NB × 1 2 Scorei = α + β + γ (1) α = v, α ∈ [0, N B] (2) 1 N ×tM AX +1 twin v+1 + 1 twin + , β ∈ [0, N B] , twin ∈ [0, N B × tM AX ] (3) http://planetwars.aichallenge.org/ The ‘1’ in all denominators is used to avoid dividing by 0 and for the ratio calculation. 187 An overview on the termination conditions γ= tdef eated , γ ∈ [0, 1] , tdef eated ∈ [0, N B × tM AX ] N B × tM AX + 1 3 (4) The terms used are: the number of battles (N B) to test, the number of victories of the individual against ExpGenebot (v), the sum of turns used to beat ExpGenebot (twin ) in winners simulations, the sum of turns when the individual has been defeated by ExpGenebot (tdef eated ) in losing simulations and the maximum number of turns a battle lasts (tM AX ). GP algorithm evolves a binary tree formed by decisions (logical expressions that evaluate the current state of the game) and actions (the leafs of the tree: the amount of ships to send to a specific planet). This tree is evaluated in each player’s planet, analysing the current state of the map/planet (decision), and how many ships send from that planet to an specific target planet (action). These target planets can be the wealthiest, the closest one, etc. owned by the player or the enemy, or neutral. The possible actions and decisions are listed in [7]. The complete set of used parameters were described in the work we are summarizing here [5]. We designed a set of five different algorithm stop criteria which were checked in the paper, namely: – [NG] Number of Generations: it is the classical termination criteria in evolutionary algorithms: 30, 50, 100 and 200 generations. – [AO] Age of Outliers: if the age of individual is an outlier in the comparison with the rest of the population then it would be potentially an optimal solution and the algorithm can be stopped: 1, 1.5, 2 and 2.5 times the interquartile range (IQR) – [RT] Replacement Rate: when using an elitist strategy in which individuals are replaced only if the offspring is better, the fact that the population stops n . generating better individuals might be a sign of stagnation: n2 , n4 , n8 , and 16 – [FT] Fitness Threshold: a maximum value to obtain in the evolution could be set considering the top limit of the score function: 20 (as half the maximum: M AXSC /2), 30 (as half the maximum score plus half this value: M AXSC /2 + M AXSC /4), and the division in four parts of the interval these values compose: 22, 24, 26 and 28. – [FI] Fitness Improvement: if the best fitness is not improved during a number of generations, the algorithm must stop. Four possible values will be tested: 3, 7, 10 and 15 generations. 3 Experiments and Results The experiments conducted involved 36 runs, each of them configured including all the defined stop criteria, and also adding an extra termination one, i.e. getting to 500 generations, in order to avoid non-ending runs. The results showed the absence of some of the commented stop values for the Fitness Threshold criterion, namely 28 and 30, because they were not reached in 188 4 A. Fernández-Ares et al. any of the runs. The results also showed that all the criterion are well defined, since all the scores grow in every criterion block, so as more restrictive the criterion is (they are met with a lower probability), the higher the obtained score is. The score function worked as expected, even with the presence of noise. This is true in all the cases except in the Age of Outliers which is so far the criterion with the worse results, as it is also proved by the statistical test which does not find significant differences between the scores obtained by every age-based criterion and the previous and next ones. The Replacement Rate criterion yielded the best distribution of results, with a clear fitness improvement tendency and a very good maximum score, close to that obtained by the Fitness Thresholds. In addition to these two studies, a new measure factor was computed by means of a benchmark based in battles against a different competitive bot available in the literature [13] in 100 maps (some of them used during the evolution). This test was conducted for the best individual obtained when every criterion was met in every run, thus 20x36 bots have been tested. The best results were yielded by the most restrictive of each criterion, highlighting the FT results. This is also a reinforcement to the correctness of the score function. Score and generations measures are compared in Figure 1. As it can be seen, an improvement in the fitness/score means a higher number of generations are required. This happens in almost all the cases, with some exceptions such as some of RT criteria, which get a higher score value in less generations than other criteria. However, this happens due to the commented problem of noise. Finally Table 1 presents all the results as a summary. It also shows the completion rate of every stop criterion. Moreover, a comparative set of values was computed, considering the number of generations equal to 30 (usual in previous papers) as the standard value to relativize the rest. The FI criterion is useful to ‘detect’ local optima. Increasing the restriction value of this method lets the EA more generations to escape from a local optimum, obtaining significantly better results. As the EA can quickly converge to a local optima, using this method could be equivalent to set a fixed (but unknown) number of generations, enough to detect a stagnation in the population (that can be useful in some evolutionary approaches). However, the results show that this criterion has stopped in local optima that other methods have surpassed. Finally, RT provides the best results considering all metrics: generations, score and completion rate. It is based on replacement rate, so it indirectly measures how the whole population increases their abilities, without explicitly measure the average fitness. This is useful in this kind of problems, i.e. where there is a noisy fitness function and the optimal solution is unknown. 4 Conclusions Using Evolutionary Algorithms (EAs) to generate bots for playing games have two main issues: the fitness is noisy and optimal bots are either not known or unreachable. This makes it difficult to find a good stopping criterion for the EA. In [5] four different stopping criteria, based in fitness and in the population, 189 An overview on the termination conditions 5 25 FT_26.0 <22%> FT_24.0 <78%> 20 RT_n/16 <67%> FT_20.0 <100%> NG_200.0 <100%> NG_100.0 <100%> RT_n/08 <100%> FI_15.0 <100%> NG_050.0 <100%> FI_10.0 <100%> FI_07.0 <100%> AO_2.5 <100%> 15 Average SCORE of Best Individual FT_22.0 <100%> AO_2.0 <100%> AO_1.5 <100%> AO_1.0 <100%> FI_03.0 <100%> RT_n/04 <100%> 10 RT_n/02 <100%> 1 2 5 10 20 50 100 200 Stopping Absolute Relative R Criteria G S V G S V NG 030.0 30.00 16.31 45.92 1.00 1.00 1.00 1.00 NG 050.0 50.00 17.80 52.72 1.70 1.09 1.15 1.00 NG 100.0 100.00 19.21 57.25 3.37 1.18 1.25 1.00 NG 200.0 200.00 20.25 58.39 6.70 1.24 1.27 1.00 AO 1.0 8.83 13.46 35.89 0.29 0.83 0.78 1.00 AO 1.5 10.33 14.07 34.83 0.34 0.86 0.76 1.00 AO 2.0 14.61 14.93 38.08 0.49 0.92 0.83 1.00 AO 2.5 17.17 15.30 39.92 0.57 0.94 0.87 1.00 RT n/02 2.00 10.20 20.58 0.07 0.63 0.45 1.00 RT n/04 5.47 12.21 28.17 0.18 0.75 0.61 1.00 RT n/08 78.64 18.16 50.94 2.62 1.11 1.11 1.00 RT n/16 248.21 21.34 62.92 8.27 1.31 1.37 0.66 FT 20.0 55.08 20.62 54.22 1.84 1.26 1.18 1.00 FT 22.0 127.56 22.65 58.25 4.25 1.39 1.27 1.00 FT 24.0 276.71 24.39 63.39 9.22 1.50 1.38 0.77 FT 26.0 378.88 26.45 74.75 12.63 1.62 1.63 0.22 FI 03.0 10.31 13.34 30.00 0.34 0.82 0.65 1.00 FI 07.0 24.56 15.54 41.39 0.82 0.95 0.90 1.00 FI 10.0 35.47 16.50 47.94 1.18 1.01 1.04 1.00 FI 15.0 52.22 17.56 53.00 1.74 1.08 1.15 1.00 500 Table 1: Average results of every criterion Average GENERATION to stop Fig. 1: Average score of the best individual and average reached generations per termination criteria. for the three measures: Number of generations (G), Score (S), and Victories in benchmark (V); plus the Completion rate in experiments (R). Relative values are computed with respect to NG 30.0. were tested and compared with the classical approach of the fixed number of generations. This paper summarizes the contents of that previous work. Several experiments were conducted, using different metrics based in a score function, the number of generations reached for each criterion, and the number of victories that the best yielded bots per criterion have obtained against an external rival (not the same used in the fitness computation). According to the results, initially, a stopping criterion based in Fitness Threshold would be the most desirable option, as it attains the best score. However, in this kind of problems, it is quite difficult to find an optimal fitness value to use (normally it is unknown). Therefore, the best option would be using a Replacement Rate as stopping criterion, since it is a compromise solution which relies in the population improvement without implicitly use the fitness. As future work, new problems (and algorithms) will be addressed to validate the proposed stopping criteria, using different environments and new score functions. In addition, mechanisms to improve the EA will be used in conjunction with the proposed methods, for example, increasing the search space when a stagnation of the population is detected. 190 6 A. Fernández-Ares et al. Acknowledgments This work has been supported in part by SIPESCA (Programa Operativo FEDER de Andalucı́a 2007-2013), TIN2011-28627-C04-02 (Spanish Ministry of Economy and Competitivity), SPIP201401437 (Dirección General de Tráfico), PRY142/14 (Fundación Pública Andaluza Centro de Estudios Andaluces en la IX Convocatoria de Proyectos de Investigación), PYR-2014-17 GENIL project and V17-2015 of the Microprojects program 2015 (CEI-BIOTIC Granada). References 1. Small, R., Bates-Congdon, C.: Agent Smith: Towards an evolutionary rule-based agent for interactive dynamic games. In: Evolutionary Computation, 2009. CEC ’09. IEEE Congress on. (2009) 660–666 2. Mora, A.M., Montoya, R., Merelo, J.J., Sánchez, P.G., Castillo, P.A., Laredo, J.L.J., Martı́nez, A.I., Espacia, A.: Evolving Bots AI in Unreal. In di Chio et al., C., ed.: Applications of Evolutionary Computing, Part I. Volume 6024 of Lecture Notes in Computer Science., Istanbul, Turkey, Springer-Verlag (2010) 170–179 3. Bäck, T.: Evolutionary algorithms in theory and practice. Oxford University Press (1996) 4. Roche, D., Gil, D., Giraldo, J.: Detecting loss of diversity for an efficient termination of eas. In: 15th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing, SYNASC 2013, Timisoara, Romania, September 23-26, 2013, IEEE (2013) 561–566 5. Fernández-Ares, A., Garcı́a-Sánchez, P., Mora, A.M., Valdivieso, P.A.C., Guervós, J.J.M., Arenas, M.I.G., Romero, G.: It’s time to stop: A comparison of termination conditions in the evolution of game bots. In: Applications of Evolutionary Computation. Volume 9028 of Lecture Notes in Computer Science., Springer (2015) 355–368 6. Koza, J.R.: Genetic Programming: On the programming of computers by means of natural selection. MIT Press, Cambridge, MA (1992) 7. Garcı́a-Sánchez, P., Fernández-Ares, A.J., Mora, A.M., Castillo, P.A., Merelo, J.J., González, J.: Tree depth influence in genetic programming for generation of competitive agents for rts games. In: EvoApplications, EvoStar. (2014) 411–421 8. Mora, A.M., Fernández-Ares, A., Guervós, J.J.M., Garcı́a-Sánchez, P., Fernandes, C.M.: Effect of noisy fitness in real-time strategy games player behaviour optimisation using evolutionary algorithms. J. Comput. Sci. Technol. 27(5) (2012) 1007–1023 9. Fernández-Ares, A., Mora, A.M., Guervós, J.J.M., Garcı́a-Sánchez, P., Fernandes, C.: Optimizing player behavior in a real-time strategy game using evolutionary algorithms. In: IEEE C. on Evolutionary Computation, IEEE (2011) 2017–2024 10. R, L.C., Cotta, C., Fernández-Leiva, A.: On balance and dynamism in procedural content generation with self-adaptive evolutionary algorithms. Natural Computing 13(2) (2014) 157–168 11. Nogueira-Collazo, M., C., C., Fernández-Leiva, A.: Virtual player design using self-learning via competitive coevolutionary algorithms. Natural Computing 13(2) (2014) 131–144 12. Mora, A.M., Fernández-Ares, A., Guervós, J.J.M., Garcı́a-Sánchez, P., Fernandes, C.M.: Effect of noisy fitness in real-time strategy games player behaviour optimisation using evolutionary algorithms. J. CST. 27(5) (2012) 1007–1023 13. Fernández-Ares, A., Garcı́a-Sánchez, P., Mora, A.M., Guervós, J.J.M.: Adaptive bots for real-time strategy games via map characterization. In: CIG, IEEE (2012) 417–721 191