Preferred clip - Heino Jørgensen
Transcription
Preferred clip - Heino Jørgensen
Aalborg University Copenhagen Department of Medialogy Semester: 5th semester Semester Coordinator: Henrik Schønau Fog Secretary: Judi Stærk Poulsen Phone: 9940 2468 Title: SmileReactor Project Period: 2th September 2008 – 18th December 2008 Semester Theme: Audiovisual Experiments Supervisor(s): Daniel Grest (Main supervisor) Henrik Schønau Fog Project group no.: 08ml582 Members: Sune Bagge Poul Martin Winther Christensen Kim Etzerodt Mikkel Berentsen Jensen Mikkel Lykkegaard Jensen Heino Jørgensen Aalborg University Copenhagen Lautrupvang 15, 2750 Ballerup, Denmark judi@media.aau.dk https://internal.media.aau.dk/ Abstract: This project is concerning movie playback which reacts on when and how much the user is smiling, in order to choose a type of humor, which the user thinks is amusing. By analyzing various techniques of character modeling, animation and types of humor, three different movie clips have been formed. Techniques of facial expression detection and in particular smile detection have been analyzed and implemented into a program, which plays the movie clips according to ratings, gained from the detection of the user’s facial expressions. The final solution has been tested both in regards of smile detection rate, user preferences and timing of user reactions in relation to the intended humorous parts of the movie clips. After the conclusion of the project, a discussion of the relevance and execution of the project is presented. The project is put into perspective in the chapter Future perspectives, and proposals for further development and alternative approaches, Copies: 3 Pages: 238 including Appendix Finished: 18th December 200 within the field of passive interaction, are described and evaluated. Copyright © 2008. This report and/or appended material may not be partly or completely published or copied without prior written approval from the authors. Neither may the contents be used for commercial purposes without this written approval. Group 08ml582 Reactive movie playback Preface Preface This project has been developed during the 5th semester of Medialogy at Aalborg University Copenhagen, in the period from the 1st of September 2008 to the 18th of December 2008. The project is based on the semester theme defined as: Audiovisual Experiments – Computer Graphics and Animation. The product produced is based on facial reading – more specifically smile detection – in order to control the playback of pre-rendered animated movie clips. This report will document and discuss the steps of the production process that the group went through during the project period. The report will cover issues such as modeling, animation, rendering and programming in C++. It should be noted that all sounds used in the implementation are copyrighted to the respective owners. We want to thank stud.scient. in Computer Vision, Esben Plenge for valuable feedback on the smile detection method. Readers Guide The report is organized in the following chapters: Pre-Analysis, Analysis, Design, Implementation, Testing, Discussion, Conclusion and Future Perspectives. Each chapter is briefly described in the start of the chapter and features a sub-conclusion in the end. Concerning the Analysis, Design and Implementation, the order of the sub-chapters will be sorted according to the product flow, meaning that first the movie clips are shown and then the program registers a smile and controls further movie clips playback. Therefore, these chapters will be structured in a similar manner, with topics concerning the creation of the movies being discussed first, having the topics regarding creation of the program being discussed second. The header contains the current chapter number, chapter name and sub-chapter name in the right side, to always have a point of reference. The footer contains the current page number in the outermost corners. Furthermore, when a notation such as “Expo671”, there will be a short explanation of the notation in the footer. I Group 08ml582 Reactive movie playback Preface The APA Quotation standard is used when referring to sources and quoting: “To be, or not to be, that is the question” (Shakespeare, 1603). The full list of sources can be found in chapter 10 Bibliography. When referencing to other chapters or illustrations the text is italic, as it can be seen in the line above. Illustrations are numbered X.Y, where X is chapter number and Y is the number of the illustration in the specific chapter. Illustration text is easily distinguishable because the font is blue and the font size is smaller than the body text, which is True type font “Cambria” in size 12. Code examples and variable is formatted with Courier New and code examples are replaced with pseudo-code, when the implementation is too long to cite in the report. A full list of illustrations with sources can be found in chapter 11 Illustration List. Chapter 12. Appendix contains all storyboards, implementation code and additional graphs and data, which are not shown in the report itself. Additional data, which is too large to be represented in the appendix, can be found on the enclosed CD-Rom. II Group 08ml582 Reactive movie playback Content Content 1. Introduction............................................................................................................................................................. 1 1.1 Motivation ......................................................................................................................................................... 1 1.2 State of the art.................................................................................................................................................. 3 1.3 Initial Problem Formulation ...................................................................................................................... 6 2. Pre-analysis ............................................................................................................................................................. 7 2.1 Narrowing down emotions ......................................................................................................................... 7 2.2 Storytelling in a movie ................................................................................................................................ 11 2.3 Rendering methods ..................................................................................................................................... 14 2.4 Target group ................................................................................................................................................... 18 2.5 Testing the product ..................................................................................................................................... 19 2.6 Delimitation .................................................................................................................................................... 21 2.7 Final Problem Formulation ...................................................................................................................... 22 3. Analysis ................................................................................................................................................................... 23 3.1 Delimitation of the character ................................................................................................................... 23 3.2 Narrative structures .................................................................................................................................... 24 3.3 Lighting, sound and acting ........................................................................................................................ 28 3.4 Camera movements and angles .............................................................................................................. 32 3.5 Humor ............................................................................................................................................................... 39 3.6 A reacting narrative ..................................................................................................................................... 45 3.7 Programming languages ............................................................................................................................ 47 3.8 Smile detection .............................................................................................................................................. 49 3.9 Analysis Conclusion ..................................................................................................................................... 56 4. Design....................................................................................................................................................................... 58 4.1 Character Design........................................................................................................................................... 58 4.2 Humor types ................................................................................................................................................... 71 III Group 08ml582 Reactive movie playback Preface 4.3 Drafty Storyboards ...................................................................................................................................... 73 4.4 Animation ........................................................................................................................................................ 82 4.5 Smile Detection Program........................................................................................................................ 106 4.6 Design conclusion...................................................................................................................................... 111 5. Implementation................................................................................................................................................. 113 5.1 Modeling ....................................................................................................................................................... 113 5.2 Rigging ........................................................................................................................................................... 117 5.3 Animation tools .......................................................................................................................................... 129 5.4 Smile Picture Capture .............................................................................................................................. 138 5.5 Training program ...................................................................................................................................... 140 5.6 Smile detection ........................................................................................................................................... 144 5.7 Implementation Conclusion .................................................................................................................. 149 6. Testing .................................................................................................................................................................. 150 6.1 Cross Validation Test ............................................................................................................................... 150 6.2 DECIDE framework................................................................................................................................... 154 6.3 Analyzing the results ................................................................................................................................ 158 6.4 Conclusion of the test ............................................................................................................................... 169 7. Discussion............................................................................................................................................................ 171 8. Conclusion ........................................................................................................................................................... 172 9. Future perspective ........................................................................................................................................... 175 10. Bibliography .................................................................................................................................................... 176 11. Illustration List ............................................................................................................................................... 182 12. Appendix ........................................................................................................................................................... 185 12.1 Storyboards ............................................................................................................................................... 185 12.2 Implementation Code ............................................................................................................................ 197 12.3 Forms ........................................................................................................................................................... 225 12.4 Test Results ............................................................................................................................................... 228 IV Group 08ml582 Reactive movie playback 1. Introduction Motivation 1. Introduction The introduction of this report will serve as a guide through the first chapter. After this short introduction, the motivation of the problem will be presented. The motivation explains why the group has found the problem interesting to work with and why it has been chosen for the project. The next step will be to examine other work in this area, not only to achieve inspiration for the product of this report, but furthermore to avoid creating a product which is merely a duplicate of an already existing product. Lastly, the information uncovered in this chapter will assist the group in formulating the initial problem formulation for the project, which will become the basis on which all further research is conducted. 1.1 Motivation The motivation for the problem of the project began with a simple thought: Imagine yourself or any person you know watch a movie - be it in the cinema, the theater or the TV. What is the most prominent action you imagine that person performing while experiencing this movie? There is a good chance the answer will be: “Not really anything”. Because that is often what happens when regular people sits down and watches a movie play out in front of them. Normally we take in the movie through our eyes and ears – and maybe feeling it somewhat, when we cry or get goose bumps – but it all takes place with us sitting perfectly still and doing nothing to contribute to the cause of the story. Usually, if the story of a movie is good enough, it is easy for us to be swept away and immerse our selves fully in it. Many people can imagine a person watching a scary movie and being so enticed by it, that they begin shouting advice like: “NO – don’t go in there!”, to the actor on the screen, completely oblivious to the fact that it will not influence the movie in any way whatsoever. Or maybe we can recognize the sensation of watching a movie that actually provokes such reactions from us, such as tears, joy or goose bumps. But all these influences are merely one-way. Everything that happens between the movie and us as the audience is that the movie can affect us in some way – it is not the other way around. 1 Group 08ml582 Reactive movie playback 1. Introduction Motivation What this project aims to establish, is the possibility for the audience to influence the movie, rather than only having the movie influence the audience. When considering what type of influence to have over the movie, the goal is to avoid rather simple interactions such as pressing either a red button or a blue button at a given time in the movie. Such a way of influencing a story exists already and we have to look no further than to popular mainstream videogames, such as Mass Effect (Bioware, 2007) or Metal Gear Solid (Konami, 1998) to find a story that is influenced by simple button presses. Illustration 1.1: The game Mass Effect uses user interaction – by pressing buttons – to alter the narrative of the game. This project aims at incorporating a more subtle form of influence over a story. Referring back to the example of the person shouting warnings at the character in the scary movie: What if the fear and dread experienced by this person actually determined what would happen in the story from that point on? Or what if a character in a movie came to a crossroad – facing a choice of whether or not to help another character out of a sticky situation – and would make his choice based on how the audience felt about this other character at this time in the movie? This all fall under the attempt to have the users’ emotion influence the story. By focusing on emotions and not simple button presses affecting how the story plays out, it is possible to let the user experience the story based on how he feels and not by what he does or simply how the director wanted the story to play out to begin with. It presents the opportunity of having a user alter the course of the movie, maybe without them even noticing or maybe not noticing until he watches the movie again and unconsciously moves the story in a different direction. This leads to many interesting questions to investigate, such as how personal the movie feels to the user, how the user would react to experiencing the movie a second time, having it 2 Group 08ml582 Reactive movie playback 1. Introduction State of the art change based on his emotions, but without him knowing about it and generally creates a great many possibilities to have the movie and the user influence and interact with each other in ways that the user might not expect or even notice. And this is why making the users’ emotions having an influence over the story forms a strong motivation and starting point for this project. The task is now to convert this motivation into a suitable project problem. 1.2 State of the art In order to know what has been done already in the area of interactive movies and measurements of users’ emotions, this chapter will look at previous attempts at doing interactive movies and examine what has been done in the area of facial expression detection, in order to interpret the emotions of a user. It is important to determine which ideas have already been realized in the area of interactive movies, since the making of a product mimicking the functions or ideas of a previous product is futile, offering nothing new to examine, test and analyze. Rather, it would be prudent to – at most – take inspiration from previous products but still ensure that the final product of this project will be unique and new, so that it provides a reason to test and evaluate it. 1.2.1 Interactive movies True interactive movies have existed in various forms for more than 40 years, but it is still a technology, which is not widely used when producing movies for the general cinema audience. In 1967, the Czechoslovakian director Radúz Činčera directed the movie Kinoautomat: One Man and His House, which was a black comedy movie, presented at the Expo 67 1; the movie was created to run on a system developed by Činčera, called Kinoautomat and made it the worlds’ first interactive movie (Kinoautomat, 2007). The movie was produced such that at certain points during the movie, a moderator would appear on stage in front of the audience and ask how the movie should develop from this point on. The audience had before them a red and a green button and could choose one of the options presented to them by pressing the corresponding button. The movie would develop according to how the majority of the audience voted. The movie was well received at the Expo 67 and the New Yorker wrote: The World Fair of Montreal 1967. A cultural exhibition held from the 27th of April to the 29th of October 1967 and received more than 50 million visitors in that period (Expo 67, 2007) 1 3 Group 08ml582 Reactive movie playback 1. Introduction State of the art “The Kinoautomat in the Czechoslovak Pavilion is a guaranteed hit of the World Exposition, and the Czechs should build a monument to the man who conceived the idea, Raduz Cincera.” (Kinoautomat, 2007) Furthermore, the movie was revived in 2007, by Činčera’s daughter Alena Cincerova (Cincerova, 2007). Despite the success back in 1967, the system was never widely used and interactive movies did not experience a great development in the following years. In 1995, the company Interfilm Technologies claimed to be the first to produce an interactive movie for viewing in cinemas, when they presented the movie Mr. Payback (Redifer, 2008). However, this movie was based on exactly the same principles as Činčera did in 1967. Mr. Payback was also made such that the cinemas showing it were required to have special equipment installed in their seats, in order for the audience to interact with the movie. Compared to Činčera’s movie, with a running time of 63 minutes, Mr. Payback would only reach a maximum playtime of approximately 20 minutes (Redifer, 2008). The movie never turned out to be a great success, and even though Interfilm Technologies did produce a couple of more of these interactive movies, the company was quickly shut down. With the development of personal computers, it has become possible to show interactive movies for people in their own homes, meaning that the interactive movie becomes more like a game, where the user can personally decide the branching of the story, instead of being “overruled” by other members of an audience. This has lead to experiments within interactive movies and it can be difficult to differ between an interactive movie and a game. One of the earliest approaches to an interactive movie as a computer game was the game called Rollercoaster from 1982 (NationMaster, 2005). The game was based on the movie of the same name, released in 1977. It was a purely text based game, a game which use text instead of graphics, describing everything in the game with words, but with the ability of triggering playback from a laserdisc, showing parts of the actual Rollercoaster movie. In 1983, the game Dragon’s Lair was released, which was the first commercial interactive movie game (Biordi, 2008). Opposite of a regular interactive movie, Dragon’s Lair had a winning and a losing condition, making choices in the game more like challenges than plain choices. Dragon’s Lair 4 Group 08ml582 Reactive movie playback 1. Introduction State of the art was quite successful, partially because of the great quality of the cartoon - produced by previous Disney animator Don Bluth - on which the game was built (NationMaster, 2005). A later experiment is the interactive drama Façade, which is an attempt to produce an interactive movie that involves the user as an actual actor in the drama that Façade is based on. Façade is developed by Andrew Stern and Michael Mateas, who together call themselves Procedural Arts. Andrew Stern is a designer, researcher, writer and engineer and has worked on other award winning software such as Virtual Babyz, Dogz and Catz (Stern & Mateas, 2006). Michael Mateas is an assistant professor at Georgia Tech and has worked on expressive AI in projects such as Office Plant #1 and Terminal Time (Stern & Mateas, 2006). Stern and Mateas claims to focus on people who are not used to playing computer games, but are more used to watching movies in cinemas and going to the theatre (Procedural Arts, 2006). 1.2.2 Facial Expression Detection Having seen how previous attempts at interactive movies have been conducted, it is necessary to look at how it is possible to use other forms of interaction methods to let the user interact with a movie. As mentioned in chapter 1.1 Motivation one way could be to somehow measure the emotions of the audience. In this chapter, recent approaches to facial expression detection will be discussed, to give an idea of how to detect a user’s reaction to certain situations. In 2007 Sony introduced the groundbreaking “Smile Shutter” function for low budget pocket cameras. The cameras used a proprietary algorithm to make the camera shoot only when a smile was detected. The cameras also recognize faces to guide the autofocus function, making the cameras even more suitable and nearly infallible for point-and-shoot situations. The technology has also been integrated into video cameras, so that the camera only records when people are smiling (Sony Corp., 2007). Facial expressions often convey massive amounts of information about the mood and emotional status of a person. Since every human face is different, the detection of facial features such as eyes and mouth has been one of the major issues of image processing. Due to the complexity of the face, it is complicated to map all muscles and skin to an equation. Furthermore, the difference between human faces regarding colors, sizes, shapes, muscle contraction paths and hair growth makes the identification even more complex. Advanced face identification uses so-called hybrid approaches combining advanced neural networks, 5 Group 08ml582 Reactive movie playback 1. Introduction Initial Problem Formulation which can “learn” to detect expressions by training (which will be elaborated in chapter 3.8.1 Training). While neural networks will not be described in this report due to the semester theme, it is unavoidable when describing state of the art technology in this field. More simple approaches involve large, tagged photo libraries which are being feature-matched with the pictures to be checked for facial expressions. 1.3 Initial Problem Formulation Based on what has been researched in the area of interactive movies and face recognition, an initial problem formulation can be described. What was discovered through this chapter was that most interactive movies previously created, were based on users interacting through pressing buttons, at which points the movies were paused, to ask the users of their opinions. With the modern technologies within the field of face detection, there might be an opportunity to create a new form of interactive movies, which would not require the movie to pause at important dramatic situations. Thus, the following initial problem formulation for this project can be given: How can we create an interactive, uninterrupted movie, which is controlled by the users’ emotions? 6 Group 08ml582 Reactive movie playback 2. Pre-analysis Narrowing down emotions 2. Pre-analysis Having found an initial problem formulation which is kept general by default, the task is now to narrow it down and make it as specific as possible in order to have a focused goal on which to base the further work of the project. First off, emotions will be further specified and narrowed down to one specific emotion. Second, it will be determined how this emotion will be measured (facial expression, heart-beat measurement etc.). Overall elements of telling a story will be examined, such as defining the narrative, introducing Mise-en-scène. The method of implementation will be discussed, by comparing methods such as pre-rendering with run time rendering. The target group for the project will be discussed and determined in order to exclude inappropriate test participants. Finally the testing method for the project will be introduced and discussed, in order to explain what tests the product will undergo. 2.1 Narrowing down emotions Having decided upon using emotions as the method of user interaction with a story and all the elements within this story, the task is now to narrow the choices down and determine exactly which emotion to focus on. Because simply using “emotions” is far too wide a term, being that human beings are capable of displaying joy, fear, boredom, curiosity, depression, jealousy, bewilderment and many, many others, with new emotions continuously being discovered (The Onion, 2007). Therefore, aiming to include a wide variety of different emotions in this project presents the risk of completely eliminating any possibility of finishing on time. Before deciding which emotion should influence the narrative, it would be prudent to briefly discuss how to discover or measure which emotion the user is portraying – how to read the user. Should it be by measuring the pulse or heartbeat? The amount of perspiration of the user? Reading the facial expression or maybe even a combination of these methods? What would be preferred is not to require the user to undergo a long preparation process of attaching various sensors to his body, in order to conduct various body-examinations either during or after watching the movie. Requiring this of the user, produces the risk that he will become distracted from watching the actual movie, or maybe be too aware of sensors and the measuring equipment to be able to fully concentrate on the movie itself. Or he might simply lose interest in the movie, if too much preparation is required. Instead, a far better alternative 7 Group 08ml582 Reactive movie playback 2. Pre-analysis Narrowing down emotions would be to simply include non-intrusive measuring and have the user watch and interact with the movie in a way that ensures he can focus on the movie and nothing else. And so, the project will focus on reading the face of the user and read his emotions that way. But while many emotions are deeply connected to e.g. the heartbeat, such as stress or excitement (Bio-Medicine.org, 2007), there is still many different emotions that can be read from the face alone. These include among others happiness, sadness, disgust and surprise (Hager, 2003). When looking to mediate an effect to a user, it is prudent to ask: Do we want to make people happy, sad, bored, disgusted etc? A large part of this project will be to evoke an emotional response in the user and as it can be seen in chapter 1.1 Motivation, TV and movies are very adept at doing exactly this. When looking at the emotions found at the website: A Human Face – Emotion and Facial Expression (Hager, 2003), such as happiness, fear or disgust, and comparing them to movies and TV, there are generally two of the emotions that stand out: Happiness and fear. Two entire movie-genres focus almost entirely on these emotions: The comedy and the horror movie genre. A few well-known examples of comedies, are Ace Venture: Pet Detective (Shadyac, 1994) and Monty Python and the Holy Grail (Gilliam & Jones, 1975), while famous examples of horror films are such films as The Shining (Kubrick, 1980) and Psycho (Hitchcock, 1960). Happiness is also heavily featured on TV, with many sitcoms showing, focusing solely on making people laugh. Shows like Friends (Crane & Kaufman, 1994), Seinfeld (Larry & Seinfeld, 1990) and Everybody Loves Raymond (Rosenthal, 1996) are very popular examples of sitcoms. And even if happiness is not the main purpose of the production, a multitude of movies and TV-shows include jokes and humor. Horror is also featured in TV-series, although somewhat less prominently. Shows like The Scariest Places on Earth (Conrad & Kroopnick, 2001) Are You Afraid of the Dark (Peters & Pryce, 1991) focus primarily of horror and suspense and are examples of horror as the main goal on TV-shows. 8 Group 08ml582 Reactive movie playback 2. Pre-analysis Narrowing down emotions When looking at the other emotions described by Hager, such as anger or disgust, few movies or TV-series exist that focus mainly on making people angry or disgusted, with a movie like Hostel (Roth, 2005) being one of few movies focusing on disgusting the audience to achieve an effect. And while the element of surprise play a big role in movies like The Usual Suspects (Singer, 1994) and The Game (Fincher, The Game, 1997), focusing on the particular emotions of anger or disgust occurs with far less frequency than fear or happiness. Based on the spread of these emotions in screen media, it is reasonable to suggest that fear and happiness are the two most widely used emotions to focus on. And so, the final choice comes down to either one of these. Looking at a user of the product of this project, it should not be a requirement to invest a long time to be able to naturally interact with the product. What is meant by this can be explained with how either happiness or fear is achieved: When considering how to achieve fear in a movie, suspense is often involved, e.g. when watching the movie and knowing that danger lurks nearby. But it is not often that fear is achieved in a few seconds without any prior action or knowledge. With happiness, it can vary from involving a long story that ends in a humorous climax or it can be as short as a one-liner, such as a play-on-words like: “Why is it that when a door is open, it's ajar, but when a jar is open, it's not a door? “ People who e.g. suffer from arachnophobia will feel fear immediately when seeing a spider, but phobias are very individual and as such, a certain phobia cannot form the basis of making the user feel fear. This means that there are more possibilities for producing happiness than fear, regardless of how much time the user invests. Next, it is prudent to think about how to recognize the features of a facial expression showing happiness and fear. When thinking of a person smiling, it is extremely hard not to think of that person as happy in some way, be it joyful, satisfied, cheery etc. The degree of the smile will vary based on the emotion portrayed, but no matter what, a smile will generally be associated with the feeling of happiness and it is therefore arguable that detecting happiness can be simplified to detecting a smile. 9 Group 08ml582 Reactive movie playback 2. Pre-analysis Narrowing down emotions But is it just as easy to determine the expression of fear? Looking at the fearful expression such as the one on Illustration 2.1, we can see open eyes and a wide mouth to indicate the emotion, but neither the eyes nor the mouth suggests the emotion alone. Illustration 2.1: The open eyes and wide mouth a both necessary to express fear. Some people suggest that the most important part of the face when showing fear is the eyes since they widen when we are afraid (American Association for the Advancement of Science, 2004), but wide eyes alone do not exclusively show fear. Amazement also includes wide-open eyes as shown on Illustration 2.2. Illustration 2.2: Here it can be seen, how wide eyes also help in express amazement. Finally, different mouth-shapes span a far greater area of the face than the eye-shapes, resulting in a higher degree of variations between expressions, making it easier to focus on 10 Group 08ml582 Reactive movie playback 2. Pre-analysis Storytelling in a movie the mouth than the eyes when recognizing emotions. And because happiness can be more easily defined by the mouth alone than fear, which needs more than just the mouth, it becomes the easier expression to recognize. Based on these traits of the emotion of happiness, along with it being the easier emotion to recognize, happiness will be chosen as the emotion to focus on in this project. As discussed in chapter 1.1 Motivation, there can be many ways to invoke certain emotions of a person watching a movie, a theatre play etc. In order to make a user smile, it would be prudent to use humor, given the following definition from The American Heritage® Dictionary of the English Language: 1. The quality that makes something laughable or amusing; […] 2. That which is intended to induce laughter or amusement. (American Heritage® Dictionary, 2008) Based on this definition, humor is very useful for the purpose of this project, given that the purpose is to find a smile on the user. As humor is the quality that makes something laughable, it is possible to make the user smile, given that he understands the humor. Therefore, it might also be prudent to investigate different types of humor, to make it fit the user. This will be discussed in chapter 3.5 Humor. This indicates that several types of humor can be necessary in order to increase the possibility of a user smiling. 2.2 Storytelling in a movie It is important to know how to tell a story in order to captivate the viewer, convey a message through a story or even trigger emotions such as fear, sadness, happiness. Different genres use different techniques and methods in order to invoke certain feelings in the viewer, and as such, knowing how to incorporate and use these is an important part of telling a story. Telling a story is not as simple as it might seem. It involves several factors which each plays a vital part in the overall concept of telling a story. Firstly, the person to which the story is told, who will be referred to as the user, is bound to have certain expectations. The user will expect the story to contain one or more characters, and he will expect that a series of events, which are connected in some way, will happen to this character. The user will furthermore assume that some problem or conflict will arise during the story, but also that this will be resolved or at least handled in some way (Bordwell & Thompson, 2008, s. 74). 11 Group 08ml582 Reactive movie playback 2. Pre-analysis Storytelling in a movie When we are watching a film, we are being told a story. During this story, we pick up clues, we store information of what happened when, where, how, why, and our profound expectations are being manipulated in order to create surprises, suspense and curiosity. When we get to the end, we usually find that our expectations have been satisfied. Although it can also occur that they have been cheated which makes us see the past events in the story from a new perspective (Bordwell & Thompson, 2008). The group members are to take on the role as filmmakers, it is important to have an understanding of how to evoke suspension, surprises etc. in the audience. Specific methods to create e.g. suspense and expectations will be explained in chapter 3.3 Lighting, sound and acting and 3.4 Camera movements and angles. 2.2.1 The Narrative David Bordwell holds a master’s degree and a doctorate in film from the University of Iowa. Furthermore he has won a University Distinguished Teaching Award and has been awarded an honorary degree by the University of Copenhagen. Of his several books are Narration in the Fiction Film, On the history of film and Figures traced in light: On cinematic staging (Bordwell & Thompson, 2008). Kristin Thompson holds a master’s degree in film from the University of Iowa and a doctorate in film from the University of Wisconsin-Madison. Her publications include Storytelling in the new Hollywood: Understanding classical narrative technique and Storytelling in film and television (Bordwell & Thompson, 2008). Bordwell and Thompson define the narrative as: “[…] a chain of events in cause-effect relationships occurring in time and space.” (Bordwell & and define the story as: Thompson, 2008, s. 75) “The set of all the events in a narrative […].” (Bordwell & Thompson, 2008, s. 76) The narrative begins with a situation. Then a series of changes or alterations happen, which are all related to a cause-effect pattern. And in the end, a new situation appears, which may be that our expectations have been fulfilled or that we see the pattern of cause and effect from 12 Group 08ml582 Reactive movie playback 2. Pre-analysis Storytelling in a movie another perspective (Bordwell & Thompson, 2008), such as in The Usual Suspects, where the ending reveals that the entire story has been made up by Verbal Kint. This knowledge will be valuable when making the storyboards. The basic building blocks of a story needs to be in place, which means that the chain of events in a cause-effect relationship has to be maintained. 2.2.2 Mise-en-Scène The meaning of this term is “putting into the scene”. It is used to describe what appears within the film frame, which could e.g. be the location, what kind of props, characters etc. to include. It is a term that describes the director’s power and control with regards to the event that the camera is to record. The following are aspects of Mise-en-Scène, and they will each be shortly introduced (Bordwell & Thompson, 2008, s. 112): • • • • Setting Costumes and makeup Lighting Staging There are several ways to control the setting. The director can e.g. choose to film at a location with already existing materials, or he can construct the setting. This means building a studio which in many aspects such as sound and lighting, provides the director with more control. Furthermore the setting often plays a vital role in films in the way that it can contain clues and/or other information about the story or characters. Similarly costumes and makeup also have important purposes. Usually they assist the viewer by providing visual information, e.g. to emphasize a poor family, they would all wear worn-out dirty clothes and have sad looks. But should this family find financial fortune, the costumes and makeup could help illustrate this transformation, e.g. by outfitting them with better clothes, new jewelry etc. Manipulation of lighting has an enormous influence on how we are affected when watching a film. Lighting can guide our attention to certain objects or characters, merely by using darker and lighter areas, e.g. to illuminate important objects in a scene. Suspense can be built up by 13 Group 08ml582 Reactive movie playback 2. Pre-analysis Rendering methods concealing details in shadows and key gestures or clues can be pointed out using brighter light on these, than in the rest of the scene. The various techniques, theories, methods of lighting will be elaborated in chapter 3.3 Lighting, sound and acting. Staging refers to movement and performance, which involves several elements like gestures, facial expressions, sound etc. and generally setting up the entire scene. It is the control of what the figures or characters should do while being in the scene and how they should act (Bordwell & Thompson, 2008). The tools of Mise-en-Scène represent control of what is in the scene. Thus, these tools will become very useful when designing storyboards. They can be used as a guideline or check list for what the director needs to be aware of, specifically in the design of storyboards, which is where the original stories will be created. 2.3 Rendering methods Referring to the preface, it is a direct requirement of the semester theme, that 3D animation is incorporated in the project. Therefore this limits other mediums of animation, such as hand- drawn animation. In order to achieve the best end-result regarding the animation, various rendering methods must be examined. There are two main methods of rendering that need to be taken in to concern before doing any kind of computer animated video production: Real-time rendering and Pre-rendering. As the words indicate, one method renders everything as it is used, while the other method is rendering all the material before it is to be used. There are pros and cons of both methods and this chapter will explain these pros and cons and thereby find the best possible rendering method for this project. First of all, it is necessary to explain what rendering actually is. The reason for doing rendering – at least in the form that is suitable for this project – is to get 3D models transformed into a 2D image, projected on a screen. A screen is a plane and as such, it can only display two dimensions (height and width), so when doing 3D modeling on a computer, rendering is constantly taking place, in order to show the modeler what he is doing to his (perceived) 3D model. 14 Group 08ml582 Reactive movie playback 2. Pre-analysis Rendering methods Illustration 2.3: The 3D model is projected on to the image plane, point by point. By projecting all points and connecting them in the right order, a 2D projection of the model is obtained. Illustration 2.3 shows that the image plane is what will be the screen in a rendering of a computer image. In the illustration, not all points of the cube are projected, which they should be. The projection of all the points onto the plane would give a 2D representation of the 3D object (Owen, 1999). For any object that is within the scene spanned by the image plane, such a projection would have to be made for each point (or in Latin: vertex, as it is more commonly called in computer graphics) on each object. If the scene was a game or similar, with multiple objects and many things happening simultaneously, there would be millions of vertices to project. As it is impossible to predict how a user would move around such a scene (the user should be allowed to move around inside the game world), the projection has to be done frame by frame. This means that for every frame the computer will have to compute the projection of each of the points in the scene onto the image plane. There are many computations and the reason why computer games tend to use rendering methods, which are as cheap as possible with regards to the amount of computations. 15 Group 08ml582 Reactive movie playback 2. Pre-analysis Rendering methods In animated movies however, there is only one way the scene can evolve, which means that the scene can be rendered before the user sees the movie. In this situation, there is theoretically unlimited time to do the rendering, which means that it is possible to choose very computationally complex rendering methods. Methods like ray-tracing (which will be explained in chapter 2.3.2 Pre-rendering) and radiosity 2 are examples of methods used primarily in animation video rendering. The following chapters will look briefly into real-time rendering and pre-rendering, to justify the final choice of rendering method for this project, in terms of production time and image quality. 2.3.1 Real-time rendering Real-time rendering is as the name suggests rendering which happen in real-time. The realtime rendering method makes the content highly customizable since the variables of the movie can be changed real-time. This makes it possible to create a multitude of different movies, just by changing one parameter like for example camera angle. The dynamic of the real-time rendering is the big advantage. One disadvantage of real-time rendering is that the rendering has to be very fast in order to maintain a proper frame rate. The minimum frame rate sets an upper limit for the complexity of the rendering. This means that heavy computational effects, like the before mentioned raytracing or radiosity, might not be useable. High polygon models are also less appropriate to use when utilizing a real-time rendering engine. This indicates that real-time rendering will be less suited for this project, since image quality is at risk of being compromised in order to maintain a decent frame rate. Furthermore, the use of high polygon models might be required in the animation, which also indicates real-time rendering as being a poor choice. 2.3.2 Pre-rendering As opposed to real-time rendering, pre-rendering will do all the work before the final rendering is to be shown to a user. As already mentioned in this chapter, this presents an opportunity for doing rendering, which is much more computationally complex, since the rendering time is not an issue. Typical examples of software for doing pre-renderings are programs such as MAYA, 3D Studio Max or Blender. Some are open source programs (for 2 Radiosity is the computation of the color of one diffuse object, based on reflected color from all other surfaces in the scene (Spencer, 1993) 16 Group 08ml582 Reactive movie playback 2. Pre-analysis Rendering methods instance Blender) and some are expensive software (MAYA or 3D Studio Max), but the programs are similar in functionality, so choosing one over the other is mainly based on personal preferences. However, MAYA is considered to be the industry standard in professional character animation studios (Animation Mentor, 2008), so this indicates that Maya can be the best choice for this project. One of the possible techniques that can be used in pre-rendering is ray-tracing. Ray-tracing is based on tracing rays from the eye/camera to a light source. Ray-tracing will not be discussed deeply in this chapter, but the theory will shortly be discussed, to give an impression of the opportunities of pre-rendering compared to real-time rendering. Illustration 2.4 shows the effects of doing ray-tracing. Illustration 2.4: By using ray-tracing it is possible to do realistic reflections and simulation of lighting, as the theory is based on tracing rays of lights in the image. Since every ray should ideally be traced all the way from the eye to a light source – or a diffuse object, meaning an object which reflects light equally in all directions – there will be many rays to do computations on. Some rays may be refracted and some reflected and for each new point which a ray hits, a new computation is to be done. In programming, this would mean a recursive function, which should follow each ray through each of its reflections and refractions. This gives more computations to perform than the renderings done real-time, which will lead to reduced frame rate. 17 Group 08ml582 Reactive movie playback 2. Pre-analysis Target group Since this project is focused on doing an animated movie, the quality of the image has a high priority. Consequently, the rendering method will be pre-rendering, since this offers the opportunity to produce images of higher quality, by e.g. including methods, such as ray tracing, in order to obtain realistic reflections in water or mirrors, if needed in the product. And since there is no need for moving around in the animations, there is no need to compromise image quality. With regards to what tool to use in creating the animated movie clips, there are - as mentioned – several possibly programs to choose from, but since MAYA is considered industry standard and that MAYA is available in a free learning edition, which offers almost all the functionality of the full version, MAYA will be the choice of software for 3D modeling, animation and rending in this project. 2.4 Target group In this chapter, a target group will be defined for the project. To shape the initial problem statement into the final problem statement, it is necessary to identify what audience the solution aims at. This will be done mostly by choices made by group consensus and will not include complex, social group classifications and lifestyle attitude segments. This chapter will analyze the choices made in order to establish who the target group will be. For the purpose of this project, a target group could be defined as being the audience of interactive exhibitions and museums. Since the goal of the project is to test whether or not a certain technology works in collaboration with a reacting movie, the audience for such an application is wide. What is needed is users who are interested and wants to actively participate in an interactive communication. Many museums feature non-interactive, non-animated artworks which do not captivate the audience in the same way as for example a science center exhibition. The need for this kind of interactive exhibition manifests itself in for example the amount of visits at the science center Experimentarium in Hellerup. The visits per year have been steady since 1991 with around 350.000 per year (Experimentarium, 2008) which is almost as many as the biggest “ordinary” museum: Statens Museum for Kunst (Ohrt & Kjeldsen, 2008). The target group of this project could be that of the museum or the exhibition, which the solution of this project should be 18 Group 08ml582 Reactive movie playback 2. Pre-analysis Testing the product located. However, the product is not restricted in application area to museums and or exhibitions. These areas are just suitable areas for testing our product. There is no need for an upper age limit in our target group since the users should only be able to watch a movie. However, there is a lower age limit for the interactivity element. As a lower limit, an age of 18 years is chosen. This is primarily out of legal issues, as people older than 18 would not need to be given legal permission to participate in our test, by their parents or guardian. The placement in an exhibition provides one major advantage. It is easier to attract an audience because it is plausible that the audience is at the exhibition or museum because they are most likely already interested and curious about what is happening. An eye-catching feature of the product is therefore less important and production time in the project can be spent elsewhere. The interpretation and understanding of art and art installations is extremely subjective, hence it is complex to measure the experience and effect. The understanding of the art is not the main issue on the test of the product, since it is possible to produce quantifiable results from the measures of how long and how often the users are smiling. Furthermore, quantifiable results can be obtained by asking users to fill out questionnaires afterwards. 2.5 Testing the product For this project there are two parts to be tested: The interaction method itself and the consequences the interaction has on the users. For the interaction method - the smiling of the user – a so called cross validation will work as a test of the successfulness. Cross validation works by taking a predefined amount, usually 10%, of the training data (for example a set of images) and use it as test data on the remaining 90% of the training data, in order to determine the validity of the training data and data processing (Moore, 2005). In the case of this project, the program can read the user as either “smiling” or “not smiling”, so the cross validation will only test these options. So, taking e.g. 10 pictures from the training data for the smiles and 10 pictures from the training data for the non-smiles, the result could be a table like seen in Fejl! Henvisningskilde ikke fundet.. 19 Group 08ml582 Reactive movie playback 2. Pre-analysis Testing the product Detected as smile Detected as non-smile Smiles 85% 15% Non-smiles 13% 87% Table 2.1: An imaginary example of the results of a cross validation. The higher the numbers in the diagonal from top left to bottom right, the better. The more of the smiles from the training data that is actually identified as smiles the better. The same applies for the non-smiles. Ideally, there should be a 100% detection rate, but that is not realistic to achieve. However, values around 50% would be just random detection, so the detection rate should be closer to 100% than 50%, meaning at least 75% is the goal for the detection rate of this cross validation. Furthermore, a higher detection rate is to be desired, so a detection rate of 80% is going to be the success criteria for the smile detection application. For the test of the interaction method, users will have to be involved. The goal of the final test is to determine whether the program’s interpretation of when the user smiles, is consistent with when the user himself thought the movie was funny. In order to ensure a good possibility of having the user smile and thus enable to program to register this smile, it would be sensible to include short movie clips containing several different types of humor, rather than merely showing him one long movie containing only a single type of humor. This would result in the program adapting to the user’s preferred humor type and focusing on this particular type of humor. It is important that the users are not informed of the specific way of interacting with the movie. The goal is to let the program register natural smiles, rather than having the users deliberately smile in order to change the way the movie is played, and it becomes difficult to determine if the user thought the movie was funny, if he is merely smiling with the objective of making the program react. This is a very subjective perception, which could vary from test person to test person and thus, it can be difficult to produce measurable data. One method of obtaining quantifiable results would be to make the users fill out questionnaires about their experience with the product. However, qualitative results, such as the opinions expressed through an interview could also be appropriate, in order to get an understanding of the users’ experience. In chapter 0. 20 Group 08ml582 Reactive movie playback 2. Pre-analysis Delimitation Testing, a more detailed description of the test will be presented. The testing framework called DECIDE, will be used to plan and evaluate the tests. The DECIDE framework is widely acknowledged within professional companies such as Nokia for the purpose of usability testing (Sharp, Rogers, & Preece, 2006) and will be sufficient for the usability tests that is to be conducted in this project. 2.6 Delimitation During this pre-analysis, it has been decided that rather than measuring users’ emotions as a whole, the project will focus only on registering their smile and make the movie react according to that, so the product will essentially focus on creating a reactive movie – where the user subconsciously controls the movie (i.e. the user will not be constantly be informed of the need to interact with the movie) – rather than regular interactive movie, such as the examples described in chapter 1.2.1 Interactive movies, which requires the user to part himself for the immersion of the movie to make a conscious and deliberate choice of how it should unfold. Using pre-rendering methods ensures a high quality of the movie. Since the movie is not supposed to allow the user to move around freely, there is no need to compromise image quality, such as ray tracing, by choosing a game engine. The target group will be people above the age of 18, who are set into a pre-decided test- environment, being unaware that they will influence our story. The testing will be performed in two steps. The first step will focus on the reaction method itself, which involves performing a cross validation test on this method. The second step will involve users, by making the users try the final product, which will involve a functional smile detection program and fully animated movie clips. Questionnaires will be used in order to produce hard, measurable data in this case. With the goal of creating a program, which can adapt to a user’s reaction to certain movie clips, it has been established that the product needs to detect when the user is smiling and use this information to change the type of humor of the movie that the user is watching. Thus a final problem formulation can be expressed. 21 Group 08ml582 Reactive movie playback 2. Pre-analysis Final Problem Formulation 2.7 Final Problem Formulation How can we, through the use of smile detection, create an adaptive program, which reacts to the smile of a user by changing the type of humor in a pre-rendered movie? 22 Group 08ml582 Reactive movie playback 3. Analysis Delimitation of the character 3. Analysis This chapter will analyze the various aspects of the final problem statement and go into depth with them. First of all, in order to be able to create the necessary movie clips, theory of narration has to be explored, since a structure must be applied to the movie clips to ensure that they become structured and connected and appear as part of the same whole. Afterwards, theories of light, sound and acting will be discussed and it will be explained how these elements can be used in correlation with this project and how to use cameras properly in order to obtain what is needed of the movie. Since this project will use humor to obtain the results needed, this chapter will also go into discussing humor types in connection to animated movies. Finally, various programming languages and environments for creating the smile detection will be discussed and theories of detecting both a face and a smile will be examined. 3.1 Delimitation of the character The reason for this subchapter is to avoid unnecessary analysis with regards to the humor in the movie clips. There will not be any dialog in the animations, because the project character is not going to speak. The reason for this involves lip synchronization, also known as lip-sync which is the term for matching the lip movements with a voice. Among animators it is considered that lipsync is a very time consuming and animation-wise often very challenging process to make it look right. According to Richard Williams The Animator’s Survival Kit (Williams, 2001), the animator should focus on the body and facial attitudes first, and then decide if it is necessary to include working with the mouth. (Williams, 2001, s. 314) 23 Group 08ml582 Reactive movie playback 3. Analysis Narrative structures Furthermore Williams suggests that one should not have too much going on with regards to e.g. how many poses a character should have for a certain sentence. “Keep it simple” is the advice, and aspects of dialogue such as accents, the sharpness of the voice and making the words appears as words, not just a collection of letters, lip-synching requires far too much effort to make it look right, for it to be worth including in the animation, given the time available for this project. This is the argumentation for excluding the mouth and thereby dialogue in the animations. The character design will aim to comply with the “keep it simple” mantra, and the goal will therefore be to make the viewer understand what is going on in the scenes, using the movements, gestures and facial expressions of the character. 3.2 Narrative structures When performing any kind of storytelling, whether it is on a screen, in a book or through any other media, it is important to be aware of how to tell the story. The structure of the narrative has a big influence on how the story is perceived by the viewer. Without a narrative structure, a movie can be very chaotic in terms of e.g. the level of excitement throughout the story or the number of acts of the story. This chapter will focus on different narrative structures, with the goal of finding the most suitable structure for the purpose of this project. 3.2.1 The dramatic Structure One of the structures, which have been researched the most throughout history, is the dramatic structure. The Greek philosopher Aristotle was the first to critically describe the theories of the drama (Clark, 1918, s. 4-5). Aristotle is the author of the book named Poetics (350 B.C.), in which he describes the origin and development of poetry and the tragedy. Aristotle also mentions the structure of the comedy in the Poetics, but this description of the comedy is believed to have been lost, so only the tragedy can be described. However, as this has been the basis of so many dramatic and poetic theories since then, such as using Aristotle’s 3 act structure in the movie Gladiator (Scott, 2000), this chapter will take into consideration the works of Aristotle in explaining the dramatic structure. Aristotle believed that humans are born with the instinct of imitation and that we from early childhood learn by imitating the actions of other living creatures (Aristotle, 350 B.C.). This is the reason why we as humans are so intrigued by movies and other acts, which essentially 24 Group 08ml582 Reactive movie playback 3. Analysis Narrative structures imitates human behavior. Even though many visual and auditory effects are used in obtaining this imitation of real life, the most important part is still the actions of the story (the Mythos), according to Aristotle, which closely relates to the definition of Bordwell & Thompson, as described in chapter 2.2.1 The Narrative. In Aristotle’s terms, there could be no tragedy (and thus no other form of narrative) without any Mythos. Since the presentation of the tragedy by Aristotle, many have discussed, praised and criticized the structure which he proposed. In 1863, the German philosopher Gustav Freytag published his book Die Technik Des Dramas (Freytag, 1876), in which he structured Aristotle’s theories in a 5 acts system and this structure is very much similar to what we know today as the Hollywood model. Illustration 3.1: Gustav Freytag’s dramatic structure evolves such that the excitement of the story rises until a climax is reached, where after the action is falling, until the catastrophe is reached, which in dramatic terms means the final resolution of the problem. Illustration 3.2: The Hollywood model in many ways resembles the model of Freytag, as it also reaches its highest excitement level at a climax and then fades out, leaving the audience at the same state as when the movie started. As it can be seen in Illustration 3.1 and Illustration 3.2, there are many similarities between Freytag’s model and the “modern” Hollywood model. The idea behind both models is that the 25 Group 08ml582 Reactive movie playback 3. Analysis Narrative structures characters and the conflict are introduced in the beginning of the story. The characters, setting and environment is described to set the scene of the story. After the introduction a conflict or a change occurs, which somehow influence the characters. The introduction of the characters and environment are crucial, because the rest of the story builds upon the first impressions of the character. Freytag also states that the introduction is perhaps the most essential part of the five acts and the introduction should be weighted out carefully against the rest of the narrative. It should also have a clear connection to the catastrophe, in which the problem of the story is resolved (Price, 1892, s. 76-94). This is also why some refer to it as Freytag’s triangle, as a line could be drawn directly from the end (the catastrophe) back to the beginning (the introduction), where a new story would start. The conflict of the narrative is further complicated in the next act of Freytag’s structure and in the Hollywood model, this happens through an elaboration and an escalation. At the highest point of excitement, the climax is reached, illustrated by a solution to the problem, followed by a fast decline in excitement, until the narrative ends (Price, 1892, s. 94-111). 3.2.2 Other narrative structures Having discussed the most common narrative structure, the dramatic structure, other narrative structures can be discussed. As mentioned in the introduction of this chapter, the movie clips that will be produced in this project will be very short clips. It is difficult to go through the entire development of the dramatic structure in such a short time and thus other alternatives should be explored. The episodic structure is one alternative, in which the action is based on small episodes. This structure is used in TV series where there might or might not be a connection between one episode and the next, but where each episode should have some sort of dramatic excitement, to keep the viewers interested. Most often though, the dramatic structure is applied to each episode, which would again require each of the clips produced in this project to have a structure similar to that of the dramatic structure. Another approach could be the star shape structure. The idea of this structure is to have an origin, to which the story returns periodically. 26 Group 08ml582 Reactive movie playback 3. Analysis Narrative structures Illustration 3.3: The star shape structure has its point of origin in the center and the story evolves according to – for instance – the main character’s immediate state of mind. Using this structure fits well with having small clips, as the development of the narrative would essentially develop according to whether the viewer is smiling or not. The viewer’s smile (or lack of so) will be the cause of whatever action will happen, where the story goes from the center and out to one of the events (1-8 in Illustration 3.3) - and the effect will be that the main character acts according to the cause. For further development of the idea, the star shape structure could be implemented in an overall dramatic structure, such that the narrative develops in a more classic way, but still having the character return to a relatively similar point of entry between all events of the narrative. The structure of each of the small clips could also have a short, simplified dramatic structure. The introduction to the character need only happen once, since it is the same character that experiences all the events in the narrative. An example of a scene could be that the character is faced with the problem of a locked chest. First he examines the chest, noticing that it is locked. As he tries to unlock the chest the excitement rises until suddenly, the chest pops open, essentially representing the climax. Whatever is inside the chest could be of amusement to the character and the fade out could be the character playing with his new found toy. A more detailed description of the content of the scenes will be described in analyzing different kinds of humor and how these should be laid out to the viewer. However, the overall narrative structure of the clips should fit inside the star shape structure, such that all effects are caused by the user. The small individual clips should have some dramatic development 27 Group 08ml582 Reactive movie playback 3. Analysis Lighting, sound and acting applied, in order to keep the user interested, even if it is not as detailed as in the full dramatic structure. 3.3 Lighting, sound and acting Having explored some of the aspects of narration and how to use them to tell a story suited for this project, it is necessary to look at the elements that will influence the perception of the movie clips; elements such as lighting, sound and acting. When utilized right, each of these elements can be very effective in e.g. creating suspension, supporting a mood, help/guide the viewer etc. This chapter will introduce the terms and theories relevant for utilizing lightning, sound and acting in the movie clips and describe how these can be used in the project. 3.3.1 Lighting The elements that help make sense of a scene’s space (which is the entire area of the scene itself) are highlights and shadows, where shadows have two further elements: shading and cast shadows. There are four features of film lighting (Bordwell & Thompson, 2008): • Quality – the relative intensity of illumination. o Hard light • o Soft light Direction – the path of light from its source(s) to the object lit. o Frontal lighting – eliminate shadows. o Side-lighting o Back-lighting – creates silhouettes. o Under lighting – distorted features, used to create dramatic or horror effects. • o Top lighting Source o Key light - primary o Fill light – primary o Back light – secondary 28 Group 08ml582 Reactive movie playback 3. Analysis Lighting, sound and acting o High-key - used for comedies, adventure films and dramas. Low contrast and the quality is soft light. o Low-key- used for mysterious effect and creates high contrast and sharper, darker shadows. The quality is hard light, and the amount of fill light is small or • Color none at all. o The standard way is to work with as white a light as possible, which can then be manipulated either by the use of filters on the set or by computer programs later in the process. As it can be seen in the above list, there are numerous ways to use different kinds of lights. However it is already possible to narrow down at this point, since the storyboards will only include humor and comedy. This means that elements like under-lighting and low-key light are unlikely to be used, since their primary use, according to the above list, is within the horror genre. The need of light for this project is rather simple. The only specific requirement for the light is the visibility of movements, gestures and facial expressions. This is crucial since there will be no dialog present. 3.3.2 Acting The performance of a character consists of visual elements and sound elements. Belonging to the visual elements are appearance, gestures and facial expressions, while e.g. music and offscreen sound belong to the sound elements (Bordwell & Thompson, 2008). The project character will not be able to speak and thus the visual elements and sound effects can fill this void, which means that his movements and facial expressions will largely be in charge of telling the story. In order to fulfill this task, the movements and facial expressions will be inspired from Disney’s 12 Principles of Animation (these will be elaborated in chapter 4.4.2 Disney’s 12 Principles of Animation) in the aspect of exaggeration. A situation where our character is to be surprised could e.g. be done by making him jump a large distance upwards or having his eyes increase in size and have a pop-out-of-the-head effect. Furthermore his gestures must be easy recognizable and understandable. It is important that a person watching is able to follow and understand what he sees without the help of dialog. 29 Group 08ml582 Reactive movie playback 3. Analysis Lighting, sound and acting 3.3.3 Sound Sound is a powerful element in films, and using it the right way provides several interesting possibilities. The director can by using sound, alter the way a viewer sees and understands the: The viewer is shown a series of images, but there are three different soundtracks accompanying these images. Each one leaves the viewer with a very different understanding than he would have gotten, had he watched either of the two others. The only changing element in this example is the sound. However it needs to be said that in this example a narrator is also heard along with the soundtracks, which has a major effect on the different ways they are understood (Bordwell & Thompson, 2008, s. 266). Another important feature of sound is its ability to direct attention to a specific area, object etc. A very simple example is the mentioning of an object by the narrator while that object appears on the screen. The viewers gaze will most certainly turn towards this mentioned object. Sound can also prove useful for anticipation along with direction of attention. The sound of a squeaking door will result in the thought of a door and thus if a door should appear in the following shot, then that would be the focus of the viewers attention along with a wondering of who might enter. However if the door remains closed, the interpretation process begins: One might think that maybe it was not the door but something else. This is one example of how sound can be used to make people feel more immersed in films. Thus sound can clarify events, contradict them and also aid the viewer to form expectations (Bordwell & Thompson, 2008, s. 265). These ways of utilizing sound can definitely be incorporated in the project. Some of the actions of the character could be clarified by the use of sound. E.g. a squeaking sound from the floor boards would help clarify the action and set the mood in a situation where the character attempts to be quiet and therefore walks in a sneaky manner. Sound effects are most certain to be used for other clarification aspects as well. Popular examples from the cartoon arena are e.g. the long whistle when something heavy is falling towards the ground or the windy “woof” when something disappears rapidly (Jones & Monroe, 1980). 30 Group 08ml582 Reactive movie playback 3. Analysis Lighting, sound and acting Another technique is the use of off-screen sound, which could make the viewer more interested and immersed in the way that he cannot see what is going on, only hear it. Off- screen sound can create the illusion of a larger space than the viewer will ever see, and it can shape the expectations about how the scene will develop (Bordwell & Thompson, 2008, s. 279). With regards to the creating of the storyboards, off-screen sound could prove very useful. Firstly because of its’ ability to make the viewer more immersed, and secondly - and more practically - because it can reduce animation time considerably, since the event causing the sound does not have to be animated, but can be imagined by the viewer alone. A different tool to use revolves around whether to use diegetic or nondiegetic sound: Diegetic sound is sound that has a source in the story world. Spoken words by characters, sounds coming from objects or music which comes from an instrument inside the story world are all diegetic sounds. Nondiegetic sound is defined as coming from a source outside the story world. The most common examples of this is music that is added in order to enhance the films action, or the omniscient narrator whose voice in most cases does not belong to any of the characters (Bordwell & Thompson, 2008, s. 279). There are also nondiegetic sound effects. Bordwell and Thompson mention an example where a group of people are chasing each other, but instead of hearing the sounds they would naturally produce, the viewer hears various sounds from a football game; crowd cheering, referee’s whistle etc. The result of this is an enhanced comical effect (Clair, 1931). Diegetic sounds are much likely to be used in the animations with regards to footsteps, interaction with objects etc; however nondiegetic sounds could also be included either to enhance important sequences or to achieve another effect. If one thinks back to those Sunday mornings watching cartoons, one might remember the vast amount of nondiegetic sounds being used in these, such as the many Warner Brothers Looney Tunes cartoons. The primary purpose of nondiegetic sound in these cartoons is to support and enhance the comical effect, but according to Bordwell and Thompson, it is also possible to blur the boundaries between diegetic and nondiegetic, which can then have a surprising, puzzling or humoristic effect on the audience. 31 Group 08ml582 Reactive movie playback 3. Analysis Camera movements and angles 3.4 Camera movements and angles With the narrative structure in place, the lighting set up correctly, the correct sounds recorded and the acting determined, all these elements has to combined in a actual movie and for this purpose, a camera is needed. However, the movements of this camera also have to be planned, in order to emphasize the structure and mood that is aimed at. When you want to make a film, the camera should not merely be randomly positioned before filming: Planning is the key. Everything visible in the shot is there for a reason, be it for setting the mood, the environment etc. Random objects are not put in the scene for no reason. The camera is the eyes of the audience: What it does not show, the audience cannot see and therefore one must be aware of the techniques and methods to use and what results they provide. 3.4.1 The image Knowledge about how to setup a scene and the choices of what to show of this scene is important, since the location of objects and characters can drastically change how the scene is perceived. For instance in a scene where an object is highly important, it would most likely be placed in the center of the image and probably also be in the foreground compared to actors or other objects in the scene. This information is useful when creating storyboards to guide the users’ attention to specific parts of a scene, ensuring that they notice the plot points in a story. 32 Group 08ml582 Reactive movie playback 3. Analysis Camera movements and angles It is important to maintain a balance in the scene. If for example there is a single character present in the shot, he should be more or less at the center of the image and if there are more characters, they could be placed so a balance is kept and this can be seen in Illustration 3.4. The reason for this is an attempt to distribute elements of interest evenly around the image, thus guiding the viewer’s attention (Bordwell & Thompson, 2008, s. 142). The viewers tend to concentrate their gazes more at the upper half of the image, since it is where the faces of the characters are usually to be found. Illustration 3.4: Balancing the image, either with on ore several characters in the frame. However creating an unbalance can produce interesting effects as well. As it can be seen in Illustration 3.5, there is a significant overweight in the right side of the image. Illustration 3.5: Overweight in one side of the image This creates the effect of making the father seem superior, since there besides him are more people and weight on his side. On the other side is the son, which is perceived as smaller and more vulnerable because he is such an ineffective counterweight. 33 Group 08ml582 Reactive movie playback 3. Analysis Camera movements and angles These examples shows how much Information it is possible to present to the user, by the placement of the camera. In this case the viewer is presented with character characteristics from a single shot (Bordwell & Thompson, 2008, s. 143). Illustration 3.6: The character in a slightly unbalanced image. Illustration 3.7: As she lowers her arm, the door comes more into focus and the viewer begins to form expectations. Another aspect of balance in the shot is when the director wants to play with the expectations of the viewers. The actions taking place in Illustration 3.6 and Illustration 3.7 show how such expectations are formed. In Illustration 3.6 the focus is on the actress in a slightly unbalanced image but in Illustration 3.7 she has moved further to one side of the screen, and the door comes into focus. Now with the enhanced unbalance in the scene, the viewer has begun to form expectations regarding the door and who might enter. Working with this unbalance is also known as preparing the viewer for new narrative developments (Bordwell & Thompson, 2008, s. 143). 3.4.2 Angle and Distance The purpose of this chapter is to provide an overview of the features and purposes of the various shots in the movie clips. For the project this will become very useful when creating 34 Group 08ml582 Reactive movie playback 3. Analysis Camera movements and angles the storyboards. These techniques are mere guidelines, as there is no universal measure of camera angle and distance. It will be of great help to know how to create the various effects to be used in the movie clips, e.g. triggering the expectations of the viewers or emphasizing an important gesture of the character. What the camera can see, the viewer can see. The viewer can see a frame, or a “window” so to speak, in a space and where this frame is placed has great importance as to how the film is experienced and perceived. The number of possible camera angles is infinite, since the camera can be placed anywhere. However there are three general categories between which filmmakers usually distinguish. • The straight-on angle. This is the most common and it portrays the scene laterally, as if the viewer was standing in the room. This angle can be seen in Illustration 3.8. The effect of using this angle is a very neutral shot with little or no psychological effect on the viewer. Illustration 3.8: A shot from the straight-on angle • The high angle. As the name implies, the viewer is above the scene and looking down at it. This is also known as birds-eye perspective, and can be seen in Illustration 3.9. A possible effect of this angle is that the viewer perceives the subject as being small, weak or inferior. 35 Group 08ml582 Reactive movie playback 3. Analysis Camera movements and angles Illustration 3.9: A shot from a high angle • The low angle. The viewer is looking up at the scene. This is also known as frog perspective. See Illustration 3.10. A psychological effect of using this angle is the viewer perceiving the subject as powerful, threatening or superior. Illustration 3.10: A shot from a low angle The framing of the image does not just position the viewer at a certain angle from which to see the scene, but also controls the distance between the camera and the scene. This camera distance is what provides the feeling of being either near or far from the scene. The following examples are to be interpreted not as an absolute rule, which applies for all films, but should rather be thought of as guidelines which can be helpful in the pre-production process, e.g. for storyboards. Camera distances are presented as follows, (with the human body as measure). • Extreme long shot The human figure is barely visible, since this framing is normally intended for landscapes and birds-eye views of cities. 36 Group 08ml582 Reactive movie playback 3. Analysis Camera movements and angles Illustration 3.11: Extreme long shot • Long shot Human figures are visible, but still it is the background that dominates. Illustration 3.12: Long shot. • Medium long shot In these shots, the human figure is usually framed from the knees and up. This is a frequently used shot, because it provides a pleasant balance between illustration and surroundings. Illustration 3.13: Medium long shot. • Medium shot This frames the human frame from the waist and up. Gestures and expressions are now becoming more visible. 37 Group 08ml582 Reactive movie playback 3. Analysis Camera movements and angles Illustration 3.14: Medium shot. • Medium close-up This frames the body from the chest and up. Again more focus on gestures and expressions. Illustration 3.15: Medium close-up. • Close-up Traditionally this shot shows only the head, hands, feet or a small object. Facial expressions, the details of a gesture or of an object are emphasized. Illustration 3.16: Close-up • Extreme close-up This shot singles out a portion of the face, which would often be the eyes or the lips. Another purpose is to magnify or isolate an object (Bordwell & Thompson, 2008, s. 191). 38 Group 08ml582 Reactive movie playback 3. Analysis Humor Illustration 3.17: Extreme close-up As seen in this chapter, the camera can have a major influence over aspects of a movie such as balancing an image. If an image turns out being unbalanced, it is important to do this deliberately to obtain a certain effect in the image, such as suggesting status of a character vs. another character in a frame. Camera angles can also be used to great effect in playing with the expectations of the viewer, e.g. by making a character in the frame directly interact with something or someone outside the frame, making the user form expectations about this outside influence on the character. The camera can assume different angles of the viewing the scene, such as a birds-eye view and thereby completely change how the scene is viewed. Finally the camera can - by using various degrees of zooming - focus on various parts of a scene, such using a close-up to focus on facial expressions or using an extreme close-up to magnify or isolate an object. 3.5 Humor In chapter 2.1 Narrowing down emotions it was decided to utilize smiles as the emotion that will influence the movie. Based on this decision, humor will be a necessary part of the movie clips. Using humor should provide many opportunities for having the user smile and thereby control the movie clip, providing measurable data to analyze. Also, without using humor in the movie clips, it becomes very difficult to provide a controllable opportunity for the user to smile, making the inclusion of humor in these movie clips futile when testing their reaction to a smile. Therefore, this chapter will focus on different ways of communicating humor, with the goal of finding the best types of humor to implement in this project. In chapter 3.1 Delimitation of the character it has been chosen not to have a mouth on the character, which exclude any form for humor relying on dialogue. In Paul Wells Understanding Animation (Wells, 1998), there is a comprehensive list of ways to start laughing in animation. In this list, the only humor type without dialogue is slapstick. For that reason this chapter will focus on that particular humor type, based on two well-known cartoons. 39 Group 08ml582 Reactive movie playback 3. Analysis Humor When excluding dialog a natural approach would be applying humor to facial gestures and body language and support it with sounds in the different scenes. By using anthropomorphic characteristics for a main character, as done with a character like Jiminy Cricket in Disney’s Pinocchio who can be seen in Illustration 3.18, the audience is able to understand the character on human terms, even though he is indeed not human. Illustration 3.18: The anthropomorphic characteristics of Jiminy Cricket make him appear human like, even though he is an insect. When creating a character it is important to give it some kind of personality, in order to relate to the comic aspect of the character. As mentioned earlier in chapter 3.1 Delimitation of the character, the character does not speak so the humor must be centered on the behavior of the character and the movie clips. Two key aspects that influence the personality are (Wells, 1998, s. 129): • • Facial gestures which clearly identify particular thought processes, emotions and reactions experienced by the character. Physical traits and behavioral mannerisms common to, and recognized by, the audience, which remain largely consistent to the character. Even though the main character has no mouth, it is still possible to make facial gestures to express thoughts, emotion and different types of reaction using only the eyes of the character. To quote Richard Williams, director of animation on the film Who Framed Roger Rabbit (IMDB.com, 2008): 40 Group 08ml582 Reactive movie playback 3. Analysis Humor “Our eyes are supremely expressive and we can easily communicate with the eyes alone. We can often tell the story just with the eyes.” (Williams, 2001, s. 326). Looking at famous personalities from the Disney universe like Mickey Mouse, Donald Duck and Pluto, it is not the jokes we remember, but the behavior and reaction from each of the characters, e.g. Donald as the hot-tempered duck who swears and curses. Frank Thomas and Ollie Johnson, both veteran animators at the Disney Studios, suggest that Walt Disney understood that the fundamental principle of comedy was that: “The personality of the victim of a gag determines just how funny the whole incident will be.” (Wells, 1998, s. 130). That is why it is funnier when Donald slips in a banana peel, instead of one of his nephews slipping in it. Donald’s hot-tempered personality will burst out in swearing and cursing. Thomas and Johnson also say about Walt Disney, that he, rather than thinking of cartoon material as being entertaining, he thought of it as being captivating. It is all about impressing the audience; make them forget the real world and lose themselves in the cartoon universe. Walt Disney had to find funny actions in everyday life, but still connected to something well- known and based on everyone’s experience (Wells, 1998, s. 131). Examples of such a connection can be seen in Illustration 3.19. 41 Group 08ml582 Reactive movie playback 3. Analysis Humor Illustration 3.19: Shows Walt Disney Donald Duck in two typical unlucky situations. 3.5.1 Tex Avery’s approach A man who went in another direction of what Disney did was Tex Avery. Avery was a famous animator and cartoonist and has done much work for Warner Bros, such as A Wild Hare (Avery, 1940) or Porky's Duck Hunt (Avery, 1937). Avery rejected the cuteness often used in Disney animation and was going for the madness in the cartoon universe. Avery is behind characters such as Daffy Duck (IMDB.com, 2008) and Bugs Bunny. Tex Avery realized that physical comedy slapstick would be satisfying for children, while adults would require some more mature themes. These include (Wells, 1998, s. 140): • • Status and power, and specifically the role of the underdog. • emergence of previously feelings. • Irrational fears, principally expressed through paranoia, obsession, and the reThe instinct to survive at any cost. A direct engagement with sexual feelings and sexual identity. Slapstick was originally a term for making sound when an actor was e.g. hitting another actor on stage. Slapstick consists of a pair of sticks, making a loud sound when struck: the recipient actor will react comically to the impact, in order to make it more amusing. Alan S. Dale refers to M. Wilson Disher who claimed that there were only six kinds of jokes, falls, blows, surprise, knavery, mimicry and stupidity (Dale, 2000, s. 3). But for a comedy to 42 Group 08ml582 Reactive movie playback 3. Analysis Humor register as slapstick, the fall and blow are the only types needed. The fall occurs e.g. when a guy slips in a banana peel. The blow is used in scenes where a guy gets hit by e.g. a pie in the face, causing a loud sound to appear. It is arguable that the soul of a slapstick joke is the physical assault on, or fail of, the character. Avery took cartoons to the extreme with Screwball Squirrel (Avery, IMDB, 1944). Screwball Squirrel is an entirely unlikeable smart guy with a very aggressive and amoral personality. His only mission in life is to get his opponent to suffer, typically by exposing them to extreme pain. In Screwball Squirrel the start scene seems a bit “Disney-like”: we see a cute squirrel doing happy walk/dancing in the forest and stops by Screwball Squirrel. He asks the cute squirrel what kind of cartoon this is going to be, and the cute squirrel starts blabbering about how it is going to be about him and how cute he is. Screwball hates it and a second later we see him beating up the cute squirrel. The narrative context in Avery’s films is often disrupted with various small gags and unexpected black humor, such as the cartoon character talking to the audience. The entire cartoon consists of many shorter clips, some more extreme than others, were Screwball continuously makes his opponent - in this case a dog - suffer in the most crazy ways. The disruptions in Avery films became the narrative itself. 43 Group 08ml582 Reactive movie playback 3. Analysis Humor Illustration 3.20: Shows the start scene of Screwball Squirrel were he beats up the cute Disney like character. 3.5.2 Chuck Jones’ approach Chuck Jones is the father of the cartoon “Road Runner”. Chuck Jones’ cartoons are similar to Avery’s, but what characterizes Jones’ work was his interest in limiting the logic of a situation instead of over-extending it. Jones had some rules for his famous Coyote and Road Runner and it became a sort of comic model. The rules are as follows (Wells, 1998, s. 150-151): 1. The Road Runner cannot harm the Coyote except by going “Beep-Beep”. 2. No outside force can harm the Coyote – only his own ineptitude or the failure of the ACME 3 products. 3. The Coyote could stop at anytime - if he was not a fanatic 4. 4. No dialogue ever, except “Beep-Beep”. 5. The Road Runner must stay on the road – otherwise, logically, he would not be called a Road Runner. 6. All action must be confirmed to the natural environment of the two characters – the south-west American Desert. 7. All material, tools, weapons or mechanical conveniences must be obtained from the ACME Corporation. 8. Whenever possible, make gravity the Coyote’s greatest enemy. 9. The Coyote is always more humiliated than harmed by his failures. The audience of Road Runner always knows that the Coyote would fail in his attempt to catch the Road Runner, but the audience love to see the Coyote try again and again. The gags appear when something happens instead of how it was expected to have happened; because of Coyote’s reaction to bizarre failure of his ACME products or how the environments always 3 4 44 A company supplying unusual inventions A fanatic is one who redouble his effort when he has forgotten his aim – George Santayana Group 08ml582 Reactive movie playback 3. Analysis A reacting narrative conspire against him. Jones believes that this is as important as the Coyote getting physically injured. Jones also had a skill for creating comic suspense, when the audience recognizes the gag or the seed of a building joke. In one episode the coyote builds an ACME trapdoor in the middle of the road, which will swing up with at use of a remote. The Coyote pushes the remote when the Road Runner speeds around the corner, in hopes of having the Road Runner coming to an end. Nothing happens of course, so the Coyote walks out to the trapdoor in the middle of the road to check the mechanism, pushes the remote one more time. Here the audience expects the door to swing up and harm the Coyote in some way, but nothing happens and the scene ends. A new scene begins with another chase between the Coyote and Road Runner, and, when the audience almost forgotten the failure from before, the trapdoor swings up in the face of the Coyote. It is the delayed outcome and the surprise that makes it funny. Illustration 3.21: Coyote is always trying to catch road runner and sometimes with the most crazy inventions, like this rocket-bike with handlebars and a cross-hair. 3.6 A reacting narrative Having explored how to create the movie clips, the next step is to determine how to make them react to user input in a proper manner, such that the movie playback becomes structured and meaningful. The narrative can be controlled through a tree structure and the user can choose between different paths through the story. One could imagine the timeline as a tree where every split is a choice as seen on the left side of Illustration 3.22. 45 Group 08ml582 Reactive movie playback 3. Analysis A reacting narrative Movie 2 Choice Movie 3 Movie 1 Choice Choice Choice Movie 2 Movie 3 Choice Movie 3 Choice Choice Movie 1 Movie 4 Movie 4 Movie 7 Movie 4 Movie 4 Movie 4 Movie 2 Choice Movie 3 Neutral state of character Movie 6 Movie 3 Movie 4 Movie 4 Movie 4 Movie 5 Movie 4 Illustration 3.22: To the left: Tree structure of choices in an interactive movie. To the right: Star structure When looking at the different narrative structures described in chapter 3.2 Narrative structures, the sequence of the story is fixed and should not be changed, if the structures are to be utilized effectively. The choices should therefore not change the structure, but alter the different elements in the structure. For example, if using the Hollywood model the presentation could change from introducing a boy to introducing a girl. The star structure is looser because it does not have a timeline. A movie clip is played, starting with the character at the neutral state and going to one of the possible branches. After this movie clip is finished, the user makes a choice and based on this, another movie clip is played with the character once more starting at the neutral state and going to a new branch, which reflects the choice made by the user. When choosing the appropriate structure for reactive control of the narrative, the form of the movie should be evaluated. If the movie contains one storyline with several different parts, the tree structure would be appropriate. If the movie however contains of multiple small stories, the movies would fit better into a star structure. This sub-chapter indicates that the Star structure is suitable for solving the project problem, since each of the branches could represent a different type of humor, allowing the program to change between these types, based on the choices made by the user. This concludes the first part of the analysis, which is concerned with the creation of movie clips, with regards to narrative structures and elements of Mise-en-Scène. It was discovered how these elements can aid the design of the movie clips. Furthermore, different interpretations of the slapstick humor were described. The next step will be to analyze on the 46 Group 08ml582 Reactive movie playback 3. Analysis Programming languages program that is needed to connect the users’ facial expressions to the playback of the movie clips. 3.7 Programming languages Having been through the topics relating to the creation of the movie clips, the rest of this analysis will be concerned with the theory of the smile detection, which is needed for the product to be successful. This subchapter will look at what programming languages are available in order to create the smile detection and playing back video. Many of these include the same features and are combinations of the same few programs and programming languages. Keeping in mind the limited time for this project, learning a completely new language for developing this tool would be too time-consuming, and as such the choices of possible programming languages could be narrowed down to those known by the group. These languages were C++ and Max/MSP with the Jitter extension. 3.7.1 C++ One method of creating this program would be to use a mixture of C++ with OpenCV (Bradski & Kaelhler, 2008) and Adobe Flash. C++ is a textual programming language, and Flash is a graphical animation tool with support for minor programming through the use of a scripting language called Action Script. OpenCV is a collection of libraries for C++ containing functions allowing work with computer vision. OpenCV includes a feature matching function using the Haar cascades, which is an algorithm for finding objects in a digital image, as described in chapter 3.8.1.1 Haar Cascades, and would therefore be a close to premade tool ready for use. This method would work by having OpenCV allowing the data from a webcam stream to be loaded into C++ - converting the feed into a stream of images - and from there manipulate the images, allowing the full freedom provided by C++, and afterwards passing the data about the detection to Flash (through file input/output), which would then serve as the graphical interface used for playing back the video and sound for the program. One of the drawbacks with this method is that videos in Flash have a tendency to lose synchronization between audio and video, which ruins the purpose of playing a video using this method. The biggest problem however, is the way that data has to be passed from C++ to Flash. The method of doing this is by saving the data and instructions from C++ into a text file every frame, and then 47 Group 08ml582 Reactive movie playback 3. Analysis Programming languages having Flash loading these data at the same speed. This however had issues in which the whole program had slowdowns, and in the worst cases Flash attempted to read the instruction file while C++ was writing to it, effectively making the program crash. It could also be done using C++ with only OpenCV. The main drawback with C++ is that it is time consuming as you have to create the entire program from scratch. This drawback especially shows itself in OpenCV’s lack of capability to playback video stream alongside audio streams. There are two ways to get past this obstacle, either programming a media player in C++, or performing a system call, which uses the command prompt to execute an already existing media player for the playback. The system call is made in C++ by using the system command as follows: system(“start calc”); This code would start the Windows Calculator and calling other Windows applications would be done in a similar manner. An example of such a media player could be Windows Media Player, which exists on all personal computers using the Windows operating system. 3.7.2 Max/MSP with Jitter Max/MSP is a graphical programming interface that allows programming to be done at a faster speed than a textual programming language such as C++. Its main force is that it is easy to work with and it is possible to create simple solutions and prototypes in short amount of time. However its’ drawback is that it offers less freedom compared to languages such as C++. Max/MSP focuses mainly on sound manipulation, but has extensions increasing the possibilities within the program regarding visual applications.. One such extension is the Jitter, which provides Max/MSP with the ability to display and manipulate graphics such as videos, webcam feeds and even 3D objects. Jitter includes several basic tracking forms such as blob detection and facial recognition. The facial recognition, seen in Illustration 3.23 could then be customized into only tracking the smiles instead of the entire face. 48 Group 08ml582 Reactive movie playback 3. Analysis Smile detection . Illustration 3.23: Facial tracking being done in Max/MSP Jitter, used to manipulate the view of a 3D scene. From looking at these possibilities, the two methods that would serve this project the best would be either Max/MSP due to the fact that it is less time consuming to program in, and would allow a prototype to be up and running within a minimal amount of time, or C++ using only OpenCV, based on the degree of freedom it offers, as well the possibility of using already existing media players for the playback of video and audio. Both of these choices also contain a feature detection method, and in both cases this can be modified for use in detecting smiles. As such the required work to be done is close to being the same regardless of which of the two programs are being worked with. The group has more experience working with C++ and OpenCV, giving the group knowledge of the fact that OpenCV provided many opportunities for creating a suitable program, along with OpenCV being far more flexible than Max/MSP, as opposed to needing to adjust to the working environment in Max/MSP and examining how this can be used to solve the problem. Based on this, the chosen programming method was decided to be C++ with the OpenCV libraries. 3.8 Smile detection Smile detection is a branch of face detection, which determines the presence of a smiling mouth in a digital image. The reason to explore the technology of smile detection in relation to this project is to be able to create a smile detection program, which can track the users of the 49 Group 08ml582 Reactive movie playback 3. Analysis Smile detection final product and determine if they are smiling. As described in chapter 2.1 Narrowing down emotions, smile will be an appropriate way to determine if the user is finding a movie clip funny and therefore, smile detection is vital for the success of the product connected to this project. Smile detection is, as mentioned in chapter 1.2.2 Facial Expression Detection, used in compact cameras and is often called “Smile Shutter” (Sony New Zealand Limited, 2007) and, as the term implies, the shutter is triggered when the camera detects a smile. The algorithm for the cameras “Smile Shutter” is often proprietary code and is therefore not accessible to the public. The function is often combined with face detection so that the camera sets focus on faces, which is often the most important part of amateur pictures. Furthermore the Sony DSC-T300 supports recognition of children’s smiles and adults’ smiles (Sony Corp., 2007), hence it can differentiate between adults and children using face detection. To analyze smile detection, it is necessary to divide the topic into subtopics and thoroughly investigate the different parts: The first step of smile detection is to find a face using face detection. After finding a face, the next thing is to find the mouth of the person. The smile detection on the mouth picture is the final part. These steps could be the workflow of the final smile detection. The main challenge is to get a sufficient amount of pictures, to train the smile detection as described in the next section, chapter 3.8.1 Training, and make the smile detection use the training material. 3.8.1 Training Smile detection training is done by going through many pictures of smiling faces and finding common characteristics. An often used method is to reuse pre-made database. Table 3.1 gives an overview of some of the large database collection of face pictures. 50 Group 08ml582 Reactive movie playback 3. Analysis Smile detection Name Images Sequences Subjects Expressions Remarks Thesis Database Original and > 1400 RGB 12 Played and natural Natural 9 mug-shots per RGB 284 Natural Expr. Often subtle or None 329 BW/RGB 100 Played > 40,000 Includes some 68 Played > 14,000 BW None 1009 Played None 10 Played HID processed Cohn-Kanade subj. CMU-PIE FERET JAFFE 213 talking clips Table 3.1: Overview of existing databases of faces (Skelley, 2005) Head movements mixed FACS coded, no head movement Large Variations in Pose and Illum. Designed for identification Only Japanese Women The databases can be e.g. sorted by facial expression and include different illuminations in order to make the face detection brightness independent. The CMU-PIE database contains more than 40.000 tagged pictures (Skelley, 2005). The tags describe e.g. the facial expressions, marking them as happy, bored etc., which is particularly useful when training for smile detection. A major drawback on the already existing databases is that they are mostly not accessible for the public and require licenses. Therefore the database can also be made from bottom up if the premade databases are not sufficient (e.g. not including smiling pictures of pictures of too low quality), if they are too large or inaccessible. The training can be very time-consuming and the amount of pictures plays an important role time-wise. However, it is important to have as much training data as possible to make the detection less sensitive to different face shapes, beards, skin color and the like. 3.8.1.1 Haar Cascades There are different training algorithms that can be used. OpenCV use the so called Haar cascades (OpenCVWiki). The Haar cascade training requires a massive amount of calculations and therefore is very time expensive. According to Ph.D. student at the University of Maryland, Naotoshi Seo, training using the CMU-PIE database mentioned in Table 3.1 can take up to three weeks (Seo, 2008). When using Haar cascades to detect objects, the classifiers made by the Haar training are loaded and compared with the live feed. The Haar cascade detection use different shapes to 51 Group 08ml582 Reactive movie playback 3. Analysis Smile detection compare the classifier with the live feed from the webcam. The rectangular shapes used for matching are called Haar features. Illustration 3.24: Haar-Cascade patterns. The orange/white shapes are each mapped to a different facial feature, of either a smile or a non-smile. The shapes make the Haar cascade comparison more reliable than simpler methods (described in chapter 3.8.1.2 Simple method), because it compares on subdivisions of the image. A Haar cascade is a series of object comparison operations encoded in a XML file featuring a simple tree structure. The different features are compared and the algorithm only detects a match, if every feature in the check is a positive match. 3.8.1.2 Simple method Another approach to image classification is to assume, that the smile and non-smile images build clusters in a multidimensional space as shown in 2D on Illustration 3.25. The images are converted into vectors, by storing each pixel of the image in an indexed vector, a vector having a number for each position. 52 Group 08ml582 Reactive movie playback 3. Analysis Smile detection Illustration 3.25: Clusters of pictures in multidimensional space. Pn is non-smile cluster of the training data. Ps the smile cluster of the training data. The purple line is the picture currently being tested on. The black line is the threshold. The mean of the images is calculated using the formula below and the converted pictures create clusters into the multidimensional space. 𝑛𝑛 1 � = 𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣 𝑣𝑣𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 𝑭𝑭𝑭𝑭𝑭𝑭 𝒆𝒆𝒆𝒆𝒆𝒆𝒆𝒆𝒆𝒆 𝒊𝒊𝒊𝒊𝒊𝒊𝒊𝒊𝒊𝒊 𝑝𝑝̅ = � 𝑝𝑝𝑖𝑖 𝑤𝑤ℎ𝑒𝑒𝑒𝑒𝑒𝑒 𝒏𝒏 = 𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛 𝑜𝑜𝑜𝑜 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 𝑎𝑎𝑎𝑎𝑎𝑎 𝒑𝒑 𝑛𝑛 𝑖𝑖=1 Equation 3.1: Mean calculation The mean approach is simpler and much faster, but it does not compare the images in the same way as the Haar training does. The smile detection training can be done in the same ways, but should be focused on mouth and the facial features that reproduce a smile. The Haar cascade training is much more time consuming than the simple approach and even though OpenCV contains compiled training programs, the programs are not well documented. OpenCV uses Haar cascades to detect faces and therefore Haar cascades will provide an easy way to realize face detection, being almost fully implemented in the OpenCV library. However, since the analysis found no Haar cascades for smile detection, this will have to be manually implemented by the group. Since the algorithm for training Haar cascades is too complicated for this project, the Simple method will most likely be used to create smile detection. 3.8.2 Detection This subchapter will suggest two methods on how to compare the mean training image with another image. 53 Group 08ml582 Reactive movie playback 3. Analysis Smile detection 3.8.2.1 Method 1 – Absolute value difference For every pixel: get the difference in pixel value and evaluate the sum of the pixel difference: 𝑘𝑘 �|𝑝𝑝𝑖𝑖 − 𝑞𝑞𝑖𝑖 | 𝑖𝑖=1 𝒌𝒌 𝑖𝑖𝑖𝑖 𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛 𝑜𝑜𝑜𝑜 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝, 𝒑𝒑 𝑖𝑖𝑖𝑖 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣 𝑜𝑜𝑜𝑜 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖, 𝒒𝒒 𝑖𝑖𝑖𝑖 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣 𝑜𝑜𝑜𝑜 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙 (𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝑡𝑡𝑡𝑡 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑓𝑓𝑓𝑓𝑓𝑓 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜) 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 Equation 3.2: Method 1 – Absolute difference To illustrate how to use the equation, a mean image and a live image, both with the dimensions 1x3 pixels can be compared like this. In the program there would be two different mean images that would have to be checked, but to make the idea clear, Illustration 3.26 only shows the calculations for one mean image: Illustration 3.26: Enlarged mean training image to the left and live image to the right The first pixel is 100 in the mean training image and 185 in the live image, hence the difference is 85. So the calculations are as follows: |100 − 185| + |30 − 40| + |230 − 255| + 𝑡𝑡 = 𝟏𝟏𝟏𝟏𝟏𝟏 , 𝑤𝑤ℎ𝑒𝑒𝑒𝑒𝑒𝑒 𝑡𝑡 𝑖𝑖𝑖𝑖 𝑡𝑡ℎ𝑟𝑟𝑟𝑟𝑟𝑟ℎ𝑜𝑜𝑜𝑜𝑜𝑜 Equation 3.3: Calculation of sum of difference with threshold = 0 The computation in Equation 3.3 shows that the total distance from the live image to the mean image is equal to 120. Let us say this is the total distance from live image to mean smile image. Then the program would do the same comparison for the mean image of non-smiles. If this comparison produces a value above 120, the live image would be detected as smiling. The threshold can bias the results towards smiling or neutral, in order to compensate for faces not similar to training data. Illustration 3.25 shows how the threshold could move the decision 54 Group 08ml582 Reactive movie playback 3. Analysis Smile detection boundary towards either cluster. The threshold will be set after evaluating the equation without threshold. Using the mean comparison it is possible to set a region of interest in the picture. The difference between the pixels within the region of interest can be multiplied with a scalar to make the pixels more important (if scalar > 1) or less important (if scalar < 1). This could be used in the smile detection, where the middle of the mouth does not change as much as each side of the lip. To make the middle pixel of our test picture a region of interest, the middle calculation is multiplied with 3 and make the right pixel less important it is multiplied with 0.5. |100 − 185| + |30 − 40| ∗ 3 + |230 − 255| ∗ 0.5 + 𝑡𝑡 = 𝟏𝟏𝟏𝟏𝟏𝟏. 𝟓𝟓 , 𝑤𝑤ℎ𝑒𝑒𝑒𝑒𝑒𝑒 𝑡𝑡 𝑖𝑖𝑖𝑖 𝑡𝑡ℎ𝑟𝑟𝑟𝑟𝑟𝑟ℎ𝑜𝑜𝑜𝑜𝑜𝑜 Equation 3.4: Method 1 – Absolute difference with weighted threshold The threshold should be adjusted when changing the equation. 3.8.2.2 Method 2 – Distance of pixels Another approach of comparing images is to calculate the distance of the live image vector to the mean image vector: 𝑘𝑘 ��(𝑝𝑝𝑖𝑖 − 𝑞𝑞𝑖𝑖 )2 𝑖𝑖=1 𝒌𝒌 𝑖𝑖𝑖𝑖 𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛 𝑜𝑜𝑜𝑜 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝, 𝒑𝒑 𝑖𝑖𝑖𝑖 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣 𝑜𝑜𝑜𝑜 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖, 𝒒𝒒 𝑖𝑖𝑖𝑖 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣 𝑜𝑜𝑜𝑜 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 Equation 3.5: Method 2 – Distance of pixel calculation When trying to show how to utilize the formula, the example from Illustration 3.26 will be used: �(100 − 185)2 + (30 − 40)2 + (230 − 255)2 + 𝑡𝑡 ≈ 𝟖𝟖𝟖𝟖, 𝟐𝟐 , 𝑤𝑤ℎ𝑒𝑒𝑒𝑒𝑒𝑒 𝑡𝑡 𝑖𝑖𝑖𝑖 𝑡𝑡ℎ𝑟𝑟𝑟𝑟𝑟𝑟ℎ𝑜𝑜𝑜𝑜𝑜𝑜 Equation 3.6: Method 2 – Distance of pixel calculation with threshold Compared to the previous method, the result is smaller; hence the threshold should also be smaller. Both the pixel difference sum and distance method is tested in the chapter 6.1 Cross Validation Test. 55 Group 08ml582 Reactive movie playback 3. Analysis Analysis Conclusion 3.9 Analysis Conclusion This chapter went through the aspects of creating a movie, establishing the star shaped narrative structure as the most prudent for this project. The important factors of sound and lighting, acting and cameras were explored and it was noted how sound is a vital part of setting the correct mood of a movie, how light could be used to enhance details of the scene, how acting can be used to emphasize emotions of the character and how it is important to balance the scenes by placing the camera correctly. The overall humor type of the movie clips were chosen to be slapstick humor, since it is easy to incorporate this type of humor in short movie clips, which is necessary to make the users smile in the short duration of the movie clips. Finally, C++ was chosen as the programming language, with the support of the OpenCV library, as this library has a built-n, well-functioning face detection system – based on the Haar cascade training method - and C++ gives the necessary flexibility to develop the programs needed. 3.9.1 Solution requirements Using the different parts of the analysis, it is possible to formulate the following demands that the design of the solution should comply with. The requirements will be additionally elaborated in the different sub-chapters in chapter 4. Design, in order to make the implementation as consistent as possible. The first part of the analysis contains with storytelling, cinematic elements and humor. The requirements for this part of the analysis are formulated as: • • • • • Character is without a mouth The movie clips should be make the viewer smile Clear narrative structure Consistent use of slapstick humor Use cinematic elements to enhance storytelling The rest of the analysis covers interaction and programming. The solution requirements for the programming part are as follows: 56 Group 08ml582 Reactive movie playback • • • • 3. Analysis Analysis Conclusion C++ as programming language OpenCV library for face detection Smile detection is done by mean image comparison Detection rate for smile detection must be at least 75%, 80% is desired These solution requirements conclude the analysis. It has provided the project with a very clear foundation and precise direction to begin the design phase of a final product that will fulfill the final problem formulation. 57 Group 08ml582 Reactive movie playback 4. Design Character Design 4. Design This chapter will look into the design of the product to be produced in the project. The first step will be to design the storyboards for the movie, in order to show how the type of humor is to be realized in a short movie clip. Furthermore, the various techniques regarding cinematic elements such as lighting and sound will be utilized to aid the movie clips. The character will be examined, with all the aspects that need to be taken into consideration, when designing a character. The character should be suitable for the purpose of the project and be doable within the given time limit regarding the implementation of the character in 3D. When the character is in place, it has to be animated and put into a movie. The techniques of animating will be explained in this chapter and the choice of these will be directly based on the design of the character. The overall concept of the order of the movies will also be discussed. The design of the program will be decisive as to how the movies will be played and this has to follow the structure of the narrative that was chosen earlier. 4.1 Character Design It must now be decided how the project character is going to look. The character should fit into the theme of the project and be able to mediate the desired effect to the user (trying to make them smile). With this in mind, this chapter will go through the various steps from deciding whether to make a complex or a simple character; making a realistic or cartoonish character; drawing inspiration to the shape of the character in order to make it likable and where inspiration was found; how the character will be further detailed down to its final form, how this helps achieve the desired effect for the project, along with thoughts on what was not chosen and why. 4.1.1 Ideas behind the character When deciding the overall look of the main character, it was designed with the subject of “fun” and “being likable” in mind. Based on the pre-analysis of choosing comedy and smile and the analysis with the requirement of humor in the animations, it was important that the character had potential for making people laugh, when they observe and smile at it. 58 Group 08ml582 Reactive movie playback 4. Design Character Design What immediately comes to mind when thinking of a character that mediates “fun” and “being likable” to an audience, is the cartoon-like character such as the various Looney Tunes characters (see them all at the website: www.Nonstick.com (Nonstick.com, 2003)). There are many other sources for character inspiration, such as Pixar or DreamWorks, but the reason to look at Looney Tunes is both because these have been around for decades and are widely recognized, but also since there is no inherent bias toward any of these characters in the minds of the audience – they can all be the hero of the story and create jokes in their own way, making each or all of them a good source for inspiration. These characters are very simple, in that they are defined largely by their outer shape. A character like Bugs Bunny has close to no muscle definition, wrinkles in his fur or similar details within the shape of the character as shown in Illustration 4.1. Another character, such as Daffy Duck, shares similar characteristics. For example, any indication that that he is clad in feathers is shown by a few feathers sticking out of his head and shoulders. Otherwise, he is also kept very simple as shown in Illustration 4.1. And yet, despite their obvious lack of detail, they are capable of displaying a great range of personality, emotion and they generally encompass a great deal of character. Illustration 4.1: Bugs Bunny and Daffy Duck illustrate, that simple characters can portray both emotion and personality. With this in mind, the main character of this project will also be kept simple, since there are many possibilities to let the animation breathe life into the character, be it simple or not. 59 Group 08ml582 Reactive movie playback 4. Design Character Design This simplistic nature of the character allows more freedom with the character design. If the object had been to create a realistic human, various elements, like the eyes, ears or nose, would have to have very specific sizes, placements etc. Straying from realism and going more towards a cartoon-like approach allows the design to truly emphasize what is important about the character and seclude or completely remove unimportant features. An example could be to make the eyes fill much of the head and minimizing the mouth and nose, thereby emphasizing eye-movements and the shape of the eyes. This ensures that the character design will deliver the necessary messages and cut away unwanted elements. Even if the character is a traditional biped, walking on two feet, having to hands, the head on top etc, it still leaves a great deal of freedom when aiming for a cartoon-like design. The hands could have only two fingers, instead of five; the feet could be directly attached to the body without legs; the head could be shaped like a sphere or a cube etc. Thus, the character will also assume a cartoon-like look. This choice will also help us emphasize key movements and events that take place around the character. First off, it gives great freedom concerning animation-techniques such as the weight in the animation, which will be explained in chapter 4.4.4 Weight. A realistic human walk would most likely contain – possibly with slight variations – the basic elements of a classic walk cycle, such as the one explained in chapter 4.4.3 The walk cycle, with the two contact positions, the straight passing position and the up and down. But having a cartoon character, it will not be expected of it to act and move exactly like a real human would, especially if it is designed not to resemble a human being. The passing position could maybe have the character crouched down, rather than straight up, the back foot could be delayed way beyond the normal passing position and zoom past the other foot in a frame or two etc. If the character is thrown at a wall, a normal human would react to the impact and then fall to the ground quickly thereafter. But a cartoon character can emphasize this impact in a much greater way. When hitting the wall, the character could deform and flatten itself against the wall. Maybe it would stick to the wall for several seconds and then fall down to the ground, maybe sliding along the wall down to the ground like a drop of water, or maybe even not fall at all, requiring someone else to pull it off the wall again, like a sticker. 60 Group 08ml582 Reactive movie playback 4. Design Character Design Having a cartoon-like character allows for the use of the 12 principles of animation (these principles will be examined in greater detail in chapter 4.4.2 Disney’s 12 Principles of Animation) with the aim of further emphasizing what the character is thinking or doing. E.g. the 10th principle of Exaggeration can be used to push the shape of the eyes far beyond what would look appropriate for a realistic human character or motion, such as a fall or a punch, could be deliberately over-done to emphasize the motion. In relation to this, the 1st principle of Squash and Stretch can help the exaggeration by e.g. overly squashing a character hitting the ground or stretching a character reaching for an object. In short, aiming for a simple and cartoon-like character allows for great emphasis, both with the character itself, but also with regards to what happens to the character. Furthermore it more easily enables the additional use of other animations tools or guidelines such as the 12 Principles to aid in letting the character getting its’ message across. 4.1.2 Constructing the character The next step is to examine how to construct the character and look at how to do this. When the user interacts with our character, it will typically be within the time-frame of a few minutes. He will not have the duration of a feature film or even 10 minutes to become acquainted with the character. He should not take a look at the character and then spend time figuring out how it is going to move, if it can move at all etc. It would be more advantageous if the user could gain an immediate assumption of how the character moves, so he can focus on what happens with the character, rather than trying to figure out the character first. As such, the aim is for a biped-character: A character, walking upright, on two feet, having two arms and a head on top. This familiar character gives the user an immediate understanding of the character and provides a wide range of possibilities regarding movements and interaction. Walking, running, jumping, picking up objects, looking at objects, punching and many, many other types of motions are available to this type of character. But the beauty of giving the character a cartoon design is that it is possible to tweak and put a new spin on every single motion, thereby making it unique and different from other characters that already exist. 4.1.3 The basic shape The next step is to define the shape of the character. When composing a simple cartoon character, it is prudent to construct it from simple shapes. However, constructing a character 61 Group 08ml582 Reactive movie playback 4. Design Character Design from simple shapes is not limited to simple characters. It is also used on e.g. feature Disney films, such as Aladdin, as shown on Illustration 4.2. Illustration 4.2: Basic shape diagram for Aladdin characters When the basic shape of the character has been determined, it can be further refined by adding in necessary detail such as arms, the head, clothing etc. And even though the character of this project will not be anywhere near as detailed as the characters from Aladdin, this guideline for basic character creation has proven to be an efficient starting point, so it will also be used for the character of this project. 4.1.4 A likable character shape Determining which shape to use, the main goal is to make the character friendly and likable. Disney’s strong line-up of characters from various feature films will be used for inspiration for the basic shape of the character of this project. The reason for not using e.g. Looney Tunes characters again is that, when looking for shapes not to use, there is little clarity to be found regarding shapes defining a non-likable character. Practically every Looney Tunes character is fun and likeable to watch and there are no real villains amongst them. However, in Disney features, there are always identifiable good and likable characters along with the obvious bad and non-likable villains, so there is a much more visible pattern regarding shape and the nature of the character. 62 Group 08ml582 Reactive movie playback 4. Design Character Design There are too many Disney features to examine them all in depth, so this will be limited to two features; Sleeping Beauty (Geronimi, 1959) and Aladdin (Clements & Musker, 1992), since these features - one released in 1959, the other in 1992 – demonstrates that the method for defining a character by its’ basic shape has been in used for decades, which makes it a viable source for inspiration. When looking at shapes within the character, it becomes clear that Disney films make use of different shapes to create contrast between different characters. Regarding Sleeping Beauty, let us look at the three good fairies – Flora, Fauna and Merryweather – in contrast to the evil witch – Malefacent. These characters represent the good and evil side of magic – the fantastic element of the story – and this stark contrast is illustrated in various ways, including the basic shape of the characters. Let us compare the characters side-by-side to get an overview on Illustration 4.3. Illustration 4.3: The good and evil side of magic in Sleeping Beauty There are many differences in these characters, but the most basic difference is round edges and soft shapes vs. sharp edges and pointy shapes. The three good fairies all share the rounded heads and cheeks, the rounded bosoms, rounded lower bodies, the small and somewhat chubby hands – they are largely build up from round, circular and full shapes. In stark contrast we see Malefacent, who consists mainly of points, spikes, sharp edges and thin shapes. Her face contains a pointy nose, chin, thin and pointy fingers, she has horns and the collar of her coats spikes out in many directions. While the details are many, the distinction is clear: Round and full for good – sharp and thin for evil. 63 Group 08ml582 Reactive movie playback 4. Design Character Design We see the same type of shape-contrast when we examine characters from Aladdin. On one side we have the good Sultan and on the other side we have the evil Jafar – these characters representing a good and evil version of what the protagonist Aladdin must overcome in order to win his princess – the laws that the Sultan upholds and the main antagonist that Jafar is. Let us compare the characters side-by-side on Illustration 4.4. Illustration 4.4: The good and evil obstacles in Aladdin The good Sultan is shape-wise round and full. The huge, almost Santa Claus-like belly, his chubby face and the puffy clothing all make him a very rounded and soft character. Contrary to the evil Jafar, who – like Malefacent - is very thin, has thin, pointy fingers, a sharp face and even though he wears the same style of clothes as the Sultan, his outfit is far sharper. And the overall idea is the same as in Sleeping Beauty: Round and full for good – thin and sharp for evil. This was just examples from two films, but there are numerous others (Shrek (Adamson & Jenson, 2001), Sid from Ice Age (Wedge, 2002) or even Santa Claus) and they all encompass this idea of what constitutes a good and positive shape. Thus, using rounded a full shapes can go a long way towards making the character good and likeable, which is desirable for the character of this project. 4.1.5 Initial character suggestions Thus it has been shown, that using round and full shapes to build your character can go a long way towards portraying it as friendly and likeable – one that the audience will enjoy watching. 64 Group 08ml582 Reactive movie playback 4. Design Character Design Inspired by this observation, the character of this project will therefore be built from round and full shapes. And which shape is more round and full than the sphere? Having no corners, being the prime example of the Principle of Squash and Stretch (look up this principle and you are very likely to see the bouncing ball) and being used to create comical characters time and again, this shape fits our character very well. With these design-ideas and inspirations, we can start to sketch out some initial suggestions for characters. Following what has been discussed in this chapter so far, they will be simple, cartoonish and consist mainly of full, round shapes. Some of the initial character suggestions are shown in Illustration 4.5 but every initial suggestion can be found in chapter 12. Appendix. Illustration 4.5: Some initial character suggestions While these all follow the ideas discussed so far, they still vary somewhat. Some have arms and legs – some do not. The feet and hands of some of the characters are more detailed than others, as are their eyes. These characters illustrate some of the range to work with within the design ideas they follow. However, while our character can become based on one of these suggestions, it could just as easily become a mixture of various features of our initial characters, or maybe only borrow a few features and then creating the rest from scratch. 4.1.6 Detailing the character Knowing the basic shape of the character and having examined the initial ideas for variety and range, the final step is to decide exactly what the character should contain. In the words of the famous cartoon and CG-film (Computer Graphics) director Brad Bird, currently employed by Pixar Animation Studios (Pixar Animation Studios, 2008): 65 Group 08ml582 Reactive movie playback 4. Design Character Design “The reason to do animation is caricature and good caricature picks out the elements that are the essence of the statement and removes everything else.” (Milton, 2004) With this in mind, the character can now become more focused. It is a biped, so in order to move around, it should have two feet. Furthermore, it should have two hands, both due to the fact that this is a part of the biped figure, but also for the character to be able to pick up and interact with objects. The character should have a head, in order for him to portray emotions and thoughts. While expressions can be supplied by body movement, based on chapter 2.1 Narrowing down emotions, it has been shown that emotions such as joy, fear, disgust etc. can be clearly shown in the face, so the character needs to have this. But apart from this, the necessities cease. A normal biped has arms, legs and a neck to connect the hands, feet and head to the body, but when designing a simple cartoon character, are arms, legs and a neck really necessary? Apart from connecting hands, feet and the head to the body, they are not vital to the character. Function-wise, much of what arms, legs and the neck can accomplish – bouncing a ball on the knee, using the elbows as a pillow for the head - can just as easily be accomplished by the hands and feet and head. Furthermore, characters without arms, legs a neck have been used with great success many times. A character like Rayman from Ubisoft (Ubisoft, 1995) has starred in many computer games over the years and he works very nicely without legs, arms and a neck. He can be seen in Illustration 4.6. Illustration 4.6: Even without legs, arms and a neck, Rayman appears like a connected character. 66 Group 08ml582 Reactive movie playback 4. Design Character Design The game-review-series known as Zero punctuation (Croshaw, 2008) also employs a character without arms, legs and a neck and that character also works just fine and appear complete, just as he would with arms, legs and a neck and he can be seen in Illustration 4.7. Illustration 4.7: Zero Punctuation character - like Rayman - has no need for arms and legs to be a complete character. So, both as a still picture and as an animated character, it is perfectly fine to exclude these limbs and still have a working character. Especially when working with an animated character, Rayman has shown that when the character moves, it can be done in such a way, it comes very close to being as natural as if the character had arms, legs and a neck. However, when excluding arms, legs and the neck, certain considerations become necessary regarding hands, feet and the head. If the hands had the shape of a plain sphere, the arms would still ensure that the outer shape of the character resembled a biped. But when working without arms, a bit of detail is needed to enforce the impression of a hand, which is why the character of this project is equipped with a big, full thumb on each hand. Illustration 4.8 The hand of the character, being made simple and from full shapes. The palm will also be deformed to resemble an actual palm slightly. But apart from this, the hand will still be simplified and the four other fingers will be combined into one big, full finger. This will enable the character to pick up and manipulate many objects and therefore, 67 Group 08ml582 Reactive movie playback 4. Design Character Design having four fingers become unnecessary. As a last idea, the hands will keep to the idea of being round and full. The feet will be shaped like a big slipper, which fits the character nicely, being normally thought of as a soft shoe. Illustration 4.9: The exaggerated feet also fit well within the design ideas for the character. Toes are not needed to portray the illusion of either a foot or a shoe, so they are not included. Going back to the characters from Disney features, such as shown in Illustration 4.3, many of them do have small feet. But in addition, they often have big or full legs to go along with the feet. The character of this project has no legs, so the sizes of the legs are transferred somewhat into the feet instead. While they will not be as big as clown-feet, they will still keep to the idea of round and full shapes by slightly exaggerating their size. The head will be entirely like a sphere. Illustration 4.10: The head of the character conveys emotions through its' exaggerated eyes and eyebrow alone. Details like a jaw, cheek bones, ears etc. are not important to give the impression of a head. In fact, as long as features such as the eyes are present, that is sufficient. The character is not going to speak, since the humor used in our project does not include telling jokes or similar 68 Group 08ml582 Reactive movie playback 4. Design Character Design vocal comedy. And since speaking is not important, the inclusion of a mouth must be discussed. A mouth can be very useful in conveying various emotions in the face of a character. But a character can portray emotions without it. The Zero Punctuation character is a good example of this. The online Animation School known as Animation Mentor is another good example: During the education, the students are provided with a character that is only equipped with eyes in the head – otherwise, the head is just a regular sphere. But it can still convey emotion by the eyes alone (AnimationMentor.com, 2008). This serves as an important source of inspiration for the head of our character, both shape-wise, but also regarding the eyes and mouth. But the main deciding factor in excluding a mouth will be production time and this decision has been elaborated in chapter 3.1 Delimitation of the character. The Animation Mentor character also shows that when using a simplistic and cartoon-like representation of the eyes, elements such as pupils loose importance. This boils down to that the head of this projects character will be simplified to a sphere with two eyes, not containing pupils. However, the eyes will be scale much larger than normal eyes. This is due to the fact, that the eyes are the main source of emotion from the character. And in order to emphasize these emotions, one way to do it is to make the eyes big. Lastly an eyebrow will be included, in order to help emphasize emotions in such cases where the eyes of the character are closed. Without eyebrows it can become troublesome to determine if he is concentration, in pain, listening sleeping etc, and eyebrows – even if they are simply just a line above the eyes – can help greatly to clarify the emotion. Lastly, the body of the character will also be a sphere, but deformed somewhat to resemble the shape of a regular body. 69 Group 08ml582 Reactive movie playback 4. Design Character Design Illustration 4.11: While starting from a basic sphere, the body still resembles a regular torso somewhat to give the characters body a sense of direction. It will be wider near the head than near the feet. If the body had been a completely round sphere there could be a chance that the user would lose track of what is up and down in the projects character during fast and somewhat erratic motion, such as air acrobatics, tumbling down a flight of stairs etc. So this deformation gives a sense of up and down in the character. Lastly, since the character has no arms, the body will be deformed to almost have an edge near the head. This edge gives a sense of the body having shoulders, which helps illustrate how the hands move in connection with the rest of the body. And that concludes the construction of the character. It should now be able to come across as a connected character due to the body shape and the inclusion of a head, hands and feet, move around due to the feet, interact with objects due to the hands and display emotions due to the eyes. But in addition, it keeps to the idea of being both simple and cartoonish, with much of the character – hands, feet etc. – being simplified in relation to a real human, but then also being scaled up, making them round and full. And as such the character has been designed to fit the requirements of the analysis and inspired by proved techniques and idea from other characters used in the same way as this projects characters is going to be: To be likable. The final version of the character can be found in Illustration 4.12: 70 Group 08ml582 Reactive movie playback 4. Design Humor types Illustration 4.12: A front and side-view of the final project character 4.2 Humor types A combination of Avery’s Screwball Squirrel and Chuck Jones’ Road Runner humor will be used in the project. Both are of the slapstick humor type and can be executed in short timeframes. Combining the extreme craziness and the type of black humor from Avery with the funny “accidents-are-bound-to-happen” and surprising style from Chuck Jones would be the keyword in the humor of the movies clips of the project. 4.2.1 Ball play All 3 gags are inspired by the slapstick humor and are similar by having at least one ball in the scene. In the scene the character will wonder about finding a ball. The ball itself is harmless; however it becomes destructive for the character after interacting with it. The character will be drawn to kick the ball and the ball will always respond by hurting the character in extreme and unpredictable ways. This should happen either rapidly to exaggerate the craziness, or in slow-motion in order to make the scenes funnier in facial and body appearance. 71 Group 08ml582 Reactive movie playback 4. Design Humor types 4.2.2 Black humor While still being a variant of the slapstick humor genre, the black humor type is greatly inspired by the group of comedians known as Monty Python, who is well known for their irrelevant and often surreal sketch comedy (Fogg, 1996). A recurring element in the sketches of Monty Python is that an event completely unrelated to the story suddenly happen, leaving the viewer quite puzzled and surprised but still amused by the absurdness of the whole situation. Furthermore the expression “And now for something completely different” (IMDB, 1971) is commonly associated with Monty Python, while also describing their type of humor very fittingly. This is what the black humor type will imitate; to have the viewer smiling because of a completely unrelated event, which however still adheres to the slapstick humor, meaning that this event will inflict some damage to the character. The thought is to build up a story with the character interacting with a prop, and then after establishing a series of logical chronological actions, an event will occur which can in no way be expected by the viewer to happen. And as it is the case with Monty Python, it is the absurdness of this unexpected event that will cause the viewer to find it amusing. 4.2.3 Falling apart This type of humor is based on exaggeration in relation to the character, taking advantage of how he is perceived by the viewer and how he is actually constructed. Due to the form of the body, resembling a torso and the elements that make up the character, he will be perceived as a whole and connected character. But in actuality, the character is not connected: his body, hands and head are all floating in the air and only his feet stand on the ground. This type of humor will revolve around the contrast between the perception of the character and the actuality of the character, since impacts between objects and the character can be uniquely interpreted in the movie clips contrary to a character that actually had arms and legs: When the character is hit by an object, he falls apart. 72 Group 08ml582 Reactive movie playback 4. Design Drafty Storyboards This action of actually taking the construction of the character very literally is surprising to the viewer, since he perceives the character as being a connected character and as thus does not expect him to fall apart, even though it is logical for him to fall apart due to gravitational pulls. The viewer might expect him to stretch unnaturally, but will most likely expect him to return to the normal form again. Furthermore, aftermath of the character falling can also be used to comical effect, since the viewer will have no real-world reference to how a character like the one in this project reattaches separated limbs to his body. The character might push limbs back onto the body so hard, the limbs on the other side of the body will fall off due to the force, the limb might not want to re-attach at all etc. 4.3 Drafty Storyboards This chapter will apply theories and methods from chapter 3.3 Lighting, sound and acting and chapter 3.4 Camera movements and angles to the drafty storyboards (from here on referred to merely as storyboards), such as camera angles, movements, acting of the character and sound, as well as explain the reasons and effects of these applications. The storyboards are useful in planning the movie clips, before going into the process of implementing them. A storyboard is a simplified way of visualizing a scene, without having to animate and model the parts of the scene. The storyboard will show the animators and modelers what camera angles to be used, which objects and props to implement and how these should be animated. Therefore, using storyboards for each of the movie clips in this project will optimize the implementation process and minimize production time. 4.3.1 Use of the camera As mentioned in chapter 3.4.2 Angle and Distance about the various shots, the human figures are visible, but the background dominates in the long shot. This can be used to introduce the viewer to the scene, as this will give an overview of what is happening. The long shot has been used in several of the group’s storyboards for the first picture. This can be seen on e.g. Storyboard “Trampoline” on page 191 in Appendix and on Illustration 4.13. 73 Group 08ml582 Reactive movie playback 4. Design Drafty Storyboards Illustration 4.13: The long shot used in humor type 1, 2 and 3. Illustration 4.28 shows the first picture of a storyboard from each type of humor. This approach, using a long shot, has been used to introduce the viewer to the character and the scene. Whenever the character displays important facial expressions, other shots like the medium shot and the medium close-up can be used in order to emphasize these expressions. Medium shots especially have been used as can be seen in Illustration 4.14. Illustration 4.14: Medium shots showing the character from the waist and up. The closest the viewer gets to the character in the storyboards is a medium shot. The effect of the medium shot is that the important parts of the character are still visible using this closer framing, and furthermore that it becomes easier to see the facial expressions and gestures. In chapter 3.3.3 Sound, off-screen sound can be used to make the viewer form expectations of what caused the sound. However, for certain scenes, such as the Music Box scene which can be seen on page 185 in Appendix, the framing of the camera will be used in correlation with off- screen sound to alter the effect of off-screen sound slightly. This use can be seen in Illustration 4.15. 74 Group 08ml582 Reactive movie playback 4. Design Drafty Storyboards Illustration 4.15: Using cameras in correlation with off-screen sound alters the effect of the sound slightly. What is achieved by this use of framing is that the audience will know perfectly well what makes the off-screen sound. However, the result of the event creating the sound is still unknown. The viewer will know that the character has been hit by the boxing glove, but will not know what this will result in, mainly because it is a cartoon-like character and it does not have to behave like a real human being when getting injured. This use of framing will create a comical effect, since the viewer is trying to imagine how the character could get hurt by the boxing glove, before actually seeing the result. Moments later, the result becomes visible and proves to be a much exaggerated result – following the humor type used in the storyboard of having the character fall apart on impact with objects – which is unexpected and takes the viewer by surprise. 4.3.2 Acting The eyes and hands of the character are very important with regards to portraying his mood and actions. Therefore they have become accentuated in the storyboards. Since the character cannot speak, his intentions and moods must be presented visually through his body language, i.e. through his hands and eyes. 75 Group 08ml582 Reactive movie playback 4. Design Drafty Storyboards Illustration 4.16: Hand and eye movement visualize the mood of the character. Keeping his eyes and eyebrows clearly in focus helps to visualize these moods and what actions he intends to perform. The left part clearly shows that the character is confused about something, while he, in the right part, uses his hands to emphasize that he is laughing. 4.3.3 Props It would be difficult to continuously have the viewer interested in what he sees if the character was the only thing present in the movies. Therefore various props have been included in the movie clips, so the character has something to interact with, and thereby the interest of the viewer can be maintained. The props included are the following: • • • • • • A music box A boxing glove A ball Several smaller balls A trampoline A piano The reason for the music box is that the scene needed an prop from which music should be played, but also that this prop did not look like a typical music playing device e.g. a ghetto blaster, since the character should not immediately be familiar with the prop and therefore be somewhat cautious when interacting with it. Furthermore the prop should be tangible, since the character still needs to interact with it. So although these requirements can be fulfilled by many shapes, the final choice came to be a box with part of a sphere protruding from one of the sides, since this fulfills the requirements 76 Group 08ml582 Reactive movie playback 4. Design Drafty Storyboards and furthermore because the production time is fairly short. While still being a somewhat unfamiliar prop, its’ outer shape does not immediately inform of what the function of this prop is, so the prop could do anything. The boxing glove has been included because a prop was needed for the purpose of an impact with the character. Again the situation is that many props can fulfill this requirement of impacting with the character, but since a boxing glove clearly indicates an impact, due to its real world reference, it was deemed suitable. The ball and the smaller balls are related in the sense that the single ball moves out into offscreen space and returns as several smaller balls. The requirement for the prop in this case was that it was to be kicked, and to this end, a ball is a very suitable choice. An important factor in the story using the black type of humor is the use of off-screen space, i.e. that something happens to the character outside the viewing range. As such, a prop is needed to help moving the character into off-screen space, rather than just having him walk out of the scene. A trampoline has therefore been decided for this interaction, since it is capable of launching the character up in the air, thereby having him disappear upwards. Lastly something should hit the character while he is in off-screen space and after he has fallen to the ground, this prop should fall right on top of him. The point of the black humor story is that suddenly something completely unrelated to the story happens, and as such this prop could be almost anything. A piano is a good choice because it firstly is not the typical object to encounter high up in the air, and secondly because it can continue the very unusual event (falling on top of the character) by beginning to play music, hereby proceeding in the black humor path. Furthermore the gag of a piano falling on a character has been used before on an episode of the show Family Guy called “Tales of a Third Grade Nothing” (Langford, 2008). The fact that it has been used in such a popular show indicates the quality of the gag. 4.3.4 Sound Sound has been visualized a few times in some of the storyboards. In these cases the function of off-screen sound has been used, creating the effect of the viewer becoming more immersed, as he cannot see what happens and therefore has to imagine it. 77 Group 08ml582 Reactive movie playback 4. Design Drafty Storyboards Illustration 4.17: The use of off-screen sound in the storyboards. Using off-screen sound also reduces animation time considerably and the fact that you cannot see the cause of the sound makes it possible that the sound could represent almost anything. In the Illustration 4.17, the “bang”-noise might come from the ball hitting a door, or it could just as well come from the ball hitting a spaceship. The unknown of the off-screen space intensifies the immersion of the viewer. Sound used for directing the viewers attention will be used in movie clips, to provoke an expectation in the user and attempt to make the guess what will happen next. An example of this use of sound can be seen in Illustration 4.18. Illustration 4.18: Showing an example of sound used to direct the viewers’ attention. What happens is that the box being held by the character has abruptly stopped playing music and the character has picked it up to examine the reason why. After shaking it for a moment, the music abruptly starts again and finishes playing. However, there are differences in the way it finished playing and how it started playing. At first, the music was played at a normal volume, with the viewer hearing it from the center of the scene. But when it abruptly starts up again, the volume has been significantly increased and the music is now heard to the right of the scene. This change of both volume and position of the music source, along with a reaction 78 Group 08ml582 Reactive movie playback 4. Design Drafty Storyboards from the character, will aid in making the user anticipate a forthcoming event from the right of the screen. Another use for sound in efforts of achieving the effect of involving the viewer more in the movie clips, is when diegetic and nondiegetic sounds are mixed, in the sense that the viewer does not know whether a sound is diegetic or nondiegetic. This happens as shown in Illustration 4.19. Illustration 4.19: The music playing can easily be considered to be nondiegetic, but is later revealed to be diegetic. The music starts playing before the first frame is shown. The movie clip then fades in while the music is playing, so the music should be perceived as regular background-music, since music with no discernible source in the scene should be perceived nondiegetic music. However, when it abruptly stops and the character even reacts to this, the viewer becomes aware, that maybe the music was played from within the scene and the source of the music thereby becomes an element of the scene, which the viewer will be interested in finding out more about. This way, the scene “plays” with both the viewer’s expectations about the music used in the scene, but it also draws the viewers’ attention to the box, before it is even shown. Apart from these instances of using sound to achieve an effect in either information displayed in the scene or to make the viewer form expectations about what will happen, the remaining sound used in the movie clips are functional, in that they support the visuals; the sound of footsteps will be heard when the character walks around, sounds of crashes will be heard when objects, such as the boxing glove or the piano, collides with the character etc. To gain an overview of the sounds used, a movie clip could be complemented by a sound map, such as the one proposed by David Sonnenschein, an accomplished sound designer on feature films such as Dreams Awake and I’d Rather be Dancing (IMDB.com, 2008), which can be seen in Table 4.1: 79 Group 08ml582 Reactive movie playback 4. Design Drafty Storyboards Table 4.1: Showing how various sounds in a scene of a movie can be illustrated in a sound map. Source: https://internal.media.aau.dk/semester/wp-content/uploads/2006/10/sonnenschein.pdf page 19 However, duplicating the categories of this sound map would be inappropriate for these movie clips, since e.g. voice is not a factor at all. Therefore, the following categories will be included in the sound maps for the movie clips of this project: - - Functional sounds: These are the sounds the support and enhance the visuals, such as the sounds of footsteps, on-screen collisions of objects etc. Effecting sounds: These are the sounds that have a purpose beyond just supporting a visual object. These sounds works as directing the attention of the viewer, to make the - viewer form expectations of what will happen etc. Character sounds: These are the sounds that the character makes. An example of a sound map for a movie clip of this project – can be found in Table 4.2. Table 4.2: Organizing the various sounds in such a sound map makes it easy to get an overview over which sounds are part of the scene and how. 4.3.5 Light None of the storyboards have drawn information regarding light. The primary goal has been to develop the stories and even though light can help doing this, the lighting requirements for the stories are limited. Basically the viewer needs to be able to have a clear image of the 80 Group 08ml582 Reactive movie playback 4. Design Drafty Storyboards character throughout the animation, since the movements, gestures and facial expressions are important and need to be seen in order to better understand the film. And since everything in the scenes will be clearly visible in the storyboards, there is no need to draw light information on them. In chapter 3.3 Lighting, sound and acting, four features of film lighting were listed: quality, direction, source and color. The following is a general description on how light will be used with regards to the animations. The quality will not be specific hard nor soft light, but rather be something in between, Although it will lean more towards the hard light because it creates sharp edges and clearly defined shadows. The direction of the light in the scene will be kept downwards. It is important to ensure, that cast shadows from props do not conceal important details. E.g. when the piano falls on top of the character as seen in storyboard “Trampoline” on page 191 in Appendix, the cast shadow of the piano cannot be allowed to conceal the hand of the character. The direction of the light will kept downwards in order to more easily control the length of the shadows, to determine what they might conceal. The key light is the only type of light, providing the dominant illumination. This is what will maintain a neutral illumination of the scene. With all the decisions made about how to craft a storyboard, an example of a complete storyboard can be seen in Illustration 4.20. 81 Group 08ml582 Reactive movie playback 4. Design Animation Illustration 4.20: The drafty storyboard developed in this project does only consist of drawn thumbnails, representing key moments in the scenes. 4.4 Animation The next step of the design phase is to examine various techniques within animation that can become useful for animating the project character. Since the project character is going to react to the user through various animated scenes, it is very important to be able to craft this animation with a sense of purpose and direction of making the user smile, rather than just moving the character around and hoping for the best. This chapter will explore various elements of animation techniques, such as Disney’s 12 Principles of Animation, the walk cycle, 82 Group 08ml582 Reactive movie playback 4. Design Animation weight in animation etc., along with exploring how these can be utilized in this project. In the end the chapter will provide a thorough overview of how the animated scenes can be realized and how the project character can come alive from the drawing board and start to make people smile. 4.4.1 Full vs. Limited Animation Before starting to delve on which animation techniques to use, a brief look at what type of animation – either full or limited – to make use of. This will be done with the remaining production time in mind, since the project-period is no longer than 3 months, along with the fact that animation can quickly become a time-consuming element of the project. Replicating movement so it appears realistic or natural is not an easy feat and going through a specific animation sequence, tweaking each frame until a satisfactory result is finally accomplished can take a significant amount of time. As such, techniques or methods to shorten to process of actually animating are very appropriate for a project such as this and that is the main reason for looking at full vs. limited animation. The differences between full and limited animation are listed here (Furniss, 1998): Full animation Limited animation - Many in-betweens - Few in-betweens - Every drawing is used only once, - Reuse of animations - Many animated frames per second filming on “ones” or “twos” - Movement in depth. - Few animated frames per second - Cycles - “Dead Backgrounds” - More dialogue driven - Many camera moves In order to fully understand these differences, a brief explanation of the terms “in-between” and “ones/twos” are needed. - Detailed information on the in-between can be found at the website The Inbetweener (Bluth, 2004), but a short description is that in-betweens are one or several drawings/positions between two extreme drawings/poses on an animation. For example, animation of a person opening a door would typically have the person 83 Group 08ml582 Reactive movie playback 4. Design Animation grabbing the handle as one extreme and the door fully open as another, with the poses - in between these extremes being the in-betweens. Detailed information on ones/twos can be found here in the book The Animator's Survival Kit (Williams, 2001, s. 47-68), but a short description is whether to use the same frame of animation once or twice in a row. One second of animation running at 24 frames per second (from this point referred to as FPS) can either consist of 24 different frames if shot on ones, or only 12 different frames if shot on twos. When having fully understood the differences between full and limited animation, we can start to see, that the type of animation for this project will end up being a mix between the two, and first the elements from limited animation will be chosen, along with why they are chosen: 4.4.1.1 Elements from Limited Animation “Reuse of animations”: The most prominent element to be used from Limited Animation is the re-use of animations. Movements that are similar in all of the scenes to be animated – such as the character walking - will be reused due to the fact, that the scenes feature the same character and unless something drastic will happen to the character, he will be walking the same way, thus eliminating the need to re-animating the walking for each scene. “Cycles”: Making use of cycles within the animation can be time efficient. Referring to the character walking again, if each and every step was to be manually animated, it would slow down the animation process to a great degree. Instead, it is possible to create a so-called walk cycle, which will be explained in more detail in chapter 4.4.3 The walk cycle: Rather than animating every step, it is only necessary to animate two steps and then these can be repeated to create a walk for as long as needed. To avoid these cycles becoming to mechanic, minor tweaks can be applied to them, such as the wobble of the head, the shaking of the arm, but the cycles are largely identical and time-saving. 4.4.1.2 Elements from Full Animation “Many in-betweens/Many animated frames per second”: While it might seem odd to deliberately include many in-betweens when aiming at saving time, it becomes clear why doing so is beneficial when working with 3D animation. While classical 2D animation only produced motion based on every drawing produced, 3D animation takes any number of poses 84 Group 08ml582 Reactive movie playback 4. Design Animation of the character and interpolates motion between these poses based on how long the animation is. Interpolating means that the computer tries to calculate the smoothest transition from one pose to the next over the given period of time (e.g. 10 frames between each pose). While this interpolation often requires the animator to tweak them manually before they look natural, it always ensures that there are many in-betweens present in the animation. If it runs at 24 FPS, there are always enough in-betweens to have 24 frames each second – they do not have to be manually created. Thereby, to achieve the effect of only have maybe 8 in-betweens will require extra work effort of duplicating frames to deliberately make the animation less smooth. This actually means that fewer in-betweens can easily include more work than many in-betweens. The type of animation used in the project has been decided upon – the next step is to explore various techniques and methods of animation to be used to create the scenes with which the users will interact. 4.4.2 Disney’s 12 Principles of Animation Created back around 1930 by the Walt Disney Studios, the twelve principles of animation were used in many of the early Disney features, such as Snow White (Hand, 1937), Pinocchio (Luske & Sharpsteen, 1940) and Bambi (Hand, 1942). Furthermore, pretty much every animator is familiar with these principles, since they are an important part of the skill-set of a competent animator. A detailed description of each of these principles has been written by Ollie Johnston and Frank Thomas in the book The Illusion of Life (Johnston & Thomas, 1997). They were directing animators on many Disney feature films (Thomas & Johnston, 2002) and are both part of Disney’s Nine Old Men, a collection of early Disney employees who are all considered some of the most influential animators of the 20th century (von Riedemann, 2008). - Squash and stretch Anticipation Staging Straight Ahead and pose-to-pose Follow-through and overlapping action Slow in and slow out Arcs 85 Group 08ml582 Reactive movie playback - 4. Design Animation Secondary action Timing Exaggeration Solid Drawing Appeal Each of these principles has its’ use and purpose (lest they would not exist) but it is important to determine which principles fit this project, how they can possibly be used and which can be skipped (if any). Squash and stretch – seen on Illustration 4.21 - is often used to exaggerate physical impact on a shape or object, but it can also reveal information about the physical nature of an object e.g. a flexible rubber ball vs. a rigid bowling ball. Illustration 4.21: The squash and stretch principle on a bouncy ball It can be used to achieve a comic or cartoony effect, making it a fitting principle for this project. This principle is usually accompanied by the example bouncing ball that squashes on impact with the ground and stretches when bouncing back up in the air. And since e.g. the head of the project character is basically a ball, this principle definitely has potential in this project. Furthermore, the character is cartoonish and not realistic, so squashing and stretching him to some extend to emphasize motion or events in the story become a viable option, such as when the character falls to the ground from great heights. It can be used to great effect to emphasize the impact of one object against another, if – in the very last frame before the impact-frame – the moving object is greatly stretched out to the point where it touches the non-moving object as seen in Storyboard “Trampoline” on page 191 in Appendix. When this happens only in one frame, it is not very obvious, but it has a great effect on the overall motion of the impact between a moving object and a non-moving object. 86 Group 08ml582 Reactive movie playback 4. Design Animation Anticipation – seen on Illustration 4.22 - has to do with cueing the audience on what is about to happen and using anticipation can often greatly enhance the impact and result of a motion. Illustration 4.22: To the left: The anticipation of a punch, before it occurs to the right E.g. if a character wanted to push an object, the motion of the push itself would be much more emphasized if he pulls his arms back before pushing forward, rather than just pushing forward. One of the main thoughts behind designing the project character cartoonish, was to be able to exaggerate what happens to it. Anticipation is another method of achieving emphasis on the character and what happens to it, so this principle also has its’ uses in this project, e.g. when the character prepares to launch itself into the air from a trampoline as seen on Storyboard “Trampoline” on page 191 in Appendix. Or when charging up the jump, he will curl down to a great extend, the arms will go far back behind the head and the eyes will pinch down hard to build up energy, which will then be released into a jump. Staging – seen on Illustration 4.23 - refers to making it clear to the audience what is happening in the scene. Illustration 4.23: To the left: Very bad staging, since most of the details of the scene is concealed. Much better staging occurs to the right where the details of the scene are clearly visible. Choosing the correct poses and lines of motion of a character for example, can assist greatly in helping the audience understand the story better. It is an important principle in relation to 87 Group 08ml582 Reactive movie playback 4. Design Animation this project, since information about the scene and the story can become entirely lost if staging is not considered. If the character is pulling something heavy towards the camera with his back to the camera, it will be extremely difficult to see the physical strain portrayed in the face of the character, motion of the hands can become concealed by the body and it becomes difficult to see how far the object has been pulled. So, without staging, the scenes can quickly become a visual mess and must be paid heed to when the character picks up objects or stands next to a trampoline. Straight ahead and pose-to-pose is regarding how to animate. It can either be using key poses and then filling in frame between these key poses to create the animation, thereby making it very predictable, but also limited when it comes to freedom with the animation. Or it can simply be beginning from one point and then just animating and creating the motion as you go along, causing the animation to be very free and spontaneous, but it can very easily get out of hand, messing up the timing of the entire scene or straying from what was originally intended with the animation. While it is hard to recognize which method has been used when watching the finished animation, it is worth considering for this project, mainly due to the limited time available, which makes pose-to-pose the obvious choice. However, it will be with a slight mix of straight ahead animation as well, since subtle motion, such as the wiggle of a hand during a walk, kicking in the air during a jump etc. can be incorporated without it necessarily being a part of the key-poses. Follow through and overlapping action – seen on Illustration 4.24 - is when a part of the character or object continues to move, despite the fact that the character itself has stopped moving (follow through) or changed direction (overlapping action). 88 Group 08ml582 Reactive movie playback 4. Design Animation Illustration 4.24: To the left: Follow through in the fingers and toes - To the right: Overlapping action in the hand to the right, since everything has turned another direction except for this hand. Being somewhat related to dragging action, when part of the character takes a bit longer to move than the rest, this can be hair or clothing moving after a character has stopped moving, but could also be a head turning slower than the body and stopping later than the body. In relation to this project it can be very useful in using little effort to create a more fluent or natural motion and loosen up the motion, keeping it from being stiff and rigid. E.g. a walk cycle can become much more natural if the toes are delayed a small amount when the foot goes down. Arm movement can go from being very stiff and robotic with a straight arm swinging back and forth to being more loose and natural, if e.g. the palm and fingers trail behind the wrist when the arms swings. It can even be used to more extremes with the projects character, since his hands and feet are not directly attached to the body. If he becomes hit by something or starts running fast, it is possible to make the hands, feet or even the head wait before following the body itself to really exaggerate the sudden shift in motion of the character. So whether it is concerning subtle or exaggerated motion, this principle is a great help in achieving more life-like movement for the project character. Slow in and slow out – seen on Illustration 4.25 - is yet another principle that is useful for this project. It is used to create snappy action, the changes in tempo, bringing more variation into the animation, rather than having it move at the same pace the whole way through. 89 Group 08ml582 Reactive movie playback 4. Design Animation Illustration 4.25: The little spikes on the arc to the right represent frames in the motion. The further apart they are, the faster the action goes, thereby being slowest and the start and finish - slowing when coming into the action and when coming out of the action Using this give the character the possibility to emphasize greatly the motion of slapping something, pulling something, when he re-attaches his head to his body, etc. and generally help in making an action particularly powerful, if the anticipation of this action is slowed down, then speeding up the motion in the action itself as seen on Storyboard “Music Box” on page 185 in Appendix. Or, the exact opposite can easily be used to create a bullet-time effect when details must be shown within a normally very fast motion, such as dodging projectiles etc. as seen on Storyboard “Bullet Time” on page 194 in Appendix. Arcs – seen on Illustration 4.26 - are one of the bases for obtaining more natural and life-like action of a character. When walking, the arms do not move in a straight line, but rather swing back and forth in arcs, the up-and-down-motion of the head in a walk occur in arcs and generally a lot of motion occurs along arcs. Illustration 4.26: The red lines represent arcs of the motion from the left picture to the right, when the ball is thrown. Pretty much every motion here takes place along arcs. As can be seen from Illustration 4.26, arcs can also help in creating exaggerated motion and extreme poses for a character, making for much more lively and interesting animation, especially for a cartoonish character like the one present in this project. Say the character 90 Group 08ml582 Reactive movie playback 4. Design Animation wants to open a door, which is stuck. Well, rather than just pulling the body back to try and open the door, the body can really stretch out in a long arc to greatly exaggerate the motion of trying to open a stuck door or when stretching out in mid-air during a trampoline-jump, the exaggeration can be seen on Storyboard “Open Door” on page 188 in Appendix. Or when falling from a great height, the character would not fall straight down, but rather arc as the body would fall first with the dragging of the legs and head following behind. Secondary action – seen on Illustration 4.27 - can bring more character into the animation. It can be referred to as action within the action and it concerns getting as much out of the animation as possible. Illustration 4.27: The top jump is relatively closed and fairly little movement apart from the jump itself. The bottom jump uses secondary action to loosen up the jump a lot more, with the legs and arms moving independently to create more action within the jump itself. Say a character is jumping. Instead of simply pushing off with both feet as once and landing with both feet again, the character could lift off and land with one foot delayed. Furthermore, the legs have many opportunities for secondary action whilst in the air. They could air-run, cycle around, waggle uncontrollably etc. as seen on Storyboard “Trampoline” on page 191 in Appendix. So, while a character performs major actions, adding nuances in the form of secondary action can add personality to this motion. This is very useful for this project. When the project character discovers a strange new object to interact with, rather than just looking at it before maybe picking it up, there is great opportunity to twist the head to create more wondering facial expressions or having the character scratch his head while looking at the object. 91 Group 08ml582 Reactive movie playback 4. Design Animation Timing – seen on Illustration 4.28 - is useful for this project solely due to the fact that it uses animation, since timing can cover how long a specific motion takes, such has how long it takes for a certain object – or in the case of this project, the character – to fall a certain distance based on the physical nature of the character as seen on Storyboard “Trampoline” on page 191 in Appendix; a heavy character will fall faster than a character which is light as a feather. But timing can also cover comical timing, known from e.g. Looney Tunes cartoons, where a character steps over the edge of a cliff, but only falls down, when he discovers that he has stepped into thin air and it is more fun to fall down at that point. Illustration 4.28: Here, timing is used to achieve a comic effect, since the character only falls down once he realizes that he stands in mid-air Using timing this way allows comic effect, if e.g. the character looses the his body and the head only falling to the ground when the character realizes its body is missing and it thus becomes more fun for the head to fall down at this specific point. A certain amount of delaying between sounds happening off-screen until events take place on-screen gives us the opportunity to let the user form his own expectations about what will happen next, before actually showing it. Exaggeration - seen on Illustration 4.29 - can be used to emphasize actions or events happening in the scene. 92 Group 08ml582 Reactive movie playback 4. Design Animation Illustration 4.29: Here we see two events. The top are without exaggeration and the effect is noticeable. However, when they are exaggerated at the bottom, the effect of the impacts is greatly enhanced and emphasized. It has been mentioned in the description of other principles is this chapter, such as Arcs and Follow through and overlapping action, since these principles can be used to realize this principle. Working particularly well for cartoons, since they do not have to abide by realistic motion, exaggeration can be used to great effect in the project. The character anticipating an action can be greatly exaggerated, e.g. making it arc far back before punching something. Or something hitting the character can be greatly exaggerated to emphasize this impact by simply making it cause the character to fall apart, flatten completely out etc. as seen on Storyboard “Music Box” on page 185 in Appendix. Solid drawing and Appeal mainly concerns the design of the character such as making it more life-like and likeable, which has been described in detail in chapter 4.1 Character Design. The original idea of Solid drawing was to use the appropriate depth, weight and balance to give the drawings a fitting style. Kerlow suggests that the principle is renamed to “Solid Modeling and Rigging” to meet up to the 3D modeling and rigging techniques of today (Kerlow, 2004). The principle can be shortly described as the art of modeling and rigging a character, so that the style and expression is as desired. It should also convey a sense of weight and balance in the character along with simplifying rigging the model. Appeal of the character is aimed at providing a character with a specific appeal such as being goofy or evil. The appeal could for example be expressed using different walk cycles, but could also be realized in the form on scarring a characters face, making it hump-back etc.. Thus it appears, that each of these 12 principles is indeed useful in some way or another to this project, either alone – such as Staging - or in combination with other principles – such as 93 Group 08ml582 Reactive movie playback 4. Design Animation exaggeration. They really help in breathing life into the character when it is being animated, especially for a cartoon character that can use certain principles such as Squash and Stretch or Secondary Action to a greater extent than a realistic character. 4.4.3 The walk cycle Let us look at the project character briefly: It is a biped and as such, it has got two feet. So, how does it move around in the scene? Well, since it is a cartoon character in a fictional universe, no distinct rule applies – he could fly around, he could teleport, he could collapse and roll around. But – as stated in chapter 4.1 Character Design, it is important for the audience to be able to quickly identify how the character should move and form expectations about it, since the sequences are rather short. As such, this character will use his feet and walk around. But, even though walking is completely natural to us in the real world, creating a walk in animation is a difficult process. To quote Ken Harris, an animator working on Warner Brothers cartoons, the Pink Panther and various Hanna Barbera cartoons (IMDB.com, 2008): “A walk is the first thing to learn. Learn all kinds of walks, ‘cause walks are about the toughest thing to get right.” (Williams, 2001, s. 102) Therefore it becomes very important to examine and break down the process of walking to see how to create a walk that appears fluent and natural. Only when knowing about how to create a walk can it be twisted and adapted to the needs of the animation in which it much partake. The following section will examine the basic steps towards creating successful walk, how to give personality to a walk and how to adapt the walk to the project character, along with how to adapt the walk into an actual walk cycle. While a walk consists of several different positions, two very important positions are known as the step or the contact positions and can be seen in Illustration 4.30. 94 Group 08ml582 Reactive movie playback 4. Design Animation Illustration 4.30: The two contact positions of taking a single step in a walk Each contact position is simultaneously what ends one step and begins the next. The timing of the walk can be roughed in with these positions, such as considering whether or not one step should happen in the course of 12 frames, 25 frames etc. in order to characterize the nature of the walk. While a walk can easily be adapted to fit many different characters, there is a general scheme of beats for a single step, to which a walk can be timed and it can be seen on Illustration 4.31. Illustration 4.31: Different people walk with different timing, as this scheme suggests When the position and timing of a step has been decided upon, the rest of step could be inbetweened to finish the walk. But there are many problems that can arise from simply in- betweening the walk without considering subjects such as weight in the walk, how the feet move in relation to the legs, the hands in relation to the arms etc. Therefore, it is important to create a few more positions within the step to ensure the walk has whatever weight, feet- movement etc. it needs to function for the character. In order to determine the nature of the walk, the so-called passing position can be added. This is the middle position of the step when one foot passes the other, so for a step taking up 13 frames, the passing position would occur on frame 7. Adding a passing position to the two contact positions in Illustration 4.32, the walk starts to become much more defined. 95 Group 08ml582 Reactive movie playback 4. Design Animation Illustration 4.32: The inclusion of the passing position, the walk begins to take shape The step has - with only 3 pictures – begun to illustrate what kind of mood the character is in, the weight of the character, the pace of the walk etc. The example in Illustration 4.32 shows a normal and neutral paced walk, with little hurry, neither relaxed nor tense, neither proud nor ashamed etc. But playing a bit with this passing position will quickly start to change the walk quite radically as seen in Illustration 4.33. Illustration 4.33: Just by altering the passing position, we can easily create the basis for four very different walks The walk is now starting to obtain a great deal of character and the mood starts to come out more. But a few more positions should still be defined, which will help the walk convey additional useful information – namely the up position and the down position and we will start by looking at the down-position. The down position is where the front leg bends down and all the weight of the body is shifted onto this leg – this position can be seen in Illustration 4.34. 96 Group 08ml582 Reactive movie playback 4. Design Animation Illustration 4.34: The passing position - drawn in blue - defines the walk even further This is where we can really start to play around with the weight of the character. This position determines most of how the character shifts its body weight when walking. Depending on the shape and mood of the character, the weight can be shifted in a multitude of ways when a character walks. When carrying a heavy load on its’ back, the down position could be made very extreme, threatening to make the legs give way to the weight; a plump character might need to lean to the side to swing the leg around; a joyful character might only touch lightly down on the toe before skipping on to the next step; an angry character would maybe stomp into the ground on the down position. Lastly, we look at the up position, which can be seen in Illustration 4.35. Illustration 4.35: The up position - drawn in green - is the last key position in the basic step before only the inbetweens remain At this point in the step, the character pushes off and maintains momentum in the walk. The pace of the walk is often determined at this point, essentially making it the position that “keeps the walk going”. As seen in Illustration 4.35 this is often the highest point in the step and it is another great place to convey a sense of weight in the character. A sad character might only lift itself very slightly off the ground, opposite of a cheerful character that might almost lift off from the ground entirely at this point in the step. 97 Group 08ml582 Reactive movie playback 4. Design Animation Now that we have seen the five key positions of a basic step, we begin to see how they are all connected and how to play with the walk per say. First of all, we see a clear use of the 7th Disney principle of Arcs when following the path of the head, the hips or the heel of the moving foot. Furthermore, we get a sense of which parts of the body are actually leading the movement in the walk and how to exploit this to tweak the walk, even in this most basic form. Let us look at the hands and feet. According to Richard Williams, regarding foot-movement in a walk: “The heel is the lead part. The foot is secondary and follows along.” (Williams, 2001, s. 136) He illustrates this point as seen in Illustration 4.36. Illustration 4.36: Demonstrating how the heel leads the movement, while toes, the ball of the foot etc. follow this motion. Knowing this gives many possibilities to spice up the motion of the foot, when taking a step. Rather than having the foot touch down flat on the ground, doing as in Illustration 4.36 – having the heel touch the ground and then having the ball of the foot and the toes follow behind - gives the motion greater flexibility (although stepping on a flat foot might be more appropriate for a tired walk), just as maybe curling or wiggling the toes can do. The same is true for hand/arm motion during a step. In relation to hand and arm movement, Williams notes: “The wrist leads the arc.” (Williams, 2001, s. 148) And he illustrates this point, as seen in Illustration 4.37. 98 Group 08ml582 Reactive movie playback 4. Design Animation Illustration 4.37: The motion of the wrist creates an arc of motion, with the hand following as secondary motion. In Illustration 4.37 we can begin to see, how the motion of the hands can be used to add more nuance and life to the walk. The 7th and 8th Disney Principles (Arcs and Secondary Action) are in play here and playing around with the arcs or finger motion makes it possible to convey e.g. very feminine qualities in the hand motion alone, by making them very flowing and dragging far behind the wrist, or maybe making the walk very lazy by making the arcs of the wrist very small and possibly eliminating wrist and arm motion entirely during the walk. With all the basic elements of the walk completed, we can now create the walk for the project character and the five positions – the contacts, the passing and the up and down positions – can be seen in Illustration 4.38, going from the right to the left. Illustration 4.38: The five basic positions of a step for the project character, with the red lines showing some of the arcs of motion 99 Group 08ml582 Reactive movie playback 4. Design Animation Many things can be read from this walk. Referring back to the down position we can see, that it is a slight down-motion, illustrating that the character is somewhat light and is able to walk without stepping down too heavily. Also, this walk is of moderate speed. The distance between the back foot and the body is fairly minimal in the up-position, so that the character does not push off from the ground with very much force. Worth noting is also the subtle handmotion: As the left hand is in front of the body, the fingers are bent slightly, but as the wrist swings back and the hand follows, the fingers drag behind, causing them to stretch out, before curling slightly again. A similar drag happens in the right hand, where the fingers curl more, as the wrist and hand swing forward, with the fingers dragging behind. So even though this is a fairly neutral and moderate walk, there are still opportunities to use some of Disney’s principles to loosen up the walk and make it less rigid. And to illustrate exactly how much can be done with a walk to loosen it up and make it livelier, Williams has crafted somewhat of a recipe for getting vitality into a walk as seen in Illustration 4.39. Illustration 4.39: There are many, many different ways to make a walk come alive as seen from this recipe. What is great about having these basic elements of the walk and having made them specific to the character, now it is possible to re-use them in order to make the character walk for as long 100 Group 08ml582 Reactive movie playback 4. Design Animation as needed. Even though there might be very slight variations in these positions for each step, an entire walk can be based on a single step. Since each contact position is both the end of one step and the beginning of the next step, it is possible to use the 2nd contact position from the previous step as the 1st contact position for the next step, ending this step with the 1st contact position from the previous step. Since the timing has already been determined from the previous step, placing the passing position is similar to the previous step, as is the up and down positions. And when making the next step, the positions from the very first step can be duplicated and maybe tweaked slightly and this process can now be repeated for as long as the character needs to walk. This process of recycling previous poses of individual steps to create an entire walk is known as a walk cycle and making use of this ensures a cheap and easy alternative to posing the character anew for each step it must take. 4.4.4 Weight As the final part of this chapter, weight in animation will be discussed, since weight is very important in order to achieve a connection between the character and the world. Referring back to chapter 4.4.3 The walk cycle, if the up and down position were not included in the walk, making the character walk in a completely straight line with no up or down motion, the character would have been floating around in the world, rather than walking and connecting with the world. But apart from connecting the character to the world, weight is also a big part of conveying the physical properties of characters or objects to the viewer. A heavy object, like a piano, will fall quicker and straighter than a light object, such as a feather. But it is also important to think about subjects such as reactions to a fall, namely the bounce off the ground. While conveying information about the weight of the object, a very bouncy ball will also be more rubbery or elastic than something like a bowling ball, which will have little bounce. A clear difference weight-wise can be seen in the scenes, when the character falls from trampoline jump and hits the ground, which causes him to bounce off the ground, before coming to a rest. Moments later, when the much heavier piano drops onto the character, this piano does not bounce at all, clearly illustrating the vast difference in weight on the impact. Often times it becomes necessary to convey a sense of weight in an object by how a character is interacting with it, rather than just how quickly the object falls. What if there is a big 101 Group 08ml582 Reactive movie playback 4. Design Animation boulder that needs to be pushed, a door that must be opened or merely an object that needs to be picked up? The sense of weight of the object must also be portrayed in these actions. What can be done is to use the character to portray the sense of weight, even before he interacts with the object itself. In the words of Williams: “One way we can show how heavy an object is, is by the way we prepare to pick it up.” (Williams, 2001, s. 256) While this refers to picking up an object, it also holds true for pushing or pulling objects. But since it holds true for many different interactions, let us only discuss it in relation to picking up objects, since the character will be doing this several times in the scenes of the project as seen on Storyboard “Music Box” on page 185 in Appendix. Making heavy use of the 2nd Disney Principle of Anticipation, how the character anticipates picking up the object is a big part of conveying the sense of weight of the object and whether or not the character is familiar with the object. If the character is about to pick up an unknown object, he will not immediately know how to do this and will most likely consider the action before carrying it out: Observing the object for a while, before beginning to interact with it; maybe breathing in, in a way to gather momentum for bending down and beginning the lift; when the character has bent down and gotten a grip in the object, he will bend down slightly more to gain extra momentum in the actual lift, if the object is a heavy one. There can be many ways to make the character perform extra preparations for the lift of an unknown object. In the case of this project, such preparations are present when the character is about to pick up and examine the musical box. This looks as seen in Illustration 4.40. Illustration 4.40: The lift of the music box and 6 important steps in using it to convey the weight of the box. 102 Group 08ml582 Reactive movie playback - 4. Design Animation At 1 the character thinks about the action at hand. It is important to note, that at this point in the scene, the box has abruptly stopped playing and so the character wonders what to do with it, making this consideration about more than just how to lift the box - itself. Nonetheless, the approach to lift the box is still being considered. At 2 the character anticipates the downwards motion to gain extra momentum for the bend and really preparing himself to lift the box, no matter how heavy it might be. What also happens is that the character takes a small step closer to the box, in order to - get it more under control, should he need to use body weight to support the lift. At 3 the character bends down to the box, grips it and starts preparing to lift it. At 4 the character bends down even more to allow himself to push up again using body weight to a greater effect than he could achieve without this slight anticipation for the - lift. The eyes also squint to prepare for the upwards motion again. At 5 the character lifts the box. The box is just a bit too heavy to be lifted simply by the hands alone, so backwards motion of the body is also put to use to aid the lift, moving - backwards and even taking a small step back as well. At 6 the lift is complete with the final action of the character widening the space between his feet, since the added weight of the box demands a bit more steady balance than when the character did not carry anything. On the other hand, if the character is familiar with the object, little to no preparation will be required, since the character knows what to expect when trying to lift the object. The entire action of lifting such a familiar object will therefore be much more straightforward and far less anticipation will be present, such as the case, when the character picks up his head and attaches it backwards, after having it knocked off by the boxing-glove, as seen on Storyboard “Music Box” on page 185 in Appendix. In the actual scene, this will look as seen in Illustration 4.41. 103 Group 08ml582 Reactive movie playback 4. Design Animation Illustration 4.41: Being that it picks up its own head, the character is now much more familiar with this lift and it becomes much more simple and easy. - At 1 the character has positioned himself in order to be able to pick up the head. In the scenes, this happens almost as soon as the character knows where the head lies on the - ground and there is no consideration regarding the lift before attempting it. At 2 the character has bent straight down and grabbed the head. There is no anticipation or gain in momentum before bending, since the character knows perfectly - well what to expect when lifting the head from the ground. At 3 the character effortlessly lifts the head off the ground and prepares to attach it to the body again. The character handles the head somewhat more gently than the box, holding more by the fingertips, than the palm of the hand. It should be noted that even though the character has taken a step backwards, this is not done to use body weight to aid the lift itself, but rather to assume a stable and secure stance for when the head becomes re-attached, which happens in a swift motion, which will cause the character to sway back and forth, requiring balance. The first example, seen in Illustration 4.40, made use of far more anticipation poses and stages of preparation for the lift, while the latter, seen in Illustration 4.41, consisted mainly of straight forward motion, since this lift will be well-known to the character. But both examples used both the character alone in getting ready to lift the object and a direct interaction between the character and the object to convey a sense of weight in the object, which shows that there are many ways to think about and illustrate weight in the scene in order to connect characters and objects to the world and each other thereby giving the viewer an impression that they are in fact part of the world, rather than floating arbitrarily around in space. 104 Group 08ml582 Reactive movie playback 4. Design Animation 4.4.5 Sum-up Now several techniques to help enhance the animation of the various scenes have been examined. A combination of Full and Limited has been established to be used for the scenes, along with determining which elements from both types are appropriate and why. Disney’s 12 Principles have been examined, along with how every single principle can in fact be used to great effect in the project in various ways. A look has been taken at the steps required to produce a walk that can convey a sense of weight in the character when walking, making the walk come across as connected to the world, rather than floating around, and also how this walk can be re-cycled to create the walk cycle, making it easy to make the character walk as far as needed without reproducing each step manually. And lastly the chapter examined how to achieve a sense of weight in various objects, both when the objects move by themselves, such as in a fall, but also when the character interacts with the objects and how to use the character alone to convey the weight of the object with which the character is going to interact. This will not cover every part of the animation however. Motions such as the jump on the trampoline or the bullet-time movements are not described in detail in this chapter, but describing in detail each and every motion in every scene would be far beyond what is necessary. Many movements are very individual to a story. Having a character such as the one in this project, does not immediately suggest that he must dodge projectiles in slow-motion, or that he must jump on a trampoline. However, it is very likely that the character will walk around the scene, that weight becomes an important factor in achieving believable interaction with objects or that Disney’s 12 Principles will be put into use. So, what has been examined in this chapter is animation that is widely used, making it prudent to examine techniques to achieve a believable version of this. The rest is up to the creative freedom of the animator. The design of the project so far covers the character itself and how this has been created to fit the aim of the project. The storyboards have been created with 3 different types of humor, as described in chapter 4.2 Humor types, in order to have variations in the test of the program. When the program then reacts to the smile of the user, it can provide him with a different type of humor based on what he likes the most. And it has been decided upon how to use the various cinematic elements, such as light and sound in order to further enhance the message of the storyboards and bring it out from the screen to the user. 105 Group 08ml582 Reactive movie playback 4. Design Smile Detection Program The relevant animation techniques and theories have been examined and chosen in order to help the character come to life in a natural and believable way, in order for it better to connect with the viewer. With this design in place, the next phase of the design is to create the program to ensure, that the smile of the users will be registered and recorded such that the test of the product is actually able to establish when the user is smiling and when he is not, in order to determine what type of humor the user liked best and react to this. 4.5 Smile Detection Program In order to be able to make our animated movie clips react to the user’s smile, it is necessary to produce a program, which can track the user’s mouth and decide whether this mouth were smiling or not. It is important that the user did not need to have various tracking devices placed on his face or that he had to interact in any other way, except from the smile. This chapter will cover the design of this program and the requirements of the program, such as determining the success criteria for the smile detection and creating a flowchart for the overall program functionality. 4.5.1 Requirements In order to know what to aim for and what was important for the program, a list of requirements were formed. First of all, the program should be able to take in a webcam feed as input and be able to detect a human face in that feed. When having detected the face, the program should determine which area of the face is the mouth and then be able to decide whether the mouth is smiling or not. With this information, the program should be able to select and play an *.avi movie clip, based on the smile detection. Given that the program should ideally be able to work in all conditions, such as light intensity, number of persons present in the webcam feed etc., there are several things to take into consideration when designing the program. However, with the limited time available for this project and with consideration of the complexity of a program, which would truly work in all conditions, the design of this application will be limited. First of all, the program should ideally be able to work in all lighting conditions, for instance in the bright daylight shining through a window or in the dim light of a dark living room, when watching a movie in privacy. Given that light will be a very decisive factor in making the smile detection work, the design 106 Group 08ml582 Reactive movie playback 4. Design Smile Detection Program will be limited to work in normal daylight and therefore, the requirements to the application will be that it works in this specific lighting condition, or in similar lighting condition that we – as test conductors – can set up artificially. Even in the most optimal lighting conditions, there should be some requirement to the success rate of the smile tracking. In chapter 2.5 Testing the product, it was mentioned that a success rate of at least 75% should be achieved. The higher the success rate of the program, the higher are the chances of a successful test. 4.5.2 Movie playback The playback of the animation clips will be ordered by type of humor. Two suggestions for the playback will be proposed and this chapter will explain and discuss the two ways of playing the movies. For the first proposal, the users will always be shown the same initial movie and based on their reaction to this movie clip, the program will choose either a movie of a new type of humor or the same type. The mapping of the movie clips according to the users’ reactions looks like illustrated in Illustration 4.42. The structure is a tree structure which is described in chapter 3.6 A reacting narrative. Illustration 4.42: The playlist for playing the movie clips in this method is based on the users’ respond to the previous clips. Based on the smile detection performed by the program, the program will choose according to whether or not the user smiled enough to pass the threshold of “liking” the movie. With this method, the user can see between 3 and 5 movie clips. If the user likes the first type of humor presented, he/she will continue to watch this type and if he/she continues to like this type, the playback will end after having showed the three videos of type 1. If the user enjoys the first movie, but then does not enjoy the next movie in type 1, the program will choose the 107 Group 08ml582 Reactive movie playback 4. Design Smile Detection Program second movie in type 2. Again, the program will detect smiles throughout the movie clip and determine if the scene was to the user’s “liking”. The program can follow any path through the tree, indicated by arrows. The benefit about this method is that the program checks the user’s opinion towards every movie clip, but the drawback of the method is that a certain value has to be found, for deciding when a user is actually enjoying the current type of humor, which can be difficult to determine exactly. Another drawback is that if the user does not like the current movie clip, it is uncertain which type of humor to choose instead. Smile or not smile is a binary choice and having three options to a binary choice is an inherent problem. The second method shows one of each type of humor at the beginning and then measures how much the user is smiling during each movie clip. The one type of humor the user is smiling most at is then the type which is chosen. Then for the next two movie clips, the user’s smiles will be ignored and no change of type will happen. The method can be seen in Illustration 4.43. Illustration 4.43: The playlist for playing the movie clips in this method is based only on the users’ response to the first three clips. The advantage about this method is that there is no need to find a threshold to decide whether or not the user liked the previous movie clip. It is simply a matter of comparing the three initial movie clips and see which of them the user smiled of the most. The disadvantage about it is that once a type of humor is chosen, the program will no longer react to the user smiling or not. For the purpose of this project, the second method will be chosen, in order to avoid having to find a threshold, defining when a movie clip is likeable. Even if such a threshold was found through a number of tests, there will still be a chance that the movie was not to the users taste, even if the program detected so. Furthermore, there is the risk that the user will constantly be shown movie clips he does not like, since, if the user does not like the previous 108 Group 08ml582 Reactive movie playback 4. Design Smile Detection Program clip, he will merely be shown a new option – this option will be based on what the user did not like and not what the user actually did like. With the second method, the program will always choose the type of humor that the user smiled the most at, and this should in theory also be the movie clip that the user likes the most. The advantage of this method is also, that at minimum one point, the user will be shown a new movie clip based on what he liked, rather than a guess based on what he did not like. Both methods can use the star shape structure, which were introduced in chapter 3.6 A reacting narrative, since they are both formed by small individual clips, which all starts at a neutral state for the character. The second method does not risk taking the user jumping from one type of humor to another all the time, but stick to one type, after an initialization period. The decisive process where the user’s smile is determining the chosen type could be repeated at dramatic height points of an entire narrative, as it was seen in the Kinoautomat system described in chapter 1.2 State of the art. However, for the purpose of testing the product in this project, the movie clips will be created such that the character always returns to the same starting point at each movie clip, no matter what has happened to him in the last movie clip. 4.5.3 Flowchart In order to understand the progress of the program, a flowchart will be presented. This will explain the step-by-step process of the program and how it reacts in different situations. The flowchart in Illustration 4.44 only covers the flow of the smile detection program. 109 Group 08ml582 Reactive movie playback 4. Design Smile Detection Program Illustration 4.44 The flowchart covering the smile detection program and playback of the relevant movie clips. 110 Group 08ml582 Reactive movie playback 4. Design Design conclusion This flowchart is based on the decision made about the movie playback, which is to show the users three initial clips and then compare the user’s reaction to each of these three clips. The program will start by establishing a connection to the webcam and start the first movie clip in the playlist. As the movie clip is playing, the program will try to find a face in the given webcam feed. For the application of the product in relation to this project, there will always be a person present in the webcam feed, so no error handling is done, in case the program fails to find a face. Should the program for a period of time be unable to track the user – if the user e.g. scratches his nose or looks away for a moment – the program will keep trying to find a face until the user reappears. If a face is detected, the program will determine the position of the mouth and then detect if the mouth is smiling. If the user is smiling, a value of 1 will be added to a vector called movieMood and if the user is not smiling, a zero will be added. At the end of the movie clip the total value of movieMood for that single movie clip is stored. If the last of the three initial movie clips has been played, the program will compare the “score” for each of the three movie clips and the clip that scored the highest value, will be decisive for what movie clips will be played next. If for example the first movie clip scored the highest, the program will show clips from the playlist of this type of humor. 4.6 Design conclusion The design of the product is now complete. The character has been designed to fit the project and to convey the message of “fun”, taking inspiration from sources such as Disney and Warner Brothers cartoons. The storyboards on which to base the movie clips have been designed, such that there are several clips containing each of the chosen types of humor, so that there are variations in the test at which the user can smile and make the program react. These storyboards have been greatly enhanced using theories of various cinematic elements, such as how to properly apply sound and different camera angles to either enhance comical effect, involving the user in speculating what happens at certain points during the movie clips or simply cutting down production time while still conveying the same message to the user. It has been decided which animation techniques to use when animating the character, such as Disney’s 12 Principles or applying weight to an animation, and why these techniques would aid the character in coming more to life. 111 Group 08ml582 Reactive movie playback 4. Design Design conclusion On a technical level, the movie playback method was decided. To ease the testing of the product, it was decided to play the movies in a way that the program only had to compare the user’s reaction to three initial clips and choose the following clips according to this comparison. Requirements to the success rate and working conditions of the program were introduced. The success rate of the smile detection was determined to reach at least 75 % in an environment of light intensity similar to that of daylight. 112 Group 08ml582 Reactive movie playback 5. Implementation Modeling 5. Implementation This chapter will explain the implementation process of the project and the work that was done in this process. First the implementation of the storyboards will be discussed. This process was done by using MAYA, and the steps of going from the 2D storyboards to a 3D animation will be covered during this chapter. The overall techniques used for modeling the character and examples of how to apply them will be described. Afterwards, a description of the techniques used for rigging the character in order to prepare it for animation will likewise be shown, along with examples of where to use them. And, as the final part of the section concerning MAYA™, techniques used for animating the character and making it come alive on the screen will be examined, as well as where they were used. For the programming part of the implementation, it will be described how the code was created and what was used in order to make the program work the way it does. A total of three different programs were made to get the final result: A picture capture program – for capturing the training images and cropping them to the right dimensions - a training program – for teaching the program what is a smile and what is not – and the actual smile detection. All programs are produced in C++, using the OpenCV library developed by Intel (OpenCV, 2008). 5.1 Modeling The first step of implementing the character and the objects that it will interact with, such as the trampoline or the boxing glove, is to model it in 3D. Concept art of the character, like shown in Illustration 4.6 in chapter 4.1.6 Detailing the character. One approach could have been to import concept art into Maya and then model precisely after this, but the project character as simple enough to model it to a satisfactory level with following the concept art meticulously. When modeling the character, there were various approaches to follow. One could be using NURBS curves - Non-Uniform Rational B-Splines – a technique which involves drawing a few curves along the outer shape of the character and then lofting them, which mean stretching a surface between them. This produces a very smooth result, but it can be difficult to add small details if needed and using this for more detailed areas such as the hands would pose problems when creating the thumb. Also, the curves which are to be lofted into an object would have to consist of an equal amount of control vertices (the points that the NURBS 113 Group 08ml582 Reactive movie playback 5. Implementation Modeling curves are being formed according to and shown as purple dots in Illustration 5.1), which can be difficult to ensure, if curves are used to create asymmetrical objects, and in general it can be difficult to be sufficiently precise with the structure and placement of the curves to ensure a proper end-result. To illustrate how this works, Illustration 5.1 shows how the body of the character could have been made using NURBS curves. Illustration 5.1: To the left: 3 NURBS curves to form 1st half of the body. In the middle: Lofting the 3 curves. To the right: Mirroring the 1st half to form the full body Another available method could have been edge-extrusions, which involves starting from a single polygon surface and extruding new surfaces out from the edges of this one polygon. This process can be compared somewhat to creating paper mache, where you have an object and thin pieces of paper. You then fold and stretch the paper around the object until you have a closed shape on paper, based on the object. Edge extrusions work in a similar fashion of folding thin surfaces around to form a closed shape, except that there is no form to shape directly around – the end result relies largely on how good the modeler is at imagining the correct volume of the model. Replicating the entire body of the character just for the sake of illustrating the process is too time-consuming, so a general example of edge extrusion will instead be shown Illustration 5.2. 114 Group 08ml582 Reactive movie playback 5. Implementation Modeling Illustration 5.2: Starting from the orange surface, the following model has been produced by extruding various edges a random number of times A third method of modeling, which was also the method that was used for the project character, is called box-modeling. As the name implies, this method involves starting with a polygon primitive – often a box, but it can also be a sphere, a cone etc. - and then modifying this primitive until the final model is obtained. This is done largely by extruding the faces of the box, which will pull the selected face out from the box and create new faces between the chosen face and the neighboring faces of the box. However, there are many possibilities of adding the necessary details and number of faces to extrude. Faces can be divided in many ways, either by splitting them, merging more faces into fewer, beveling edges – a process that takes an edge, splits it in two and creates a face between these two new edges – and many more. Using these options for extruding and creating new faces enables a modeler to mold a very primitive object into something much more detailed. To demonstrate how the body was modeled, Illustration 5.3 shows the starting polygon primitive and how the final result would look. 115 Group 08ml582 Reactive movie playback 5. Implementation Modeling Illustration 5.3: To the left: Starting with a simple cone primitive. To the right: Having extruded the primitive, the shape of the body has been made. The hands, feet, eyes and the eyebrow were made using the same modeling technique of extruding faces from a polygon primitive. The only different part of the character is the head. As per the character description of having the head be a sphere, modeling the head involved nothing more than creating a polygon sphere and scaling it to fit the size. Illustration 5.4 shows how the character looks when completely modeled from a front, side and perspective view. Illustration 5.4: The character from a front, perspective and side view, along with a wireframe on top to illustrate the number of polygon faces that makes up the character 116 Group 08ml582 Reactive movie playback 5. Implementation Rigging 5.2 Rigging Now the project character has been built and can be used in the scene. However, before animation can begin, the character must be rigged - a setup-process that involves the construction of a skeleton, constraints for the skeleton etc. Without rigging a character, moving it around would become a very tiresome process of moving each vertex manually, effectively lengthening the animation process too much for it ever being possible to complete in time. The character of this project has roughly 4.500 vertices. Just the thought of having to move this many vertices around for maybe about 240 frames of animation (roughly 10 seconds) renders the process of animating too heavy and cumbersome to even consider including. This is where rigging has its uses, since a rigged character becomes very easy to move around. Outfitting the character with a skeleton of joints allows the character to be controlled according to the design of the skeleton. E.g. a biped character like the one in this project can be rigged to fit a basic humanoid skeleton and thus achieve easy hand and feet movement as well as general humanlike motion. This chapter will describe the various elements of constructing a working rig for the character used in this project. It will cover when to use each technique and an example of how it was actually used in the character. And while each and every step of creating the finished rig will not be covered, the chapter will – at the end – have explained the various techniques such that it is possible to understand how the rig was created. 5.2.1 Joints and skeleton The first thing to construct when rigging a character is the joints of the skeleton. A joint is exactly what it sounds like – a connection between two bones and as such their functionality and purpose need only little explanation: Any point where the mesh of the character is intended to bend in any way should be connected to a joint. On the project character, this includes places like the toes or fingers or in the torso. On Illustration 5.5 it can be seen how the joints for the leg of the project character has been constructed along with how the final skeleton looks, in relation to how they fit inside the mesh of the character. Only these two parts are shown, since going from a leg all the way to the final skeleton only involves creating the two legs and arms and parenting these to the spine. 117 Group 08ml582 Reactive movie playback 5. Implementation Rigging Illustration 5.5: To the left, the details of one part of the skeleton is shown, while an overview of the entire skeleton is shown to the right. Using joints, the entire skeleton for the character can be created. It is important to keep in mind, that when creating joints for e.g. the leg, all these joints will be connected in a joint chain and this chain has an internal parent-child relationship. This means, that the joint highest up in the chain – the first joint that was placed, the parent-joint of the leg is the Hip- joint – will influence the other bones in the chain. E.g. when the top joint is rotated, all the other joints below are also rotated, but when the lowest joint in the chain is rotated, nothing happens to the joints above. Think of this as moving your knee; your ankle and foot follows this motion, but if you move your toes, the knee will not be moved along with it. Understanding this parent-child relationships shows in what order the various joints of the skeleton should be created in order to create a logically functioning skeleton, e.g. the shoulder should be made before the wrist and the knuckle should be made before the finger tip. It does pose a slight problem in that it is not possible to create a skeleton consisting of two legs, a spine, two arms and a neck in a single chain. However, this problem is solved by simply parenting the various joint-chains to each other afterwards, such as parenting the top joint of each leg to a joint near the bottom of the torso to act as the hip. 5.2.2 IK/FK IKs and FKs are terms used to describe methods of translating joints around. FK means Forward Kinematics and IK means Inverse Kinematics. 118 Group 08ml582 Reactive movie playback 5. Implementation Rigging One use of an IK chain in the project character can be seen in Illustration 5.6. Also note, that a chain of joints is essentially an FK chain until an IK is added, and an FK chain can also be seen on Illustration 5.6. Illustration 5.6: On the left is an IK-chain in the leg between the brown-colored joints while the rest of the joints in the foot are FK chains. On the right is an overview of all IK handles in the character enclosed in white circles. The main difference between IKs and FKs is that IKs uses simple movement of a single point of a joint chain to translate the joints in the scene, while FKs make use of rotations of joints in a chain to place the joints where they are needed. Deciding whether to use IKs or FKs is really depending upon the action or motion that must be animated. Say for example, that an arm is bent and must be stretched out. - With IKs this can be accomplished by creating an IK-chain between the top joint in the arm joint chain (the shoulder joint) and the lowest joint (the wrist), grabbing the end of the IK chain (the wrist) and dragging this single joint out from the body, until the arm joint chain is stretched out. This is possible since the IK system will ensure, that the remaining joints in the chain will automatically adjust themselves according to the - position of the wrist joint. With FKs, it is more cumbersome. Using FKs requires, that each joint in the arm be rotated individually to obtain the final stretched-out pose of the arm, which in most cases would involve rotating the shoulder joint first and then rotating the elbow joint second. 119 Group 08ml582 Reactive movie playback 5. Implementation Rigging For such movements, IKs are superior to FKs, which is also the case with e.g. posing of the feet in a walk cycle, since creating an IK chain between the hip and the ankle joints and then only moving the ankle is easier than rotating the hips joint and then the knee joint. However, FKs have their own advantages over IKs. In such actions as a walk cycle, when the arms often swings back and forth, the swinging motion is easy to obtain, by simply rotating the shoulder joint back and forth between each step, while obtaining the swing by moving the wrist up down, left and right with IKs, making sure that the elbow also looks right is much more time-consuming. Since many types of joint chains can benefit from both an IK and an FK chain, it would be prudent to be able to switch between these two methods and luckily Maya provides an IK- handle with the “IK Blend” option that allows for this IK-FK switch - a control than can enable or disable an IK chain. 5.2.3 Control objects and constraints A control object is used to gain easier access to certain joints that can be difficult to access directly. In fact, it is generally not a good idea to rig a character such that any big movements are performed by joints that must be directly selected: In order to keep the joints from clustering up too much, their size can be scaled down to any size that fits the animator. But if direct selection of joints is required to animate the character, a small joint size causes problems, if it requires the animator to constantly zoom in on the character to select a joint and then back out to animate. A close-up of the control object for the wrist and an overview of every control object in the character can be seen in Illustration 5.7. 120 Group 08ml582 Reactive movie playback 5. Implementation Rigging Illustration 5.7: To the left, the wrist control object is shown. To the right, every control object in the character is displayed. A control object in itself is any primitive that can be created, be it a polygon, a NURBS surface, a curve etc., but is normally a curve, since curves are not rendered and do not conflict with the scene this way. The control objects used with the project character are just curves shaped to fit various positions around the character, such as a box around the hips and top of the torso or a circle around the shoulders and elbows. It can now be made as big as needed by the animator for easy selection and be made to control any number of joints or IK handles. In order to make the control object actually control anything in the rig, the various constraints inside Maya can be put to use. There are many different types of constraints, such as the point constraint, which handles normal movement much like an IK chain, or the orient constraint, which handles rotation much like an FK system, so when creating e.g. a control-object for the wrist, the object is created and positioned at the position of the wrist joint. The joint is then both point and orient constrained to the control object (to make it work for both IKs and FKs) and now the rig includes a working control object. 5.2.4 Driven Keys Apart from making the various joints and IK handles more easy to control, control objects also have many possibilities to make it easy to organize controls for certain join motions, which are cumbersome to animate by either IKs or FKs, such as forming a fist with the hand or the foot movement required to simulate pushing off from the ground in a step. Control objects 121 Group 08ml582 Reactive movie playback 5. Implementation Rigging offer natural places to create these extra utilities from the rig. However, the extra control itself comes from using Driven Keys. Using Driven Keys is somewhat like setting a normal animation key (this will be covered in chapter 5.3.1 Key frames). However, rather than keying transformation, rotate or scale of an object, a joint etc. to a certain time, it can be keyed to a control or a slider instead. Even though it is possible to set a Driven Key to already existing controls, such as Scale X or Rotate Y, in order to make a proper control for a certain transformation, a new control should be created. The project character has such a custom control for e.g. making each hand form a fist and to understand Driven Keys fully, we will look at this control. All the controls for the right wrist can be seen in Illustration 5.8. Illustration 5.8: Many custom controls have been made for each wrist of the character, such as "Fist", Thumb Bend" etc. When having created the custom Fist control and set its minimum and maximum value (in this case from 0 to 10), it is possible to start setting Driven Keys. The joints that form the fist are every joint in the four fingers. When Fist control is at 0, no fist should be formed, so a Driven Key is set between the joints and the Fist control for each of the finger joints in their neutral position now. A Fist value of 10 is then set. Now the fist should be fully formed, so each joint in all the fingers are rotated to form a fist and now a Driven Key between the joints and the Fist control is set. When this has been done, the Fist control will - when varying the value between 0 and 10 – cause the joints in the fingers to form a fist or go back to neutral position. Using Driven Keys in this way thereby ensures that when wanting a fist in the animation, it is no longer necessary to manually move or rotate each joint in the finger – it can be done by use of a single custom-made control. 122 Group 08ml582 Reactive movie playback 5. Implementation Rigging The Fist control in action can be seen in Illustration 5.9. Illustration 5.9: To the left: The hand in the neutral position. To the right: A fist has been formed by setting "Fist" to 10. 5.2.5 Clusters They are not used very much in the rig, so only a brief look will be taken at them. A cluster is a deformer, which controls any number of joints, vertices, control points etc., with varying influence over what it controls. Illustration 5.10: To the left, the spine is bent by use of clusters. To the right, no clusters are used In the character, clusters are used in the spine. In order to achieve a more fluent bending of the spine, it is important to ensure, that not only a single part of the spine bends at any time, since this is not how a real spine works and would make the character look choppy and weird. 123 Group 08ml582 Reactive movie playback 5. Implementation Rigging Instead, by using a cluster for several points of the spine, when one of these clusters are moved, it will also have influence over other parts of the spine, causing them to move as well, although to a lesser extend. This creates a smooth curve in the spine, rather than a sharp bend. The result of using clusters in the spine can be seen in Illustration 5.10. 5.2.6 Painting the weights With the skeleton rigged and all the controls set up, it is time to look at attaching the skeleton properly to the mesh and that is done via the process of skinning, which involves binding each vertex of the mesh to the various joints in the skeleton, based on which bones are closest to the vertices (the number of bones allowed to influence each vertex can be adjusted). But that is the basics of skinning and this specific process will not be covered further, since skinning a character just involves pressing one button. Due to the highly automated nature of skinning, problems will often arise with bones affecting too many vertices and making the mesh of the character deform incorrectly when moving the joints. Illustration 5.11 shows incorrectly influenced vertices around the head of our character (the shoulder vertices are influenced incorrectly when tilting the head) as well as how it looks when these influences have been fixed. Illustration 5.11: To the left: Incorrect influence. To the right: Correct influence The way to fix problems such as this is by manually painting weights of each joint. This is a method of assigning influence from a joint to vertices of a mesh, thereby controlling how the mesh deforms when certain joints are moved. When painting influences of vertices to a joint, all currently influenced vertices will show as white and the less influence the joint has on any vertex, the more black the vertices be, allowing for easy view for how to e.g. fade out influence 124 Group 08ml582 Reactive movie playback 5. Implementation Rigging between two joints. It also makes it easy to correct, if vertices are incorrectly influenced by a joint; paint those vertices black and it is done. Illustration 5.11 shows that the neck joint (enclosed in the red circle) has been rotated. To the left, the vertices (enclosed in the blue circle) are incorrectly influenced by this joint, shown by them being not black, but grayish. This problem has been corrected to the right, where all the influence in the vertices in the blue circle has been removed, causing them to appear as black instead and ensuring that the neck joint does not have any influence anymore. 5.2.7 Influence objects Using influence objects is another method of ensuring that the mesh deforms like intended, when moving or rotating joints. An influence object by itself is normally just a primitive object, such as a sphere that is inserted into the mesh. Illustration 5.12: A top, side and perspective view of the influence objects in the right hand of the character Where influence objects are useful, is when bulges in the mesh are intended when rotating a joint or when ensuring that parts of the mesh, such as the outside of the elbow, the muscle in the arm or the palm of the hand, do not collapse into the mesh. When inserting an influence object, this ensures, that the mesh cannot deform through it, thereby allowing for the mesh to retain a desired amount of volume when moving joints and bending the mesh. For the project character, influence objects are used to keep the volume of the hand fairly similar, no matter how the joints in the hand are rotated. After testing the various hand controls, such as “Fist”, three places were found where the use of Influence objects would help the mesh deform correctly and they are shown in Illustration 5.12. 125 Group 08ml582 Reactive movie playback 5. Implementation Rigging And, as Illustration 5.13 shows, when making the hand into a fist, the influence objects really does go a long way towards retaining the volume of the hand, such as the area of the palm near the thumb. Illustration 5.13: To the left is the fist with influence objects maintaining volume. To the right there are no influence objects and the fist is not nearly as closed. 5.2.8 The reverse foot control One specific part of the rig that warrants a more detailed look is the reverse foot control (referred to as RFC from this point forward), which can be seen in Illustration 5.14. Illustration 5.14: This reverse foot control, will be extremely helpful in e.g. the walk cycle Illustration 5.14 shows the mesh of the foot as purple, the joints of the foot in green and the RFC in blue and the control object for the entire foot in the blue circle going around the foot. 126 Group 08ml582 Reactive movie playback 5. Implementation Rigging The RFC is made up of four joints, marked by the red arrows. The reason for adding such a control is to ease the process of posing the foot during a walk. So, this control can simulate the motion of the foot pushing off from the ground, having the green joints move independently of each other to make the motion – all with a single custom created control, similar to the fistcontrol seen chapter 5.2.4 Driven Keys The first step is to parent the green joints to joint 4, 3 and 2 (the red numbers) in the RFC, so that when using the foot control to move the foot around, the foot will move around with it. But the nature of a joint chain means, that if one of the green joints were parented to a joint in the RFC, the green joint chain would be broken, which would essentially break the foot entirely. This problem can be solved with inserting an IK chain between each of the green joints, which creates an IK handle at each joint. Now these IK handles can be parented to the blue joints without breaking the green joint chain – IK handle 1 to RFC joint 4, IK handle 2 to RFC joint 3 and IK handle 2 to RFC joint 3. The custom foot control – named Foot Roll - is now created within the control object for the foot and given a minimum-value of -10 and a maximum-value of 10 and set to 0 as default. This will control the motion of the foot pushing off, going from 0 to 10, but will also control to entire foot bending backwards and up, as when to prepare to set down the foot on the ground again when taking a step, going from 0 to -10. Creating the functionality of the foot motion requires the use of Driven Keys like shown in chapter 5.2.4 Driven Keys. A driven key is set for all the joints as they are shown in Illustration 5.14, when Foot Roll is at 0, since this is the resting position for the foot. Using this value is an easy way to ensure that the foot will not bend in any direction, but rather be flat against the ground. Creating the motion of the foot lifting off from the ground is done in two steps. - First, Foot Roll is set to 5. Then, RFC joint 3 is rotated a certain distance. This is done to simulate the position of the foot pushing off, when the heel has lifted from the ground, but the toes remain on the ground and a Driven Key is set for this position and value of Foot Roll and it can be seen in Illustration 5.15. 127 Group 08ml582 Reactive movie playback 5. Implementation Rigging Illustration 5.15: Half-way through the step, at a Foot Roll value of 5 - Second, Foot Roll is set to the maximum of 10. RFC joint 3 is kept at the same rotation, but now, RFC joint 2 is rotated a certain distance to simulate the remaining motion of the toes also lifting from the ground, when the weight of the body has shifted to the other foot and this foot is in the air. A Driven Key is set for this position and value of Foot Roll and can be seen in Illustration 5.16. Illustration 5.16: All the way through the step, at a Foot Roll value of 10. Creating the backwards foot motion involves setting Foot Roll to -10 and rotating RFC joint 1 a certain value and then setting a Driven Key for this position and value of Foot Roll and this can be seen in Illustration 5.17. 128 Group 08ml582 Reactive movie playback 5. Implementation Animation tools Illustration 5.17: Preparing the foot to land on the ground again. This reverse foot control now reveals itself to greatly simplify the process of animating the foot through taking a step. When the foot pushes off from the ground, Foot Roll can merely be set to 10; when the foot is in the air, Foot Roll goes back to 0; when the foot goes down to the ground again, Foot Roll is set to -10; and at the down position of the step (see chapter 4.4.3 The walk cycle) Foot Roll goes back to 0 again. It should also be noted, that through all the various steps of using Foot Roll to pose the foot through taking a step, the control object for the foot (the blue circle going around the foot in Illustration 5.17) is plane, making it easy to align the foot correctly to the ground, ensuring that it is flat on the ground and generally position the foot, regardless of the value of Foot Roll. 5.3 Animation tools This section describes how the process of animating the character has been carried out. It will examine the various tools used for this process, such as what Key Frames are, how to interpret and take advantage of the Graph Editor and how real-life video footage has been used to aid in achieving natural timing in the animation. When animating in 3D, the animator will not manually pose every single frame, but rather create the key poses (with Key Frames) and let the program automatically create the remaining frames. However, the program often creates the frames in unintended ways and it therefore becomes important for the animator to be able to control this automation using tools suited for this task (such as the Graph Editor). 129 Group 08ml582 Reactive movie playback 5. Implementation Animation tools 5.3.1 Key frames Key Frames help the user to control the animation. When a frame is to become a Key Frame, a key is set for all transform attributes of an object. This means that the translation, scaling or rotating performed with the object in that specific frame is saved. In the time slider, which will be in chapter 5.3.2 Time slider and Range slider, a red marker appears at the frame that becomes keyed, and this can be seen in Illustration 5.18. If another Key Frame is made in a later frame, and the object is rotated, MAYA interpolates the transformation of the object between the Key Frames. Illustration 5.18: The red markers appear several times along the time slider, thereby making it easy to see where you have placed your Key Frames. Key frames have been heavily used by the group with regards to the animation of the character and the objects he interacts with. Key frames can be directly related to key positions or extremes of classical 2D animation. The key positions tell the important parts of the story, while the remaining frames are the in-betweens, which make the animation appear smooth. Key frames in 3D are similarly the key positions in the animation and created by the animator, while the computer creates the in-betweens to make the animation smooth by interpolating between the transformations of the animated objects between each Key Frame. The following illustrations show the various positions of the character at some of the Key Frames in Illustration 5.19. Key frame 1 130 Key frame 10 Group 08ml582 Reactive movie playback 5. Implementation Animation tools Key frame 30 Key frame 35 Key frame 40 Illustration 5.19: Five different Key Frames. For classical 3D animation, these would be five key poses and the other frames of the animation would be in-betweens. 5.3.2 Time slider and Range slider The time slider and range slider can be seen in Illustration 5.20. Illustration 5.20: The time slider and range slider The time and range sliders allows control of either play back of or scrolling through a given animation, which is very useful if e.g. it is desired to play the animation inside the program before rendering it. The time slider specifically displays the playback range and keys, if such are made, and the range slider controls the range of frames that will be played if you click the play button. The reason for the range slider to have two values on each side (1.00 and 1.00 to the left and 24.00 and 24.00 to the right in Illustration 5.20) is that the range slider controls both how many frames are currently visible in the time slider, but also how many frames are totally available, while maybe not being visible. The inner-most values on each side of the range slider – the values closest to the range slider itself – controls the currently visible frames in the time slider, while the outer-most values control the total number of frames 131 Group 08ml582 Reactive movie playback 5. Implementation Animation tools available. These tools are important to know in order to maintain an overview of the animation, which can easily reach a size of e.g. 720 frames, which is only 30 seconds. In Illustration 5.20, the time slider currently shows 24 frames, from frame 1 to frame 24, and the currently active frame is frame 1, with no keys being set, since there are no red markers to indicate any Key Frames. The range slider allows for a maximum of 24 frames, while also currently displaying 24 frames. These sliders have been used in a variety of ways on the implementation of animation. One use has been to obtain an overview of every Key Frame in the entire animation, regardless of there being 100 or 1000 frames currently keyed. Another use has been to make detailed adjustments to the placements of the key-frames. The time slider allows for manipulation of the placement of key-frames and thereby the timing of the animation, by moving the keys from one frame to maybe a few frames further down the time slider (e.g. a key from frame 4 to frame 7). Viewing 700 frames in the time slider makes it very difficult to move frames like this, so adjusting the range slider to see fewer frames allows for more detailed manipulation of frame positions. 5.3.3 Graph editor The motions and transformations of every animated object in a scene can be graphically represented as curves in the graph editor. These curves are called animation curves and there is one for each keyed attribute of the object. I.e. there is one for the translation of the object in the y-axis, one for the scaling in the z-axis etc. Each curve shows how an attribute is changing during the animation. In Illustration 5.21 the animation curve for translation in the y-axis is shown. 132 Group 08ml582 Reactive movie playback 5. Implementation Animation tools Illustration 5.21: The animation curve for translation in the y-axis. The steep rise and fall of the curve in Illustration 5.21 indicates that the character moves rapidly up and down in the scene, i.e. when he jumps. The following wave-like shape of the curve (from frame 225 and onwards) is when the character has landed on the trampoline and the elastic fabric gives in and eventually settles. The transformation displayed as a graph in Illustration 5.21 is translation in the y-axis (the green color indicates change - be it rotation, transformation or scaling - in the y-axis). The way to interpret these curves is that the x-axis of curves in the graph editor represents time in the animation or frames if you will (Illustration 5.21 shows animation from frame 150 to 294). The y-axis represents changes in value for the current transformation that is being animated. This is what makes transformations in any y-axis the easiest to understand in the graph editor. When values of transformation in the y-axis are increased, the object moves along the y-axis in the scene (normally up on the screen). Therefore, the motion of the object in the scene would be very similar to the curves displayed in the graph editor. However, transformation in e.g. the x-axis would be more difficult to interpret, since when the curve goes up, the object might move to the right in the scene, and move to the left, when the curve moves down. In this case, the object would be aligned in the scene such that an increase 133 Group 08ml582 Reactive movie playback 5. Implementation Animation tools of motion along the x-axis would cause the object to move to the right in the scene and left following a decrease in motion along the x-axis. And since the y-axis in the graph editor represents a change in value, an increase of motion in the x-axis in the scene would be displayed as the curve going up in the graph editor, while the curve would go down, if there were a decrease of motion along the x-axis in the scene. The curves work similarly for rotating and scaling. If an object was rotated or scaled in the positive direction in the scene, the animation curve in the graph editor would go up, while a negative scale or rotation of the object would cause the animation curve to go down. A clear example of this correlation between motion and the graph editor can be seen in Illustration 5.22. Illustration 5.22: Showing how translation in the x-axis corresponds to the curve in the graph editor In Illustration 5.22 the box in the lower right corner has been moved 20 units along the x-axis and the lower left box represents the starting position for the box to the right. In the scene – the lower part of Illustration 5.22 - motion along the x-axis is motion to the right on the screen. But when looking at the animation curve in the graph editor (the upper part of Illustration 5.22), the graph moves up. This moving of 20 units has happened over 24 frames, which can be seen along the x-axis in the graph editor, where the numbers 0, 2, 4, 6…, 18, 20, 22 and 24 represent the number of frames in the animation. And looking at the y-axis reveals that the 134 Group 08ml582 Reactive movie playback 5. Implementation Animation tools graph starts at frame 0 and a value of transformation along the x-axis of 0. But, as more frames goes by, this transformation values also increases (along the y-axis) until it reaches a value of 20, when the numbers of frames gone by reaches 24. A final clarification can be found in Illustration 5.23, which shows how the same amount of transformation (20 units along the x-axis) would look if it lasted 18 frames, 12 frames and 6 frames. The amount of motion remains the same – only the amount of frames gone by changes. Illustration 5.23: When the amount of transformation remains the same, the height of an animation curve also remains the same, since these two elements are directly connected This translation of every possible transformation of an object in the scene onto a twodimensional graph can take some time getting used to, but once the animator understands this system, many animation problems can be solved simply by tweaking the animation curves in the graph editor. In Illustration 5.21 the graph editor has been used to ensure that the jump of the character had a realistic feel to it, bearing in mind that elements from cartoons are also present, like an unnaturally long time for the character to take-off from the trampoline. 5.3.4 Blend shapes The function of blend shape deformers is to change the shape of one object into the shapes of other objects and is typically used when creating facial animation of character. 135 Group 08ml582 Reactive movie playback 5. Implementation Animation tools In this project, blend shapes has been used to create the various facial expressions of the character, which can be seen in Illustration 5.24. Illustration 5.24: Blend shapes were used to make the various facial expressions of the character. Through the use of the blend shapes, it is possible to make the character seem happy, sad, angry etc. This is a very important aspect of the overall impression that the user gets from watching the character. When these facial blend shapes were created, the first thing to do was to make the eyes to be used on the character, which would also be the ones that change shape between the blend shapes. Next step is to copy these eyes one time for each blend shape that must be made. So in this project, the original eyes were copied 12 times in order to make all the different blend shapes. The next step is to modify the shape of each of the eye-copies into the various facial expressions, such as the sad eyes, the angry eyes etc. until the result shown in Illustration 5.24 was achieved. The last step is to make blend shapes out of these new facial expressions and tie them to the original eyes. By use of Maya’s Blend Shape Editor, the animator can now switch between the facial expressions and key them to the necessary frames of the animation. It should be noted, that the reason that the blend shapes can be placed anywhere in the scene, while the original eyes will change shape, while still being stuck to the head of the character is, that Maya can be setup to ignore any differences in position, rotation and scale of the blend 136 Group 08ml582 Reactive movie playback 5. Implementation Animation tools shapes and only take into account changes in position of the vertices relative to the center of the eyes. 5.3.5 Reference video In order to obtain more realistic movements from the character, a recording of movements of the group members were made so it could work as a reference of movements in the animating process. This can be of great help in such aspects as timing of e.g. a jump or a walk, or just as a study of exactly how the various limbs rotate, bend and move. Also it becomes easier to exaggerate movements when you have a clear reference to how they normally look. Illustration 5.25 shows some key positions of the character as well as the real world reference. Illustration 5.25: Key positions in the animation and in real world reference video. When thinking in terms of one second lasting 24 individual frames, it can be somewhat abstract to visualize how many frames e.g. a jump would last. The timing of the jump and the key poses, such as the anticipation, the pushing off from the ground, the character in mid-air and the landing can require an immense amount of trial-and-error to get to look right. But when filming a jump of a real person and then being able to play back this jump at a speed of 24 frames per second to match the animation, while also being able to slow down the video 137 Group 08ml582 Reactive movie playback 5. Implementation Smile Picture Capture and view each individual frame, the process of timing the key poses of the jump and make it look natural becomes much easier. This way, the animator can see exactly how many frames it takes to go from anticipation to lift-off or from lift-off to landing. Therefore, the use of reference video is an extremely helpful aid and can speed up the production time tremendously. It has now been seen which techniques have been used to model the project character, such as box-modeling and examples of where they were applied. The process of rigging the character and the functionalities of techniques used have been described such as joints or control objects, along with where to utilize them has likewise been covered. And then, the implementation of animation was described, which tools were used, such as Key Frames or the graph editor, while also describing where they were used. The next part of this chapter will detail how the smile detection was implemented, which techniques where used and examples of how to use them. 5.4 Smile Picture Capture The learning data for the program is created by a stand-alone program called ”Smile Picture Capture”. The program use the same Haar cascade as the Smile Detection program to locate the face, and the mouth is also located using the same algorithm as in the detection program. The “Smile Picture Capture” makes it easy to capture new learning material and import it into the learning program. The program is set to 25 pictures of smiles and 25 pictures of neutral faces for each test-participant as default. The code will be commented using pseudo code: 138 Group 08ml582 Reactive movie playback 5. Implementation Smile Picture Capture 1. Establish connection to webcam. 2. Create directory C:\Pictures\RunXXX\ 3. Create index files C:\Pictures\RunXXX\SmilePics.txt and …\NeutrPics.txt 4. While not more than 25 smile and neutral images captured 1. If a face is found, do: 2. Grayscale the image 3. Crop the image so only the mouth is visible (See XXXXXX for more information) 4. Scale the image to 30x15 pixels 5. Return image 5. If keypress “1”, do 1. If less than 25 images of smiles has been saved, do 2. Save image to RUNTIMEDIRECTORY\SmilePictureX.jpg (X in SmilePicture is replaced with A-Z) 3. Save path to SmilePics.txt for later collection import 6. If keypress “2”, do 1. If less than 25 images of neutral faces has been saved, do 2. Save image to RUNTIMEDIRECTORY\NeutrPictureX.jpg (X in NeutrPicture is replaced with A-Z) 3. Save path to NeutrPics.txt for later collection import 7. Copy all files from RUNTIMEDIRECTORY with Smile* in the filename to: C:\Pictures\RunXXX\Smiles\ 8. Copy all files from RUNTIMEDIRECTORY with Neutr* in the filename to: C:\Pictures\RunXXX\Neutrals\ 9. Close windows, files and terminate program If button 1 or 2 is pressed on the keyboard, the program will capture either a smile- or a neutral picture. The program gathers all pictures in C:\Pictures\RunXX\ directory which can easily be renamed to for example “Batch nr. 3.” The Smile Detection Training can, with slight changes, import the produced index files into a vector of pictures, which is called a collection. 139 Group 08ml582 Reactive movie playback 5. Implementation Training program 5.4.1 Photo Session The capture of the image to be used in the smile detection was done in a lab with the same camera, which will be used in the testing. Light conditions were partly controlled by fluorescent tubes, but did also include sunlight shining on the left sides of the faces. Illustration 5.26: A series of 5 smile pictures from the same participant. Seven participants were provoked into laughing and making neutral faces. The group of participants included four Caucasians (Males – two with a beard), one East-European (Female), one African (Male) and one Asian (Male). The broad composition of the participant group was made in order to make the smile detection less sensitive to skin colors, facial hairs, genders and inheritance. The Smile Picture Capture program was used to easily capture the smiles and save them into collection indexes. 25 smile and 25 neutral faces were captured per participant, making a total of 350 pictures. 26 pictures were sorted out due to wrong facial expressions, inappropriate lights and general mouth detection failures. A total of 165 smile pictures and 173 neutral pictures were sent to the learning program. 5.5 Training program For the training program, first a collection of photos on which to train is needed. This is done by the function loadCollection, which takes as input a string and an integer value. The string is the path to a txt-file, which contains the path for all images that is to be trained on. The integer value is used to specify the maximum size of the collection, if it is needed to limit the collection size. The loadCollection function implementation is explained, with original code and comments: 140 Group 08ml582 Reactive movie playback 5. Implementation Training program vector <IplImage*> loadCollection(string file, int maxCollectionSize) /* Function takes in two variables: The path to the index file of pictures, and number of how big the collection */ { vector <IplImage*> collection; // Creates a vector of image pointers const char *filename = file.c_str();// Converts the string file to a const char ifstream myfile (filename); // Opens the file for reading string line; // Creates a string named line if (myfile.is_open()) // Run through the file { while (! myfile.eof() && collection.size() < maxCollectionSize ) { getline(myfile,line); const char *charline = line.c_str(); IplImage *image = cvCreateImage(cvSize(30,15),IPL_DEPTH_8U,1); // This is image pointer for the image IplImage *imageFlip = cvCreateImage(cvSize(30,15),IPL_DEPTH_8U,1); // This is image pointer for the flipped image image= cvLoadImage(charline,0); cvEqualizeHist(image,image); collection.push_back(image); cvFlip(image,imageFlip,1); collection.push_back(imageFlip); // // // // // Load the image Equalize the histogram Save image into vector Flip image and save into imageFlip Save imageFlip into vector } } return collection; } The flipping of the image was done since some problems about light direction occurred. Since the training images were taken with the light coming from approximately the same direction in all the images, the smile detection program had difficulties tracking the smiles correctly, if the light was coming opposite of the light in the training images. Instead of wasting time on taking many new training images, it was decided to simply flip each image, to simulate light coming from the opposite direction. With the collection loaded, the mean of all the images has to be calculated. The function getMean takes care of calculating the mean values for each pixel, given a number of images of equal dimensions. The function takes only one input, which is the vector of images defined by the loadCollection function. 141 Group 08ml582 Reactive movie playback 5. Implementation Training program Illustration 5.27: Histogram equalization process. T - Transformation All pictures are histogram equalized since without, the neutral template turns out brighter than the smile template, because a smile creates more attached shadows on the face (shadows casted by the face itself). By equalizing, the histogram is “stretched”, such that the lowest valued pixel present is set to 0 and the highest valued pixel is set to 255. All pixels in between are scaled to fit into the new scale of the histogram. In this way, both templates are guaranteed to span from entirely black to entirely white. The function in OpenCV, which equalizes the histogram is called cvEqualizeHist(input, output) to normalize brightness and increase contrast. The function first calculates the histogram, normalizes the histogram, finds the integral of the histogram and applies the altered histogram to the image. The implementation of the function can be found in cvhistogram.cpp in OpenCV source folder. The getMean function goes through the following steps: 142 Group 08ml582 Reactive movie playback 5. Implementation Training program /* Loads a collection of images into the function */ IplImage* getMean(vector <IplImage*> collection) { /* Creates two scalars, which contains an 1D array with RGB and Alpha values (a 8 bit picture) */ CvScalar s, t; /* Creates an image with the same width and height as the training images */ IplImage* meanImg = cvCreateImage(cvSize(collection[0]->width,collection[0]>height),IPL_DEPTH_8U,1); int temp = 0; /* Creates a vector to temporarily save pixel values vector <int> coordinate((collection[0]->width)*(collection[0]->height)); /* Goes through every picture in collection */ for( int i = 0; i < collection.size() ; i++ ) { int coordinateCounter = 0; for (int y=0; y<collection[i]->height; y++) // For Y values { for (int x=0; x<collection[i]->width; x++) // For X values { s = cvGet2D(collection[i],y,x); // Get pixel value for image in X,Y /* Add the pixel value for the current image into the coordinate vector */ coordinate[coordinateCounter] += s.val[0]; coordinateCounter++; } } } /* Go through the added pixel values and divide with the amount of pictures */ for (int j = 0; j<coordinate.size(); j++) { coordinate[j] = coordinate[j]/collection.size(); } int pixelCounter = 0; /* For loop that converts the coordinate vector into an image (meanImg) */ for (int h = 0; h < meanImg->height; h++) { for (int w = 0; w < meanImg->width; w++) { for (int scalar = 0; scalar < 4; scalar++) { t.val[scalar] = (double)coordinate[pixelCounter]; } cvSet2D(meanImg, h, w, t); pixelCounter++; } } return meanImg; } The OpenCV variable CvScalar (MyDNS.jp, 2008), is actually an array with a size of four, holding a value of red, green, blue and alpha channel at a specific pixel. However, as the program is using only grayscale images, the value of red, green and blue is the same for each single pixel, the first value (position 0 in the array) is chosen for each channel. As the alpha channel is not used at all, it gets the same value as the rest of the channels, for simplicity only. 143 Group 08ml582 Reactive movie playback 5. Implementation Smile detection The main function of the smile training program calls the loadCollection and meanImg functions and then saves the mean image for both the neutral face expressions and for the smiles. The saved pictures are then what are going to be used as templates in the smile detection program. This completes the training program that is capable of loading a series of pictures, calculating the mean and outputting the result to an image file, that can be used in the Smile Detection Program. 5.6 Smile detection For the smile detection program, which was described in chapter 4.5 Smile Detection Program, the Haar cascades are used. However, since the code used for finding a face in the webcam feed (which is where the Haar cascades are used) is developed by and published with the OpenCV library it will not be described in this report. The function to load a Haar cascade is: (CvHaarClassifierCascade*)cvLoad("haarcascadefile.xml", 0, 0, 0 ); For the part of the smile detection program that is developed in this project, two different methods were described in chapter 4.5.2 Movie playback, but in the conclusion of the design, it was decided only to implement the second method of movie playback, which were comparing three initial clips and then deciding on a humor type, based on the user’s reaction to these movies. Therefore, the implementation of the main program developed in this project looks as follows: 144 Group 08ml582 Reactive movie playback 5. Implementation Smile detection 1. Load the movie playlist from playlist.txt. 2. Establish connection to webcam. 3. Start first video clip. 4. Try to detect a mouth in the webcam feed, using mouthDetection function. 5. If a mouth is found, do 1. Check if mouth is smiling, using isSmiling function. 2. If movie clip has ended, do 1. In case of first movie, do 1. Set score for first movie clip to amount of smiling frames/time of clip. 2. Start next movie clip. 2. In case of second movie, do 1. Set score for second movie clip to amount of smiling frames/time of clip. 2. Start next movie clip. 3. In case of third movie, do 1. Set score for movie clip to amount of smiling frames/time of clip. 2. Compare the three movie clips. 3. Start playing the next movie clip in the playlist of the style with the highest score. 6. If program is terminated, save txt-file containing times of when user started and stopped smiling. 7. Else, go back to step 4. 5.6.1 Mouth Detection The first necessary function is the mouthDetection. A big part of this function is also developed by the creators of OpenCV and will not be explained here. Using the Haar cascades, the function can find the face. What has been implemented in addition to this Haar cascade detection is the detection of the mouth. However, this is actually just a predefined area in the face. The area is defined as being a rectangle with the following coordinates: 145 Group 08ml582 Reactive movie playback 5. Implementation Smile detection 1 𝑥𝑥1 = 𝑐𝑐𝑥𝑥 − 𝑟𝑟 , 2 1 𝑦𝑦1 = 𝑐𝑐𝑦𝑦 + 𝑟𝑟 3 Equation 5.1: Left, top coordinate calculations 𝑥𝑥2 = 𝑥𝑥1 + 𝑟𝑟 , 1 𝑦𝑦2 = 𝑦𝑦1 + 𝑟𝑟 2 Equation 5.2: Right, bottom coordinate calculations , 𝑤𝑤ℎ𝑒𝑒𝑒𝑒𝑒𝑒 𝑥𝑥1 , 𝑥𝑥2 , 𝑦𝑦1 , 𝑦𝑦2 𝑎𝑎𝑎𝑎𝑎𝑎 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐, 𝑐𝑐𝑥𝑥 , 𝑐𝑐𝑦𝑦 𝑖𝑖𝑖𝑖 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑎𝑎𝑎𝑎𝑎𝑎 𝑟𝑟 𝑖𝑖𝑖𝑖 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 OpenCV draws rectangles from the top left corner to the bottom right corner. This is specified in the code and calculated by using the information about the face center and the radius of the circle surrounding the face. Again, it is important to note for these calculations that the origin of an OpenCV image is specified as the top left corner. Illustration 5.28 shows the parameters in Equation 5.1 and Equation 5.2 in a graphical manner. Illustration 5.28: The mouth detection function specifies an area around the mouth by starting at the center of the face (center of the black circle) and calculating the top left and bottom right corner of the green rectangle, using the radius of the black circle. 146 Group 08ml582 Reactive movie playback 5. Implementation Smile detection When the mouth is defined, the pixel data for the area is saved into an IplImage* and is returned. If no face is found, the function returns a grey image. Next step is to resize the image to a 30*15 pixel image. This step is done purely to optimize the process and make the program run faster. With only 450 pixels to compare, instead of maybe 3000-5000 pixels, the program will make the comparison much faster. Furthermore, the image should have exactly the same size to use the mean image detection method. 5.6.2 Comparison With the mean images loaded and an image of the mouth in the current frame, the actual smile detection can be made. This is done with the isSmiling function, which performs the following steps: Compare current image with smile template: 1. For every pixel in current image, do { 1. Get CvScalar value of current pixel in current image. 2. Get CvScalar value of current pixel in smileTemplate. 3. Increase diff by the absolute difference between the two images. 4. Store diff. } { 5. Get CvScalar value of current pixel in current image. 6. Get CvScalar value of current pixel in neutralTemplate. 7. Increase diff by square of difference between the two pixels. 8. Store diff. } 2. If squareroot from current image to smileTemplate < distance to neutralTemplate + threshold, return true, else, return false. The threshold value is used to bias the results towards smiling, if the program does not track smiles accurately enough. What was discovered during the implementation phase was that the distance from the image to the smileTemplate was often longer than the distance to the 147 Group 08ml582 Reactive movie playback 5. Implementation Smile detection neutralTemplate, resulting in the image being detected as not smiling, even if it really was. By adding the threshold value, as described in subchapter 3.8.1.2 Simple method, to the distance from the current image to the neutralTemplate, the distance from the current image to the smileTemplate will more often be smaller than the distance to the neutralTemplate, resulting in the image being detected as a smile. 5.6.3 Movie Playback The movie playback is performed by sending a command through the system function, which is part of the C++ library called stdlib.h. The system function executes commands through the command prompt and as such, any process can be started with this command. This functionality is used to start an instance of Windows Media Player, with the movie name included in the command call. An example of a command could be: “start wmplayer.exe C:\\Movies\\movie0.wmv /fullscreen” This command starts Windows Media Player and plays the file movie0.wmv, which can be found in the folder C:\Movies. /fullscreen at the end of the command will ensure that Windows Media Player is running in full screen. A playlist is composed and written as a txt-file, containing the commands for start of all needed video files and the smile detector program is then loading in this text-file into an array, from which the commands can be called. The function called playMovie takes as input an integer and an array of const char*. Calling the function, it will play the movie from the array, with the integer value specified. playMovie(2,file) will play the 3rd file (due to array index, where 0 is the first element of the array) in the array called “file”. In order to get statistics from the test, the program detects every time a user starts smiling and stops smiling. When a user starts smiling, the amount of milliseconds since program start is stored in a vector called smileTimer and no other value is stored until the user stops smiling again. This continues throughout the entire program and as the program terminates (when last movie is played), the content of the vector is saved in a txt-file for use afterwards. This concludes the implementation of Smile Detection Program, which is capable of loading the image files from the Training program, utilizing these as templates for smile detection. The Smile Detection Program will detect the face of a user and determine whether the user is 148 Group 08ml582 Reactive movie playback 5. Implementation Implementation Conclusion smiling or not. Depending on the results of the smile detection, the program will choose to play a movie clip according to these results. 5.7 Implementation Conclusion In this chapter it has been shown how the entire product was created, based on the designs made in chapter 4. Design. First of all, the actual implementation of the character within Maya, including techniques concerning the process of modeling the character was covered, followed by the techniques and methods utilized when rigging the character and setting it up for animation. And then, the animation process and the tools used for this were covered. Three programs where created for the smile detection, the first of these was the Smile Picture Capture which was used to create a series of images of people smiling and non-smiling people, and categorize them accordingly. The second program was the Training Program which then took all of the pictures that were taken in the Smile Picture Capture program, processed them and calculated the mean values of each pixel coordinate from the pictures from each category. The calculated means where then saved into two files “smileTemplate” and “neutralTemplate”. The third program was the Smile Detection program, which is the program actually used to detect the smiles. This was done by using a webcam feed, locating the mouth and then comparing each pixel coordinate in that region to that of the mean from the training data provided by the second program. The data itself is also used by the program in order to determine which of the animated movies should be played. With this, a working prototype has been created consisting of animated movie clips, and a program that detects smiles and then chooses between the movies. 149 Group 08ml582 Reactive movie playback 6. Testing Cross Validation Test 6. Testing This chapter will cover the setup and conduction of the test that has been performed in the project. First, the cross validation will be covered as it was described in chapter 2.5 Testing the product and it will be checked if the detection rate fulfills the criteria set (75% is required 80% is desired). Next, the tests involving users will be introduced, using the DECIDE framework and finally the results of the initial and final test will be presented, analyzed and evaluated. The conclusion of this chapter will determine if the solution proposed in this project, is a valid solution for the problem formulation. 6.1 Cross Validation Test As mentioned in chapter 2.5 Testing the product, there are some requirements to the successfulness of the smile detection program, in order for the final test to be useful. It was established that a success rate of at least 75% was needed, and a success rate of at least 80% was desired. The method of doing cross validation test was discussed in chapter 2.5 Testing the product, so this chapter will be concerned about the implementation of the test and the results produced. For conducting the test, another small application was developed. However, the functions of this new application was already implemented in the actual smile detection program, so for doing the cross validation test, only some minor changes were needed. Instead of testing on all of the training data, it was necessary to subtract 10% of the training data and use it as test data. This was done by running through a loop in the code, each time subtracting a new 10% from the training data and then specifying the same 10% as the test data, until all images has been trough the test. In the case of the cross validation for this program, it means that there were taken 16 images out of each class of the training data to be used as test data, for each iteration of the test. This leaves 298 images as smile training data and 314 as neutral training data, because all images have been flipped and copied. The program will then produce a text file for the result of the neutral images and smiling images, stating the success rate. Furthermore, a threshold was also implemented, so the cross validation should take into concern different threshold values. Two methods were tested in the cross validation, both of them described in chapter 3.8 Smile detection, and what will be described here is the results of both tests, run at different 150 Group 08ml582 Reactive movie playback 6. Testing Cross Validation Test threshold values, as described in chapter 5.6 Smile detection. The first method uses no length calculation, but calculates the absolute pixel difference between two pictures. The second method calculates distance between the images, represented as vectors in Euclidean space, and uses this for comparison. 6.1.1.1 Method 1 Referring back to chapter 2.5 Testing the product, the way to represent the cross validation, is to make a table, showing the detection rate for both smiles and non-smiles. For example, at a threshold of 0, the results of the cross validation for method 1, could be illustrated as in Table 6.1. t = threshold Detected as smile Detected as non-smile Smiles (t=0) 76.67% 23.33% Non-smiles (t=0) 7.5% 92.5% Table 6.1: The results of the cross validation of the first method, as threshold 0, shows a good detection rate for nonsmiles, but a poor detection rate for smiles. This shows that the 76.67% of the smiles were actually detected as smiles; where as 92.5% of the non-smiles were detected as such. In Table 6.2, a few selected thresholds have been picked out, to show the results at different threshold values. t = threshold Detected as smile Detected as non-smile Smiles (t=2500) 83.33% 16.67% Non-smiles (t=2500) 11.67% 88.33% Smiles (t=5000) 91.67% 8.33% Non-smiles (t=5000) 17.5% 82.5% Smiles (t=7500) 97.5% 2.5% Non-smiles (t=7500) 24.17% 75.83% Smiles (t=10000) 100% 0% Non-smiles (t=10000) 35% 65% Table 6.2: The results of the cross validation test run on the first method, at a few selected threshold values. As it can be seen from Table 6.2, a threshold higher than 7500 will be too high, as the nonsmiles success rate is getting too low at that point. In order to determine the optimal threshold value, more comparisons have to be made. 151 Group 08ml582 Reactive movie playback 6. Testing Cross Validation Test 6.1.1.2 Method 2 t = threshold Detected as smile Detected as non-smile Smiles (t=0) 74.79% 25.21% Non-smiles (t=0) 7.56% 92.44% Smiles (t=125) 84.03% 15.97% Non-smiles (t=125) 9.24% 90.76% Smiles (t=250) 93.28% 6.72% Non-smiles (t=250) 14.29% 85.71% Smiles (t=375) 97.48% 2.52% Non-smiles (t=375) 17.65% 82.35% Smiles (t=500) 100% 0% Non-smiles (t=500) 24.42% 70.58% Table 6.3: The results of the cross validation test run on the second method, at a few selected threshold values. As it can be seen in Table 6.3, there are different threshold values for this test, which is due to the way the difference between the images is computed. In the second method a square root computation is involved in calculating the difference between the images, which was not the case in the first method, in which the total difference was computed. Therefore, the values in the second method are smaller than in the first method and the threshold has to be scaled accordingly. In order to compare the two methods, the average value of the detection rate for both smiles and non-smiles in both methods is computed. Illustration 6.1 and Illustration 6.2 shows the average detection rate for smiles and non-smiles in both methods. 152 Group 08ml582 Reactive movie playback 6. Testing Cross Validation Test Average detection rate method 1 92 90 88 % 86 84 82 80 9000 8000 7000 6000 5000 4000 3000 2000 1000 0 Threshold Illustration 6.1: The first method peeks at just above 89% average detection rate, at a threshold of 6600-6700. Average detection rate method 2 92 90 88 % 86 84 82 80 450 400 350 300 250 200 150 100 50 0 Threshold Illustration 6.2: The second method peeks at almost 90% average detection rate, at a threshold of 215-220. In the first method, the highest average detection rate is 89.08%, which is reached at a threshold of 6600-6700. In the second method the average detection rate reaches a height point at threshold of 315-320 as it reaches an average detection rate of 89.92%. With an average detection rate of almost one percentage point higher for the second method, this proved to be the most efficient way of doing the smile detection. Furthermore, the detection rate of both smiles and non-smiles in the second method, at the optimal average detection rate is above the desired detection rate of 80%, as the detection rate for smiles is 96.64% at a threshold of 315-320 and the detection rate for non-smiles is 83.19%. For the first method, 153 Group 08ml582 Reactive movie playback 6. Testing DECIDE framework both values do also pass the 80% detection rate, but the detection rate for non-smiles is only 80.67%, while the detection rate for smiles is higher, at 89.08%. Referring back to chapter 2.5 Testing the product, the results of the cross validation of both methods exceeded our expectations, but the second method is chosen because of the best result. 6.2 DECIDE framework As mentioned in chapter 2.5 Testing the product, the DECIDE framework will be used to setup and evaluate the test to be performed in this project. The DECIDE framework offers a six-step guide to perform evaluation which is(Sharp, Rogers, & Preece, 2006): 1. Determine the goals. 2. Explore the questions. 3. Choose the evaluation approach and methods. 4. Identify the practical issues. 5. Decide how to deal with the ethical issues. 6. Evaluate, analyze, interpret, and present the data. Following this framework ensures that many aspects in the evaluation context are covered. The DECIDE framework is driven by goals, which assist in clarifying the scope of the test, Once the goals have been established, the next step is to begin investigating which questions to ask, and later which specific methods to use. Thus whether one chooses to employ e.g. a usability test or a focus group interview depends on what the goals and questions are. Although the DECIDE framework is a list, it does not necessarily mean that the work should be done step-by-step wise. The items in the list may either be worked with iteratively, or perhaps going backwards as well as forwards in the list. 6.2.1 Determine the goals Based on the problem statement of the project, the overall goal of this test will be to determine whether or not the program created is successful in choosing the movie clip that the users actually thought was the funniest. If this is the case, the users should get an impression that the reactive playback of movies is funnier than just playing back random movie clips. 154 Group 08ml582 Reactive movie playback 6. Testing DECIDE framework The goal will be to obtain quantitative data about what movie clip the users thought was the funniest in their own opinion and what movie clip the users thought was the funniest according to the program. 6.2.2 Explore the questions The questions asked should lead to achieving the goal by getting the relevant data from the users. In this test, the program itself will give many of the answers to the questions needed to be answered, e.g. when and how long the users smiled. This data can be used to determine if the users are smiling more during the reactive version compared to the non-reactive version. However, some questions are still needed to be answered by the users. First of all, the users should be presented a still shot from each of the three initial movies and be asked to choose the one they thought was the funniest. As the program should choose a type of humor corresponding to the one the user thought was the funniest, a question should also be asked about whether or not the user could see this correlation. However, the program might not choose the type of humor that the user identified as the funniest clip in the previous question, so this question should explore which of the three initial types the last two movie clips was most similar to. This sum up to the following questions: • • Did the program succeed in detecting the right humor type for the user? • Which movie clip did the user like the most according to himself? • initial movie? Did the user see the connection in humor type from last two movies to the according Did the users smile more of the reactive version than the non-reactive version? 6.2.3 Choose the evaluation approach and methods The users will fill out a questionnaire, producing quantitative results about the questions answered. Comparing the data given by the users with the data produced by the program, it is possible to get results about the success of the product. Both questions in the questionnaire are producing quantitative data and so will the program, so the data is easily comparable. The test will be conducted by placing the user in front of a screen with a integrated webcam. The users will watch the movie clips on the screen, while the webcam is tracking their face. 155 Group 08ml582 Reactive movie playback 6. Testing DECIDE framework The program will continuously save information about the user’s facial expression (smiling or not smiling), but during the test, the user is not supposed to do anything but to watch the movies. Afterwards, the user will be asked to answer the questions given on the questionnaire. Opposite of the user, one test observer will be watching the progress on another screen. This test observer will be able to see whether or not the program is tracking the user correctly, as he can see a symbol representing whether or not the program is tracking a smile or not, as seen in Illustration 6.3. Illustration 6.3: As the test is running, the test observer watching the process of the test is seeing these two symbols to identify whether or not the user is smiling. If the program is detecting that the user is not smiling, the program shows the left picture and vice versa. This allows the observer to change the smile threshold, if the program is tracking the current user’s smile poorly. Another observer is standing by to answer questions from the user before, under or after the test. This observer will also be in charge of handing out the consent form to be signed before the test starts and the questionnaire that is to be filled out after the test. The test persons will be divided into two groups: Half the test persons will test a reactive version, with the program reacting to their smiles, and the other half will test a non-reactive version, not considering the test persons’ smiles in the movie playback. 6.2.4 Identify the practical issues The practical issues of this test are mostly concerned with the test environment. The test persons will be fulfilling the target group description discussed in chapter 2.4 Target group, so there are no legal issues about age to be taken into concern. 156 Group 08ml582 Reactive movie playback 6. Testing DECIDE framework The test will be conducted at Aalborg University Copenhagen and as students at this institution, we can use the rooms freely, which rules out the need of taking into concern economical issues. At a later test of the product, it might be prudent to test it in the environment and conditions that it is supposed to be used in, but for this prototype test, the conditions within the university will be sufficient. The equipment needed is limited to a computer, with a webcam running the program which has been developed. It is not necessary to have any external cameras running, to record the movement of the users as the webcam will be sufficient to track the things needed. However the users will need to sign an agreement that they are going to be recorded. The test of each user will be only around 10-15 minutes, including answering the questionnaire. This means that it should be possible to get a sufficient number of participants tested during a day of testing. Since this test will only produce indications to the successfulness of the product and indications about the whether or not the goal is achieved, a test group of roughly 30 persons will be sufficient, 15 for each of the two test methods. 6.2.5 Decide how to deal with the ethical issues The ethical aspects of the test cover the rights of the users we are going to test on. The users have some basic rights that are to be thought of when conducting a test. First of all, the subjects will have to give some personal information, in order to assure that they fit the age requirements for the target group, so the only necessary information is about age. Before the tests starts, the participants will have to be informed about what they are going to be exposed to, during the test and what the goals of this test are. The subjects have the right to know what the goals of this project are and what their role is in relation to the project. This means that the users will have to sign a consent form, where they give their consent for us to use any data that the test will give. When the test is running, the subjects have the right to leave at any time, which they should also be informed about. The subjects are also allowed to leave even before the tests starts, if they should not accept the terms of the test. Since the participants of this test will be recorded using a camera, they will need to accept this recording and the right of the use of it in the project. The test could be performed without 157 Group 08ml582 Reactive movie playback 6. Testing Analyzing the results recording the participants, but in order to verify the results afterwards and have some material to analyze, each participant will be recorded, unless he explicitly decides not to be recorded. All these ethical rights have to be written down, such that each participant’s have given their consent with the terms and that each have signed this consent form. This consent form can be seen in Appendix X. 6.3 Analyzing the results In this sub-chapter the results acquired from both the initial and final test will be analyzed. Selected results will be displayed graphically and the results will be discussed. Furthermore, the test method will be evaluated and suggestions for an extensive test will be presented. 6.3.1 Initial test The first test that was conducted was a test to see whether or not the users agreed with the choices the program made or not. The program is, as explained in chapter 5. Implementation, designed to measure the amount of time the user is smiling during one specific clip and then compare the three initial clips to determine which of the movie clips were liked the most. This test took advantage of this feature and measured the user during a test of approximately 2-4 minutes. The users were placed in front of a laptop screen with a built-in webcam, which were used for tracking. One test observer was sitting behind another screen, watching the progress of the program and making sure the program was tracking as supposed to. Two other test observers were present to help in assisting the test subject and answering questions. Before the test started the users signed a consent form about their participation in the test and that they agreed to the terms of the test. Then they were instructed to just sit back and watch the movie and afterwards, they were asked to fill out a questionnaire about the movie they saw. Since this was an initial test, a total of only 10 persons participated in this test. The small number of participants is justified because the test only served as an indication of the validity of the test method. 158 Group 08ml582 Reactive movie playback 6. Testing Analyzing the results The participants were asked to answer the following questions: • On a scale from 1 to 5, where 1 is no change and 5 is a higher number of changes, how • often did you find the type of humor to change during the test? • feel you had control over the choice of movie clips? • clips? On a scale from 1 to 5, where 1 is no control and 5 is full control, how much did you If you felt that you were in control, how did you feel you were controlling the movie Which of the following clips did you think were the funniest? (The user was presented with the screenshots: One from each of the three initial movies) Illustration 6.4: From the right: 1 is the first type of humor (ball play), 2 is the second type of humor (black humor) and 3 is the third type of humor (falling apart) The program itself returned data about what movie clip the user smiled at the most, for comparison after the test. In Appendix 12.4 Test Results, the full results for the test can be seen, but in this chapter, only the facts about choice of movie clips will be discussed deeply. Looking at the results of what the users chose as the funniest clip, Illustration 6.5, shows the distribution of answers. 159 Group 08ml582 Reactive movie playback 6. Testing Analyzing the results Preferred clip User 6 5 4 Ball play Black humor 3 Falling apart 2 1 0 Illustration 6.5: The distribution of answers to the question of which movie clips the user found the funniest. 1 is the first type of humor (ball play), 2 is the second type of humor (black humor), 3 is - third type of humor (falling apart) As it can be seen in Illustration 6.5, there is an uneven distribution of preference towards the movies. 5 people thought that the black type of humor was the funniest, while 4 users thought the falling apart type of humor, in which the character is losing limbs, was the funniest. Only one person preferred the ball play humor. However, the results produced by the program, measuring how long the users where smiling at each clip, produced somewhat different results, as seen in Illustration 6.6. Preferred clip Program 3,5 3 2,5 2 1,5 Ball play Black humor Falling apart 1 0,5 0 Illustration 6.6: The program detected a somewhat different distribution of what the users liked. Here, 1st and 3rd type of humor each received 3 votes, while the second type received only two. Two tests turned out invalid. 160 Group 08ml582 Reactive movie playback 6. Testing Analyzing the results First, what has to be noted is that two of the tests turned out invalid. In one test, the user smiled so much that it ended in a “draw” between two of the movies. In another case, the user was not smiling at all (or at least not enough to make the program react). This meant that none of the movies received any score and thus, the program could not decide which clip the user liked the most. These problems in the program was something to be solved before the final test, preventing a premature termination of the program. As it can be seen from Illustration 6.5 and Illustration 6.6, there were differences between the users’ answers and what the program detected. However, to get a clear view of the result, it is necessary to look at each test subject and their answers, compared to the results of the program. This comparison can be seen in Table 6.4: Test subject Person 1 Person 2 Person 3 Person 4 Person 5 Person 6 Person 7 Person 8 Person 9 Person 10 User choice 2 3 2 2 3 1 3 2 2 3 Detected funniest N/A 2 N/A 3 1 1 3 2 3 1 Correct choice? False False False False False True True True False False Table 6.4: Distribution of answers for each test person and the according result of the program reading. As it can be seen, only 3 of 10 test persons actually chose the funniest clip to be the clip that the program detected them to like the most (that is, smiled the most at). The reasons for this result can be many, one of them being the program developed during this project. There were instances where the program lost track of the user’s face, because the user moved in his chair, ending up outside the camera scope. In another instance, the user started scratching his nose during the test, which also prevented the camera from tracking him correctly. Another problem is that the user might not actually think that the funniest clip is the one that he/she smiles at the longest. What the program lacks is the opportunity to measure how much the user is smiling, rather than only measuring the amount of time he smiled. If the user finds that one movie clip has many small funny situations, he might actually smile for a long period 161 Group 08ml582 Reactive movie playback 6. Testing Analyzing the results during that clip, but find that another clip is funnier for a short period of time and prefer that clip. 6.3.2 Final test The final test was in many ways conducted in the same way as the initial test. The general test setup was the same, but instead of only testing on the reactive program, every second test person was tested on a program that did not react according to the user’s smile. The nonreactive program simply just chose two movie clips without taking into concern the amount of smiling during each clip. Furthermore, the questionnaire was changed to reflect what was described in chapter 6.2 DECIDE framework, since the only important information needed from the users, were about what movie they preferred and if they could recognize the humor type from initial to ending movie clips. Therefore, the users only had to choose which of the first three movie clip was the funniest and which of the three first movie clips were most similar to the two last animatics. This is done to determine if the user could recognize a correlation between a type of humor from the animated clips and the type of humor in the storyboards. For this test it was not important whether or not the users felt they were in control or if they understood the interaction method, as the comparison could be made from the reactive and the non-reactive version. The final test was conducted on a total of 30 test persons, 15 testing the reactive and 15 testing the non-reactive. The entire amount of data can be seen in Appendix 12.4 Test Results, but in this chapter, only the key elements of the test will be discussed. First of all, the program’s ability to detect what movie clip the user liked the most was tested. This test was performed on all 30 test persons, even though the information was only used in half of the cases. With the information saved by the program and the answers that the users gave, the results shown in Illustration 6.7 were obtained. 162 Group 08ml582 Reactive movie playback 6. Testing Analyzing the results Preferred clip Preferred clip User Program 25 12 20 10 15 Falling apart Black humor 10 Ball play 8 6 4 5 2 0 0 Falling apart Black humor Ball play Illustration 6.7: The left illustration shows the distribution of answers given by the users; to what movie clip they liked the most. The right illustration shows what the program detected that they liked the most. Despite the difference in what the users chose compared to what the program chose, the program did choose the type of humor that the users preferred in half of the cases. As seen, the users had a strong preference toward the third movie clip (ball play humor), but the program only detected the users to be smiling at most at this clip in 11 cases. The first movie clip (falling apart humor) however, was only preferred by 3 test persons, but the program detected that the users were smiling most of this clip in 11 cases too, just as for the third movie clip. Looking back at chapter 6.3.1 Initial test (note that the first and third type of humor has been interchanged, such that the ball play humor is now the third type of humor, compared to being the first in the initial test), the results in the initial test differs from the results of the final test, in what movie clip the users’ preferred. The animatics produced apparently did not reflect the story in the scenes as well as the animated counterparts. When it comes to the users connecting the last two movies clips to one of the three initial clips, the users had difficulties in recognizing them. In total, 11 of the users related the type of the last two movie clips to the correct initial one, while 19 did not. However, 8 out of 11 of the correct choices were found in the reactive version. This shows that the users were in general not able to relate the one of the first three animation movie clips to the last animatics with the same humor type. 163 Group 08ml582 Reactive movie playback 6. Testing Analyzing the results As explained in the DECIDE model, the reactive version is expected to entertain the user more than the non-reactive version. This was tested by measuring for how long the users smiled during the test. Illustration 6.8 shows for how long the users smiled in average during the two different test methods. Average Smile Time 25000 Miliseconds 20000 15000 10000 Reactive Non-reactive 5000 0 Illustration 6.8: As seen in the above illustration, the users were in average smiling more at the reactive version of the video playback than the non-reactive version. This test cannot prove that the users liked the reactive method more than the non-reactive, just because the users smiled more at one kind of playback than the other. However, it is an indication that the users actually did smile more when they were showed the type that they initially preferred (at least by the program’s definition). With the information gained from this test, there is an indication that the reactive version of the product is actually more entertaining than just watching the movie clips alone. However, this is based on the assumption that the users are always smiling for the longest period of the movie they thought was the funniest. This is, as already described in this chapter, not always the case, since users in half the cases, chose another movie clip than the one the program had detected, to be the funniest. One movie clip might seem very funny to the user for a short period of time, while another movie clip might seem less funny, but for a longer period of time. In this case, the program would detect that the user liked the second movie clip the better, simply because the user smiled for a longer period in this movie clip, but in reality the 164 Group 08ml582 Reactive movie playback 6. Testing Analyzing the results user might think the movie with a short, but very funny situation was the funniest. A more accurate test-result would require that the program could also detect the degrees of smiling. There were also indications that the user had problems connecting humor type from initial movie clips to following movie clips. However, as there were only animated video clips available for the three initial movies, the significant change in look of the movie clips (full animation vs. animatics consisting of drafty storyboards) might have confused the users so much that they were not able to recognize the change in humor type. Furthermore, the different humor types might have been too close to each other to clearly identify one type from another. The data produced by the program during the test also made it possible to establish at approximately what times the users smiled at each movie clip. There was some delay between the program and the video player showing the videos to the user, but an exact duration of this delay could not be established and since the program returned times of smiling at milliseconds after program start, the values of milliseconds produced by the program might not be exactly at the correct times during the movie clips. It will however give an indication as to whether or not there were humoristic peaks in the movie clips that was coherent to what was expected to be funny. What should ideally be seen in the graphs is that few or no people should be smiling at the sequences of the movie clips, which was not designed to be funny and then have a high amount of people smiling when the scenes went into a funny sequence. What can be seen in Illustration 6.9is how the users reacted to the first movie clip, with the falling apart type of humor. The graph is made by rounding all data about smiles to whole seconds and then plotting how many users were smiling at each individual second during the movie clip. Note that the data is gained only from people testing the reactive version. 165 Group 08ml582 Reactive movie playback 6. Testing Analyzing the results Smile Development Amount of users smiling Falling apart 10 8 6 4 2 0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 Seconds Illustration 6.9: At certain times during the movie clips with the falling apart type of humor, there were peaks, where more than half of the users were smiling at the same time. Comparing the data in Illustration 6.9 with the actual video and what was supposed to be funny, there are similarities. This was the first video to be played in the test, so this was the very first time any of the users saw the main character. This video started with a short period of only music playing, so the first peak (the green dot) is actually where the users first see the character. Even though this was not supposed to be a funny element of the movie clip itself, the character was designed to be perceived as likeable, so the reaction of the users smiling is an indication that they perceived the character as intended. The next peak (the yellow dot) is around the time where the character finds the mysterious music box, lifts it and shakes it, but the funniest element of this movie was designed to be the part where the boxing glove knocks the head of the character. In this period however, there were actually less users smiling, which might have been due to the implementation of the movie. The boxing glove is appearing for less than one second in the movie and with no previous introduction and the users might have had a hard time perceiving what actually happened to the character in this short period. In the end, the curve peaks again (the red dots), which is approximately around the time where the character turns to walk out of the scene, his head on backwards, and crashes into something off-screen. This was a final twist to the movie clip and the users seemed to have found this surprising end funny, or at least liked it. 166 Group 08ml582 Reactive movie playback 6. Testing Analyzing the results The other movie clip that – according to the program – was the funniest to watch was the ball play type of humor clip. In this movie clip, the humoristic peak was designed to be around when the character goes into slow-motion when dodging the small balls coming towards him. This actually did seem to make the users smile and again taking into consideration the delay of the time values, the highest peak in Illustration 6.10 (the yellow dots) is matching the period of time in the video, where the character starts to go into slow-motion and the camera starts moving around him. However, it seems the users got bored of the effect quite quickly, or that they did not understand what was happening when all the balls came flying back, since there was a strong decrease in amount of smiles in the last period of the slow-motion effect sequence. Smile Development Amount of users smiling Ball play 7 6 5 4 3 2 1 0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 Seconds Illustration 6.10: The clips with the ball play type of humor had a high peak around the time of the slow motion sequence. There was a small peak towards the end of the timeline (the red dots), when the character thinks he has dodged all the balls, but then a single, delayed ball comes flying towards the character and knocks him over. This was the “joke” that this movie clip was based around and even though there was a rise in amount of people smiling, more people enjoyed the slow- motion sequence than this last humoristic development. For reference, the development of smiles during the third movie clip is shown in Illustration 6.11, but there was no real indications of what was preferred and what was not, during this movie clip. The actual fun part of the movie clip (the character dropping to the ground and 167 Group 08ml582 Reactive movie playback 6. Testing Analyzing the results having a piano drop on top of him), made less people smile than the introduction and jumping on the trampoline. However, this clip was also the clip that made fewest people smile in total and therefore, there was not as much data to work with, in making this graph. Smile Development Amount of users smiling Black Humor 6 5 4 3 2 1 0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 Seconds Illustration 6.11: The clips of the black humor showed no real indications as to what was preferred or not by the users and what made them smile most. There are no real peaks in the graph. Even though the data about when the users smiled gave indications, it is difficult to prove that the users did smile at the actual sequences they were planned to smile at. The peaks on the graphs were not clearly separated from the theoretically non-humoristic parts, where the graph should have been at a lower level. Furthermore, at no point did more than eight of 15 users smile at the same moment in time and even if there was delays in the data produced, the peaks should still be clearly separated, even if they were a little shift in either direction on the time line. 6.3.3 Extensive testing For future development and testing, a more extensive test could be conducted, compared to the final test conducted in this project. Even if the method used for the final test in this project was performed without major flaws, there are more things to be tested on and taken into consideration for a future test. Even though the final test was conducted on 30 persons, there was a chance that the users tested on the reactive version were simply more expressive when seeing something they 168 Group 08ml582 Reactive movie playback 6. Testing Conclusion of the test found amusing, than the users testing the non-reactive version. For further development of the product, it would be prudent to introduce a focus group in the development stage. By testing the product at different stages of the development, it would be possible to get the user of the focus group mapped to the different types of humor and how expressive each of the users are when watching the movie clips. With this information, it would be possible to do a more balanced test to compare a non-reactive and a reactive movie playback. The final test in this project placed the users in a unfamiliar situation, because they were placed together with two test observers that they did not know beforehand. Had the users been placed in an environment they were familiar with and around people they knew, it might have been easier to influence them and make them smile. If using a focus group the users would have a chance to get familiar with this situation and feel more relaxed in the test situation. 6.4 Conclusion of the test The cross validation proved to be fulfilling both the required and the desired requirements set. With the method chosen for doing the smile detection, an average success rate of 89.92% was found at the optimal threshold setting, which proved that the technical aspect of the smile detection were useful for the final test involving users. The initial test involving users proved that the test setup and method was useful for further testing. Minor adjustments had to be made to the detection program before the final test, in order to prevent invalid test results. The final test showed that the smile detection program detected the type of humor that the users actually thought was the funniest in half the cases. This result might be due to the structure of the movie clips and the users’ personal preferences, since the success rate of the smile detection was already proven to be sufficient. The individual movie clips would have to be funny for an equally long period of time, in order for the program to make a fair comparison between the different clips. The final thing that was investigated in the analyses of the final test was the coherence between at what times the users smiled at the animated movie clips, compared to what was expected of the movie clips when they were implemented. For the falling apart type and ball 169 Group 08ml582 Reactive movie playback 6. Testing Conclusion of the test play type of humor, there was indications that in general, more people smiled at the periods of the movie clips that were designed to be the funniest. However, there was no clear proof in the results and for the black humor, it was impossible to even get useable indications, partly due to lack of data, since the users were not smiling as much during this clip, compared to the two others. What the testing of this product indicated, is that it is indeed possible to make an adaptive program – the smile detector – which controls the type of humor in a series of movie clips, according to a user’s reaction to previous played clips. However, the movie clips would have to be highly adjusted for this purpose, or the program be able to detect the smiles in a different way than just by the amount of time the user was smiling. The program did react on user input, but the general interpretation of the data obtained from the users, would have to be altered to make the program choose a correct type of humor in every case. 170 Group 08ml582 Reactive movie playback 7. Discussion Conclusion of the test 7. Discussion There are a number of things that could have been improved about the product of this project, in order to prove the final problem formulation even more extensively. First off all, it would be prudent to have more animated clips and make them seem connected through one long story. The test showed that users had problems connecting the type of humor in the animatics with the type of humor in the animated movie clips. It would be needed to first of all have everything fully animated and make sure that every clip is connected through a narrative structure to both the previous and next movie clip. Furthermore, it also proved hard to distinct the various types of humor. Without the use of dialogue, the types of humor available were limited to only slapstick humor. And even though the movie clips are variations of slapstick humor, it can be difficult to immediately distinguish between the three humor types. All three animations involves the character getting injured in various ways, in even though these ways are what actually distinguish the humor types from each other, the viewer might simply view them as three instances of the character getting hurt. Therefore, it could greatly help distinguish between the humor types by adding different types of dialogue based humor, making a clear difference between speaking and non-speaking movie clips. In order to enable the program to read more subtle smiles, it would be sensible to include the ability to detect more than just a smile. It was established during the test that users did not necessarily like the movie clip they smiled at the most. This was due to the fact that a user might like one, short joke more than a long joke or a lot of small jokes. As the program could only detect how long the users were smiling, it would determine the clip which the user smiled at the longest, as being the one that the user thought was funniest. In order to take care of this problem, the program would have to be adjusted to be able to read the degrees of a smile, to determine what the user actually thought were the funniest. The last thing to improve is the amount of training data for the program. With more training data, the detection would be even more accurate and also in higher degree be able to read people with e.g. beard. After this discussion of what can be improved about the product of this project, a conclusion which will give an overview of the entire project will be introduced. 171 Group 08ml582 Reactive movie playback 8. Conclusion Conclusion of the test 8. Conclusion By reading the preface along with this conclusion, the reader should be able to get a short summary of the contents and purpose of each chapter in the report is given in this conclusion. The motivation of the entire project was to challenge the nature of communication between a movie and a viewer. Traditionally, this communication is entirely one-way, in that only the movie can influence the viewer. This has made watching a movie a rather passive experience and all the effects a movie can have on a viewer never go beyond reaching the viewer. This project aimed at seizing these effects experienced by the viewer and utilizing them with the goal of changing the movie itself. This should make watching a movie a much more personal and specific experience, based on the mood of the viewer when watching it. In order to formulate a specific problem from this motivation, the problem inherent in the motivation was researched, investigating the other work done in this area of having a viewer’s mood influence the movie being watched. Different attempts at creating interactive movies we examined, along with ways of detecting a person’s facial expression, since doing so was decided upon as the method of registering a viewer’s mood. From this it was concluded that no previous work had been done to achieve a solution, which could satisfy the motivation. Thus an initial problem statement for the project could be formed. This was the first step in specifying the motivation into a problem to work with, in the rest of the project. In order to narrow down this statement into a concrete final problem statement, it was analyzed in its various areas. Firstly, the specific mood to focus on in the project was determined to be “happy”. The next step was researching storytelling in a movie in order to understand how to evoke a specific mood in a viewer. On a more technical side, it was examined whether to use real-time or pre-rendering to obtain the best way of presenting the movie to the viewer and pre-rendering was found to be the optimal choice. A target group for the product was determined as primarily being people above the age of 18. And finally it was chosen to test the product through cross validation and the DECIDE framework. This preliminary analysis enabled the group to decide upon a specific area of interest, from which the final problem formulation was derived. At this point, the motivation has been specified into a precise problem, which will be what the remainder of the project work is based upon. 172 Group 08ml582 Reactive movie playback 8. Conclusion Conclusion of the test As with the initial problem statement, this final statement was broken down into the areas that needed to be researched in order for the final statement to be analyzed and fulfilled. This research was divided into two main areas. Firstly, relevant research was done to gain sufficient knowledge about how to use the medium of a movie, such a deliberate use of sound or the camera, to mediate an effect to the viewer. Secondly, research concerning the creation of a program to capture the facial expression of the viewer, such as which programming language to use and which method of detecting a smile to use, was done. The analysis of the final problem formulation became the foundation of the design of the product – a product that aims at fulfilling the motivation and solving the final problem formulation. The project character was designed taking inspiration from e.g. Warner Brothers and Walt Disney. Next, the different types of humor to be used in the movie clips were chosen as variants of slapstick humor. Nine drafty storyboards were created, involving the character being exposed to each humor type in three storyboards per type, and by using knowledge gained from analyzing the medium of movies, such as the use of acting. In order to bring the character to life, several animation techniques were examined and chosen to be used in correlation with the drafty storyboards, such as Disney’s 12 Principles. In the last part of the design phase, it was chosen how to implement the smile detection, using Haar cascades for face detection and a more simple method was used for detecting smiles. The decisions made in this design chapter served as a template for the realization of the actual product of the project. In the implementation of the movies, the techniques used for modeling the character, such as box-modeling were explained. Next, the process of rigging the character, thereby preparing him for animation, was explained. Lastly the techniques used for animating the character, e.g. using Key Frames, were explained. During the implementation of the program, an application to obtain training data for smile detection, a program for doing training of smile detection and the actual smile detection program was developed. Each of the programs was developed in C++, using the OpenCV library for the image processing. This implementation resulted in the creation of a product to be tested in order to fulfill the motivation prove the final problem statement. 173 Group 08ml582 Reactive movie playback 8. Conclusion Conclusion of the test With both program and movie clips implemented, the combined product was tested. A cross validation of the smile detection proved that the success rate of the program fulfilled the requirements set, with an average detection rate of more than 89%. The user tests proved that the program was able to detect the users smiling, but only in half the cases was the detection of the program coherent with what the users’ own opinion. The test indicated that the users were smiling more when they watched reactive version of the program, which chose clips of their preferred type of humor, than if they watched a program that did not take the smiling into consideration. This test indicated that the final problem formulation was indeed solved by this project and that the product is in fact one possible solution to creating an adaptive program reacting to the smile of a user. 174 Group 08ml582 Reactive movie playback 9. Future perspective 9. Future perspective When glancing at the final product of the project, the initial idea of making the user passively control the narrative has been delimited into selecting small narratives, with the same starting point. A natural successor of the product could be one, long lasting continuous narrative, which react to the users facial expressions. By having only one, long story and not a lot of different short stories, the user experience would be less sporadic due to the continuous story, but also because of the non-interrupting nature of the passive interaction form. During the development of the product, it has become clear for the group, that the area of animation and interactive controls has many facets. The passive interaction form suggested by report, provides possibilities for further exploration. The passive control of a medium, through facial expressions and moods, offers a wide variety of usage. Not only within the area of the implementation used in this project, but also within many daily routines. The adjustment of a product’s functionality according to their user’s mood can be applied in many home appliances, such as lamps, windows, beds etc. Imagine the color of a lamp, changing according to the user’s mood. In connection with a PC, the smile detection specifically can be used in connection with many online applications, communities etc. One example could be the video portal YouTube, which already has the functionality of recommending video clips to the users, according to what the individual user has previously watched. This could be further developed by implementing smile detection on the site and make YouTube detect whether or not the user actually liked the clips he was shown. The technology could also be applied in connection to music, where an online radio station could create a personalized playlist, based on the users’ reaction to different types of music. Reading all kind of reactions would also require the technology to be able to detect other facial characteristics than just a smile, e.g. eye movement, blushing or other facial expression, such as frown. This shows that, even outside of the main concept of altering the narrative through smile detection, there are a large number of possibilities to enhance every day entertainment and appliances using passive interaction. 175 Group 08ml582 Reactive movie playback 10. Bibliography 10. Bibliography Adamson, A., & Jenson, V. (Directors). (2001). Shrek [Motion Picture]. American Association for the Advancement of Science. (2004, 12 16). Finding fear in the whites of the eyes. Retrieved 10 02, 2008, from American Association for the Advancement of Science - EurekAlert: http://www.eurekalert.org/features/kids/2004-12/aaft-ffi020805.php American Heritage® Dictionary. (2008). The American Heritage® Dictionary of the English Language: Fourth Edition. Houghton Mifflin Company. Animation Mentor. (2008). Requirements to Attend Animation Mentor. Retrieved October 6, 2008, from Animationmenter.com: http://www.animationmentor.com/school/requirements.html AnimationMentor.com (Director). (2008). Student Showcase Summer 2008 [Motion Picture]. Aristotle. (350 B.C.). Poetics. Avery, T. (Director). (1940). A Wild Hare [Motion Picture]. Avery, T. (1944). IMDB. Retrieved from The Internet Movie Database: http://www.imdb.com/title/tt0037251/ Avery, T. (Director). (1937). Porky's Duck Hunt [Motion Picture]. Bio-Medicine.org. (2007, 11 11). Nervous system. Retrieved 10 02, 2008, from Bio-medicine.org: http://www.bio- medicine.org/biology-definition/Nervous_system/) Biordi, B. (2008). Retrieved December 6, 2008, from Dragon's Lair Fans: http://www.dragonslairfans.com/ Bioware. (2007). Mass Effect Community. Retrieved 10 15, 2008, from Bioware: http://masseffect.bioware.com/ Bluth, D. (2004). The Inbetweener. Retrieved 11 23, 2008, from Don Bluth's Animaiton Academy: http://www.donbluth.com/inbetweener.html Bordwell, D., & Thompson, K. (2008). Film Art. New York: McGraw-Hill. Bradski, G., & Kaelhler, A. (2008). Learning OpenCV: Computer Vision with the OpenCV Library. O'Reilly Media, Inc.: 1st edition. Bros., W. (1930-1969). Looney Tunes. Retrieved 12 12, 2008, from Looney Tunes: http://looneytunes.kidswb.com/ Cincerova, A. (2007, June 14). Groundbreaking Czechoslovak interactive film system revived 40 years later. (I. Willoughby, Interviewer) 176 Group 08ml582 Reactive movie playback 10. Bibliography Clair, R. (1931). Le Million. Retrieved 12 12, 2008, from IMDB.com: http://www.imdb.com/title/tt0022150/ Clark, B. H. (1918). European Theories of the Drama: An Anthology of Dramatic Theory and Criticism from Aristotle to the Present Day. Stewart & Kidd. Clements, R., & Musker, J. (Directors). (1992). Aladdin [Motion Picture]. Conrad, T., & Kroopnick, S. (2001). The scariest place on Earth. Retrieved 12 08, 2008, from http://www.imdb.com/title/tt0280312/ Crane, D., & Kaufman, M. (1994). Friends. Retrieved 12 08, 2008, from http://www.imdb.com/title/tt0108778/ Croshaw, B. ". (2008, 11 06). Zero Punctuation. Retrieved 11 06, 2008, from the escapist: http://www.escapistmagazine.com/videos/view/zero-punctuation Dale, A. S. (2000). Comedy is a Man in Trouble: Slapstick in American Movies. U of Minnesota Press. Drawingcoach. (2008). Retrieved 12 10, 2008, from www.drawingcoach.com: http://images.google.dk/imgres?imgurl=http://www.drawingcoach.com/image- files/cartoon_eyes_female.gif&imgrefurl=http://www.drawingcoach.com/cartoon- eyes.html&h=600&w=135&sz=11&hl=da&start=67&sig2=YBdhnc6G5qDk9ofV9aOiHA&um=1&usg=__w65giv6Q dud78QIdvtzZT5-lq Experimentarium. (2008). Statistik. Retrieved October 3, 2008, from Experimentarium: http://www.experimentarium.dk/presse_corporate/tal_fakta/statistik/ Expo 67. (2007). Montreal Universal and international exhibition 1967. Retrieved December 6, 2008, from expo67: http://expo67.morenciel.com/an_expo67/ Fincher, D. (1995). Se7en. Retrieved 12 12, 2008, from IMDB.com: http://www.imdb.com/title/tt0114369/ Fincher, D. (1997). The Game. Retrieved 12 12, 2008, from http://www.imdb.com/title/tt0119174/: http://www.imdb.com/title/tt0119174/ Fogg, A. (1996, 02 18). Monty Pythons's completely useless website. Retrieved December 16, 2008, from Monty Pythons's completely useless website: http://www.intriguing.com/mp/ Freytag, G. (1876). Die Technik des Dramas. Leipzig, S. Hirzel. Furniss, M. (1998). Art in motion: Animation aesthetics. Sydney: John Libbey. Geronimi, C. (Director). (1959). Sleeping Beauty [Motion Picture]. Gilliam, T., & Jones, T. (1975). Monty Python and the Holy Grail. Retrieved 12 12, 2008, from IMDB.com: http://www.imdb.com/title/tt0071853/ 177 Group 08ml582 Reactive movie playback 10. Bibliography Hager, J. C. (2003, 01 17). Emotion and Facial Expression. Retrieved 10 02, 2008, from A Human Face: http://www.face-and-emotion.com/dataface/emotion/expression.jsp Hand, D. (Director). (1942). Bambi [Motion Picture]. Hand, D. (Director). (1937). Snow White and the Seven Dwarves [Motion Picture]. Hitchcock. (1960). Psycho. Retrieved 12 12, 2008, from IMDB.com: http://www.imdb.com/title/tt0054215/ IMDB. (1971). And Now for Something Completely Different. Retrieved December 16, 20008, from IMDB.com: http://www.imdb.com/title/tt0066765/ IMDB.com. (2008). David Sonnenschein. Retrieved 12 11, 2008, from IMDB.com: http://www.imdb.com/name/nm0814408/ IMDB.com. (2008). Ken Harris. Retrieved 12 06, 2008, from IMDB.com: http://www.us.imdb.com/name/nm0364938/ IMDB.com. (2008). Ollie Johnston. Retrieved 12 12, 2008, from IMDB.com: http://www.imdb.com/name/nm0426508/ IMDB.com. (2008). Porky's Duck Hunt. Retrieved 12 12, 2008, from IMDB.com: http://www.imdb.com/title/tt0029426/ IMDB.com. (2008). Richard Williams. Retrieved 12 06, 2008, from IMDB.com: http://www.imdb.com/name/nm0931530/ Johnston, O., & Thomas, F. (1997). The Illusion of Life. Hyperion; 1st Hyperion Ed edition. Jones, C., & Monroe, P. (1980). Soup of Sonic. Retrieved 12 12, 2008, from IMDB.com: http://www.imdb.com/title/tt0081540/ Kerlow, I. V. (2004). Applying the Twelve Principles to 3D Computer Animation. In I. V. Kerlow, The Art of 3D Computer Animation and Effects (pp. 278-283). Hoboken, New Jersey: John Wiley & Sons, Inc. Kinoautomat. (2007). About Kinoautomat. Retrieved December 6, 2008, from Kinoautomat: http://www.kinoautomat.cz/index.php?intro=ok Konami. (1998). Metal Gear Series. Retrieved 10 15, 2008, from Kojima Productions: http://www.konami.jp/kojima_pro/english/index.html Kubrick, S. (1980). The Shining. Retrieved 12 12, 2008, from IMDB.com: http://www.imdb.com/title/tt0081505/ Langford, J. (Director). (2008). Tales of a Third Grade Nothing [Motion Picture]. 178 Group 08ml582 Reactive movie playback 10. Bibliography Larry, D., & Seinfeld, J. (1990). Seinfeld. Retrieved 12 08, 2008, from http://www.imdb.com/title/tt0098904/ Luske, H., & Sharpsteen, B. (Directors). (1940). Pinocchio [Motion Picture]. Microsoft. (2008, August). Download details: DirectX End-User Runtime. Retrieved October 2, 2008, from Microsoft Download Center: http://www.microsoft.com/downloads/details.aspx?familyid=2da43d38-db71- 4c1b-bc6a-9b6652cd92a3&displaylang=en#Requirements Milton, E. (Director). (2004). The Incredibles - Behind the Scenes [Motion Picture]. Moore, A. W. (2005, October 11). Cross-validation for detecting and preventing overfitting. Pittsburgh, USA. MyDNS.jp. (2008). Class: OpenCV::CvScalar. Retrieved November 24, 2008, from mydns.jp: http://doc.blueruby.mydns.jp/opencv/classes/OpenCV/CvScalar.html NationMaster. (2005). Encyclopedia > Interactive movie. Retrieved December 6, 2008, from NationMaster: http://www.nationmaster.com/encyclopedia/Interactive-movie Nonstick.com. (2003). www.nonstick.com. Retrieved 11 05, 2008, from Looney Tunes Character List: http://www.nonstick.com/wdocs/charac.html Ohrt, K., & Kjeldsen, K. P. (2008). Årsrapport 2007 for Statens Museum for Kunst. Copenhagen: Statens Museum for Kunst. OpenCV. (2008, November 19). Welcome. Retrieved November 23, 2008, from OpenCV Wiki: http://opencv.willowgarage.com/wiki/ OpenCVWiki. (n.d.). Face Detection using OpenCV. Retrieved November 19, 2008, from willowgarage.com: http://opencv.willowgarage.com/wiki/FaceDetection OpenGL.org. (2008). OpenGL Platform & OS Implementations. Retrieved October 2, 2008, from OpenGL.org: http://www.opengl.org/documentation/implementations/ OpenGL.org. (2008). Using and Licensing OpenGL. Retrieved October 2, 2008, from OpenGL.org: http://www.opengl.org/about/licensing/ Owen, S. (1999, June 2). Perspective Viewing Projection. Retrieved October 11, 2008, from ACM SIGGRAPH: http://www.siggraph.org/education/materials/HyperGraph/viewing/view3d/perspect.htm Peters, S., & Pryce, C. (1991). Are you affraid of the dark? Retrieved 12 08, 2008, from http://www.imdb.com/title/tt0103352/ Pisarevsky, V. (2004, June 10). intel.com. Retrieved November 3, 2008, from OpenCV Object Detection: Theory and Practice: http://fsa.ia.ac.cn/files/OpenCV_FaceDetection_June10.pdf 179 Group 08ml582 Reactive movie playback 10. Bibliography Pixar Animation Studios. (2008). Pixar Animation Studios. Retrieved 12 12, 2008, from Pixar.com: http://www.pixar.com/companyinfo/history/1984.html Price, W. T. (1892). The Technique of the Drama. New York: Brentano's. Procedural Arts. (2006). Façade Vision and Motivation. Retrieved September 14, 2008, from InteractiveStory.net: http://www.interactivestory.net/vision/ Redifer, J. (2008). Interfilm. Retrieved September 14, 2008, from Joe Redifer.com: http://www.joeredifer.com/site/interfilm/interfilm.html Rosenthal, P. (1996). Everybody loves Raymond. Retrieved 12 08, 2008, from http://www.imdb.com/title/tt0115167/ Roth, E. (2005). Hostel. Retrieved 12 12, 2008, from IMDB.com: http://www.imdb.com/title/tt0450278/ Scott, R. (2000). Gladiator. Retrieved 12 12, 2008, from IMDB.com: http://www.imdb.com/title/tt0172495/ Scratchapixel. (2008, July 10). Lesson 1: How does it work? Retrieved September 28, 2008, from www.scratchapixel.com: http://www.scratchapixel.com/joomla/lang-en/basic-lessons/lesson1.html Seldess, Z. (n.d.). Retrieved November 3, 2008, from Zachary Seldess: http://www.zacharyseldess.com/sampleVids/Seldess_faceTrackGLnav.mov Seo, N. (2008, October 16). Tutorial: OpenCV haartraining. Retrieved December 15, 2008, from Naotoshi Seo: http://note.sonots.com/SciSoftware/haartraining.html#j1d5e509 Shadyac, T. (1994). Ace Ventura: Pet Detective. Retrieved 12 12, 2008, from IMDB.com: http://www.imdb.com/title/tt0109040/ Sharp, H., Rogers, Y., & Preece, J. (2006). Interaction Design. West Sussex: John Wiley & Sons Ltd. Singer, B. (1994). The Usual Suspects. Retrieved 12 12, 2008, from IMDB.com: http://www.imdb.com/title/tt0114814/ Skelley, J. P. (2005). Experiments in Expression Recogntion. Massachusetts: M.I.T - Department of Electrical Engineering and Computer Science. Sony Corp. (2007). Cyber-Shot handbook - Sony DSC-T300. Sony New Zealand Limited. (2007, August 24). Scoop Independent News - Sci Tech. Retrieved November 12, 2008, from Cyber-shot introduces smile detection: http://www.scoop.co.nz/stories/SC0708/S00064.htm Spencer, S. (1993). Radiosity overview, part 1. Retrieved 12 12, 2008, from SIGGRAPH.org: http://www.siggraph.org/education/materials/HyperGraph/radiosity/overview_1.htm 180 Group 08ml582 Reactive movie playback 10. Bibliography Stern, A., & Mateas, M. (2006, July 14). Behind Façade: An Interview with Andrew Stern and Michael Mateas. (B. B. Harger, Interviewer) The Onion. (2007, May). Hallmark Scientists Identify 3 New Human Emotions. Retrieved 10 02, 2008, from The Onion: http://www.theonion.com/content/news/hallmark_scientists_identify_3_new Thomas, F., & Johnston, O. (2002). Our Work. Retrieved 12 12, 2008, from Frank & Ollie's official site: http://www.frankandollie.com/Film_Features.html Ubisoft. (1995). Rayman Zone. Retrieved 12 12, 2008, from www.ubisoft.com: http://raymanzone.uk.ubi.com/ von Riedemann, D. (2008, June 02). Walt Disney's Nine Old Men. Retrieved December 12, 2008, from suite101.com: http://vintage-animated-films.suite101.com/article.cfm/walt_disneys_nine_old_men Wedge, C. (Director). (2002). Ice Age [Motion Picture]. Wells, P. (1998). Understanding animation. Routledge. Williams, R. (2001). The Animator's Survival Kit. New York: Faber and Faber Inc. 181 Group 08ml582 Reactive movie playback 11. Illustration List 11. Illustration List Illustration 1.1: Retrieved: http://news.filefront.com/wp-content/uploads/2008/04/mass-effect11.jpg Illustration 2.1: Retrieved: http://img2.timeinc.net/ew/dynamic/imgs/071029/horrormovies/psycho_l.jpg Illustration 2.2: Retrieved: http://www.myaffiliatetips.com/Images/amazed-affiliate-woman.jpg Illustration 2.3: Own creation: Heino Jørgensen Illustration 2.4: Retrieved: http://pages.cpsc.ucalgary.ca/~apu/raytrace1.jpg Illustration 3.1 - Illustration 3.3: Own creation: Heino Jørgensen Illustration 3.4: Retrieved: (Bordwell & Thompson, 2008, s. 142) Illustration 3.5: Retrieved: (Bordwell & Thompson, 2008, s. 143) Illustration 3.6: Retrieved: (Bordwell & Thompson, 2008, s. 143) Illustration 3.7: Retrieved: (Bordwell & Thompson, 2008, s. 143) Illustration 3.8 - Illustration 3.10: Retrieved: (Bordwell & Thompson, 2008, s. 190) Illustration 3.11 - Illustration 3.17: Retrieved: (Bordwell & Thompson, 2008, s. 191) Illustration 3.18: Retrieved: http://i3.photobucket.com/albums/y90/pinkfloyd1973/myspace/jiminycricket.png Illustration 3.19: Retrieved: http://www.coverbrowser.com/image/donald-duck/31-1.jpg and http://www.coverbrowser.com/image/donald-duck/47-1.jpg Illustration 3.20: Retrieved: http://classiccartoons.blogspot.com/2007/05/screwball-squirrel.html Illustration 3.21: Retrieved: http://justyouraveragejoggler.files.wordpress.com/2006/11/111806-roadrunner.jpg Illustration 3.22: Own Creation: Mikkel Berentsen Jensen Illustration 3.23: Retrieved: http://www.zacharyseldess.com/sampleVids/Seldess_faceTrackGLnav.mov Illustration 3.24: Retrieved: http://fsa.ia.ac.cn/files/OpenCV_FaceDetection_June10.pdf Illustration 3.25: Own creation: Heino Jørgensen Illustration 3.26: Own Creation: Mikkel Berentsen Jensen Illustration 4.1: Retrieved: http://realitymarbles.files.wordpress.com/2007/08/bugsbunny.png and http://www.funbumperstickers.com/images/Daffy_Duck_1.gif Illustration 4.2: Retrieved: http://animationarchive.net/Feature%20Films/Aladdin/Model%20Sheets/AladdinModelSheet1.jpg Illustration 4.3: Retrieved: http://www.acmeanimation.com/FAIRIES.JPG and http://www.quizilla.com/user_images/N/Nightshadow/1034485810_EWorkPicsMalifecant.JPG 182 Group 08ml582 Reactive movie playback 11. Illustration List Illustration 4.4: Retrieved: http://disneyheaven.com/images/DisneyStorybook/Aladdin/Sultan.gif and http://arkansastonight.com/uploaded_images/jafar-719207.jpg Illustration 4.5: Own Creation: Kim Etzerodt Illustration 4.6: Retrieved: http://pressthebuttons.typepad.com/photos/uncategorized/rayman.jpg Illustration 4.7: http://img512.imageshack.us/img512/373/yathzeewallbx1.jpg Illustration 4.8: Own Creation: Kim Etzerodt Illustration 4.9 - Illustration 4.12: Own creation: Kim Etzerodt Illustration 4.13: Own creation: Kim Etzerodt, Mikkel Lykkegaard Jensen and Sune Bagge Illustration 4.14: Own creation: Mikkel Lykkegaard Jensen. Illustration 4.15: Own creation: Kim Etzerodt. Illustration 4.16: Own creation: Mikkel Lykkegaard Jensen and Sune Bagge. Illustration 4.17: Own creation:: Kim Etzerodt and Sune Bagge. Illustration 4.18 - Illustration 4.20: Own creation: Kim Etzerodt. Illustration 4.21: Retrieved: http://www.coe.tamu.edu/~lcifuent/edtc656/Unit_08/reading_files/image001.gif Illustration 4.22 - Illustration 4.29: Own creation: Kim Etzerodt. Illustration 4.30: (Williams, 2001, s. 107) Illustration 4.31: (Williams, 2001, s. 107) Illustration 4.32: (Williams, 2001, s. 107) Illustration 4.33: (Williams, 2001, s. 112-113) Illustration 4.34: (Williams, 2001, s. 107) Illustration 4.35: (Williams, 2001, s. 108) Illustration 4.36: (Williams, 2001, s. 136) Illustration 4.37: (Williams, 2001, s. 148) Illustration 4.38: Own creation: Kim Etzerodt. Illustration 4.39: (Williams, 2001, s. 163) Illustration 4.40: Own creation: Kim Etzerodt. Illustration 4.41: Own creation: Kim Etzerodt. Illustration 4.42: Own creation: Heino Jørgensen. Illustration 4.43: Own creation: Heino Jørgensen. 183 Group 08ml582 Reactive movie playback Illustration 4.44: Own creation: Heino Jørgensen. Illustration 5.1 - Illustration 5.17: Own creation: Kim Etzerodt. Illustration 5.18 - Illustration 5.21: Own creation: Mikkel Lykkegaard Jensen. Illustration 5.22: Own creation: Kim Etzerodt Illustration 5.23: Own creation: Kim Etzerodt Illustration 5.24: Own creation: Mikkel Lykkegaard Jensen Illustration 5.25: Own creation: Mikkel Lykkegaard Jensen Illustration 5.26: Own creation: Mikkel Berentsen Jensen Illustration 5.27: Retrieved: http://upload.wikimedia.org/wikipedia/commons/7/71/Histogrammspreizung.png Illustration 5.28 - Illustration 6.8: Own creation: Heino Jørgensen 184 11. Illustration List Group 08ml582 Reactive movie playback 12. Appendix 12. Appendix 12.1 Storyboards Music Box (Humor Type ”Falling apart”– Story board for 3D Animation) 185 Group 08ml582 Reactive movie playback 12. Appendix Cheated 1 of 2 (Humor Type "Falling apart") 186 Group 08ml582 Reactive movie playback 12. Appendix Cheated 2 of 2 (Humor Type "Falling apart") 187 Group 08ml582 Reactive movie playback 12. Appendix Open Door 1 of 3 (Humor Type "Falling apart") 188 Group 08ml582 Reactive movie playback 12. Appendix Open Door 2 of 3 (Humor Type "Falling apart") 189 Group 08ml582 Reactive movie playback 12. Appendix Open Door 3 of 3 (Humor Type "Falling apart") 190 Group 08ml582 Reactive movie playback 12. Appendix Trampoline (Humor Type "Black humor"– Story board for 3D Animation) 191 Group 08ml582 Reactive movie playback 12. Appendix Stickman (Humor Type "Black humor") 192 Group 08ml582 Reactive movie playback 12. Appendix Kick Box (Humor Type "Black humor") 193 Group 08ml582 Reactive movie playback 12. Appendix Bullet Time (Humor Type "Ball play" – Story board for 3D Animation) 194 Group 08ml582 Reactive movie playback 12. Appendix Lethal Ball (Humor Type "Ball play") 195 Group 08ml582 Reactive movie playback 12. Appendix Hard Ball (Humor Type "Ball play") 196 Group 08ml582 Reactive movie playback 12. Appendix 12.2 Implementation Code Smile Picture Capture Code #define VERSION "Smile Picture Capture v. 0.004 Alpha" #include "cv.h" #include "highgui.h" #include <stdio.h> #include <stdlib.h> #include <string.h> #include <assert.h> #include <math.h> #include <float.h> #include <limits.h> #include <time.h> #include <ctype.h> #include <iostream> #include <fstream> using namespace std; static CvMemStorage* storage = 0; static CvHaarClassifierCascade* cascade = 0; IplImage* mouthDetection( IplImage* image ); IplImage* resizeMouth(IplImage* src); const char* cascade_name = "haarcascade_frontalface_alt2.xml"; // Choice of haarcascade (here frontal face) int main( int argc, char** argv ) { char smileFile[] = "SmilePictureX.jpg"; // Filename template for smile pictures char neutralFile[] = "NeutrPictureX.jpg";// Filename template for neutral pictures int smilePicCount = 65; // Changing Ascii value which are replacing the "X" in the filenames above int neutralPicCount = 65; // Changing Ascii value which are replacing the "X" in the filenames above // Ascii 65 = A CvCapture* capture = 0; IplImage *frame, *frame_copy = 0; const char* input_name; input_name = argc > 1 ? argv[1] : 0; cascade = (CvHaarClassifierCascade*)cvLoad( cascade_name, 0, 0, 0 ); if( !cascade ) // If haar cascade is not found { fprintf( stderr, "ERROR: Could not load classifier cascade\n" ); system("pause"); return -1; } storage = cvCreateMemStorage(0); if( !input_name || (isdigit(input_name[0]) && input_name[1] == '\0') ) // If camera is found capture feed capture = cvCaptureFromCAM( !input_name ? 0 : input_name[0] - '0' ); else // Else capture from movie (film.avi) capture = cvCaptureFromAVI( "film.avi" ); 197 Group 08ml582 Reactive movie playback 12. Appendix cvNamedWindow( "Mouth", 1 ); // Open Window cvNamedWindow( "Original Picture", 1 ); // Open Window #ifdef WIN32 // This is only for Windows 32 bit system("mkdir C:\\Pictures\\RunXXX\\");// Tells system to create the C:\Pictures\RunXXX #endif fstream file_smile("C:\\Pictures\\RunXXX\\SmilePics.txt",ios::out); // Opens a txt file for write access fstream file_neutr("C:\\Pictures\\RunXXX\\NeutrPics.txt",ios::out); // Opens a txt file for write access if( capture ) { // While smiles or neutrals are needed keep running for(;smilePicCount < 90 || neutralPicCount < 90;) { if( !cvGrabFrame( capture )) break; frame = cvRetrieveFrame( capture ); if( !frame ) break; if( !frame_copy ) frame_copy = cvCreateImage( cvSize(frame->width,frame->height), IPL_DEPTH_8U, frame->nChannels ); if( frame->origin == IPL_ORIGIN_TL ) cvCopy( frame, frame_copy, 0 ); else cvFlip( frame, frame_copy, 0 ); directory IplImage* mouthImage; // Creates a new picture for the image of the mouth mouthImage = mouthDetection( frame_copy ); // Finds the mouth in the picture frame_copy if (mouthImage->width != 0) // If the mouth image exists { mouthImage = resizeMouth(mouthImage);// Resize cvShowImage( "Mouth", mouthImage );// Show mouth cvShowImage( "Original Picture", frame_copy );// Show webcam feed // Added this to save image: int key=cvWaitKey(10); if(key=='1') // If "1" is pressed { if(smilePicCount != 90);// If less than 25 pictures been found { smileFile[12] = smilePicCount; // Change the 12. character in the filename to ASCII code from smilePicCount std::cout << "Smile saved as: " << smileFile << std::endl; cvSaveImage(smileFile,mouthImage); // Save mouth image file_smile << "C:\\\\Pictures\\\\RunXXX\\\\Smiles\\\\" << smileFile << endl; // Add path to smileFile text file smilePicCount++; // Increment of character } else { // Enough smiles std::cout << "No more smiles needed! " << std::endl; } }; if(key=='2') 198 // If "2" is pressed Group 08ml582 Reactive movie playback 12. Appendix { if(neutralPicCount != 90)// If less than 25 pictures has been found { neutralFile[12] = neutralPicCount;// Change the 12. character in the filename to ASCII code from neutralPicCount std::cout << "Neutral saved as: " << neutralFile << std::endl; cvSaveImage(neutralFile,mouthImage);// Save mouth image file_neutr << "C:\\\\Pictures\\\\RunXXX\\\\Neutrals\\\\" << neutralFile << endl; neutralPicCount++; } else { std::cout << "No more neutral expressions needed! " << std::endl; } }; } else // If no mouth has been found { cvShowImage( "Mouth", frame_copy ); // Show webcam feed instead cvShowImage( "Original Picture", frame_copy );// Show webcam feed instead } cvReleaseImage( &mouthImage ); } // Finished gathering smiles and neutrals #ifdef WIN32 // This is only for Windows 32 bit system("xcopy *Smile*.jpg C:\\Pictures\\RunXXX\\Smiles\\"); // Copy all files with filename *Smile* to specific path system("xcopy *Neut*.jpg C:\\Pictures\\RunXXX\\Neutrals\\"); // Copy all files with filename *Neut* to specific path #endif cvReleaseImage( &frame_copy ); cvReleaseCapture( &capture ); } cvDestroyWindow(VERSION); return 0; } IplImage* mouthDetection( IplImage* img ) // Mouth mouthDetection function in Smile Detection Program { IplImage* mouthPixels; bool detectedFace = 0; static CvScalar colors[] = { {{0,0,255}}, {{0,128,255}}, {{0,255,255}}, {{0,255,0}}, {{255,128,0}}, {{255,255,0}}, {{255,0,0}}, {{255,0,255}} }; detection is identical with the double scale = 1.3; IplImage* gray = cvCreateImage( cvSize(img->width,img->height), 8, 1 ); IplImage* small_img = cvCreateImage( cvSize( cvRound (img->width/scale),cvRound (img>height/scale)),8, 1 ); int i; 199 Group 08ml582 Reactive movie playback 12. Appendix cvCvtColor( img, gray, CV_BGR2GRAY ); cvResize( gray, small_img, CV_INTER_LINEAR ); cvEqualizeHist( small_img, small_img ); cvClearMemStorage( storage ); if( cascade ) { double t = (double)cvGetTickCount(); CvSeq* faces = cvHaarDetectObjects( small_img, cascade, storage, 1.1, 2, 0/*CV_HAAR_DO_CANNY_PRUNING*/, cvSize(30, 30) ); t = (double)cvGetTickCount() - t; for( i = 0; i < (faces ? faces->total : 0); i++ ) { CvRect* r = (CvRect*)cvGetSeqElem( faces, i ); CvPoint facecenter; int radius; facecenter.x = cvRound((r->x + r->width*0.5)*scale); facecenter.y = cvRound((r->y + r->height*0.5)*scale); radius = cvRound((r->width + r->height)*0.25*scale); // Mouth detection CvPoint mouthUpLeft; CvPoint mouthDownRight; mouthUpLeft.x = facecenter.x - 0.5*radius; mouthUpLeft.y = facecenter.y + 0.3*radius; mouthDownRight.x = cvRound(mouthUpLeft.x + radius); mouthDownRight.y = cvRound(mouthUpLeft.y + radius * 0.5); detectedFace = true; cvRectangle( img, mouthUpLeft, mouthDownRight, colors[3%8], 3, 8, 0); //Pixels we need for smile :D int step = gray->widthStep/sizeof(uchar); //std::cout << "Step: " << step << " Width: " << gray->width << uchar* data = (uchar *)gray->imageData; std::endl; mouthPixels = cvCreateImage( cvSize(mouthDownRight.y mouthUpLeft.y,mouthDownRight.x - mouthUpLeft.x),IPL_DEPTH_8U, 1 ); mouthPixels->height = mouthDownRight.y - mouthUpLeft.y; mouthPixels->width = mouthDownRight.x - mouthUpLeft.x; mouthPixels->widthStep = mouthPixels->width/sizeof(uchar); mouthPixels->nChannels = 1; uchar* data2 = (uchar *)mouthPixels->imageData; int data2Location = 0; for(int a = mouthUpLeft.y; a < mouthDownRight.y; a++) { for(int b = mouthUpLeft.x; b < mouthDownRight.x; b++) { data2[data2Location] = data[b+a*step]; data2Location++; } } } } if(detectedFace { // cvShowImage( cvReleaseImage( cvReleaseImage( 200 == true) VERSION, mouthPixels ); } else { cvShowImage( VERSION, gray);}; &gray ); &small_img ); - Group 08ml582 Reactive movie playback 12. Appendix return mouthPixels; } else { gray->width = 0; cvReleaseImage( &small_img ); return gray; } }; IplImage* resizeMouth(IplImage* src) { IplImage* resizedMouth = cvCreateImage(cvSize(30,15),IPL_DEPTH_8U, 1); cvResize(src, resizedMouth, 1); return resizedMouth; }; 201 Group 08ml582 Reactive movie playback 12. Appendix Smile Training Program Code #define VERSION "Smile Training v. 0.001 Alpha" #include <cv.h> #include <highgui.h> #include #include #include #include #include #include #include #include #include #include #include #include #include #include <stdio.h> <stdlib.h> <string.h> <assert.h> <math.h> <float.h> <limits.h> <time.h> <ctype.h> <iostream> <string> <cstring> <fstream> <vector> using namespace std; #ifdef _EiC #define WIN32 #endif vector <IplImage*> loadCollection(string file, int maxCollectionSize) /* Function takes in two variables: The path to the index file of pictures, and number of how big the collection */ { vector <IplImage*> collection; // Creates a vector of image pointers const char *filename = file.c_str();// Converts the string file to a const char ifstream myfile (filename); // Opens the file for reading string line; // Creates a string named line if (myfile.is_open()) // Run through the file { while (! myfile.eof() && collection.size() < maxCollectionSize ) { getline(myfile,line); const char *charline = line.c_str(); IplImage *image = cvCreateImage(cvSize(30,15),IPL_DEPTH_8U,1); // This is image pointer for the image IplImage *imageFlip = cvCreateImage(cvSize(30,15),IPL_DEPTH_8U,1); // This is image pointer for the flipped image image= cvLoadImage(charline,0); cvEqualizeHist(image,image); collection.push_back(image); cvFlip(image,imageFlip,1); collection.push_back(imageFlip); // // // // // Load the image Equalize the histogram Save image into vector Flip image and save into imageFlip Save imageFlip into vector } } return collection; } /* Loads a collection of images into the function */ IplImage* getMean(vector <IplImage*> collection) { /* Creates two scalars, which contains an 1D array with RGB and Alpha values (a 8 bit picture) */ CvScalar s, t; /* Creates an image with the same width and height as the training images */ IplImage* meanImg = cvCreateImage(cvSize(collection[0]->width,collection[0]>height),IPL_DEPTH_8U,1); int temp = 0; 202 Group 08ml582 Reactive movie playback 12. Appendix /* Creates a vector to temporarily save pixel values vector <int> coordinate((collection[0]->width)*(collection[0]->height)); /* Goes through every picture in collection */ for( int i = 0; i < collection.size() ; i++ ) { int coordinateCounter = 0; for (int y=0; y<collection[i]->height; y++) // For Y values { for (int x=0; x<collection[i]->width; x++) // For X values { s = cvGet2D(collection[i],y,x); // Get pixel value for image in X,Y /* Add the pixel value for the current image into the coordinate vector */ coordinate[coordinateCounter] += s.val[0]; coordinateCounter++; } } } /* Go through the added pixel values and divide with the amount of pictures */ for (int j = 0; j<coordinate.size(); j++) { coordinate[j] = coordinate[j]/collection.size(); } int pixelCounter = 0; /* For loop that converts the coordinate vector into an image (meanImg) */ for (int h = 0; h < meanImg->height; h++) { for (int w = 0; w < meanImg->width; w++) { for (int scalar = 0; scalar < 4; scalar++) { t.val[scalar] = (double)coordinate[pixelCounter]; } cvSet2D(meanImg, h, w, t); pixelCounter++; } } return meanImg; } int main( int argc, char** argv ) { vector <IplImage*> smileCollection; vector <IplImage*> neutralCollection; IplImage* imageS; IplImage* imageN; smileCollection = loadCollection("SmilePics.txt", 330); neutralCollection = loadCollection("NeutrPics.txt", 346); imageS = getMean(smileCollection); imageN = getMean(neutralCollection); CvScalar s; s = cvGet2D(imageS,0,0); cvNamedWindow("Picture", 1); // Create a window and name it: Picture cvShowImage("Picture", imageS); // display it cout << "Current pixel value at pixel (0,0): " << s.val[0] << endl; cvWaitKey(); // Wait for a KeyPress cvSaveImage("smile.jpg",imageS); cvSaveImage("neutral.jpg",imageN); cvDestroyWindow("Picture"); cvReleaseImage(&imageS); cvReleaseImage(&imageN); return 0; } 203 Group 08ml582 Reactive movie playback 12. Appendix Cross Validation Program Code #define VERSION "Smile Detector v. 0.003 Alpha" #include <cv.h> #include <highgui.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <assert.h> #include <math.h> #include <float.h> #include <limits.h> #include <time.h> #include <ctype.h> #include <iostream> #include <vector> #include <fstream> #include <string> #include <cstring> using namespace std; #ifdef _EiC #define WIN32 #endif bool isSmiling(IplImage* mouthPicture, IplImage* smileTemplate, IplImage* neutralTemplate, int st); double pixelDifference(IplImage* pic1, IplImage* pic2); double pixelDifferenceSqr(IplImage* pic1, IplImage* pic2); vector <IplImage*> loadLearningCollection(string file, int maxCollectionSize, int skipStart, int trainingDataSize) vector <IplImage*> loadTestingCollection(string file, int maxCollectionSize, int offset); IplImage* getMean(vector <IplImage*> collection); int main( int argc, char** argv ) { vector <IplImage*> smileCollection;// Creates vector of images to hold all smile pictures vector <IplImage*> neutralCollection;// Creates vector of images to hold all neutral pictures vector <IplImage*> smileTestData;// Creates vector of images to hold the smile pictures to test on vector <IplImage*> neutralTestData; // Creates vector of images to hold all neutral pictures to test on IplImage* smileTemplate = cvLoadImage("smile.jpg"); //cvCreateImage(cvSize(30,15),IPL_DEPTH_8U,1); IplImage* neutralTemplate = cvLoadImage("neutral.jpg"); //cvCreateImage(cvSize(30,15),IPL_DEPTH_8U,1); int smile = 0, neutral = 0; int n = 1, trainingDataSize = 16; ofstream sResults ("cvSResult.csv");// Outputs the results in a comma seperated value file ofstream nResults ("cvNResult.csv");Outputs the results in a comma seperated value file //cvNamedWindow( "Current picture", 1); for (int st = 0; st<10000; st+=100) // Checks different thresholds using steps of 1000 { smile = 0; // Counts the smiles neutral = 0;// Counts the neutrals n = 0; for (int i=0; i<10; i++) { 204 Group 08ml582 Reactive movie playback 12. Appendix smileCollection = loadLearningCollection("SmilePics.txt", 297, i*trainingDataSize, trainingDataSize); // Load the collection of smiles, excluding the pictures that is to be used as test data neutralCollection = loadLearningCollection("NeutrPics.txt", 313, i*trainingDataSize, trainingDataSize); // Load the collection of non-smiles, excluding the pictures that is to be used as test data smileTemplate = getMean(smileCollection); // Get the mean image of the smile collection neutralTemplate = getMean(neutralCollection); // Get the mean image of the smile collection smileTestData = loadTestingCollection("SmilePics.txt", trainingDataSize, i*trainingDataSize); // Load the smile training collection neutralTestData = loadTestingCollection("NeutrPics.txt", trainingDataSize, i*trainingDataSize); // Load the non-smile training collection for (int j=0; j<12; j++) { /*cvShowImage("Current picture", neutralTestData[j]); cvWaitKey();*/ if (isSmiling(smileTestData[j],smileTemplate,neutralTemplate,st)) // Simply tests if the image contains smiles (See SmileDetection for more information) { //sResults << "Picture " << n << " is smiling" << endl; smile++; } else { //sResults << "Picture " << n << " is not smiling" << endl; } if (isSmiling(neutralTestData[j],smileTemplate,neutralTemplate,st)) Simply tests if the image contains smiles (See SmileDetection for more information) { //nResults << "Picture " << n << " is smiling" << endl; } else { //nResults << "Picture " << n << " is not smiling" << endl; neutral++; } n++; } } cout << "Picture " << st << " done." << endl; sResults << ((float)smile/(n-1))*100 << ";"; // Calculates the percent of smiles detected as smiles nResults << ((float)neutral/(n-1))*100 << ";"; // Calculates the percent of neutrals detected as neutrals } cout << "Cross validation complete." << endl; cvReleaseImage(&smileTemplate); cvReleaseImage(&neutralTemplate); //cvDestroyWindow("Current picture"); sResults.close(); // Close file nResults.close(); // Close file // return 0; } bool isSmiling(IplImage* mouthPicture, IplImage* smileTemplate, IplImage* neutralTemplate, int st) // Function that checks if user is smiling { int smileDist = 0; int neutralDist = 0; 205 Group 08ml582 Reactive movie playback 12. Appendix int width = mouthPicture->width; if (pixelDifferenceSqr(mouthPicture,smileTemplate) < pixelDifference(mouthPicture, neutralTemplate)+st) // Comparision between difference between (live image and smile image) and (live image and neutral image) { return true; } else { return false; } }; double pixelDifference(IplImage* pic1, IplImage* pic2) // Smile comparision using method 1 { // Please see "SmileDetection" program for more information CvScalar s, t; // "SmileDetection" contains a more streamlined version of this function int width = pic1->width; double diff = 0; for (int y = 0; y < pic1->height; y++) { for (int x = 0; x < pic1->width; x++) { if (s.val[0]-t.val[0] < 0) diff += t.val[0]-s.val[0]; else diff += s.val[0]-t.val[0]; } } return diff; }; double pixelDifferenceSqr(IplImage* pic1, IplImage* pic2) // Smile comparision using method 2 { // Please see "SmileDetection" program for more information CvScalar s, t; // "SmileDetection" contains a more streamlined version of this function int width = pic1->width; double diff = 0; for (int y = 0; y < pic1->height; y++) { for (int x = 0; x < pic1->width; x++) { s = cvGet2D(pic1,y,x); t = cvGet2D(pic2,y,x); diff += (s.val[0]-t.val[0])*(s.val[0]-t.val[0]); } } return sqrt(diff); }; vector <IplImage*> loadLearningCollection(string file, int skipStart, int trainingDataSize) // Loads collection of "SmileLearner" { vector <IplImage*> collection; const char *filename = file.c_str(); ifstream myfile (filename); string line; int lineNum = 0; if (myfile.is_open()) 206 maxCollectionSize, int images, descriped in Group 08ml582 Reactive movie playback 12. Appendix { while (! myfile.eof() && collection.size() < maxCollectionSize ) { if (lineNum >= skipStart && lineNum < skipStart+trainingDataSize) // Slightly changed to skip every 12'th picture { getline(myfile,line); lineNum++; } else { getline(myfile,line); const char *charline = line.c_str(); IplImage *image = cvCreateImage(cvSize(30,15),IPL_DEPTH_8U,1); // This is image pointer IplImage *imageFlip = cvCreateImage(cvSize(30,15),IPL_DEPTH_8U,1); // This is image pointer image= cvLoadImage(charline,0); // load the image cvEqualizeHist(image,image); collection.push_back(image); cvFlip(image,imageFlip,1); collection.push_back(imageFlip); lineNum++; } } } return collection; }; vector <IplImage*> loadTestingCollection(string file, int maxCollectionSize, int offset) { vector <IplImage*> collection; const char *filename = file.c_str(); ifstream myfile (filename); string line; int lineNum = 0; if (myfile.is_open()) { while (! myfile.eof() && collection.size() < maxCollectionSize ) { if (lineNum < offset) // Slightly changed to use only every 12'th picture { getline(myfile,line); lineNum++; } else { getline(myfile,line); const char *charline = line.c_str(); IplImage *image = cvCreateImage(cvSize(30,15),IPL_DEPTH_8U,1); // This is image pointer image= cvLoadImage(charline,0); // load the image cvEqualizeHist(image,image); collection.push_back(image); lineNum++; } } } return collection; }; IplImage* getMean(vector <IplImage*> collection) // Described in "SmileLearner" { CvScalar s, t; 207 Group 08ml582 Reactive movie playback 12. Appendix IplImage* meanImg = cvCreateImage(cvSize(collection[0]->width,collection[0]>height),IPL_DEPTH_8U,1); int temp = 0; vector <int> coordinate((collection[0]->width)*(collection[0]->height)); for( int i = 0; i < collection.size() ; i++ ) { int coordinateCounter = 0; for (int y=0; y<collection[i]->height; y++) { for (int x=0; x<collection[i]->width; x++) { s = cvGet2D(collection[i],y,x); coordinate[coordinateCounter] += s.val[0]; coordinateCounter++; } } } //cout << "coordinate[150] before averaging= " << coordinate[150] << endl; for (int j = 0; j<coordinate.size(); j++) { coordinate[j] = coordinate[j]/collection.size(); } //cout << "coordinate[150] after averaging= " << coordinate[150] << endl; int pixelCounter = 0; for (int h = 0; h < meanImg->height; h++) { for (int w = 0; w < meanImg->width; w++) { for (int scalar = 0; scalar < 4; scalar++) { t.val[scalar] = (double)coordinate[pixelCounter]; } cvSet2D(meanImg, h, w, t); pixelCounter++; } } return meanImg; }; 208 Group 08ml582 Reactive movie playback 12. Appendix Smile Detection Program Code #define VERSION "Smile Detector v. 0.7 Release Candidate" #include <cv.h> #include <highgui.h> #include <stdio.h> #include <stdlib.h> #include <string> #include <assert.h> #include <math.h> #include <float.h> #include <limits.h> #include <time.h> #include "timer.h" #include <ctype.h> #include <iostream> #include <vector> #include <fstream> #include <cstring> using namespace std; #ifdef _EiC #define WIN32 #endif // OpenCV variable decleration static CvMemStorage* storage = 0; static CvHaarClassifierCascade* cascade = 0; // Function prototypes for the functions created by Group 08ml582 IplImage* mouthDetection( IplImage* image, unsigned char colorChange ); IplImage* resizeMouth(IplImage* src); bool isSmiling(IplImage* mouthPicture, IplImage* smileTemplate, IplImage* neutralTemplate, int st); double pixelDifference(IplImage* pic1, IplImage* pic2); void sendToFlash(int smileCount, int neutralCount); void playMovie(int newMovie, const char *file[]); // Load the haar cascade used for face detection const char* cascade_name = "haarcascade_frontalface_alt2.xml"; int main( int argc, char** argv ) { // Initialization of variables IplImage* smileTemplate = cvLoadImage("smile.jpg"); IplImage* neutralTemplate = cvLoadImage("neutral.jpg"); IplImage* smiley = cvLoadImage("smiley.jpg"); IplImage* saddy = cvLoadImage("saddy.jpg"); CvCapture* capture = 0; IplImage *frame, *frame_copy = 0; const char* input_name; vector <bool> frames,movieMood; vector <int> smileTimer,movieOrder; Timer timer; bool smiling = false; int smileCount = 0, neutralCount = 0, st = 0, lineNum = 0, movieNr = 0, init = 0, fa = 0, cd = 0, clipEnd = 0; float duration = 0.f, clipDuration[9], hStyle[3]; unsigned char colorChange = 0; // Load playlist string fileArray[9] = {"0","0","0","0","0","0","0","0","0"}; ifstream playList ("playlist.txt"); string line; if (playList.is_open()) { 209 Group 08ml582 Reactive movie playback 12. Appendix while (! playList.eof() && lineNum < 18) { if(lineNum%2==0) //For every linenumber dividable by two (0,2,4,8...) { getline(playList,line); //Get the current line as a string (command for starting a movie clip) fileArray[fa] = line; //Save current line in fileArray at position fa fa++; } else //For every odd linenumber { playList >> duration; //Find and store a float (duration of movie found in previous line) getline(playList,line); //This line is needed in order to make the getline work properly in above if statement! clipDuration[cd] = duration; //Save current duration in clipDuration at position cd cd++; } lineNum++; } } // Create a const char* array and save all values of fileArray as c strings in this array (needed for reading later) const char *file[9] = {fileArray[0].c_str(),fileArray[1].c_str(),fileArray[2].c_str(),fileArray[3].c_str(), fileArray[4].c_str(),fileArray[5].c_str(),fileArray[6].c_str(),fileArray[7].c_str(), fileArray[8].c_str()}; // Initialization of haar cascade input_name = argc > 1 ? argv[1] : 0; cascade = (CvHaarClassifierCascade*)cvLoad( cascade_name, 0, 0, 0 ); if( !cascade ) { fprintf( stderr, "ERROR: Could not load classifier cascade\n" ); system("pause"); return -1; } storage = cvCreateMemStorage(0); // Test if a webcam is connected. If no webcam is connected, use a movie as input if( !input_name || (isdigit(input_name[0]) && input_name[1] == '\0') ) capture = cvCaptureFromCAM( !input_name ? 0 : input_name[0] - '0' ); else capture = cvCaptureFromAVI( "film.avi" ); // Set up windows for viewing //cvNamedWindow( VERSION, 1 ); //Window for viewing the webcam feed (only for debugging) cvNamedWindow( "Happy?", 1); //Window for viewing the image showing if the user is smiling cvCreateTrackbar("Threshold","Happy?",&st,500,NULL); //Create a trackbar in the "Happy?" window, for adjusting the threshold playMovie(0,file); //Start playing movie 0 from array file clipEnd = clipDuration[0]; //Set clipEnd to the duration of movie 0 int done = 0; //done is set to 0, to make the smile detection start if( capture ) { while (!done) //As long as done is 0... { // Create image from webcam feed to be detected on if( !cvGrabFrame( capture )) break; frame = cvRetrieveFrame( capture ); if( !frame ) break; 210 Group 08ml582 Reactive movie playback 12. Appendix if( !frame_copy ) frame_copy = cvCreateImage( cvSize(frame->width,frame->height), IPL_DEPTH_8U, frame->nChannels ); if( frame->origin == IPL_ORIGIN_TL ) cvCopy( frame, frame_copy, 0 ); else cvFlip( frame, frame_copy, 0 ); // Create IplImage* mouthImage and save the result of mouthDetection in this variable IplImage* mouthImage; mouthImage = mouthDetection( frame_copy, colorChange ); if (mouthImage->width != 0) //If mouthImage has a width bigger than 0 meaning that a mouth is found { mouthImage = resizeMouth(mouthImage); //Resize the mouthImage to the desired size //cvShowImage( VERSION, mouthImage ); //Show the mouthImage in the VERSION window //cvShowImage( VERSION, frame_copy ); //Show the current frame in the VERSION window /*************************************************************** ** Method 2: ** ** The code below is used for the method checking for smiles ** ** only on the three intial movie clips. Depending on what ** ** movie clip the user smiled most at, an according new set ** ** of clips is chosen. ** ***************************************************************/ if (isSmiling(mouthImage, smileTemplate, neutralTemplate, st)) // If user is smiling { frames.push_back(1);// Save to vector that the user is smiling if (frames.size() > 5) // Keeps the vector size to max 5 (frames) { frames.erase(frames.begin(),frames.begin()+1); } } else { frames.push_back(0); if (frames.size() > 5) { frames.erase(frames.begin(),frames.begin()+1); } } smileCount = 0; neutralCount = 0; for (int i = 0; i<frames.size(); i++) { if (frames[i] == 1) {smileCount++;} else {neutralCount++;} } if (smileCount > neutralCount) { cvShowImage( "Happy?", smiley); colorChange = 3; movieMood.push_back(1); if (smiling == false) { smileTimer.push_back(clock()); } smiling = true; } 211 Group 08ml582 Reactive movie playback 12. Appendix if (smileCount < neutralCount) { cvShowImage( "Happy?", saddy); colorChange = 0; movieMood.push_back(0); if (smiling == true) { smileTimer.push_back(clock()); } smiling = false; } if (timer.elapsed(clipEnd+2000)) {//If current movie clip has ended int mood = 0; switch(init) { //switch statement makes sure that the program plays movie 0, followed by movie 3, followed by movie 6, followed //by two movies of the same style as the movie which scored the highest case 0: //If the ended movie was movie 0 movieOrder.push_back(init); //Store 0 in movieOrder's last place. MovieOrder keeps track //of the what movies the user saw and in which order. Used for testing. mood = 0; for (int i = 0; i<movieMood.size(); i++) //Calculate the sum of elements in movieMood. //As 1 is smiling and 0 is not, this will give //the total amount of smiling frames during the last video { mood += movieMood[i]; } movieMood.clear(); //Store mood/duration of ended clip as the score for this clip hStyle[0] = (float)mood/clipDuration[0]; init = 3; playMovie(init, file); //play init from file (init = 3) clipEnd = clipDuration[init]; //Set clipEnd to duration of clip init break; case 1: movieOrder.push_back(init); //Remember that this video is seen init = 2; playMovie(init, file); clipEnd = clipDuration[init]; break; case 2: movieOrder.push_back(init); done = 1; //Terminate program break; case 3: //Same procedure as in case 0 movieOrder.push_back(init); mood = 0; for (int i = 0; i<movieMood.size(); i++) { mood += movieMood[i]; } movieMood.clear(); hStyle[1] = (float)mood/clipDuration[3]; init = 6; playMovie(init, file); clipEnd = clipDuration[6]; break; case 4: movieOrder.push_back(init); init = 5; 212 Group 08ml582 Reactive movie playback 12. Appendix playMovie(5, file); clipEnd = clipDuration[5]; break; case 5: movieOrder.push_back(init); done = 1; break; case 6: movieOrder.push_back(init); mood = 0; for (int i = 0; i<movieMood.size(); i++) { mood += movieMood[i]; } movieMood.clear(); hStyle[2] = (float)mood/clipDuration[6]; cout << "Style 1 scores: " << hStyle[0] << endl << "Style 2 scores: " << hStyle[1] << endl << "Style 3 scores: " << hStyle[2] << endl; //Outputs the calculated scores for each style if (hStyle[0] >= hStyle[1] && hStyle[0] >= hStyle[2]) //If the score for style 1 (hStyle[0]) beats the score for the two other styles... { init = 1; playMovie(init, file); //Play the next movie in style 1 clipEnd = clipDuration[1]; } if (hStyle[1] > hStyle[0] && hStyle[1] >= hStyle[2]) //If the score for style 2 (hStyle[1]) beats the score for the two other styles... { init = 4; playMovie(init, file); //Play the next movie in style 2 clipEnd = clipDuration[4]; } if (hStyle[2] > hStyle[1] && hStyle[2] > hStyle[0]) //If the score for style 3 (hStyle[2]) beats the score for the two other styles... { init = 7; playMovie(init, file); //Play the next movie in style 3 clipEnd = clipDuration[7]; } break; case 7: movieOrder.push_back(init); init = 8; playMovie(8, file); clipEnd = clipDuration[8]; break; case 8: movieOrder.push_back(init); done = 1; break; } } /*************************************************************** ** End of method 2 ** ***************************************************************/ } else { cout << "No face detected!" << clock() << endl; //Output a text followed by a time in ms, if the user's face is not detected cvShowImage( VERSION, frame_copy ); } cvReleaseImage( &mouthImage );//Release image (delete it from memory) 213 Group 08ml582 Reactive movie playback 12. Appendix if( cvWaitKey( 10 ) >= 0 ) //Wait for a keypress. If a key is pressed, the program is terminated done = 1; } cvReleaseImage( &frame_copy ); cvReleaseCapture( &capture ); } //cvDestroyWindow(VERSION); cvDestroyWindow("Happy?"); //The following code writes to the log file. In stead of writing cout we write log, but the rest is pretty self-explanatory ofstream log ("log.txt"); for (int j=0; j<smileTimer.size(); j++) { if (j%2==0) { log << "User started smiling at " << smileTimer[j] << "ms" << endl; } else { log << "User stopped smiling at " << smileTimer[j] << "ms" << endl; } } log << endl << "The user had the following \"Smile Score\" for each of the three humor types:" << endl << "Style 1: " << hStyle[0] << "\nStyle 2: " << hStyle[1] << "\nStyle 3: " << hStyle[2] << endl; log << endl << "User watched the movies in the following order:"; for(int k=0; k<movieOrder.size();k++) { log << " " << movieOrder[k]; } log << "." << endl; log << endl << "User smiled a total of " << smileTimer.size()/2 << " times." << endl; return 0; //End of main } IplImage* mouthDetection( IplImage* img, unsigned char colorChange ) { //Most of this is not developed by Group 08ml582, but is standard OpenCV code IplImage* mouthPixels; bool detectedFace = 0; static CvScalar colors[] = { {{0,0,255}}, {{0,128,255}}, {{0,255,255}}, {{0,255,0}}, {{255,128,0}}, {{255,255,0}}, {{255,0,0}}, {{255,0,255}}}; double scale = 2.2; IplImage* gray = cvCreateImage( cvSize(img->width,img->height), 8, 1 ); IplImage* small_img = cvCreateImage( cvSize( cvRound (img->width/scale),cvRound (img>height/scale)),8, 1 ); int i; cvCvtColor( img, gray, CV_BGR2GRAY ); cvResize( gray, small_img, CV_INTER_LINEAR ); cvEqualizeHist( small_img, small_img ); 214 Group 08ml582 Reactive movie playback 12. Appendix cvClearMemStorage( storage ); if( cascade ) { double t = (double)cvGetTickCount(); CvSeq* faces = cvHaarDetectObjects( small_img, cascade, storage, 1.1, 2, 0/*CV_HAAR_DO_CANNY_PRUNING*/, cvSize(30, 30) ); t = (double)cvGetTickCount() - t; for( i = 0; i < (faces ? faces->total : 0); i++ ) { CvRect* r = (CvRect*)cvGetSeqElem( faces, i ); CvPoint facecenter; int radius; facecenter.x = cvRound((r->x + r->width*0.5)*scale); facecenter.y = cvRound((r->y + r->height*0.5)*scale); radius = cvRound((r->width + r->height)*0.25*scale); // Mouth detection CvPoint mouthUpLeft; CvPoint mouthDownRight; //Define the top left corner of the rectangle spanning the mouth //Check report for further explanation mouthUpLeft.x = facecenter.x - 0.5*radius; mouthUpLeft.y = facecenter.y + 0.3*radius; //Define the bottom right corner of the rectangle spanning the mouth mouthDownRight.x = mouthUpLeft.x + radius; mouthDownRight.y = mouthUpLeft.y + radius * 0.5; detectedFace = true; cvRectangle( img, mouthUpLeft, mouthDownRight, colors[colorChange], 3, 8, 0); //Create an rectangle as specified above. This is the mouth! int step = gray->widthStep/sizeof(uchar); uchar* data = (uchar *)gray->imageData; //Set mouthPixels' different attributes mouthPixels = cvCreateImage( cvSize(mouthDownRight.y mouthUpLeft.y,mouthDownRight.x - mouthUpLeft.x),IPL_DEPTH_8U, 1 ); mouthPixels->height = mouthDownRight.y - mouthUpLeft.y; mouthPixels->width = mouthDownRight.x - mouthUpLeft.x; mouthPixels->widthStep = mouthPixels->width/sizeof(uchar); mouthPixels->nChannels = 1; uchar* data2 = (uchar *)mouthPixels->imageData; //Set data2 to the imageData of mouthPixels int data2Location = 0; for(int a = mouthUpLeft.y; a < mouthDownRight.y; a++) { for(int b = mouthUpLeft.x; b < mouthDownRight.x; b++) { data2[data2Location] = data[b+a*step]; //For the rectangle that makes up the mouth, save the image data from the webcam feed data2Location++; }}}} //std::cout << detectedFace << std::endl; - specified if(detectedFace == true) { // cvShowImage( VERSION, mouthPixels ); } else { cvShowImage( VERSION, gray);}; cvReleaseImage( &gray ); cvReleaseImage( &small_img ); cvEqualizeHist(mouthPixels,mouthPixels); return mouthPixels; //If a mouth was found, return the image that specifies the mouth } 215 Group 08ml582 Reactive movie playback 12. Appendix else { gray->width = 0; cvReleaseImage( &small_img ); return gray; //If no mouth was found, return a gray image } }; IplImage* resizeMouth(IplImage* src) { IplImage* resizedMouth = cvCreateImage(cvSize(30,15),IPL_DEPTH_8U, 1); //Create an IplImage* and set its size to 30*15 cvResize(src, resizedMouth, 1); //resize input image to fit inside resizedMouth and store the image in resizedMouth return resizedMouth; //return resizedMouth }; bool isSmiling(IplImage* mouthPicture, IplImage* smileTemplate, IplImage* neutralTemplate, int st) { int smileDist = 0; int neutralDist = 0; int width = mouthPicture->width; //Call pixelDifference to calculate difference between mouthPicture and smileTemplate and compare the value to the difference between mouthPicture and neutralTemplate + threshold. if (pixelDifference(mouthPicture,smileTemplate) < pixelDifference(mouthPicture, neutralTemplate)+st) { return true; //If the difference was smaller between the mouthPicture and the smileTemplate, frame is smiling, so we return true } else { return false; //If the difference was smaller between the mouthPicture and the neutralTemplate, frame is not smiling, so we return false } }; double pixelDifference(IplImage* pic1, IplImage* pic2) { CvScalar s, t; int width = pic1->width; double diff = 0; for (int y = 0; y < pic1->height; y++) { for (int x = 0; x < pic1->width; x++) { s = cvGet2D(pic1,y,x); //Save the current pixel value of first input image in CvScalar s t = cvGet2D(pic2,y,x); //Save the current pixel value of second input image in CvScalar t diff += (s.val[0]-t.val[0])*(s.val[0]-t.val[0]); //Increase diff by the absolute value of the difference between s and t } } return sqrt(diff); //return diff, which is now the total difference between the two input images }; void playMovie(int newMovie, const char *file[]) { system(file[newMovie]); //Calls the command stored in place newMovie of array file }; 216 Group 08ml582 Reactive movie playback 12. Appendix Smile Detection Program Code #define VERSION "Smile Detector v. 0.7 Release Candidate" #include <cv.h> #include <highgui.h> #include <stdio.h> #include <stdlib.h> #include <string> #include <assert.h> #include <math.h> #include <float.h> #include <limits.h> #include <time.h> #include "timer.h" #include <ctype.h> #include <iostream> #include <vector> #include <fstream> #include <cstring> using namespace std; #ifdef _EiC #define WIN32 #endif // OpenCV variable decleration static CvMemStorage* storage = 0; static CvHaarClassifierCascade* cascade = 0; // Function prototypes for the functions created by Group 08ml582 IplImage* mouthDetection( IplImage* image, unsigned char colorChange ); IplImage* resizeMouth(IplImage* src); bool isSmiling(IplImage* mouthPicture, IplImage* smileTemplate, IplImage* neutralTemplate, int st); double pixelDifference(IplImage* pic1, IplImage* pic2); void sendToFlash(int smileCount, int neutralCount); void playMovie(int newMovie, const char *file[]); // Load the haar cascade used for face detection const char* cascade_name = "haarcascade_frontalface_alt2.xml"; int main( int argc, char** argv ) { // Initialization of variables IplImage* smileTemplate = cvLoadImage("smile.jpg"); IplImage* neutralTemplate = cvLoadImage("neutral.jpg"); IplImage* smiley = cvLoadImage("smiley.jpg"); IplImage* saddy = cvLoadImage("saddy.jpg"); CvCapture* capture = 0; IplImage *frame, *frame_copy = 0; const char* input_name; vector <bool> frames,movieMood; vector <int> smileTimer,movieOrder; Timer timer; bool smiling = false; int smileCount = 0, neutralCount = 0, st = 0, lineNum = 0, movieNr = 0, init = 0, fa = 0, cd = 0, clipEnd = 0; float duration = 0.f, clipDuration[9], hStyle[3]; unsigned char colorChange = 0; // Load playlist string fileArray[9] = {"0","0","0","0","0","0","0","0","0"}; ifstream playList ("playlist.txt"); string line; 217 Group 08ml582 Reactive movie playback 12. Appendix if (playList.is_open()) { while (! playList.eof() && lineNum < 18) { if(lineNum%2==0) //For every linenumber dividable by two (0,2,4,8...) { getline(playList,line); //Get the current line as a string (command for starting a movie clip) fileArray[fa] = line; //Save current line in fileArray at position fa fa++; } else //For every odd linenumber { playList >> duration; //Find and store a float (duration of movie found in previous line) getline(playList,line); //This line is needed in order to make the getline work properly in above if statement! clipDuration[cd] = duration; //Save current duration in clipDuration at position cd cd++; } lineNum++; } } // Create a const char* array and save all values of fileArray as c strings in this array (needed for reading later) const char *file[9] = {fileArray[0].c_str(),fileArray[1].c_str(),fileArray[2].c_str(),fileArray[3].c_str(), fileArray[4].c_str(),fileArray[5].c_str(),fileArray[6].c_str(),fileArray[7].c_str(), fileArray[8].c_str()}; // Initialization of haar cascade input_name = argc > 1 ? argv[1] : 0; cascade = (CvHaarClassifierCascade*)cvLoad( cascade_name, 0, 0, 0 ); if( !cascade ) { fprintf( stderr, "ERROR: Could not load classifier cascade\n" ); system("pause"); return -1; } storage = cvCreateMemStorage(0); // Test if a webcam is connected. If no webcam is connected, use a movie as input if( !input_name || (isdigit(input_name[0]) && input_name[1] == '\0') ) capture = cvCaptureFromCAM( !input_name ? 0 : input_name[0] - '0' ); else capture = cvCaptureFromAVI( "film.avi" ); // Set up windows for viewing //cvNamedWindow( VERSION, 1 ); //Window for viewing the webcam feed (only for debugging) cvNamedWindow( "Happy?", 1); //Window for viewing the image showing if the user is smiling cvCreateTrackbar("Threshold","Happy?",&st,500,NULL); //Create a trackbar in the "Happy?" window, for adjusting the threshold playMovie(0,file); //Start playing movie 0 from array file clipEnd = clipDuration[0]; //Set clipEnd to the duration of movie 0 int done = 0; //done is set to 0, to make the smile detection start if( capture ) { while (!done) //As long as done is 0... { // Create image from webcam feed to be detected on if( !cvGrabFrame( capture )) break; frame = cvRetrieveFrame( capture ); 218 Group 08ml582 Reactive movie playback 12. Appendix if( !frame ) break; if( !frame_copy ) frame_copy = cvCreateImage( cvSize(frame->width,frame->height), IPL_DEPTH_8U, frame->nChannels ); if( frame->origin == IPL_ORIGIN_TL ) cvCopy( frame, frame_copy, 0 ); else cvFlip( frame, frame_copy, 0 ); // Create IplImage* mouthImage and save the result of mouthDetection in this variable IplImage* mouthImage; mouthImage = mouthDetection( frame_copy, colorChange ); if (mouthImage->width != 0) //If mouthImage has a width bigger than 0 - meaning that a mouth is found { mouthImage = resizeMouth(mouthImage); //Resize the mouthImage to the desired size //cvShowImage( VERSION, mouthImage ); //Show the mouthImage in the VERSION window //cvShowImage( VERSION, frame_copy ); //Show the current frame in the VERSION window /*************************************************************** ** Method 2: ** ** The code below is used for the method checking for smiles ** ** only on the three intial movie clips. Depending on what ** ** movie clip the user smiled most at, an according new set ** ** of clips is chosen. ** ***************************************************************/ if (isSmiling(mouthImage, smileTemplate, neutralTemplate, st)) // If user is smiling { frames.push_back(1); // Save to vector that the user is smiling if (frames.size() > 5) // Keeps the vector size to max 5 (frames) { frames.erase(frames.begin(),frames.begin()+1); } } else { frames.push_back(0); if (frames.size() > 5) { frames.erase(frames.begin(),frames.begin()+1); } } smileCount = 0; neutralCount = 0; for (int i = 0; i<frames.size(); i++) { if (frames[i] == 1) {smileCount++;} else {neutralCount++;} } if (smileCount > neutralCount) { cvShowImage( "Happy?", smiley); colorChange = 3; 219 Group 08ml582 Reactive movie playback 12. Appendix movieMood.push_back(1); if (smiling == false) { smileTimer.push_back(clock()); } smiling = true; } if (smileCount < neutralCount) { cvShowImage( "Happy?", saddy); colorChange = 0; movieMood.push_back(0); if (smiling == true) { smileTimer.push_back(clock()); } smiling = false; } if (timer.elapsed(clipEnd+2000)) {//If current movie clip has ended int mood = 0; switch(init) { //switch statement makes sure that the program plays movie 0, followed by movie 3, followed by movie 6, followed //by two movies of the same style as the movie which scored the highest case 0: //If the ended movie was movie 0 movieOrder.push_back(init); //Store 0 in movieOrder's last place. MovieOrder keeps track of the what movies the user saw and in which order. Used for testing. mood = 0; for (int i = 0; i<movieMood.size(); i++) //Calculate the sum of elements in movieMood. //As 1 is smiling and 0 is not, this will give the total amount of smiling frames during the last video { mood += movieMood[i]; } movieMood.clear(); hStyle[0] = (float)mood/clipDuration[0]; //Store mood/duration of ended clip as the score for this clip init = 3; playMovie(init, file); //play init from file (init = 3) clipEnd = clipDuration[init];//Set clipEnd to duration of clip init break; case 1: movieOrder.push_back(init); //Remember that this video is seen init = 2; playMovie(init, file); clipEnd = clipDuration[init]; break; case 2: movieOrder.push_back(init); done = 1; //Terminate program break; case 3: //Same procedure as in case 0 movieOrder.push_back(init); mood = 0; for (int i = 0; i<movieMood.size(); i++) { mood += movieMood[i]; } movieMood.clear(); 220 Group 08ml582 Reactive movie playback 12. Appendix hStyle[1] = (float)mood/clipDuration[3]; init = 6; playMovie(init, file); clipEnd = clipDuration[6]; break; case 4: movieOrder.push_back(init); init = 5; playMovie(5, file); clipEnd = clipDuration[5]; break; case 5: movieOrder.push_back(init); done = 1; break; case 6: movieOrder.push_back(init); mood = 0; for (int i = 0; i<movieMood.size(); i++) { mood += movieMood[i]; } movieMood.clear(); hStyle[2] = (float)mood/clipDuration[6]; cout << "Style 1 scores: " << hStyle[0] << endl << "Style scores: " << hStyle[1] << endl << "Style 3 scores: " << hStyle[2] << endl; //Outputs the calculated scores for each style if (hStyle[0] >= hStyle[1] && hStyle[0] >= hStyle[2]) //If the score for style 1 (hStyle[0]) beats the score for the two other styles... { init = 1; playMovie(init, file); //Play the next movie in style 1 clipEnd = clipDuration[1]; } if (hStyle[1] > hStyle[0] && hStyle[1] >= hStyle[2]) //If the score for style 2 (hStyle[1]) beats the score for the two other styles... { init = 4; playMovie(init, file); //Play the next movie in style 2 clipEnd = clipDuration[4]; } if (hStyle[2] > hStyle[1] && hStyle[2] > hStyle[0]) //If the score for style 3 (hStyle[2]) beats the score for the two other styles... { init = 7; playMovie(init, file); //Play the next movie in style 3 clipEnd = clipDuration[7]; } break; case 7: movieOrder.push_back(init); init = 8; playMovie(8, file); clipEnd = clipDuration[8]; break; case 8: movieOrder.push_back(init); done = 1; break; } } /*************************************************************** ** End of method 2 ** ***************************************************************/ } 2 221 Group 08ml582 Reactive movie playback 12. Appendix else { cout << "No face detected!" << clock() << endl; //Output a text followed by a time in ms, if the user's face is not detected cvShowImage( VERSION, frame_copy ); } cvReleaseImage( &mouthImage ); //Release image (delete it from memory) if( cvWaitKey( 10 ) >= 0 ) //Wait for a keypress. If a key is pressed, the program is terminated done = 1; } cvReleaseImage( &frame_copy ); cvReleaseCapture( &capture ); } //cvDestroyWindow(VERSION); cvDestroyWindow("Happy?"); //The following code writes to the log file. In stead of writing cout we write log, but the rest is pretty self-explanatory ofstream log ("log.txt"); for (int j=0; j<smileTimer.size(); j++) { if (j%2==0) { log << "User started smiling at " << smileTimer[j] << "ms" << endl; } else { log << "User stopped smiling at " << smileTimer[j] << "ms" << endl; } } log << endl << "The user had the following \"Smile Score\" for each of the three humor types:" << endl << "Style 1: " << hStyle[0] << "\nStyle 2: " << hStyle[1] << "\nStyle 3: " << hStyle[2] << endl; log << endl << "User watched the movies in the following order:"; for(int k=0; k<movieOrder.size();k++) { log << " " << movieOrder[k]; } log << "." << endl; log << endl << "User smiled a total of " << smileTimer.size()/2 << " times." << endl; return 0; //End of main } IplImage* mouthDetection( IplImage* img, unsigned char colorChange ) { //Most of this is not developed by Group 08ml582, but is standard OpenCV code IplImage* mouthPixels; bool detectedFace = 0; static CvScalar colors[] = { {{0,0,255}}, {{0,128,255}}, {{0,255,255}}, {{0,255,0}}, {{255,128,0}}, {{255,255,0}}, {{255,0,0}}, {{255,0,255}} }; double scale = 2.2; IplImage* gray = cvCreateImage( cvSize(img->width,img->height), 8, 1 ); 222 Group 08ml582 Reactive movie playback 12. Appendix IplImage* small_img = cvCreateImage( cvSize( cvRound (img->width/scale),cvRound (img>height/scale)),8, 1 ); int i; cvCvtColor( img, gray, CV_BGR2GRAY ); cvResize( gray, small_img, CV_INTER_LINEAR ); cvEqualizeHist( small_img, small_img ); cvClearMemStorage( storage ); if( cascade ) { double t = (double)cvGetTickCount(); CvSeq* faces = cvHaarDetectObjects( small_img, cascade, storage, 1.1, 2, 0/*CV_HAAR_DO_CANNY_PRUNING*/, cvSize(30, 30) ); t = (double)cvGetTickCount() - t; for( i = 0; i < (faces ? faces->total : 0); i++ ) { CvRect* r = (CvRect*)cvGetSeqElem( faces, i ); CvPoint facecenter; int radius; facecenter.x = cvRound((r->x + r->width*0.5)*scale); facecenter.y = cvRound((r->y + r->height*0.5)*scale); radius = cvRound((r->width + r->height)*0.25*scale); // Mouth detection CvPoint mouthUpLeft; CvPoint mouthDownRight; //Define the top left corner of the rectangle spanning the mouth //Check report for further explanation mouthUpLeft.x = facecenter.x - 0.5*radius; mouthUpLeft.y = facecenter.y + 0.3*radius; //Define the bottom right corner of the rectangle spanning the mouth mouthDownRight.x = mouthUpLeft.x + radius; mouthDownRight.y = mouthUpLeft.y + radius * 0.5; detectedFace = true; //Create an rectangle as specified above. This is the mouth! cvRectangle( img, mouthUpLeft, mouthDownRight, colors[colorChange], 3, 8, 0); int step = gray->widthStep/sizeof(uchar); uchar* data = (uchar *)gray->imageData; //Set mouthPixels' different attributes mouthPixels = cvCreateImage( cvSize(mouthDownRight.y mouthUpLeft.y,mouthDownRight.x - mouthUpLeft.x),IPL_DEPTH_8U, 1 ); mouthPixels->height = mouthDownRight.y - mouthUpLeft.y; mouthPixels->width = mouthDownRight.x - mouthUpLeft.x; mouthPixels->widthStep = mouthPixels->width/sizeof(uchar); mouthPixels->nChannels = 1; //Set data2 to the imageData of mouthPixels uchar* data2 = (uchar *)mouthPixels->imageData; int data2Location = 0; for(int a = mouthUpLeft.y; a < mouthDownRight.y; a++) { for(int b = mouthUpLeft.x; b < mouthDownRight.x; b++) { data2[data2Location] = data[b+a*step]; //For the specified rectangle that makes up the mouth, save the image data from webcam data2Location++; } } } } if(detectedFace == true) { cvReleaseImage( &gray ); cvReleaseImage( &small_img ); 223 Group 08ml582 Reactive movie playback 12. Appendix cvEqualizeHist(mouthPixels,mouthPixels); return mouthPixels; //If a mouth was found, return the image that specifies the mouth } else { gray->width = 0; cvReleaseImage( &small_img ); return gray; //If no mouth was found, return a gray image } }; IplImage* resizeMouth(IplImage* src) { IplImage* resizedMouth = cvCreateImage(cvSize(30,15),IPL_DEPTH_8U, 1); //Create an IplImage* and set its size to 30*15 cvResize(src, resizedMouth, 1); //resize input image to fit inside resizedMouth and store the image in resizedMouth return resizedMouth; //return resizedMouth }; bool isSmiling(IplImage* mouthPicture, IplImage* smileTemplate, IplImage* neutralTemplate, int st) { int smileDist = 0; int neutralDist = 0; int width = mouthPicture->width; //Call pixelDifference to calculate difference between mouthPicture and smileTemplate and compare the value to the //difference between mouthPicture and neutralTemplate + threshold. if (pixelDifference(mouthPicture,smileTemplate) < pixelDifference(mouthPicture, neutralTemplate)+st) { return true; //If the difference was smaller between the mouthPicture and the smileTemplate, frame is smiling, so we return true } else { return false; //If the difference was smaller between the mouthPicture and the neutralTemplate, frame is not smiling, so we return false } }; double pixelDifference(IplImage* pic1, IplImage* pic2) { CvScalar s, t; int width = pic1->width; double diff = 0; for (int y = 0; y < pic1->height; y++) { for (int x = 0; x < pic1->width; x++) { s = cvGet2D(pic1,y,x); //Save the current pixel value of first input image in CvScalar s t = cvGet2D(pic2,y,x); //Save the current pixel value of second input image in CvScalar t diff += (s.val[0]-t.val[0])*(s.val[0]-t.val[0]); //Increase diff by the absolute value of the difference between s and t } } return sqrt(diff); //return diff, which is now the total difference between the two input images }; void playMovie(int newMovie, const char *file[]) { system(file[newMovie]); //Calls the command stored in place newMovie of array file }; 224 Group 08ml582 Reactive movie playback 12. Appendix 12.3 Forms Consent form Test of animated movie Age: 18-24 25-31 32+ If you would like us to contact you when further test of this product is to be conducted, please fill out the following form too: Name: _______________________________________________________________ Address: _______________________________________________________________ Zip-code city: & _______________________________________________________________ Phone: _______________________________________________________________ E-mail: _______________________________________________________________ This test is performed in order for the study group 08ml582 of Aalborg University Copenhagen, to collect data about the product you are about to test. The test will last approximately 10 minutes and you are free to leave the test at any time, if you feel you do not want to continue. You will be sitting in a room together with three observers. All three persons are from the previously mentioned study group. One person will be in charge of technical issues, one person will be an observer and one person will be there to help you in case of any doubts about the test. The goal of the project is to determine whether it is possible to establish a new way for viewers to watch interactive movies. In this test, your reactions to certain animated clips will be tested and afterwards, you will be asked to fill out a questionnaire about what you have been trough. Your personal information will only be used by the group internally and none of your personal information will be given to third parts. If you agree to these terms, please state so below. I understand the terms above and agrees to participate in the test on these terms: Yes No Date _____________ Signature _____________________________________________________ 225 Group 08ml582 Reactive movie playback 12. Appendix Questionnaire for test of animatics Age: 18-24 25-31 32+ On a scale from 1 to 5, where 1 is no change and 5 is a lot of changes, how often did you find the humor style to change during the test? 1 2 3 4 5 On a scale from 1 to 5, where 1 is no control and 5 is full control, how much did you feel you had control over the choice of movie clips? 1 2 3 4 5 If you felt that you were in control, how did you feel you were controlling the movie clips? ____________________________________________________________________________________________________________ ____________________________________________________________________________________________________________ ____________________________________________________________________________________________________________ ____________________________________________________________________________________________________________ Which of the following clips did you think were the funniest? 226 Group 08ml582 Reactive movie playback 12. Appendix Questionnaire for final test Which of the following clips did you think were the funniest? Which of the following clips did you think the last two clips (the black-white ones) were most similar to? 227 Group 08ml582 Reactive movie playback 12. Appendix 12.4 Test Results Initial test For the initial test, the users were asked whether the sensed the style of humor change during the test. They were to answer this on a scale from 1 to 5. Users were also asked, on a scale from 1 to 5, whether they thought they were in control of the change in humor style. Choosing between three screenshots, they had to choose the funniest movie clip in their opinion. The program was detecting which clip the users smiled the most at. Age Style change Control User choice Detected funniest clip Correct choice Person 1 25-31 3 2 2 N/A False Person 2 18-24 3 3 3 2 False Person 3 18-24 4 1 2 N/A False Person 4 18-24 3 2 2 3 False Person 5 18-24 4 1 3 1 False Person 6 18-24 1 1 1 1 True Person 7 25-31 2 1 3 3 True Person 8 25-31 1 1 2 2 True Person 9 25-31 4 2 2 3 False Person 10 25-31 3 1 3 1 False Style change 228 Control 0 = No change 0 = No change 2 2 3 3 4 4 5 = A lot of change 5 = A lot of change Group 08ml582 Reactive movie playback 12. Appendix Final Test The final test was in many ways conducted in the same way as the initial test. The general test setup was the same, but instead of only testing on the reactive program, every second test person was tested on a program that did not react according to the user’s smile. The reactive test: Age User funniest clip Detected funniest clip Fitting style Person 1 18-24 3 3 1 95499 Person 3 18-24 3 1 2 26284 Person 5 18-24 3 3 3 26662 Person 7 18-24 2 1 1 445 Person 9 18-24 1 1 3 71847 Person 11 18-24 2 1 2 6733 Person 13 18-24 2 1 1 70333 Person 15 18-24 3 3 1 21374 Person 17 18-24 2 2 1 3865 Person 19 18-24 3 1 1 18699 Person 21 18-24 3 1 1 104697 Person 23 18-24 3 2 3 128658 Person 25 25-31 3 3 3 1787 Person 27 18-24 3 3 3 47395 Person 29 18-24 3 3 3 52642 All users total smile time: Average user smile time: Total smile time (ms) 676920 ms 22564 ms 229 Group 08ml582 Reactive movie playback 12. Appendix Preferred clip Preferred clip User Program 6% Style 1 27% 40% 47% Style 2 Style 3 67% Style 1 Style 2 Style 3 13% The non-reactive test: Age User funniest clip Detected funniest clip Fitting style Total smile time Person 2 18-24 1 1 1 24499 Person 4 18-24 3 0 1 4441 Person 6 18-24 3 2 3 82507 Person 8 18-24 3 3 1 25047 Person 10 18-24 3 2 1 16073 Person 12 25-31 3 3 1 67742 Person 14 18-24 3 1 1 24220 Person 16 18-24 3 2 1 55104 Person 18 18-24 3 3 1 4814 Person 20 18-24 3 1 3 31198 Person 22 18-24 1 2 3 1473 Person 24 18-24 2 2 1 61533 Person 26 18-24 3 3 1 6466 Person 28 18-24 3 1 1 2120 Person 30 18-24 3 3 1 9030 All users total smile time: Average user smile time: 230 416267 ms 13875.57 ms