MATCHING SELF PRESENTAION IN INTERNET DATING SITES TO CONSUMER PREFERENCES – AN INNOVATIVE MATCHING ALGORITHM Moti Zwilling Acadenic Center of Law and Business, Israel; Natek Srečko International School for Social and Business Studies, Slovenia Abstract: This study presents an innovative "Matching Algorithm" to match self presentation to consumer preferences in internet dating sites using data mining and machine learning techniques. The study is designed from 2 parts: The first part examines the correlation between the presentation characteristics of man and women in social networks vs. the response rate using several hypotheses. Results show that there is a strong correlation between the way man and woman presents themselves in social networks (such as "FACEBOOK") especially in the range of ages 18-55 (average age is 25.91). In addition, there is a strong positive correlation between the desire of man and woman to develop a romantic relationship between them trough social networks. As such, the more the user desires to achieve a "Real" relationship that may lead to a serious long term relationship, the more he/she uses the social network as an application to achieve their objectives. In the second part the author used data mining and machine learning techniques (Decision trees and Genetic Algorithms) to predict which personal attributes may influence the response rate of the other side's (In this paper only Decision trees – J48 algorithm results will be shown). Results show that some attributes (characteristics) related to personal presentation and education background are critical to achieve a positive response from the other side. Keywords: social networks, dating sties, self presentation, consumer preferences, data mining 1175 1. INTRODUCTION Online dating sites have been defined as: "The place where individuals created profiles, and initiate contact with others through an online service” (Hanock et al., 2007). The internet as a computermediated communication interface has more and more turned to a place where people can meet each other and express level of affection and emotion in order to have a parallel interaction (Walther, 1996). This thesis was also supported by Wysocki (1998) who claimed that the internet dating sites or forums are forming a new way of relationships which is considered more quickly and easy to form an intimately interaction rather than face to face relationship. Through the recent years more and more papers report on researches that focus on matching and consumer behavior in internet dating sites. Such a research was conducted already in 2005 by Madden and Lenhart, where usage behavior of peoples in online dating sites was examined. Their study reveals that most of the Internet users are currently single and are looking for romantic partners, 74% of them have used the Internet in one way or another to further their romantic interests. This study also found that a significant number of Americans personally know others who have tried and succeeded in online dating. Almost 15% of the respondents said they know someone who has been in long-term relationships with someone else or married someone whom they met online. A telephone survey of Australian adults by Hardie and Buzwell (2006) revealed that 13% of the respondents have had online social relationships. Most of them are students, young, single and comfortable with new technology. Single and partnered individuals equally admitted that they had experienced online romance, indicating that many of them may be cheated by online daters. Using Gale-Shapley algorithm, Hitsch et al., (2010) investigated the mate preferences and matching in a dating or marriage market of the online dating sites users. The results of this study clearly indicated that there is a stronger emphasis among women than among men in terms of the partner’s income. A sample of 300 university students survey by Donn and Sherman (2002) explored young adults’ attitudes and practices regarding internet use to facilitate romantic relationships. The study results show that most of the students have had experience knowing and developing a relationship with someone whom they met online. Much of the research focuses on the preferences of consumers in online dating sites, however very little is devoted to find the matching preferences among consumers and the usage of data mining algorithms to implement this matching. As stayed, it seems that social interaction in the internet is one of the reasons that this arena has become more popular among consumers and that intimate relationships has turned to be more and wider in the cyber space (Yang and Chiou, 2010). The way people present themselves in the internet forums and internet dating sites has been examined in many researches. These researches reveal that women prefer taller man and they receive more responses from man who seek "good shape" and "good looking" (Hitsch et al., (2010). Another interesting finding is that women and man try not to reveal the "true picture" about them especially concerning the way they describe themselves and the way they present themselves with a picture (which is most often not updated). (Catalina.l et al., 2008). In this study three main questions were raised: 1) Is there a correlation between the way people present themselves in dating sites or social networks meaning by using different level of self attributes and 2) Is there any difference between the desire of man and woman to develop an intimate relationship. 3) Can we use data mining prediction tools to forecast which attributes of users may be of influence on their usage and the response rate of the other side. 2. METHODOLOGY 150 responders which were chosen as a sample for the research were examined using questionnaires. 53% were man and 41% were woman. The average age of responders was 25.91 (Se = 5.99). Questionnaires were delivered to responders by Email, Facebook and through face-to-face meetings. 4 Questionnaires were found as not appropriate to analysis. The first part of the questionnaire was aimed to examine how people describe themselves in the internet arena as well as the way these people find their image. In addition the second part of the questionnaire was constructed from 11 questions where the responders had to answer on a likert scale ranged from 1 (not at all) to 5 (very much). These questions were aimed mainly to find which usage is made by them in order to find their partner and their desire to develop an intimate relationship 1176 through the internet. In the last part of the questionnaire the responders had to rank different descriptions on imaginary profile of woman and man where the color of hair was manipulated by the authors such as it was changed by the authors using the "Photoshop" software in order to present a user profile with only one attribute (hair color) which was different among the other pictures. In addition, different locations and user mood were taken (either in a "closed" place such as bar or in an open environment (near the see), with good mood or with a sad mood). Research hypothesis were examined using Spearman correlation & 2 tailed T-test analysis. The matching forecast was based on the J48 decision tree algorithm provided by weka – a free tool which is most common in data mining analysis. The research hypothesis was as follows: H1: There is a correlation between the way a seeker in an internet site presents himself/herself and the response rate from the other side towards his presentation. H2: There is a positive correlation between the desire to develop a romantic relationship through social networks and the level of usage in these networks, in a way that the more the seeker would like to develop a romantic relationship through the internet dating site the usage with this site will be higher. H3: There is a difference between men and women in the usage of internet dating sites. Women will be found as more active than men in dating sites. H4: There is a difference between the desire to develop a romantic relationship between men and women, women preference to develop a romantic relationship will be found as higher than men. 3. RESULTS Table 1: exhibits a 100 sample where 58 of it are men and 42 of it are women. The average age is 24 and the total age ranges between 18-35. Age Mean(SE) (3.07) 24.813 (2.99)24.66 (3.02) 24.75 N(%) (%58) 58 (%42)42 100 (100%) Man woman Total Level of usage in social netwroks Picture 1: Correlation between the desires of users to develop romantic relationship through social internet sites to the level of usage Desire to develop a romantic relationship 1177 Picture 1 exhibits a positive correlation between the desire of women and man to develop a romantic relationship and the level of usage in social networks. Table 2: Correlation between the level of usage in the internet and the desire to develop an intimate relationship Desire to develop relationship v.s. 0.353 ** a 1 0.01 > **p Level of usage 1 Level of usage 0.353 ** Desire to develop a relationship Spearman test yielded a positive correlation between the level of usage in internet sites to the desire of users to develop a relationship. Table 3: Correlation between the desires of users to develop romantic relationship through social internet sites to the level of usage between men Desire to develop a relationship v.s 0.311 ** Level of usage 1 Level of usage 1 0.311 ** Desire to develop a relationship 0.01 > **p Spearman test yielded a positive significant correlation between the level of usage in internet sites to the desire of users to develop a relationship (in the sub category of men). Table 4: Correlation between the desires of users to develop romantic relationship through social internet sites to the level of usage between women Desire to relationship 0.411 ** develop 1 a Level of usage 1 Level of usage 0.411 ** Desire to develop a relationship 0.01> **p Spearman test yielded a positive significant correlation between the level of usage in internet sites to the desire of users to develop a relationship (in the sub category of women). Table 5: Differences between women and man regarding the usage and desire to develop romantic relationship in the internet – T Test. CI 95% t [-0.594 ,0.017] [-0.411 ,3.598] -2.107 -1.66 Female M(SD) (0.401) 4.058 (0.292) 4.095 Male M(SD) (0.876) 3.752 (0.685) 3.907 Level of usage Desire to develop a romantic relationship **p<0.05 T-test analysis yielded significant differences between male and woman regarding their desire to develop a romantic relationship in internet sites and the level of usage. 1178 Content and spearman analysis regarding the pictures and "self description" of users reveal that women are more attracted to a positive and rich man who is seen in a good mood and positive position and male attract to a "good looking" woman who is seen in a positive position and looks attractive. Picture 2: Forecast the match between male and women seekers in internet dating sites [high/low] to their profile (romantic objective and level of usage). J48 algorithm provided by weka for analysis shows a nice correlation of correctly instances which were classified by using a simulated data analysis of profiles based on the data gathered from the open and closed questionnaires. It shows clearly that data mining can be utilized to forecast a "high" or "low" match between internet seekers which hold a different profile – hence different attributes. 4. CONCLUSIONS The purpose of the study is to examine whether picture content and profile attributes and preferences of users in internet dating sites may form a good match between male/female seekers. Results support the research hypothesis that profile attributes such as good looking, positive position, age and economic situation clearly have a substantial influence on seekers especially where romantic purpose is the reason for their presence in the site. In addition using data mining for forecasting, it was shown that decision trees may assist and support the hypothesis that certain preferences of users profile such as look, economic situation etc, may play an important role in the success of the matching data. Yet there is more to do in order to find which attributes are of most importance and which other algorithms may assist to forecast it, to improve the matching between man and woman seekers. REFERENCE LIST 1. Catalina L. Toma., Jeffrey T. Hancockm., Nicole B. Ellison(2008).Separating fact from fiction: An examination of deceptive self –presentation in online dating profiles. . Pres soc psycho bull, 2008; 34; 1023. 2. Donn, J, E.,Sherman, R, C. (2002). Attitudes and practices regarding the formation of romantic relationships on the internet. Cyberpsychology & Behavior, 5, (2), 107-123. 3. Hitsch, Gunter J., Ali Hortaçsu, and Dan Ariely. 2010. Matching and Sorting in Online Dating, American Economic Review, 100(1): 130-63. 1179 4. Hancock, J.T., Toma, C., and Ellison, N. (2007). The Truth about Lying in Online Dating Profiles, Chi 2007, Proceedings. Online Representation of Self, April 28-May 3, 2007, San Jose, CA, USA. 5. Hardie, E., and Buzwell, S. (2006). Finding Love Online: The Nature and Frequency of Australian Adults’Internet Relationships. Australian Journal of Emerging Technologies & Society, 4(1), 1-14. 6. Madden, M., and Lenhart, A. (2006). Online Dating. Pew Internet and American Life Project, March 5, 2006. Washington, DC. 7. Walther, J. B. (1996). Computer-mediated communication: Impersonal, interpersonal, and hyperpersonal interaction. Communication Research, 23 (1), 3-44. 8. Wysocki, D.K. (1998). Let Your Fingers do the Talking: Sex on an Adult Chat-line. Sexualities, 1(4), 425-452. 1180 APPENDIX Questionnaire sample questions Q14 Please indicate from the above pictures which are the one that you most likely choose for a romantic purpose. D C D C B B Q15 1181 A A Please indicate from the above pictures which are the one that you most likely choose for a romantic purpose. D D C C B B A A 1182