ICT 2012 Study Guide revision 1v2
Transcription
ICT 2012 Study Guide revision 1v2
STUDY GUIDE Information Competence ICT512S Namibia University of Science and Technology Centre for Open and Lifelong Learning STUDY GUIDE Course Writers William S Torbitt, Dean, IT, IUM Peter Gallert, HOD, Computer Systems & Networks, Namibia University of Science and Technology Content Editor Ravi Nath, School of Information Technology, Namibia University of Science and Technology Instructional Designer Sonja Joseph Language Editor Carol Kotze (?) Quality Controller Agathe Lewin Copyright Published by the Centre for Open and Lifelong Learning, Namibia University of Science and Technology, Windhoek, 2012. © Centre for Open and Lifelong Learning, Namibia University of Science and Technology. Graphics and content under different copyrights inherit copyright regulations of the copyright holders. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without prior permission of the publishers. Namibia University of Science and Technology Centre for Open and Lifelong Learning 13 Storch St Private Bag 13388 Windhoek Namibia Fax: +264 61 2072206/2081 E-mail: coll@nust.na Website: www.nust.na Acknowledgements The Namibia University of Science and Technology Centre for Open and Lifelong Learning wishes to thank those below for their contribution to this study guide: Mr. Bernd Schulz Contribution of in-text questions Unit Figure License Author 1 1 CC-BY 2.5 Generic Barrett Lyon 1 2 CC-BY-NC-ND worldmapper.org Commercial companies providing resources for education are exempted from NC, see http://www.worldmapper.org/copyright.html 1 3 CC-BY-SA 2.0 Generic Al Jazeera English 1 4 Public Domain (PD) 1 5 Fair use 1 6 CC0 1.0 2 1 PD 2 2 PD 2 3 CC0 1.0 2 4 CC-SA 2.0 Generic Paul Clarke 2 6 CC-BY-SA 3.0 Unported Nikolas Becker 3 1 Fair use 4 1 PD 4 2 CC-SA 2.0 Generic Joi Ito 5 1 CC-BY-SA 3.0 Unported Stern 6 1 Fair use Robert Kenneth Wilson, Daily Mail, 1934 7 All figures Fair use 8 1 CC-BY-SA 3.0 Daniel Joseph, Barnhart Clark Creative Commons Unported 9 1 CC-BY-SA 2.5 Generic Wikimedia Commons. Original by Markus Angermeier Vectorised and linked version by Luca Cremonini xkcd.com NC requirement waived for "reprinting occasional comics (with clear attribution) in publications like books", see http://xkcd.com/license.html 9 2 CC-BY-NC-SA 2.5 9 3 Possibly new creation Bill Torbitt 10 1 CC-BY-SA 3.0 Unported Mikael Häggström 10 2 CC-BY-SA 3.0 Unported Bridgespan Partners 11 1 PD 11 2 CC-BY-SA 3.0 Unported 11 3 Fair use 11 4 CC-BY-SA 3.0 Unported Adam Jones 11 5 CC-BY-SA 3.0 Unported Regis Lachaume 12 1 PD + own creation 12 2 PD 12 3 CC-BY-SA 3.0 Unported Rico Shen Tom-b Information Competence Contents About this study guide 1 How this study guide is structured ............................................................. 1 Course overview 3 Welcome to Information Competence ICT512S ............................................... 3 Information Competence ICT512S—is this course for you? .................................. 4 Your own Internet connection? .......................................................... 4 The main factors are: .............................................................. 4 Course objectives ................................................................................. 6 Timeframe ......................................................................................... 7 Study skills ......................................................................................... 8 Need help? ......................................................................................... 9 Assignments ....................................................................................... 9 Assessment....................................................................................... 10 Getting around this study guide 11 Margin icons ..................................................................................... 11 Unit 1 13 The Impact of the Internet .................................................................... 13 Introduction .............................................................................. 13 1. The World in One Room ............................................................. 13 2. The Role of Conventional Media .................................................... 19 2.1 What was life like before the Internet? ................................... 19 Pre-Internet communication technology ................................ 20 2.2 One of the longest lived media formats – the newspaper .............. 20 2.3 The future of the conventional media .................................... 25 2.4 Finally, a note on the future of some professions ....................... 25 3. Official ‘Secrets’, Personal Privacy and the Restriction of Information ...... 27 Unit summary ................................................................................... 31 Unit 2 33 How the Internet works ........................................................................ 33 Introduction .............................................................................. 33 1. A Short History of the Internet ..................................................... 34 1.1 The Internet in Namibia ..................................................... 37 2. The Web and Web Browsers ......................................................... 37 3. The Technology ....................................................................... 38 3.1 The protocol suite TCP/IP .................................................. 38 3.2 Internet Protocol (IP) addresses: .......................................... 39 3.3 Domain Name System (DNS) ................................................ 39 3.4 Dynamic Host Configuration Protocol (DHCP) ............................ 40 ii Contents Unit summary ................................................................................... 43 Unit 3 45 Data, Information and Knowledge ............................................................ 45 Introduction .............................................................................. 45 1. Information for Decision-Making ................................................... 46 2. Data, Information and Knowledge ................................................. 47 2.1 Data and Information ........................................................ 47 Primary, secondary and tertiary information: .......................... 49 2.2 Knowledge..................................................................... 50 Unit summary ................................................................................... 51 Unit 4 52 Search Engines .................................................................................. 52 Introduction .............................................................................. 52 1. How a Search Engine Works ......................................................... 52 2. Choice of a search engine ........................................................... 56 Unit summary ................................................................................... 61 Unit 5 62 Creating your Own Web Page (or Web Site) ................................................ 62 Introduction .............................................................................. 62 1. A Page by Coding with HTML ........................................................ 63 1.1 Step-by-Step Example ....................................................... 64 2. The Difference between a ‘Page’ and a ‘Site’ ................................... 77 3. A Page or Site Produced with a Web Design Tool ................................ 78 Unit Summary ................................................................................... 80 Unit 6 81 The Reliability of Information ................................................................ 81 Introduction .............................................................................. 81 1. Impartiality and Bias ................................................................. 82 1.1 Types of bias: ................................................................. 82 2. How to Detect Bias and ‘Disinformation’ ......................................... 84 2.1 “Unspeak” ..................................................................... 86 Unit Summary ................................................................................... 88 Unit 7 89 Elementary logic, assumptions and reasoning, fallacies and misleading statistics .... 89 Introduction .............................................................................. 89 1. The nature of logic ................................................................... 90 2. Statistics ............................................................................... 94 3. Presentation of Statistical Information ............................................ 97 Information Competence Unit Summary .................................................................................. 102 Unit 8 103 Intellectual Property, Plagiarism and Copyright .......................................... 103 Introduction ............................................................................. 103 1. The History of Intellectual Property Protection................................. 104 2. Copyright, patent and trademark ................................................. 106 2.1 Copyright ..................................................................... 106 2.2 Patent ......................................................................... 106 Creative Commons ......................................................... 107 2.3 Trademarks and logos ...................................................... 108 3. Plagiarism ............................................................................. 109 3.1 Referencing and citation: .................................................. 112 Reference list .............................................................. 113 Unit summary .................................................................................. 114 Unit 9 115 Web 2 and 3, E-business and the 'Long Tail' ............................................... 115 Introduction ............................................................................. 115 1. The Evolution of the Web .......................................................... 116 1.1 Web 1 ......................................................................... 116 1.2 Web 2 ......................................................................... 116 1.3 Web 3 ......................................................................... 121 2. The business of the Internet ....................................................... 122 3. E-commerce .......................................................................... 124 Cautions and advice for buying online .................................. 126 4. The Long Tail ......................................................................... 127 Unit summary .................................................................................. 128 Unit 10 129 The Wikipedia phenomenon .................................................................. 129 Introduction ............................................................................. 129 1. What is an encyclopedia? ........................................................... 130 2. The nature of Wikipedia ............................................................ 130 3. Why is Wikipedia special? .......................................................... 132 3.1 Name spaces ................................................................. 134 3.2 Hierarchy and subject experts on Wikipedia ............................ 134 3.3 Verifiability .................................................................. 135 3.4 Reliability of Sources, Relative to Wikipedia ........................... 135 3.5 Neutrality - Fringe and Conspiracy Theories - Due Weight ............ 136 4. Getting started with Wikipedia .................................................... 138 Learning your Way around Wikipedia and Getting Help .................... 140 5. Finding an article to create or improve .......................................... 140 Unit summary .................................................................................. 142 Unit 11 143 The Mobile Revolution ........................................................................ 143 Introduction ............................................................................. 143 iv Contents 1. The background and range of mobile services .................................. 145 2. Mobile computing in Africa ......................................................... 147 3. What Can Mobile Devices be Used for (in Africa and Elsewhere)? ............ 149 3.1 Advertising ................................................................... 150 3.2 Location aware services: ................................................... 150 3.3 (A story of) Crime Investigation ........................................... 150 3.4 General information: ....................................................... 151 3.5 Then some more innovative services: .................................... 151 Your mobile phone as a ‘remote’: ...................................... 151 Your mobile phone as ID: ................................................. 151 Financial services and ‘electronic wallet’: ............................ 151 “Mobile activism” .......................................................... 153 Unit summary .................................................................................. 154 Unit 12 155 The Down Side of the Internet ............................................................... 155 Introduction ............................................................................. 155 1. The dangers of the Internet - Cybercrime ....................................... 156 2. ID Theft and Phishing ............................................................... 157 2.1 ID Theft ....................................................................... 157 2.2 Phishing ....................................................................... 158 3. Cyber-stalking, cyber-harassing, cyber-bullying ................................ 159 4. Cyber-terrorism and cyber-warfare ............................................... 160 5. Botnets ................................................................................ 160 6. Is the Internet making us ‘dumber’? .............................................. 161 Unit summary .................................................................................. 165 About this study guide Information Competence ICT512S has been produced by the Centre for Open and Lifelong Learning, Namibia University of Science and Technology. All study guides produced by the Centre for Open and Lifelong Learning (COLL) are structured in the same way, as outlined below. How this study guide is structured The course overview The course overview gives you a general introduction to the course. Information contained in the course overview will help you determine: If the course is suitable for you What you will already need to know What you can expect from the course How this course fits into the learning programme as a whole How much time you will need to invest to complete the course The overview also provides guidance on: Study skills Where to get help Course assignments and assessments Study guide icons Units We strongly recommend that you read the overview carefully before starting your study. The course content The course is broken down into units. Each unit comprises: An introduction to the unit content Unit objectives The prescribed readings for the unit Additional readings for the unit Core content of the unit with a variety of learning activities References Self-assessment activities (if any) The unit’s key words or concepts A unit summary Resources 1 2 The Impact of the Internet For those interested in learning more on this subject, we provide you with a list of additional resources within each unit of this study guide; these may be books, articles or web sites. Please note that these resources are optional rather than prescribed readings. The prescribed readings are listed at the beginning of each unit. Your comments After completing Information Competence we would appreciate it if you would take a few moments to give us your feedback on any aspect of this course. Your feedback might include comments on: Course content and structure Course reading materials and resources Course assignments Course assessments Course duration Course support (assigned tutors, technical help, etc) Your constructive feedback will help us to improve and enhance this course. Course overview Welcome to Information Competence ICT512S This course is about coping with information in the age of the Internet. This is a factor which dominates the world we live in. Of course, there are no absolute rules on how to do this, but in this course you will receive something like a ‘road map’ of how to get a cost-effective connection to the Internet, if you do not have one already. You will be introduced to some theoretical concepts of data, information, and knowledge, as well as to some fundamental principles of how the Internet actually works. You will learn where to look for the information you require and how to find it, either from the Internet or from conventional sources, without ‘drowning’ in a sea of excess information. You will learn how to evaluate information. This entails judging its reliability: how likely a particular piece of information is free from errors and unintended flaws, and its independence: How likely the information you receive is unbiased, neutral, and free from authors' or publishers' agendas. The idea of scientific referencing is introduced to you: Why do academics (including students, in other words: you) are expected to reference the work of others, what do they reference, and how they do that. You will learn the principles of elementary logic and how to avoid being misled by incorrect reasoning and poor statistics. You will learn a little of the history of the Internet, and how we got to where we are, including the latest technological phenomenon: mobile networking and social interaction over the Internet. You will learn to create and communicate your own information. Equally, just as in human nature, there is a bad side to the Internet – fraud, faking, theft, deception and worse. You will learn to recognise and guard against these. 3 4 Course overview The Impact of the Internet Information Competence ICT512S—is this course for you? This course is intended for all people who do not live in a cave on the planet Zurgon – although there may be Internet access even there! Seriously, anyone preparing to enter almost any profession, from banking to biology, from music to mining, needs to be able to handle the informational aspects of the profession, and the information era in which we live. So this course is for you. There are no prerequisites for this course, other than those to gain entrance to the NUST. All you need is a desire to understand the unprecedented world brought to us largely, but not entirely, by technology. Your own Internet connection? Figure 1: 3G Network access. By Peter Gallert Firstly, you need access to the Internet! For the purpose of this course, all the assignments and exercises may be performed by visiting your local COLL centre, and utilising the equipment there, where you will also receive tuition on how to use it, if necessary. You can also access the Internet at a cost from an Internet café. However, I am sure that during and after this course, you will become ‘addicted’ to the Internet and would like to have your own connection to it. Everyone likes independence. So here are a few tips: Network access in Namibia is vastly easier these days than at the time when the Internet extended into the country in the early 1990’s, but sometimes still a problem in rural areas. The main factors are: Cost : of equipment (desktop computer or laptop, if you do not already have one). An alternative is a lower cost ‘netbook’ – a small laptop usually without a DVD). You do not need a powerful computer just for browsing the Internet. When considering cost there are two positions to evaluate: The cost of obtaining the necessary infrastructure like computers, modems, and software (this is called Capital Expenditure, or CapEx), and the cost of running the equipment like power consumption, Internet access fees, repair and maintenance (This is called Operative Expenditure, or OpEx). Quality : this is mainly the speed of the connection (data in megabytes per second) but also includes reliability, of course. Volume : How much data (in megabytes or gigabytes) do you need to upload or download each month? Do not underestimate this! Once 100 megabytes per month was thought ample: now, if you are downloading ‘streaming’ movies, 2 gigabytes per month may not be enough. Remember your family will develop extensive on-line habits! An ‘unlimited’ or ‘uncapped’ deal may be the best if you can get this at an affordable price. You need a ‘service provider’. In small countries, such as Namibia, these service providers are usually the national Telecom company or mobile phone networks, plus a few secondary providers. Internet access technology: There are several ways in which to access the Internet; they differ in availability, speed, and price: A so-called ADSL connection ‘splits’ your existing house phone line into a data and the normal voice line. This service is normally provided by Telecom Namibia. It may also include a home ‘wireless’ system (see below). But the service only works in urban areas. If you do not have a fixed phone line, but desire a network connection at home, both Telecom Namibia and MTC have a ‘home phone’ which looks like a domestic phone but which actually works on the mobile network. Data access is possible with this as well. In fact, there is less and less distinction these days between a voice phone and a data connection. Wireless or ‘wi-fi’ is a short range system based on very high frequency radio communication, ‘broadcasting’ from an access point with an antenna. This is used in restaurants, airports, hotels and private homes. All modern laptops have an internal wireless receiver, to communicate by wi-fi. Access is restricted by a password or key, which you sometimes have to pay for. WIMAX involves a small square antenna installation on your roof with a radio connection to a line of sight base station. It is mainly provided by Mweb. If you live in a rural area with few fixed phone lines and no ADSL, Telecom can install a satellite system called VSAT , working like a satellite phone and providing network access. Unfortunately OpEx for a satellite connection is very high. But the most popular connection mode and the most convenient for anyone who moves around, wanting to take their network access with them is 3G mobile access. Here, a USB stick-sized modem is inserted into a laptop and connection is established via the mobile phone network. All the mobile operators offer packages on both a prepaid and postpaid basis, for certain ‘capped’ amounts of data or for a higher charge, unlimited access. All these packages are constantly advertised, and for details it is best to inquire at a MTC ‘mobile home’, Leo shop or Telecom ‘teleshop’ for latest offers. 3G can also be accessed by a mobile phone, with no computer required (actually, those smart phones are computers). Mobile technology is amazing and it is quite awesome to think you can browse the world from the palm of your hand. The trouble is that, although fine in theory, mobile browsers can be frustrating unless you have a good (and expensive) smart phone with at least a 50x50 mm screen and a full keyboard which even large fingers are comfortable with. Browsers on cheaper phones tend to be slow and creaky and often do not display the 5 6 Course overview The Impact of the Internet page correctly, even in miniature. The currently extremely popular but expensive i-phone, copied now by other brands, consists only of a screen interface, and all commands including a virtual keyboard are operated by ‘touch screen’ actions. Some people like this, others find it irritating. Once set up with an Internet service provider (ISP) you get certain perks, such as at least one unique email address. However, many young people prefer a webmail address such as gmail or yahoo which does not identify your location and which you can utilise from any device connected to the Internet. Your ISP will generally give you assistance for setting up your web site and host it, although you may have to pay for this. Of course there are many commercial companies who will build a professional web site for you, if you require this, but it comes at a price. Course objectives Objectives Upon completion of this course you will be able to: discuss the impact that the Internet has made on modern life; describe the history of the Internet and the technology involved; distinguish between data, information and knowledge: generate information from data and extract knowledge from information; search effectively for required information, using traditional and electronic sources; discuss advantages and disadvantages of traditional libraries and online information sources; compare different types of information sources, and select the one best suited for solving a given problem; analyse critically acquired information to assess its usefulness, truth or bias; explain the implications of intellectual property rights, and how to acknowledge and reference sources, and avoid plagiarism and copyright violation; create and publish new information; report on and communicate information in a professional manner. Prescribed reading Bennett, D.J. (2004). Logic Made Easy: How to Know When Language Deceives You, W.W. Norton & Co., ISBN 978-0393057485 Bothma, T., Cosijn, E., Fourie, I., Penzhorn, E. (3rd ed. 2011) Navigating information Literacy Pearson, ISBN 978-1770259676 Burrows, T. (2008). Blogs, Wikis, MySpace, and More: Everything You Want to Know About Using Web 2.0 but Are Afraid to Ask, Chicago Review Press, ISBN 978-1556527562 Croft, B., Metzler, T., Strohman, D. (2009). Search Engines: Information Retrieval in Practice, Addison Wesley, ISBN 978-0136072249 Frauenfelder, M. (2007). Rule the Web: How to Do Anything and Everything on the Internet---Better, Faster, Easier, St. Martin’s Griffin, ISBN 9780312363338 Gralla, P. (8th ed. 2006). How the Internet Works (8th Edition), Que, ISBN 978-0789736260 Nolt, J., Rohatyn, D., Varzi, A. (2nd ed. 1998). Schaum’s Outline of Logic, McGraw-Hill, ISBN 978-0070466494 Poteet, J. (2004). Canning Spam: You’ve Got Mail (That You Don’t Want), Sams, ISBN 978-0672326394 Rowley, J., Hartley, R. (4th ed. 2008). Organizing Knowledge, Ashgate, ISBN 978-0754644316 Whyte, J. (2004). Crimes against logic: Exposing the Bogus Arguments of Politicians, Priests, Journalists, and Other Serial Offenders. McGraw-Hill, ISBN 978-0071446433 Web resource: the original ‘information competence’ course from California State university (last visited March 2011): http://www.calstate.edu/LS/Tutorials.shtml Recommended website Timeframe The duration of this course is one semester. Two hours per week formal study time is suggested. How long? Regarding self study time, if we count all the time you spend browsing the Internet, this is likely to be very long! Enjoy! 7 8 Course overview The Impact of the Internet Study skills As an adult learner your approach to learning will be different to that from your school days: you will choose what you want to study, you will have professional and/or personal motivation for doing so and you will most likely be fitting your study activities around other professional or domestic responsibilities. Essentially you will be taking control of your learning environment. As a consequence, you will need to consider performance issues related to time management, goal setting, stress management, etc. Perhaps you will also need to reacquaint yourself in areas such as essay planning, coping with exams and using the web as a learning resource. Your most significant considerations will be time and space i.e. the time you dedicate to your learning and the environment in which you engage in that learning. We recommend that you take time now—before starting your selfstudy—to familiarise yourself with these issues. There are a number of excellent resources on the web. A few suggested links are: http://www.how-to-study.com/ The “How to study” web site is dedicated to study skills resources. You will find links to study preparation (a list of nine essentials for a good study place), taking notes, strategies for reading text books, using reference sources, test anxiety. http://www.ucc.vt.edu/stdysk/stdyhlp.html This is the web site of the Virginia Tech, Division of Student Affairs. You will find links to time scheduling (including a “where does time go?” link), a study skill checklist, basic concentration techniques, control of the study environment, note taking, how to read essays for analysis, memory skills (“remembering”). http://www.howtostudy.org/resources.php Another “How to study” web site with useful links to time management, efficient reading, questioning/listening/observing skills, getting the most out of doing (“hands-on” learning), memory building, tips for staying motivated, developing a learning plan. The above links are our suggestions to start you on your way. At the time of writing these web links were active. If you want to look for more go to www.google.com and type “self-study basics”, “self-study tips”, “self-study skills” or similar. Need help? Help If you need any academic support, please contact the Tutor-Marker for this course. Contact details for this person can be found either in the first tutorial letter for this course or the student Distance Education Manual, which you receive at registration. For administrative matters, please contact the Student Support Officer (SSO) for this course. Details of the SSO can be found in your student Distance Education Manual. A web site for this course is under construction. You will be advised of this, and where to find it. The present course coordinator is not a full time staff member of the NUST. Please refer to the SSO for details on how to reach him. Please refer to the NUST Computer Services department for help on computer problems, Internet access etc. Assignments Assignments are to be either: Assignments Submitted by hand at the NUST Campus, Windhoek, Centre for Open and Lifelong Learning. An assignment box is situated at the end of the COLL building, outside the Stores room. Submitted by hand at one of the NUST Regional Centres. Assignment boxes are provided at each of the centres. Sent via email to the following address: collassignments@ NUST.edu.na Note: If you are emailing your assignment, please include your student number, course code and assignment number in the subject line of the email. Attachment files to the mail, if more than a simple word document, should be collected in a folder and compressed with winzip or winrar. Note: Please see your student Distance Education Manual for more details about submitting assignments 9 10 Course overview The Impact of the Internet Assessment There will be 4 x homework/assignments. There will also be 2 x theory tests; arrangements will be advised. Assessments There is no examination. Getting around this study guide Margin icons While working through this study guide you will notice the frequent use of margin icons. These icons serve to “signpost” a particular piece of text, a new task or change in activity; they have been included to help you to find your way around this study guide. A complete icon set is shown below. We suggest that you familiarise yourself with the icons and their meaning before starting your study. Objectives Activity Time Prescribed reading. Additional reading In-text question Group activity Discussion Case study Reflection Tip Feedback Study skills Note it! Key words/concepts Help 11 12 Getting around this study guide The Impact of the Internet Audio Recommended website References Summary Refer to the assessment Refer to the assignment Unit 1 The Impact of the Internet Introduction This unit introduces you to the phenomenon of the Internet and its dominance in our information culture, but is contrasted with the still important role of the traditional media, both print and electronic. Objectives Upon completion of this unit you will be able to: describe the important facts concerning the background and development of the Internet; explain the term “Digital Divide”; compare on-line facilities with their conventional equivalent; recognise the importance of information literacy; describe the most likely source of some required information; recognise that there are two sides to nearly every story; consider the likely effect of the Internet on some professions. Bothma, T., Cosijn, E., Fourie, I., Penzhorn, E. (3 rd ed. 2011) Navigating information Literacy. Pearson, ISBN 978-1770259676 Prescribed reading Standage, T. (2004). The Victorian Internet: The Remarkable Story of the Telegraph and the Nineteenth Century's On-line Pioneers. Walker & Co , ISBN 978-0425171691 Additional reading 1. The World in One Room The Internet is probably the most important invention in the history of the world, and we are supremely lucky to be living in the ‘age of the Internet’. Nothing like it has happened before: it is as though all the people of the world, and all their knowledge, creativity, wit and conversational skills were present in one room, so that you could talk 13 14 Unit 1 The Impact of the Internet Figure 1: Visualisation from the Opte project of various routes through a portion of the Internet, from www.seopher.com instantly with any of them about anything you wanted, or anything they might be interested in, and make friends with anyone you wanted, and that all the world’s libraries, and media - video, music and newspapers - were there as well, and instantly searchable for any imaginable information you might need, and that every shop, selling even the most specialised goods, was there also; so you could obtain even the most obscure item that you would never find anywhere else. All services would be there, including the banks, post offices and utilities, so you would never go physically to them or queue up in them, again. Most importantly, you would no longer be a passive observer or consumer at this feast of activity – you could add your own contribution in whatever way you wished, voice your opinions on any subject, ask any question or for any advice, and update or edit the information in the ‘library’! And the room is not at all overcrowded; there is plenty of space for you to ‘set up your stall’ and make your contribution. Some imaginative writers compare the Internet with the vastness of the astronomical universe (note the picture above). Before we proceed we need to distinguish two notions that are often confused: the Internet, and the World Wide Web (WWW): The Internet is a global network of interconnected computers. The word in fact is an acronym for inter connected net work. The term entails all servers, hosts, networking machinery, and cables. The Internet is the infrastructure on which the World Wide Web runs. The World Wide Web (WWW , or simply: the Web ) is a system of interlinked documents that can be are accessed by a web browser. The way of formatting documents so that they connect to each other via hyperlinks is called hypertext (More on hypertext and how it is produced in Unit 5). The web is the collection of content accessible via the Internet. Thus, when talking about technically gaining access to the infrastructure, the correct term is “Internet connection”. When exploring the world of content one better talks of “web browsing” than “Internet browsing”, although the terms are often used interchangeably. The brand name “Internet Explorer” for Microsoft's web browser is a misnomer; it explores the web, not the Internet. Of course, the web, this fantastic room is virtual, it is ‘cyberspace’, but it is not difficult to get to: it used to need some technical IT expertise, but now almost any computer will do or almost any mobile phone. We still visit the web by means of a keyboard and screen, but even that is about to change in the future. Whereas the web used to be viewed as something very technical, it is now simply a mirror of all human activity, and many things which used to be done in person or face to face are now on-line: dating, gambling, chatting and gossiping, sharing your opinions, shopping, banking, looking for work, showing your holiday pictures, diagnosing your illness, and running your business. Unfortunately, this amazing room is not yet available to all people on the planet. People in developing countries often do not have the opportunity to participate because they lack the infrastructure to access the Internet, or the money to pay for this access. Many people are illiterate and can for this reason not obtain the services and information that Internet and WWW provide. This situation is called the Digital Divide because part of the World (the northern hemisphere) has access, and the other part has not. Activity Activity 1 Time Required: 30 minutes A fashionable topic these days, about which many books have been written and conferences held is the so-called ‘Digital Divide’. What do you understand by this expression? Will Internet access (and profitable use) steadily expand to nearly everyone, or will a large section of humanity be left ‘disconnected’? How long? 15 16 Unit 1 The Impact of the Internet Feedback The digital divide refers to the ‘division’ between those (individuals, communities or countries) who have little or no access to communications technology, especially the Internet, and those who have. As might be expected, this ‘dividing’ line falls very much along the division between rich and poor, between often smaller countries like Iceland and South Korea, where nearly everyone has high quality network facilities, to countries like Ethiopia, where fewer than 1% have. It is often claimed that the lack of communication and information reinforces poverty, because it means less access to customers and markets etc. The former president of South Africa, Thabo Mbeki, once said that fewer than half the people of Africa had ever made a phone call – this might have been true for landline phones but no longer true of mobiles. (See Unit 11). Although the Bible says the poor will always be with us, the position with communications and network access is improving, with the vast impact made by mobile technology in Africa and developments such as the latest undersea fibre optic cable recently laid around Africa. Of course, Internet access (as expressed in the number of people per 1000 who are networked) is not spread evenly around the world. It is concentrated as you would expect in North America, Western Europe and the “Asian Tiger” countries, although the situation is improving. One web site, http://www.worldmapper.org very graphically illustrates global inequalities in various fields by means of bizarrely distorted world maps. The following map illustrates global Internet usage: Figure 2: What information is this map trying to convey? (from worldmapper.org) As in any aspect of human activity, especially where there are large crowds, whereas there are uncountable good, helpful and knowledgeable people on the Internet, there are the bad guys too liars, deceivers, thieves, saboteurs and stalkers. There are crimes whose names were unknown before the Internet: spamming, ‘phishing’, ‘spoofing’, ID theft. These terms will be explained in Unit 12 at the end of this Study Guide. These new crimes have to be recognised, and guarded against, if possible, just as in real life. Of course, short of crime, there is loads of rubbish on the Internet: misleading information, bad advice, bias, bizarre web sites, uninformed opinion – just as there is in real life. This course is about coping with information in the age of the Internet. It is difficult to find a good name for the course. It was originally going to be called ‘extracting information from the Internet’, but the Internet is about far more than this, and young people particularly, when asked what they are doing online, do not generally respond that they are ‘extracting information’. So the course was called information competence , which suggests achieving some ability and becoming comfortable in the age of information, because it should be remembered that we are not just talking about information from the Internet, but from conventional media as well – libraries, newspapers, TV and human word of mouth. This is the age of information, just as surely as there was a Stone Age, a Bronze Age and an Iron Age. Sometimes courses like this are called Information Literacy, because just as a person needs to be at ease with the written word (literacy) and with numbers and mathematics (numeracy) it is equally important to be competent with information. The recent phenomenon on the Internet has been the staggering growth – over only the last four or five years – of the so-called social networking sites. Before this, the Internet was mainly a one way ‘push’ system – the content was there, provided by technical experts for you to consume. It was really no different from going to a library. Of course you could build your own web site, and have your own presence on the web, but that was technically difficult and expensive. Then came the sites where you could set up your own profile, and upload it with a click and without any technical knowledge. Figure 3: Egyptian protesters February 2011. Note the cameras and cell phones! The Internet is steadily assuming a strong, and to undemocratic regimes ominous, political significance. The social networking sites – notably these days Facebook and Twitter - have extended far beyond their original purpose in the West of posting often trivial personal details and rather banal comments (‘tweets’). The technology played a vital role in the recent revolutions in Tunisia and Egypt, enabling activists to communicate with each other, circulating news of tactics and rallying points, getting messages to thousands of people simultaneously, and bringing people together for mass demonstrations. In the Egyptian protests in Cairo’s Tahrir Square, one of the first things the organisers 17 18 Unit 1 The Impact of the Internet provided were plug racks for the protestors to charge their phones. Of course the authorities tried to close the web sites down, but the nature of the Internet ensured that there were other ways of getting through. When Internet access via the official service providers was cut altogether, activists published coordinates of satellites that would allow to connect to the Internet with standard TV reception dishes – Governments just could not destroy all dishes in private possession, or switch satellites off that flew in orbit around our planet! Eventually, the government gave up and switched conventional access back on. Nevertheless, in countries such as China the government spends huge amounts on Internet restriction and surveillance. Very sophisticated software is in place to block ‘sensitive’ topics on the web, and the Internet habits of people, especially dissidents, are monitored. The political significance of the Internet is crucially recognised. The Internet is an absolute boon to anyone who has to research a particular topic or just to find out a fact quickly. Need to know when Martin Luther King was born or how many litres in an American ‘gallon’ or what is an ‘electrolyte’ or translate a label into Spanish? Need to write an essay on the history of Ethiopia? Instead of searching for hardto-find reference books, it is a matter of typing in a box into a search engine or looking it up in an online encyclopedia. Equally, being cut off from the Internet can have devastating effects, especially for businesses, even non-‘hi-tech’ ones. A world wide survey carried out in 2010 by the Avanti company of corporations showed that more than a quarter (27%) could not function at all if they had no Internet, and over 20% said that being cut off for a week would mean the death of their organisation. Reflection Much of our Internet connection depends on cables and wires laid through populated places, and these can be very vulnerable. In April 2011 an old lady in the small country of Armenia, looking for copper, dug through the cable connecting most of the country, plunging nearly the whole of Armenia into Internet darkness for a day. She was not popular, but she said she had never heard of the Internet! These days, international connections are not achieved through copper cable. Fibre optics has replaced most of the Internet backbone, so the old lady wasn't even successful in her raid. (Source: Woman scavenging for copper wipes out internet service to neighbouring Armenia for 28 HOURS, DemocraticUnderground.com, http://www.democraticunderground.com/discuss/duboard.php?az=view_all&add ress=102x4803253) Of course, material dating from before the invention of the Internet is less well represented online – nevertheless there are many endeavours to digitise and put online most of the world’s classic literary works, historic photographs, and public data. Many of the world’s famous museums are placing their exhibits on the Internet for anyone to see, and every article from the Times of London is now available from 1785! In Namibia, the National Archives are currently working on the digitisation of all their historic documents, and eventually these will be made accessible to the general public. In Europe, there is the Gutenberg Project (named after Johannes Gutenberg, the inventor of printing), scanning and re-publishing all important books of which copyright has expired. When you have completed this course you should have an idea of the nature of information, and how we arrived here in the age of information, what the role is of the new media and technology – and what future the old media has, if any. Activity Activity 2 Time Required: 30 minutes Approach one of your elders who remember the 1970’s, a period before the advent of the WWW. In Namibia, at this difficult time, even other forms of communication were limited. How did he/she communicate over a distance or obtain information on a specialised topic? How long? If he/she had a friend or relative who had moved to another country, to study or to go into exile, how did they communicate? By letter, which might take weeks to reach the recipient, and months to get a reply? How did they find information, especially in an area without a reference library? Or did they just do without the information? Answers might vary, but most probably you will find that before the Internet, it simply did not occur to people that instant communication and access to information was possible. But see if you can find a copy of the very interesting book ‘The Victorian Internet’ (in references). Feedback 2. The Role of Conventional Media 2.1 What was life like before the Internet? A Story A Story The writer’s mother, in South Africa, had a sister who went to Australia and married there. For forty years until her death, she lost contact completely with her sister, because the post between South Africa and Australia was unreliable, and she gave up writing. She did not know how to book an international phone call (you had to book them in those days!), fearing it would be too expensive and anyway she did not know her sister’s number or how to find it. It is incomprehensible these days. Now of course you can type your email, enter the address and click the Send button. The mail is usually at its destination in 1 second (!), anywhere in the world. Or you could find your desired person’s profile on Facebook People in those days just took it for granted that if a friend or relative 19 20 Unit 1 The Impact of the Internet went overseas, you would lose contact with them for a considerable time or even forever. To find information on a topic of interest, if the topic could be summarised in a word, it could be looked up in a printed encyclopedia, although the encyclopedia would probably be several years out of date. You might go to a library, and find a reference book, but the process would be time consuming. The library might have to borrow a book from another library, which could take months. With luck you might find a knowledgeable person with your answer. Often you would just have to do without the information you were looking for. Pre-Internet communication technology By the 1860’s the telegraph service spanned the world, and you could send an instant ‘wire’ from England to the US, although the cost was a week’s wages for an ordinary person! (See the ‘Victorian Internet’ book in the “Additional Reading” section at the top of this unit). By 1896, the German administration in Namibia had cable links between Windhoek, Swakopmund and Berlin, and during the Herero and Namaqua War of 1904-07 the German colonial Schutztruppe used heliographs to communicate in the open field. In-text question: Have you ever seen a heliograph? Do you know how it works? Several museums in Namibia exhibit this technology, for instance the museums in Windhoek's Alte Feste and the one near Swakopmund's Light House. In-text question The telephone was invented in 1876, first known as the ‘acoustic telegraph’. Primitive instruments consisted of a stand, a separate transmitter, and a receiver unit hanging on a hook. To terminate a conversion you replaced the receiver on its hook – that is why we still say ‘to hang up’ and ‘leave your phone off the hook’. The first London phone book of 1878 had only 30 entries (!) but by 1904 the US had over 3 million telephone subscribers. Telex (Teleprinter Exchange) technology was the ‘email’ for most of the 20th century. It consisted of bulky ‘teleprinter’ machines – telephones combined with a printing mechanism - communicating over lines parallel to the telephone network, working at a speed of 60 words per minute! Facsimile (fax) machines dominated instant written communication in the 1980’s and 90’s – they worked by scanning a printed page and transmitting the ‘image’ over a normal phone line. They are still used, with the advantage of course that handwriting and diagrams can be transmitted as well as text or any type. 2.2 One of the longest lived media formats – the newspaper The newspaper originated in occasional information and propaganda sheets which appeared soon after the invention of printing. The earliest daily, Einkommende Zeitungen , appeared in Leipzig, Germany, in 1650. The famous Times (of London) also has a long history; it printed its first issue on 1st January 1785, and has appeared on nearly every weekday since. Below you see the front page of the issue of 24 th August 1858, not very interesting by today’s standards! The Times' front page until 1966 contained only advertisements; news appeared inside. Figure 4: The Times on July 6, 1863 Picture: Public Domain (PD) Figure 5: Contrast with: The Times, April 11 2011 (from wikipedia) 21 22 Unit 1 The Impact of the Internet The 150-year-old newspaper (if you had the original, sorry the photo is not very good!) is perfectly readable. Would a 150-year-old hard drive be? Reflection As the (very useful) website howstuffworks.com says: “Newspapers are the original form of broadband communication, a distinction not always recognized in the age of the Internet. Long before we had computers, television, radio, telephones and telegraph, newspapers were the cheapest and most efficient way to reach mass audiences with news, commentary and advertising.” The newspaper with the largest circulation in the world is the Japanese Yomiuri Shimbun, with 16 million daily copies. The Namibian has a print run of 65000! How do newspapers get their information, and who decides what is the most important ‘story’ to put on the front page? The newspaper creation process starts with its reporters, its eyes and ears. At a large paper, there are specialist reporters covering crime, government matters, the courts, business and finance, technology, sport, social events etc. At a smaller publication, such as in Namibia, there are ‘general assignment’ reporters who have to cover a wide range of topics. In the movies, and sometimes in real life, there are investigative journalists who try to ‘sniff out’ stories that other people, especially in officialdom, would like to keep quiet about. This is a common source of friction between government and the media. In general, the conflict between newspapers, who would like to publish information, and governments, who would like to control and censor it, has been going on for a couple of hundred years. Newspapers have been banned, and their staff arrested, especially when journalists follow a sacred tradition of never revealing their sources of sensitive information. However, most journalistic work is quite mundane! An increasing amount of news is provided not by individual reporters but by national or international news or ‘wire’ agencies, which are news gathering ‘wholesalers’ selling stories to many individual papers. Well-known international press agencies are Reuters and Associated Press (AP). Namibia also has a press agency, NAMPA. In the first line of a newspaper article you can sometimes see whether a particular story comes from a local journalists or has been bought from a news agency. Next time you open a newspaper, quickly check which contributions are from a news agency, and which ones are from a local journalist. In large newspapers again, there is a formidable hierarchy of editors, ranging from subeditors, city editors, managing editors, up to the editor in chief. They verify and proof-read the reporters’ ‘copy’, write headlines, decide on the material’s newsworthiness, and the prominence it should receive, and the page it should be printed on. There will also be a research department, now computerized but formerly a vast library of former issues, reference books and press cuttings, where facts can (or should) be checked. In a small paper there may be only two or three levels of editors. A senior editor will decide what story deserves the front page. In addition to news items there are ‘columns’ of opinion written by journalists or specialized columnists, the leading one of which is written by the chief editor him- or herself. This is the ‘leader’, and often reflects a tone of political opinion with which the newspaper is associated (pro or anti government etc). There is a strict division between news and comment, with the journalists from each reporting to different management. In a multi-ethnic country there are usually newspapers printed for the different language groups as well as for shades of political opinion. Thus in Namibia we have an Afrikaans daily, an ‘independent’ English daily, an English ‘pro-government’ daily etc. The price you pay for today’s paper would literally hardly cover the cost of the paper it is printed on. A newspaper survives by advertising which sometimes occupies more than half its space. Ethically, there should also be a clear distinction between real news stories and paidfor advertising, so that when a news-type article appears which is really promoting a product, it should carry the heading ‘Advertorial’. Another important component of a paper is the ‘letters to the editor’ from which the paper can get feedback from its readers. An innovative feature in The Namibian is the page of SMS’s which readers can send on any subject, recognizing that cellphones are much more accessible to the local public than email, and much quicker than letter writing. In earlier days pages were laid out by hand (the highly skilled work of a ‘compositor’) and implemented with engraved metal plates for pictures and movable lead type for text; now thankfully, this is done by software. Anyway, once the pages have been arranged and filled with either news, comment, advertisements and reader feedback, the paper has to be printed. This used to be done by metal and ink pressed on to the paper (the industry still being called the Press). Nowadays, the processes are photographic and electronic. Even in the age of the Internet modern printing presses are still a marvel of hardware technology, which can print and collate thousands of pages and copies of newspapers per hour, and shoot them along conveyor belts for distribution. The history of journalism has been a noble one, and newspaper coverage has brought many social issues to the notice of the public with consequent reforms. For instance, the Times pictures and reports of the Crimean War of the 1850’s and the terrible conditions for soldiers led to Florence Nightingale’s creation of the modern nursing profession. One of the most famous episodes of investigative journalism was the exposure of the “Watergate” plot in the 1970’s, which implicated the then US president Richard Nixon in criminal activities, and led to his resignation. Read more about the Watergate Scandal on http://en.wikipedia.org/wiki/Watergate_scandal. However, especially in countries such as the United Kingdom (UK) and the United States (US) where many papers are locked in desperate battles for circulation, standards have been dropping especially among the so-called ‘tabloid’ press. On the assumption that the mass market wants to read about celebrities, showbiz or sporting stars, and nothing sells like money and sex, some tabloids have opted for ‘news-free’ vapid front page stories about celebrities, their money and their private lives. Who writes these stories? Some theorise that there is a new kind of journalist who works not in news but in privacy invasion – getting stories by whatever means, spying on their ‘victims’, hacking their communications, even literally combing through their rubbish or if that fails, simply making stories up. 23 24 Unit 1 The Impact of the Internet Then there is the phenomenon of ‘trial by media’. Remember the horrific murder of schoolgirl Magdalena Stoffels in 2010? A man washing bloodied clothes not far away was arrested, and all Namibian media portrayed him as the killer. He had to be protected by the Police, otherwise community members would have administered “justice” very quickly. Alas, when forensic results came back from South Africa it did not connect him to the crime, and he had to be released – the man's reputation is probably stained forever, and he is now (2012) suing the Namibian Police. A Wikipedia article about the incident, written by a student of this Information Competence course at NUST, was not updated until February 2012 and still gave full name and age of the alleged killer! This is not an isolated incident. In the UK, when a young woman was murdered over Christmas 2010, the media seized upon her landlord, who was a rather eccentric looking old man with long white hair. He must be guilty! Every odd detail about his past was dug up, and he was even arrested and held for a couple of days and then released. He was totally innocent, but his reputation was shattered. The basic problem is that increasingly, newspapers are bought up by business conglomerates whose owners know little about the press but insist the papers are run to make as much profit as possible, that is to say, their reporting staff is reduced to a minimum, they write stories from their office (the expense of travelling out to cover stories is discouraged) most of their pages consist of bland syndicated agency material, and their editorial policy, if any, is tailored to their owners’ commercial interests. Assuming that you have reliable Internet access, (or even if you do not), do you read printed newspapers very often? (Either for news or entertainment?) Do you think they have a future? In-text question Activity How long? Activity 3 Time Required: 45 minutes Obtain copies of the Namibian, New Era and the Sun newspapers. If you can, get a copy of one of the overseas UK newspapers, such as the Daily Mail (they are available at the CNA). Compare their style. How serious or ‘sensationalist’ are they? How do their front pages compare? What proportion of hard news to soft news (gossip, showbiz news etc) do they contain? Do they contain interesting comment or editorial material? How much advertising? What actually did you learn from each paper? Was it worth the purchase price?! Of the local newspapers, how much material overlaps on a single day? What can you conclude from that? Feedback The standard, and target market of newspapers, even in a small country, varies tremendously. Obviously a paper would not exist if it was not popular with a significant number of people. Broadly speaking, newspapers are divided into the ‘serious’ and the sensationalist, the latter often referred to as the ‘tabloid’ press, because they are printed in smaller paper format, although for economy most papers now appear in this way. Serious papers concentrate on political and economic news, and editorial comment, while the sensational press concentrates on scandals (often relating to official corruption in the Namibian context), lurid crime, ‘celebrity’ news and entertainment. As you might expect, tabloid papers usually have a much higher circulation and make much better profits than the serious press! Sometimes the tabloid papers overstep the mark in their attempt to obtain ‘stories’ – in July 2011 the News of the World paper in Britain, with a circulation of over two million, was shut down by its owners, News International, because of allegations of phone hacking, even of the families of murder victims, and illegal bribes made to Police for inside information. All stories that appear in more or less the same wording in several n ewspapers at once, do not come from newspaper journalists but from press agencies! 2.3 The future of the conventional media Despite the marvels of the electronic age, one might say that people, especially in ‘3rd world’ countries, still derive much of the information from traditional sources – word of mouth, and the conventional media – newspapers, radio and television. This applies to most ordinary people in Namibia. The role of these media therefore should not be ignored, but the key word is ‘still’, because the balance does seem to be changing. For one thing, the quality of the traditional media is slipping, as mentioned above, due no doubt to financial constraints. Newspapers can no longer afford their own foreign correspondents, or even local reporters, and rely on syndicated news services, otherwise filling their pages with gossip and ‘celebrity’ stories. Radio used to carry discussions, plays, poetry etc but now mostly carry recorded music, the station largely defined by the personality of the DJ, mostly relegated to the role of background sound. Television used to be the immediate medium of live news, current affairs, good quality entertainment and culture, but now seems to consist of old B- quality movies and ‘reality shows’. Their audience all over the world is declining. The year 2030 is the year in which we hope to attain Namibia’s ‘vision’ – but in America some commentators define it as the year in which the last hard copy newspaper will be sold! 2.4 Finally, a note on the future of some professions The Internet has created countless jobs, from technicians to web designers, but what about the jobs and even whole professions it is threatening to extinguish? For instance, if we can book all our travel arrangements on-line, flights, hotels etc, what do we need travel agents for, especially since on the Internet you can get much better information about your travel destination than what you could get from the agent? If we want a specialised book, we will go to an on-line book store like Amazon which has a catalogue of millions of titles, far more than any local bricks and mortar bookshop could possibly offer. So why do we need physical book shops? In general, the effect of the Internet in business is to put the buyer directly in contact with the seller, thus eliminating the agent or 25 26 Unit 1 The Impact of the Internet middleman. The future of these ‘middleman’ professions may thus be under threat. Activity Activity 3 Time Required: 20 minutes How do you think the Internet will affect the future of professions such the law, medicine, banking, estate agents? Write one paragraph on each. How long? Feedback The Internet will probably not change the medical profession much. People are becoming aware of possible misinformation, and we hope they do not entrust their health to someone they do not know at all. The same is probably true for lawyers, although the improved information flow – all Namibian High Court decisions are available online – might change the uniformity of the decisions made. Banking will increasingly be done online because it is expensive to maintain and staff branches in city centres. Much of the work done by tellers is repetitive and can better and faster be done by a computer. Estate agents will, and actually do, establish online presence, for the same reasons as banks. Of course, predictions about the future are always risky, and everything might turn out entirely different! Activity Activity 4 Time Required: 15 minutes Say where you would most likely look for information on: The melting point of gold (metal) Today’s price of gold on the financial markets The history of gold mining in southern Africa Pictures of some modern African gold jewellery How long? Contact details of some Namibian gold jewellery manufacturers. i – an encyclopedia or chemistry text ii – a daily newspaper iii – an encyclopedia or relevant book iv – a trade magazine v – the Yellow pages. But it all could be found on the Internet! Feedback Writer(s), please check this - is it correct?... Nearly every topic is 'controversial’, literally meaning it can be turned, and viewed, from either side. It does not necessarily mean something that people fight about. There are two sides to nearly every issue. Assignment It’s also interesting to compare websites of opposing political opinions. On American political views contrast www.newsmax.com (for the right wing) and www.moveon.org (for the left) and www.slate.com (the online magazine of the Washington Post) for the approximate centre. Recommended website 3. Official ‘Secrets’, Personal Privacy and the Restriction of Information We close this discussion by providing some brief notes (because it is a whole subject in itself) on the topic of personal privacy and government secrets. Before we start, let us look at two simple questions: Why do people, organisations and governments want to keep certain information secret, and Why do people, organisations and governments want to obtain information that other entities want to keep secret? Privacy is a cultural value. The organisational side of it is probably easiest to understand: Companies have advantages if they can figure out how their competitors operate, where they buy, what they want to do in the future, and how their products work. If I know that Telecom pays a low salary to their best engineers I can try to phone them to make them join my company. If I know that Shoprite buys mangos for half the price as my company, I can likewise change suppliers and make more profit. The most obvious value of obtaining secrets is during armed conflicts: If I know what the enemy is going to do next I'll probably keep the upper hand in any conflict. But what about private individuals? If I am not a criminal, what value can any of my private information have to others? This, by the way, is one of the main arguments of governments to introduce laws that erode privacy! Why does privacy matter even if you have nothing to hide? The answer is that you would lose your freedom, and eventually your individuality. Imagine someone taking a picture while you shower, and put it on Facebook. You did nothing wrong, but you would not want that. Imagine that I knew that you want to buy my car, no other one but just this. I 27 28 Unit 1 The Impact of the Internet could easily raise the price by a few thousands, you would buy it anyway. The problem is not that you hide a wrong, the problem is that others could do wrong with the information they obtain. More recently, some people, especially ‘media celebrities’ have become concerned about erosion of the privacy and intrusion into their private life, by the so-called tabloid press, although of course their celebrity status to a great extent depends on their continued coverage in this press. However, a low point was reached when ‘paparazzi’ photographers, instead of rendering assistance to Princess Diana, allegedly continued filming her lying dying in her car crash. Calls grew for legislation to protect personal privacy, but before this, concerns had grown about covert government action (Read more about this story on http://en.wikipedia.org/wiki/Lady_Di). Figure 6: Bradley Mannings, the soldier who released secret government communication to WikiLeaks. He is facing life in prison. People were equally worried about personal data held on official computer files, and there were calls for ‘Freedom of Information Acts’ or ‘Data protection acts’ by which any citizen can lawfully demand to see the information which the government is keeping about him or her (or information held on any other issue, provided disclosure can be shown to be in the public interest.) So the struggle, both social and legal, between the opposing ideals of privacy/confidentiality and the right of access to information has gone on, and will continue to do so. Has the Internet made this issue more acute? Of course. We know that a whisper on a web site will be around the world in literally a minute, and nothing can be done to take it back. Wild rumours and false news, as well as valid information embarrassing to officialdom, circulate on social websites and even serious sites, and are very difficult to dispel. Remember that once an indiscretion is uploaded, there is no easy way to delete it. Like germs, you can kill some but there will be many more ‘copies’ still around! Whereas governments can censor their own media and block and censor Internet connection for their resident citizens, they cannot block a website hosted somewhere around the world. An example is the recent furore over Wikileaks, the website which acquired (possibly illegally) thousands of secret government communications, published them online, and faced the almost unanimous fury of all the governments around the world whose confidences had been disclosed. It is a controversial issue, because it can be argued that government, business and life is impossible without the expectation of some confidentiality: “We risk the continued erosion of trust in society if we abandon the importance of a duty of confidence, whether to the family, our employer or the State", says Sir David Omand, a former security and intelligence co-ordinator at the (British) Cabinet Office. Alex Hudson, Is the web waging war on super-injunctions? BBC News 24 April 2011, http://www.bbc.co.uk/news/ mobile/technology-13159193 Legislation around the world on this subject varies greatly. The US has a powerful Freedom of Information Act, but there is little to protect personal privacy, online or otherwise. Vendors from whom you buy goods are perfectly entitled to pass your information on to advertisers to whom it may be of ‘interest’. The European Union (EU) has enshrined, in rather contradictory fashion, two principles, one for personal privacy and the other for freedom of information. The UK, which one might imagine as a bastion of democracy in information, actually has some of the most restrictive legislation in the world. First, there is an Official Secrets Act, and ‘gagging’ orders can be issued to the media to prevent them from reporting on any issue at any time. Bizarrely, there are also the so-called ‘super-injunctions’ which not only prohibit the discussion of any issue, but also prohibit any mention of the fact that the injunction has been made! Possibly this may be of temporary value in some life-or-death police investigations, but when (as has been alleged) such injunctions have been awarded so that celebrity footballers are spared the embarrassment of public revelation of their unsavoury private lives, it is difficult to see that the cause of public interest (or the law) is being served. In Namibia we have the controversial Electronic Communications Act. The intention of the Act was to regularise all the electronic media in the country (not just the Internet) and to put all its players on a clear and fair legal footing – objectives which are long overdue and can hardly be argued with. Its drafting took several years. However, the Act became quickly bogged down in controversy, because its drafters seemed more interested in giving the authorities rights to eavesdrop on any electronic communications, especially cell calls and emails. For this reason, detractors promptly labelled the Act the Spy Bill. (It has to be said though, that even icons of democracy such as Sweden, and probably all governments these days, have facilities for listening in on communications, whether they admit to it or not). Despite being passed in parliament, the Namibian Act has still not been fully ratified into law (there are perhaps some vested interests at stake, rather than privacy concerns) so that we are unfortunately still in legal limbo in many aspects of electronic communications. Another current legal informational puzzle is posed by the Statistics Act, which allegedly prohibits the collection of any systematic national information, unless permission from the Ministry has been received! Luckily this allegation is not true, local media have grossly misunderstood the meaning of the legislative text: All it is regulating is 29 30 Unit 1 The Impact of the Internet the re-use of data that has been acquired using government funds, anyone planning this needs the Ministry's approval. This of course makes sense, why should a company or a private individual gain advantages from data that has been collected using taxpayer's money? Discussion If you have any connections with a discussion or debating group, why not try and organise a session on the question of right to privacy versus freedom on information? What legislation should there be on this subject in Namibia, if any? See the prescribed reading at the head of this unit. References For the text of the Namibian communications act see http://www.parliament.gov.na/acts_documents/120_4378gov_n226act _82009.pdf (or google “communications act Namibia”) For comment in the Windhoek Observer on the proposed Statistics Act, see http://www.observer.com.na/archives/693-statistics-billimportant-milestone-katjavivi For a timeline on the controversy on tactics by the tabloid press in the UK and US, from 2000 to 2011 see http://www.bbc.co.uk/news/uk14124020 and many other sources. A useful website to explain basic technical terms: http://howstuffworks.com These websites accessed 15th July 2011. Internet: A worldwide network of data networks, stemming from American military research in the 1960’s. Cyberspace: The virtual world of facilities which exists only on-line. E-commerce: The practice of doing business, buying and selling, electronically, usually over the Internet. Social networking: The present day enormously popular sites such as Myspace, Facebook, LinkedIn and Twitter, on which millions of people post messages, pictures, comments and news about themselves. World Wide Web (WWW): The collection of documents accessible via the Internet and connected with HTML hyperlinks. Keywords/concepts Unit summary Summary In this unit you learned about the impact of the Internet, its growth and huge impact today, the astounding amount of information and facilities it offers, and the current phenomenon of social networks. We also considered the role of the conventional media and information sources (newspapers in some detail, radio and TV) and considered how they will stand up to the ‘onslaught’ of their new electronic equivalents. Lastly we touched on the controversial subject of privacy, and the twin opposing viewpoints of the right to protect personal information on the one hand, and the public right to have freedom of information access. 31 Unit 2 How the Internet works Introduction It’s true that you can drive a modern car without ever opening the bonnet, and even if you did open the bonnet of a modern car with all its electronic systems, there is not much you could do. However, it is useful to know something about how the car works, and more especially why sometimes it doesn’t start! On-line, you can browse away happily, but when you get the dreaded ‘404, page not found' message, what is wrong? Or when your computer stubbornly refuses to connect to the network, what do you do? If download speeds are extremely slow, is there anything you can do about it? It also helps to understand what IP addresses are, what web servers are and how packet switching works. It allows you to hold your head up when your technical colleagues are discussing the network. This is what this unit is about. Objectives Upon completion of this unit you will be able to: discuss the history of the Internet; recognise the structure of the Internet and its ‘protocols’; recognise what TCP/IP is, and what IP addresses are; obtain basic technical knowledge of the Internet, and apply this knowledge to connect to the Internet. Gralla, P. (2006). How the Internet Works (8th Edition), Que, ISBN 9780789736260 Prescribed reading Standage, T. (2004). The Victorian Internet: The Remarkable Story of the Telegraph and the Nineteenth Century's On-line Pioneers. Walker & Co , ISBN 978-0425171691 Additional reading 33 34 Unit 2 How the Internet works 1. A Short History of the Internet Figure 1: The Berlin telegraph exchange office, 1867 The idea of instant data communication by electric current or through some electromagnetic medium long precedes the invention of the computer or even of the telephone. From the invention of the Morse code in the 1840’s and from the technology necessary to manufacture and lay long cables, even under the sea, the telegraph network covered the world, so by the 1860’s it was possible to send a message from London to New York and get a reply in 10 minutes. Not all the telegraph lines were connected of course, but by manual means it was generally possible to get a message between any two people in major centres. The first attempts to link computers over communication lines resulted in star networks, with so called dumb terminals (computers without storage devices that only display what another computer has processed) radiating out from a large central mainframe computer (a computer that does all calculations for the terminals), much as voice telephones are connected by a central telephone exchange (‘sentrale’ in many languages). Corporate networks and airline systems still work in this way. When the military (in the US) realised the strategic importance of data communications, the dangers of a centralised system became obvious. If the enemy bombed the central facility, all communications would be knocked out. This anxiety was much spurred on by the Russian launch of the first Sputnik satellite in 1957, making the US paranoid about being left behind in the space race. Also for civil uses, a single point of failure is undesirable. Much research was done in the 1960’s on a packet-switched network, that is, a system of linked communication lines without a definite centre, so that messages could be sent from one point to another over many possible routes. The notion that the Internet was designed to withstand a major military attack is however not true, although many people believe it. We call such narratives an “urban myth”, a story that sounds logical and reliable but lacks proper evidence and truth. Much of the Internet's infrastructure is centralised, and a directed attack against DNS root servers (more on DNS below) or international backbones would in fact render it largely unusable. Figure 2: Design of a modern company network. The first system of interconnected computers became known as ARPANET (the Advanced Research Projects Agency network). It was used exclusively for research – not accessible to the general public at all. Still less was any commercial use of the network tolerated or even imagined. However, several ‘amateur’or public networks grew up with odd names such as Usenet, Bitnet, Fidonet etc. In the 1980’s it was possible to connect your new PC to one of these networks, provided you had a considerable amount of technical knowledge. Being able to network your computer very definitely marked you as a nerd in those days. Even so, ‘bridging’ between the networks, for instance to send a message to someone on another network, was very awkward, because they had different user commands and were incompatible, much like a roomful of groups of people eager to communicate but not speaking each others' languages. Figure 3: A packet switched network. Eventually, protocols (sets of rules how computers communicate) were developed to bring these separate networks together, creating an interconnected network, or just internet for short. Of course for ordinary people being able to connect your computer to a network was not of much use unless there was something to do with it – much as personal computers did not attract attention until there were useful things such as word processing, spreadsheets and games. The first of these ‘killer apps’ was electronic mail, which, as a system for transmitting a printed text message over a data network, dates back 35 36 Unit 2 How the Internet works very many years, probably before the 1970’s. The ‘@’ sign became associated with email addresses long before the Internet – the symbol would probably otherwise have died out from keyboards! Structured systems such as Compuserve and AOL were launched to provide easy–to-use information services to subscribers. What changed everything, and what is almost synonymous with the Internet for many people, was the World Wide Web (the distinction between Internet and WWW was defined in Unit 1). Figure 4: Sir Tim Berners-Lee. From www.ted.com et ubique The British nuclear physicist Sir Tim Berners-Lee, who should rank as one of the select few individuals to change the course of civilization, needed an easy way to share pages of research findings with his colleagues at the European Nuclear Research Centre (Centre européenne pour la recherche nucléaire , CERN); not just to send messages. The genius was to combine this insight with the concept of hypertext – the idea that you did not have to read a book or a set of books, boringly in page order, but that you could skip around from place to place or page to page by using hyperlinks, following the ‘interesting bits’, like a fly buzzing over a table of food, alighting on whatever morsel smelt most alluring. The technology was there, but as always, software had to catch up with the hardware to make it usable for ordinary people. In other words, the world was waiting for what we all know as a browser. Before that, as one wit said, lunch was free but nobody could read the menu! The browser is the web window for users, which handles the request for a web page and displays it, and any hypertext links it may contain, in a user–friendly way. The first browser was called Mosaic (i.e. a picture made up from many small pieces) later developing into Netscape Navigator, which remained as the most popular program to access the World Wide Web for 10 years. The first browsers displayed only text – now of course many web pages are masterpieces of artistic design with graphics, images, sounds, animations and often videos. The recent history of the WWW has been dominated by the so-called Web 2. Early web sites were ‘in your face’: they presented often nicely designed pages to you, but there was nothing you could do to respond to or change them. It was like reading the pages of a magazine. Even with a traditional newspaper you could write to the Editor! Now, especially in the social networking sites and Wikipedia, the web site is changeable and even totally created by the users, who can upload almost any content they want to the site. 1.1 The Internet in Namibia In Namibia, Internet connectivity was started by a small pioneering group called the Namibian Internet Development Foundation or Namidef, around 1994, connecting on a small line via South Africa. As commercialisation of service providers in the region took place, this organisation was absorbed into the South African based groups extending their service into Namibia (iAfrica, Mweb) and later the department of Telecom Namibia set up to offer Internet services (Iway). There is now the full range of services from a variety of service providers which you would expect in any developed country. These services, known by their acronyms ADSL, WIMAX, 3G , and so on, have been explained in the Course Introduction. What was your first experience of the Internet? What did you use it for? Did you find it an exciting or technically baffling encounter at first? In-text question 2. The Web and Web Browsers It seems magical when we type the address of a web page into our browser, a page most likely hosted in the US or elsewhere in the world and it appears on our screen almost instantly. But in terms of what we have learned above, it is quite logical. It is simply a matter of file transfer and the right protocols. Figure 5 : HTTP Operation Peter Gallert Firstly, a web page, no matter how fancy, colourful or dynamic, is just a collection of files. Much of the text on web pages is in fact written in a very simple language called HTML (hypertext mark up language). A browser is a piece of software which can recognise an HTML file and interpret its ‘commands’ as a display of text and graphics on your screen. That is your web page. We can have web pages which are local on our computer and have nothing to do with the Internet at all. When we type a web address (more properly called a URL or uniform resource locator) into the browser window, the browser packages it as a request in hypertext transfer protocol (HTTP) which gets passed into the routing network and is directed to the web server which is ‘hosting’ that site. Suppose the request was for http://www.amazon.com. This URL will be converted to an IP (‘Internet Protocol’) address (see subsection 37 38 Unit 2 How the Internet works below), and the web server with that address will be located. On the hard drive of the web server will be a file called index.html, which is the HTML for the home page of your web site. The server transfers a copy of that file maybe with some other files such as images, and sends them back to the address which was requesting them – you. When the files arrive, your browser will receive them and know how to display them as an attractive web page, as the designer of the page intended. If there are hyperlinks on the page, and you click on them, this just implies a further request for files from the web server or other servers, and they are dealt with in the same way. The web site can be interactive, for instance you can register or post comments, and this information from you again gets sent to the web server and recorded on a database associated with the web site which it also hosts. 3. The Technology 3.1 The protocol suite TCP/IP A protocol is a set of rules specifying how a certain action exactly is executed. You have probably heard of diplomatic protocols, the rules by which diplomats of different countries act and communicate with each other. These are very detailed and strict so that actions or words are not misunderstood as insults and lead to a war! These rules have been formalised and standardised for several hundred years so that they apply to diplomats from any country in the world. For instance, when a foreign head of state arrives in the country, a red carpet is rolled out when they get out of the plane – if the carpet ever was yellow, the dignitary would for sure take this as utter disrespect, and bilateral relations would quickly deteriorate. There is no good reason why the carpet must be red – other than that it was always done that way, and both sides expect it to happen. This is an important thing to remember about protocols: They do not always make sense but must be adhered to strictly, at all cost, also among computers. Computers, which do not possess intelligence, have to have a very clear set of rules when they ‘talk’ to each other. How do they say ‘hello’? How does computer A ask computer B if it is busy or whether it is OK to send a message? How does computer A indicate that its message is finished? How does computer B say, “I’m sorry, I did not get that, can you send that part again?” How do they say goodbye? Not only this, but communication should be considered on different ‘levels’, even when it is not computerised; for instance, messengers should just be concerned with the delivery of the message, not about the content of the message or how the message is typed. The sender should not care what messenger is used or by exactly what route the message is sent. The messenger will get the message sent, not caring who the sender is. The millions of computing devices on the Internet definitely need a protocol to communicate, in fact a ‘suite’ of protocols, best known as TCP/IP from the two best known protocols in the suite, the Transmission Control Protocol (TCP) and the Internet Protocol (IP). A detailed treatment of TCP/IP will be given if you are going to take any kind of networking course. Roughly, the Transmission Control Protocol is responsible for contacting remote computers, indicating that they want to communicate, what they want to send or receive, and how they intend to do this. The Internet Protocol in turn is responsible for providing an addressing scheme that is world-wide unique (every computer gets its own IP address), and for routing messages from the sender to the receiver. TCP/IP is a protocol suite or stack ; it contains many more protocols than just these two. Every protocol in the stack is responsible for a distinct task, and together they achieve what we call connectivity : that computers can contact other computers for the purpose of data exchange. We will cover a few very important ones below, but the understanding of all of them is the topic for a particular academic degree on its own, and would be far beyond the scope of this course. 3.2 Internet Protocol (IP) addresses: Everyone who wants to receive post (snail mail!) especially in countries like Namibia where there are no house deliveries, has to have a P.O. box number, with a place name, so that the post office knows where to put mail items for him or her (note that a box number does not specify where you physically are). In electronic communications, everyone with a cell phone has a unique cell phone number which enables the cellular network to identify and find his or her number anywhere in the country (and if you add the country code or roaming facilities, anywhere in the world). Similarly every device on the Internet has to have a unique address, at least at any particular time. These are the IP addresses, which are like Internet ‘P.O. boxes’. In the current situation (version 4 of the Internet Protocol) these addresses are binary digits like everything else on a computer: a sequence of 32 bits, divided into 4 parts. In dotted decimal notation these can be written as four numbers ranging from 0 to 255, e.g. 168.16.124.255. (You may think of these as the country, town, street and house number of a physical address, but the comparison is not exact). Anyway, there are 232 possible combinations of IP numbers – about 4.3 billion addresses. Although this may seem a large amount, and there are still less than that number of devices - computers, terminals, servers, printers, mobile devices etc on the Internet, we are running out of numbers, because for technical reasons not all these numbers can be used as addresses. The problem has been worked around by employing ‘reusable’ addresses: when you connect to the network, your Internet Service Provider (ISP) probably allocates you a dynamic address – one of the addresses at their disposal which is not currently being used. When you disconnect it is released again – much like a hire car, which when you return it is of course deregistered from you and made available to someone else. Even with this expedient, the supply of IPv4 addresses is becoming exhausted and a move to the new IP version 6 protocol is becoming urgent. This has 128 bits, with an unimaginably huge range of addresses – enough for every computer on every intelligently inhabited planet in the galaxy (assuming they are on the Internet as well!) 3.3 Domain Name System (DNS) As far as the web surfer is concerned, you enter the Uniform Resource Locator (URL) of the page you want into your browser– the (relatively) user-friendly address for instance of the Namibian newspaper ‘www.namibian.com.na’. This is meaningful to a human but useless to the Internet routing system: every web server (a computer hosting one or more web pages, 39 40 Unit 2 How the Internet works including the one hosting the Namibian web site) has an IP address, which is the only way the routing system can find it. It is possible to type a numerical IP address straight into a browser should you know it: however, most people, quite rightly, only know the URL (The computer hosting www.namibian.com.na has an IP address of 196.31.243.42 Open a browser and type this IP address into the bar to see that this gives you the same result as accessing the url.) Figure 6: Top-level country domains in East Africa The hierarchy of addressing pages in the WWW consists of domains . The top-level domain is the country code (in this example, na), and then there are second levels (com), third levels (namibian), and so on. For computers to retrieve web pages, first the URL must be ‘resolved’ or translated into an IP address: looked up in a kind of large dictionary called a ‘domain name system’, whose address itself must be known. This is done automatically and the user is generally unaware of the process. Failure to do this results in messages such as ‘can't find the server at...' or 'failure to resolve…’ and is a common cause of being unable to access a desired web site. The first thing to ensure is that you know the IP address of the DNS of your ISP, and that this is correctly entered in the network settings on your computer. 3.4 Dynamic Host Configuration Protocol (DHCP) Look at the last paragraph again: Did you really specify the IP address of the DNS server last time you browsed the Internet? Most probably not, someone (or something) did that for you. In fact, most service providers these days spare you this trouble, IP addresses, gateways, DNS servers and all these pieces of technical information are automatically allocated. The protocol that makes this possible is the Dynamic Host Configuration Protocol (DHCP). As the name suggests, DHCP allows for a dynamic configuration of your computer. This is particularly useful for mobile devices; all of the text of this study guide was edited in different restaurants in Windhoek, and at no time did the editor have to specify local connection parameters. All that was necessary to obtain Internet access was to “allow” the laptop to be connected to the respective wireless network, and to be suitably configured by it. How do you think one domain name server can resolve all the billions of URL’s on the Internet? In-text question The answer is of course that there is not just one name server, but many. Just like there are experts with ‘local’ knowledge, servers are divided into ‘zones’ which can resolve addresses within that zone – there are ‘master’ and ‘slave’ servers which keep copies of information in the master, and there are ‘recursive’ servers which, if they cannot resolve an address, pass it on to another server which is more likely to know (just as, when you do not know an answer to a question, you look for someone who does!) Activity Activity 1 Time Required: 30 minutes plus! Or less! Find a ‘non-technical’ friend who has just acquired a new laptop, or who is having difficulty with his/her Internet connection. Whether the laptop is to be connected by wireless (wi-fi) or via a 3G modem, follow the instructions supplied with the modem or the laptop, and set up the Internet connection! It is supposed to be straightforward and easy these days! Don't forget to switch DHCP on, otherwise it might not be that easy. How long? 41 42 Unit 2 How the Internet works Feedback If a wi-fi connection, check that a wireless network is available where the laptop is situated, and whether it needs an access key, which of course you must know. Make sure wireless on the laptop is enabled. If using a 3G connection, follow the instructions supplied with the unit and set up the correct user profile. Contact the mobile provider for advice if necessary. You will also learn something! Once connected, you can check the technical details of your connection. If you use Windows, chose “run” from the start menu, type “cmd”. In the command line window that appears, type “ipconfig /all” to see your current IP address, the IP address of the DNS server, the result of the DHCP request, and a number of other technical information. It is a bit more difficult to retrieve that information if you use Linux, but in this case you probably know anyway what to do. Banks, M.A. (2008), On the way to the web: the secret history of the Internet and its founders, Apress. References Gralla, P. (2006). How the Internet Works (8th Edition), Que, ISBN: 07897-3626-8 Kaplan P.J., 2007, F’d companies Spectacular Dot Com Flameouts , Simon & Schuster. (a very funny book!) ARPANET: Advanced Research Projects Network – the forerunner of the Internet. HTML: Hypertext Mark-up Language, the code to produce web pages. Packet: A small amount of data into which longer messages are broken up, for transmission over a ‘packet switched’ network. TCP/IP: Transmission Control Protocol/Internet Prototcol – the protocol suite governing communications over the Internet. IP address: The address (like an online ‘P.O. box number’) which identifies every device such as a web server on the Internet. URL: Uniform Resource Locator – a ‘human friendly’ address to identify a web site such as www.nust.na. DNS: Domain name server – a ‘dictionary’ which translates a URL to an IP address. ADSL, WIMAX, 3G: Methods of connection to the Internet. Keywords/concepts Unit summary Summary In this unit you learned about the origins of the Internet – where it came from and where it is today. You learned about the technology of the Internet and how it works in simple terms. The difference between the Internet and the Web, and the function of web browsers was explained. The most important business developments and applications on the web were recounted, as well as summaries of the entrepreneurs who have been extremely commercially successful with their innovations and software. 43 Unit 3 Data, Information and Knowledge Introduction This unit is about the nature of information, a rather mysterious concept if you come to think about it. Is it a ‘tool’, a commodity, a quantity from physics like energy or something completely abstract? It is not much use going to standard dictionaries for definitions of data, information and knowledge. They will paraphrase the words in general terms, attempting to define them in terms of each other, and do not capture the subtle distinctions and technical meanings which we require. We shall try and clarify these in the course of the unit. For us, information in one sense is the resource for decision-making. It is what reduces uncertainty in a situation, although not necessarily eliminating it. Now that we take instant high quality information on almost any topic for granted, it is hoped that our leaders, business persons and politicians will take better decisions, though this remains to be proved! Acts carried out without information can certainly be disastrous. Some battles of history had (to us) puzzling outcomes simply because commanders did not know what was hapenning on the battlefield and even if they did, had no way of communicating orders to their soldiers. The war of 1812 between Britain and the newly independent USA broke out, because although differences between the sides had been settled at a conference, this fact could not be communicated to the faraway forces in North America! In this module we will discuss what it means to be information competent, not just ‘informed’. What is the difference between data, information and knowledge? What is hard and soft information, and primary, secondary and tertiary information? We will then investigate a current controversial topic, but presenting both sides of the argument. Objectives Upon completion of this unit you will be able to: discuss the presence of contradictory information about real-life events; define the terms data, information, and knowledge; explain the difference between data and information; discuss the nature of knowledge; describe the difference between primary, secondary and tertiary information; evaluate any piece of information whether it is primary, secondary, or tertiary in nature. 45 46 Unit 3 Data, Information and Knowledge Information (2012). Wikipedia. Retrieved from http://en.wikipedia.org/wiki/Information Retrieved 24 April 2012. Prescribed reading 1. Information for Decision-Making In-text question Do you pride yourself on being able to distinguish between facts, hearsay, beliefs, opinions and rumours? If yes, what are your criteria? If you are looking for a new cell phone, do you rush out to buy the first one you see advertised or do you check prices at other shops, and also look for information on the different types of phones? If you do not understand how a new appliance works, are you prepared to read the instruction manual? If you are in the happy position of having received two job offers, do you stress over which one to take or do you look for information about the two companies, and the career prospects of each, in order to make an informed decision? If you have a large catalogue of hundreds of different courses offered by a university like the Namibia University of Science and Technology Year books, do you get lost in it or can you quickly find the particular course you are interested in? Can you read a map of an unfamiliar large city, and use it to find your destination? Can you read a company’s balance sheet, and from it derive information as to whether the company is thriving? Among the billions of sites on the World Wide Web, can you quickly find a few sites which give you the information you require on a particular topic, and assess which is the most reliable of these? Can you read two opinions on an issue by opposing politicians, and judge for yourself where the truth probably lies? Do you do some background research on a topic, before ‘going public’ with an opinion on it? Can you deduce the likely implications or conclusion from some given facts? Do you think that laying a fibre optic cable around Africa, costing nearly 1 billion US dollars, is money well spent? If you (honestly!) replied yes to some of these questions, you are information competent (congratulations), and it will stand you in very good stead in the information dominated world of the future. You certainly do not have to be a mathematician or scientist to be information competent, although some understanding of logic and statistics will help. Being information competent will increase your chances of success in life, by allowing you to make better career, financial and personal decisions. We are not giving medical advice, but it must be true that the ability to rationalise situations and choose between the alternatives which face us every day, on the basis of information and logic, is much less stressful than ‘agonising’ over choices, and being unable to decide between them. Activity How long? Feedback Activity 1 Time Required: 30 minutes Check the following two web references: http://nbsrocks.com/obamas-ratings-rise http://www.nydailynews.com/news/politics/2010/02/08/2010-0208_obamas_rating_plunges_underwater_for_first_time_in_new_poll_as_ just_44_give_him_.html What do you make of the apparent contradiction in these two reports from these ‘respectable’ online newspapers? Can they be reconciled? These two references give completely contradictory facts: one that President Obama’s popularity ratings are rising, the other that they are ‘plummeting’. Is one the truth, and the other a lie? Not necessarily – the political bias of the references will interpret data in their favour, and they could be referring to different things. For instance, could you have popularity in different matters (leadership, handling the economy, etc) or ratings taken at different times (popularity can change very quickly) or data taken from different areas (parts of America) etc.? 2. Data, Information and Knowledge Let us now try to distinguish usefully between these three words. 2.1 Data and Information Data is any and every signal coming into our senses – literally any light or sound wave – in our digital age it could be represented by some stream of 0’s and 1’s. Commercially, it might refer to any item, the detail of any transaction. For instance, the price charged for an item at a supermarket – a till slip. Thousands of these till slips will be produced every day – a mass of data. Suppose these till slips could be collected and sorted by product and date: we might get an indication of the most popular products at the supermarket, and the price variation or rate of inflation, for various foodstuffs. This is information . “1N$” is data , it does not mean anything because it is removed from its context. It could be the price of something, or the denomination of a particular coin. It could be a spelling error, a sum, a wage per minute. Taken alone, it cannot be true nor false because it is not a statement yet. On the other hand, a phrase like “This is a 1N$ coin.” is information. It comes in form of a statement and is thus true or false, depending on the statement that is made. Information is interpreted data. This is a 47 48 Unit 3 Data, Information and Knowledge very important distinction. Information is relevant and comprehensible, the tool that supports you in making a decision. Often it is created by summarising, sorting, collating or condensing data. However, there is no necessary correlation between data and information. A long stream of data not only carries no information yet but might not be of a type possible to be interpreted (take a long string of random data, for example). Thus it is possible to have data that cannot be interpreted and not be the source of information at all. Edgar Codd and the Relational database Figure 1: Edgar F. Codd. From research.ibm.com This is another of the gentlemen of whom you may never have heard, but without whom nobody would ever be able to find any information on a computer, let alone on the Internet. In other words the world as we know it would not exist. He was a British World War II flying hero, who moved to the US and invented the relational data base. Before this, there had been various attempts to devise software for retrieving data and yield information stored on a computer – to answer questions such as: “What are the names of customers based in Walvis Bay who owe more than $1000 but have not paid anything in the last 3 months?” But these early attempts at creating ‘databases’ were clumsy, inefficient and ad-hoc. Whole teams of programmers would have been needed to answer a query like the one above. Codd, who was a mathematician, devised a system in 1970 for representing information, which like all genius inventions, looked ridiculously simple afterwards: basically in tables of rows and columns, and tables (relations) which could be linked together by virtue of their common data. Although this sounds almost naive, it was both mathematically sound and easy to implement. There have been extensions and improvements to this arrangement over the years, but basically the relational model still stands – it is the model on which your course in databases, if you have done one, is founded. Nearly everything which you search for and find on a computer today comes from a relational database. Primary, secondary and tertiary information: You witnessed a car accident in the street and file a Police report on it. You read a newspaper report about a car accident in the street. The accident was included in the statistics published by the Traffic Department each month. OR The original diaries and letters of Nelson Mandela A biography of Nelson Mandela, reviewing his thoughts as expressed in diaries and letters, written maybe by someone who met or knows him An encyclopedia article about Nelson Mandela, written by someone who is knowledgeable but who has never met him In these two examples, the three points represent respectively primary, secondary and tertiary information. The three terms differ in their level of abstraction: Primary information is a direct account of events, from the viewpoint of its actors or witnesses. Its propositions are on a low abstraction level, a direct interpretation of data related to the event. Examples of primary information are court files, witness reports, interviews, scientific data analysis. Secondary information is an interpretation and synthesis of primary information. That means that the data that forms the basis of secondary information is not the same as the data for primary information: The subject of the interpretation is not the data itself, but its primary interpretation! For instance, if three eyewitnesses report different colors of a car participating in a hit-and-run accident, the journalist covering the event might write that the car was of a dark color, not giving any specific color but summarising the three different accounts. Note that the journalist in this example does not have any own perception of the color of the car; she wasn't around when the accident happened. Her data consists of the three pieces of primary information available from the witnesses. Examples of secondary information are newspaper reports, academic research output, TV documentaries and the like. Most information available to us via the mass media is secondary. The abstraction level of secondary information is medium. Tertiary information is on a very high abstraction level. It consists of summaries of many events and very general descriptions like text books, encyclopedias, annual reports or executive summaries. Like with secondary information, the data basis is itself information: Assertions like “Namibia has a very high accident rate” are, if properly generated, reflections on the general number of reported accidents compared to those from other countries. Most details are being left out – abstracted from – to achieve a general, high-level, analysis. The three levels of information can blur into each other and sometimes the distinction (especially between secondary and tertiary) is rather a pedantic matter. Interestingly, all three types of information are, in principle, admissible as evidence in a court of law. Primary information is direct witness evidence, secondary is indirect or hearsay evidence, tertiary is ‘expert witness’ evidence, when a scientist testifies that a DNA sample matches 49 50 Unit 3 Data, Information and Knowledge that of the suspect with 99% probability etc. Incidentally, how much information is currently on the Internet? About (in 2010) 5 million terabytes, that is 5 million, million, million bytes. A DVD can hold about 2 gigabytes, and it is estimated the human brain can hold up to 5 terabytes of data. Thus, to store the information content of the Internet would require: One billion DVD’s or One million human brains! 2.2 Knowledge Knowledge is a slippery concept, difficult to define. It is more than just a collection of information. “Knowledge management” is a current business buzz word. Consult 20 books on ‘knowledge management’ and you will find 20 definitions of ‘knowledge’. We could try: “understanding information about a subject to the extent that it can be applied to the benefit of the organisation.” The philosophically “most correct” definition, and indeed the one widely accepted by scientists, is this: Knowledge is justified, true belief. This may sound surprising but is derived from the following exclusion criteria: First, you cannot know something that is not true. People will laugh at you if you say “I know that the moon is made of green cheese”, no matter what. If something that was widely believed and thought to be known, turns out to be false (for instance, that the world is a disk), it can henceforth not be “known” anymore, simply because it is false. Second, you cannot know what you do not believe. This is a very personal determination, but people would find it weird if you say “I know I will pass Information Competence” if at the same time you admit that you do not believe you will. Thirdly, you cannot know anything for which you have no good justification. If you say “I know I will live at least one hundred years”, you will be asked “How can you know that?”, and then you are expected to explain yourself: That you do not smoke, that all your relatives died old, that you invented anti-aging medicine, whatever. If you cannot corroborate your knowledge with good reasons, people will not agree that indeed you know. The observation that knowledge is justified, true belief leads to a number of important conclusions, most prominently that knowledge transfer – teaching and learning – cannot only consist of the conveying of facts. Memorising is therefore not an acceptable learning method; to properly learn something you must establish your own body of justifications, and not just parrot the justifications of your lecturer. As soon as you have own factual justification people will say that you understand the matter, and this is equivalent to having abtained knowledge. During your studies lecturers will attempt to test your knowledge. They cannot check what your beliefs really are and we assume that what you learn is true, but what they will test is - your justification. Knowledge involves subjective aspects – synthesis, experience, judgement, ‘wisdom’ - things which you cannot put in a database but are important none the less. Some knowledge is inherent to individuals, and cannot easily be transferred to anyone else – this is called ‘sticky’ knowledge. An organisation possesses far more ‘knowledge’ than what is contained in its information systems – everything from how its products are perceived by customers to what jokes go down well with their best client. A database will tell you the address, phone number and outstanding balance of a customer, but is unlikely to tell you what his favourite cuisine is, which could be important in closing a deal! Much knowledge is unstructured – it cannot be reduced to the rows and columns of a table – and much could be inferred from the data which the organisation possesses, but which is unused. Knowledge management is an attempt to extract some of this ‘intangible’ value for the benefit of the organization. Keywords/concepts Data: Any item, detail or ‘bit’ comprising some signal of some event: the sum total of the signals of such event. Information: The collected, summarised, extracted or formatted data that have been interpreted in a particular way; a tool for decision-making. Knowledge: The system of justified, true beliefs of an individual or an organisation. Unit summary Summary In this unit you learned about the nature of data, information and knowledge. You learned about the importance of being ‘information competent’ – the ability to acquire, process and utilize information in order to facilitate and deal with the problems of everyday life. You learned about primary, secondary and tertiary information and the essential definition of information as a tool for decision-making. Finally you read the basics of the scientific nature of information. 51 52 Unit 4 Search Engines Unit 4 Search Engines Introduction In the beginning, with Tim Berners-Lee at CERN, there were only 50 pages on the ‘web’, and users kept a list of them on a couple of sheets or paper. A notice was sent around of new pages each week! In 2009 there were over 25 billion web pages. Finding your way through this vast jungle would be impossible without some automated searching help. Formerly, many users wrote down the URL of a web site which they found by chance. All browsers allow you now to bookmark a site, but many people do not even bother with this – they type their topic into their favourite search engine in the confidence of finding the page again. The power of modern search engines is awesome. You can type in the most obscure search key words – not necessarily the title of a page and Google or other advanced search engines will search the billions of web sites and return your search results within a fraction of a second. Objectives Prescribed reading Additional reading Upon completion of this unit you will be able to: describe generally how a search engine works; use a search engine for simple and advanced information retrieval discuss the principles of legitimate search engine optimisation discuss the power and privacy implications of search engines. Search Engines, and Google, from Wikipedia, accessed July 2011: http://en.wikipedia.org/wiki/Search_engines http://en.wikipedia.org/wiki/Google Croft, B. et al, 2009, Search Engines: Information Retrieval in Practice, Addison Wesley, ISBN: 0-1360-7224-0 Belew, R.K., 2007, Finding Out About: A Cognitive Perspective on Search Engine Technology and the WWW, Cambridge University Press, ISBN: 05216-3028-2 1. How a Search Engine Works It may be that the Internet and the World Wide Web are the most important inventions ever. But without modern search engines it would be useless, because most of the information in it could never be found. It would be like a huge pile of books on the floor, useless unless shelved and indexed. Figure 1: Comparison of market shares of the three largest search engines PD There were several searching systems in the early days of the Internet. Most of these disappeared with the arrival of Google in 2000. Google accounts for over 90% of searches worldwide, with the nearest rivals Yahoo and Microsoft’s Bing trailing far behind. This applies to most countries; however, in China mention should be made of the Baidu search engine, which enjoys 60% of the Chinese market, that is, several hundred million users. The first thing to understand is that a web search engine operates entirely automatically or algorithmically, without human intervention. It must do, if 25 billion pages can be searched and a list of results displayed in a fraction of a second! Occasionally, human intervention is applied, as was the case when a racialist cartoon appeared first in the results when image searches were made for Michelle Obama. Of course, not every one of 25 billion pages is checked when a search is made. An extremely sophisticated indexing system is created. The precise mechanism depends on the particular search engine, but firstly automated browsers called trawlers or more appropriately since this is the Web, ‘spiders’, spend their lives visiting web pages. The spider will note keywords of the site, collect title and location, its textual content, and metatags , information about the content of the page such as author, creation date, and categories the web page falls into. Information about images on the page can also be recorded, not yet directly, but if the image has a text caption or title. When you upload a new web page or site, sooner or later a spider will visit it and note its contents for the engine’s index. How would the algorithm become aware of the new page? There are two possibilities. Either someone links to your page from a web site that is already known to the spider. When that page is due to be visited again by the spider the algorithm will recognise the new link and include it in its pages to be visited. Or, if nobody knows about your page yet, you can manually register it with the search engine so that it starts including it in its spider visits. The spiders must continue their journey around the web endlessly because web sites are continually being updated, content changed and even removed. That is why occasionally you click hopefully on a search result but obtain only a white screen with a “Page not found” message. All the data gathered by the spiders is compiled into an enormous 53 54 Unit 4 Search Engines index, continually expanded and updated. When a user enters one or more keywords into the search box on his/her browser, the search engine examines its index and finds matches for the keywords required. Of course, unless the search is for an extremely obscure topic, there would be many matches (there is a game to find a search topic on which Google returns only one result!). But even a search on the “Namibian Economy” today (8/3/2011) yielded 23 000 results, or ‘hits’, and “Lady Gaga” returned 222 000 hits! All these are web addresses, which you could click on, in theory, to see a page relating to Lady Gaga! Figure 2: From left: Eric Schmidt (Google CEO), Sergey Brin, and Larry Page (Google founders) Nobody would have the time, or be interested, to look at thousands of search results. Google lists say ten pages of results, at about 20 per page, but it has been shown that most searchers look only at the first page; actually at the first five results. Only a few, determined to get exactly the information or result they want, may proceed to subsequent result pages. The question is how to ‘rank’ the results so that the most relevant, highest quality, most likely to satisfy the searcher, are displayed at the top of the results or at least on the first page. This is where page ranking algorithms come in. Users of early search engines were greatly frustrated by their ‘stupidity’, so it is important that the most relevant results are shown first, so that the user will not have to look further. Most search engines keep their exact ranking algorithms confidential. This is because all web site publishers, especially commercial ones, would like their pages to appear high or at the top of the search rankings, and if they knew exactly how the ranking algorithms worked, they could manipulate their sites to achieve this. Nevertheless, the general principles are known. These include: Traffic to the site – is it a site which gets a lot of visits, and also recent visits? (Dormant sites will be downgraded in the rankings) Incoming links – do many other web sites link to the site, indicating that it is recognised as being important in the online community? Social popularity – has the site received a large number of Facebook ‘likes’? A good match between the index and search keywords, and repeated matches, to some extent, will reinforce this. In the ‘metatag’ part of the page code, the page writer can include keywords to ‘help’ the indexing spider, and this is good if used with discretion. If abused, for instance by a restaurant owner repeating “best restaurant in Windhoek” 20 times in the metatags, this will be noted and counted as ‘search engine spoofing’ so the site will be downgraded instead of being promoted in the rankings For Google, the most popular search engine by far, the core algorithm is known; it is called the Page Rank. Incidentially, this algorithm to rank pages was invented by a man with the surname “Page” - Larry Page, the co-founder of Google! The Page Rank algorithm runs iteratively (repeating cycles) over the amount of incoming links: In the first run, pages that have a high number of links to them rank highest. But with only one run this algorithm could easily be tricked: Just create a worthless page on the WWW, and then create thousands of pages (called link farms) containing nothing but links to your page. This trick would make your page immediately rank high in the first run of the Page Rank algorithm, and such tricks (called link bombing) have been used extensively in the early days of the World Wide Web. Here comes the improvement of the Page Rank algorithm: In second and further runs, the Page Rank of the pages linking to a page is considered. This makes it important to not only have links to your site, but to have important (i.e., high-ranking) pages link to it. A bogus page containing only links would in itself have a low Page Rank, and the original page would rank lower and lower in successive runs of the Page Rank algorithm. Exploiting the (probable) ways the search engine ranks and displays results, making sure your site is ‘search engine friendly’, is called Search Engine Optimisation (SEO). It is perfectly legal and a big business in itself. Google does not only use the Page Rank algorithm, although it forms the core of the search. Further factors influencing the results of a Google search are geolocation (If you search for “pizza” you will get results from your local area, not a possibly better-known and higher ranking pizzaria in London or New York), payments (Some of the search results are advertisements), current trends (Web links that were clicked by other users rank higher), and a number of other factors that Google does not disclose. Do you use the standard (‘I’m lucky’) search mode of Google or the Advanced search? Do you generally find what you are looking for in the first five returns? In-text question 55 56 Unit 4 Search Engines Activity Activity 1 Time Required: 15 minutes or longer Google yourself! (Your own name or that of someone or some organisation close to you). How many hits are there, and on which sites do they appear? Discuss the implications of Google's amazing memory about you – Are you comfortable with Google's snap shot about your personality? How will other people perceive you when they google your name? How long? Of course if you have a common name you will have to individualise it somehow. Some of the hits may be a surprise or even a shock (they can be quite different from a Facebook search). Some may be irrelevant or uncomplimentary. Feedback Some search results might paint a negative picture of you, for instance if you are frequently listed in online game highscores but never for scholarly or work achievements. Outsiders might think that you play while at work. Remember that potential employers, when you apply for a job, especially a senior one, will google you to get some background information! 2. Choice of a search engine Figure 3: Google is noted for its clean uncluttered display; however on special days of the year it substitutes whimsical logos to celebrate the event: this is Earth Day 2011. From www.google.com Although Google is overwhelmingly the most popular search engine, perhaps in part because of its unusual and easy to remember name (it derives from Googol – a word apparently coined by the 13-year-old nephew of a mathematician for the massive number of 10 to the power 100) there is a proliferation of other search engines, often with whimsical names like dogpile, duckduckgo etc. Microsoft rebranded its unsuccessful engine with the hopefully even easier to remember name of Bing. Search engines certainly differ in their appearance (home page). Google is famous for its clear largely white home page; others are cluttered with advertisements, links, current events, pop-ups etc. Whether it really makes any difference what search engine you use is a matter of opinion. Generally they give similar results for a given search. There does not seem to be such a thing as an engine more suited to technical searches, another for social or literature matters etc, except that there is a site called wolframalpha, produced by the inventor of the iconic mathematical software Mathematica, which is supposed to be specialised in retrieving statistical data, solving equations etc. But it does not seem to work very well. There is always the suspicion of commercial bias. Suppose you ‘googled’ “operating systems”. Would Google put some preference on its own Android operating system rather than Microsoft Windows? Of course, Google and other companies strenuously deny this. There is a difference though with so-called ‘consolidated search engines’, which act like a search ‘insurance broker’. A broker takes your insurance inquiry and farms it out to different insurance companies to get the best policy for you. Similarly, a consolidated search engine takes your search keywords and forwards them to a group of primary search engines. You then get their consolidated reply. Maybe you get a broader set of results from this. Examples are search.com and dogpile.com. Earlier versions of search engines allowed complex Boolean searches (AND OR NOT etc). Although an ‘advanced’ search option is still usually available (pushed on to the second page), searches have generally been ‘dumbed down’ so that the searcher is simply confronted with one large box to type keywords. We rely on the intelligence of the search engine to interpret, parse, paraphrase or correct the spelling of your request, if necessary. Some users do not even bother to type a URL into the browser address field, even if they know it; they simply enter the name of the site (or something like it) into Google which is set as their home page and let Google find it. Nevertheless, mastering some aspects of advanced search can dramatically improve search results. We mention just a few of them to give you an impression: Exact phrase search: by putting double quotes around a string of words you force the search engine to only show results where this exact sequence of words occurs. A search for “Namibia University of Science and Technology” will only list pages where these three words occur exactly in order. Without double quotes, pages that contain the words “Namibia University of Science and Technology” (the “of” is ignored in such searches) would also be listed, for instance the homework of a Hongkong Namibia University of Science and Technology student about the history of Namibia. Usage of logical operators OR and NOT (the default operation on Google is AND) and brackets: ("Seeis" OR history) AND -wiki gives different results than "Seeis" OR (history AND -wiki) Usage of keywords (identified by containing a colon): define:computer (provides a definition) translate:Hansard into German (provides a translation) “Seeis” history inurl:.na (searches in the url instead of the web page text) 57 58 Unit 4 Search Engines Gallert lecture notes filetype:pdf (searched for a type of file) site:en.wikipedia.org “Seeis” (restricts search to a particular web site) link:http://en.wikipedia.org/wiki/Seeis (searches for pages containing a hyperlink to the specified site) What search companies now are talking about are smart engines which ‘know’ what you are looking for without you having to tell them! This seems to mean that as you begin to type your search, the engine will suggest topics which it has, starting with those letters. This means that as soon as you type “Cat..” results for catalyst, catamaran etc would appear in a drop down list. If the word or phrase you want appears in the list, you can click on it straight away, which saves a little time, but many people will be irritated at being ‘second guessed’ in this way. The engine may also remember your search history so that it knows you are more likely to be interested in catalysts than catamarans (you previously searched for information on various chemicals). The question of whether Google remembers your previous searches is a hugely controversial topic. Of course it is able to. The history of everything you have searched for (and every result you have clicked on) obviously creates a comprehensive profile of your interests and basically who you are, invaluable to advertisers and maybe also to politicians. Does Google hand this information over to the authorities? This is of special concern with regard to authoritarian regimes such as China. Google’s mission is to bring all the world’s information to all the people of the world. This includes the digitisation and putting online of all the world’s books. This sounds a noble ideal but of course falls foul of copyright interests. Obviously, who would buy a book if you could read it for free on Google? So this initiative has become the subject of acrimonious legal action, with Google being accused of massive copyright violation. Reflection If Google gets to know everything about everybody and everything, does that make Google God? Not to get into that discussion, but the organisation certainly gets involved in times of natural crises. “When the world is in crisis, we turn to Google” said Slate magazine on March 11 2011. This was the day on which a massive earthquake hit Japan. Within the hour, Google had a page up whereby people could ask after those who were missing, and those who were trapped in the affected area could post messages that they were safe. Google offers many other facilities, some well known, others which come and go without making much of an impact. All, however, are or were interesting and imaginative. On the search side, there is Google Scholar, which searches for academic papers, translation tools, image search, Google earth and maps, and of course gmail. Its attempts to promote ‘facebook’ like networking and chat sites though, have largely failed (e.g. Google ‘wave’). See: http://google.about.com/od/blogs/ss/Google-Graveyard for a list of Google ideas and projects that were unsuccessful or were never implemented. In-text question Are you cautious about searching for and visiting controversial sites, e.g. sites of possible terrorist organisations, even if you have a legitimate reason for doing so, like research, in case this might be recorded and count against you in some way, like obtaining a visa for a country? Activity Activity 2 Time Required: About 15 minutes Select a topic among your current interests. Or otherwise say the ‘best player of the 2010 World Cup’. Submit it to at least four search engines or meta search engines, e.g. Google, Bing, Search.com, Ask.com. How long? Perhaps the results will be very similar. If there are differences, would this be anything systematic due to the nature of the search engines, or just a matter of chance? Feedback Activity How long? Activity 3 Time Required: 30 minutes How good are you at ‘googling?’ Use Google to find the following: The name of the capital of the new state of the South Sudan Vilna is the town which is now capital of Lithuania. But to which other countries did it belong in the past? Who composed your cell phone ring tone? Find a picture of the very first computer mouse! What are the GPS coordinates of the Ariamsvlei border post? What does the acronym TLA stand for? What did New Era write about the Caprivi treason trial? 59 60 Unit 4 Search Engines Hint: there is nothing to stop you keying your question straight into Google – use the Advanced search facility and enter the ‘exact phrase’! In particular you could try the following search terms: Capital of South Sudan Feedback History of Vilna This is a difficult task. Look up the name of the ring tone from your phone settings and google for it. From the result you should see from which work of music it is derived. Now google the music piece and find the composer. Picture of first computer mouse location of Ariamsvlei define:TLA inurl:newera “Caprivi treason trial” Gray, Michael (Greywolf SEO blog December 2005) Google Search Tricks, Tips and Hints, last accessed 30 April 2012. http://www.wolfhowl.com/google/google-search-tricks-tips-and-hints/. References See also items in required reading at head of the unit Search engine: Sophisticated software which indexes billions of web sites with keywords and retrieves them according to search requests. Page rank: The core of the Google search algorithm. An iterative evaluation of the number and Page Rank of incoming links to a web site. Spider: A program which continuously visits web sites, looking for keywords and possibly other data, and indexes them for a search engine. Search engine optimisation (SEO): Techniques for making web sites visible and ‘user friendly’ to spiders, in an attempt to improve their retrieval ranking. Search engine Attempts to improve search engine rankings by underhand means, e.g. by multiple repeated (invisible) keywords, (like blue on blue) on the page. Keywords/concepts Adding extra rows to the Table graphicRemoving rows from the table graphic spamming or spoofing: Unit summary Summary In this unit you learned about the power of Internet search engines, and how in principle they work. You learned about the different types of search engine, and what meta- or consolidated search engines are. You learned how web site designers should optimize their sites to attract (legally) the attention of search engines and improve their rankings in search results. You learned about how search engines are allegedly manipulated or censored for commercial or political ends, and how some people speculate on the philosophical implications of the enormous power of search engines. 61 62 Unit 5 Creating your Own Web Page (or Web Site) Unit 5 Creating your Own Web Page (or Web Site) Introduction These days anyone can be on the web and if we believe Facebook’s recent figures, over 500 million of us are. On Facebook you can upload your pictures and stories, request to be a ‘friend’ with almost anyone on the planet, post comments about anything etc. Even in these days of Facebook, it is a good idea to be able to produce your own independent web page, either for yourself or on behalf of a small organisation you may be working for. There are two aspects to this – creating the page itself, either by coding or with a web tool, and publishing it – getting it hosted on the Internet so that everyone can see your efforts. Web pages, however, need not be on the Internet – they can reside locally on your computer hard drive or be part of a local internal ‘intranet’. There are two different ways to produce web pages, you can learn how to write HTML code and write your page from scratch, or you can use a web development tool which will produce the code for you. Both methods have disadvantages. This unit describes both, and if you have the time we encourage you to try both ways. However, the exercise for this unit you can complete whichever way you want, and only one method of creating web pages is needed for you to accomplish it. Objectives Prescribed reading Additional reading Upon completion of this unit you will be able to: write simple HTML code to create a basic web page, OR use a web development tool to create a more sophisticated web page or site; test and troubleshoot your site; upload your web site on to the Internet. The website: www.w3schools.com for a vast amount of information and support for learning web languages and building your web site. Any of a multitude of books with a title such as: “Build your own web site”. Go to the vast www.about.com site (which has topics on nearly everything), select the ‘web design’ newsletter written by Jennifer Kyrnin and sign up for it. It’s completely free, and you will get an email once or twice a week filled with invaluable advice on web design, both basic and advanced. 1. A Page by Coding with HTML As we said previously, a web page is generated by a text file; this text being a ‘program’ in a simple language called HTML, although the HTML can ‘call’ other programs in less simple languages like Javascript to do jobs more complicated than what the HTML itself can do alone (like animations etc.). You can see the HTML source code behind any web page – all browsers have a ‘view source’ option. Just as you can prepare your own great meal (if you have the cooking skills) or otherwise send out for a take-away – easier, but not so much choice and maybe not exactly what you want - you have two choices in web development. You can construct the web site yourself or you can use ‘tools’ of varying sophistication to make a page for you. You can create web pages by selecting from various templates, clicking, ‘dragging and dropping’, needing no technical or programming knowledge. We will create pages using both of these methods. There are several good web design courses offered at Namibia University of Science and Technology. We do not attempt to duplicate these here but simply provide you with enough knowledge to create a simple page. Note it! / Warning Tip One ‘word’ of warning. There is a facility in MSWord to create a document as usual, then ‘save as’ an HTML file. In theory this would be a web page. But it is horrendously inefficient – go to ‘view source’ and you will see dozens of pages of code, for a ‘page’ which is just a line of text! Actually it is difficult to see what purpose this facility serves. So although the process looks easy, do not do it! It is a very good idea to have your file extensions turned on, and visible. Depending on your Windows defaults these may be suppressed to avoid confusing non-technical users. For instance letter.docx (a Word2007 file) shows just as ‘letter’. But we need to know the file type, to be able to tell whether a file is a word document (.docx) a plain text file (.txt) or most importantly for us an html file (.htm or .html) which is a text file, but has a special purpose (recognised by a browser). Similarly an image would be a .jpg or .gif file etc. To turn file extensions on, start Windows explorer, choose “File” (Win2007: “Organize”), click on the “View” tab, and untick “Hide file extensions for known file types”. Click “Apply” and “OK”. This is a simple HTML ‘program’ which produces an equally simple web page: 63 64 Unit 5 Creating your Own Web Page (or Web Site) <html> <head> </head> <body text = “yellow” bgcolor= “blue”> Hello there </body> </html> It will produce the message “Hello there” in yellow letters on a blue background. Notice the words in angle brackets < >. These are the ‘tags’ or instructions in HTML. They define the basic structure of the program and instruct the browser on how to display the content , which is simply included as text between the instructions. This mixture of content and structure is a bad thing (to purists) and has led to better ways of defining a web page, but these need not concern us here. To prove this works, you can type the above into a plain text editor (MS Word is not a text editor and cannot be used! A text editor available on all MS Windows computers is Notepad, again, if you are using Linux you probably know what to do, but common text editors there are emacs, gedit, joe, and vi) and save as ‘all files’ but with an extension .htm (that is why we needed the file extensions to be showing). When you double click on the saved htm file icon, (which should look like your default browser), the browser should take the file, open it, and display a simple web page as described. You can check it by looking at the ‘view source’ or ‘view code’ option. 1.1 Step-by-Step Example We will now produce a small HTML page step by step. Make sure you actually do this yourself on a computer – simply reading through this example will not teach you anything! The screen shots below are taken from a computer running the Linux operating system. For legal reasons we may not provide screen shots from proprietary software like Microsoft Windows, or Microsoft Internet Explorer. 1. Start a text editor (NOT a word processor like WordPad, MS Word, or OpenOffice) and create a new document. On MS Windows, there is one pre-installed editor called “Notepad”. 2. Save the new document with file extension .html 3. Type the general structure of an HTML document. For proper rendering by web browsers, the tags <HTML>, <HEAD>, and <BODY> are compulsory. They must all be closed (</HTML>, </HEAD>, and </BODY>), and they must not be overlapping: The </HEAD> closing tag must occur before the opening tag of <BODY>, and so on. 65 66 Unit 5 Creating your Own Web Page (or Web Site) 4. Write some text into the body of the document. The body contains all text that will appear in the browser window. 5. Save your document, open a web browser, and within the web browser open the document you just created. 6. Go back to the text editor. The header of an HTML document contains meta-information. Whatever you type here will not be displayed in the browser window. However, you can specify the author of this page, keywords for search engines, and a page title. The page title will display in the title bar of the browser when the page is opened. 7. Give an appropriate page title and save. 8. Go back to the browser and click “Refresh” (on most browsers: F5). Check that the title bar displays the title of the document. 67 68 Unit 5 Creating your Own Web Page (or Web Site) 9. In the text editor, add some more text to your HTML document. Separate the paragraphs by line feeds. 10. Save and check in the browser (refresh with F5) how the text is rendered. Note that the white space in the source file does not result in separate paragraphs in the browser! 11. To achieve an organisation into paragraphs, add <P> (paragraph ) tags to your text. 12. Save and check the result in a browser. 13. Change the appearance of headlines (<H1> to <H7>, depending on the size you want them to be, <H1> being very big and <H7> being very 69 70 Unit 5 Creating your Own Web Page (or Web Site) small). Separate the headline from the text with a horizontal row (tag: <HR>). Make the page creator information smaller with <SMALL> tags. Special characters are created by keywords, for instance © for the copyright symbol ©. 14. Save and check the result in a browser. 15. Insert a link to another web page with the anchor tag <A>. The anchor tag needs a parameter to work properly. The parameter's keyword is HREF, and its value is the url of the web page you want to link to. Parameter and value are connected by the assignment symbol =. The text between the opening and the closing anchor tag will be displayed in blue, and can be clicked on. 16. Save and check the result in a browser. The link will only work if you have Internet connection. 17. Insert a picture into the web page, using the <IMG> tag. You must specify the source (the file name), and the dimensions. 71 72 Unit 5 Creating your Own Web Page (or Web Site) 18. Save and check the result in a browser. The image will only show if it is stored in the same folder as the source HTML file. Don't forget the file extension for the image; it is part of the file name! 19. You can combine the functionality of several tags by nesting them. For instance, wrapping the <IMG> into anchor tags makes the image clickable. Some tags can even be iterated (repeated) by placing them more than once. 20. Save and check the result in a browser. 21. Change colors of the page's background and font. 73 74 Unit 5 Creating your Own Web Page (or Web Site) 22. Save and check the result in a browser. 23. Sophisticated text placement can be achieved by using tables. Tables use the <TABLE> tag. Inside the table there are rows (<TR>), and inside the rows there are cells (<TD>). Make sure every row has sufficient cells to fill it. If you want to have a border, use the BORDER keyword inside the <TABLE> tag. 24. Save and check the result in a browser. 75 76 Unit 5 Creating your Own Web Page (or Web Site) Congratulations, you have just produced your first web page! But how do you get it online so that it becomes part of the World Wide Web, and people around the world can see it? For your page to become visible it must be transferred to a web host, a computer that reacts to user queries, and that is always online. That means, you must upload your document to another computer. It also must get an URL so that it becomes reachable from anywhere. Many web hosting sites charge money for hosting your web site, but there are free options. An example of a free web hoster is http://freehostingnoads.net/, it contains a step-by-step instruction of how to transfer your files there. There are literally thousands of places where you can upload your own web site for free. Just Google “free web hosting” and see how many millions of results you get. There are of course disadvantages when you use a free service provider: Your page may not load very fast, the site may go offline, the provider may place advertisements on your site, or it might even disappear without good reason. Always keep a copy of your web site, in case the free hoster goes bankrupt, and your page is not available anymore. One disadvantage of free web hosting is that you do not get your own domain . That means you cannot have your web site accessed by a url like www.john-doe.com. This service always costs some money (around 20 US$ per year) because such domains must be registered with the global DNS system (see chapter 2) in order to become available. You may, however, get a url like john-doe.freehost.com, where the free hoster registered freehost.com, and your site piggybacks on it. Recommended website Activity How long? There are literally thousands of books and resources on the Internet for learning HTML. We will point you to one of the best of these – the site run by the World Wide Web consortium itself at www.w3schools.com. This site contains virtually everything you need for teaching yourself any web related language, at any level. (An alternative is www.selfhtml.com). Activity 1 Time Required: 2 hours Work through the teach yourself HTML tutorial at http://www.w3schools.com/html/default.asp. This will teach you enough ‘tags’ to construct a simple page. A complete reference dictionary of all tags, their meaning and syntax is available there. By a simple page we mean the ability to construct a page with a heading, defined colours of text and background, maybe an inserted picture (image), with necessary headings, body text and formatting such as a bulleted list etc. Feedback Your task will now be to produce a personal web page, with a picture of yourself and a bit of biographical detail. Activity Activity 2 Time Required: Half a day. Produce a personal web page ‘manually’ using HTML code, with the features described above. How long? Feedback Remember that although HTML is a ‘simple’ language, like all programming languages it is unforgiving of mistakes like spelling errors. Remember that it uses American spelling! E.g. <center> will work but <centre> will not. Similarly <bgcolor> is right, <bgcolour> is wrong. Unlike many other computer languages, you do not get a ‘syntax error’ warning from a mistake – it will simply ignore you, sometimes with frustrating and baffling results. 2. The Difference between a ‘Page’ and a ‘Site’ A web page is usually a single screen display on your computer resulting from a browser processing an html file. As we mentioned, the genius of the web results from the idea of ‘hypertext’. These show up as places on the page called links (generally in blue underlined text) which change the shape of the mouse pointer to a pointing finger when you hover over them. Each link is the address of another page, which is summoned and loaded when you click. Links are implemented in html by the ‘anchor’ or <a> tag. This other page can be another html file on your computer, or a web page on the other side of the world – it does not matter. When we have a collection of related html files within the same computer or folder, it is called a web site or just a site. For instance, your personal home page can be expanded to a full web by having links to a page of your sporting interests, a page of your current studies, a page of family pictures etc. It would be much too cluttered to try to put all of this material on one page. There could be external links to some other subjects of interest to you, like Namibian environmental issues, or anything else. All of these linked pages together will form your web site or just your ‘web’, and your web then will consist of a folder of html files plus image files, in the simplest case. There should be a structure to these files, or pages: the simplest would be a hierarchy: 77 78 Unit 5 Creating your Own Web Page (or Web Site) Figure 1: A hierarchical structure as used for web pages with the home page at the top. The ‘home page’ by the way, should have a file name of ‘index.html’ or ‘home.html’ – this is what the network looks for when someone is requesting your web site. Although one can always use the ‘back’ button on the browser, it is good practice to include a link back to the home page from every other page, and other ‘navigational’ links as well. If the web site is large, the web site plan itself, similar to the diagram above, can be included to assist the user! Sometimes what looks like a single page is actually a composite of several pages (html files, again) fitted together, like cuttings from a newspaper. In fact, online newspaper sites are often constructed in this way. You can tell if parts of the page scroll up and down, leaving the rest of the screen unmoved. These sub-pages are called ‘frames’. So what looks like a single page can be an entire web of frames. The matters of deciding on the structure of a web site, as well as designing each page from the point of view of visual attractiveness, utility and ease of use fall under the subject of web design and ‘useability’, a vast and fascinating area which brings together skills of graphics, art and aesthetics, technical knowledge and competence, and even human psychology. 3. A Page or Site Produced with a Web Design Tool There are hundreds of web design software tools on the market, from the simple and free to the professional and expensive, such as Dreamweaver and Microsoft Expression Web. There is another class of product, called content management systems (CMS) which are especially suitable if you want a dynamic or interactive web site on which information is frequently updated, with a searchable database etc. Examples of these are Drupal and Joomla. And then there are online design tools that allow you to create and upload a web site in one go: You create the pages online, and the moment you click “save” your page is on the Web for everyone to see. Compared to the “manual method” described above, you save the step of finding a free web host service, and you create your pages “on the fly”. This method also has a disadvantage: If the provider of the service goes out of business, or starts charging money, you often have no way to retrieve your files in order to upload them somewhere else. Check social networking sites like facebook or twitter to see what feedback, positive or negative, the service got from other users. If you have used a web developer before and have a favourite, you may use this, otherwise we suggest moonfruit (www.moonfruit.com). Recommended website Go to this site and you should feel free to experiment with it. You may choose a type of site, personal, business etc., a template for a page structure, colours, background and of course a means of placing your content including hyperlinks to other sites or pages. There is a beginner’s guide on how to use the system. Activity Activity 3 Time Required: 3 hours Produce a practice web page or site using the Moonfruit or other web tool as desired, to develop into a web site (or just a ‘web’). How long? Feedback The site should be along the same lines as your first, but a bit more ambitious by using the features of the development tool. Again, if you have any query or have produced an effort you are particularly proud of, please send to the course coordinator. When the web site for the course is fully construc ted, there will be a section for meritorious student web pages, so it may be published! Niederst, J. (2006). Web design in a Nutshell, New York: O’Reilly Media. References Keywords/concepts HTML: Hypertext markup language, the language on which all web pages are based. Web tool: A software program in which the user generally lays out a web page visually, for which the program then generates the HTML code. 79 80 Unit 5 Creating your Own Web Page (or Web Site) Content management system: A special web design tool which makes it easy to create ‘dynamic’ pages with user postings, database look-up, mailing lists etc. Javascript: A language (not to be confused with Java) used to create many web animations and effects. Unit Summary Summary In this unit you learned about the basic structure of a web page, and how every page at heart is a piece of code in a language called HTML. You learned enough HTML to create a simple personal page. You learned that a web site is a collection of related and interlinked pages. You then learned about the ‘tools’ available to create more sophisticated pages and sites without having to code them ‘from scratch’, and used one to create a page for your assignment. If required, you also learned about how to upload these pages on to the Internet. Unit 6 The Reliability of Information Introduction There are some who theorise that early humanity developed language not in order to communicate but in order to lie! Lies, misleading information and deliberate ‘disinformation’ have always been with us, but in the age of the Internet, where anyone, regardless of whether they are knowledgeable, qualified or well intentioned, can publish any comment on almost any subject for all the world to read, we have to be even more careful about the ‘information’ we are given. This unit offers advice on how to assess the reliability of information from any source, detect bias or the effect of vested interests, conflict of interests or conspiracy theories. There is a widespread but naïve belief in the correctness of all published information. However, most of what you read in books or newspapers, see on TV, or listen to on the radio is subjective, biased, deceptive, or just plainly wrong. And yes, this includes this study guide – be alert! Everything is published for a reason, and only in selected few cases this reason is that the writer wants to part with their knowledge and educate the general populace. More often than not the reason to publish is to push an agenda, to earn money or to gain reputation, and if misinformation is the best way to achieve this, many authors will misinform. It is therefore very important to assess both the reliability and the independence of any piece of publication. Upon completion of this unit you will be able to: assess the reliability of information and independence of the author; discern the presence of vested interests when they appear in sources of information; Objectives judge the reliability of web sites and the independence of the publisher; check the substance of urban legends and conspiracy theories; recognise ‘unspeak’ and verbal manipulation. None specific. Prescribed reading Wikipedia guidelines on reliable sources, authored by the co-author: Gallert, P. (Wikipedia, 2010) On Reliable Sources. Last accessed 30 April 2012. http://en.wikipedia.org/wiki/User:pgallert/On reliable sources Additional reading 81 82 Unit 6 The Reliability of Information 1. Impartiality and Bias In the pre-Internet days, when information came mostly verbally or in printed form, you were better able to judge its accuracy. You placed more reliance on the statements of a respected authority, and less reliance on the opinions of a drunk in a pub. When you were in visual contact with an opinion giver, it was easier to assess him. Books were reviewed by proof-readers and publishers, and anything outrageous or libelous would not make it into print. Similarly, newspapers would not allow obvious falsehoods or extreme views in readers’ letters onto the pages. Any paper appearing in an academic journal would have been reviewed by other researchers, with the assurance that its contents, if not guaranteed correct, would at least be well founded and arguable. But on the Internet, anyone can set up a blog and voice opinions online, passed off as authoritative information, without any checks or ‘health warnings’. Talking of health warnings, there are many sites which offer health and medical advice, even illness diagnoses, of very variable reliability. There is possibly well meaning but misleading advice on what to do and where to shelter in an earthquake for example. The problem is the anonymity and unaccountability of the Internet. The author of a blog or website is not always mentioned, and even if it is, one is seldom sure whether the statements, advice or opinions expressed have been reviewed or verified. There is no control or entrance criteria for publishing on the Internet. You do not even have to host your own web site – you can set up a blog or post on innumerable other sites. Thus almost any school of thought, bias or conspiracy theory can be found online, even hate-filled or racialist sites. Pro-Nazi sites are banned in Germany and other countries, but they can be hosted somewhere else. 1.1 Types of bias: Bias can be very subtle and perhaps impossible to avoid by human beings. If you see some information which you think is unbiased, it is probably because it is biased the same way you are! Reflection Bias is generally political or religious or even scientific. Since most people have some religious or political beliefs (even if they are anarchism and atheism!) it is difficult to be impartial – indeed, impartiality may be impossible to achieve. Vested interests are usually related to commercial matters. As they say, a conference of butchers is unlikely to resolve in favour of vegetarianism! Dieting advice sites are unlikely to be sponsored by sugar refiners. Companies producing competitive products like computer software, cars and beverages are unlikely to be objective about each other’s products. Vested interests can span across industries and into politics, for example the nuclear power lobby. In many countries, such as the US, powerful lobby groups exist, representing the vested interests of nearly every industry. When any legislation is proposed, which might threaten a vested interest the relevant group ‘lobbies’ legislators relentlessly until the proposed law is dropped or watered down. That is why groups representing tobacco companies held up information for years relating to the dangers of smoking. Conspiracy theories thrive on the ‘net. Perhaps whereas before it was difficult to get enough people together believing in a wacky ‘theory’, the Internet can gather fringe theorists from all over the world to espouse their ideas. These theories can range from the idea the Titanic was never sunk, that Elvis is still alive, that HIV was produced by the CIA, that it can be cured by beetroot, that people have not landed on the moon, that President Obama was born in Kenya or that Michael Jackson was killed by his doctor. Of course, some conspiracy theories may turn out to be true – in the last case the doctor concerned has been convicted of causing the singer’s death! What is your favourite conspiracy theory? Be honest! In-text question When you are on-line, rumours, ‘urban legends’ and conspiracy theories abound. Some of these are fun, like gossip. Read them but do not take them too seriously. For amusing commentaries and analysis of current and older urban legends (most are very old) check out www.urbanlegends.about.com For debunking urban legends and conspiracy theories, see www.snopes.com Recommended website Activity Activity 1 Time Required: 1 hour Research your ‘favourite’ urban legend or conspiracy theory, maybe one which you half-believe in! What about aliens! Or the idea that on 9/11, the ‘twin towers’ were demolished by controlled explosions rather than the impact of the planes? Search for it in snopes.com or other conspiracy sites and see what the official verdict on it is How long? Were the answers on the ‘debunking’ sites convincing? Or is there still room for doubt!? What reliability can you attest to www.snopes.com? Feedback 83 84 Unit 6 The Reliability of Information 2. How to Detect Bias and ‘Disinformation’ As already noted earlier, nothing is published without a reason, and often the reason is not benign. But in order to find out about the reliability of information we need to find out the reason for publication. This is difficult, after all we cannot torture the author in order to obtain this information. We can, however, usually arrive at some “educated guess”, drawing from experience and the different business models of entities publishing information. This is the reason why an introduction in how publishing houses make money is part of Information Competence. Figure 1: The Loch Ness monster as published in the Daily Mail on 21 April 1934. Believed to be a hoax. 1. To evaluate whether a particular publication is reasonably objective and free of bias one has to consider two main points: 2. Is the publication reliable? That means, is it generally concerned about factual accuracy, and does it attempt to detect and correct possible errors in advance? Is the publication independent of the subject? That means, can it reasonably be said that it is unlikely that the publication furthers a certain agenda, sells a product relevant to the topic of the publication, or is affiliated with subjects covered in the publication? The first question restricts “good” sources of information or opinion to certain classes of publications – the ones that usually attempt to be free of factual errors. Those are for instance books published by reputable publishing companies, politically independent newspapers, academic papers and monographs. Why are these publications better than, for instance, press releases, readers' letters, advertisements, and self-published essays? The reason is that large publishing companies have a reputation to lose; if they constantly publish faulty facts they will go out of business. Academics might ruin their career is they are wrong too often. On the other hand, people that only occasionally publish and make their living from some other economic activity than writing do not necessarily run the risk of reputation loss: A farmer writing about the history of Namibia can still sell his cattle, even if he is perceived to forward a fringe view on history. What about the World Wide Web? Not everything written there is automatically wrong, although there really is a lot of dodgy content on the web. Sometimes the web site itself looks strange, has a lot of blinking text, garish colors, and just looks utterly unprofessional. But a professional web site design is not in itself an indicator for factual accuracy. It is not difficult these days to produce high-quality web content; your exercise in Unit 5 was intended to show you how easy it is. On the Web, reputation is as important as offline: A web site read by thousands of followers has a high value for advertising because it attracts a lot of potential customers, and many reputable publishers do not even consider anymore to put their content on paper. It can be accessed so much easier over the Internet, by so many more people, at next to no cost! Which publications in the WWW are reputable? A few generally good locations are: Web offerings of publications that also have a print run (Online newspapers, academic journals) Book scans (Check Google books) Academic collections like Google Scholar Blogs by well-known people, writing in their area of expertise Official web sites of governments and companies (but beware of their possible bias!) The second question, whether information might be factually correct but nevertheless biased, involves some detective work. There are thousands of examples of how hidden affiliations influence what information is published, and what is omitted. Remember the times when Informante published negative stories about Namibia University of Science and Technology, almost on a weekly basis? From an inside perspective, let me assure you that most of the facts were distorted – not entirely wrong but taken out of context, exaggerated, or one-sided. Now, Informante was owned by TrustCo at that time, and TrustCo owns IOL, a competitor of NUST Every prospective student that can be convinced not to study at NUST but to go to IOL, is a potential profit for the company. This is what we mean by a dependent source: In their own interest, Informante could not have published a positive article on NUST, because that would have had a negative influence on their own business! The issue of dependent sources is bigger than most ordinary people imagine, and it is not restricted to any particular company, political system, or developmental stage. In Germany, there is a long-standing agreement between journalists and government, not to expose private scandals of public figures, unless they are involving illegal activities. If any journalist publishes such a story, for instance an extra-marital affair of a minister, the publishing newspaper will be excluded from all future press conferences, and their ability to compete on the newspaper market will be severely weakened. This is the institutional dimension of bias. On a personal level, inspecting the objectivity of an author, rather than his publishing company, the two questions above can be rephrased - We may ask (derived from http://en.wikipedia.org/wiki/User:Pgallert/On reliable sources, an essay by the co-author): 1) Has the author knowledge in this subject area? 2) Why has this been published? 85 86 Unit 6 The Reliability of Information Again, the first question boils down to reputation, and the second one to affiliation. A court reporter writing about computer viruses for the first time very likely has not produced something reliable on this occasion. No matter how fantastic the newspaper is with court reports, or otherwise. On the other hand, a computer security expert writing on computer security is a reliable source. It does not matter at all whether they write in a book, an academic journal, a blog, or a government brochure. The second questions again can hardly have a definite answer. Under normal circumstances we just do not know what the intention of the author was. But some detective work can go a long way: Often it is possible to determine if an author was paid to contribute, or if they voluntarily proffered their opinion. In the days of Google, it is often easy to find out who the employer of a particular person is, to what religious or political movement they belong, how other texts have been perceived, or if they own certain businesses in the sector. Reflection Reflection: Professor Kiremire from University of Namibia often writes on the methodology of university rankings, and particularly on the rankings of NUST and UNAM. Is there something wrong with that? Make sure you consider affiliation and reputation. It is obvious that Prof Kiremire could have an interest to promote UNAM, he is working there. However, being one of the top scientists of Namibia, he must also consider his own reputation as independent researcher. It is therefore unlikely that he blatantly distorts facts or interpretations; his contribution is reasonably reliable despite his affiliation. 2.1 “Unspeak” This is a particular type of verbal bias or subtle manipulation recognised by Stephen Poole, who coined the word and wrote a book of the same name. It depends on a cunning use of words or phrases which automatically put the reader or listener on your side, and denigrates the opposing view at the same time. The most notorious example of this in recent years is the “War on Terror” coined after the “9/11” attacks on America. The suggestion is that Terrorism is a terrible violent thing (of course it is), therefore we must make ‘war’ against it, therefore you are signed up and enlisted as a soldier in the ‘war’ (“if you are not with us you are against us”), and because it is a ‘war’ we are entitled to invade other countries. So you are with us on this, because who could possibly be in favour of terror? The obvious flaw in this implied reasoning is that ‘war’ can only be waged against countries or peoples – you cannot make ‘war’ on a mentality or a tactic, which is what terrorism is (and anyway are Terror and Terrorism the same thing?). “Terrorist suspect” seems to be almost a crime in itself, and the repeated ‘s’ sounds, like the hissing of a snake, make it sound very bad indeed. Another well-known piece of unspeak was the “Axis of Evil”, a phrase lumping together unfavoured countries such as Iran, Iraq and North Korea. “Axis” was the English term for Nazi Germany and their allied countries in World War 2, the term therefore has a very negative connotation. Even though the countries of today's Axis of Evil have little in common, and nothing to do with Nazis or Hitler, the choice of the word “axis” suggests that they employ similar tactics. A less aggressive but no less persuasive example is the environmental organization which calls itself the “Friends of the Earth”. Who could not be in favour of them? (Because if you were not, you would presumably be an Enemy of the Earth, and nobody would want to be that!) Activity Activity 2 Time Required: 30 minutes Look on the Internet, or in a newspaper for the full (verbatim) text of a politician’s speech. Find some instance of ‘unspeak’ and try to discern what the real meaning or intention was behind the words. How long? Feedback Unspeak often uses words of ambiguous meaning, euphemisms etc. ‘Feelgood’, positive words are used in the context of unpleasant things. There are very many examples. NATO ‘takes care to avoid civilian casualties in Libya’. That does not mean that there are no civilian casualties (i.e. deaths) but at least they take care – and ‘caring’ is a feel-good word! In Namibia there is apparently going to be a department with a generous budget for promoting ‘national pride and identity’. What does this mean? Do we not have pride and identity already? Or is this code for something else? For more rich topical examples, see Poole’s ‘unspeak’ blog at www.unspeak.net McConnachie, J. (2008). Rough Guide to Conspiracy Theories. London: Rough guides. References Keywords/concepts Poole, S. (2007). Unspeak . London: Abacus Independence: The quality of being unaffiliated with the subject of a publication. Reliability: The quality of being dependable or trustworthy. Bias: Predisposition, partiality, prejudice. Unspeak: A subtle form of verbal bias or manipulation, such as the ‘war on terror’ A subtle form of verbal bias or manipulation, such as the ‘war on terror’. 87 88 Unit 6 The Reliability of Information Unit Summary Summary In this unit you learned about how to assess the reliability of information and how to detect lack of impartiality or bias. You learned how to suspect the presence of vested interests and to recognize conspiracy theories. Most of all, however, you should learn how to make your own mind up as to how reliable and truthful information is, based on the source or the information, the medium and the circumstances. Unit 7 Elementary logic, assumptions and reasoning, fallacies and misleading statistics Introduction Logic is the system by which we can infer valid conclusions from true premises. Wiktionary defines it as “A method of human thought that involves thinking in a linear, step-by-step manner about how a problem can be solved.” It is really a language of its own, and was invented by the ancient Greeks over two thousand years ago. It is amazing then how so many people in public life are so bad at it and how many decisions are taken on an illogical basis. Logic can be highly mathematical (symbolic) and philosophical, but it will be enough for us to stick to common sense aspects. Statistics can mean two things – the simple gathering of numeric data, as in Government statistics or the mathematical study of how predictions can be made from collections of related numeric data under various circumstances. Nearly everything in statistics is probabalistic, which means that only a likelihood or probability can be attached to any prediction. If you toss a coin a hundred times it is almost but not absolutely certain that you will get at least one head. It is not certain that you will get 50 heads and 50 tails – there is only a probability attached to that! Actually, it’s unlikely! Statistics is difficult and sometimes leads to unexpected results. Furthermore, members of the media, politicians, lawyers and opinionators are often entirely ignorant of mathematical statistics, so their assumptions and conclusions from numerical data and other statistics are often wildly out of line. Statistics can be and are very often manipulated to give false or misleading conclusions, by means of incorrect mathematics, misleading graphic representation etc. An information-literate person needs to be able to detect this, and this unit will try to help you with this. Objectives Upon completion of this unit you will be able to: describe the principles of basic true/false logic; apply the idea of premises and conclusions; describe the principles of statistics; describe the requirements for making valid statistical inferences; recognise the ways in which statistical data can be misleadingly represented, and incorrect conclusions drawn. 89 90 Unit 7 Elementary logic, assumptions and reasoning, fallacies and misleading statistics Huff, D. (1991) How to Lie with Statistics London: Penguin Prescribed reading Goldacre, B. (2009) Bad Science London: Harper Perennial Additional reading 1. The nature of logic The simplest kind of logic deals with propositions, statements which can have one of two values, true or false. These are quite similar to the 0’s and 1’s of computer science, which is essentially a system of binary logic. There is incidentally no moral value attached to a ‘truth’ over a ‘falsity’, they are just value, or symbols, T and F. We have to start somewhere, so we take some propositions (not too many) which we assume to be true. These are called premises or axioms, and every mathematical or logic system (every language if you like) has to have them. Say A: “Grass is green.” With B: “The ground in front of my house is covered in grass.” Therefore C: “The ground in front of my house is green.” This is a valid interference. Assuming that the premise is true (we could argue that it is not, consider the color of grass in the Namibian context), it is impossible that the conclusion is false. Actually it is an example of deductive reasoning, reasoning from the general (grass is green) to the particular (the front of my house is green). We can write pq or ‘p implies q’ – the truth of p means the truth of q. This is the same as the ‘if ..then..’ construct in a programming language. Of course the first fallacy is to think then that qp. Just because the front of my house is green, it doesn’t mean there is grass there – it could be green painted concrete! This may sound like a strange and illegitimate condition, but logic requires that if we accept the premise, and if the reasoning is valid, then there is no way that the conclusion could ever be wrong. To write “p” and “q” instead of complete propositions is called formalising: The schematic way of deducting “q” if we accept “p” and “pq” is valid no matter what actual sentences occupy the places of p and q. P and q are propositions which can be ‘flipped’ since they only have two values. ¬p means ‘not p’ so if p is true then ¬p is false. So pq does not mean that q p but it does mean ¬q ¬p ! Think about it! Just as numbers and variables can be combined in normal algebra (don’t panic – we just mean that from x and y we can have x+y, x*y, etc) we can combine logic propositions with different operators such as AND, OR, exclusive OR (XOR) with meanings similar to English. Combining these propositions using these operators gives us other propositions, i.e. statements that are true or false. (p AND q) is true only if both p and q are true (p OR q) is true if at least one of p and q are true (p XOR q) is true if either p or q is true but not both (pq) is a special case: It is true in two cases: If p is false, or if q is true. This might seem very strange because it does not correspond with the everyday use of if...then. But indeed, the proposition “If it rains the streets get wet” says nothing about the case that it does not rain. If it does not rain, the implication is true, it does not cover that fact (you could get a garden hose out and wet the street yourself). There is only one way an implication like (pq) could ever be false: If we have to accept that p is true but q is false. So (p OR -q) is equivalent to it. The “if..then” is often meant colloquially “if and only if”: “If you do your homework properly I will take you to the Windhoek Show.” means two things: “You do your homework, then I take you to the show” (If p then q) AND “You do not do your homework then the Show is off” (if -p then -q). These are two statements, not one. Some colloquial uses of “and” and “or” likewise do not conform with the rules of proposition logic. For instance the phrase: “I don’t have a TV or a DVD player” really means “I don’t have a TV AND I don’t have a DVD player”. Or a notice “No smoking and drinking in the building” which might mean you can smoke or drink but not both at the same time! Expression Meaning p A proposition ¬p (It is) not (the case that) p pq If p is true then q is true. Alternatively: p is false OR q is true p AND q True if both p and q are true P OR q True if either p or q and true p XOR q Exclusive OR – true if either p or q are true but not both 91 92 Unit 7 Elementary logic, assumptions and reasoning, fallacies and misleading statistics Activity Activity 1 Time Required: 3 minutes Which of these situations is a logical AND, which an OR and which an exclusive OR? To succeed in a job search these days you need at least qualifications or experience. Education plus job creation is necessary for national development. Choice of main dish on a special restaurant menu. How long? The structure of these statements is an OR, and AND and XOR respectively (the restaurant would look puzzled if you asked for both or all the main dishes on the menu!). Feedback Some of these operations can be expressed in terms of each other. Some of these equivalences are not obvious. For instance (these are called the deMorgan laws of propositional logic ): p q is the same as (¬p OR q) ( p OR q) is the same as ¬p AND ¬q Also, if (p q) AND (q r) then p r . This is called transitivity . Activity Activity 2 Time Required: 5 minutes Suppose it is true that eating fresh fruit makes you healthy. What would you say about the following ‘deductions’? I eat fresh fruit, therefore I am healthy I am healthy, therefore I eat fresh fruit I am not healthy, therefore I do not eat fresh fruit How long? I do not eat fresh fruit, therefore I am not healthy The first statement is the given premise, assumed true. Feedback The second is not true – you could be healthy for other reasons. The third is true – if you did eat fresh fruit you would be healthy, from the premise. The fourth is not true – the premise says nothing about the consequence of not eating fresh fruit. Remember p q does not imply ¬p ¬q! The converse of deduction is induction , which tries to proceed from the particular to the general – can you infer a general principle from particular observations, say? This is the core of the scientific method . An experiment which produces a result should produce the same result when conducted again in the same circumstances, and may lead to a general law. The apple which landed according to legend on the English physicist Newton’s head started him wondering: the apple falling on my head must be the same phenomenon as all apples as well as all other objects falling to the ground – why? Was the earth attracting the apples? Eventually this led him to formulate the general law of gravitation, according to which all objects in the universe with mass, not just on earth, attract each other. In mathematics the technique is very useful – if you have a formula involving the whole number N and you want to prove it true for all N, all you have to do is show it is true for N=1. Then, assuming it is true for some arbitrary number N, show it is then true for N+1. The formula will then be true for all values of N! Be careful when concepts like Nothing and Nobody enter a logical argument, as these can lead to nonsensical results. Think of the photocopied sign on nearly every secretary’s wall, about the office where Somebody, Everybody and Nobody worked. “Everybody thought that Somebody would do it, but Nobody did it!!” Try: Nothing is better than eternal life A hotdog (from the NUST kiosk?) is better than nothing 93 94 Unit 7 Elementary logic, assumptions and reasoning, fallacies and misleading statistics Therefore, a hot dog is better than eternal life! Where is the trick in this deduction? Reflection Activity The formalisation has been executed in a faulty way. The “nothing” in the premise is equivalent to “there exists no entity that...”. The “nothing” in the conclusion means “no food”. So the first and the second “nothing” do not refer to the same objects; the conclusion is invalid. Activity 3 Time Required: 10 minutes What’s the difference between the following two inductive conclusions: This N$1 coin is made of copper-nickel alloy. Another N$1 coin is made of copper-nickel alloy. Therefore, all N$1 coins are made of a coppernickel alloy. This lecturer wears ‘veldskoene’. Another lecturer wears ‘veldskoene’. Therefore, all lecturers wear veldskoene. How long? Valid inductive conclusions support counter-factual statements. But while “if this thing were a N$1 coin, it would be made of alloy” is true, the statement “if this person were a lecturer, they would wear velskoene” is false. Therefore the first induction is valid, and the second one is not. Feedback 2. Statistics It was a 19th century British prime minister who said that there were three kinds of lies: Lies, damned lies, and statistics. He meant that people (politicians especially) can present figures in a way to support almost any theory or conclusion. In the simplest case, politicians or spokespersons for corporations can simply ‘cherry pick’, ignoring statistics which do not suit them, and focussing on the statistics which do. This is certainly true, but not only that: theoretical (mathematical) statistics deal with the probability of outcomes resulting from circumstances defined by collections of numerical data. It arose from a French aristocrat trying to improve his winnings at gambling! It is a very reliable science, but many people do not clearly understand the nature of probability, or how to apply the principles of statistics. For instance, think about the following: (from an alleged NBC weather forecast) The chance of rain on Saturday is 50%. The chance of rain on Sunday is also 50%. Therefore the probability of rain over the weekend is 100% certain! [what actually was the probability of rain?] In 2009, there were two reported cases of HIV in a certain village. In 2010, there were four reported cases. So the incidence of HIV in that village is doubling every year! (A gambler). I’ve lost 10 times in a row. Therefore, the next time I play, I must win! Over a period of three months, 10% of all the patients in a hospital die. Over the same period, over the whole country, only 1% of the population dies. Therefore, it is 10 times more dangerous to go to hospital than to stay at home! Invalid reasoning like this are called fallacies. The problems with the above fallacies (and sometimes they are much more subtle) are the following: In (i), probabilities do not add up (50+50=100) in the way assumed. Actually the probability that it will rain on the weekend is exactly 50%. The probability that it will rain all weekend (Saturday and Sunday) is only 25%. It is even more complicated, because if it does rain on Saturday, that fact will influence (due to the nature of weather) whether it will rain on Sunday or not – the probabilities are conditionally related (dependent). In ii) the sample size of the data is far too small. If it was a large town, and there were 2000 cases of a disease one year and 4000 the next, it would be a case for alarm – the rate is evidently doubling! In (iii) we have independent probabilities, where what has happened before is no indication of what will happen the next time. Suppose the chances of a gambler winning any game is 10%. Whatever has happened before, the chances of winning the next game is still 10%. Not a certainty! In (iv) there is obviously skewed data, or a sample bias. The people who are admitted to hospital are already sick, so not representative of the general population. They are more likely to die than the mostly healthy population outside. Not the fault of the hospital! Mostly when we are doing a survey or trying to establish a trend, we have to choose a sample of the population of data, because we cannot examine all of it. It is important then that the sample be fair, large enough, and representative of the whole population. Activity 4 Time Required: 30 minutes Suppose you were asked to undertake an investigation of what students thought of the sports facilities at their college. You prepare a Activity 95 96 Unit 7 Elementary logic, assumptions and reasoning, fallacies and misleading statistics questionnaire, and consider the following ways of distributing it: How long? You print 200 copies and hand them out at the gate to the first 200 students arriving at the university one morning. You get an alphabetical list of students and send a questionnaire to every 20th student. You post the questionnaire on the university web site and invite anyone interested to fill it in and submit it (on-line). You send out ten questionnaires to the secretary of each of the registered sports clubs of the university, asking them to distribute the copies to members of their club to fill in. You call a meeting to discuss the “problem of sports facilities” and hand out a questionnaire to every student attending the meeting. Which of the above ways do you think would be the fairest, i.e. the one which would be the most reliable, and give you the best feedback in answer to your question? This example is about the problems of working with a fair, representative sample. Mostly, you will want to concentrate on the active sports playing students – students who do not play sport will not care what the sports facilities are! Feedback may be unrepresentative, because only students with early morning lectures will be involved! is the best way of the five suggestions but has practical challenges:. You will get an ‘even spread’ of all students, but how will you get hold of each 20 th student? many students, not just those who are not computer literate, may never look at the website, and not know about the questionnaire. is unrepresentative because you would get only students to respond that are actively involved in sports at the college. Those who do their sports elsewhere because they are dissatisfied with the condition of the facilities would have no chance of answering! the sample may be skewed, because if the meeting is called about a ‘problem’ only the dissatisfied students who think there is a problem will attend! The phenomenon of the ‘silent majority’ may be ignored Now, what would be a proper method to hand out those questionnaires? One possibility would be to question all students. This, however, would not be a sample but a census. The Namibian Housing and Population Census is conducted in this way; everybody is counted. If a proper sampling is required it needs to be drawn from an independent variable. For students at a college one could take a certain mathematical property of the student number, perhaps pick student numbers that are divisible by 37. The assumption is here that the divisibility of a student number has no impact on the properties of the person that carries this number. 3. Presentation of Statistical Information Here is a graph of the recent ‘catastrophic drop’ of the value of the Australian dollar against the Rand (or Namibian Dollar). Is this alarming (for Australians!)? Graph 1 Not really. Let us look at the graph again, but this time with the scales: Graph 2 (Graphs produced from information from www.oanda.com) Not only do we see that this graph is only over 30 days, and therefore maybe a short term variation, but the horizontal scale does not start from zero! The value of the currency is not dropping to zero but simply varying from 7.26 to 6.87, much less dramatic. Many graphs of financial data are of this type. 97 98 Unit 7 Elementary logic, assumptions and reasoning, fallacies and misleading statistics Now for another graph. This purports to show the number of drivers in fatal car crashes one year in America. It seems that teenage drivers (16-19) are safer than 20-24 year olds (fewer accidents) and drivers in the 80’s are very safe! Misleading? Yes: 16-19 year old have fewer crashes because they do not drive so much – many teenagers (still) do not have cars! Likewise, octogenarians! The better graph would be of say vehicle miles driven by each age group. We then have: Above graphs sourced from the American Journal of Public Health as cited and reviewed on www.econoclass.com Here it is clearly seen that the extremes of age – teenagers and very elderly people, are by far the most at risk. 3D graphs and those with novelty graphics should be avoided – the shapes can distract and mislead the reader. For instance: Figure 1 This graphic is supposed to show the variation in the price of oil during the 1970’s, in an original way. But the picture is misleading in many respects – the size of the drums, perspective and labeling hide the facts, which are that the price of oil jumped sharply in 1973 and then remained mostly constant for the other years. Notice for instance that the 1979 drum is roughly double the volume of the 1978 drum, even though the price increase was exactly 5%. Figure 2 (From stats10fmsample.ppt from website www.suffolkmaths.co.uk and elsewhere) The kind of table above is well known, showing data similar to that first collected in World War 1, when it seemed that the issue of helmets actually increased deaths and injuries! Surely wearing a helmet, for soldiers or motorcyclists, cannot result in more injuries or fatalities? What can be wrong here? In-text question 99 100 Unit 7 Elementary logic, assumptions and reasoning, fallacies and misleading statistics Two phenomena may be correlated with each other, but correlation does not imply causation, and if it did, which was the cause and which the effect? For instance, let’s say there is a high correlation between teenagers taking drugs and those suffering from depression. Does the depression encourage drug taking or does the drug taking promote depression? Or are these just two parallel phenomena? Another point to beware of is the ‘reliability’ of tests, for instance in medical diagnosis. We may be told that a test is ‘95% reliable’ but we need to know the rate of false positives and/or false negatives. A false negative result from a test, in the case of a medical diagnosis, is the result that you do not have the disease, when you actually do. Note it! / Warning A false positive result is the result that you have the disease, when you actually do not. Let’s say there is an epidemic of a new disease, which affects 1 in 1000 people. There is a false positive rate of just 5% in the test which has been developed for the disease. That is why if you are diagnosed positive there is a 95% probability that you have the disease – correct? Surprisingly, no, the chances are only 2% that you have! The test is useless, because the false result rate is far greater than the incidence of this rather rare disease. [Explanation: 1 in 1000 have the disease. Because of the false positives, 50 in 1000 will be diagnosed with it. Therefore, being diagnosed with the disease carries only a 1 in 50 chance of actually having it! – that is just 2%!] In the famous trial of O.J. Simpson in the 1990’s, the defence attorney, although he admitted that O.J. was abusive towards his girlfriend, produced statistics to show that only 5% of abusive men go on to kill their girlfriends, therefore O.J. was 95% likely to be innocent. OJ was duly acquitted. What nobody told the jury was that, given that an abused woman was killed, the killer was 90% likely to be her boyfriend. This is a case of conditional probabilities – the probability of some event in isolation will be very different when given that another event relating to it has actually occurred. Great misconceptions attach to the currently fashionable criminal identification technique of DNA matching and formerly fingerprint matching. We are often told that the match is reliable to ‘one in a million’ etc. John Paulos, the author of the excellent books “Innumeracy” and “A mathematician reads the Newspaper” tells the following story: A fingerprint or DNA sample from the murder scene matches that of Mr. Smith, so he ‘must be guilty’. But wait. The relevant conditional probability is the chance that a person is innocent given that his prints or DNA match those from the crime scene. Assume the crime took place in a city of 2 million people. Assume all the residents of the city have their prints or DNA on file. Assume that 3 people out of the 2 million have prints or DNA that would match the prints at the crime scene. Two of these people would be innocent, the third the guilty one. Thus, the conditional probability of a print match, given that the person is innocent, is two out of 2 million (your 1 in a million statistic!) By contrast, the conditional probability that a person is innocent, given that his prints match at a crime scene, is 2 in 3. A 66% chance of innocence, not 1 in a million! An example of the surprising conclusions from statistics! Conclusion: So beware of statistics! For statistics to be reliable: they must apply to the correct situation Note it! / Warning bias must not be present the sample size of data must be large enough the test must be continued for long enough you must beware of ‘conditional’ probabilities we must be sure we are observing some real effect among the random ‘background’ – engineers would say the signal must be stronger than the ‘noise’ Activity How long? Activity 5 Time Required: 30 minutes Discuss the logical errors in the following: About 35 % of all road accidents are due to drunk driving. Therefore, 65%, almost twice as many, accidents are caused by sober drivers. Therefore, it is better to have some drinks before driving – it will cut your chances of an accident by half; It is very unlikely that there will be a bomb on board an aeroplane. Therefore, it will be almost impossible that there will be two bombs by chance simultaneously on the same place. Therefore, to ensure your safety from possible terrorist attack while flying, carry your own bomb on board the plane (You do not detonate it, of course). About 35% of all car accidents in Windhoek involve taxi drivers, far more than any other profession. Therefore, taxi drivers are the worst drivers in the city. In recent years there has been a great improvement in educational standards, so that statistics now show that more than 60% of students have marks above average! (statement by an education minister). Before signing the agreement, we will wait to see whether the cessation of violence is permanent! (Former British prime minister). 101 102 Unit 7 Elementary logic, assumptions and reasoning, fallacies and misleading statistics Feedback Check for sampling errors, conditional probability errors, and commonsense meaning of words in the above! For instance: In (i), the statistical sample does not include all the drivers who do not have accidents, most of whom are sober! (ii) is a conditional probability fallacy. Your carrying a bomb on board a plane does not affect the probability of one already being there. In (iii) taxi drivers do far more driving than normal private motorists so they will be involved in more accidents – you should rather take the number of accidents per thousand kilometres driven, as in the example earlier in the unit. (iv) is not a fallacy: You can indeed have 60% of the students being above the average – Three students having 80%, 80%, and 20% respectively have an average mark of 60%, and two out of three (66%) are above average. However, this does not mean that the average itself has improved. The implicit “unspeak” of the minister is not supported by facts! In (v) if you wait to see whether anything is permanent, you would wait for ever! (Maybe this is what the prime minister intended! Huff, D. (1991). How to lie with statistics. London: Penguin. References Keywords/concepts Logic: A system for drawing conclusions from true premises: a system for testing the truth and falsity of propositions: a symbolic algebra for the above. Statistics: Either: collated or summarized numeric information relating to some situation or process, or the study of predictions made by the mathematical theory of probability. Fallacy: Reasoning that looks as if it was logically or statistically correct, but actually isn't. Unit Summary Summary In this unit you learned about the theory and principles of elementary logic, how to draw valid logical conclusions and how to recognize invalid conclusions (logical fallacies). The principles of basic statistics were introduced, and examples given of how statistical theory can lead to surprising and counterintuitive results. Unit 8 Intellectual Property, Plagiarism and Copyright Introduction In February 2011, the German foreign minister, a popular politician with an aristocratic background, with a Ph.D., who was widely expected to one day lead his party and perhaps his country, suddenly had a problem. Some investigative journalists had noticed that his university doctoral thesis contained passages which had simply been copied from other sources, without acknowledgment. The minister initially denied this, saying that at most there was a footnote and referencing problem, and his boss, the Chancellor, stood by him, saying she had employed a minister, not an academic! But the furore in the media steadily grew, and more evidence was produced of wholesale copying in his thesis. Within a few more days, the minister resigned, his career ruined. What was going on here? The minister was evidently guilty of plagiarism , the act of taking other peoples work or ideas and passing them off as your own. It is a kind of deception to boast that you are the author of some intellectual work, while most of it is actually not yours. It is not a legal crime but it is the curse of the academic world, because if universities award qualifications to students or researchers in respect of work which is not their own, the qualifications become worthless. All reputable academic institutions therefore have zero tolerance to plagiarism. Plagiarism is sometimes confused with copyright . The two have nothing directly to do with each other, although plagiarism may involve copyright violation. Copyright is the legal framework in the wider world where creators of intellectual work – musicians, authors, photographers, film makers, do, in theory, have their work legally protected, so that people who want to use this work have to get permission or pay, to use it; and infringers (people who violate the laws of copyright) can be stopped from doing so, fined or imprisoned. Copyright violation is theft, not of physical but intellectual property, and it is a legal crime. This unit explains the principles of intellectual property and their relevance in the age of the Internet. 103 104 Unit 8 Intellectual Property, Plagiarism and Copyright Objectives Upon completion of this unit you will be able to: explain what plagiarism is and the problems which it causes; discuss how to avoid plagiarism, and how to reference your work properly; explain what copyright is, and its general legal implications; explain the meaning of other terms such as trademarks, patents; discuss the importance of intellectual property and its protection in the modern world; explain how to properly reference, cite and acknowledge sources in your work. Bothma, T., Cosijn, E., Fourie, I., Penzhorn, E. (2011) Navigating information Literacy 3rd ed. Pearson, ISBN 978-1770259676. Chapters 8 and 9 Prescribed reading Intellectual Property, Wikipedia, http://en.wikipedia.org/wiki/Intellectual_property Accessed July 2011 Additional reading 1. The History of Intellectual Property Protection Some people bemoan the effect the Internet has had on intellectual property – the impact on the music industry especially, but the need to protect intellectual property has always been driven by technology. In the middle ages in Europe, when all books were painstakingly handwritten and hand-painted by monks, and only available in tiny numbers, the copying of a book was obviously as much effort as producing the original, so that the monks (presumably) were not worried about infringement of their copyright! The invention of practical printing, around the year 1450, changed all that. Any idea or work could be printed off in multiple copies a large number of times relatively cheaply, and the copies sold. Immediately the writers of the day became concerned about the ‘piracy’ of their works, because no laws existed to prevent this copying. Shakespeare himself was so concerned about this that he forbade the publishing of most of his plays during his lifetime, to avoid ‘pirate’ performances of them. These plays were only issued in edited form after his death, and we do not know what the original versions really were. Long afterwards came the photocopier, which anyone in theory could operate to make unlimited copies of any document or book. Music performances before ‘technology’, up to 100 years ago, were all live, and to hear them you physically had to go to concerts and pay. Much later sound recording technology came along, and musicians and music promoters were worried. People could listen to recorded music without paying to attend performances! In the early days of cinema, you could only see movies in movie houses, for which you of course had to pay. There was control over the situation. When TV and video recorders were introduced, movie makers panicked: people could watch movies at home, and over and over again if desired! All these problems were resolved. A part of the sale proceeds of records (so-called royalties) went back to the music artistes and producers. Similarly for movies recorded on video tape, although there was concern over ‘pirate’ copies. The music industry reckoned that they benefited from a recorded music audience which was enormously larger than the audience for live performances. Then along came the Internet, offering a massive amount of media of every kind online, accessible and downloadable anywhere in the world, all part of the early culture of the Internet, which implied that all content was open and free. Sharing sites such as Napster allowed anyone on the Internet to download any music on your hard drive, and vice versa! This time the music industry really did panic. Massive copyright violation was alleged and massive lawsuits followed. Napster and similar sites were closed down. Again, the position has been regularised in a way. The most popular music download site currently is probably Apple’s huge Itunes store, where almost any musical item can be downloaded for an average of US$ 1. More than a billion items have been downloaded thus far. Have you downloaded music and movies from the Internet? Did you pay for them, or did you make sure not to violate anyone's copyright? In-text question Activity Activity 1 Time Required: 1 hour Make a note of your two favourite movies, books or music (single numbers or albums) at the moment. Search around online and see if you can download copies of these, without paying! What are your experiences? Do you think everything was legal? How long? 105 106 Unit 8 Intellectual Property, Plagiarism and Copyright I am always impressed by the way my teenage daughter seems able to download whatever movies or music she wants from the Internet, presumably without paying. Be aware that simply because you are able to do so, does not mean you are not violating copyright! Feedback 2. Copyright, patent and trademark 2.1 Copyright Copyright is a legal system which recognises the creator of intellectual products such as literature, music, photographs or film, as the owner of copyright, with the right of either exclusive use or to regulate the right of others to use the products for a certain period of time (not indefinitely). There is copyright law in all (organised) countries of the world. Thus the publisher of a book is generally the copyright holder: others are not allowed to copy the book (except in limited cases for research or quoted in short passages for literary criticism) or reproduce it in any way. The copyright holder might sell the rights (for a lot of money!) to turn the book into a movie. If the book is a novel, and another novel comes out with a suspiciously similar story line or characters to the first, the copyright holder may be able to sue the second author for copyright violation. There is a continuous tug of war between the two opposing outlooks – those who want the maximum and longest possible protection for copyright owners, and those who want the maximum freedom of access to information. Copyright legislation in countries such as the US varies between these extremes, the idea being to offer copyright owners fair rights but not too many or for too long. Lastly, copyright subsists only in actual products. There is no copyright in an idea for a book or film – the book must be written or the film must be made. This study guide, by the way, is under copyright. You may not reproduce it without permission. The exact conditions of the copyright you find on the second cover page. In-text question Both sides of the question: as a consumer you probably resent having to pay for music etc seeing as the culture of the Internet is at root free access to all. But how would you feel if you wrote a book, some music or a computer game, and pirate copies of it appeared online for all to download for free? 2.2 Patent Patent is the equivalent to copyright in the domain of material inventions. The inventor of a new machine, device, design, algorithm or medicine can apply for a patent for the invention to be protected. Unlike copyright, the idea itself can be patented – the patented machine does not have to be constructed. But anyone else wishing to manufacture the machine must pay fees to the patent holder. Computer software lies in a kind of in-between situation. As a ‘media’ product it should come under copyright law, but it is fashionable these days to regard it as a ‘product’ analogous to hardware, so that it should then be patented. For many years in the early days of computers, the concept itself was not legally defined so that protection of software writers against copying was difficult. Technical measures were resorted to, like the refusal to release source code, rather strange methods to prevent the copying of disks, and hardware devices called dongles looking like USB sticks which had to be plugged into the computer before the software would run! (Based on the idea that hardware is more difficult to copy or simulate than software). Nowadays most countries have sophisticated legislation covering software piracy, data confidentiality and illegal use of computer systems such as unauthorised access and hacking. In earlier days hackers could only be charged with the theft of electricity! Now, whenever you download software, even free software, there is usually a box containing complex legal terms and conditions of use, with an ‘Agree’ box which you have to tick before proceeding. Few people bother to read these conditions, but maybe they should, because they are legally bound by them. When we say most countries, there is currently a good deal of tension between the West and countries such as China, which although in theory have copyright and patent legislation, are the sources of vast amounts of counterfeit (imitation) goods and pirated media and software. This results in huge losses of revenue for legitimate manufacturers and publishers, and in the case of counterfeit medicines is of course very dangerous. Certain media are in the public domain , media on which the copyright has expired or which the author has given permission for anyone to use for any purpose. The question of copyright has been one of “all or nothing” for most of the history of intellectual property: Either some work is under copyright, then no significant reproduction is allowed. Or the work is in the Public Domain, then anything might be done with it. Originating from the question of how to share computer code, a third way of licensing has been developed in the last decades. This third way is sometimes called “Copyleft”. Copyright reserves all rights. Public Domain reserves no rights. Copyleft reserves some rights. Creative Commons A particularly important implementation of Copyleft is the work of a US non-governmental organisation, Creative Commons. Many recent works are now licensed under Creative Commons' licenses, including Wikipedia, flickr, YouTube, government web sites, Al Jazeera productions, and a lot of computer software. How does Creative Commons (CC) work? Figure 1: An icon labelling a piece of work to be under the Creative Commons License, in this case CC-BY-NC-SA CC licenses are readable by laymen. The name of the licenses indicates what one can do with the works that fall under it. There are five keywords, the presence or absence of which indicate whether a certain right is reserved, or waived: 107 108 Unit 8 Intellectual Property, Plagiarism and Copyright All CC licenses begin with the letters CC (Creative Commons) to indicate that a standardised license is applicable. All CC licenses have the BY keyword. This condition specifies that the licensor requires attribution of their work (BY like in “written by”) Some CC licenses carry the NC (non-commercial) keyword. This indicates that no legal uses of the work must be for profit. Some CC licenses carry the ND (no derivatives) keyword. This indicates that changing the licensed work is not allowed. Some CC licenses carry the SA (share alike) keyword. This indicates that all derivative work must be licensed under comparable conditions. The SA condition forbids to take a text or image under copyleft, change it, and then put it under more restrictive copyright conditions. A typical Creative Commons license thus reads CC-BY-SA, indicating that all legal uses require attribution (BY), and that any changes must be put under a similar licenses (SA). The absence of the NC and ND keywords indicate that selling and changing the work are both permitted. Wikipedia content is licensed under CC-BY-SA. Activity Activity 1 Time Required: 10 minutes You want to release an essay of yours under a Creative Commons license. You want it to be attributed to you, you want to allow others to use it commercially, but you do not permit that your essay be altered. Which keywords would you have to use? How long? CC-BY-ND. The absence of the NC keyword allows commercial use. The SA keyword does not make sense because you do not allow any changes to your work. Feedback The actual work that the Creative Commons NGO does is to draft legal documents that ensure that the underlying “fine print”, the license agreement, exactly reflects the human intention that is expressed in the keywords. 2.3 Trademarks and logos Trademarks would seem to be a simple aspect of intellectual property but are important because they represent and symbolise a product, company or service. Trademarks can include logos, the name of the product if not a standard word, graphics, presentation, colours or packaging shapes. They are protected by their own legislation. Many are iconic, decades old, and known worldwide, such as Coca Cola, ‘Colonel Saunders’ of Kentucky Fried Chicken, the golden arches of McDonald’s, Disney characters, and the apple with a bite out, of the Apple Corporation. They are fiercely protected, world wide. Nobody is allowed to produce a competitive product using a registered trademark or a ‘confusingly similar’ one. It is said that some years ago, before ‘McDonalds’ had any branches in South Africa, a person whose surname really was McDonald attempted to open a hamburger joint there, called McDonalds. He was contacted by lawyers from McDonald’s in America who put a stop to the venture! Recently, in Namibia, there was a brand of cola on the market, in a red can, with the name of the cola in wavy white writing. It is surprising that this product did not attract the attention of lawyers representing the Coca Cola Corporation. 3. Plagiarism Plagiarism, according to the Oxford English Dictionary is “the copying of another person's ideas, text, or other creative work, and presenting it as one's own”. It is not illegal in itself, which distinguishes it from all the concepts discussed above. In academia however, it is despised, and certainly regarded as an ‘academic crime’. If proved, it will speedily ruin a writer’s, academic’s or journalist’s reputation. (See story at beginning of unit). Journalists are paid by the number of lines of text they produce. They will therefore write the same things again and again, reusing whatever they think might be suitable. Academics, on the other hand, are expected to publish only new results; regurgitating old wisdom again and again is frowned upon in the scientific community. They will therefore rather reference whatever is already known, and after a short introductory paragraph they will immediately get down to business. That is why academic papers are sometimes hard to understand – they assume that you will read all referenced work, if you have not already done so. As a student of a tertiary institution, you are a member of Academia. You are therefore expected never to plagiarise – not in any homework, essay, project, or thesis. If you plagiarise anywhere in your course work (including ICT), you will immediately fail the course. If you do it more than once, you will be expelled from Namibia University of Science and Technology, and your place of study will be availed to someone else. Do not say you have not been warned! Activity 2 Time Required: 30 minutes Should every lecturer at the end of each lecture, give a list of sources, to avoid charges of plagiarism? Activity 109 110 Unit 8 Intellectual Property, Plagiarism and Copyright If you give a talk on your general experiences of the coming of independence in Namibia, is it plagiarism unless you quote a particular book on the history of Namibia? How long? Is it plagiarism to quote something you remember reading, but you cannot recollect where? If you take a picture of a well known person, and use the picture on your web site, do you need that person’s permission? i. If the lecturer has actively used these sources to compile the material for his/her lecture, yes. Feedback iii. Actually yes, as in the age of Google, it is not difficult to find the source of something you read and are consciously quoting. ii. No, you are not utilising any source – in fact you have the rights over your own experiences! iv. No, you own the copyright of any picture that you take. The subject of the picture does not own it. You are, however, bound by common law, for instance of privacy: Spying though a fence to take a picture of your neighbor in a swimming costume, is not permitted! Firstly it must be stated that in the age of information overload, where everything which can be said about any topic seemingly has been said, it is not plagiarism to discuss a topic in general terms, even if you are not saying anything particularly original, or if you are repeating facts which are common knowledge, such as that Namibia achieved independence in 1990. But if you are researching the circumstances leading up to independence, and you have referred to a book or article on the subject, and especially if you want to quote a passage from the book, you must reference it, otherwise you are guilty of plagiarism (utilising someone else’s work but not acknowledging the fact). Of course the coming of the Internet, word processing software and ‘soft copies’ (the ‘age of cut and paste’) has made plagiarism enormously easier. Instead of copying a passage from a book laboriously by hand, you simply find the online version, and copy and paste. Students now, when given an essay topic, often simply Google it, take some passages from search results, paste them together and hand in, without even bothering to coordinate the fonts of the copied text! This obviously infuriates academics, and is becoming a major headache. Wholesale plagiarism of course means that a student has contributed little or no effort, shown no originality, and gained no insight into the relevant topic, so the essay, assignment or project is worthless, apart from the dishonesty and deception involved. Activity How long? Activity 3 Time Required: 1 hour Take any ‘striking’ passage from an article, magazine or book you are reading, and google it (using the ‘exact form of words’). The results could be interesting. Do the hits you get come just from the publication you are reading, or also from other earlier sources? Did the author of the publication you are reading acknowledge them? Now, rephrase the article passage. Make sure that you use different words and a different sentence structure. It is extremely challenging to capture the same meaning but use different words! You might need to practice this several times. Note – earlier hits do not necessarily imply plagiarism – they could be from an earlier edition of the same work. Feedback Students are far from being the only perpetrators of plagiarism. In one amusing incident in 2004, the Canadian prime minister gave a speech (on the Iraq war), in parliament, almost word for word that given by the Australian prime minister a few days earlier. Presumably the speech writer was ‘in a hurry’ – a frequent excuse of plagiarists. Mr. Tony Blair, in one of his justifications for the Iraq war, lifted an entire college essay for his material! Shakespeare lifted most of the story lines for his plays from earlier works, so far as is known without acknowledgment, but it does not mean we should do this today! (On a much more modest scale, the writer was surprised recently to find parts of an article he had written for the Air Namibia magazine reproduced word for word in a tourist guide for Windhoek!) All this is definitely not to say you should not use or refer to or even quote the work of others in your assignments of projects. Your work would be poorer without consulting others who have previously written and researched in the same area. But these references must be properly listed and or cited – give a fair chance to people who actually want to read the background information that you are referring to. It is for this reason that you must not just give a vague reference like “The Namibian” or “Google” or “Internet”. How on earth could anyone find the piece of information that you used? Instead, give as much information as you possibly can, e.g. author, title, publisher, pagination, date, and if the work is online, also the url. This way you give the potential reader (and your tutor-marker!) the chance to locate the article in question, either on the Internet, or in a public library. That said, obviously your work should not mostly consist of quotations from other sources or references to other sources, unless your project 111 112 Unit 8 Intellectual Property, Plagiarism and Copyright was specifically a literature review. Otherwise, which part of the assignment should be marked in your favour, if you actually did not write anything? Remember that when including photographs etc it may not be enough simply to quote their source – they may be subject to copyright in which case you need to obtain the copyright holder’s permission to use them. Pictures of people can in general be used without permission – you do not own the rights to your own likeness. It may be asked, in the age of the Internet, among such a huge mass of material, how plagiarism would ever be detected. The answer is easy: also by virtue of the Internet and its massively powerful search engines, any suspect passage can be searched for: you just have to google it! As they say, if you can find it on the Internet, so can your lecturer. The Internet is very democratic! Reflection Even without this, a passage in a student essay whose style is markedly different from his/her usual style of writing, (or in a different font!) is immediately suspect. There are specific software packages such as ‘Turnitin’ to which you can submit a thesis or text extract and it will search its massive database for matches i.e. plagiarism, as well as an estimate of the degree of plagiarism. Every piece submitted to Turnitin becomes part of its ever expanding database, against which new texts can be checked. Unfortunately the software is rather expensive for smaller academic institutions. All reputable academic institutions now take a very tough line on plagiarism. The NUST has a code on plagiarism, and students whose work is suspected to involve plagiarism may face disciplinary proceedings, given zero marks for their work, and even suspended from the Namibia University of Science and Technology. If the excuse is that we do not know how to reference, please learn how to do so – in the next section. 3.1 Referencing and citation: It is extremely important that you master this area; otherwise, your professional writing, whether in business or academia, will never be respected, and your work will be regarded as sloppy at best, dishonest at worst. Perhaps you may find this boring, just as some software developers find writing documentation boring, when you are keen to get on with your ‘cutting edge’ activities, but proper acknowledgment, citation and referencing of works you have consulted in your writing is essential to professionalism and integrity. What should be referenced? Basically, any identifiable item, whether text or image, which is not your own work. Sometimes, you may not even be sure, but if you find yourself using even some felicitous phrase, not in your typical style of writing or using words not in your usual vocabulary, it may be that you are quoting something ‘unconsciously’ – Google it, and see if you can find the source. Rather be safe than sorry. It is best to cite a source ‘unnecessarily’ than be accused of plagiarism. It is important the references be listed according to a consistent, clear and understandable method. (Though any method is better than nothing!) There should be a reference whenever a source is quoted or referred to in the text, possibly with a footnote, and the corresponding reference listed in full at the end. The method favoured at the NUST is the APA (American Psychological Association) system. It is very simple, consisting mainly of the following: In text referencing consists of the author surname and date (in brackets) In the reference list at the end of the work, authors are listed alphabetically, with the date of publication appearing in brackets after the name and initials of the author, followed by the title. Second and further lines are indented. For example: Pressman, R. (2009) Software Engineering: A practitioner’s approach. New York, McGraw-Hill publishing. This is a reference to one book by a single author. A printed book or journal is permanent, but what about referencing a web site, which may disappear off the Internet tomorrow? The full URL should be given, and the date retrieved. Since web sites can change any time, or even disappear, as we said, it may be a good idea to print a copy of the web page referenced. For fuller information on how to reference books by multiple authors, works with no stated author, periodicals etc., please consult the APA Reference guideline booklet available from the NUST Library. This booklet forms part of the official documentation for the ICT course; if you do not yet own it, purchase a copy. Reference list The APA style guide prescribes that the Reference section, bibliographies and other lists of names should be accumulated by surname first, and mandates inclusion of surname prefixes. For example, "Martin de Rijke" should be sorted as "Rijke, de M." and "Saif Al-Falasi" should be sorted as "Al-Falasi, S." For names in non-English languages, follow the capitalization standards of that language. For each of the source types below a hanging indent should be used where the first line is flush to the left margin and all other lines are indented. Bothma, T. et al. (2008). Navigating Information Literacy. Cape Town: Pearson Education South Africa References Intellectual property: The idea that ‘intangible’ products or creations of the human mind should be protected from theft and misuse in the same way as physical property. Upheld by international law and treaty. Copyright: The framework of laws that protect intellectual property in the area of writing, music, images and films (also computer software), and the rights of the author. Keywords/concepts 113 114 Unit 8 Intellectual Property, Plagiarism and Copyright Patents: The framework of laws that protect intellectual property in the area of design, inventions, devices, machines and processes, and the rights of the inventor. Trademarks and logos: Images, phrases, design or packaging, which define, identify and promote particular commercial products, protected by trademark law. Plagiarism: The act of appropriating substantive parts of another author’s creative work and passing it off as one’s own. Unit summary Summary In this unit you learned the meaning of the term intellectual property, and how this included the topics of copyright and patents. Nearly all countries have enacted laws to protect intellectual property and punish infringers. You learned about the problem of plagiarism, which amounts to passing the work of others off as your own, which, although not illegal is a great problem in academic institutions, because it renders the value of academic work worthless. You learned how to avoid plagiarism by properly referencing your work, and citing and acknowledging the work of others in a standard agreed format. Unit 9 Web 2 and 3, E-business and the 'Long Tail' Introduction There has been a lot of hype and some cynicism over the terms Web 2 and Web 3 – they refer to the changes the WWW has experienced since its creation in 1992. The terminology is a word-play with the versions of computer software where 2.0 is a decisive improvement over version 1, but for the WWW it is more of a progression than a jump from 1 to 2 to 3. Berners-Lee, the inventor of the WWW had wanted it to be interactive from the very beginning – that users could react to and change pages which they were reading. Technical difficulties prevented this, so that for several years the Web was ‘read only’ – you could select the page you wanted, but it was like an unalterable page of a newspaper or magazine. You simply consumed its content. You could already have your presence on the web by creating your own web site, but it was a technically demanding and expensive business. With cheaper Internet access and data storage came the sites, mainly on-line news sites, on which you could post comments, much like ‘letters to the editor’ of a newspaper, and sites on which you could ‘register’, and since around 2005, a flood of sites have emerged, such as Flickr and Youtube, on which anyone can post their pictures or videos, and the social networking sites whose content is defined and produced by its users – millions of them; and an online encyclopedia which is written by ‘the people’. In more practical terms, the amount of business conducted over the Internet grows apace, being difficult to estimate but amounting to billions of dollars a year world wide, and possibly about to overtake conventional business to business dealing (B2B) and retailing, i.e. business to customer - (B2C). Advantages to consumers are numerous – more competition between vendors, lower prices, and availability of unusual items you would not find in bricks and mortar shops – this is the ‘long tail’. 115 116 Unit 9 Web 2 and 3, E-business and the 'Long Tail' Upon completion of this unit you will be able to: explain the evolution of the web, and the meaning of the terms Web 1, 2 and 3; discuss and utilise the best and most useful features of Web 2 and social networking sites; Objectives recognise and avoid the downside of social networking sites; find the goods you want, and purchase them, if necessary, safely on the Internet; explain what is meant by the ‘long tail’. Burrows, T., 2008, Blogs, Wikis, MySpace, and More: Everything You Want to Know About Using Web 2.0 but Are Afraid to Ask, Chicago Review Press, ISBN: 1-5565-2756-X Prescribed reading 1. The Evolution of the Web 1.1 Web 1 In the beginning web sites used to be static. Some privileged, highly computer literate people put up content on private or official sites, and the majority of the web surfers simply read what was put up for them. There were two major reasons why not everybody would contribute content. First it was a rather difficult procedure, users would have to know how to use different protocols, download software, install and use it for uploading content. Practically, only computer professionals, university students and a few committed private people would have the time and will to acquire the necessary knowledge and skills. Second, Internet connection speed and data storage capabilities were far behind today's standards, and web site owners had to pay both for the storage of their content and for any traffic caused by site visitors. To make videos or other bulky files available for everyone was very costly, and if a large number of web surfers indeed downloaded them, the hoster of the files would receive a hefty bill. As a result, content used to be mainly in text form because text does not occupy much space on a computer's hard disk. 1.2 Web 2 Gradually prices for data storage and Internet connectivity came down, allowing more and more people to actively participate. At the same time, business ideas emerged that were reliant on masses of people browsing sites. Business models based on advertising require a lot of visitors, and soon the service providers realised that they can make money from hosting pages whose content attracts people. The logical consequence was to reward, rather than punish, people for publishing content on the Web. Now, someone could make money from writing on the WWW! For every visitor on a page who clicks on a banner to go to a commercial page, the owner of that page pays a tiny amount (typically well below 1N$) to the owner of the page where the banner is placed. This principle has been valid ever since, and people who write content that is read by thousands of web surfers have a chance to get a few dozens, maybe hundreds, of clicks to the adverts on their page. They make money by simply writing interesting content; popular bloggers and site operators actually make a living from it. Figure 1: Web 2.0 What is the metaphor behind the coining of the term Web 2.0? In-text question It probably originated as a journalist’s slogan, but just as software is released in successive versions, with version 2.0 usually being a complete upgrade on any version 1, the idea is that Web 2.0 is something fundamentally different from the Web as it was before (i.e. non-interactive). Blogs, short for web logs, are the ‘painless’ alternative to producing your own full web site. They emerged just before the millennium, and soon software became available to make your own, with almost no skill required. You do not need an Internet host. You will not be alone – Wikipedia estimates that as of February 2011 there are 154 million blogs. The sum total of this massive amount of continuously generated information, opinion and of course nonsense is called the blogosphere. Many bloggers write an opinion piece on the news of the day, which overlaps with the function of newspaper columnists and gives them the name of ‘citizen journalists’. Actually, the dividing line is even more blurred, since regular journalists themselves write blogs. Whole books have been written as blogs (these are of course ‘blooks!’ and even movies made out of them – the cooking movie Julie and Julia is a case in point). 117 118 Unit 9 Web 2 and 3, E-business and the 'Long Tail' In autocratic parts of the world, bloggers can get into trouble, in the same way as journalists. In Egypt and Burma, among other countries, bloggers have been jailed for ‘insulting the head of state’. In most legal systems, ‘publications’ are subject to the law of libel, but it all hinges on whether ‘blogs’ constitute legal publications. At the time of writing, the Communications Act has not yet been signed into law, and even if it was, it says nothing about the status of bloggers, so the legal position of local bloggers and to what extent they are subject to the laws of libel or other interference is not clear. Please use common sense – observe normal principles of etiquette – discuss issues strongly but do not make personal attacks in your blog. At the same time, do not fear to use your constitutional right to freedom of expression. Activity How long? Activity 1 Time Required: Initially about 30 minutes, but indefinitely thereafter! Create your own proper blog . It can be simply your online diary, containing as much personal information as you wish, (don’t post your physical contact information) or if you have some special interest or knowledge, you can share this with the Internet community, i.e. the whole world! More fun than a paper diary, more fulfilling and professional than just posting ‘tweets’. But create a theme, and try and keep it up on a daily basis! [the writer’s blogs are www.namiblogger.net and www.namibnews.net – unfortunately he does not practice what he preaches and these are not updated very often!] Feedback Go to: www.blogspot.com – the Blogger website -and follow the instructions. Since blogger is owned by Google, any Google or gmail account you have will be enough to register, or you can create a new profile. Send a link to your blog to the coordinator of the course, with the opportunity for him/her to give feedback on it! Figure 2: "Map" of the Web 2.0 landscape, xkcd.com. Many of the above sites have disappeared, and many others will have arisen since the time of writing. They cover almost every conceivable way in which humans might want to communicate, voice their opinions or ‘share’ their experiences or showcase their talents. The best known of these will surely already be familiar to you: Youtube, the massive video archive. Missed a bit of last night’s TV programme? Want a video of a talking cat? It will be there. Did some politician make a gaffe, or arrive at a function having seemingly had a bit too much to drink? It will be on Youtube, preserved for all eternity, for anyone to see, as many times as they like. Popularity? The theme song of the 2010 World Cup has been accessed and viewed 300 million times! Want to keep a properly sorted collection of all the interesting web pages you have visited, share them with friends, and also see what other people’s favourite sites are at the moment? Del.ic.ious and digg are for you. Wikipedia , an online encyclopedia with more articles in more languages than all other conventional encyclopedias put together, and written entirely by ‘the people’, is the subject of the next unit. Google, either in itself or through the companies it has acquired, is the centre of a whole family of on-line facilities. (Did you know you can create documents without the need for Word, store them online, in Google’s ‘cloud’ and let colleagues collaborate on them from anywhere?) Finally of course comes the stunning success of the social networks: Myspace, then Facebook, LinkedIn, and Twitter. Facebook now has 900 million users around the world, with over 40% of the American population having an active Facebook account. 119 120 Unit 9 Web 2 and 3, E-business and the 'Long Tail' A few years ago, it was a great achievement to track down an old school friend from years ago, or even a long-lost brother – taking weeks or months of patient investigation, following up leads and old addresses etc. Now, provided you remember the name, all you have to do is look up his/her Facebook profile! Facebook must tap into a deep human desire to communicate, to talk about oneself, to bare one’s soul, and to be part of a group, even if that group is the whole of humanity. Most readers of these notes will already have a Facebook account. It can be great fun, but please utilise the security features, usually to restrict information on your personal details to your ‘friends’, and even here, avoid disclosing your physical address. Actually, Facebook has progressed far beyond a site where young people post silly pictures of themselves. Even large ‘serious’ organizations have their own Facebook sites, because they realise that ‘that’s where everybody is’. Most users will look first for the Facebook site rather than the site of the organisation itself. The organisation’s main web site will carry an icon to ‘follow us on facebook’, and users everywhere have to opportunity to say they ‘like’ the site on Facebook.A recent article on the Business Network (Bnet) suggested that corporations should shut their main sites and just use Facebook! Just as the Internet became almost synonymous with the web, the web is becoming almost synonymous with Facebook. Some keyboards now have a Facebook button, to take you straight there without even the chore of opening a browser. Twitter is a ‘microblogging’ site, where users can post 140 character messages (chosen to be compatible with most cellular texting systems) about anything and everything. In fact, the sites have converged – Facebook, which originally was a fairly static profile site, has become more ‘twitter’ like, with the ability to send short messages, ‘poke’ other users etc. This is also fun, except to the outsider most ‘tweets’ are rather banal or even incomprehensible. It has been claimed by enthusiasts of the social networking culture (see unit 1) that Twitter and Facebook have played an important democratic role in recent, sometimes successful uprisings against repressive governments,(e.g. in Egypt) allowing information about meeting points and tactics to be rapidly circulated among protesters etc. This is possible but not certain – for one thing, these uprisings take place largely in ‘third world’ countries, where a much smaller proportion of the population has access to smart phones and the Internet, and secondly, the allegedly repressive government can use the social networks also, to spread disinformation, propaganda etc. Some interesting statistics: (social network sites in 2012) Table 1 Site Youtube Some statistics 800 million users 4 billion views per day 80 hours of video uploaded every Site Some statistics minute More video material created in 90 days than all the US TV networks in 60 years 500 years of video watched every day Facebook 900 million users, more than half of them go online every day Average user has 130 friends 300 million photos uploaded per day Twitter 500 million accounts Average of 4000 tweets per second, with peaks of up to 25,000 tweets per second in a single hashtag 300 000 new accounts every day Technology is more complicated for Web 2 than for Web 1 where the server simply responded to requests to send pages. Web 2 has banks of servers, to fulfill requests, not for just a web page, but for particular transactions from the client, including uploads from the client, and the requirement to modify the page or the web itself. Apart from the really well-known sites described above there are many more Web 2 services. You can register with sites, especially media sites, to receive regular bulletins or ‘feeds’ which go under such names as ‘podcasts’ or RSS (Really Simple Syndication), there are newsletters by email, and automatic notifications if preferred web content changes. Podcasts and RSS 'feeds' are the online equivalent of newspaper or magazine subscription by which users receive regular installments of digital media to their ‘inboxes’, either automatically or on request. The difference is, you can actually assemble your own magazine with this technology, specifying what type of content you want to read, and which publisher you prefer for each piece. 1.3 Web 3 The term “Web 3” summarises the next evolutionary step of usermachine interaction. Proposed by no less than Berners-Lee again, who defined it as “a web of data that can be processed directly or indirectly by machines”, Web 3 means that web pages, by including machine readable ‘meta data’ can themselves ‘understand’ what they are about, and offer themselves more intelligently to humans who are interested in their content. With this progress it is for instance possible to place adverts onto pages that are semantically (in terms of meaning) linked to the product and to previous user behaviour. If you browse pages about Berlin and have in the recent past bought air tickets to London, chances are that you will see an advert for an onward flight from London to Berlin. However, if your recent activity was to buy a street map of London online then you might rather see an offer for a street map of Berlin! The semantic web will also know that you are a Namibian resident, and therefore not offer 121 122 Unit 9 Web 2 and 3, E-business and the 'Long Tail' a map of Windhoek! And if you are the proud owner of a smart phone, the Web will know this as well and might offer the latest Android applications for navigating Europe – provided, of course, that your phone's operating system is in fact Android. Wizardry? No, the web stores information of what you did, which devices you used to login, what web browser you use, and where you are located. We discussed the ethical implications of this in Unit 1. The semantic web also improves searching the Internet. You can now use Google to search for pages of a specific age (the web “knows” how old the information on a certain page is, try the search term Namibia 2005..2007), or for pages released under a specific license (try the search term Pohamba picture CC-BY, the Creative Commons licenses described in Unit 8 are machine readable). It does not yet work very well, but the algorithms to facilitate such searches are being improved all the time. Web 3 is currently more of a future scenario than an existing technology. It requires some “intelligence” in computers and software. But the decisive breakthrough might just be around the corner. 2. The business of the Internet The forerunner of the Internet, ARPANET was a government research network, and even when it was turned over to more general use no commercial use was envisaged. Development was driven by networking enthusiasts who were not motivated by making money. The culture of the Internet as an ‘open space’ with free access to all still persists in principle. In the early 1990’s a firm of American lawyers tentatively started to advertise their services online, to much derision and insult. But the power of the Internet as a medium to advertise and promote products to a wide audience was obvious and irresistible. In the meanwhile the first fortunes were being made from successful Internet software. Marc Andreessen who created the first web browser and founded Netscape, sold his interests for over 4 billion US dollars in 1999. Mark Shuttleworth is a South African who pioneered systems for Internet security, essential for the new industry of trading over the Internet – e-commerce, selling his interests to a US company for 575 million US dollars in 1999. He became the ‘first African in space’ as a tourist on the International Space station. The most successful Internet search engine, and one of the most successive enterprises in history, is of course Google, started by two Ph.D. students, Larry Page and Sergey Brin, in 2000. The most obvious method for a search engine to make money would be to charge for searches. It was the genius of the Google business model that they rejected this approach. Instead they generated their enormous revenue by advertising in at least two clever ways. You have noticed on a Google search that the main results list down the centre of the page and on the right are ‘sponsored links’. These are from advertisers who have given keywords to Google to make their results show up when searchers are looking for these keywords (‘adwords’). Google then charges the advertisers for every click which viewers make on the ads. It does not matter whether anything is bought from the advertisers or not. The second, called adsense is a system by which ‘relevant’ advertisements appear on company’s or individual’s own web sites. For instance if I had a site about baby’s names and their meanings, it would be safe to assume that the people visiting the site would be parents expecting a new baby. Thus a variety of advertisements for baby products would display on my site. (Google would select these advertisers, and modify the HTML of my site, so they would display automatically). Again, if viewers of my site were interested in the products advertised and clicked on them, the advertisers would be charged and I would also get a commission. So this is a money making opportunity for anyone with a web site. Near the end of the millennium, as the Internet gripped the popular imagination, it attracted a frenzy of interest from investors desperate to ‘get a piece of the action’ and who poured money into any idea relating to the Internet, no matter whether there was any clear business model, any clear way of making a profit, or how crazy the idea was. Teenage entrepreneurs offered shares in ‘instant’ Internet companies to the public, and became millionaires, on paper, overnight. These businesses were called “dot coms” because their web presence invariably was located in the top-level domain com, meaning that their web sites' url ended in .com. Of course, the companies set up to market these ‘ideas’ soon burnt through their capital and collapsed, leading to the dot com bubble of 2000. The stock exchange index in America largely devoted to these high tech companies (NASDEQ) fell to a third of its peak value. This led to disillusion with high-tech investment for several years. Of course, companies with sound business models such as Google thrived and made immense amounts of money. Currently (2012) with the phenomenal rise in popularity of sites such as Facebook we are seeing signs of a dot com bubble again (Bubble 2.0?). This time, investment banks, the same ones involved in the financial crash of 2008, are rushing to invest in the most popular social networking networks, because of their huge user base. For instance, Facebook was recently (February 2011) valued at 50 billion US dollars and Twitter at about 4 billion. The recent rise of other sites, less well known, is startling. Groupon, a site which offers daily special offers from shops in various US cities to customers, who sign up and pre-pay them, only started up in 2008, founded by a 29-year-old who wanted to be a rock star, and is now valued at 7 billion dollars. Even more bizarrely, a site called zynga.com, which offers on-line games where participants can take on the role of characters and buy accessories is valued by investors at 4 billion dollars. This is literally, a company which (legally) sells products which do not exist! One may well wonder if we are heading for Internet crash 2.0, but time will tell. Activity 2 Time Required: 1 - 2 hours Research another successful web entrepreneur or lucrative web site. Write a short report on this entrepreneur or site. Activity 123 124 Unit 9 Web 2 and 3, E-business and the 'Long Tail' How long? Feedback The choice of topic is yours, but as a hint some other wildly successful Internet companies must come to mind. For instance, ‘ebay’ and ‘secondlife’ have not been mentioned above. Look these sites up and find out what they offer. If you cannot find specific examples of “internet hypes”, look at www.millionpixelpage.com. Their business idea was to sell every pixel of their web site for 1 US$. Crazy idea? Not quite. The student that came up with this idea got a lot of media coverage. As a result, millions of people worldwide visited his web page to check if anyone would be stupid enough to buy those pixels, and as a result of that, it made sense for companies to buy the pixels for advertisement. The student became a millionaire within a few weeks. 3. E-commerce Electronic commerce (e-commerce) is a division of electronic business and refers to the buying and selling of goods and services partly or wholly facilitated by electronic communications technology. It is conventionally divided between business to business (B2B – ‘wholesale’ dealing) and business to customer (B2C – retail). (There are other variations e.g. G2C – ‘government to customer’ for paying taxes). Some businesses like Amazon were founded as and remain entirely online (you can’t buy anything physically from them or even over the phone), while some traditional ‘bricks and mortar’ businesses, like physical book stores, quickly created websites for on-line sales – so-called ‘clicks and mortar’ businesses. From early pre-Internet beginnings on home terminals connected to the telephone network, in the 1980’s, with simple systems to book theatre or train tickets online, developed the huge industry of online shopping, only slightly dented by the dot-com bust in 2000. This was enabled by easy and secure means of payment, initially by credit card, but with other methods, devised entirely for online payment, e.g. Paypal, and of course by the rapidly expanding Internet access among the general population, at least in ‘developed’ countries. Online shopping sites are now, apart from being very easy to use, are extremely sophisticated. If you are shopping for a pair of jeans, you can turn them around and go through the range of colours! If you are thinking of buying a book, you can ‘open’ it and browse the first chapter! Sometimes there is even a robotic speaking ‘sales assistant’ who will answer your questions and advise you on your purchase! In-text question Assuming you have a card (does not have to be a credit card – a debit or ATM card is sometimes accepted) or other means of electronic payment, would you or do you shop online? What are the advantages and disadvantages of online shopping? Does it depend on the goods you are shopping for? The nation most ‘into’ buying online today is not the US but the UK (Britain). According to a survey in 2010 by the Boston Consulting Group, commissioned by Google, (http://www.bcg.com/documents/file62983.pdf, accessed April 2011) the Internet as a whole is worth £100 billion (N$ 1,2 trillion!) to the British economy (in 2009), directly or indirectly employs 250 000 people and makes up 7,2% of the UK economy. Much of this is down to ecommerce activity. 30% of total retail sales are now done online. In 2003, again in the UK, online retail (or ‘e-tail’) sales were £14 billion which increased to £44 billion in 2010. On average, each British online shopper in 2010 spent £1284 (N$ 15 000!) – in fact a lot is ordered from UK e-tailing sites by customers overseas, so that online shopping is also a profitable export business! This trend will grow overwhelmingly, just because whereas online sales are growing by 14% per year currently, conventional (bricks and mortar) sales are increasing only a tenth as much – 1,4%. Activity How long? Feedback Activity 3 Time Required: 15 minutes Go to amazon.com (or amazon.co.uk) and search for an article which you are thinking of buying locally in Namibia (almost anything). Whereas most people think Amazon just sells books, the categories of goods for sale are amazing – garden tools and car parts, jewellery, electronics, gourmet food and of course books. Compare the price on Amazon (convert from US$ or £) with the price in a local shop, and survey the choice. (Of course, you would have to pay some postal costs from Amazon, and many goods cannot be delivered to Namibia). You don’t need to buy anything! What conclusions can you draw? There is vastly more choice online – whereas a local shop may have only two or three brands of camera, Amazon will display dozens of competitive products. Prices will often be lower than in local shops, but of course you would have to get the goods here, and in some cases this would not be possible. But it is a good way of comparing prices, and getting an idea whether prices in local shops are fair! 125 126 Unit 9 Web 2 and 3, E-business and the 'Long Tail' Cautions and advice for buying online The first thing people worry about is whether they will be defrauded – the goods they order will not arrive, or worse, their credit details could be stolen. The answer is to stick to well known e-commerce sites which have been around for many years – amazon.com internationally or kalahari.net in South Africa. Unfortunately, Namibian sites are few in number, are small operations, and may be unreliable. An exception is the booking facility for Namibian Wildlife resorts, which works well. Real e-tailing sites are usually very complex and difficult to impersonate or ‘spoof’. Again, use your common sense as advised in a previous unit, and distrust sites which trumpet too many ‘special offers’, or have too many exclamation marks! Avoid sites which do not seem to have ‘help’ or ‘contact us’ links. If you report that goods have not arrived, Amazon, particularly with books, tends to be very understanding and replaces them without question. They will follow up, however, on the ‘loss’ of high value goods. Reputable sites will send you email confirmations of every purchase, stating the exact amount charged to your card. Keep or print these. After any online purchases, check the transactions on your card. If you see anything suspicious – overcharging or any other dubious debits, report it to your bank/card provider with your evidence. Again, they will generally be sympathetic and may reverse any contested debits, and issue you with a new card if necessary. The main frustration, however, is that sites may refuse to deliver to Namibia (the drop-down list of countries when you want to enter your address offers only the choice of USA and Canada!) Or acknowledges South Africa but does not mention Namibia. (You may be able to enter your address as Namibia being a province of South Africa!). Otherwise the overseas address is accepted but the postage asked is astronomic. In many cases branded goods cannot be delivered from overseas due to distributor agreements. Do not try to order a cell phone or camera from abroad – generally this is illegal and is really likely to get ‘lost’ in the post – not the supplier’s fault. In general please bear in mind goods which are suitable for buying online and which are not. Books and DVD’s are fine – it is either the DVD of the movie you want or it is not. But to the writer it has always seemed ill-advised to buy clothing online – when it arrives the indicated size and fit may not be right, the material and colour may not be as expected. Sending it back overseas to be exchanged is a hassle. 4. The Long Tail Figure 3: Composite picture from www.searchengineguide.com and www.thefuturebuzz.com This strange name refers to what in mathematics is called the exponential decay graph, where the graph line starts from a high value on the vertical axis, then gradually ‘tails’ down to the right, tending towards zero but never quite attaining it. The graph is typical of many physical phenomena, such as the temperature of a hot object gradually cooling down to that of its environment. In business and management, it alludes to the 80:20 rule that 80% of your profits come from 20% of your customers or that 80% of faults occur in 20% of the equipment etc. The 80% is the ‘head’ part of the graph, the 20% is the ‘tail’. The relevance of this to e-commerce is the following: A physical shop has limited storage or shelf space, and space costs money. The shop cannot afford to stock ‘slow moving’ items – it must stock items that are popular. These are the 80%. If you fancy a pair of solid gold earrings in the shape of rabbits or an Armenian dictionary, and you go into a jewellery shop or bookshop in Windhoek and ask for these, the assistant will tell you they don’t have it – “there is no demand for them, sir/madam!”. But in cyberspace there are no physical limitations on shelf space – the number of products and suppliers is huge, so that even the most obscure item will be found, on some database, available from someone, somewhere in the world - somewhere in the 20% ‘tail’, in fact. On Amazon, finding rabbit-shaped jewelry or the dictionary of an obscure language will be no problem. If you fancy obtaining wolf urine, there is a supplier for it, on the Internet. It was 31.95US$ per litre at the time of writing which is not exactly cheap. You can also buy UFO detectors, singing garden gnomes, coffee mugs with a hole in them, or bacon-flavoured toothpaste. Given that billions of people can access the Internet, finding customers has never been that easy. The point is that shopkeepers who tell customers ‘there is no demand for it here’ may be losing a lot of business. Demand for unusual items may be infrequent, but there are huge numbers of them. As an Amazon employee put it: “We sold more books today that didn’t sell at all yesterday than we sold today of all the books that did sell yesterday!” 127 128 Unit 9 Web 2 and 3, E-business and the 'Long Tail' Anderson, C. (2009). The Long Tail. New York: Random House References Keywords/concepts Web 2: The interactive web, where the user’s input to the web site is as important as the original content itself. Web 3: The semantic web – the idea that pages ‘know’ what they contain and can present themselves to the viewers who are searching for them. Blog: A web log: an online ‘column’ of advice, comment or opinion written by an individual or sometimes by an organisation. Wiki: An online collaboration tool often hosted by a department or organisation, whose members can exchange information or update a database. RSS feed and Podcasts: Online equivalent of newspaper or magazine subscription by which users receive regular installments of digital media to their ‘inboxes’, either automatically or on request. Social networking: Web sites which provide a forum for users to post information about themselves, set up networks of ‘friends’ or groups, and exchange messages or news with other users. E-commerce: Almost any commercial or trading activity undertaken partly or wholly by electronic communications and funds transfer. The long tail: The idea that there are vast numbers of obscure, specialist, infrequently requested goods or services which can be catered for online but not from space-constrained physical suppliers. Unit summary Summary In this unit you learned about present and possible future web developments – how the Web has become more interactive and ‘intelligent’, giving rise to the terms Web 2 and Web 3. This has led to new terms such as blogs, RSS feeds etc signifying means by which you can post material online and the way in which you can receive material from the Internet. The phenomenon of the social networks was explained, as well as the explosion of e-commerce – the way in which we can now buy a vast array of goods and services online. The idea of the ‘long tail’ was explained, meaning that it is generally possible to find goods and articles online that you would be unlikely to find in a shop. Unit 10 The Wikipedia phenomenon Introduction Traditional encyclopedias have always been expensive. A typical classical encyclopedia like the German Meyers Universallexikon would come in twelve massive volumes, and at a price of several years of a worker's wage. There was a time when salesmen sold them door to door! The leather bound volumes looked good on your shelf but since new editions came out only every ten years or so, were likely to be out of date on current affairs or technology. Later came encyclopedias on CD’s, which took up less space, and enabled you to take your knowledge around with you but also went out of date. A scant ten years ago Wikipedia came along (named from wikiwiki, the Hawaiian word for ‘quickly’, and 'pedia, an acronym of 'encyclopedia'). How was it possible? – an encyclopedia which was free, was accessible anywhere online, was written by non-experts, and was perpetually kept up to date by the same army of writers, continuously correcting and checking each other’s work. When news of Michael Jackson’s death broke, his article on Wikipedia was updated within 90 seconds! Objectives Upon completion of this unit you will be able to: explain what Wikipedia is and how it generally operates; define the term ‘encyclopaedic’; create a Wikipedia account and make edits; use Wiki format to edit texts; repeat and apply previously introduced material about: referencing in scientific texts avoiding copyright violations and plagiarism researching information on the Internet and in libraries distinguishing between primary, secondary, and tertiary sources. The ‘five pillars’ of Wikipedia: http://en.wikipedia.org/wiki/Wikipedia:Five_pillars Prescribed reading 129 130 Unit 10 The Wikipedia phenomenon Dalby, A (2008) The World and Wikipedia – how we are editing reality New York Siduri Books Additional reading Recommended website Wikipedia’s site is either www.wikipedia.com or www.wikipedia.org There are many other sites in the wiki family such as the dictionary www.wiktionary.com, a ‘how to’ guide www.wikihow.com and for tourists, the currently rather sketchy but rapidly expanding www.wikitravel.com Wikipedia has a very good mobile site (for mobile phones and other devices with small screens) http://mobile.wikipedia.com 1. What is an encyclopedia? An encyclopedia is a collection of established scientific mainstream knowledge, separated into entries, or "articles", that are contributed by experts in their respective field, and sorted alphabetically. It is not an indiscriminate pool of facts, and it is not a place of speculations - future events can be covered if they can be determined with some certainty. All entries are supposed to have at least minor historic significance, as far as this can be determined at the time of creation. An encyclopedia is a tertiary source of information. Remember: Primary sources of information are eyewitness reports, raw data sets, photos, recordings, and the like; in other words, unprocessed pieces of first-hand information Secondary sources are synthesized academic or journalistic creations such as newspaper articles, papers and books Tertiary information sources are collections of academic or journalistic results like year books, text books and encyclopedias As a tertiary information source, there are certain types of content one would not expect to see there. These are contributions such as essays, previously unpublished research, manuals, trivia, autobiographies, and many more (sound- and video clips and pictures are allowed but not as stand-alone entries). 2. The nature of Wikipedia Wikipedia is an online encyclopedia, accessible for everyone with Internet access and changeable by everyone. Its contributors are called “Wikipedians”; they form a community that together has built the largest amalgamation of knowledge ever collected by mankind. In fact, counted in work hours, Wikipedia is the largest human project ever, surpassing the erection of Rome, the building of the Great Wall of China, and the creation of the pyramids of Egypt. Considering the definitions of information and knowledge as covered in Unit 3, what can you say about the claim that Wikipedia contains knowledge? Is it perhaps just information? In-text question An encyclopedia does not just offer facts but also justifications. Although you cannot extract knowledge directly from reading Wikipedia, the way the information is presented allows for the constitution of knowledge in the brain of the reader. It is thus not wrong to describe the overall collection of explanations as a body of knowledge, as Wikipedia and all other encyclopedias do. Wikipedia was launched in 2001. As of March 2012, Wikipedia is available in 278 languages. The English Wikipedia is by far the largest of them, containing almost 4 million articles. Due to the collaborative effort by its editors, its accuracy matches that of other respected encyclopedias like the Encyclopædia Britannica despite being roughly 30 times as large by article count. On the English Wikipedia alone there are 90,000 active editors who contribute at least ten times per month. Figure 1: Wikipedia content by subject, 2008 Wikipedia has been criticised for its approach to let anyone edit who wishes to do so. There is frequent vandalism (page blanking, inserting nonsense text, promoting nationalism, and the like), some of its articles are plagiarised, some information is wrong, misleading, libelous, or plainly non-verifiable. While most of these concerns are dealt with rather quickly (within seconds or a few minutes), a small number of errors, misinformation and hoaxes have stayed in Wikipedia for years. Wikipedia is by no means complete. It still lacks millions of articles about every settlement, species, rock star, mountain range, and many other entities. Moreover, of the existing articles the vast majority are very short or not properly referenced. Less than 1% of all articles on the English Wikipedia are rated “good” or better; the rest are in need of improvement. 131 132 Unit 10 The Wikipedia phenomenon Activity Activity 1 Time Required: 20 minutes Read the articles on Twyfelfontein (http://en.wikipedia.org/wiki/Twyfelfontein) (a “good article”) and on child labour in Namibia (http://en.wikipedia.org/wiki/Child labour in Namibia) -a stub article, at the lower end of the quality scale. Write down five fundamental differences in quality. How long? Feedback i. Length – the article on Twyfelfontein is long and contains a rich level of detail and information, the one on child labour is brief, sketchy and evidently hastily written. ii. Professional standard of writing – Twyfelfontein reads like a proper encyclopedia article, ‘child labour’ is a bare set of notes. iii. References – Twyfelfontein has an extensive set of references, the child labour article is largely lacking them. iv. Layout - the Twyfelfontein article is well illustrated; the child labour article not, although admittedly there is less scope for illustration in the latter. v. Cross-references – the Twyfelfontein article has a multitude of links to related topics and other Wikipedia articles, the child labour article has mainly a few general links to well known UN agencies. At least though there is an article on child labour in Namibia – as can be seen at the end of the article many countries do not have an article at all on the subject! 3. Why is Wikipedia special? Wikipedia is different in two ways from classical encyclopedias: It is not published on paper, and it is not written by experts. The first difference is manifested by the fact that it has articles not only for major scientific keywords but on virtually every conceivable topic, in a choice of languages spoken by only a handful of people: on country churches, tribal leaders, and minute scientific viewpoint differences. Also within articles, there is no space concern: hard disk space is much cheaper than paper. While classic encyclopedias restrict the size of entries due to space limitations, Wikipedia is only bound by how much there is to say about the topic. In fact, in its current (2011) state, article size and topic coverage is dependent on the amount of editors that care for it, not on perceived importance. Wikipedia has been criticised for this property, where there is much more coverage of the Star Wars science fiction movies than on the history of China! Wikipedia does not attempt to draw the line between science and popular culture. One will find entries on cartoon series, news events, blockbusters, soccer clubs, and so on. Figure 2: Demographic Composition of Wikipedia Contributors The second difference with respect to other encyclopedias is the composition of the author base: there are quite a few subject experts editing on Wikipedia, but this is neither enforced nor guaranteed. This is opposite to how traditional encyclopedias are written, where individual people are assigned to contribute a subject, a topic, or an article. Apart from an obvious larger pool of contributors for the encyclopedia, this creates certain synergies: an editor interested in history contributes an episode on World War II. Another editor with knowledge of vehicle mechanics adds technical information. Someone else contributes the business aspects, a philosopher adds ethical considerations. Another editor conversant with Wikipedia syntax formats texts and tables, an article designer searches the picture database (called Wikimedia Commons, http://commons.wikimedia.org) and illustrates the prose. An expert on English language copy-edits the work and improves the flow of writing. This is how high quality articles are produced by means of collaboration. On the negative side, there is of course the danger of the vehicle mechanics person providing the history, the philosopher adding technical information, and the historian formatting it, creating a mess. Still, once the article has caught the attention of knowledgeable editors it is likely to be improved. The contribution by the community produces another, not so well documented effect: as laymen are generally unable to take sides in 133 134 Unit 10 The Wikipedia phenomenon scientific debates, articles on contested topics usually outline the dialogue about the differences rather than presenting only one theory. This is an important improvement over classically assembled encyclopedias. 3.1 Name spaces A topic that can be very confusing for new Wikipedia editors are the socalled name spaces of Wikipedia. Not every page in the encyclopedia is an encyclopedic article, there are discussion pages, project pages, user pages, templates, and many more. All encyclopedic articles are in the main space of Wikipedia. For instance, Namibia University of Science and Technology has an article where the institution is described, it is at http://en.wikipedia.org/wiki/ Namibia University of Science and Technology. But there is also a page http://en.wikipedia.org/wiki/Talk: Namibia University of Science and Technology, where improvements of the PoN article can be discussed. And then there is http://en.wikipedia.org/wiki/Wikipedia:School and university projects/ Namibia University of Science and Technology, the page where NUST students enrolled for ICT deliver their homework. Those pages are in different name spaces. Name spaces can be recognised by their keywords: They contain a word with a colon at the beginning of their title, Talk:, Wikipedia:, User:, and so forth. All pages that do not have any keyword, are articles, and the rest are maintenance pages used to administer the Wikipedia endeavour. Some important name spaces are: [no keyword] – Wikipedia articles Talk: - Discussion pages User: - Pages of Wikipedia editors User talk: - Talk pages of Wikipedia editors. If somebody wants to leave you a message, you will find it on User talk:YourUserName Wikipedia: - Project pages like policies, statistics, explanations on how Wikipedia works Help: - Help pages and tutorials Template: - Pre-manufactured pages to be used on other pages. 3.2 Hierarchy and subject experts on Wikipedia Wikipedia has its own hierarchy. At the lowest level are editors without an account (anonymous users ) where only their IP address is visible. Higher up are editors (users that created an account and are logged in while editing), autoconfirmed users (account is 4 days and 10 edits “old”), autopatrollers (>75 articles created), rollbackers (experienced in vandal fighting), and filemovers (experienced with licensing and copyright). At the top of the pyramid are administrators who have the right to delete pages and block users and bureaucrats who can promote editors to administrators. What makes an editor move up those ranks? It is experience on Wikipedia, making many edits (thousands!) without messing up too much. The editors' experience “in real life” does not matter at all, a professor has no bigger chance to become administrator than a kid from Junior Secondary School. In fact, the professor might have less of a chance to get into a high position on Wikipedia because she might not have as much time to spend as a school girl. Also with singular edits, a subject specialist's edit is not “worth” more than that of an ordinary citizen. This has created unhappiness among the specialists who sometimes face the situation that their edits are reverted (undone) by someone far less knowledgeable in the subject. Until now, attempts to give special status to expert editors have all been voted down by the Wikipedia community. After all, anyone could pose as “professor” on Wikipedia, and it is impossible to verify whether an editor is really who they claim they are. To outline and satiricise some of the problems expert editors (those researching in, and teaching the, topic they write about on Wikipedia) face, Lore Sjoberg wrote in the April 2006 Wired Magazine: “The Wikipedia philosophy can be summed up thusly: "Experts are scum." For some reason people who spend 40 years learning everything they can about, say, the Peloponnesian War— and indeed, advancing the body of human knowledge— get all pissy when their contributions are edited away by Randy in Boise who heard somewhere that swordwielding skeletons were involved. And they get downright irate when asked politely to engage in discourse with Randy until the swordskeleton theory can be incorporated into the article without passing judgment.” On Wikipedia, this story has become known as Randy in Boise (Boise is a city in Idaho, United States), see the accompanying essay at http://en.wikipedia.org/wiki/Wikipedia:Randy_in_Boise. 3.3 Verifiability All claims on Wikipedia must be verifiable for everyone willing to make the effort. This is one reason for the need of references for all nontrivial claims on Wikipedia. This clause seeks to prevent hoaxes and hearsay from encroaching the encyclopedia. 3.4 Reliability of Sources, Relative to Wikipedia All content on Wikipedia should be referenced, at least with one footnote per paragraph. This is different from classical encyclopedias where there are few, if any, references in articles. The reason for this requirement is again that it is not written by experts; Wikipedia does not automatically trust your judgement to sum up a certain topic but wants to see proof that someone else wrote about the topic, and came to the same conclusion as the editor who puts an article together. Wikipedia puts a certain threshold on the types of sources you may use to support your text, and the sheer amount of rules for these “reliable sources” makes it difficult to decide, case-by-case, if a certain source is a suitable reference for a statement in an article. The following rules of thumb should be applied: No self-published sources. People’s CVs, company descriptions on their own web site, advertisements, advertorials, and press releases of any kind, YouTube, even government information on their own country’s political system, are not well received. No opinion pieces from people who themselves are not notable. A good indication of “notability” in this context is if the person has an article about themselves on Wikipedia. This rule excludes reader’s 135 136 Unit 10 The Wikipedia phenomenon letters and editorials in newspapers and most blogs and personal web sites. No sources that don't have at least a minimal reputation of factchecking and accuracy. Per this rule, student’s home work and minor theses (below Master's) are disallowed, as are tabloid newspapers. 3.5 Neutrality - Fringe and Conspiracy Theories - Due Weight The writing style should be as neutral and objective as possible, even if sources would support an entirely positive or negative coverage. So for instance, instead of writing “Namibia University of Science and Technology is the best tertiary institution in the country”, one should rather stick to verifiable facts like “Namibia University of Science and Technology has won the PWR Diamond Award three times in the last five years.” If there is considerable argument about a fact or a judgment, and if all different viewpoints can be properly attributed to reliable sources, then all those views should be covered on Wikipedia, with the approximate prominence they receive in the corroborating sources. In other words, within an article each of the conflicting opinions should be given their due weight. Under this guideline it is perfectly possible to cover even extraordinarily weird positions, for as long as they receive proper commentary from the opposing factions. However, there is a rule of thumb for fringe theories (those that are not supported by the majority of scientists, e.g. “CERN will produce a black hole that is going to consume planet Earth”), conspiracies (“The Americans faked the Moon landing”) and other weird statements (“There is a tea kettle orbiting the planet”): Extraordinary statements require extraordinary backup by sources. Activity Activity 2 Time Required: 30 minutes Read http://en.wikipedia.org/wiki/Caprivi_conflict. For a number of years this article includes a statement that reads “On October 7, 2002, the Itengese nation severed all ties with Namibia and declared the independent, sovereign Free State of Caprivi Strip/Itenge their national homeland”. Answer the following questions: How long? Feedback Is this a fringe theory, a conspiracy theory, a true statement, or just a widely unsupported claim? Is due weight given to this claim within the article? Is this statement properly referenced? Why was it not deleted? Is the criticism of this claim clearly spelled out? Is the criticism of this claim properly referenced? What would be requirements for a yet-to-be-written article on the Free State of Caprivi Strip/Itenge? Would the claim that it is an unrecognised state have to be included? How would you reference such a claim, considering that due to the non-existence of this state no reliable sources would actually write anything about it? The claim implies that a souvereign state now exists in the Caprivi Strip; this is not factual. The statement is in fact so far from the truth that it could be removed right away, the source backing it up is a paramilitary forum—not exactly an example of reliability and neutrality. To reference the counter-claim is not easy because typically no one writes “X does not exist” on a subject that indeed does not exist. There are, however, lists of recognised states by the UN and other bodies. An article on the Free State of Caprivi Strip/Itenge could be legitimately written on Wikipedia, provided that the situation that it does not actually exist is prominently described. The article could outline the history of ideas, motions, and conflicts connected to the secession of the Caprivi. Activity 3 Time Required: 15 minutes Read Wikipedia:Notability (people) (http://en.wikipedia.org/wiki/Wikipedia:Notability (people)). Activity 137 138 Unit 10 The Wikipedia phenomenon Considering this content guideline, give one example of a person whose biography should be in Wikipedia, and one that should not. How long? Feedback Compare lifetime achievements, general coverage in news and scientific discussion. Try to assess the probable historic significance of people we know today – how likely is it that they will remembered 50 years from now? Wikipedia itself is sometimes not true to its own principles: There is an article about a cricketer of nearly 100 years ago whose top score was 13(!) and who seems to have no other achievement! 4. Getting started with Wikipedia This practical topic consists of two activities followed by an important course assignment. You need a functional email account and of course Internet access. You do not need an account to make edits on Wikipedia. However, for the purpose of the ICT course, account creation is required to enable your assessor to verify which edits you have made. Tip Activity The search feature on Wikipedia is not very sophisticated. As all Wikipedia sites have a high Page Rank (compare Unit 4) you can always use Google to find Wikipedia pages. This will be a lot more effective than using the built-in search function of Wikipedia itself. For instance, to find the Namibia University of Science and Technology ICT project page on Wikipedia (activity below), google for “Namibia University of Science and Technology wikipedia project”. Hit #1 is the page you are looking for! Activity 4 Time Required: 15 minutes 1. Go to http://en.wikipedia.org and click on "login/create account" (top right corner) 2. In the login window that appears, as you don't have an account yet, click on "create one", and follow the instructions. How long? Once your account is created, make sure you can login. A successful login is indicated by your user name appearing top right on screen. Feedback For security reasons Wikipedia only allows the creation of six accounts per day and site. If you experience the problem of your account creation request being rejected by the server, you'll have to contact the Account Creation Team at http://toolserver.org/~acc/. Fill in the required information again and make sure to include that you are part of the Namibia University of Science and Technology ICT course, in the "Comments" field. There will not be an immediate response; the team at toolserver.org will first discuss your request. Activity Activity 5 Time Required: 10 minutes Once you have created your account and can login, subscribe to the Namibia University of Science and Technology project page at http://en.wikipedia.org/wiki/Wikipedia:School and university projects/ Namibia University of Science and Technology. How long? 139 140 Unit 10 The Wikipedia phenomenon Information on how to do that is provided on that page. Feedback Learning your Way around Wikipedia and Getting Help Recommended website Wikipedia puts an extensive threshold on editing due to its somewhat complex syntax and its rules. This module cannot make you a Wikipedia nerd but you are strongly encouraged to either do a tutorial (start at: http://en.wikipedia.org/wiki/Wikipedia:Tutorial) or read the manual at http://en.wikipedia.org/wiki/Help:Wikipedia: The missing manual. If the Account Creation Team created your account, you will most likely find a welcome message on your user talk page (situated at http://en.wikipedia.org/wiki/User talk:YourUserName) which provides links to the most useful help pages on Wikipedia. 5. Finding an article to create or improve Wikipedia has sophisticated rules and interpretations on what articles should be included, and what articles should not. For a newcomer this can be difficult to evaluate. Instead of reading through all those rules, you are advised to adhere to the following rule of thumb: Notable (that is, worthy of inclusion) is any geographical feature (such as villages, lakes, rivers, mountain ranges, nature parks, suburbs), any high-profile individual (professional politicians, heads of large companies, well-known entertainers), any school above primary level, and –roughly - anything else that has received wide coverage. There are three caveats connected to the creation of new articles: The notability mentioned above. Apply your common sense as to what will have historic significance; for instance dam levels in Namibia have wide news coverage without justifying an encyclopedic entry. Naming and spelling. Wikipedia article names are case sensitive, thus “Oshigambo High School” and “Oshigambo high school” are different articles. Still, of course only one of the two titles should have content (the first one, because it is a proper name). The alternative spelling could become a redirect to the main article. 1) Sometimes it is more difficult than just to apply proper spelling to find existing articles. At the time of writing this guide, “Independence of Namibia” is a non-existent article. The content is spread over the articles “South African Border War” and “United Nations Transition Assistance Group”, but should probably rather be merged into the currently very short article on “Namibian War of Independence”. It is possible to rename articles (by moving them from one place to another), but actions of this sort should be discussed in advance. 2) Often there are many articles with the same name on Wikipedia, for instance “Olympia”, which is an ancient site of an athletic competition, a stadium in Berlin, a ferry ship on the Portsmouth-Bilbao route, an Opel family car, and many other things. So if you want to write an article on the Windhoek suburb Olympia, you must give it the name “Olympia (Windhoek suburb)”, and produce a link and a short explanation on the page “Olympia”, which is a socalled disambiguation page. In-text question Verifiability. Even though every village could and should have its own article, the least that is required is a reliable source confirming its existence. If nobody has yet written about it, it should not have an article yet. In mid 2010, an article on “Magdalena Stoffels” was created on WP (by a student of ICT). A discussion soon started, suggesting a deletion of this article. Why do you think this happened? At the end of the discussion, it was decided to rename the article “Murder of Magdalena Stoffels”. What would have been the rationale for this? (Magdalena Stoffels was the girl who was tragically murdered in Windhoek on her way home from school in July 2010. We suppose that the rationale for the above was that her life was normal and unexceptionable – sadly the only thing notable and newsworthy about her was the manner of her ending) One feature that can help with the decision whether a new article for a topic is a good idea is the search for red links. While links internal to Wikipedia appear blue if the destination article exists, they appear red if that article still needs to be written. Placing a link in a page (done by putting text between double square brackets in the page's source code [[like this]]) is an editorial decision, particularly in the case of a red link. Think of it as requesting an article with that name. If another editor thinks it is not a good idea to have such an article included, they will remove the red link. For this reason you can be somewhat certain that an article requires writing (and does not yet exist under another name) if you see a red link to it. If you browse through Wikipedia you will find a lot of red links, and your lecturers have put further suggestions on the talk page of the PoN Wikipedia project (http://en.wikipedia.org/wiki/Wikipedia talk: School and university projects/ Namibia University of Science and Technology). If you find any red link where you can write an article and can provide reliable sources, go ahead. Instead of creating a new article, you may also decide to improve an existing one. Every article on Wikipedia can be improved —there is no “perfect” article as yet. Broughtion, J. (2008). Wikipedia – the missing manual. New York: Pogue Press References 141 142 Unit 10 The Wikipedia phenomenon Keywords/concepts Wikipedia: The on-line encyclopedia free to all, and which all can contribute to. An edit: A change to Wikipedia. Red link: A virtual link to a topic whose article has not been written yet. Name space: An area of Wikipedia, containing pages of similar purpose. All articles are in the main space, all maintenance pages are in other name spaces. Identified by a keyword, followed by a colon. Encyclopedia: (idealized) the publication of the sum of all accepted human knowledge. Unit summary Summary In this unit you learned what an encyclopedia is. You learned about the phenomenon of Wikipedia – the vast online encyclopedia written and edited by non-experts (or at least, by people not guaranteed to be experts). You learned about the strengths and weaknesses of this approach and how, although it seems as though anyone has complete freedom to contribute to the encyclopedia, there are a surprisingly large number of rules and guidelines to maintain and improve its quality. You learned about some of these. You learned how to use the encyclopedia, and most importantly how to create an account, make an ‘edit’ or even create an entire article of your own, on a topic which Wikipedia did not yet have. Unit 11 The Mobile Revolution Introduction On April 3 1973 Martin Cooper, an engineer at the Motorola Corporation in New York, called his competitor at Bell Labs. What was unusual about that? Just that Cooper was walking in the street using a portable mobile phone. It was the world’s first cell call. The phone weighed more than a kilogram, with a long antenna and no display. Cooper joked that the call was quite short, because he could not hold the phone up for long! From this beginning, less than 40 years ago (and with the first commercial cell service starting less than 30 years ago) when nobody assumed it was possible to make a phone call or send any kind of message except from home, their place of work or the post office, has come the time when we take it for granted that we can communicate with anyone else in the world, anywhere, with a hand-held device, with any form of information as required, in voice, text or pictures. Using a modern ‘cell phone’ just as a voice phone is only using a fraction of its capabilities, because it is a go-anywhere location aware computer – with far more computing power than the computer which went on Apollo 11 to the moon in 1969! This revolution has been especially significant in Africa, where people have taken to mobile communications and information services to an extent vastly greater than any of the ‘experts’ predicted. Note that this unit concentrates on mobile phone communications. The wider aspect of communications with other mobile devices (WLANs for laptops, satellite phones etc) is not dealt with. A Story A Story In the mid 1990’s, the co-author was running a small delivery business. As it was very important for him to be reachable at all times, even when driving from one town to another, he used a cell phone – very innovative for those days. This consisted of a handset and a docking station, together weighing several kilograms, occupying the entire passenger seat. The docking station required a more or less permanent connection to the car’s electric circuits, draining the battery in no time, if the car was not running. A call would cost approximately N$13 per minute, in today’s terms. Customers were amazed to learn that the recipient would be actually talking to them while driving! Becoming involved in a discussion about this marvelous technology, they would forget about the exorbitant cost of the call and be horrified when they got their phone bill. Needless to say that the cell phone was more expensive than the car it was driven around in! 143 144 Unit 11 The Mobile Revolution Figure 1: Early cell phones were not so mobile after all--they weighed several kilograms and needed a permanent power supply. Objectives Upon completion of this unit you will be able to: describe the basics of mobile technology and development; discuss the range of data and information services potentially available from mobile technology; propose further applications and uses for mobile technology and services, to promote social development in Africa. The article on cell phones from the www.howstuffworks.com website. Prescribed reading Additional reading The Wikipedia article on mobile telephony: http://en.wikipedia.org/wiki/Mobile_telephony 1. The background and range of mobile services Figure 2: Dr. Martin Cooper of Motorola made the first US analog mobile phone call on a larger prototype model in 1973. Mobile technology really dates back to the discovery of radio waves in the 19th century, because the mobile phone is just a sophisticated radio. Portable radios (‘walkie-talkies’) have been around for 70 years – the problem is that due to a limited number of radio channels available, only a very few people could use them in any one area. The genius of modern mobile technology is the ‘cellular’ system – the whole mobile coverage area of a city and even a country, is divided into much smaller areas called cells, named after the hexagonal cells of a honeycomb. Each cell has a ‘base station’ transmitter and receiver. (In country areas, where the cells are much larger, you can see these as large poles on hill tops). In Namibia, the poles are often disguised as trees in order to fit into the landscape. All base stations are linked in turn (maybe by normal landlines) to the main communications centre of the mobile provider. The radio power of a hand-set is deliberately set low, so that it can only communicate with its base station and maybe with the directly adjacent ones. The number of available radio channels, set by the technology, can be reused in every cell, hugely increasing the possible number of phones on the network. When a phone is switched on, it calls in to the base station of the cell where it is situated, to identify itself and say who it is. The mobile provider maintains an enormous real time database of base stations and the numbers contained in them at that moment. So that if there is a call for 081 222 3456 (say your number) the database will be searched to find where that number currently is, and its base station, and communication will be established through that base station. If you are not in the coverage area or your phone is switched off, the system will not be able to find you and the caller will 145 146 Unit 11 The Mobile Revolution get a ‘not available’ message. If you want to make an outgoing call, the message will of course go by radio link to your current home base station. But if you are moving, for instance in a car, and you cross the boundary of the cell you were in, the system senses this, and ‘hands you over’ to the next cell into which you are moving and its base station, which takes over the call. You will not normally notice any interruption to the call. Since the 1990’s, the technology has moved from analogue to digital, with clever electronic protocols developed to squeeze ever more channels into the available radio ‘bandwidth’, and improve connection quality and security. But the principle remains the same. How do you think ‘roaming’ works – sending and receiving calls when you are in another country? Why are roaming charges so high? Reflection If you are using your Namibian cell phone in another country, the mobile provider you are linked to, if it has a roaming agreement with your Namibian provider, will pass the call request back to your provider who will connect the call as usual, wherever its destination. This would mean that if you are in London, calling a number in London, you would be involved with two overseas calls – London to Namibia and back to London! That is why it is so expensive! As we said, the cellular phone is really a computer, with capabilities far beyond voice communication. This was soon realised. In December 1992 the first text message was sent. First thought of as a system for technicians to communicate while testing the system, it caught on throughout the world, especially as a medium for young people, and billions of ‘SMS’s’ are sent every day; the 160-character text maybe now defining the modern unit of communication. The first ‘smart phone’, incorporating a full computer-like QWERTY keyboard and Internet connectivity was the Nokia Communicator in 1996. Smart phones were soon merged with existing Personal Digital Assistant (PDA) technology, devices which had calculators, personal organizers etc but no phone. The currently most popular advanced handset is of course the i-phone from Apple and related products like the Samsung Galaxy, or the Blackberry, being a large screen full phone and texter, with full Internet connectivity and thousands of available software applications which can be downloaded and run on it. Activity How long? Activity 1 Time Required: 30 minutes Configure your cell phone to show the name of the cell tower currently associated with it. How many different towers does your cell phone associate with when you move from one location to another? Find out from your cellular provider how to do this. For the second question, think of the hexagonal shapes of the ‘cells’. When walking or driving, you will leave the cover area of one cell phone tower and enter the coverage of the next one. Feedback 2. Mobile computing in Africa Figure 3: Google screenshot Interestingly, in late 2008, while given a lecture on this topic, the writer Googled the phrase above: “Mobile computing in Africa” - and got only 7 hits! In April 2011, with the same Google search, he got 840 hits. So something must be happening. When cell phones first became commercially available, it was assumed their market would be as high-tech business tools for North America and Western Europe, and then possibly, for the emerging economies of Asia. This certainly happened, despite the devices being initially extremely expensive. Africa was hardly thought of as a cellular market. “Obviously”, the product was too high-tech for Africans, too complicated to understand and use, and look at the very low penetration rate of PC’s and even fixed line telephones on the continent. Also, only credit-worthy persons with bank accounts were eligible for cellular accounts. Indeed, the initial reason why a cellular service was set up in South Africa was that the American organisers of the Miss Universe pageant held in Windhoek in 1995 insisted on one, and it was installed for their benefit! Coverage was initially limited to the pageant venue and the hotel (specially built!) where the organizers were staying! What was not foreseen was the enormous uptake of the product by ordinary people in all countries. People wanted to be able to 147 148 Unit 11 The Mobile Revolution communicate all the time, even if they did not ‘need’ to. In Africa, this was all the more remarkable, despite or because many of the people buying and using mobile phones had no access to fixed phones and had maybe never used one! The commercial inspiration behind this breakthrough was certainly the ‘pre-paid’ concept whereby you bought air time in advance from the local shop and had no need for credit references or to open an account with the mobile provider. This concept of pre-paid is relatively rare in the developed world, where the majority of the people have sufficient money, and are credit-worthy. For instance, MTC, the first and predominant cellular service in Namibia, was launched as we said in 1995 with a sales expectation of 500 handsets! In 2008 MTC passed the million customer mark, and in 2011, after 15 years, achieved 1,5 million claimed active customers. In a country with a population of only some 2,2 million, including babies and the old, this is remarkable. In fact, in 2009, over 350 million people on the African continent owned cell phones, nearly a 50% penetration rate, including presumably people with limited conventional education and numeracy. Activity Activity 2 Time Required: 1 hour Write a short essay (400 words): Why do you think may Africans prefer mobile to fixed line phone communications? In many countries, including Namibia, installations of fixed lines are falling, while mobile growth continues apace. Why would that be? How long? Feedback There are several reasons for this. Firstly, traditional telephone services, nearly all state owned in Africa, are associated with corruption, are expensive and unreliable, and have provided poor service. In addition, wars, disruption and theft of cabling etc have destroyed much of what telephone infrastructure there was. Even without this, distances are very large, and providing fixed services to remote areas is very expensive. You are dependent on a ‘government service’ which can let you down at any time. In contrast, a mobile phone, especially with a prepaid service, feels much more under your control, independent, and does not depend on expensive fixed infrastructure. Figure 4: A Cambodian market woman, possibly checking market prices on her mobile 3. What Can Mobile Devices be Used for (in Africa and Elsewhere)? The above picture probably gives a clue, to which we will return. But first we should reiterate the features of a modern mobile device: It is a digital computer in the full sense, with processor, memory, operating system. It has the normal computer input and output modes, keyboard and screen, obviously limited by size in a small hand held device. It supports electro-magnetic communications in various protocols of voice and data, (WI-FI, 3G, bluetooth etc) increasingly including Internet access. Mostly these communications are for individual purposes, but the mobile provider can ‘broadcast’ to selected numbers or all numbers on the network if necessary. It has both software and hardware based identification (e.g. its IMEI number) which marks it out as unique in the world. It could also include RFID (radio frequency identification or ‘tags’). It can be equipped with a GPS device, so that it ‘knows where it is’ – (this is not strictly necessary, see below). It has been said above that we have hardly scratched the surface of possible mobile systems and services. If these have not emerged yet, the reason could be technical, legal or commercial: services are unlikely to be offered unless the mobile operator can see its way into making a profit from them. All that said, it is impressive to note that some cellular services in Africa, especially for banking and payments, such as m-pesa and mobipay, are far in advance of systems in Europe and America, where tentative schemes for transferring money by cell phone (e.g. to share restaurant bills) are only just being discussed! 149 150 Unit 11 The Mobile Revolution So, potential applications: 3.1 Advertising It is surprising that advertisers in Namibia, especially those that promote the cellular companies as well as retail products, have not done more with this. For instance, why not produce an advertisement which pops briefly on your screen when you switch your phone on each day? Of course, this has to be done so that mobile users are not offended, but people were offended by advertisements on radio when they first appeared! 3.2 Location aware services: Many upper end phones include GPS sensors, so they know exactly where they are on the planet to less than a metre. On the other hand, because a mobile phone works by connecting to its nearest base station, and often can see another couple of base stations within range as well, its position can be well defined by triangulation techniques. Figure 5: The observer at A measures the angle α between the shore and the ship, and the observer at B does the same for β . With the length l the law of sines can be applied to find the coordinates of the ship at C and the distance d. CC-BY-SA 3.0 Unported Regis Lachaume Possible uses of the device’s ‘location aware’ status include: Emergency – if you feel unwell, call a ‘911’ type number on your cell, and then collapse, emergency services will be able to find you even if you cannot talk. Navigational – a map of the area you are in could be displayed, if you are lost. Advertising (again) – suggestions for restaurants in your area, or shops with sales in your area, can pop up, prompted or unprompted. 3.3 (A story of) Crime Investigation A Story When a woman was killed on a motorway by a piece of concrete thrown off a bridge, the killer was caught because his cell phone was still on when he threw the missile – cellular company records showed that there was only one phone being moved at right angles to the motorway at that point, and at that time, i.e. there was someone moving across the bridge. The killer had no idea that his location and direction of movement could be tracked! 3.4 General information: Registering for particular services, and dialing particular numbers could provide market prices for farmers, weather forecasts, mark results for students or school leavers (this has been available at the NUST and Namcol for some time). 3.5 Then some more innovative services: All of the examples mentioned below are already existent, this is no science fiction! Your mobile phone as a ‘remote’: The motion and acceleration sensors allow your phone to act as a mouse or remote control. There are Android applications for this. It would also be simple in principle to incorporate RFID so that your phone could act as a car key, especially with modern cars which are in any case keyless. You walk up to your car with your phone in your pocket (don’t bother to take it out) – the car door unlocks. You touch your phone to a spot on the dashboard, the car starts! You get out of the car and walk away – the car door ‘senses’ you and your phone leaving, and locks itself! Your mobile phone as ID: Since your phone has an unalterable hardware ID which cannot be tampered with, could your phone not be used as ID instead of an ID document? Could not, of course subject to international agreement, your phone be used as your passport? On request, a two dimensional ‘bar code’ could come up on your screen, not readable or to be tampered with by a human, but could be read by an immigration official with a special reader? The bar code could contain all your details, photo, visas, date of expiry of ‘passport’ (this is done in some countries already with train tickets) Financial services and ‘electronic wallet’: In Namibia the banks, especially the FNB, have introduced an impressive cell phone banking service, where typically you can do anything you can do at a PC or an ATM (except draw cash out!), including buying air time which is debited to your bank account. Especially useful is the system (called ‘in-touch’) which sends you an SMS whenever there has been a transaction of any kind on your account. But what about actually buying goods, at least small purchases, with your phone? You select your purchase, maybe from a small rural shop where the nearest cash point could be a hundred kilometers away. You hold your phone up to a receptor attached to the shop’s cash register, which reads your phone’s number. Some account of yours is then debited for the purchase, maybe airtime, or another account set up specially for cellular purchases. This technology is called Near Field Communication or NFC, basically designed for mobile phones to interact with devices only a couple of centimeters away. Many experiments have been done with it in the last couple of years, especially in Japan. There you can buy canned drinks out of a machine or pay your freeway tolls with NFC. The problem is not the technology but the financial aspects and possibly security. If I buy a packet of sugar for N$5.95, what does the 151 152 Unit 11 The Mobile Revolution shopkeeper get, what does the mobile network get, and what does the financial organization backing the system get? One very commendable innovation in Namibia, probably inspired by Mpesa in Kenya, commended by President Obama, is the mobipay system. It is a mobile payment system. You register with mobipay and create an account, which you have to top up with credit. Your account number is just your mobile number. You do not have to have a commercial bank account. You can call a number, access mobipay’s menu, transfer money to any other mobile number with a mobipay account, buy airtime or prepaid electricity and at a limited number of shops, buy goods. The shopkeeper has to have a dedicated mobipay terminal at the checkout. The procedure is explained on the mobipay website (http://www.mobipay.com.na/index.php?option=com_content&view=ar ticle&id=74&Itemid=192) as follows: Step 1: The cashier will enter the amount for your goods into the MobiPay Point of Sale (POS) unit. Step 2: You then have to enter your mobile phone number into the POS. Then press Enter. Step 3: You then enter your MobiPay secret PIN into the POS unit (keep it secret and block the view for other people to see your PIN). Then press Enter. Step 4: Your mobile phone will start ringing after more or less 5 seconds. Answer the call and place your mobile phone close to the POS unit. You will hear a funny sound for about 3-5 seconds. When the sound has stopped the payment has been approved. You will always receive an SMS as proof of payment from MobiPay. Also always insist to get the till slip (invoice) from the retailer as proof that you paid for your bread and butter (or any other goods or services). Activity How long? Activity 3 Time Required: 15 minutes Assuming that Mobipay is still around and thriving at the time you are reading this, and particularly if you do not have payment cards, register for the Mobipay service and see if it is useful for you [Warning: there is a charge for financial transfers but not for shop purchases – the shop presumably pays for the transaction.] You may cancel your account at any time. Give Mobipay some feedback on your experiences with their system, or post it on your blog! Feedback “Mobile activism” It has been mentioned before that claims have been made that social networking sites have helped dissidents in authoritarian countries ‘mobilise’ against their regime. Without Internet access, and even short of any political disturbance, however, cheap and universal mobile communication is a great catalyst for, well social mobility and development. A site called www.mobileactive.org provides stories on how rapidly circulated information , for instance about market conditions, income generating opportunities and medical services, helps disadvantaged and sometimes isolated people around the world. Check the current content of www.mobileactive.org Recommended website Note that not all the above proposed mobile information services necessarily require an Internet compatible phone. With an Internet connected phone, you obviously have all the facilities of the Internet on a mobile platform with a hand held computer. Conclusion: Africa, in a mutually beneficial partnership between its people and its mobile service providers, has shown great imagination in the take-up of mobile communication and information services in the last 15 years, and has even led the way in the development of mobile financial transactions. “I find the recent growth of mobile phone use by low-income people in Africa to be fascinating. They are quickly showing us that mobile phones, not PCs, are the computers of the future for the developing world.” A blogger quote. From www.africafocus.org Southwood, R. (2009). Less walk, more talk – how celtel and the mobile phone changed Africa. New York: John Wiley and Sons References Banks, K. (2010). The SMS uprising – mobile activism in Africa . Nairobi. Pambazuka Press Mobile technology: Technology which enables electronic voice, video and data communications to devices not fixed at a location, usually by radio technology. Cellular Mobile communications technology which divides 153 154 Unit 11 The Mobile Revolution Keywords/concepts Adding extra rows to the Table graphicRemoving rows from the table graphic technology: the area to be covered into small reception locations called cells. Mobile phone: Wireless telephone handset operating by means of cellular technology. Roaming: Technology through which mobile communication can be achieved elsewhere than in the ‘home area’. Mobile services: Services delivered to software running on mobile devices. Location aware services: Mobile services exploiting the cellular system’s knowledge of where the mobile device is currently located. Unit summary Summary In this unit you learned about the ever growing importance of mobile devices, communication and information systems and mobile services, particularly in the developing world and in Africa, and how a mobile phone is far more than an instrument for making ‘cordless’ telephone calls and sending text messages, and how in fact the device is becoming for probably the majority of the people in Africa, their primary informational tool. Unit 12 The Down Side of the Internet Introduction Is everything about the Internet uncontroversially beneficial, an unqualified boon to humanity? You know of course, that it is not – at least, not uncontroversially. Since the Internet is a subset of human affairs, maybe even a superset of human affairs, it contains the bad guys as well as the good, the uninformed and incompetent as well as the experts, the jokers as well as the serious. Just as you have to have your wits about you to avoid being deceived or defrauded in real life, so too on the Internet. That is the first part of our discussion. A flood of information! Figure 1: A flood of information! Adaptation of Hokusai's "The Wave off Kanagawa" by Bill Torbitt The second part addresses the question of whether the Internet is actually good for us and our brains. Does the huge flood of information and entertainment available on line turn us into a continuous distracted state, with sapped powers of concentration and zero attention span? Do we exchange a real life for a life on-line? There has been furious discussion on this issue, which should certainly help to revive our brain power. There was a tragic (reported) case of a Korean couple, so absorbed in the care of their virtual ‘second life’ baby that they neglected their real child, who died of malnutrition. There are those who, when a new comet is sighted, rush to their computers to find images of it on the Internet, instead of stepping outside to look in the sky themselves! Some might joke that this problem started in the Stone Age, when the elders of the group condemned the youngsters for sitting all day in the 155 156 Unit 12 The Down Side of the Internet cave painting pictures of animals on the walls, instead of going out to hunt the real thing! Objectives Upon completion of this unit you will be able to: discuss the risks of activities online; recognise the dangers lurking on the Internet; avoid the above dangers; discuss the proposition that the Internet is making us ‘dumber’; formulate your own opinions on how the Internet has changed our lives, and whether for the better. Wikipedia articles on computer crime, phishing, cyber stalking, ID theft etc Prescribed reading Keen, A. (2008) The cult of the amateur Nicholas Brealey Publishing Carr, N. (2010) The Shallows Atlantic Books Additional reading 1. The dangers of the Internet - Cybercrime Cyberspace is a huge area for undesirables and criminals both to operate in and to hide in so that in many ways cybercrime is more difficult to control and police than conventional crime. As mentioned, it is much easier to be anonymous online! We should distinguish between crime involving attacks on the network itself, and crime merely using the facilities of the network. In the former category could be included viruses and other malware, denial of service attacks and creation of botnets, which will be discussed only briefly because of the technical aspects; and in the latter ID theft, fraud, cyber stalking or cyber bullying, phishing and cyber terrorism. Perpetrators can be anyone from student jokers, to organised criminals operating for commercial gain, to dedicated cyberterrorists working for political ends, right up to state level. Unfortunately, some countries themselves have acquired a reputation for hosting and even sheltering both commercial and political cybercriminals. It should be noted that, as always, the Internet can work both ways. With enormously powerful means of searching and tracking, cyberinvestigators and detectives have also much more powerful tools at their disposal. In-text question What measures do you take to protect yourself when online? You should take some, and not just anti-virus software! 2. ID Theft and Phishing This should more properly be called ID fraud, since it is impossible to actually steal someone’s identity (or is it? this may be a philosophical questio n – are you more than just your ID number? What if you were cloned?!) Reflection 2.1 ID Theft ID theft comprises appropriating somebody else’s personal information for various illicit purposes, either for illegally buying goods on their bank account, acquiring credit, impersonating someone else if you are accused of a crime (so the other person gets the blame!), getting medical records for ‘free’ treatment, or personal information for blackmail. It is important to keep your most important details as confidential as possible. In America, where so much depends on your Social Security number, the disclosure of your number into the wrong hands can cause serious problems. Acquiring someone else’s details is not necessarily a high-tech matter. Fraudsters can advertise bogus jobs, to which innocent applicants often reply with their CV’s which contain their full names, date of birth, passport number and even a photocopy of their ID card! (This process is made easier by lazy employers advising that ‘only successful applicants will be contacted’. What happens to the unsuccessful CV’s?) Otherwise fraudsters can employ people to look through dustbins, often left outside for a couple of days, to retrieve discarded but not shredded bank statements or credit card slips! High-tech methods of identity theft include key logging software, installed without your knowledge by malware, which records your every key stroke, for instance when you are typing in your credit card number, and send it back to its ‘master’; and phishing. It needs hardly to be said that you should avoid putting sensitive personal data online: your ID number or physical address, and if in a small country where you can be easily identified, maybe not your real name either. Activity Activity 1 Time Required: 1 hour WE ARE NOT ENCOURAGING OR CONDONING ANY ILLEGAL ACTIVITY! However, supposing you were tasked with finding out a given person’s date of birth, postal address and ID number, how would you go about it? How long? 157 158 Unit 12 The Down Side of the Internet You could try anything from the phone directory, electoral register, municipal records, census data - all of which are public documents. Google them, or check their Facebook entry! Feedback 2.2 Phishing Figure 2: Phishing An ironic misspelling of ‘fishing’, this technique attempts to gain confidential information (for ID theft purposes) not by hidden malware but by duping online users into willingly supplying it! This is usually done by sending emails purportedly from a bank saying that your account or banking facilities will be suspended unless you ‘confirm’ your current account details and password for a routine security check. There are variations on this, but all prey on the user’s anxiety and wish to do the right thing by their bank. There are now more ‘innovative’ non-bank scenarios asking you to join a worthwhile charitable organization, obviously supplying your details again, and then of course there are the iconic ‘Nigerian’ scams, not necessarily from Nigeria, which present you with even more imaginative situations. It again goes without saying that you should not be taken in by this. Firstly, the phishers often do not know what bank you have, so you will get emails from irrelevant ‘banks’ which are obviously bogus. Secondly, since phishers often originate from non-native English speaking countries, the English grammar and spelling is rather strange. Third, although a quoted URL in the mail may look legitimate, if you hover over it (don’t click!) quite a different link crops up in the status bar at the bottom of the screen, with a strange and unfamiliar domain or country code. It is unlikely, for instance, that the Standard Bank has its headquarters in Kazakhstan or the Ukraine. Tip The best tip to avoid phishing or any attempt at ID theft is not to respond to any unexpected message or request online. Do not accept Facebook friend requests from anyone you do not know. Do not answer puzzling emails (and it is a good idea not to have automated ‘out of the office’ replies to emails – this confirms your email address). Do not even give your phone number in an email even if requested from someone you know – call and give it directly. If you have to buy something by giving a credit card number by email, either split it (send one half of the number in one email, the rest in another some time later) or send by old-fashioned fax – more secure. And as your bank and ISP are weary of pointing out, they will never ask for your details or passwords by email. When about to enter any confidential details online, even on a trusted site, check that the ‘encrypted’ (padlock) symbol appears in the status bar. Get online banking facilities and check your accounts regularly for ‘unknown’ payments. Consider using a bank service (it is usually free) which sends you an SMS whenever any transaction takes place on any of your accounts. 3. Cyber-stalking, cyber-harassing, cyber-bullying This is one of the least savoury aspects of the Internet. Stalkers and bullies have always been with us. Stalkers are usually reclusive nonachievers who become obsessed with a person, often a celebrity, and bombard them with unwanted messages, ‘gifts’, and sometimes follow them around physically, occasionally resulting in violence. Bullies, as is well known, are cowards who seek to boost their self-esteem by victimizing anyone around them who they perceive to be weaker or who will not fight back. Usually this is associated with school life, but it can continue into adulthood. The trouble with the Internet is, as usual, it makes all this easier. The victims of stalkers and bullies, who are often firstly the targets of ID theft, can be bombarded by insulting or threatening emails, sent anonymously, or can have unpleasant or worse comments posted on their Facebook pages, even though the site in theory forbids these. Some tragic cases have resulted in suicide. Megan Meir, a 13- year-old girl in the US was befriended by a ‘boy’ online who later turned against her, telling her she was not a nice person etc. Soon after, Megan committed suicide. The sender of the messages and the identity of the ‘boy’ turned out to be the mother of one of her school friends! Incidentally, the identity of the attacker was revealed by bloggers who uncovered her IP address and real email. [This raises yet another question: is ‘trial by Internet’ justified?]. The person involved was never legally convicted. Few countries have any specific legislation against cyber-harassment. Namibia certainly has none. Anyone thinking they are being harassed online should close their Facebook account and get another email address. Parents who notice their children looking upset after going online should urgently investigate the cause. 159 160 Unit 12 The Down Side of the Internet 4. Cyber-terrorism and cyber-warfare It may be wondered today why terrorists bother with old fashioned bombs when far more damage to their targets, governments or individuals, can be done online. Instead of destroying tanks, destroy the military communications network. Block the control systems which keep the country’s transport links going. Disrupt official websites by denial of service attacks, so that people cannot get news of what is happening. Ironically, the more ‘advanced’ and ‘networked’ a state is, the more vulnerable it will be. Although Al Qaeda and other organizations have professional websites, it is doubtful whether they yet operate on this level. When it comes to state level, however, it is a different matter. In 2007 the small East European state of Estonia, incidentally highly networked, had the effrontery to move a statue of a Russian soldier (Estonia was part of the Union of Soviet Republics from 1945 to 1990) from the centre of its capital. Russia was not amused. Shortly afterwards, Estonian networks started collapsing, due to mysterious denial of service attacks. The country was paralysed for several days. Russia denied responsibility. In 2008 a mini-war broke out between Georgia, a former Soviet state in the Caucasus Mountains and Russia. Two Russian influenced regions of Georgia broke away. During this time again, Georgian computer networks started breaking down, greatly impacting communications in the country. Another recent event that caused quite a stir in the computer community was the StuxNet virus, a virus that apparently intended to damage certain Uranium enrichment plants as used in the Iranian nuclear program. The StuxNet virus was crafted with extraordinary effort and was apparently successful in delaying the development of an atomic bomb by Iran. Without pointing a finger at any particular country, it is clear that with state cyber resources, costing a fraction of a physical military programme, there would be a huge potential for cyber ‘warfare’, waged for strategic, political or commercial reasons. A Story A Story A cleaning lady once paralysed an army of 370 000 soldiers: one day in 1999 this cleaning lady, working in the HQ of the German Armed Forces, used a waxy wood polish instead of a cleaning agent to clean the office doors of the chief of staff and his subordinates. All offices had electronic locks, accessible only by a magnetic card. The polish clogged the reading slots, and for half a day, no General in the German army could get into his office, access his high security communication system, phone anyone or in fact do anything. A good day to attack the country! 5. Botnets Although not going into technical matters which should rather be covered in a network security course, the nature of a botnet (= robot net) should be explained, since your computer is probably in one! A botnet is a collection of computers, sometimes hundreds or thousands of them, which have been infected by a downloaded program which has turned the computer into a ‘slave’ used to forward email spam, malware, fake antivirus software, or participate in a denial of service attack, all at the ‘master’s’ or botnet ‘herder’s’ bidding. Botnet ‘owners’ can rent out the ‘services’ of the botnet to other criminals. Bizarrely, various botnets can combine together in a kind of botnet Internet. They are difficult to find and disable. Figure 3: How a botnet works 1. A botnet operator sends out viruses or worms, infecting ordinary users' computers, whose payload is a malicious application — the bot. 2. The bot on the infected PC logs into a particular command and control (C&C) server (often an IRC server, but, in some cases a web server). 3. A spammer purchases access to the botnet from the operator. 4. The spammer sends instructions via the IRC server to the infected PCs, causing them to send out spam messages to mail servers. If, in your inbox, you sometimes see a message bounced from a recipient you do not know and did not send a mail to, with the message that it contained an illegal attachment, that attachment was a botnet ‘worm’ and you are in the botnet! 6. Is the Internet making us ‘dumber’? The Internet has the potential to make us better informed, and some say is already making us better informed, than any generation in history. Anything we want to know, read, see or listen to is literally at our fingertips. And Web 2 makes it all interactive. But some people are not happy. Take this quotation from “The cult of the amateur” by Andrew Keen (the title of the book itself giving some indication of his concerns): “Before the Internet it seemed like a joke: if you provide an infinite number of monkeys with typewriters one of them will eventually come up with a masterpiece. But with the web now firmly established in its second evolutionary phase – in which users create the content on blogs, podcasts and streamed video – the infinite monkey theory doesn’t seem so funny anymore.” and again: 161 162 Unit 12 The Down Side of the Internet “many of the ideas promoted by champions of Web 2.0 are gravely flawed. Instead of creating masterpieces, the millions of exuberant monkeys are creating an endless digital forest of mediocrity: uninformed political commentary, unseemly home videos, embarrassingly amateurish music, unreadable poems, essays and novels.” In other words, everyone can now talk and is talking, but is anyone listening? Nobody has to be an expert to be on the Internet any more. Does this mean the death of the qualified ‘expert’ or is it just the final democratisation of information? Are you irritated by the amount of rubbish on the Internet, and the distractions it causes? Or do you quickly get through the junk to find what you are looking for? In-text question The other concern is the more subtle question of what the Internet is doing to our brains, or what the influence of the structure of the Internet has or will have on the structure of the brain. Another recent book called “The Shallows” by Andrew Carr addresses this question. The rest of this unit summarises some of the points he makes. Needless to say, this is not accepted uncontroversially. Surfing the Internet is quite different from the corresponding ‘traditional’ activity of reading a book. You spend hours or days on a book – the average time spent on a web site is 20 seconds! Does this mental short-termism have any effect on us? The 1960’s philosopher Marshall McLuhan coined the iconic phrase “the medium is the message”. He meant that eventually the content of the message matters less than the form of the medium itself, in influencing the way we think and act in response. His quotation was: “electronic technology is at the gate, and we are numb, deaf and blind about its encounter with the Gutenberg technology (i.e. books)” McLuhan’s electronic technology was television: we now should substitute the Internet. So as the 1960’s were dazzled by TV, are we now too dazzled by the Internet to notice what it is doing to our heads? The Internet is a feast – one course of food after the other, each juicier than the last, with hardly a moment to breathe between bites. With smart phones, the feast has become a moveable one, available almost any time, anywhere. The medium works its magic, or mischief, on the nervous system itself. From Carr: The Shallows In dealing with this, the brain is an extremely ‘plastic’ organ, and continuously rewires itself to deal with the sensory input to which it is subjected every day, how to learn from it and what data to put into ‘memory’. Carr says: “I’ve had an uncomfortable feeling that someone or something has been tinkering with my brain, remapping neural circuitry, reprogramming memory… I used to find it easy to immerse myself in a book or lengthy article… but now… my concentration starts to drift after a page or two.. I think I know what’s going on… for well over 10 years now, I’ve been spending a lot of time online, searching and surfing… Whether I’m online or not, my mind now expects to take in information the way the net distributes it – a swift stream of particles. I used to be a diver, now I am a jet skier!” A study was done in 2008 of the effects of Internet use on the young – the ‘Net generation’. It was found that ‘digital immersion’ has affected the way they absorb information. They don’t read a page from left to right, or top to bottom. They skip around, scanning for interesting information or links. What we are seeing is the loss of linearity . It seems there are two ways of thinking: the linear mind, calm, focussed and attentive. Or, the mind that needs to take in and dole out information in short, disjointed, hyperlinked, overlapping bursts. Which one is characteristic of surfing the Internet? What we are seeing is the age of distraction. We are drowning in a sea of distraction: Every few seconds there is a text to read, another incoming email to respond to, dozens of interesting links to follow on every web page we call up, Facebook walls to update, ‘tweets’ to read and reply to, cell calls to answer. What of the supreme feature of the web, the hyperlink? There is evidence that ‘hyperlink’ technology affects the quality of our comprehension. A study was done where a researcher made groups of people read the same piece of online writing, but she varied the number of links included in the passage. She then tested the readers’ comprehension by asking them to write a summary of what they had read and complete a multiple-choice test. She found that comprehension declined as the number of links increased. Readers were forced to devote more and more of their attention and brain power to evaluating the links and deciding whether to click on them. That left less attention and fewer cognitive resources to devote to understanding what they were reading. So are we becoming ‘pancake brains’? Technically aware, but spread out, wide and thin, knowing a slight amount about everything, but no time to take more than a quick bite out of anything? Actually, research suggests that this is not too likely. The same argument was brought forward against television when it became ubiquitous in the 1960s. It was alleged that people only consume, instead of participating in discussions anymore. 163 164 Unit 12 The Down Side of the Internet But instead of making the human race stupid, TV has actually helped with educating citizens, and an ordinary grade 12 pupil now has more knowledge than a scientist of 400 years ago! Activity Activity 2 Time Required: 2 hours Either buy (very commendable!) or find online extracts of books such as The Shallows itself, or other possible titles: Future Minds – How the Digital Age is changing our minds by Richard Watson or The Net Delusion – How not to liberate the world by Evgeny Morozov. How long? Read these, and find online reviews and criticisms of these books. Google “reviews of The Shallows” or the other books to find plenty of countervailing arguments. Write about a page of notes on whether the Internet is a Good or Bad thing, in the context of the extracts you have read. Feedback Perhaps it all depends on the mind and mental discipline of the Internet user. The Internet is a huge benefit in finding the information or goods you want, but if you allow yourself to be distracted and spend all your time browsing the net, that is up to you. It is possible that the huge amount of things to find and do on the net do change your way of thinking, and lessen your attention span, but that is not necessarily bad. The last book (The net delusion) is somewhat different, analysing the popular assumption that the net and mobile technolog y confounds dictators and encourages democracy. Not necessarily so - dictators are equally able to use the net for their own ends. But one thing is sure – the world has changed, and whenever change or the threat of change looms, there are always those to decry it. Hundreds of years ago, some thought that the invention of printing would cause the corrosion of memory, stop people coming to church because they could buy and read the Bible themselves, and put information into the hands of lower classes of people who did not need it! Carr, N. (2010). The Shallows. New York: Atlantic Books References Keywords/concepts Adding extra rows to the Table graphicRemoving rows from the table graphic Cybercrime: Crimes committed against the integrity of the network itself, or ‘conventional’ crimes committed with the aid of the Internet. Identity theft: Misappropriation of personal details for fraudulent purposes. Phishing: Attempt to obtain personal financial details by deception. Cyberharassing: Harassing by sending threatening or malicious messages online. Cyber terrorism: Tactics by terrorists in attacking networks or information systems of the target country. Botnet: Network computers taken over or ‘enslaved’ by malicious software, to transmit further malware at the behest of their ‘masters’. Unit summary Summary In this unit you learned about the negative side of online facilities and our online existence. You learned about the down side of the Internet, cybercrime, fraud and identity theft. You learned about the more sinister aspects of cyber stalking and even of cyber terrorism. You learned about the care that needs to be taken when you are posting any personal and financial information online. Finally, we touched on the philosophical debate on whether the Internet in fact has an effect on our mental processes: the way we think and concentrate, and whether we live in an age of distraction, or whether the Internet is just the latest phenomenon in the way technology brings information to us, and the way we deal with it. 165