Econometric Principles and Data Analysis
Transcription
Econometric Principles and Data Analysis
Econometric Principles and Data Analysis product: 4339 | course code: c230 | c330 Econometric Principles and Data Analysis © Centre for Financial and Management Studies SOAS, University of London 1999, revised 2003, 2007, revised 2009, 2010, revised 2013 All rights reserved. No part of this course material may be reprinted or reproduced or utilised in any form or by any electronic, mechanical, or other means, including photocopying and recording, or in information storage or retrieval systems, without written permission from the Centre for Financial & Management Studies, SOAS, University of London. Econometric Principles & Data Analysis Course Introduction and Overview Contents 1 Course Objectives 2 2 The Course Authors 2 3 The Course Structure 2 4 Learning Outcomes 8 5 Study Materials 8 6 Assessment 11 Econometric Principles & Data Analysis 1 Course Objectives This course provides an introduction to econometric methods. In brief, the course examines how we can start from relationships suggested by financial and economic theory, formulate those relationships in mathematical and statistical models, estimate those models using sample data, and make statements based on the parameters of the estimated models. The course examines the assumptions that are necessary for the estimators to have desirable properties, and the assumptions necessary for us to make statistical inference based on the estimated models. In addition, the course explores what happens when these assumptions are not satisfied, and what we can do in these circumstances. The course concludes with an examination of model selection. 2 The Course Authors The course, and its more advanced sequel, Econometric Analysis and Applications, were designed and written by Dr Graham Smith, who is Senior Lecturer in the Department of Economics, SOAS, where he teaches econometrics to MSc students and carries out research on empirical finance. His main research interests focus on emerging stock markets and he has published extensively in international refereed journals. His recent research demonstrates that stock market efficiency is determined by market size, liquidity and the quality of markets. The course has been revised by Dr Jonathan Simms, who is a tutor for CeFiMS, and has taught at University of Manchester, University of Durham and University of London. He has contributed to development of various CeFiMS courses including Econometric Analysis and Applications; Financial Econometrics, Risk Management: Principles & Applications; Public Financial Management: Reporting and Audit; and Introduction to Law and to Finance. 3 The Course Structure The paragraphs following the list of topics presented in the units provide brief descriptions of the units’ content. They are intended as an introduction and overview of the course. More complete, detailed explanation, analysis and discussion are provided in the units themselves, and in the course textbook. So don’t worry if you do not understand everything in this short introduction. Unit 1 Introduction to Econometrics and Regression Analysis 1.1 1.2 1.3 1.4 1.5 1.6 1.7 2 What is Econometrics? How to Use the Course Texts Ideas – The Concept of Regression Study Guide An Example – The Consumption Function Summary Eviews University of London Course Introduction and Overview 1.8 Exercises 1.9 Answers to Exercises Unit 2 The Classical Linear Regression Model 2.1 2.2 2.3 2.4 2.5 2.6 Ideas and Issues Study Guide Example – the Single Index Model (SIM) Summary Exercises Answers to Exercises Unit 3 Hypothesis Testing 3.1 3.2 3.3 3.4 3.5 3.6 Ideas and Issues Study Guide Example – The Capital Asset Pricing Model Summary Exercises Answers to Exercises Unit 4 The Multiple Regression Model 4.1 4.2 4.3 4.4 4.5 4.6 Ideas and Issues Study Guide Example – A Multi-Index Model Summary Exercises Answers to Exercises Unit 5 Heteroscedasticity 5.1 5.2 5.3 5.4 5.5 5.6 Ideas and Issues Study Guide Example – Price-Earnings Ratio Summary Exercises Answers to Exercises Unit 6 Autocorrelation 6.1 6.2 6.3 6.4 6.5 6.6 Ideas and Issues Study Guide Example – The Single-Index Model Summary Exercises Answers to Exercises Unit 7 Nonnormal Disturbances 7.1 7.2 7.3 7.4 7.5 Ideas and Issues Study Guide Examples Summary Exercises Centre for Financial and Management Studies 3 Econometric Principles & Data Analysis 7.6 Answers to Exercises Appendix 1: Small-Sample Critical Values for the Jarque-Bera Test Appendix 2: Stock Market Indices Unit 8 Model Selection and Course Summary 8.1 8.2 8.3 8.4 8.5 8.6 8.7 Ideas and Issues Study Guide Example: the Demand for Money Function Summary Exercises Answers to Exercises Course Summary: ‘What you do and do not know’ Unit 1 provides an introduction to econometrics and regression analysis. By regression we mean an equation that captures the mathematical relationship between the variables, and also the imperfect nature of that relationship. The unit introduces the stages of an econometric investigation: • statement of the theory • collection of data • mathematical model of the theory (an exact relationship between variables) • econometric model of the theory (a stochastic model of the relationship between variables) • parameter estimation • checking for model adequacy • tests of hypotheses • prediction. Unit 1 also provides guidance on how to use the study materials. In addition, it provides a brief revision of how to calculate financial rates of return. Each unit includes a worked example. (In Unit 1, the example concerns the relation between spot and forward exchange rates.) All of the units also contain exercises for you to do in order to develop your own understanding and confidence, from a wide range of econometric studies. Data for the exercises are provided. The data used in the examples are also provided so that you can replicate the results presented in the unit (replicating the results in the example is presented as an exercise). The course uses the software package Eviews. Results from Eviews are presented in the units. You are provided with a copy of Eviews to do the unit exercises. Answers for the exercises are provided at the end of each unit, but you look at the answers only after you have done the exercises yourself! Data on the stock price of Delta Airlines Inc. and the New York Stock Exchange Composite Index are introduced in the exercises in Unit 1. This data set is used in a number of units throughout the course, in the worked examples or the exercises. By applying different econometric tools with the same data set, it is hoped you will develop a rounded view of how the 4 University of London Course Introduction and Overview methods you will learn relate to each other. A variety of other models and data sets are also used. Unit 2 presents the classical linear regression model. It explains the method of ‘ordinary least squares’ (OLS) and how that can be used to estimate the unknown parameters of a regression equation using sample data. In this unit we are concerned with models containing two variables; we are trying to discover how one variable – the explanatory variable – explains another variable – the dependent variable – and estimate the parameters in that relationship. We then need to ask whether we can make statements about the true, unknown, parameters of the model, based on our estimated values. To do this we need to make a number of assumptions. These assumptions, if satisfied, ensure that the estimators we use have desirable properties (in brief and oversimplified terms: the estimators are accurate and efficient). If the assumptions are satisfied, we can also make predictions about the unknown model parameters, and we can specify, precisely, how confident we are about those predictions. Unit 2 also explains goodness of fit: how closely our estimated model fits our sample data. These ideas are demonstrated using the single-index market model applied to Delta Airlines Inc., and the British retailer Marks & Spencer. Unit 3 explores how to test hypotheses. Based on our estimated model coefficients, can we answer questions of the form: • Is the true, unknown coefficient negative, zero, or positive? • Does it take a particular value? • Is there actually a relationship between the two variables? Unit 3 uses the capital asset pricing model (CAPM) for GlaxoSmithKline to demonstrate hypothesis testing. Hypothesis testing is demonstrated further in the exercises with the single-index model. So, for example, we might be concerned with how we can test whether the stock we are interested in is defensive or aggressive; is the company beta less than one or greater than one? The efficiency of foreign exchange markets is also examined. Unit 4 extends the analysis to the multiple regression model; these are regression models in which one variable is explained by two or more variables. The unit examines the assumptions necessary to estimate and make predictions with such models. The unit asks what happens if, in a multiple regression model, there is a relationship between any of the explanatory variables, in addition to the relationships we hope to discover between the explanatory variables and the dependent variable (this is called multicollinearity). The techniques of multiple regression are demonstrated with an example of a multi-index model. Units 5, 6 and 7 are concerned with what happens if a number of the assumptions of the classical linear regression model are not satisfied. What are the consequences for the properties of the ordinary least squares estimators, and can we still make predictions about the unknown model parameters based on our estimated model? Centre for Financial and Management Studies 5 Econometric Principles & Data Analysis Unit 5 is concerned with heteroscedasticity. What is that? Here is a very brief and simplified explanation; a more detailed and precise explanation is provided in Unit 5. Unit 1 explains how we can specify a mathematical relationship between variables. The actual relationship between variables is not exact, and we attempt to capture this by including an error or disturbance term in the regression equation. One of the assumptions we make is that the variance of the disturbance term – how much it varies about its mean value – is constant for all observations. This is the assumption of homoscedasticity, and is explained in Unit 2. In some econometric studies this assumption may not be satisfied. Consider a cross-section study of commission rates for different brokerage companies. The disturbance term also attempts to capture those influences on commission rates that we have not included in our model. Is it likely that the variance of this disturbance term will be constant for all brokerage companies? If the variance of the disturbance term is not constant, we say there is heteroscedasticity. Unit 5 examines the consequences of heteroscedasticity: • What are the effects on the properties of OLS estimators, and can we still make predictions based on our estimated model? The unit examines how heteroscedasticity can be identified, and how we can deal with it, either by transforming the model or by using a different estimation method. If we know what form the heteroscedasticity takes, we can use the method of weighted least squares. Heteroscedasticity is demonstrated with a study of price-earnings ratios estimated for a cross-section of companies. Unit 6 is concerned with autocorrelation. Again, here is a very simple and brief explanation; a more precise and formal explanation is provided in Unit 6. Consider again the disturbance term that we include in our regression equation. The disturbance term reflects the stochastic nature of the relationship between variables, and also attempts to capture the elements that we have not included in the model. Another assumption we make about the disturbance term is that the disturbance terms for different observations (e.g. if using annual data, last year and this year, or if using daily data, yesterday and today) are not related. This is the assumption of noncorrelated disturbances, and is explained in Unit 2. If the disturbances for different observations are related, we say that the disturbance term is serially correlated or ‘autocorrelated’. For example, an economic or financial shock in one month may have persistent effects in following months, and if the model does not explicitly include such persistence effects, the disturbance terms in different months will be correlated. Unit 6 examines the implications of autocorrelation for the properties of OLS estimators, and also the consequences for prediction based on OLS estimators. It also shows how to identify autocorrelation using plots and more formal tests, and what can be done to take account of autocorrelation, including changing the method of estimation. The effects of autocorrelated disturbances are demonstrated with the single-index market model for Delta Airlines, and a model of spot and forward exchange rates. 6 University of London Course Introduction and Overview Unit 7 is concerned with the assumption of normality. In order to make predictions about the true, unknown model parameters, based on our estimated values, we need to assume that the disturbance terms are distributed normally – that is, they follow a normal distribution. You are probably already familiar with the normal distribution from your other studies. It is a probability distribution with known properties, which allows us to make statements concerning the unknown model parameters with a certain degree of confidence – for example, we can reject a hypothesis about a parameter with a 5% chance of being wrong, or we can be 95% confident that an unknown parameter takes a value within a certain range of values. If the disturbance terms are not normally distributed, we are unable to make such predictions, and it also has consequences for the properties of the OLS estimators. Unit 7 explains the effects of having disturbances that are not distributed normally, the tests to detect non-normal disturbances, and what can be done about non-normal disturbances. This includes the use of dummy variables to take account of outliers (data points which are very different from the rest of the sample). These methods are demonstrated with two examples: stock market returns and the single-index model for Marks & Spencer. The exercises include consideration of the SIM for Delta Airlines and for Bank of America. Unit 8 is concerned with model selection. One of the assumptions we make is that the model we estimate is correctly specified: the regression equation includes all relevant variables, and the functional form of the relationship is specified correctly – variables are included correctly as levels, or their logged values are included, or perhaps squared values of the variables are included. If the model is not correctly specified, this has consequences for the properties of the OLS estimators and for prediction based on those estimators. In particular, Unit 8 examines the consequences of omitting a relevant explanatory variable, including an irrelevant explanatory variable, and using the wrong functional form. The unit explains methods to identify misspecified equations. These include tests specifically designed to identify misspecified models. In addition, evidence of heteroscedasticity, autocorrelated errors, or non-normal errors, may be a further sign that a model is not correctly specified. Unit 8 also shows how we can decide between different specifications of a particular economic relationship. It demonstrates model selection using the Delta Airlines data set, and also the SIM for IBM stock. Finally, Unit 8 includes a summary of the course, to help with your revision for the final examination. More advanced topics in econometrics are studied in the CeFiMS course Econometric Analysis & Applications. These include more use of dummy variables, dynamic models: lags and expectations; simultaneous equation models; time series analysis: stationarity and nonstationarity, and forecasting. Centre for Financial and Management Studies 7 Econometric Principles & Data Analysis 4 Learning Outcomes After studying this course you will be able to: • explain the principles of regression analysis • outline the assumptions of the classical normal linear regression model, and discuss the significance of these assumptions • explain the method of ordinary least squares • produce and interpret plots of data • use the program Eviews to estimate a regression equation, and interpret the results, for bivariate (two-variable) regression models and multiple regression models • test hypotheses concerning model parameters • test joint hypotheses concerning more than one variable • discuss the consequences of multicollinearity, the methods for identifying multicollinearity, and the techniques for dealing with it • explain what is meant by heteroscedasticity, and the consequences for OLS estimators and prediction based on those estimators • assess the methods used to identify heteroscedasticity, including data plots and more formal tests, and the various techniques to deal with heteroscedasticity, including model transformations and estimation by weighted least squares • explain autocorrelation, and discuss the consequences of autocorrelated disturbances for the properties of OLS estimator and prediction based on those estimators • outline and discuss the methods used to identify autocorrelated disturbances, and what can done about it, including estimation by generalised least squares • discuss the consequences of disturbance terms not being normally distributed, tests for nonnormal disturbances, and methods to deal with non-normal disturbances, including the use of dummy variables • discuss the consequences of specifying equations incorrectly • discuss the tests used to identify correct model specification, and statistical criteria for choosing between models • use Eviews to conduct tests for heteroscedasticity, correlated disturbances, nonnormal disturbances, functional form, and model selection • use Eviews to estimate models in which the disturbance term is assumed to be heteroscedastic or autocorrelated. 5 Study Materials These course units are your central learning resource; they structure your learning unit by unit. Each unit should be studied within a week. The course units are designed in the expectation that studying the unit and the associated readings in the textbook, and completing the exercises, will require 15 to 20 hours during the week. 8 University of London Course Introduction and Overview Textbook In addition to the course units you must read the assigned sections from the textbook, which is provided with your course materials: Damodar N Gujarati and Dawn C Porter (2010) Essentials of Econometrics, New York: McGraw-Hill. We have specifically used this textbook because it provides an excellent userfriendly introduction to econometric theory and techniques. You will notice that Gujarati and Porter present examples from finance, economics and business, because it is an introduction to econometrics in general. The examples and exercises in the course units are drawn entirely from finance. In each course unit there is a section, called Study Guide, which leads you through the relevant parts of the textbook, and helps you to read and understand the analysis presented there. If, while studying this course, you find you need some revision in basic probability and statistics, you may find it useful to look at parts of Appendices A to D in the textbook, which cover probability, probability distributions, and statistical inference. Eviews You have been provided with a copy of Eviews, Student Edition. This is the econometrics software that you will use to do the exercises in the units, and also the data analysis part of your assignments. The results presented in the units are also from Eviews. Instructions to install Eviews, and to register your copy of the software, are included in the booklet that comes with the Eviews CD. (Your student edition of Eviews will run for two years after installation, and you will be reminded of this every time you open the program.) You must register your copy of Eviews within 14 days of installing it on your computer. If you do not register your copy within 14 days, the software will stop working. Eviews is very easy to use. Like any Windows program, you can operate it in a number of ways: • there are drop-down menus • selecting an object and then right-clicking provides a menu of available operations • double-clicking an object opens it • keyboard shortcuts work. There is also the option to work with Commands; these are short statements that inform the program what you wish to do, and once you have built up your own vocabulary of useful Commands, this can be a very effective way of working. You can also combine all of these ways of working with Eviews. In each unit there are instructions to help you use Eviews to do the exercises. In addition, Eviews includes help files, which you can read as pdf files, or Centre for Financial and Management Studies 9 Econometric Principles & Data Analysis navigate via the Eviews help and search facility. Unit 1 includes a section introducing Eviews. Although easy to use, Eviews is a very powerful program. There are advanced features that you will not use on this course, and you should not be worried if you see these, either in the menus or the help files. The best advice is to stay focused on the subject that is being studied in each unit, and to do the exercises for the unit; this will reinforce your understanding and also develop your confidence in using data and Eviews. Exercises As already noted, there are exercises in every unit. These require you to work with Eviews and data files, available from the VLE in the course area for this study session, to do your own econometric analysis. It is very important that you attempt these exercises, and do not just look at the Answers at the end of the units. Your understanding of the material you have studied in the unit will be greatly improved if you do the exercises yourself. You will also develop better understanding and confidence in using Eviews. The Instructions that accompany the exercises in the first few units are quite detailed, because they are intended to help you to start working with Eviews. As the units progress, it is assumed that you will gradually develop your understanding of the basic Eviews operations, and the Instructions then focus more and more on what new operations are required to do the Exercises in the units. If you find that you have forgotten how to do something, look back at the Instructions in the early units, because the basic operations will be the same. Podcast There is a podcast to accompany Econometric Principles & Data Analysis, in which Dr Simms discusses the course with Pasquale Scaramozzino, Professor of Economics at the Centre for Financial and Management Studies. The podcast is 18:26 minutes in length. Timings in the podcast are indicated below in brackets. The podcast begins by explaining what the course does (from 0:28), and provides advice on how to study econometrics (1:16). Dr Simms then discusses how to get the most out of the materials (2:58), including the examples and exercises, and Eviews, and explains the choice of the textbook, Essentials of Econometrics. The podcast then addresses the question of how a course in econometrics helps the understanding of financial markets (7:30). The discussion here emphasises the importance of being able to interpret regression results, and assessing the quality of those results; obtaining estimated equations is not enough in itself. Following this, the podcast considers how econometrics bridges the gap between theoretical financial models and financial data (11:56), explaining how econometrics allows us to test whether a particular theoretical model is appropriate or not, and how qualities displayed by the data can be used to improve models. 10 University of London Course Introduction and Overview In addition to analysing the examples, completing the exercises, and writing your assignments, you are also encouraged to apply the methods you are learning to data sets with which you are familiar from your own working environment (14:37), and to consider how the methods relate to your work or areas of interest – this may enable you to develop a more intuitive understanding of the econometric techniques. Finally (16:57) there is a summary of the podcast discussion, and a consideration of the general approach to take to your study of econometrics, especially if you are unfamiliar with statistics and maths, or are returning to these subjects after a period of time. We suggest that you listen to the podcast before you start studying Unit 1, and perhaps again half-way through the course when you have finished Unit 4. It may also provide a helpful revision at the end of the course, reinforcing your understanding of what you have learnt and providing an overall context. We hope that you enjoy this course. 6 Assessment Your performance on each course is assessed through two written assignments and one examination. The assignments are written after week four and eight of the course session and the examination is written at a local examination centre in October. The assignment questions contain fairly detailed guidance about what is required. All assignment answers are limited to 2,500 words and are marked using marking guidelines. When you receive your grade it is accompanied by comments on your paper, including advice about how you might improve, and any clarifications about matters you may not have understood. These comments are designed to help you master the subject and to improve your skills as you progress through your programme. The written examinations are ‘unseen’ (you will only see the paper in the exam centre) and written by hand, over a three-hour period. We advise that you practise writing exams in these conditions as part of your examination preparation, as it is not something you would normally do. You are not allowed to take in books or notes to the exam room. This means that you need to revise thoroughly in preparation for each exam. This is especially important if you have completed the course in the early part of the year, or in a previous year. Preparing for Assignments and Exams There is good advice on preparing for assignments and exams and writing them in Sections 8.2 and 8.3 of Studying at a Distance by Talbot. We recommend that you follow this advice. The examinations you will sit are designed to evaluate your knowledge and skills in the subjects you have studied: they are not designed to trick you. If you have studied the course thoroughly, you will pass the exam. Centre for Financial and Management Studies 11 Econometric Principles & Data Analysis Understanding assessment questions Examination and assignment questions are set to test different knowledge and skills. Sometimes a question will contain more than one part, each part testing a different aspect of your skills and knowledge. You need to spot the key words to know what is being asked of you. Here we categorise the types of things that are asked for in assignments and exams, and the words used. All the examples are from CeFiMS exam papers and assignment questions. Definitions Some questions mainly require you to show that you have learned some concepts, by setting out their precise meaning. Such questions are likely to be preliminary and be supplemented by more analytical questions. Generally ‘Pass marks’ are awarded if the answer only contains definitions. They will contain words such as: Describe Define Examine Distinguish between Compare Contrast Write notes on Outline What is meant by List Reasoning Other questions are designed to test your reasoning, by explaining cause and effect. Convincing explanations generally carry additional marks to basic definitions. They will include words such as: Interpret Explain What conditions influence What are the consequences of What are the implications of Judgment Others ask you to make a judgment, perhaps of a policy or a course of action. They will include words like: Evaluate Critically examine Assess Do you agree that To what extent does Calculation Sometimes, you are asked to make a calculation, using a specified technique, where the question begins: Use the single index model analysis to Using any financial model you know Calculate the standard deviation Test whether It is most likely that questions that ask you to make a calculation will also ask for an application of the result, or an interpretation. 12 University of London Course Introduction and Overview Advice Other questions ask you to provide advice in a particular situation. This applies to policy papers where advice is asked in relation to a policy problem. Your advice should be based on relevant principles and evidence of what actions are likely to be effective. Advise Provide advice on Explain how you would advise Critique In many cases the question will include the word ‘critically’. This means that you are expected to look at the question from at least two points of view, offering a critique of each view and your judgment. You are expected to be critical of what you have read. The questions may begin Critically analyse Critically consider Critically assess Critically discuss the argument that Examine by argument Questions that begin with ‘discuss’ are similar – they ask you to examine by argument, to debate and give reasons for and against a variety of options, for example Discuss the advantages and disadvantages of Discuss this statement Discuss the view that Discuss the arguments and debates concerning The grading scheme Details of the general definitions of what is expected in order to obtain a particular grade are shown below. Remember: examiners will take account of the fact that examination conditions are less conducive to polished work than the conditions in which you write your assignments. These criteria are used in grading all assignments and examinations. Note that as the criteria of each grade rises, it accumulates the elements of the grade below. Assignments awarded better marks will therefore have become comprehensive in both their depth of core skills and advanced skills. 70% and above: Distinction As for the (60-69%) below plus: • • • • shows clear evidence of wide and relevant reading and an engagement with the conceptual issues develops a sophisticated and intelligent argument shows a rigorous use and a sophisticated understanding of relevant source materials, balancing appropriately between factual detail and key theoretical issues. Materials are evaluated directly and their assumptions and arguments challenged and/or appraised shows original thinking and a willingness to take risks Centre for Financial and Management Studies 13 Econometric Principles & Data Analysis 60-69%: Merit As for the (50-59%) below plus: • • • • shows strong evidence of critical insight and critical thinking shows a detailed understanding of the major factual and/or theoretical issues and directly engages with the relevant literature on the topic develops a focussed and clear argument and articulates clearly and convincingly a sustained train of logical thought shows clear evidence of planning and appropriate choice of sources and methodology 50-59%: Pass below Merit (50% = pass mark) • • • • • shows a reasonable understanding of the major factual and/or theoretical issues involved shows evidence of planning and selection from appropriate sources, demonstrates some knowledge of the literature the text shows, in places, examples of a clear train of thought or argument the text is introduced and concludes appropriately 45-49%: Marginal Failure • • • shows some awareness and understanding of the factual or theoretical issues, but with little development misunderstandings are evident shows some evidence of planning, although irrelevant/unrelated material or arguments are included 0-44%: Clear Failure • • • fails to answer the question or to develop an argument that relates to the question set does not engage with the relevant literature or demonstrate a knowledge of the key issues contains clear conceptual or factual errors or misunderstandings Specimen exam papers Your final examination will be very similar to the Specimen Exam Paper that you received in your course materials. It will have the same structure and style and the range of question will be comparable. We do not provide past papers or model answers to papers. Our courses are continuously updated and past papers will not be a reliable guide to current and future examinations. The specimen exam paper is designed to be relevant to reflect the exam that will be set on the current edition of the course. Further information The OSC will have documentation and information on each year’s examination registration and administration process. If you still have questions, both academics and administrators are available to answer queries. The Regulations are also available at , setting out the rules by which exams are governed. 14 University of London UNIVERSITY OF LONDON Centre for Financial and Management Studies MSc Examination for External Students 15DFMC230|15DFMC330 Financial Economics Finance Econometric Principles and Data Analysis Specimen Examination This is a specimen examination paper designed to show you the type of examination you will have at the end of this course. The number of questions and the structure of the examination will be the same, but the wording and requirements of each question will be different. The examination must be completed in three hours. Answer FOUR questions – Question One and then THREE other questions. The examiners give equal weight to each question; therefore, you are advised to distribute your time approximately equally over four questions. Candidates may use their own electronic calculators in this examination provided they cannot store text. The make and type of calculator MUST BE STATED CLEARLY on the front of the answer book. Do not remove this Paper from the Examination Room. It must be attached to your answer book at the end of the examination. © University of London, 2012 PLEASE TURN OVER You must answer Question One and then any other THREE questions. All candidates must attempt Question 1. 1 The Eviews output from estimating a single-index model for Microsoft Corporation using weekly data for the period from 1 September 2009 to 27 August 2012 is provided below. MSFT is the price of Microsoft Corporation stock, C is an intercept and SP is the Standard & Poor’s 500 index. Dependent Variable: DLOG(MSFT) Method: Least Squares Sample (adjusted): 8/09/2009 27/08/2012 Included observations: 156 after adjustments Variable Coefficient Std. Error t-Statistic Prob. C DLOG(SP) 8.97E-05 0.867121 0.001650 0.067407 0.054359 12.86391 0.9567 0.0000 R-squared Adjusted R-squared S.E. of regression Sum squared resid Log likelihood F-statistic Prob(F-statistic) 0.517967 0.514837 0.020539 0.064962 385.7823 165.4801 0.000000 Mean dependent var S.D. dependent var Akaike info criterion Schwarz criterion Hannan-Quinn criter. Durbin-Watson stat 0.001896 0.029487 -4.920286 -4.881185 -4.904405 2.177215 Breusch-Godfrey Serial Correlation LM Test: F-statistic Obs*R-squared 0.821681 1.668569 Prob. F(2,152) Prob. Chi-Square(2) 0.4416 0.4342 Ramsey RESET Test Equation: DLOGMSFT_C_DLOGSP Specification: DLOG(MSFT) C DLOG(SP) Omitted Variables: Squares of fitted values t-statistic F-statistic Likelihood ratio Value 2.219073 4.924283 4.941733 df 153 (1, 153) 1 Probability 0.0280 0.0280 0.0262 Heteroskedasticity Test: White F-statistic Obs*R-squared Scaled explained SS 0.119699 0.243711 0.268446 Prob. F(2,153) Prob. Chi-Square(2) Prob. Chi-Square(2) Page 2 of 5 0.8873 0.8853 0.8744 The calculated Jarque-Bera statistic for the least squares estimation of the single-index model is 6.410157 (Prob. = 0.040556). a Explain the economic rationale underlying the regression equation. b Interpret the estimated coefficients. c Discuss the adequacy of the model with respect to 2 d 2 i R ii Serial correlation iii Functional form iv Normality v Heteroscedasticity. Predict the value of the return on Microsoft stock if the market return is 2 per cent (or 0.02). Is this forecast likely to be accurate? Explain four of the following: a Linear in parameters, and linear in variables b The method of ordinary least squares (OLS) c The confidence interval for a slope coefficient d 3 e A consistent estimator f Under the assumptions of the CLRM, OLS estimators are BLUE. Answer all parts of this question. Using daily data for the period 1 March 2010 to 5 April 2012 (532 observations after adjustments), the following multi-index model was estimated by ordinary least squares R̂t = 0.002 + 0.902RM ,t + 0.103RO,t + 0.001TSt + 0.002RPt (0.012) (0.031) (0.024) ( (0.002) (0.003) (3.1) and standard errors are in parentheses) where is the daily log return on the stock of the American energy multinational ConocoPhillips, is the daily log return on the NYSE Composite Index, is the daily log return of the Brent crude oil price, term structure variable, and is a is a risk premium variable. Test the following null hypotheses, explaining carefully in each case the null and alternative hypotheses, the test statistic, degrees of freedom and the critical value of the test statistic. Page 3 of 5 a b the intercept is zero is independent of c the coefficient on is less than one d Test the hypothesis that the coefficients on and are both zero. For your information, the following equation was also estimated using the same data and OLS R̂t = 0.0006 + 0.902RM ,t + 0.104RO,t R 2 = 0.703 (0.0004) (0.030) (0.024) (3.2) (Standard errors are in parenthesis.) 4 5 Answer both parts of this question. a What is ‘imperfect multicollinearity’ and how might it be detected? b ‘The theoretical consequences of imperfect multicollinearity are relatively unimportant but the practical consequences are potentially serious’. Explain and discuss. Answer all parts of this question. a How might heteroscedasticity arise? b Explain why heteroscedastic disturbances have consequences for the validity of t and F tests. c Explain the Park test of heteroscedasticity. d Given Yi = 1 + 2 X i + ui ( ) where var ui = 2 X i2 show how this model can be transformed so that the disturbances have constant variance. 6 Answer all parts of this question. a What is autocorrelation? b Why does it matter? c Explain how the Durbin-Watson test can be used for detecting autocorrelation. d. For the model Yt = + X t + ut ut = ut 1 + vt | |<1 ( vt ~ IID 0, 2 ) explain the steps involved in obtaining CochraneOrcutt estimates of the unknown parameters. Page 4 of 5 7 8 Answer all parts of this question. a What is nonnormality? b What are the consequences for the properties of the OLS estimators if the disturbance terms are not distributed normally? c How would you examine whether the disturbance terms are distributed normally? d If there is evidence that the disturbance terms are not distributed normally, what would you do? Answer all parts of this question. a Explain the characteristics of a ‘good’ econometric model. b What are the consequences of c i including an irrelevant variable, and ii using an incorrect functional form? How might i the presence of unnecessary variables, and ii an incorrect functional form be detected? END OF EXAMINATION Page 5 of 5 &RXUVH,QWURGXFWLRQDQG2YHUYLHZ &HQWUHIRU)LQDQFLDODQG0DQDJHPHQW6WXGLHV (FRQRPHWULF3ULQFLSOHVDQG'DWD$QDO\VLV 8QLYHUVLW\RI/RQGRQ &RXUVH,QWURGXFWLRQDQG2YHUYLHZ &HQWUHIRU)LQDQFLDODQG0DQDJHPHQW6WXGLHV (FRQRPHWULF3ULQFLSOHVDQG'DWD$QDO\VLV 8QLYHUVLW\RI/RQGRQ &RXUVH,QWURGXFWLRQDQG2YHUYLHZ &HQWUHIRU)LQDQFLDODQG0DQDJHPHQW6WXGLHV (FRQRPHWULF3ULQFLSOHVDQG'DWD$QDO\VLV 8QLYHUVLW\RI/RQGRQ &RXUVH,QWURGXFWLRQDQG2YHUYLHZ &HQWUHIRU)LQDQFLDODQG0DQDJHPHQW6WXGLHV (FRQRPHWULF3ULQFLSOHVDQG'DWD$QDO\VLV 8QLYHUVLW\RI/RQGRQ &RXUVH,QWURGXFWLRQDQG2YHUYLHZ &HQWUHIRU)LQDQFLDODQG0DQDJHPHQW6WXGLHV (FRQRPHWULF3ULQFLSOHVDQG'DWD$QDO\VLV 8QLYHUVLW\RI/RQGRQ &RXUVH,QWURGXFWLRQDQG2YHUYLHZ &HQWUHIRU)LQDQFLDODQG0DQDJHPHQW6WXGLHV (FRQRPHWULF3ULQFLSOHVDQG'DWD$QDO\VLV 8QLYHUVLW\RI/RQGRQ Econometric Principles & Data Analysis Unit 1 An Introduction to Econometrics and Regression Analysis Contents 1.1 What is Econometrics? 3 1.2 How to Use the Course Texts 6 1.3 Ideas – The Concept of Regression 8 1.4 Study Guide 16 1.5 An Example – The Consumption Function 17 1.6 Summary 20 1.7 Eviews 21 1.8 Exercises 22 1.9 Answers to Exercises 30 References 34 Econometric Principles & Data Analysis Unit Content This unit provides an introduction to econometrics and regression analysis. It outlines the differences between financial and economic theory and econometrics. The unit explains how stochastic relations between variables are different to mathematical relations between variables. It explains how uncertainty may be modelled using a disturbance term. The unit introduces the steps involved in an econometric investigation. Unit 1 also introduces you to Eviews, the econometrics software you will be using for this course. Learning Outcomes After studying this unit, the readings, and the exercises, you will be able to discuss and apply the following • • • • • • • • the population regression function the sample regression function the disturbance (or error) term the residual term how to use Eviews to open pre-existing text files containing data how to create and interpret a scatter plot how to obtain summary statistics how to create transformations of variables. Readings for Unit 1 Chapter 1 ‘The Nature and Scope of Econometrics’, from Damodar Gujarati and Dawn Porter (2010) Essentials of Econometrics, New York: McGrawHill/Irwin. You will also be asked to read Chapter 1, ‘A Quick Walk Through’, in Richard Startz (2009) Eviews Illustrated – An Eviews Primer, Irvine California: Quantitative Micro Software. This file will be installed on your computer when you install your copy of Eviews, and it is accessed via Help in Eviews. Installation and registration of Eviews Instructions for installing and registering your copy of the Eviews Student Edition are in the booklet that comes with the Eviews CD. Instructions to help you use Eviews to do the Exercises are included in section 1.8 of the unit. You must register your copy of Eviews. If you do not, it will stop working 14 days after installation. Data Files for exercises You will also be asked to work through exercises, and the data files you need for these are available from the Online Study Centre, in the course area for this study session. 2 University of London Unit 1 An Introduction to Econometrics and Regression Analysis 1.1 What is Econometrics? Welcome to this course. The aim of the course is to give you an introduction to econometric methods or, more specifically, to linear regression, which is the major statistical foundation of econometric work. This course requires that you work with data; we hope you will find this interesting and useful, and that you enjoy the course. A principal concern of financial and economic theory is relations between variables. In finance, you may have already studied many of these including the capital asset pricing model; arbitrage pricing theory; efficient markets hypothesis; optimal hedging ratios; bid-ask spreads. If you have studied economics, you may be familiar with consumption, investment, and demand for money functions; labour supply and labour demand functions; the expectations-augmented Phillips curve; and many others. You could, in fact, view the whole of economic and finance theory as a set of relations among variables. What is econometrics? Econometrics is concerned with quantifying financial and economic relations. Econometrics is of use in providing numerical estimates of the parameters involved and for testing hypotheses embodied in the theoretical relationships. Broadly defined, econometrics is … the application of statistical and mathematical methods to the analysis of economic data, with a purpose of giving empirical content to economic theories and verifying them or refuting them.1 This definition is not the only possible one; in fact, in your textbook you will come across a number of definitions, which each puts the emphasis slightly differently. Common to all definitions, however, is the stress on the empirical nature of econometric work: the subject matter of econometrics concerns the interaction of, and confrontation between, theory and data in quantifying economic and financial relationships. Hence, econometrics is not purely a branch of mathematical economics or mathematical finance. Indeed, mathematical finance or economics need not have any empirical content at all. Econometrics makes use of mathematical methods, but its emphasis is on empirical analysis. However, econometrics is not just a ‘box of tools’ to work with data. It requires, undoubtedly, a good training in statistical techniques, but these techniques need to be situated in an interactive process between theory and the data. To give empirical content to financial and economic theories and to verify them or refute them, the econometrician is confronted with three types of problems, which are of lesser or no concern to the theorist. First, in economic or financial theory we develop models out of a priori reasoning based on relatively simple assumptions. To do this, we abstract from secondary complications by assuming that ‘other things remain equal’ 1 Maddala, GS (1992: 1) Centre for Financial and Management Studies 3 Econometric Principles & Data Analysis while we investigate the relations between a few key economic or financial variables. In effect, this method reduces to ‘intellectual experimentation’ with causal relations postulated by theory. For example, in demand theory we say that the quantity demanded of a commodity (which is not an inferior good) will fall if its price rises, other things being equal. This method is fruitful in theory but, unfortunately, in empirical economics and finance the scope for experimentation is severely limited. A researcher cannot alter a commodity’s price (or an asset’s price), holding other things constant, in order to see what happens to demand. In general, financial and economic data are not the outcome of experiments, but rather the product of observational programmes of data gathering and collection in a world where other things are never equal. In econometrics, therefore, we can only resort to careful observation; the basic art of econometric work is more like unravelling a complex puzzle than setting up an experiment in a laboratory. Second, we need to address the difference between deterministic and stochastic relationships. This issue arises in a different way in economics and in finance. To make the point, we will explain the distinction between deterministic and stochastic relationships with an example from economics, and then address it from a financial perspective. In most economic theory we work with deterministic relationships between economic variables. Take a simple example: the Keynesian consumption function. In economic theory we assume that if we know the level of aggregate real income, consumption will be uniquely determined. That is, for each value of aggregate real income there corresponds a given level of aggregate consumption. This is a convenient device to enable us to work out exact solutions for the interplay between variables within the confines of the assumptions of an economic model. In reality, however, we do not expect this relationship to be exact: it may be stable perhaps, but it is surely imperfect. Hence, in econometric work we deal with imperfect relationships between variables. It follows that our models cannot be deterministic in nature. We investigate functions between variables that we believe to be reasonably stable, on average, but there will always be a degree of uncertainty about outcomes and conclusions derived from such a model. Econometric modelling requires that we make explicit assumptions about the character of these imperfections, or disturbances as they are more commonly labelled. That is, we work with stochastic variables and we need to model their stochastic nature. This is what makes us enter the areas of probability theory and statistical inference and estimation. How does the distinction between deterministic and stochastic relationships arise in finance? Uncertainty is a fundamental element of risk, and the measurement and management of risk are central aspects of finance. To demonstrate this, consider the single-index model (which you will examine and estimate in Unit 2). In the single-index market model the return on a company stock is considered to be a function of three elements. There is a fixed element which is specific to the company. There is also a deterministic 4 University of London Unit 1 An Introduction to Econometrics and Regression Analysis relationship between the return on the company stock and the return on a relevant market index: For each value of the return on the market index there corresponds a given value for the return on the company stock. (This part of the model captures the concept of market-determined risk.) In addition, the return on the company stock is explained by a company-specific disturbance or error. (The company-specific error captures the concept of companyspecific risk.) The single-index model includes the company-specific disturbance not just to make the model more realistic; it is included because we specifically want to understand the stochastic nature of the return on the company stock, and thus get a better understanding of the risk associated with the stock. Third, in financial and economic models we work with theoretical variables. Econometrics, in contrast, deals with observed data. Obviously, there is a certain correspondence between them; data collection is inspired by theoretical frameworks. For example, national income account data were constructed after the ascendancy of Keynesian economics, which concerns the analysis of theoretical aggregates such as output, demand, employment and the price level. However, observed variables do not fully correspond to their theoretical equivalents because of errors in measurement, conceptualisation and coverage. This is usually less of a problem for econometrics applied in a financial context than it is for economics. Financial data on asset prices, for example, is more closely related to the actual transactions taking place, so measurement error is less likely. However, we should be aware that movements in financial data may be the result of the particular operating or reporting features of a market, say, in addition to the desired trading activities of the participants that our theories suggest. In econometrics we need to be aware of the nature of the observed data and its implications for investigating theoretical propositions. These three elements: • the fact that we cannot hold other things constant in empirical analysis • the imperfect nature of relationships between variables and • the discrepancies between theoretical variables and observed data give econometrics its distinctive flavour. We cannot move straight from a financial or an economic model (as formulated by theory) to the data before we come to terms with these issues. Econometric methods, therefore, aim to address these issues so as to enable us to engage in meaningful investigation of economic and financial theories. Note that we talk about methods and, hence, emphasise the need for methodological groundwork to approach these types of problems. There are no hard and fast rules to deal with them. There is not a box of magic tricks, which always work and give us straight answers. Rather, we are left with the task of studying methodological approaches to issues, which are complex, varied, but challenging. This course, Econometric Principles and Data Analysis, deals with regression analysis. Why this focus? We have seen that, in empirical analysis, our Centre for Financial and Management Studies 5 Econometric Principles & Data Analysis data never behave exactly as our theoretical models would lead us to believe. Theoretical models are useful abstractions, which provide the applied researcher with analytical handles to make sense of an often bewildering economic and financial reality. Good theory allows us to search for patterns within the data and to give meaning to such patterns. But we need to disentangle these patterns in the middle of a great deal of chance variations and uncertainties of outcomes, which our theories could not possibly aim to explain. Regression analysis provides us with an analytical framework to handle relations between variables, especially between variables whose relation is imperfect. Indeed, regression analysis seeks to establish statistical regularities among observed variables. To do this, we need to come to terms with the uncertainty inherent in the behaviour of our data. For this, we need to equip ourselves with statistical theory which allows us to model uncertainty as part of relations between variables. This is the purpose of this course, Econometric Principles and Data Analysis, of which this is the first unit. The following are the main points to remember. • In econometrics we pose the question how to confront theory with data so as to quantify our financial and economic relationships, to verify them or to refute them. • In practice, we deal with imperfect relationships between variables which we can only observe (with errors and, often, through proxies) in a context which we do not control (we cannot experiment). • It follows that we can only resort to careful observation of complex phenomena in order to check our theories against the empirical evidence. This raises questions about econometric methods: methodological issues about gathering and evaluating such empirical evidence. Whatever conclusions we draw in such a context will always involve a considerable degree of uncertainty, even if our models are correctly specified. For this reason, we resort to probability theory and statistical inference to deal with uncertainty in assessing outcomes and conclusions of empirical analysis. • Since our concern is primarily with investigating relations between variables, regression analysis constitutes the major tool of statistical analysis in econometrics. 1.2 How to Use the Course Texts It is quite possible that you are worried about studying econometrics. After all, it involves working with mathematics and statistics, and you may feel that this is not one of your greatest strengths. Alternatively, you may be one of those who welcome this greater emphasis on mathematics and statistics. Whichever view you hold, it is useful to be aware of a particular problem that invariably arises when studying econometrics. Teaching and learning econometrics almost inevitably involves a preoccupation with technical details: definitions of technical terms, mathematical 6 University of London Unit 1 An Introduction to Econometrics and Regression Analysis derivations, step by step descriptions of statistical procedures etc., all phrased in technical notation. This is normal and, indeed, necessary. But this preoccupation with technical detail often implies that students lose a perspective on ‘What is it all about?’ or ‘Why are we doing this?’ That is, there is a need to keep a focus on the kind of basic questions, uncluttered by notation and technical detail, which give substance to the subsequent technical exercises. We need to get an overview of a problem before we explore it aided by our technical skills. We need to know the simple questions and intuitive insights which often prompted elaborate technical enquiries. For this reason, the course texts will always start with a section on ideas or issues. The purpose of this is to explain in simple words, with the minimum of technical notation, the basic substance of the unit. The aim is to give you an intuitive feel for the subject matter before going into technical detail. If you feel that mathematics and statistics are not your strongest subjects, this regular section will give you a few ‘analytical handles’ to hold on to when studying relevant techniques. But, alternatively, if you are confident with mathematics and statistics, it is important not to skip this section. Technical expertise is not just a question of one’s ability to work out the steps in a technical procedure or to understand a mathematical derivation. It also involves understanding the type of questions a technique tries to address as well as the assumptions on which it is based. Good technical expertise is more than understanding a set of technical skills (narrowly defined); it also involves analytical insights and judgement of the appropriateness of particular technical procedures in specific conditions. The section on ideas or issues will be self-contained; no references will be made to reading parts of the assigned textbook. Take your time to read it carefully, and to reflect whether you understand the type of questions which will be addressed subsequently in technical detail: ‘get familiar with the forest before you start looking at the trees’. In other words, use this section to provide you with the ‘analytical handles’ to facilitate the study of the relevant techniques. Next, the course texts will have a reading section, or Study Guide, which guides your study of the textbook, Gujarati and Porter’s Essentials of Econometrics. The purpose of these sections is to structure your reading of the textbook as well as to provide brief comments, elaborations and crossreferences to exercises and examples, and to suggest short cuts in coping with the material. The section after that will normally contain one example. This section has two purposes. Firstly, the example highlights a specific aspect of the topic under study in a particular unit of the course. Secondly, the example also tries to give you a bit of the flavour of econometrics in action. Generally, you will be asked to participate in the analysis of the example. The examples aim to highlight the links between economic theory and empirical investigation, and try to illustrate the problems that can arise when we work with real data. Centre for Financial and Management Studies 7 Econometric Principles & Data Analysis The next section will provide a brief summary of the main issues raised in the unit. This will be followed by a section of exercises. It is most important that you work through all of these exercises. The exercises have three purposes: • to check your understanding of basic concepts and ideas • to verify your ability to use technical procedures in practice and • to develop your skills in interpreting the results of empirical analysis. The final section of the units will include brief answers to these exercises, which you should not look at until after you’ve worked out the answers for yourself! You will be using Eviews to do the econometric exercises, and this unit has an additional section describing this program, which is a widely used econometrics software package. Instructions to use Eviews will accompany the exercises, where necessary. This basic structure of the course texts will be maintained throughout your study of this course. The section on ideas or issues gives you an overview of the topic of the unit, using non-technical language. The core of the course text is the study guide. This guides you through your reading of the textbook and refers you to the exercises whenever appropriate. The example in each unit demonstrates a problem dealt with in the course material using real data. By using examples drawn from areas of finance, using real data, this section also aims to provide cross-references to the theory courses. The summary draws your attention to the main points made in the unit. The exercises are important and you should always work through them. The exercises will help you to understand the course material. In addition, the knowledge and experience you gain from doing the exercises will help you to write assignments and answer examination questions. 1.3 Ideas – The Concept of Regression The remainder of this unit will deal with the introduction to regression analysis. As you will see, it is structured along the pattern outlined above. 1.3.1 What is regression? Regression is the main statistical tool of econometrics. What is regression? Broadly speaking, … regression methods bring out relations between variables, especially between variables whose relations are imperfect in that we do not have one Y for each X.2 But what do we mean by imperfect relations? An example may help. Consider the relation between corporate bond spreads (this is the Y-variable) and the earnings before interest of companies (this is 2 8 Mosteller and Tukey (1977: 262). University of London Unit 1 An Introduction to Econometrics and Regression Analysis the X-variable). The spread for a corporate bond is the difference between the interest rate on the corporate bond and the interest rate on government bonds of equivalent maturity. Interest rates on corporate bonds are higher than those on government bonds to reflect expected default loss, different tax treatments and the riskier return associated with corporate bonds. We would expect that a company with higher earnings before interest would be less likely to default, and hence the bond spread for that company would be lower. Hence, we expect that, on average, the corporate bond spread is inversely related to earnings before interest. But we do not expect this relation to be perfect. That is, if we were to sample 10 companies with identical earnings before interest (i.e. equal X-values), we would not expect to get 10 identical corporate spreads (the Y-values). Differences between the markets in which the firms operate, in management and in other financial variables (e.g. coupon rates, coverage ratios) will account for differences in bond spreads. But, importantly, it is still valid to say that, on average, the bond spread declines as the level of earnings before interest increases. That is what Mosteller and Tukey (quoted above), mean when they say that a relation exists between two variables but that it is imperfect in that we do not have one Y for each X. This leads us to the discussion of the concept of regression. Regression methods aim to bring out this average relation between a dependent variable on the one hand and one or more independent variables on the other. In our example the average inverse relation between the bond spread and the level of earnings before interest is the regression of the former variable on the latter. But, obviously, there will be variation in how markets view the bonds of individual companies that have broadly the same earnings. In fact, anyone familiar with data analysis knows very well that we can always take an average of one or another aspect of a number of individuals, but we rarely meet the ‘average individual’. So it is also with regression as an average relation: individual observations will rarely conform to the average relationship between Y and X. Hence, in regression analysis we seek to establish statistical regularities in the middle of a great deal of chance variation and uncertainty in outcomes. For this reason, regression methods involve statistical modelling of the chance variation in the data as well as of the average relationship. In summary, we hope that our model captures the basic structure of interaction between economic and financial variables, and we expect that the behavioural relations are reasonably stable, but imperfect. At most, we expect these relations to hold ‘on average’. In other words, we seek to discover structure and regularity within data in the middle of a great deal of uncertainty in outcomes. It is similar to separating sound from noise when trying to listen to a badly tuned radio. Therefore, a regression model embraces two components: • a regression line (which defines the basic structure) and • disturbances. Centre for Financial and Management Studies 9 Econometric Principles & Data Analysis Firstly, the regression line models the average relation between the dependent variable and its explanatory variable(s). To do this we make an explicit assumption about the shape of the regression curve: linear, quadratic, exponential, etc. Secondly, we recognise the existence of chance fluctuations due to a multitude of factors beyond our control. We model this element of uncertainty (the noise) in the form of a disturbance term, which constitutes an integral part of our model. This disturbance term is a ‘catch all for all the variables considered as irrelevant for the purpose of the model as well as all unforeseen events’.3 It is a random variable which we cannot observe or measure in practice. Sometimes we are not interested in the disturbance term as a variable in its own right, but we are interested in understanding how the disturbance term affects our attempts to investigate the behavioural relations in the model. In other circumstances we might be particularly interested in the properties of the disturbance term, if it reflects an element of uncertainty and risk that we are trying to understand. In both cases, we need to model the probabilistic nature of the disturbance term. In other words, we try to model the character of the uncertainty inherent in the data. This is no easy task, and we always need to think carefully whether the assumptions we make about the nature of these chance variations are indeed appropriate for the type of issue under study. Not surprisingly, a great deal of econometric theory and practice is concerned with these assumptions. It is useful to express these important ideas a little more formally. We start with the population regression function. This is a theoretical construct, which contains a hypothesis about how the data are generated. For the simple, twovariable linear regression model we have (1.1) in which Y is the dependent variable, X is the explanatory variable – sometimes called the regressor, u is the disturbance term, and the subscript i indicates the ith observation. β1 and β2 are the regression parameters; β1 is the intercept, or constant, and β2 is the slope coefficient. Typically, the variables Y and X are observable, the disturbance is not observable, and the parameters β1 and β2 are unknown. The presence of the random disturbance means that Y is stochastic; for each value of the explanatory variable, X, there is a distribution of Y-values. In this explanation of regression we will continue to use the i subscript to indicate the ith observation. In many financial applications we will examine series that vary over time, and it will be more meaningful to use a t subscript to indicate that the observation refers to period t. This will allow us to use t – 1 to refer to the previous period, etc. 3 10 Maddala G S (1992: 3). University of London Unit 1 An Introduction to Econometrics and Regression Analysis The population regression function may be viewed as comprising two components: a systematic element represented by a straight line which shows the statistical dependence of Y on X; and a random, or stochastic, element represented by the disturbance (error) term u. The systematic element can be expressed as (1.2) that is, the average (or expected), value of Y conditional on a given value of X is a linear function of X – or, more concisely, the average value of Y for each value of X. That is, the population regression function joins the conditional means of Y. The disturbance term, u, is the focus of much attention. It accounts for the variation in Y around the population regression line. In Unit 2 you will learn about the important assumptions made about u. A prime objective of econometrics is to quantify the unknown parameters and . Using a sample of data on Y and X, we obtain estimates, of the unknown population parameters.4 and , We have the sample regression function (1.3) in which and are random variables (the particular estimates obtained depend on the particular sample of data on Y and X used) that differ from the population parameters β1 and β2. Consequently, the sample residuals, ei, differ from the unknown population disturbances, ui. Whereas the disturbance term accounts for the variation in Y around the population regression line, the residuals give us the vertical deviations of the observed Y-values from the estimated regression line derived from sample data. The residuals, therefore, are not identical with the disturbances, but clearly they do tell a story, which may enable us to assess whether or not our assumptions about the behaviour of the disturbances seem reasonable. How to analyse the story or stories told by residuals is a matter we address in the second half of the course. The predicted value of the dependent variable is given by the sample regression line (1.4) in which is the fitted value of the dependent variable, the estimator of , that is the estimator of the population conditional mean. The sample regression line is an estimator of the population regression line. Notice that we focus on the linear regression model. That is, we are concerned with a model that is linear in the parameters to be estimated. The model 4 ^ is read as ‘hat’, hence is ‘beta one hat’. Centre for Financial and Management Studies 11 Econometric Principles & Data Analysis (1.5) is linear in β1 and β2. With the sample regression line (1.6) is the predicted value of Y (in units of Y) if X = 0. Also, ; this implies that a 1 unit increase in X (measured in units of X) results in a unit increase in (measured in units of Y). Now consider the model (in which e stands for exponential, not the residual) (1.7) which, after taking natural logarithms of both sides of the relation, can be written as or (1.8) where β1 = lnα. This model is also linear in the parameters to be estimated, β1 and β2. We may view the model as (1.9) where and . This model is known by a number of different names – logarithmic, double log, log-log, log linear, and constant elasticity – and is frequently used in applied work when it characterises the form of the functional relationship between the variables. It has the useful property that the slope coefficient measures the elasticity of Y with respect to X because (1.10) With this logarithmic model, a 1 per cent increase in X results in a β2 per cent increase in Y. Note that here we mean a 1 per cent proportionate increase in X, not that X increases by 100 basis points (1 basis point equals 0.01 per cent). Although regression analysis is related to correlation analysis, conceptually these two types of analysis are very different. The main aim of correlation analysis is to measure the degree of linear association between two variables, and this is summarised by a sample statistic, the correlation coefficient. The two variables are treated symmetrically. Both are considered random; there is no distinction between dependent and explanatory variables, and no implication of causality in a particular direction from one variable to the other. Regression analysis, however, can incorporate relationships between two or more variables and the variables are not treated symmetrically. The dependent and explanatory variables are carefully distinguished. The former is 12 University of London Unit 1 An Introduction to Econometrics and Regression Analysis random and the latter is often assumed to take the same values in different samples – often referred to as ‘fixed in repeated samples’. The underlying economic or financial theory implies that X, an explanatory variable, causes Y, the dependent variable. Moreover, with more than one explanatory variable, regression analysis quantifies the influence of each explanatory variable on the dependent variable. 1.3.2 Data and Regression Regression methods allow us to investigate associations between variables, but the inspiration as to which relations to investigate obviously comes from theory. We are not interested in detecting spurious (false or bogus) associations between variables. Indeed, relations have to be meaningful – and whether they are, or not, depends on theoretical argument. This does not mean, however, that data play only a passive role in economic and financial analysis. The role of data is not just to provide numerical support to theoretical arguments. Empirical investigation is an active part of theoretical analysis in as much as it is concerned with testing theoretical hypotheses against the data as well as, in many instances, providing clues and hints towards new avenues of theoretical enquiry. This requires that we translate our theoretical insights into empirically testable hypotheses, which we can investigate with observed data. Hence, the process between theory and the data is interactive: we must continuously investigate the empirical content of our theoretical propositions in order to test our theories, and pick up signals from the data that enable us to improve our theoretical insights. Most of the data we use in applied economic analysis are not obtained through experimentation but are the result of observational programmes. National income accounts, agricultural and industrial surveys, financial accounts, employment surveys, population census data, household budget surveys and price and income data, among others, are collected by various statistical offices. They are partial records of what happens; they are not the outcome of experiments. As we have noted, finance data more closely relate to actual transactions, but, like economic data, they are not the outcome of experiment. The character of this economic and financial data makes the work of an econometrician quite different from that of a psychologist or an agricultural scientist. In the latter cases, experiments play a prominent role in analysis, and much of the emphasis in research work is put on the careful design of experiments in order to be able to single out effect and response between two variables while controlling for the influence of other variables (that is, by holding them constant). In economics and finance, the scope for experimentation is very limited. We cannot change the price of a stock, holding all other prices constant, merely to see what would happen in its demand. In theory, we do just that by assuming that ‘other things are equal’ and postulating cause and effect between the remaining variables. In empirical analysis, however, other things Centre for Financial and Management Studies 13 Econometric Principles & Data Analysis are never equal, and we can only carefully observe the behaviour of economic agents from survey data. As you will see in subsequent units, multiple regression techniques allow us to ‘account’ for the influence of other variables while investigating the interaction between two key variables, but this is not the same as ‘holding other variables constant’. The econometrician, therefore, needs to be, above all, a careful observer. Empirical analysis in economics and in finance allows us to search for patterns in our data through careful observation backed by theoretical understanding; but experimentation is not really an option we have available, because we do not have control over the overall context that determines the movement of our variables. In analysing data, we should follow the advice ascribed to Darwin. It is obviously pleasing if the empirical evidence seems to support our theoretical hypotheses, but – more importantly – we should take special note of any signs given by the data that go against our arguments. That is, we should not approach our data merely to confirm answers to well-defined questions derived from theoretical argument, but we should also look out for hints from the data about what we do not know – that is, about questions that we have not confronted yet. A careful observer uses data not just to confirm his or her theories, but also to get clues from empirical analysis to advance one’s theoretical grasp of a problem. It is primarily this aspect that enables data to be used to play an active part in the process of analysis. 1.3.3 Rates of return Much analysis in financial econometrics is concerned with rates of return, including returns on shares, stock indices, commodities and exchange rates. Therefore, at this point in Unit 1 it might be useful, briefly, to refresh your understanding of returns. In your study of finance or risk management, or in your work, you may already be familiar with arithmetic and logarithmic rates of return. For example, logarithmic returns are used especially in the BlackScholes-Merton model of options pricing. First consider arithmetic returns. Suppose we have a stock that is worth $1000 at the start of the year and $1050 at the end of the year. Ignoring any dividends, we say that the arithmetic or simple or proportionate rate of return is r= (1050 − 1000) = 0.05 or 5 per cent. 1000 It is the increase (or decrease) in value, divided by the original value. Put another way, if the stock, initially valued at $1000, benefits from a 5% return over the year, then the value at the end of the year will be ( ) 1000 1 + 0.05 = 1050 In general terms, if the price at the start of the year is , and the stock experiences a return of r, the price at the end of the year will be 14 University of London Unit 1 An Introduction to Econometrics and Regression Analysis ( P1 = P0 1 + r ) (1.11) and the rate of return is r= (P − P ) . 1 0 (1.12) P0 To understand logarithmic returns and continuous compounding, it may help to conduct a short thought exercise. In the previous example, we can think of the return, r, being applied to the asset once a year (if it makes more sense to you, think of r as the interest paid on a sum of money in a bank account, paid annually). Now suppose that this growth rate is applied at more times through the year, but the rate of return at each point of the year is adjusted to take account of the increased number of times the return is experienced. Continuing the 5% example, if the return is applied twice in a year, the stock will benefit from a return of 2.5% in the first six months, and another 2.5% in the second six months. After six months the asset price will be ⎛ 0.05 ⎞ = 1025 1000 ⎜ 1 + 2 ⎟⎠ ⎝ And after one year the asset price will be 2 ⎛ 0.05 ⎞ P1 = 1000 ⎜ 1 + = 1050.625 2 ⎟⎠ ⎝ The growth of 0.025 or 2.5% in the first six months also benefits from growth of 0.025 or 2.5% in the second six months. This is known as compounding, and it explains why the value of the stock at the end of the year is more than 1050. In general, if the return is applied m times in a year, the asset price at the end of the year will be (1.13) We could increase m to 12 or 365, to see what the price of the stock would be if the return were applied (or compounded) every month or every day. We could also ask what continuous compounding would look like. Continuous compounding or continuous growth is when the return is experienced an infinite number of times in the year, but the return at each point of the year is infinitesimally small. That is, what happens if m approaches infinity? You can see that will approach zero, but the expression in brackets will be raised to the power infinity. The limit of this expression when m approaches infinity is , where e is equal to 2.718 (to three decimal places). The value e is known as the base of natural logarithms. Going back to our example, if the stock is initially valued at $1000, and experiences continuous growth at an annual rate of 5 per cent (or 0.05), it will be valued at the end of the year at 1000e0.05 = 1051.27 Centre for Financial and Management Studies 15 Econometric Principles & Data Analysis and in general terms (1.14) We can calculate the logarithmic rate of return (also known as the continuously compounded return) as (1.15) where ln represents the natural logarithm, or the logarithm to base e. To see this, take natural logarithms of the end-of-year continuously compounded stock price ( ) ( ) ln P1 = ln P0 er = ln P0 + ln er = ln P0 + r ln e = ln P0 + r since the natural log of e is 1. In one of the exercises at the end of the unit you will show that arithmetic returns are not symmetric: if a stock valued at $1000 experiences first a return of minus 10% and then a return of 10%, it will not be equal to $1000 at the end. On the other hand, you will find out that logarithmic returns are symmetric. You will also use Eviews to calculate arithmetic and log returns. Note that in this course, returns will always be calculated as decimals, so a return of 5%, for example, will be shown in Eviews and in any other calculations as 0.05. It will not be shown as 5.00. A consistent approach is necessary, and the decimal representation makes calculations a little bit simpler. 1.4 Study Guide First, let us consider notation. In econometrics, population parameters and their estimators are normally denoted by Greek letters, and the course units follow this standard practice. The textbook, however, differs. Table 1.1 summarises the principal difference and similarities in notation. Table 1.1 Notation Course units Textbook Population parameters , B1, B2 Their estimators , Disturbances ui ei N b1, b2 ui ei n Residuals Number of observations For this unit you are requested to study Chapter 1 of the course textbook, Gujarati and Porter’s Essentials of Econometrics. This chapter has three main sections, the first two of these address two questions: What is econometrics? and Why study econometrics? These sections are straightforward, and you can read them relatively quickly. 16 University of London Unit 1 An Introduction to Econometrics and Regression Analysis Reading Damodar Gujarati and Dawn Porter (2010) Essentials of Econometrics, sections 1.1 and 1.2 Chapter 1 ‘The Nature and Scope of Econometrics’. Please now read sections 1.1 and 1.2, pages 1–3, of Gujarati and Porter’s textbook. Make notes of the important points. The next section of the textbook is particularly important. It sets out a methodology of econometrics; that is, it explains how you might proceed in a typical econometric study. Gujarati and Porter identify eight steps associated with the typical econometric investigation. All of these eight steps are discussed in the context of a model of labour force participation. Although this particular example is drawn from economics, you will see that the steps described are relevant to econometric investigation in any discipline, including finance. You will see that in this example the data are plotted in a scatter diagram (often called a scatter plot). This can be helpful in giving a simple illustration of the relationship among two variables in the data. Notice also the central role of estimating the parameters of the model and so obtaining the estimated regression line. The notation in the textbook differs slightly from the notation in these units. In the context of the model of labour force participation, Gujarati and Porter define CLFPR as the civilian labour force participation rate and CUNR as the civilian unemployment rate and write the population regression function as (1.16) which is comparable to our population regression function . (1.1) Reading Please read carefully section 1.3, pages 3–12, of the textbook. Damodar Gujarati and Dawn Porter (2010) Essentials of Econometrics, Chapter 1, Section 1.3 ‘The Methodology of Econometrics’. 1.5 An Example – Efficiency in the Foreign Exchange Market The eight steps explained in the textbook are typical of any econometric investigation and you are now going to follow them in another example, examining the hypothesis of efficiency in the foreign exchange market. Statement of the Theory Efficiency in markets is a central assumption of many theories in finance and economics. The efficient markets hypothesis states that current prices will reflect all available information. Applied in the exchange rate market, the hypothesis suggests that the forward exchange rate is the market’s expectation of the spot rate that will exist in the future. Any difference between the forward rate formed in the previous period and the spot rate in the current period should be entirely random and unpredictable. In addition, there should Centre for Financial and Management Studies 17 Econometric Principles & Data Analysis be a close relation between the forward rate from the previous period and the spot rate in the current period. Collection of Data The data to be used are monthly time series data for the spot exchange rate between UK sterling and the US dollar, measured in dollars per pound, and the one-month-ahead forward exchange rate, also measured in dollars per pound. The data cover the period January 1982 to January 2012. The source of the data is www.bankofengland.co.uk. Figure 1.1 shows a scatter plot of the current spot rate, S, against the forward rate available in the previous month, F(–1). The figure suggests that the relationship is upward sloping and it seems to be reasonably linear. Figure 1.1 Scatter plot of S (current spot rate) on F(–1) (previous forward rate), 1982–2012 Y 280,000 240,000 200,000 160,000 120,000 80,000 100,000 150,000 200,000 250,000 300,000 350,000 X Mathematical Model of the Theory The relation between the current spot rate and the forward rate in the previous month in its simplest form can be presented as a linear relationship (1.17) where is the spot rate in period t; is the one-month ahead forward rate available in the previous period, ; β1 is a constant (or intercept) and β2 is the slope of the function. For the efficient markets hypothesis to hold we would expect and β 2 = 1 . Econometric Model of the Theory The econometric model is stochastic. It includes a random error, , which captures the influence of all the other variables that may influence the spot exchange rate. 18 University of London Unit 1 An Introduction to Econometrics and Regression Analysis (1.18) The disturbance term is crucial to the distinction between a mathematical model and an econometric model. In the mathematical model we have a function – there is a unique value of the spot rate for each value of the previous forward rate. With the econometric model, we have a relation in which there is no longer a unique value of the spot rate for each value of the previous forward rate. In the context of the efficient markets hypothesis the disturbance term has additional interpretation: according to the hypothesis, any difference between the previous forward rate and the current spot rate should be random and unpredictable. Parameter Estimation Using these data and Eviews, it is possible to obtain estimates of the parameters β1 and β2 to obtain the average relationship between and . The problem of estimating the coefficients of the population regression function will be discussed in Unit 2. The function estimated with our data is (1.19) and this represents the average relationship between the spot exchange rate and the previous forward exchange rate. The estimated value of β1 , β̂1 , is 0.064 and the estimated value of β 2 , β̂ 2 is 0.962. Consequently if the forward exchange rate increases by 0.01, the spot rate in the next period increases on average by 0.00962. The interpretation of the intercept is not as straightforward. Mechanical interpretation of the estimate tells us that the spot exchange rate is $0.064 per pound if the forward exchange rate in the previous period is zero. On its own, this statement is without meaning. However, in the context of the efficient markets hypothesis, we may ask if the estimated constant indicates there is a systematic and predictable difference between the average spot rate in a period, and the spot rate expected by the markets in the previous period (as measured by the forward rate), and whether this difference could be exploited by traders. Checking for Model Adequacy How appropriate is the model? Should some other variable(s) be included, and is the functional form correct? For example, research on the efficient markets hypothesis in exchange markets has used the natural logarithms of the spot and forward exchange rates. Alternatively, researchers have focussed on the rates of return on the spot and forward exchange rates, and not the levels. Researchers have also examined if the difference between the spot rate and the previous forward rate ( ) can be explained by the difference that was observed in earlier periods. With the relevant data, we could estimate various specifications of the relation between spot and forward exchange rates. How do we choose the best model? This is discussed in Unit 8. Centre for Financial and Management Studies 19 Econometric Principles & Data Analysis Tests of the Hypothesis Do the results conform to the theory of the efficient markets hypothesis? With our theory we expect β1 = 0 and β 2 = 1 . Is each of these hypotheses supported by the results? Our estimates would appear to be consistent with what we expected to obtain, but we should conduct formal tests to check that this is actually the case. Formal tests of hypotheses will be discussed in Unit 3. Prediction How might the estimated model be used for prediction? We could use it to predict what the spot exchange rate would be if the forward rate in the previous period was a particular amount. Suppose the forward exchange rate in the previous month was $1.50 per £1.00. The predicted level of the spot rate is Ŝt = 0.064 + 0.962 × 1.50 . (1.20) Therefore . That is, the spot rate is predicted to be $1.507 per £1.00 if the forward rate in the previous period is $1.50 per £1.00. 1.6 Summary In this unit we introduced some basic ideas on econometrics and regression analysis. The most important points to remember are the following: • Econometrics is the application of statistical and mathematical methods to the analysis of data, with a purpose of giving empirical content to economic and financial theories and verifying them or refuting them. Three elements account for the difference in the work of an econometrician in relation to an economic or finance theorist: 1 the fact that we cannot ‘hold other things constant’ in empirical analysis 2 the imperfect nature of relations between variables which makes the conclusions and outcomes of empirical analysis always contain a considerable element of uncertainty, and 3 the discrepancy between theoretical variables and observed data in terms of coverage and precision of measurement. Regression analysis constitutes the statistical foundation of econometric theory and practice. Its aim is to bring out relations between variables, especially between variables whose relation is subject to chance variation and to the influence of unforeseen events. Regression involves finding an average line, which summarises the relation of Y on X among considerable chance variation and uncertainty of outcome. 20 University of London Unit 1 An Introduction to Econometrics and Regression Analysis The uncertainty inherent in conclusions and outcomes based on regression analysis is formally modelled through the introduction of a disturbance term in our behavioural equations. This is a stochastic variable, which we cannot observe in practice. However, the residuals of a sample regression function may provide us with an indication as to the behaviour of these unknown disturbances. Regression allows us to investigate the association between variables, but this does not imply any causality between them. To establish causality we need to use economic and finance theory. In empirical work in economics and finance we cannot use experimentation. Econometric analysis, therefore, is based on careful observation of data drawn from a context that we do not control. In terms of practical skills, this unit requires that: • you are familiar with the scatter plot as a practical tool of empirical analysis • you know how to enter data in Eviews by opening a pre-existing text file • you know the Eviews commands or operations to obtain a summary of descriptive statistics of a variable, make a scatter plot, create logarithms of variables, and create rates of return. 1.7 Eviews If you have not done so already, now would be a good time to install and register your copy of the Eviews Student Edition. Instructions for installation and registration are in the booklet that comes with the Eviews CD. It is important to remember that you must register your copy of Eviews. If you do not, it will stop working 14 days after installation! Reading Please now quickly read Chapter 1, ‘A Quick Walk Through’, in Eviews Illustrated – An Eviews Primer. You can access this in Eviews via the Help button on the top toolbar. This chapter provides a quick overview of using Eviews; it also follows the steps described in this unit and in the reading from Gujarati and Porter. Do not worry about the detail of this reading at this stage – it is intended to give you a quick idea of some of the things you will be learning in these units. Richard Startz (2009) Eviews Illustrated, Chapter 1 ‘A Quick Walk Through’. Eviews is a very easy package to use. Many of the mouse and keyboard operations that you would use in other Windows packages also work in Eviews. With the Exercises in the units there are instructions to help you work with Eviews. To begin with, the instructions are quite detailed. However, as you move on to later units, you should become familiar with the basic operations in Eviews, and the instructions will concentrate on new information required Centre for Financial and Management Studies 21 Econometric Principles & Data Analysis for each set of exercises. Therefore, if you forget how to do something, refer back to the instructions in the earlier units, (or use Help in Eviews). These instructions are specifically related to the exercises, and they do not provide an overall guide to Eviews. This is because there is excellent, comprehensive Help provided by Eviews. You can access the Eviews Help information in a number of ways. Perhaps the easiest is to go to Help on the top toolbar, then Eviews Help Topics... In the Eviews Help Topics … you can look through the Contents, use an A-Z Index, or use the Search facility. Eviews Help Topics... links to the Users Guide I, Users Guide II, and the Command Reference (more on Commands later). If you prefer, you can access these pdf files directly, again via the Help button in Eviews. The pdf file Users Guide I includes the contents pages for Users Guide I and Users Guide II, and the entries in the contents pages link to the relevant pages in the files. You can also search within the pdf files. Although easy to use, Eviews is a very powerful econometrics package. It has many features that you will not use in this course, so don’t worry if you see methods or notation in the Help files that are not covered in this course. Everything you need to understand is described in the course units, readings, and exercises. Lastly, answers to the exercises are provided at the end of the unit, for you to check you have understood and done the exercises correctly. If you do the exercises yourself, you will develop a good understanding of the course materials, and the models and methods described in the units; you will also become more confident using these methods and using Eviews. Do not go straight to the answers! 1.8 Exercises 1 What is the critical distinction between econometrics and (i) economic or finance theory and (ii) mathematical finance and economics? 2 The file C230C330_U1_Q2.txt contains the data used in the example in the unit. It is monthly time series data on the exchange rate between the US dollar and UK sterling, measured in dollars per pound. The current spot exchange rate is denoted S, and the one-month ahead forward rate is denoted F. The data relate to the period January 1982 to January 2012, and the source of the data is www.bankofengland.co.uk. a. Produce a plot of the spot rate, S, over time. Comment on the plot. Are there any noteworthy episodes? b. Produce a scatter plot of the current spot rate, , on the vertical axis and the forward rate available in the previous period, , on the horizontal axis. Comment on the scatter plot; would a linear regression seem appropriate? c. Produce a plot over time of the difference between the current spot rate and the forward rate available in the previous period, . 22 University of London Unit 1 An Introduction to Econometrics and Regression Analysis Comment on the plot; are there periods when the current spot rate differs noticeably from what is predicted by the previous forward rate? d. Produce a scatter plot with the difference between the current spot rate , on the and the forward rate available in the previous period, vertical axis, and this difference one month ago, , on the horizontal axis. Comment on the scatter plot; does there appear to be a relationship between the two transformed series? Data files The file C230C330_U1_Q2.txt is a tab separated text file. Eviews can open data stored in a wide variety of different sorts of file, including text files and Excel files. Text files are very basic, they are readable by many applications, (you could open them in Excel, in Eviews, and even in Word), and they are robust to upgrades in software. For these reasons, the data files for the course are all provided in the simplest (and most accessible) format, text files. The first line of C230C330_U1_Q2.txt contains the labels for the three columns: Date, S and F. (Please note that in Eviews certain names are reserved and cannot be used as names for data series. For example, C is reserved for the constant term. If you attempt to import a variable named C, Eviews will rename it C01.) The next row contains the data for the first observation: 31-Jan-82, 1.8835 and 1.8837, separated by tabs. Row 3 is 28-Feb-82, 1.8225, 1.8237, and so on. The final row contains 31-Jan-12, 1.578 and 1.5777. A useful tip when working with data is to note the first and last observations for your variables, so that you can check files have been opened successfully (and completely). Open foreign data as a workfile To open the file in Eviews, go to File/Open/Foreign Data as Workfile ... This dialogue box allows you to browse folders to find the file C230C330_U1_Q2.txt After you have found and opened the file, you will get the dialogue box ‘Text Read – Step 1 of 4’. This shows the preview window – how Eviews will interpret the data in the file. You can check that the first values are as noted above. Click Next. Step 2 of 4 asks about the delimiter between entries; this is a single tab, as indicated, so click Next. Step 3 of 4 identifies that the column headers (the names of the variables) are in line 1. Click Next. Step 4 of 4 concerns the Import Method: Eviews will create a new workfile containing the data series. Step 4 also concerns the Structure of the Data to be Imported: In this case the data are dated, with the dates specified by a date series, and in the text file that series is called ‘date’. Just click on Finish. You should now see the Workfile window (C230C330_U1_Q2) with a list of variables. To see the values of a series, double-click on the name of the Centre for Financial and Management Studies 23 Econometric Principles & Data Analysis series. A new window will open, displaying the values for the series in spreadsheet view. (When opening the text files of data, do not use File/Open/Text File... This really will open the file as if it is a file of text, and not as data.) Eviews will recognise that the data are monthly, and it will arrange the values into observations: 1982M01, 1982M02, etc. However, Eviews also takes the Date values from the first column in the text file and assigns them to a series in its own right, in this case with values 1982-01-31, 1982-02-28, etc. In many contexts this date series will not be used, especially if you use annual data, in which case it would contain values like 2,009; 2,010 etc. However, in other contexts (daily data with irregular breaks, like the data series in Q4), Eviews uses the date series to index the observations. Therefore it is best to retain this series but to ignore it. Note that in general the undo feature (Control and z) does not work in Eviews (although it does work when editing in the Command line). If you have made a mistake when creating a new series, for example, you will have to delete the series and create it again. To delete an object in the Workfile window, right-click the object: a list of possibilities appears, including Delete. And save your Workfiles frequently. Saving a Workfile To save this Workfile, make sure the Workfile window is selected (highlighted), go to File on the top toolbar in Eviews, Save As..., and provide a filename and folder where you wish to save the Workfile. Eviews will assume you wish to save the file as a Workfile; so the filename will be c230c330_u1_q2.wf1 in this case, unless you have renamed the file. Note that in the Save window there is a button on the bottom left that allows you to Browse folders (that is, to display the folders for browsing) or to Hide folders. After clicking Save, Eviews will ask you what level of precision you wish to use to save the Workfile. Leave the default choice as it is, and click OK. (If the Workfile window is highlighted, Eviews will save the Workfile. If the Command line is highlighted, Eviews will ask if you wish to save the Workfile or the command log. If you save the command log, it will be saved as a simple text file.) Producing a Graph Analysing graphs of your data is a very useful method for identifying general patterns, relations between series, or noteworthy changes in the data. To demonstrate this, Q2a examines a plot of a series over time; Q2b examines a scatter plot where one series is plotted against another; Q2c requires a plot of transformed series over time; and Q2d considers a scatter plot of two transformed series. To produce a plot of the current spot rate, S, select the object s in the Workfile window. Then go to Quick on the top bar of Eviews, and select Graph... (Eviews then shows the series selected, which is s. If you had not already 24 University of London Unit 1 An Introduction to Econometrics and Regression Analysis selected s, you can type the name of the series directly into the Series List box). Click on OK. This brings up the Graph options window. The selected type of graph is Line & Symbol, which is what you want in this case, so click on OK. This should produce a plot of s over time. The graph already has a title label, s. If you move the mouse pointer over the graph, the observations and values are displayed in the bottom left of the screen; resting on a point on the line will show you which observation you are pointing to and what the value is. To save the graph in your Workfile you will have to Name it. With the Graph window open, click on Name, and give a Name to identify the object. Note that Names for objects cannot include spaces; one suggestion is to use underscore (_) instead of a space. To use the graph in other applications, you can save the graph in a variety of formats, or you can simply copy and paste it into a Word document (both very useful when writing assignments). Note that all the instructions that follow will refer to Microsoft Word; operations for other word processing software may not be the same. Click on the graph area so that the plot area is highlighted (this selects everything, even though only the plot area is highlighted). To save the graph, right click and select Save graph to disk... The Graphics File Save dialogue box then gives you the opportunity to provide a filename for the saved graph, and to browse to the folder where you want the file to be saved. You can also choose the format for the file (e.g. Windows Metafile (*.wmf), Enhanced Metafile (*.emf), *.jpg, *.bmp, etc). Note that Browsing to change the File name/path, and clicking Save, does not save the graph. You also need to click on OK in the Eviews Graphics File Save dialogue box. If you prefer to copy the graph, select the graph, right click and choose Copy to clipboard ... This then gives you a few options for the copied graph (e.g. use colour, *.wmf or *.emf). Click on OK. Go to your Word document, then press Control and v, or click on Paste, and the graph will be pasted into your document. Using Commands (an alternative) So far, the instructions above have used the drop down menus in Eviews. An alternative is to use Commands. (For an introduction to using Commands, see Command and Programming Ref – available in Eviews Help – Chapter 1 ‘Object and Command Basics’.) The command line is the space below the top toolbar in Eviews. As an example, typing (without the inverted commas): ‘graph myplot.plot s’, then pressing the Enter (or return) key, will produce the graph for question 2a. To see the graph, type the command: ‘show myplot’ followed by Enter, and the graph will be displayed. You will now notice there is a new object in the Workfile window. Double clicking on the object myplot also opens the graph. You may prefer to use Commands, or you may prefer to use the dropdown menus, or you may prefer to switch between them. If you like using the Centre for Financial and Management Studies 25 Econometric Principles & Data Analysis Commands, you might find it useful to develop a list of useful Commands as you work through the exercises in the units. Notice that once you have pressed the Enter key to execute the Command, the Command stays in the Command area. If you want to do a similar operation again, you can edit the Command line then run it again; just move the cursor into the line containing the Command (make the edit if required) and press Enter. Producing a Scatter Plot Next, produce a scatter plot with on the vertical axis and on the horizontal axis. Go to Quick on the top bar of Eviews, and select Graph... Type the names of the series in the Series List box. Note that in scatter plots in Eviews, the first series name you type in the series list will be measured on the horizontal axis and the second series name will be measured on the vertical axis. Also note that the Series List can include names of series that you have created or imported, and also expressions. To see how this works, type the series list for this graph: f(-1) s. Click on OK. Then in the Graph Options window, under Graph type, select Scatter, and click OK. This should produce a scatter plot of on . Note that if you had typed s and then f(1) in the series list, you would get a scatter plot of on . The expression f(-1) indicates that you want to use the value of F from the previous period, (known as the first lagged value of F). The Command to produce this scatter plot is ‘graph myscatter.scat f(-1) s’, which will produce a graph object named myscatter. You can add labels to the graph using the AddText button. Type in the text for the label. You can specify the position (e.g. Top and centred for a title for the graph), but you can drag the textbox to wherever you want on the graph after you have clicked OK. If you made a mistake when you added the text label, just double click on the text label and you can edit the text in it. Question 2c requires a plot of over time. In the Graph Series List box (or Command), the expression for this will be s-f(-1). To help interpret the graph, you might wish to add a zero line. This is a horizontal line that goes through zero (measured on the vertical axis). To add this line, have the Graph open, and go to Options. On the left of the window you will see the Option pages arranged in a tree system. Click on Axes & Scaling, and then Data axis labels. Under ‘Axis ticks & lines’ you will see a button titled No zero line. Click on this and select Zero line, background, and click on OK. on the vertical axis, and Question 2d requires a scatter plot with on the horizontal axis. Can you think what two expressions are required for this graph (to go in the Series List box or to put in your Command)? And what order should they be in to produce this scatter plot? The series list will be s(-1)-f(-2) followed by s-f(-1). And remember that it is a Scatter graph. Once you have produced the graph, you can add a horizontal zero line as before (like the graph in Q2c). You can also add a vertical zero 26 University of London Unit 1 An Introduction to Econometrics and Regression Analysis line (passing through zero on the horizontal axis). To do this, select Options, Axes & Scaling, and then Data axis labels. Select the button titled ‘Left axis’, towards the top of the window, and select ‘Bottom axis’. Click on the ‘No zero line’ button and select Zero line, background. This should add a vertical zero line to the graph. Generating New Series As you can see, it is possible to type expressions for series (transformations of series) directly into the Graph Series List or in Commands. Sometimes it will be more convenient, or you may prefer it, to create new series that incorporate the transformations, and then work with the new series. So in Q2c you could create a new series, call it Z, equal to the difference between the current spot rate and the forward rate from the previous period, and then plot Z. You can create a new series in a number of ways. From the top toolbar click on Quick/Generate Series... This brings up the ‘Generate Series by Equation’ dialogue box. In the box titled Enter equation, type (without the inverted commas): ‘z=s-f(-1)’ then click on OK. You will see that there is a new series in the Workfile window. Alternatively, in the Workfile window you could click on the button Genr, to bring up the same ‘Generate Series by Equation’ dialogue box. Or, to generate the new series using a Command, type: ‘genr z=s-f(-1)’ in the Command space, and then Enter. Note that if you are using Commands, you can use the editing functions to create and edit your commands: Copy (the control key and c), Cut (control and x), and Paste (control and v). Rightclicking in the command space also gives a drop down menu with these editing functions. As you can see, there are many ways to work with objects (series, graphs, etc) in Eviews. Often rightclicking on any object in the Workfile window will enable you to open, copy or delete that object. Now save your Workfile (this saves the original series, the Named graphs, and new series if you have generated them). Remember that if the Workfile window is highlighted, clicking on File/Save As ... will allow you to save the Workfile. If the cursor is in the Command space and the Command Space is highlighted, clicking on File/Save As... prompts Eviews to ask if you wish to save the Workfile or the log of Commands in the Command line. 3 A share is valued at $1000 at the start of year 1. In year 1 it experiences a return of -20%, and in year 2 it experiences a return of +20%. Calculate the value of the share at the end of year 1 and the end of the year 2 using a) arithmetic returns, and b) logarithmic returns. Comment on the values you have obtained for the share price at the end of year 2. 4 The tab delimited text file C230C330_U1_Q4.txt contains the share price of Delta Airlines Inc. (DAL) and the New York Stock Exchange Centre for Financial and Management Studies 27 Econometric Principles & Data Analysis Composite Index (NYA). The data are daily, for the period 1 March 2010 to 1 March 2012, and both series are measured in US dollars (source: http://finance.yahoo.com). The text file also includes a column of dates. a) Plot the series DAL and NYA over time. Comment on the plots. b) Plot the daily logged return of Delta Airline shares over time, and comment on the plot. c) Produce a scatter plot with the daily logged return of Delta shares on the vertical axis, and the daily logged return on the NYSE composite index on the horizontal axis. Comment on the scatter plot. d) Calculate the means, standard deviations, and minimum and maximum values for the Delta Airline daily logged return and daily arithmetic return. Comment on the values you have obtained. The file contains three columns. The first column contains the date; the second column contains the share prices for Delta Airlines Inc. (DAL); and the third column contains the value for the NYSE Composite Index (NYA). For reference, on 1/3/2010 DAL is 13.17 and NYA is 7100.75; on 2/3/2010 DAL is 12.78 and NYA is 7135.97; and on 1/3/2012 DAL is 9.64 and NYA is 8175.11. When you produce the plots of the Delta share price and NYSE Composite Index for Q4a, you may notice gaps in the graph due to the irregular dates. To close the gaps in the graph, go to Options, select the Graph type page, and in the Sample breaks section, put a tick in the ‘Connect adjacent’ box. For Q4b you need to plot the daily logged return for the shares of Delta Airlines. Recall from the unit that the daily logged return is equal to (1.21) that is, the natural logarithm of the share price minus the natural logarithm of the share price from the day before. You can obtain a plot of this variable in a number of ways. You can create a new series, call it r, and the expression for r will be r=log(dal)-log(dal(-1)). What does this equation do? The first term is the natural logarithm of the current value of dal (in Eviews, log stands for the logarithm to base e, and not base 10). The second term takes the value of dal from the observation before, and then takes the natural logarithm of it. Logged returns are used widely in econometrics. Therefore, Eviews has a built-in function or short-cut for this calculation: You can create the daily log return for Delta shares with the expression r = dlog(dal). Now you can plot the series r. Alternatively, you can type the expression dlog(dal) directly in the Quick/Graph…Series list box. Or the Command to produce the required graph (called plot_dlogdal) would be ‘graph plot_dlogdal.plot dlog(dal)’. For the scatter plot in Q4c, you could create a new series for the daily log return on the NYSE Composite Index, or you can work directly with the expressions in the Graph … Series list, or use a Command. Working with the expressions, the Series list would be dlog(nya) dlog(dal). Remember that in a Scatter plot, the first series name in the list will be measured on the horizontal axis, and the second will be on the vertical axis. 28 University of London Unit 1 An Introduction to Econometrics and Regression Analysis Sample statistics You can produce sample statistics for a series using Quick (on the Eviews top toolbar)/Series Statistics/Histogram and Stats, then typing in the name of the series and clicking on OK. Copying and Pasting this output into Word will copy the complete graph (the histogram and the statistics). You could do this for the logged return for Delta shares, and then repeat this operation for the arithmetic return. The arithmetic return is (1.22) that is, the current share price minus the share price in the previous period, all divided by the share price in the previous period. In Eviews the equation to create this series, call it ra, would be ra=(dal-dal(-1))/dal(-1). Alternatively, there is a built-in function @pch which produces the one-period percentage change (expressed as a decimal). So the equation to produce the daily arithmetic return for dal would be ra=@pch(dal). Alternatively, you can produce descriptive statistics for a number of series together. If you have created two new series for the logged return and arithmetic return, then in the Workfile window select the two objects (click on one, then press the Control key and click on the other), then just doubleclick the two series or right click, then Open Group. In the Group Window, switch to the Stats Table: click on View/Descriptive Stats/Common Sample. This produces a table showing mean, median, maximum, minimum, etc. You can select all of the table (including the row labels) by clicking in the empty top left-hand cell, then right click and copy (or the Control key and C), then paste into Word. (In the Copy Precision dialogue box, just leave the default selection – formatted – and click on OK.) At this stage of the course you can ignore most of this output. Alternatively, on the Eviews toolbar you can go to Quick/Group Statistics/Descriptive Statistics/Common Sample, and then type in the names of the required series in the Series list box, including any necessary transformations e.g. dlog(dal) @pch(dal). Or, if you have generated new series for the logged returns and arithmetic returns, you can produce the descriptive statistics for the series using the Command ‘r.stats’ for the series of log returns, r; and ‘ra.stats’ for the series of arithmetic returns, ra. Centre for Financial and Management Studies 29 Econometric Principles & Data Analysis 1.9 Answers to Exercises 1 Economic and finance theory can be viewed as a set of qualitative relations among variables. Such theory can frequently be written in the form of a mathematical model. An econometric model may be obtained from an appropriate mathematical model with the addition of a random error term. By using data to estimate the econometric model we can in effect quantify financial and economic relations. 2 a) The plot of S over time is shown in Figure 1.2. The plot of the current spot rate, measured as US dollars per pound, reveals a number of notable episodes. For example, there is a sharp depreciation of sterling in 1992, when sterling left the European Exchange Rate Mechanism. (A lower value for S means that one pound will buy fewer dollars, or equivalently, it takes fewer dollars to buy one pound). There is another sharp depreciation of sterling against the dollar (and other currencies) after the 2008 financial crisis. Figure 1.2 Plot of S 1982–2012 2.2 2.0 1.8 1.6 1.4 1.2 1.0 1985 1990 1995 2000 2005 2010 b) The scatter plot of against is shown in Figure 1.1 in the unit. The scatter plot shows that and have the expected positive relationship. The relationship seems to be approximately linear and seems to be relatively strong, in that the observations appear close to a regression line drawn in the scatter plot. c) Figure 1.3 shows the plot of over time. This is the difference between the current spot rate and the one-month ahead forward rate that was available one month previously. According to the efficient markets hypothesis, the forward rate should be a good predictor of the spot rate, so that any differences between and should be random. Any differences also reflect information that has become available between the time the forward exchange rate was formed, and the current spot rate was formed. In the first few years of the sample there are months when is consistently negative. If is 30 University of London Unit 1 An Introduction to Econometrics and Regression Analysis consistently greater than it suggests that the forward market is consistently under-predicting the value of the spot rate. Looking back at Figure 1.2, sterling was steadily depreciating in this period. This means the forward market is consistently underpredicting the extent of the depreciation in the spot rate. Again in 2008, there are relatively for a few months, and the same large negative values for interpretation might be applied: the forward market is not adequately predicting the depreciation in sterling. Figure 1.3 Plot of S – F(-1) 1982–2012 .16 .12 .08 .04 .00 -.04 -.08 -.12 -.16 -.20 -.24 1985 1990 1995 2000 2005 2010 d) Figure 1.4 shows the scatter plot of against . That is, the difference between the current spot rate and the forward rate one month ago, plotted against the difference in the previous period. Figure 1.4 Scatterplot of S – F(-1) against S(-1) – F(-2) 1982–2012 .16 .12 .08 S-F(-1) .04 .00 -.04 -.08 -.12 -.16 -.20 -.24 -.3 Centre for Financial and Management Studies -.2 -.1 .0 S(-1)-F(-2) .1 .2 31 Econometric Principles & Data Analysis The scatter plot allows us to examine whether the forecasting error between and can be explained by the forecasting error in the earlier period, . Figure 1.4 suggests there is no obvious relationship, positive or negative, between the forecasting error in one period and the forecasting error in the period that follows. 3 The value of the share at the start of year 1 is $1000, and in year 1 it experiences a return of –20% or –0.20. In year 2 the return is +20% or +0.20. a) Using arithmetic returns, the share price at the end of year 1 is The share price at the end of year 2 is b) Using logarithmic returns, the share price at the end of year 1 is The share price at the end of year 2 is or $1000 Arithmetic returns are not symmetric: a negative return, followed by a positive return of the same magnitude, does not restore the share to the original price. However, logarithmic returns are symmetric: a negative return followed by a positive return of the same magnitude does restore the share to the original value. 4 a) The plot of the Delta Airlines Inc. share price and the NYSE Composite Index is shown in Figure 1.5. Figure 1.5 Plot of DAL and NYA March 2010 to March 2012 8,800 8,400 8,000 7,600 7,200 16 6,800 14 6,400 12 10 8 6 2010M07 2011M01 2011M07 2012M01 The Delta Airlines share price is measured on the left-hand axis and the NYSE Composite Index is measured on the right-hand axis. Presented in this way the plots are not directly comparable, but you can see there are periods when both series generally move together, and there are other times when one series exhibits sharp movements that are not shown in the other series. 32 University of London Unit 1 An Introduction to Econometrics and Regression Analysis b) Figure 1.6 shows the plot of the daily logged return of the Delta Airlines share price. The daily logged return crosses the zero line frequently. Occasionally there are large positive and negative daily returns, of around +0.12 (12%) and –0.12 (minus 12%). Figure 1.6 Plot of DAL daily logged return March 2010 to March 2012 .12 .08 .04 .00 -.04 -.08 -.12 -.16 2010M07 2011M01 2011M07 2012M01 c) Figure 1.7 shows the scatter plot of the daily logged return on Delta shares (on the vertical axis) against the daily logged return on the NYSE Composite Index (on the horizontal axis). There would seem to be a positive, linear relationship between the two series. Figure 1.7 Scatter plot of DAL daily logged return and NYA daily logged return .12 .08 DLOG(DAL) .04 .00 -.04 -.08 -.12 -.16 -.08 -.04 .00 DLOG(NYA) .04 .08 d) The histogram and statistics for the Delta daily logged return is shown in Figure 1.8. The descriptive statistics for the daily logged return and daily arithmetic return for the Delta share price are shown in Table 1.2. Centre for Financial and Management Studies 33 Econometric Principles & Data Analysis Figure 1.8 Histogram and descriptive statistics dlog(dal) 80 Series: DLOG(DAL) Sample 1/03/2010 1/03/2012 Observations 506 70 60 50 40 30 Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis -0.000617 -0.000895 0.104360 -0.120286 0.029401 0.088426 4.036065 Jarque-Bera Probability 23.29091 0.000009 20 10 0 -0.10 -0.05 -0.00 0.05 0.10 Table 1.2 Descriptive statistics for dlog(dal) and @pch(dal) DLOG(DAL) -0.000617 -0.000895 0.104360 -0.120286 0.029401 0.088426 4.036065 @PCH(DAL) -0.000185 -0.000895 0.110000 -0.113333 0.029448 0.222162 4.104456 23.29091 0.000009 29.88029 0.000000 Sum Sum Sq. Dev. -0.312020 0.436530 -0.093542 0.437919 Observations 506 506 Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis Jarque-Bera Probability For small changes, the logged return and arithmetic return are approximately equal. However, for larger changes this approximation is not so close. You can see this in the maximum and minimum values for the two series in Table 1.2. References Gujarati D and D Porter (2010) Essentials of Econometrics, Fourth edition, New York: McGraw-Hill Book Company. Maddala GS (1992) Introduction to Econometrics, New York: Macmillan. Mosteller F and JW Tukey (1977) Data Analysis and Regression: a second course in statistics, Massachusetts: Addison-Wesley. Startz Richard (2009) Eviews Illustrated – An Eviews Primer, Irvine California: Quantitative Micro Software. 34 University of London