Econometric Principles and Data Analysis

Transcription

Econometric Principles
and Data Analysis
product: 4339 | course code: c230 | c330
© Centre for Financial and Management Studies
SOAS, University of London
1999, revised 2003, 2007, revised 2009, 2010, revised 2013
All rights reserved. No part of this course material may be reprinted or reproduced or utilised in any form or by any electronic,
mechanical, or other means, including photocopying and recording, or in information storage or retrieval systems, without
written permission from the Centre for Financial & Management Studies, SOAS, University of London.
Econometric Principles & Data
Analysis
Course Introduction and Overview
Contents
1
Course Objectives
2
2
The Course Authors
2
3
The Course Structure
2
4
Learning Outcomes
8
5
Study Materials
8
6
Assessment
11
Econometric Principles & Data Analysis
1
Course Objectives
This course provides an introduction to econometric methods. In brief, the
course examines how we can start from relationships suggested by financial
and economic theory, formulate those relationships in mathematical and
statistical models, estimate those models using sample data, and make
statements based on the parameters of the estimated models. The course
examines the assumptions that are necessary for the estimators to have
desirable properties, and the assumptions necessary for us to make statistical
inference based on the estimated models. In addition, the course explores what
happens when these assumptions are not satisfied, and what we can do in these
circumstances. The course concludes with an examination of model selection.
2
The Course Authors
The course, and its more advanced sequel, Econometric Analysis and
Applications, were designed and written by Dr Graham Smith, who is Senior
Lecturer in the Department of Economics, SOAS, where he teaches
econometrics to MSc students and carries out research on empirical finance.
His main research interests focus on emerging stock markets and he has
published extensively in international refereed journals. His recent research
demonstrates that stock market efficiency is determined by market size,
liquidity and the quality of markets.
The course has been revised by Dr Jonathan Simms, who is a tutor for
CeFiMS, and has taught at University of Manchester, University of Durham
and University of London. He has contributed to development of various
CeFiMS courses including Econometric Analysis and Applications; Financial
Econometrics, Risk Management: Principles & Applications; Public Financial
Management: Reporting and Audit; and Introduction to Law and to Finance.
3
The Course Structure
The paragraphs following the list of topics presented in the units provide brief
descriptions of the units’ content. They are intended as an introduction and
overview of the course. More complete, detailed explanation, analysis and
discussion are provided in the units themselves, and in the course textbook. So
don’t worry if you do not understand everything in this short introduction.
Unit 1 Introduction to Econometrics and Regression Analysis
1.1
1.2
1.3
1.4
1.5
1.6
1.7
2
What is Econometrics?
How to Use the Course Texts
Ideas – The Concept of Regression
Study Guide
An Example – The Consumption Function
Summary
Eviews
University of London
1.8 Exercises
1.9 Answers to Exercises
Unit 2 The Classical Linear Regression Model
2.1
2.2
2.3
2.4
2.5
2.6
Ideas and Issues
Study Guide
Example – the Single Index Model (SIM)
Summary
Exercises
Answers to Exercises
Unit 3 Hypothesis Testing
3.1
3.2
3.3
3.4
3.5
3.6
Ideas and Issues
Study Guide
Example – The Capital Asset Pricing Model
Summary
Exercises
Unit 4 The Multiple Regression Model
4.1
4.2
4.3
4.4
4.5
4.6
Ideas and Issues
Study Guide
Example – A Multi-Index Model
Summary
Exercises
Unit 5 Heteroscedasticity
5.1
5.2
5.3
5.4
5.5
5.6
Ideas and Issues
Study Guide
Example – Price-Earnings Ratio
Summary
Exercises
Unit 6 Autocorrelation
6.1
6.2
6.3
6.4
6.5
6.6
Ideas and Issues
Study Guide
Example – The Single-Index Model
Summary
Exercises
Unit 7 Nonnormal Disturbances
7.1
7.2
7.3
7.4
7.5
Ideas and Issues
Study Guide
Examples
Summary
Exercises
Centre for Financial and Management Studies
3
Appendix 1: Small-Sample Critical Values for the Jarque-Bera Test
Appendix 2: Stock Market Indices
Unit 8 Model Selection and Course Summary
8.1
8.2
8.3
8.4
8.5
8.6
8.7
Ideas and Issues
Study Guide
Example: the Demand for Money Function
Summary
Exercises
Course Summary: ‘What you do and do not know’
Unit 1 provides an introduction to econometrics and regression analysis. By
regression we mean an equation that captures the mathematical relationship
between the variables, and also the imperfect nature of that relationship. The
unit introduces the stages of an econometric investigation:
• statement of the theory
• collection of data
• mathematical model of the theory (an exact relationship between
variables)
• econometric model of the theory (a stochastic model of the relationship
between variables)
• parameter estimation
• checking for model adequacy
• tests of hypotheses
• prediction.
Unit 1 also provides guidance on how to use the study materials. In addition,
it provides a brief revision of how to calculate financial rates of return.
Each unit includes a worked example. (In Unit 1, the example concerns the
relation between spot and forward exchange rates.) All of the units also
contain exercises for you to do in order to develop your own understanding
and confidence, from a wide range of econometric studies. Data for the
exercises are provided. The data used in the examples are also provided so
that you can replicate the results presented in the unit (replicating the results
in the example is presented as an exercise).
The course uses the software package Eviews. Results from Eviews are
presented in the units. You are provided with a copy of Eviews to do the unit
exercises. Answers for the exercises are provided at the end of each unit, but
you look at the answers only after you have done the exercises yourself!
Data on the stock price of Delta Airlines Inc. and the New York Stock
Exchange Composite Index are introduced in the exercises in Unit 1. This
data set is used in a number of units throughout the course, in the worked
examples or the exercises. By applying different econometric tools with the
same data set, it is hoped you will develop a rounded view of how the
4
methods you will learn relate to each other. A variety of other models and
data sets are also used.
Unit 2 presents the classical linear regression model. It explains the method of
‘ordinary least squares’ (OLS) and how that can be used to estimate the
unknown parameters of a regression equation using sample data. In this unit we
are concerned with models containing two variables; we are trying to discover
how one variable – the explanatory variable – explains another variable – the
dependent variable – and estimate the parameters in that relationship.
We then need to ask whether we can make statements about the true,
unknown, parameters of the model, based on our estimated values. To do this
we need to make a number of assumptions. These assumptions, if satisfied,
ensure that the estimators we use have desirable properties (in brief and
oversimplified terms: the estimators are accurate and efficient). If the
assumptions are satisfied, we can also make predictions about the unknown
model parameters, and we can specify, precisely, how confident we are about
those predictions. Unit 2 also explains goodness of fit: how closely our
estimated model fits our sample data. These ideas are demonstrated using the
single-index market model applied to Delta Airlines Inc., and the British
retailer Marks & Spencer.
Unit 3 explores how to test hypotheses. Based on our estimated model
coefficients, can we answer questions of the form:
• Is the true, unknown coefficient negative, zero, or positive?
• Does it take a particular value?
• Is there actually a relationship between the two variables?
Unit 3 uses the capital asset pricing model (CAPM) for GlaxoSmithKline to
demonstrate hypothesis testing. Hypothesis testing is demonstrated further in
the exercises with the single-index model. So, for example, we might be
concerned with how we can test whether the stock we are interested in is
defensive or aggressive; is the company beta less than one or greater than
one? The efficiency of foreign exchange markets is also examined.
Unit 4 extends the analysis to the multiple regression model; these are
regression models in which one variable is explained by two or more
variables. The unit examines the assumptions necessary to estimate and make
predictions with such models. The unit asks what happens if, in a multiple
regression model, there is a relationship between any of the explanatory
variables, in addition to the relationships we hope to discover between the
explanatory variables and the dependent variable (this is called multicollinearity). The techniques of multiple regression are demonstrated with an
example of a multi-index model.
Units 5, 6 and 7 are concerned with what happens if a number of the
assumptions of the classical linear regression model are not satisfied. What
are the consequences for the properties of the ordinary least squares
estimators, and can we still make predictions about the unknown model
parameters based on our estimated model?
5
Unit 5 is concerned with heteroscedasticity. What is that? Here is a very brief
and simplified explanation; a more detailed and precise explanation is
provided in Unit 5. Unit 1 explains how we can specify a mathematical
relationship between variables. The actual relationship between variables is
not exact, and we attempt to capture this by including an error or disturbance
term in the regression equation. One of the assumptions we make is that the
variance of the disturbance term – how much it varies about its mean value –
is constant for all observations. This is the assumption of homoscedasticity,
and is explained in Unit 2.
In some econometric studies this assumption may not be satisfied. Consider a
cross-section study of commission rates for different brokerage companies.
The disturbance term also attempts to capture those influences on commission rates that we have not included in our model. Is it likely that the variance
of this disturbance term will be constant for all brokerage companies? If the
variance of the disturbance term is not constant, we say there is heteroscedasticity. Unit 5 examines the consequences of heteroscedasticity:
• What are the effects on the properties of OLS estimators, and can we
still make predictions based on our estimated model?
The unit examines how heteroscedasticity can be identified, and how we can
deal with it, either by transforming the model or by using a different estimation
method. If we know what form the heteroscedasticity takes, we can use the
method of weighted least squares. Heteroscedasticity is demonstrated with a
study of price-earnings ratios estimated for a cross-section of companies.
Unit 6 is concerned with autocorrelation. Again, here is a very simple and
brief explanation; a more precise and formal explanation is provided in Unit
6. Consider again the disturbance term that we include in our regression
equation. The disturbance term reflects the stochastic nature of the relationship between variables, and also attempts to capture the elements that we
have not included in the model. Another assumption we make about the
disturbance term is that the disturbance terms for different observations (e.g.
if using annual data, last year and this year, or if using daily data, yesterday
and today) are not related.
This is the assumption of noncorrelated disturbances, and is explained in
Unit 2. If the disturbances for different observations are related, we say that
the disturbance term is serially correlated or ‘autocorrelated’. For example,
an economic or financial shock in one month may have persistent effects in
following months, and if the model does not explicitly include such
persistence effects, the disturbance terms in different months will be
correlated. Unit 6 examines the implications of autocorrelation for the
properties of OLS estimators, and also the consequences for prediction based
on OLS estimators. It also shows how to identify autocorrelation using plots
and more formal tests, and what can be done to take account of autocorrelation, including changing the method of estimation. The effects of
autocorrelated disturbances are demonstrated with the single-index market
model for Delta Airlines, and a model of spot and forward exchange rates.
6
Unit 7 is concerned with the assumption of normality. In order to make
predictions about the true, unknown model parameters, based on our
estimated values, we need to assume that the disturbance terms are distributed normally – that is, they follow a normal distribution. You are probably
already familiar with the normal distribution from your other studies. It is a
probability distribution with known properties, which allows us to make
statements concerning the unknown model parameters with a certain degree
of confidence – for example, we can reject a hypothesis about a parameter
with a 5% chance of being wrong, or we can be 95% confident that an
unknown parameter takes a value within a certain range of values.
If the disturbance terms are not normally distributed, we are unable to make
such predictions, and it also has consequences for the properties of the OLS
estimators. Unit 7 explains the effects of having disturbances that are not
distributed normally, the tests to detect non-normal disturbances, and what
can be done about non-normal disturbances. This includes the use of dummy
variables to take account of outliers (data points which are very different
from the rest of the sample). These methods are demonstrated with two
examples: stock market returns and the single-index model for Marks &
Spencer. The exercises include consideration of the SIM for Delta Airlines
and for Bank of America.
Unit 8 is concerned with model selection. One of the assumptions we make is
that the model we estimate is correctly specified: the regression equation
includes all relevant variables, and the functional form of the relationship is
specified correctly – variables are included correctly as levels, or their logged
values are included, or perhaps squared values of the variables are included.
If the model is not correctly specified, this has consequences for the
properties of the OLS estimators and for prediction based on those estimators. In particular, Unit 8 examines the consequences of omitting a relevant
explanatory variable, including an irrelevant explanatory variable, and using
the wrong functional form.
The unit explains methods to identify misspecified equations. These include
tests specifically designed to identify misspecified models. In addition,
evidence of heteroscedasticity, autocorrelated errors, or non-normal errors,
may be a further sign that a model is not correctly specified. Unit 8 also
shows how we can decide between different specifications of a particular
economic relationship. It demonstrates model selection using the Delta
Airlines data set, and also the SIM for IBM stock. Finally, Unit 8 includes a
summary of the course, to help with your revision for the final examination.
More advanced topics in econometrics are studied in the CeFiMS course
Econometric Analysis & Applications. These include more use of dummy
variables, dynamic models: lags and expectations; simultaneous equation
models; time series analysis: stationarity and nonstationarity, and forecasting.
7
4
Learning Outcomes
After studying this course you will be able to:
• explain the principles of regression analysis
• outline the assumptions of the classical normal linear regression model,
and discuss the significance of these assumptions
• explain the method of ordinary least squares
• produce and interpret plots of data
• use the program Eviews to estimate a regression equation, and interpret
the results, for bivariate (two-variable) regression models and multiple
regression models
• test hypotheses concerning model parameters
• test joint hypotheses concerning more than one variable
• discuss the consequences of multicollinearity, the methods for
identifying multicollinearity, and the techniques for dealing with it
• explain what is meant by heteroscedasticity, and the consequences for
OLS estimators and prediction based on those estimators
• assess the methods used to identify heteroscedasticity, including data
plots and more formal tests, and the various techniques to deal with
heteroscedasticity, including model transformations and estimation by
weighted least squares
• explain autocorrelation, and discuss the consequences of autocorrelated
disturbances for the properties of OLS estimator and prediction based
on those estimators
• outline and discuss the methods used to identify autocorrelated
disturbances, and what can done about it, including estimation by
generalised least squares
• discuss the consequences of disturbance terms not being normally
distributed, tests for nonnormal disturbances, and methods to deal with
non-normal disturbances, including the use of dummy variables
• discuss the consequences of specifying equations incorrectly
• discuss the tests used to identify correct model specification, and
statistical criteria for choosing between models
• use Eviews to conduct tests for heteroscedasticity, correlated
disturbances, nonnormal disturbances, functional form, and model
selection
• use Eviews to estimate models in which the disturbance term is
assumed to be heteroscedastic or autocorrelated.
5
Study Materials
These course units are your central learning resource; they structure your
learning unit by unit. Each unit should be studied within a week. The course
units are designed in the expectation that studying the unit and the associated
readings in the textbook, and completing the exercises, will require 15 to 20
hours during the week.
8
Textbook
In addition to the course units you must read the assigned sections from the
textbook, which is provided with your course materials:
Damodar N Gujarati and Dawn C Porter (2010) Essentials of
Econometrics, New York: McGraw-Hill.
We have specifically used this textbook because it provides an excellent userfriendly introduction to econometric theory and techniques. You will notice
that Gujarati and Porter present examples from finance, economics and
business, because it is an introduction to econometrics in general. The
examples and exercises in the course units are drawn entirely from finance.
In each course unit there is a section, called Study Guide, which leads you
through the relevant parts of the textbook, and helps you to read and
understand the analysis presented there. If, while studying this course, you
find you need some revision in basic probability and statistics, you may find
it useful to look at parts of Appendices A to D in the textbook, which cover
probability, probability distributions, and statistical inference.
Eviews
You have been provided with a copy of Eviews, Student Edition. This is the
econometrics software that you will use to do the exercises in the units, and
also the data analysis part of your assignments. The results presented in the
units are also from Eviews.
Instructions to install Eviews, and to register your copy of the software, are
included in the booklet that comes with the Eviews CD. (Your student edition
of Eviews will run for two years after installation, and you will be reminded
of this every time you open the program.)
You must register your copy of Eviews within 14 days of installing it on
your computer. If you do not register your copy within 14 days, the
software will stop working.
Eviews is very easy to use. Like any Windows program, you can operate it in
a number of ways:
• there are drop-down menus
• selecting an object and then right-clicking provides a menu of available
operations
• double-clicking an object opens it
• keyboard shortcuts work.
There is also the option to work with Commands; these are short statements
that inform the program what you wish to do, and once you have built up
your own vocabulary of useful Commands, this can be a very effective way
of working. You can also combine all of these ways of working with Eviews.
In each unit there are instructions to help you use Eviews to do the exercises.
In addition, Eviews includes help files, which you can read as pdf files, or
9
navigate via the Eviews help and search facility. Unit 1 includes a section
introducing Eviews.
Although easy to use, Eviews is a very powerful program. There are
advanced features that you will not use on this course, and you should not be
worried if you see these, either in the menus or the help files. The best advice
is to stay focused on the subject that is being studied in each unit, and to do
the exercises for the unit; this will reinforce your understanding and also
develop your confidence in using data and Eviews.
Exercises
As already noted, there are exercises in every unit. These require you to work
with Eviews and data files, available from the VLE in the course area for this
study session, to do your own econometric analysis. It is very important that
you attempt these exercises, and do not just look at the Answers at the end of
the units. Your understanding of the material you have studied in the unit will
be greatly improved if you do the exercises yourself. You will also develop
better understanding and confidence in using Eviews.
The Instructions that accompany the exercises in the first few units are quite
detailed, because they are intended to help you to start working with Eviews.
As the units progress, it is assumed that you will gradually develop your
understanding of the basic Eviews operations, and the Instructions then focus
more and more on what new operations are required to do the Exercises in
the units. If you find that you have forgotten how to do something, look back
at the Instructions in the early units, because the basic operations will be the
same.
Podcast
There is a podcast to accompany Econometric Principles & Data Analysis, in
which Dr Simms discusses the course with Pasquale Scaramozzino, Professor
of Economics at the Centre for Financial and Management Studies. The
podcast is 18:26 minutes in length. Timings in the podcast are indicated
below in brackets.
The podcast begins by explaining what the course does (from 0:28), and
provides advice on how to study econometrics (1:16). Dr Simms then
discusses how to get the most out of the materials (2:58), including the
examples and exercises, and Eviews, and explains the choice of the textbook,
Essentials of Econometrics. The podcast then addresses the question of how a
course in econometrics helps the understanding of financial markets (7:30).
The discussion here emphasises the importance of being able to interpret
regression results, and assessing the quality of those results; obtaining
estimated equations is not enough in itself. Following this, the podcast
considers how econometrics bridges the gap between theoretical financial
models and financial data (11:56), explaining how econometrics allows us to
test whether a particular theoretical model is appropriate or not, and how
qualities displayed by the data can be used to improve models.
10
In addition to analysing the examples, completing the exercises, and writing
your assignments, you are also encouraged to apply the methods you are
learning to data sets with which you are familiar from your own working
environment (14:37), and to consider how the methods relate to your work or
areas of interest – this may enable you to develop a more intuitive understanding of the econometric techniques. Finally (16:57) there is a summary of
the podcast discussion, and a consideration of the general approach to take to
your study of econometrics, especially if you are unfamiliar with statistics
and maths, or are returning to these subjects after a period of time.
We suggest that you listen to the podcast before you start studying Unit 1,
and perhaps again half-way through the course when you have finished Unit
4. It may also provide a helpful revision at the end of the course, reinforcing
your understanding of what you have learnt and providing an overall context.
We hope that you enjoy this course.
6
Assessment
Your performance on each course is assessed through two written assignments and one examination. The assignments are written after week four
and eight of the course session and the examination is written at a local
examination centre in October.
The assignment questions contain fairly detailed guidance about what is
required. All assignment answers are limited to 2,500 words and are marked
using marking guidelines. When you receive your grade it is accompanied by
comments on your paper, including advice about how you might improve,
and any clarifications about matters you may not have understood. These
comments are designed to help you master the subject and to improve your
skills as you progress through your programme.
The written examinations are ‘unseen’ (you will only see the paper in the
exam centre) and written by hand, over a three-hour period. We advise that
you practise writing exams in these conditions as part of your examination
preparation, as it is not something you would normally do.
You are not allowed to take in books or notes to the exam room. This means
that you need to revise thoroughly in preparation for each exam. This is
especially important if you have completed the course in the early part of the
year, or in a previous year.
Preparing for Assignments and Exams
There is good advice on preparing for assignments and exams and writing them
in Sections 8.2 and 8.3 of Studying at a Distance by Talbot. We recommend
that you follow this advice.
The examinations you will sit are designed to evaluate your knowledge and
skills in the subjects you have studied: they are not designed to trick you. If
you have studied the course thoroughly, you will pass the exam.
11
Understanding assessment questions
Examination and assignment questions are set to test different knowledge and
skills. Sometimes a question will contain more than one part, each part
testing a different aspect of your skills and knowledge. You need to spot the
key words to know what is being asked of you. Here we categorise the types
of things that are asked for in assignments and exams, and the words used.
All the examples are from CeFiMS exam papers and assignment questions.
Definitions
Some questions mainly require you to show that you have learned some concepts, by
setting out their precise meaning. Such questions are likely to be preliminary and be
supplemented by more analytical questions. Generally ‘Pass marks’ are awarded if the
answer only contains definitions. They will contain words such as:
Describe
Define
Examine
Distinguish between
Compare
Contrast
Write notes on
Outline
What is meant by
List
Reasoning
Other questions are designed to test your reasoning, by explaining cause and effect.
Convincing explanations generally carry additional marks to basic definitions. They will
include words such as:
Interpret
Explain
What conditions influence
What are the consequences of
What are the implications of
Judgment
Others ask you to make a judgment, perhaps of a policy or a course of action. They will
include words like:
Evaluate
Critically examine
Assess
Do you agree that
To what extent does
Calculation
Sometimes, you are asked to make a calculation, using a specified technique, where the
question begins:
Use the single index model analysis to
Using any financial model you know
Calculate the standard deviation
Test whether
It is most likely that questions that ask you to make a calculation will also ask for an
application of the result, or an interpretation.
12
Advice
Other questions ask you to provide advice in a particular situation. This applies to policy
papers where advice is asked in relation to a policy problem. Your advice should be based
on relevant principles and evidence of what actions are likely to be effective.
Advise
Provide advice on
Explain how you would advise
Critique
In many cases the question will include the word ‘critically’. This means that you are
expected to look at the question from at least two points of view, offering a critique of
each view and your judgment. You are expected to be critical of what you have read.
The questions may begin
Critically analyse
Critically consider
Critically assess
Critically discuss the argument that
Examine by argument
Questions that begin with ‘discuss’ are similar – they ask you to examine by argument, to
debate and give reasons for and against a variety of options, for example
Discuss the advantages and disadvantages of
Discuss this statement
Discuss the view that
Discuss the arguments and debates concerning
The grading scheme
Details of the general definitions of what is expected in order to obtain a
particular grade are shown below. Remember: examiners will take account of
the fact that examination conditions are less conducive to polished work than
the conditions in which you write your assignments. These criteria are used
in grading all assignments and examinations. Note that as the criteria of each
grade rises, it accumulates the elements of the grade below. Assignments
awarded better marks will therefore have become comprehensive in both
their depth of core skills and advanced skills.
70% and above: Distinction As for the (60-69%) below plus:
•
•
•
•
shows clear evidence of wide and relevant reading and an engagement
with the conceptual issues
develops a sophisticated and intelligent argument
shows a rigorous use and a sophisticated understanding of relevant
source materials, balancing appropriately between factual detail and key
theoretical issues. Materials are evaluated directly and their assumptions
and arguments challenged and/or appraised
shows original thinking and a willingness to take risks
13
60-69%: Merit As for the (50-59%) below plus:
•
•
•
•
shows strong evidence of critical insight and critical thinking
shows a detailed understanding of the major factual and/or theoretical
issues and directly engages with the relevant literature on the topic
develops a focussed and clear argument and articulates clearly and
convincingly a sustained train of logical thought
shows clear evidence of planning and appropriate choice of sources and
methodology
50-59%: Pass below Merit (50% = pass mark)
•
•
•
•
•
shows a reasonable understanding of the major factual and/or theoretical
issues involved
shows evidence of planning and selection from appropriate sources,
demonstrates some knowledge of the literature
the text shows, in places, examples of a clear train of thought or argument
the text is introduced and concludes appropriately
45-49%: Marginal Failure
•
•
•
shows some awareness and understanding of the factual or theoretical
issues, but with little development
misunderstandings are evident
shows some evidence of planning, although irrelevant/unrelated
material or arguments are included
0-44%: Clear Failure
•
•
•
fails to answer the question or to develop an argument that relates to the
question set
does not engage with the relevant literature or demonstrate a knowledge
of the key issues
contains clear conceptual or factual errors or misunderstandings
Specimen exam papers
Your final examination will be very similar to the Specimen Exam Paper that
you received in your course materials. It will have the same structure and
style and the range of question will be comparable. We do not provide past
papers or model answers to papers. Our courses are continuously updated and
past papers will not be a reliable guide to current and future examinations.
The specimen exam paper is designed to be relevant to reflect the exam that
will be set on the current edition of the course.
Further information
The OSC will have documentation and information on each year’s examination registration and administration process. If you still have questions, both
academics and administrators are available to answer queries. The Regulations are also available at , setting out
the rules by which exams are governed.
14
UNIVERSITY OF LONDON
MSc Examination
for External Students
15DFMC230|15DFMC330
Financial Economics
Finance
Specimen Examination
This is a specimen examination paper designed to show you the type of examination
you will have at the end of this course. The number of questions and the structure of
the examination will be the same, but the wording and requirements of each question
will be different.
The examination must be completed in three hours. Answer FOUR questions –
Question One and then THREE other questions.
The examiners give equal weight to each question; therefore, you are advised to
distribute your time approximately equally over four questions.
Candidates may use their own electronic calculators in this examination
provided they cannot store text. The make and type of calculator MUST BE
STATED CLEARLY on the front of the answer book.
Do not remove this Paper from the Examination Room.
It must be attached to your answer book at the end of the examination.
© University of London, 2012
PLEASE TURN OVER
You must answer Question One and then any other THREE questions.
All candidates must attempt Question 1.
1
The Eviews output from estimating a single-index model for Microsoft
Corporation using weekly data for the period from 1 September 2009 to 27
August 2012 is provided below. MSFT is the price of Microsoft Corporation
stock, C is an intercept and SP is the Standard & Poor’s 500 index.
Dependent Variable: DLOG(MSFT)
Method: Least Squares
Sample (adjusted): 8/09/2009 27/08/2012
Included observations: 156 after adjustments
Variable
Coefficient
Std. Error
t-Statistic
Prob.
C
DLOG(SP)
8.97E-05
0.867121
0.001650
0.067407
0.054359
12.86391
0.9567
0.0000
R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood
F-statistic
Prob(F-statistic)
0.517967
0.514837
0.020539
0.064962
385.7823
165.4801
0.000000
Mean dependent var
S.D. dependent var
Akaike info criterion
Schwarz criterion
Hannan-Quinn criter.
Durbin-Watson stat
0.001896
0.029487
-4.920286
-4.881185
-4.904405
2.177215
Breusch-Godfrey Serial Correlation LM Test:
F-statistic
Obs*R-squared
0.821681
1.668569
Prob. F(2,152)
Prob. Chi-Square(2)
0.4416
0.4342
Ramsey RESET Test
Equation: DLOGMSFT_C_DLOGSP
Specification: DLOG(MSFT) C DLOG(SP)
Omitted Variables: Squares of fitted values
t-statistic
F-statistic
Likelihood ratio
Value
2.219073
4.924283
4.941733
df
153
(1, 153)
1
Probability
0.0280
0.0280
0.0262
Heteroskedasticity Test: White
F-statistic
Obs*R-squared
Scaled explained SS
0.119699
0.243711
0.268446
Prob. F(2,153)
Prob. Chi-Square(2)
Prob. Chi-Square(2)
Page 2 of 5
0.8873
0.8853
0.8744
The calculated Jarque-Bera statistic for the least squares estimation of the single-index
model is 6.410157 (Prob. = 0.040556).
a
Explain the economic rationale underlying the regression
equation.
b
Interpret the estimated coefficients.
c
Discuss the adequacy of the model with respect to
2
d
2
i
R
ii
Serial correlation
iii
Functional form
iv
Normality
v
Heteroscedasticity.
Predict the value of the return on Microsoft stock if the
market return is 2 per cent (or 0.02). Is this forecast likely
to be accurate?
Explain four of the following:
a
Linear in parameters, and linear in variables
b
The method of ordinary least squares (OLS)
c
The confidence interval for a slope coefficient
d
3
e
A consistent estimator
f
Under the assumptions of the CLRM, OLS estimators are
BLUE.
Answer all parts of this question. Using daily data for the period 1 March 2010 to
5 April 2012 (532 observations after adjustments), the following multi-index
model was estimated by ordinary least squares
R̂t = 0.002 + 0.902RM ,t + 0.103RO,t + 0.001TSt + 0.002RPt
(0.012) (0.031) (0.024)
(
(0.002)
(0.003)
(3.1)
and standard errors are in parentheses)
where
is the daily log return on the stock of the American energy
multinational ConocoPhillips,
is the daily log return on the NYSE
Composite Index,
is the daily log return of the Brent crude oil price,
term structure variable, and
is a
is a risk premium variable.
Test the following null hypotheses, explaining carefully in each case the null and
alternative hypotheses, the test statistic, degrees of freedom and the critical value
of the test statistic.
Page 3 of 5
a
b
the intercept is zero
is independent of
c
the coefficient on
is less than one
d
Test the hypothesis that the coefficients on
and
are
both zero. For your information, the following equation
was also estimated using the same data and OLS
R̂t = 0.0006 + 0.902RM ,t + 0.104RO,t
R 2 = 0.703
(0.0004) (0.030) (0.024)
(3.2)
(Standard errors are in parenthesis.)
4
5
Answer both parts of this question.
a
What is ‘imperfect multicollinearity’ and how might it be
detected?
b
‘The theoretical consequences of imperfect
multicollinearity are relatively unimportant but the practical
consequences are potentially serious’. Explain and discuss.
Answer all parts of this question.
a
How might heteroscedasticity arise?
b
Explain why heteroscedastic disturbances have
consequences for the validity of t and F tests.
c
Explain the Park test of heteroscedasticity.
d
Given
Yi = 1 + 2 X i + ui
( )
where var ui = 2 X i2
show how this model can be transformed so that the
disturbances have constant variance.
6
a
What is autocorrelation?
b
Why does it matter?
c
Explain how the Durbin-Watson test can be used for
detecting autocorrelation.
d.
For the model
Yt = + X t + ut
ut = ut 1 + vt
| |<1
(
vt ~ IID 0, 2
)
explain the steps involved in obtaining CochraneOrcutt estimates of the unknown parameters.
Page 4 of 5
7
8
a
What is nonnormality?
b
What are the consequences for the properties of the OLS
estimators if the disturbance terms are not distributed
normally?
c
How would you examine whether the disturbance terms are
distributed normally?
d
If there is evidence that the disturbance terms are not
distributed normally, what would you do?
a
Explain the characteristics of a ‘good’ econometric model.
b
What are the consequences of
c
i
including an irrelevant variable, and
ii
using an incorrect functional form?
How might
i
the presence of unnecessary variables, and
ii
an incorrect functional form
be detected?
END OF EXAMINATION
Page 5 of 5
&RXUVH,QWURGXFWLRQDQG2YHUYLHZ
&HQWUHIRU)LQDQFLDODQG0DQDJHPHQW6WXGLHV
(FRQRPHWULF3ULQFLSOHVDQG'DWD$QDO\VLV
8QLYHUVLW\RI/RQGRQ
8QLYHUVLW\RI/RQGRQ
8QLYHUVLW\RI/RQGRQ
8QLYHUVLW\RI/RQGRQ
8QLYHUVLW\RI/RQGRQ
8QLYHUVLW\RI/RQGRQ
Econometric Principles & Data
Analysis
Unit 1 An Introduction to
Econometrics and Regression
Analysis
Contents
1.1 What is Econometrics?
3
1.2 How to Use the Course Texts
6
1.3 Ideas – The Concept of Regression
8
1.4 Study Guide
16
1.5 An Example – The Consumption Function
17
1.6 Summary
20
1.7 Eviews
21
1.8 Exercises
22
30
References
34
Unit Content
This unit provides an introduction to econometrics and regression analysis. It
outlines the differences between financial and economic theory and econometrics. The unit explains how stochastic relations between variables are
different to mathematical relations between variables. It explains how uncertainty may be modelled using a disturbance term. The unit introduces the
steps involved in an econometric investigation. Unit 1 also introduces you to
Eviews, the econometrics software you will be using for this course.
Learning Outcomes
After studying this unit, the readings, and the exercises, you will be able to
discuss and apply the following
•
•
•
•
•
•
•
•
the population regression function
the sample regression function
the disturbance (or error) term
the residual term
how to use Eviews to open pre-existing text files containing data
how to create and interpret a scatter plot
how to obtain summary statistics
how to create transformations of variables.
Readings for Unit 1
Chapter 1 ‘The Nature and Scope of Econometrics’, from Damodar Gujarati
and Dawn Porter (2010) Essentials of Econometrics, New York: McGrawHill/Irwin.
You will also be asked to read Chapter 1, ‘A Quick Walk Through’, in
Richard Startz (2009) Eviews Illustrated – An Eviews Primer, Irvine California: Quantitative Micro Software. This file will be installed on your computer
when you install your copy of Eviews, and it is accessed via Help in Eviews.
Installation and registration of Eviews
Instructions for installing and registering your copy of the Eviews Student
Edition are in the booklet that comes with the Eviews CD. Instructions to
help you use Eviews to do the Exercises are included in section 1.8 of the
unit.
You must register your copy of Eviews. If you do not, it will stop working
14 days after installation.
Data Files for exercises
You will also be asked to work through exercises, and the data files you need
for these are available from the Online Study Centre, in the course area for
this study session.
2
Unit 1 An Introduction to Econometrics and Regression Analysis
1.1 What is Econometrics?
Welcome to this course. The aim of the course is to give you an introduction
to econometric methods or, more specifically, to linear regression, which is
the major statistical foundation of econometric work. This course requires
that you work with data; we hope you will find this interesting and useful,
and that you enjoy the course.
A principal concern of financial and economic theory is relations between
variables. In finance, you may have already studied many of these including
the capital asset pricing model; arbitrage pricing theory; efficient markets
hypothesis; optimal hedging ratios; bid-ask spreads. If you have studied
economics, you may be familiar with consumption, investment, and demand
for money functions; labour supply and labour demand functions; the expectations-augmented Phillips curve; and many others. You could, in fact, view the
whole of economic and finance theory as a set of relations among variables.
What is econometrics? Econometrics is concerned with quantifying financial
and economic relations. Econometrics is of use in providing numerical
estimates of the parameters involved and for testing hypotheses embodied in
the theoretical relationships. Broadly defined, econometrics is
… the application of statistical and mathematical methods to the analysis
of economic data, with a purpose of giving empirical content to economic
theories and verifying them or refuting them.1
This definition is not the only possible one; in fact, in your textbook you will
come across a number of definitions, which each puts the emphasis slightly
differently. Common to all definitions, however, is the stress on the empirical
nature of econometric work: the subject matter of econometrics concerns the
interaction of, and confrontation between, theory and data in quantifying
economic and financial relationships.
Hence, econometrics is not purely a branch of mathematical economics or
mathematical finance. Indeed, mathematical finance or economics need not
have any empirical content at all. Econometrics makes use of mathematical
methods, but its emphasis is on empirical analysis. However, econometrics is
not just a ‘box of tools’ to work with data. It requires, undoubtedly, a good
training in statistical techniques, but these techniques need to be situated in
an interactive process between theory and the data.
To give empirical content to financial and economic theories and to verify
them or refute them, the econometrician is confronted with three types of
problems, which are of lesser or no concern to the theorist.
First, in economic or financial theory we develop models out of a priori
reasoning based on relatively simple assumptions. To do this, we abstract
from secondary complications by assuming that ‘other things remain equal’
1
Maddala, GS (1992: 1)
3
while we investigate the relations between a few key economic or financial
variables. In effect, this method reduces to ‘intellectual experimentation’ with
causal relations postulated by theory. For example, in demand theory we say
that the quantity demanded of a commodity (which is not an inferior good)
will fall if its price rises, other things being equal.
This method is fruitful in theory but, unfortunately, in empirical economics
and finance the scope for experimentation is severely limited. A researcher
cannot alter a commodity’s price (or an asset’s price), holding other things
constant, in order to see what happens to demand. In general, financial and
economic data are not the outcome of experiments, but rather the product of
observational programmes of data gathering and collection in a world where
other things are never equal. In econometrics, therefore, we can only resort to
careful observation; the basic art of econometric work is more like unravelling a complex puzzle than setting up an experiment in a laboratory.
Second, we need to address the difference between deterministic and stochastic relationships. This issue arises in a different way in economics and in
finance. To make the point, we will explain the distinction between deterministic and stochastic relationships with an example from economics, and then
address it from a financial perspective.
In most economic theory we work with deterministic relationships between
economic variables. Take a simple example: the Keynesian consumption
function. In economic theory we assume that if we know the level of aggregate real income, consumption will be uniquely determined. That is, for each
value of aggregate real income there corresponds a given level of aggregate
consumption. This is a convenient device to enable us to work out exact
solutions for the interplay between variables within the confines of the
assumptions of an economic model.
In reality, however, we do not expect this relationship to be exact: it may be
stable perhaps, but it is surely imperfect. Hence, in econometric work we deal
with imperfect relationships between variables. It follows that our models
cannot be deterministic in nature. We investigate functions between variables
that we believe to be reasonably stable, on average, but there will always be a
degree of uncertainty about outcomes and conclusions derived from such a
model. Econometric modelling requires that we make explicit assumptions
about the character of these imperfections, or disturbances as they are more
commonly labelled. That is, we work with stochastic variables and we need
to model their stochastic nature. This is what makes us enter the areas of
probability theory and statistical inference and estimation.
How does the distinction between deterministic and stochastic relationships
arise in finance? Uncertainty is a fundamental element of risk, and the
measurement and management of risk are central aspects of finance. To
demonstrate this, consider the single-index model (which you will examine
and estimate in Unit 2). In the single-index market model the return on a
company stock is considered to be a function of three elements. There is a
fixed element which is specific to the company. There is also a deterministic
4
relationship between the return on the company stock and the return on a
relevant market index: For each value of the return on the market index there
corresponds a given value for the return on the company stock. (This part of
the model captures the concept of market-determined risk.) In addition, the
return on the company stock is explained by a company-specific disturbance
or error. (The company-specific error captures the concept of companyspecific risk.) The single-index model includes the company-specific disturbance not just to make the model more realistic; it is included because we
specifically want to understand the stochastic nature of the return on the
company stock, and thus get a better understanding of the risk associated
with the stock.
Third, in financial and economic models we work with theoretical variables.
Econometrics, in contrast, deals with observed data. Obviously, there is a
certain correspondence between them; data collection is inspired by theoretical frameworks. For example, national income account data were constructed
after the ascendancy of Keynesian economics, which concerns the analysis of
theoretical aggregates such as output, demand, employment and the price
level. However, observed variables do not fully correspond to their theoretical equivalents because of errors in measurement, conceptualisation and
coverage. This is usually less of a problem for econometrics applied in a
financial context than it is for economics. Financial data on asset prices, for
example, is more closely related to the actual transactions taking place, so
measurement error is less likely. However, we should be aware that movements in financial data may be the result of the particular operating or
reporting features of a market, say, in addition to the desired trading activities
of the participants that our theories suggest. In econometrics we need to be
aware of the nature of the observed data and its implications for investigating
theoretical propositions.
These three elements:
• the fact that we cannot hold other things constant in empirical analysis
• the imperfect nature of relationships between variables and
• the discrepancies between theoretical variables and observed data
give econometrics its distinctive flavour. We cannot move straight from a
financial or an economic model (as formulated by theory) to the data before
we come to terms with these issues. Econometric methods, therefore, aim to
address these issues so as to enable us to engage in meaningful investigation
of economic and financial theories.
Note that we talk about methods and, hence, emphasise the need for methodological groundwork to approach these types of problems. There are no hard
and fast rules to deal with them. There is not a box of magic tricks, which
always work and give us straight answers. Rather, we are left with the task of
studying methodological approaches to issues, which are complex, varied,
but challenging.
This course, Econometric Principles and Data Analysis, deals with regression analysis. Why this focus? We have seen that, in empirical analysis, our
5
data never behave exactly as our theoretical models would lead us to believe.
Theoretical models are useful abstractions, which provide the applied researcher with analytical handles to make sense of an often bewildering
economic and financial reality. Good theory allows us to search for patterns
within the data and to give meaning to such patterns. But we need to disentangle these patterns in the middle of a great deal of chance variations and
uncertainties of outcomes, which our theories could not possibly aim to
explain. Regression analysis provides us with an analytical framework to
handle relations between variables, especially between variables whose
relation is imperfect.
Indeed, regression analysis seeks to establish statistical regularities among
observed variables. To do this, we need to come to terms with the uncertainty
inherent in the behaviour of our data. For this, we need to equip ourselves
with statistical theory which allows us to model uncertainty as part of
relations between variables. This is the purpose of this course, Econometric
Principles and Data Analysis, of which this is the first unit.
The following are the main points to remember.
• In econometrics we pose the question how to confront theory with data
so as to quantify our financial and economic relationships, to verify
them or to refute them.
• In practice, we deal with imperfect relationships between variables
which we can only observe (with errors and, often, through proxies) in
a context which we do not control (we cannot experiment).
• It follows that we can only resort to careful observation of complex
phenomena in order to check our theories against the empirical
evidence. This raises questions about econometric methods:
methodological issues about gathering and evaluating such empirical
evidence. Whatever conclusions we draw in such a context will always
involve a considerable degree of uncertainty, even if our models are
correctly specified. For this reason, we resort to probability theory and
statistical inference to deal with uncertainty in assessing outcomes and
conclusions of empirical analysis.
• Since our concern is primarily with investigating relations between
variables, regression analysis constitutes the major tool of statistical
analysis in econometrics.
1.2 How to Use the Course Texts
It is quite possible that you are worried about studying econometrics. After
all, it involves working with mathematics and statistics, and you may feel
that this is not one of your greatest strengths. Alternatively, you may be one
of those who welcome this greater emphasis on mathematics and statistics.
Whichever view you hold, it is useful to be aware of a particular problem that
invariably arises when studying econometrics.
Teaching and learning econometrics almost inevitably involves a preoccupation with technical details: definitions of technical terms, mathematical
6
derivations, step by step descriptions of statistical procedures etc., all phrased
in technical notation. This is normal and, indeed, necessary. But this preoccupation with technical detail often implies that students lose a perspective
on ‘What is it all about?’ or ‘Why are we doing this?’ That is, there is a need
to keep a focus on the kind of basic questions, uncluttered by notation and
technical detail, which give substance to the subsequent technical exercises.
We need to get an overview of a problem before we explore it aided by our
technical skills. We need to know the simple questions and intuitive insights
which often prompted elaborate technical enquiries.
For this reason, the course texts will always start with a section on ideas or
issues.
The purpose of this is to explain in simple words, with the minimum of
technical notation, the basic substance of the unit. The aim is to give you an
intuitive feel for the subject matter before going into technical detail. If you
feel that mathematics and statistics are not your strongest subjects, this
regular section will give you a few ‘analytical handles’ to hold on to when
studying relevant techniques.
But, alternatively, if you are confident with mathematics and statistics, it is
important not to skip this section. Technical expertise is not just a question of
one’s ability to work out the steps in a technical procedure or to understand a
mathematical derivation. It also involves understanding the type of questions
a technique tries to address as well as the assumptions on which it is based.
Good technical expertise is more than understanding a set of technical skills
(narrowly defined); it also involves analytical insights and judgement of the
appropriateness of particular technical procedures in specific conditions.
The section on ideas or issues will be self-contained; no references will be
made to reading parts of the assigned textbook. Take your time to read it
carefully, and to reflect whether you understand the type of questions which
will be addressed subsequently in technical detail: ‘get familiar with the
forest before you start looking at the trees’. In other words, use this section to
provide you with the ‘analytical handles’ to facilitate the study of the relevant
techniques.
Next, the course texts will have a reading section, or Study Guide, which
guides your study of the textbook, Gujarati and Porter’s Essentials of
Econometrics. The purpose of these sections is to structure your reading of
the textbook as well as to provide brief comments, elaborations and crossreferences to exercises and examples, and to suggest short cuts in coping
with the material.
The section after that will normally contain one example. This section has
two purposes. Firstly, the example highlights a specific aspect of the topic
under study in a particular unit of the course. Secondly, the example also tries
to give you a bit of the flavour of econometrics in action. Generally, you will
be asked to participate in the analysis of the example. The examples aim to
highlight the links between economic theory and empirical investigation, and
try to illustrate the problems that can arise when we work with real data.
7
The next section will provide a brief summary of the main issues raised in the
unit. This will be followed by a section of exercises. It is most important that
you work through all of these exercises. The exercises have three purposes:
• to check your understanding of basic concepts and ideas
• to verify your ability to use technical procedures in practice and
• to develop your skills in interpreting the results of empirical analysis.
The final section of the units will include brief answers to these exercises,
which you should not look at until after you’ve worked out the answers for
yourself!
You will be using Eviews to do the econometric exercises, and this unit
has an additional section describing this program, which is a widely used
econometrics software package. Instructions to use Eviews will accompany
the exercises, where necessary.
This basic structure of the course texts will be maintained throughout your
study of this course. The section on ideas or issues gives you an overview of
the topic of the unit, using non-technical language. The core of the course
text is the study guide. This guides you through your reading of the textbook
and refers you to the exercises whenever appropriate. The example in each
unit demonstrates a problem dealt with in the course material using real data.
By using examples drawn from areas of finance, using real data, this section
also aims to provide cross-references to the theory courses.
The summary draws your attention to the main points made in the unit. The
exercises are important and you should always work through them. The
exercises will help you to understand the course material. In addition, the
knowledge and experience you gain from doing the exercises will help you to
write assignments and answer examination questions.
1.3 Ideas – The Concept of Regression
The remainder of this unit will deal with the introduction to regression
analysis. As you will see, it is structured along the pattern outlined above.
1.3.1 What is regression?
Regression is the main statistical tool of econometrics. What is regression?
Broadly speaking,
… regression methods bring out relations between variables, especially
between variables whose relations are imperfect in that we do not have
one Y for each X.2
But what do we mean by imperfect relations?
An example may help. Consider the relation between corporate bond spreads
(this is the Y-variable) and the earnings before interest of companies (this is
2
8
Mosteller and Tukey (1977: 262).
the X-variable). The spread for a corporate bond is the difference between the
interest rate on the corporate bond and the interest rate on government bonds
of equivalent maturity. Interest rates on corporate bonds are higher than those
on government bonds to reflect expected default loss, different tax treatments
and the riskier return associated with corporate bonds. We would expect that
a company with higher earnings before interest would be less likely to
default, and hence the bond spread for that company would be lower.
Hence, we expect that, on average, the corporate bond spread is inversely
related to earnings before interest. But we do not expect this relation to be
perfect. That is, if we were to sample 10 companies with identical earnings
before interest (i.e. equal X-values), we would not expect to get 10 identical
corporate spreads (the Y-values). Differences between the markets in which
the firms operate, in management and in other financial variables (e.g.
coupon rates, coverage ratios) will account for differences in bond spreads.
But, importantly, it is still valid to say that, on average, the bond spread
declines as the level of earnings before interest increases. That is what
Mosteller and Tukey (quoted above), mean when they say that a relation
exists between two variables but that it is imperfect in that we do not have
one Y for each X.
This leads us to the discussion of the concept of regression. Regression
methods aim to bring out this average relation between a dependent variable
on the one hand and one or more independent variables on the other. In our
example the average inverse relation between the bond spread and the level
of earnings before interest is the regression of the former variable on the
latter. But, obviously, there will be variation in how markets view the bonds
of individual companies that have broadly the same earnings.
In fact, anyone familiar with data analysis knows very well that we can
always take an average of one or another aspect of a number of individuals,
but we rarely meet the ‘average individual’. So it is also with regression as an
average relation: individual observations will rarely conform to the average
relationship between Y and X. Hence, in regression analysis we seek to
establish statistical regularities in the middle of a great deal of chance variation and uncertainty in outcomes. For this reason, regression methods
involve statistical modelling of the chance variation in the data as well as of
the average relationship.
In summary, we hope that our model captures the basic structure of interaction between economic and financial variables, and we expect that the
behavioural relations are reasonably stable, but imperfect. At most, we expect
these relations to hold ‘on average’. In other words, we seek to discover
structure and regularity within data in the middle of a great deal of uncertainty in outcomes. It is similar to separating sound from noise when trying to
listen to a badly tuned radio.
Therefore, a regression model embraces two components:
• a regression line (which defines the basic structure) and
• disturbances.
9
Firstly, the regression line models the average relation between the dependent variable and its explanatory variable(s). To do this we make an explicit
assumption about the shape of the regression curve: linear, quadratic, exponential, etc.
Secondly, we recognise the existence of chance fluctuations due to a multitude of factors beyond our control. We model this element of uncertainty (the
noise) in the form of a disturbance term, which constitutes an integral part of
our model. This disturbance term is a ‘catch all for all the variables considered as irrelevant for the purpose of the model as well as all unforeseen
events’.3 It is a random variable which we cannot observe or measure in
practice.
Sometimes we are not interested in the disturbance term as a variable in its
own right, but we are interested in understanding how the disturbance term
affects our attempts to investigate the behavioural relations in the model. In
other circumstances we might be particularly interested in the properties of
the disturbance term, if it reflects an element of uncertainty and risk that we
are trying to understand.
In both cases, we need to model the probabilistic nature of the disturbance
term. In other words, we try to model the character of the uncertainty inherent
in the data. This is no easy task, and we always need to think carefully whether
the assumptions we make about the nature of these chance variations are
indeed appropriate for the type of issue under study. Not surprisingly, a great
deal of econometric theory and practice is concerned with these assumptions.
It is useful to express these important ideas a little more formally. We start
with the population regression function. This is a theoretical construct, which
contains a hypothesis about how the data are generated. For the simple, twovariable linear regression model we have
(1.1)
in which Y is the dependent variable, X is the explanatory variable – sometimes called the regressor, u is the disturbance term, and the subscript i
indicates the ith observation. β1 and β2 are the regression parameters; β1 is the
intercept, or constant, and β2 is the slope coefficient. Typically, the variables
Y and X are observable, the disturbance is not observable, and the parameters
β1 and β2 are unknown. The presence of the random disturbance means that Y
is stochastic; for each value of the explanatory variable, X, there is a distribution of Y-values.
In this explanation of regression we will continue to use the i subscript to
indicate the ith observation. In many financial applications we will examine
series that vary over time, and it will be more meaningful to use a t subscript
to indicate that the observation refers to period t. This will allow us to use
t – 1 to refer to the previous period, etc.
3
10
Maddala G S (1992: 3).
The population regression function may be viewed as comprising two components: a systematic element represented by a straight line which shows the
statistical dependence of Y on X; and a random, or stochastic, element represented by the disturbance (error) term u. The systematic element can be
expressed as
(1.2)
that is, the average (or expected), value of Y conditional on a given value of X
is a linear function of X – or, more concisely, the average value of Y for each
value of X. That is, the population regression function joins the conditional
means of Y. The disturbance term, u, is the focus of much attention. It accounts for the variation in Y around the population regression line. In Unit 2
you will learn about the important assumptions made about u.
A prime objective of econometrics is to quantify the unknown parameters
and . Using a sample of data on Y and X, we obtain estimates,
of the unknown population parameters.4
and
,
We have the sample regression function
(1.3)
in which
and
are random variables (the particular estimates obtained
depend on the particular sample of data on Y and X used) that differ from the
population parameters β1 and β2. Consequently, the sample residuals, ei,
differ from the unknown population disturbances, ui. Whereas the disturbance term accounts for the variation in Y around the population regression
line, the residuals give us the vertical deviations of the observed Y-values
from the estimated regression line derived from sample data. The residuals,
therefore, are not identical with the disturbances, but clearly they do tell a
story, which may enable us to assess whether or not our assumptions about
the behaviour of the disturbances seem reasonable. How to analyse the story
or stories told by residuals is a matter we address in the second half of the
course.
The predicted value of the dependent variable is given by the sample
regression line
(1.4)
in which
is the fitted value of the dependent variable, the estimator of
, that is the estimator of the population conditional mean. The
sample regression line is an estimator of the population regression line.
Notice that we focus on the linear regression model. That is, we are concerned with a model that is linear in the parameters to be estimated. The
model
4
^ is read as ‘hat’, hence
is ‘beta one hat’.
11
(1.5)
is linear in β1 and β2. With the sample regression line
(1.6)
is the predicted value of Y (in units of Y) if X = 0. Also,
; this
implies that a 1 unit increase in X (measured in units of X) results in a
unit increase in
(measured in units of Y).
Now consider the model (in which e stands for exponential, not the residual)
(1.7)
which, after taking natural logarithms of both sides of the relation, can be
written as
or
(1.8)
where β1 = lnα.
This model is also linear in the parameters to be estimated, β1 and β2. We
may view the model as
(1.9)
where
and
. This model is known by a number of different
names – logarithmic, double log, log-log, log linear, and constant elasticity –
and is frequently used in applied work when it characterises the form of the
functional relationship between the variables. It has the useful property that the
slope coefficient measures the elasticity of Y with respect to X because
(1.10)
With this logarithmic model, a 1 per cent increase in X results in a β2 per cent
increase in Y. Note that here we mean a 1 per cent proportionate increase in X,
not that X increases by 100 basis points (1 basis point equals 0.01 per cent).
Although regression analysis is related to correlation analysis, conceptually
these two types of analysis are very different. The main aim of correlation
analysis is to measure the degree of linear association between two variables,
and this is summarised by a sample statistic, the correlation coefficient. The
two variables are treated symmetrically. Both are considered random; there is
no distinction between dependent and explanatory variables, and no implication of causality in a particular direction from one variable to the other.
Regression analysis, however, can incorporate relationships between two or
more variables and the variables are not treated symmetrically. The dependent and explanatory variables are carefully distinguished. The former is
12
random and the latter is often assumed to take the same values in different
samples – often referred to as ‘fixed in repeated samples’. The underlying
economic or financial theory implies that X, an explanatory variable, causes
Y, the dependent variable. Moreover, with more than one explanatory variable, regression analysis quantifies the influence of each explanatory variable
on the dependent variable.
1.3.2 Data and Regression
Regression methods allow us to investigate associations between variables,
but the inspiration as to which relations to investigate obviously comes
from theory. We are not interested in detecting spurious (false or bogus)
associations between variables. Indeed, relations have to be meaningful –
and whether they are, or not, depends on theoretical argument.
This does not mean, however, that data play only a passive role in economic
and financial analysis. The role of data is not just to provide numerical
support to theoretical arguments. Empirical investigation is an active part of
theoretical analysis in as much as it is concerned with testing theoretical
hypotheses against the data as well as, in many instances, providing clues and
hints towards new avenues of theoretical enquiry. This requires that we
translate our theoretical insights into empirically testable hypotheses, which
we can investigate with observed data. Hence, the process between theory
and the data is interactive: we must continuously investigate the empirical
content of our theoretical propositions in order to test our theories, and pick
up signals from the data that enable us to improve our theoretical insights.
Most of the data we use in applied economic analysis are not obtained
through experimentation but are the result of observational programmes.
National income accounts, agricultural and industrial surveys, financial
accounts, employment surveys, population census data, household budget
surveys and price and income data, among others, are collected by various
statistical offices. They are partial records of what happens; they are not the
outcome of experiments. As we have noted, finance data more closely relate
to actual transactions, but, like economic data, they are not the outcome of
experiment.
The character of this economic and financial data makes the work of an
econometrician quite different from that of a psychologist or an agricultural
scientist. In the latter cases, experiments play a prominent role in analysis,
and much of the emphasis in research work is put on the careful design of
experiments in order to be able to single out effect and response between two
variables while controlling for the influence of other variables (that is, by
holding them constant). In economics and finance, the scope for experimentation is very limited.
We cannot change the price of a stock, holding all other prices constant,
merely to see what would happen in its demand. In theory, we do just that by
assuming that ‘other things are equal’ and postulating cause and effect
between the remaining variables. In empirical analysis, however, other things
13
are never equal, and we can only carefully observe the behaviour of economic agents from survey data. As you will see in subsequent units, multiple
regression techniques allow us to ‘account’ for the influence of other variables while investigating the interaction between two key variables, but this
is not the same as ‘holding other variables constant’.
The econometrician, therefore, needs to be, above all, a careful observer.
Empirical analysis in economics and in finance allows us to search for
patterns in our data through careful observation backed by theoretical understanding; but experimentation is not really an option we have available,
because we do not have control over the overall context that determines the
movement of our variables.
In analysing data, we should follow the advice ascribed to Darwin. It is
obviously pleasing if the empirical evidence seems to support our theoretical
hypotheses, but – more importantly – we should take special note of any
signs given by the data that go against our arguments. That is, we should not
approach our data merely to confirm answers to well-defined questions
derived from theoretical argument, but we should also look out for hints from
the data about what we do not know – that is, about questions that we have
not confronted yet. A careful observer uses data not just to confirm his or her
theories, but also to get clues from empirical analysis to advance one’s
theoretical grasp of a problem. It is primarily this aspect that enables data to
be used to play an active part in the process of analysis.
1.3.3 Rates of return
Much analysis in financial econometrics is concerned with rates of return,
including returns on shares, stock indices, commodities and exchange rates.
Therefore, at this point in Unit 1 it might be useful, briefly, to refresh your
understanding of returns. In your study of finance or risk management, or in
your work, you may already be familiar with arithmetic and logarithmic rates
of return. For example, logarithmic returns are used especially in the BlackScholes-Merton model of options pricing.
First consider arithmetic returns. Suppose we have a stock that is worth $1000
at the start of the year and $1050 at the end of the year. Ignoring any dividends, we say that the arithmetic or simple or proportionate rate of return is
r=
(1050 − 1000) = 0.05 or 5 per cent.
1000
It is the increase (or decrease) in value, divided by the original value. Put
another way, if the stock, initially valued at $1000, benefits from a 5% return
over the year, then the value at the end of the year will be
(
)
1000 1 + 0.05 = 1050
In general terms, if the price at the start of the year is
, and the stock
experiences a return of r, the price at the end of the year will be
14
(
P1 = P0 1 + r
)
(1.11)
and the rate of return is
r=
(P − P ) .
1
0
(1.12)
P0
To understand logarithmic returns and continuous compounding, it may help
to conduct a short thought exercise. In the previous example, we can think of
the return, r, being applied to the asset once a year (if it makes more sense to
you, think of r as the interest paid on a sum of money in a bank account, paid
annually). Now suppose that this growth rate is applied at more times through
the year, but the rate of return at each point of the year is adjusted to take
account of the increased number of times the return is experienced. Continuing the 5% example, if the return is applied twice in a year, the stock will
benefit from a return of 2.5% in the first six months, and another 2.5% in the
second six months. After six months the asset price will be
⎛
0.05 ⎞
= 1025
1000 ⎜ 1 +
2 ⎟⎠
⎝
And after one year the asset price will be
2
⎛
0.05 ⎞
P1 = 1000 ⎜ 1 +
= 1050.625
2 ⎟⎠
⎝
The growth of 0.025 or 2.5% in the first six months also benefits from
growth of 0.025 or 2.5% in the second six months. This is known as compounding, and it explains why the value of the stock at the end of the year is
more than 1050. In general, if the return is applied m times in a year, the asset
price at the end of the year will be
(1.13)
We could increase m to 12 or 365, to see what the price of the stock would be
if the return were applied (or compounded) every month or every day. We
could also ask what continuous compounding would look like. Continuous
compounding or continuous growth is when the return is experienced an
infinite number of times in the year, but the return at each point of the year is
infinitesimally small. That is, what happens if m approaches infinity? You
can see that
will approach zero, but the expression in brackets will be
raised to the power infinity. The limit of this expression when m approaches
infinity is , where e is equal to 2.718 (to three decimal places). The value e
is known as the base of natural logarithms.
Going back to our example, if the stock is initially valued at $1000, and
experiences continuous growth at an annual rate of 5 per cent (or 0.05), it
will be valued at the end of the year at
1000e0.05 = 1051.27
15
and in general terms
(1.14)
We can calculate the logarithmic rate of return (also known as the continuously compounded return) as
(1.15)
where ln represents the natural logarithm, or the logarithm to base e. To see
this, take natural logarithms of the end-of-year continuously compounded
stock price
( )
( )
ln P1 = ln P0 er = ln P0 + ln er = ln P0 + r ln e = ln P0 + r
since the natural log of e is 1.
In one of the exercises at the end of the unit you will show that arithmetic
returns are not symmetric: if a stock valued at $1000 experiences first a
return of minus 10% and then a return of 10%, it will not be equal to $1000
at the end. On the other hand, you will find out that logarithmic returns are
symmetric. You will also use Eviews to calculate arithmetic and log returns.
Note that in this course, returns will always be calculated as decimals, so a
return of 5%, for example, will be shown in Eviews and in any other calculations as 0.05. It will not be shown as 5.00. A consistent approach is
necessary, and the decimal representation makes calculations a little bit
simpler.
1.4 Study Guide
First, let us consider notation. In econometrics, population parameters and
their estimators are normally denoted by Greek letters, and the course units
follow this standard practice. The textbook, however, differs. Table 1.1
summarises the principal difference and similarities in notation.
Table 1.1
Notation
Course units
Textbook
Population parameters
,
B1, B2
Their estimators
,
Disturbances
ui
ei
N
b1, b2
ui
ei
n
Residuals
Number of observations
For this unit you are requested to study Chapter 1 of the course textbook,
Gujarati and Porter’s Essentials of Econometrics. This chapter has three main
sections, the first two of these address two questions: What is econometrics?
and Why study econometrics? These sections are straightforward, and you
can read them relatively quickly.
16
Reading
Damodar Gujarati and
Dawn Porter (2010)
Essentials of
Econometrics, sections
1.1 and 1.2 Chapter 1
‘The Nature and Scope
of Econometrics’.
Please now read sections 1.1 and 1.2, pages 1–3, of Gujarati and Porter’s textbook.
Make notes of the important points.
The next section of the textbook is particularly important. It sets out a methodology of econometrics; that is, it explains how you might proceed in a
typical econometric study. Gujarati and Porter identify eight steps associated
with the typical econometric investigation. All of these eight steps are
discussed in the context of a model of labour force participation. Although
this particular example is drawn from economics, you will see that the steps
described are relevant to econometric investigation in any discipline, including finance. You will see that in this example the data are plotted in a scatter
diagram (often called a scatter plot). This can be helpful in giving a simple
illustration of the relationship among two variables in the data. Notice also
the central role of estimating the parameters of the model and so obtaining
the estimated regression line.
The notation in the textbook differs slightly from the notation in these units.
In the context of the model of labour force participation, Gujarati and Porter
define CLFPR as the civilian labour force participation rate and CUNR as the
civilian unemployment rate and write the population regression function as
(1.16)
which is comparable to our population regression function
.
(1.1)
Reading
Please read carefully section 1.3, pages 3–12, of the textbook.
Damodar Gujarati and
Dawn Porter (2010)
Essentials of
Econometrics, Chapter
1, Section 1.3 ‘The
Methodology of
Econometrics’.
1.5 An Example – Efficiency in the Foreign Exchange
Market
The eight steps explained in the textbook are typical of any econometric
investigation and you are now going to follow them in another example,
examining the hypothesis of efficiency in the foreign exchange market.
Statement of the Theory
Efficiency in markets is a central assumption of many theories in finance and
economics. The efficient markets hypothesis states that current prices will
reflect all available information. Applied in the exchange rate market, the
hypothesis suggests that the forward exchange rate is the market’s expectation of the spot rate that will exist in the future. Any difference between the
forward rate formed in the previous period and the spot rate in the current
period should be entirely random and unpredictable. In addition, there should
17
be a close relation between the forward rate from the previous period and the
spot rate in the current period.
Collection of Data
The data to be used are monthly time series data for the spot exchange rate
between UK sterling and the US dollar, measured in dollars per pound, and
the one-month-ahead forward exchange rate, also measured in dollars per
pound. The data cover the period January 1982 to January 2012. The source
of the data is www.bankofengland.co.uk.
Figure 1.1 shows a scatter plot of the current spot rate, S, against the forward
rate available in the previous month, F(–1). The figure suggests that the
relationship is upward sloping and it seems to be reasonably linear.
Figure 1.1 Scatter plot of S (current spot rate) on F(–1) (previous forward rate),
1982–2012
Y
280,000
240,000
200,000
160,000
120,000
80,000
100,000
150,000 200,000 250,000 300,000
350,000
X
Mathematical Model of the Theory
The relation between the current spot rate and the forward rate in the previous month in its simplest form can be presented as a linear relationship
(1.17)
where
is the spot rate in period t;
is the one-month ahead forward rate
available in the previous period,
; β1 is a constant (or intercept) and β2 is
the slope of the function. For the efficient markets hypothesis to hold we
would expect
and β 2 = 1 .
Econometric Model of the Theory
The econometric model is stochastic. It includes a random error,
, which
captures the influence of all the other variables that may influence the spot
exchange rate.
18
(1.18)
The disturbance term
is crucial to the distinction between a mathematical
model and an econometric model. In the mathematical model we have a
function – there is a unique value of the spot rate for each value of the
previous forward rate. With the econometric model, we have a relation in
which there is no longer a unique value of the spot rate for each value of the
previous forward rate. In the context of the efficient markets hypothesis the
disturbance term has additional interpretation: according to the hypothesis,
any difference between the previous forward rate and the current spot rate
should be random and unpredictable.
Parameter Estimation
Using these data and Eviews, it is possible to obtain estimates of the parameters β1 and β2 to obtain the average relationship between
and
. The
problem of estimating the coefficients of the population regression function
will be discussed in Unit 2. The function estimated with our data is
(1.19)
and this represents the average relationship between the spot exchange rate
and the previous forward exchange rate. The estimated value of β1 , β̂1 , is
0.064 and the estimated value of β 2 , β̂ 2 is 0.962. Consequently if the forward exchange rate increases by 0.01, the spot rate in the next period
increases on average by 0.00962. The interpretation of the intercept is not as
straightforward. Mechanical interpretation of the estimate tells us that the
spot exchange rate is $0.064 per pound if the forward exchange rate in the
previous period is zero. On its own, this statement is without meaning.
However, in the context of the efficient markets hypothesis, we may ask if
the estimated constant indicates there is a systematic and predictable difference between the average spot rate in a period, and the spot rate expected by
the markets in the previous period (as measured by the forward rate), and
whether this difference could be exploited by traders.
Checking for Model Adequacy
How appropriate is the model? Should some other variable(s) be included,
and is the functional form correct? For example, research on the efficient
markets hypothesis in exchange markets has used the natural logarithms of
the spot and forward exchange rates. Alternatively, researchers have focussed on the rates of return on the spot and forward exchange rates, and not
the levels. Researchers have also examined if the difference between the
spot rate and the previous forward rate (
) can be explained by the
difference that was observed in earlier periods. With the relevant data, we
could estimate various specifications of the relation between spot and
forward exchange rates. How do we choose the best model? This is discussed in Unit 8.
19
Tests of the Hypothesis
Do the results conform to the theory of the efficient markets hypothesis?
With our theory we expect β1 = 0 and β 2 = 1 . Is each of these hypotheses
supported by the results? Our estimates would appear to be consistent with
what we expected to obtain, but we should conduct formal tests to check that
this is actually the case. Formal tests of hypotheses will be discussed in
Unit 3.
Prediction
How might the estimated model be used for prediction? We could use it to
predict what the spot exchange rate would be if the forward rate in the
previous period was a particular amount. Suppose the forward exchange rate
in the previous month was $1.50 per £1.00. The predicted level of the spot
rate is
Ŝt = 0.064 + 0.962 × 1.50 .
(1.20)
Therefore
.
That is, the spot rate is predicted to be $1.507 per £1.00 if the forward rate in
the previous period is $1.50 per £1.00.
1.6 Summary
In this unit we introduced some basic ideas on econometrics and regression
analysis. The most important points to remember are the following:
• Econometrics is the application of statistical and mathematical methods
to the analysis of data, with a purpose of giving empirical content to
economic and financial theories and verifying them or refuting them.
Three elements account for the difference in the work of an econometrician
in relation to an economic or finance theorist:
1 the fact that we cannot ‘hold other things constant’ in empirical analysis
2 the imperfect nature of relations between variables which makes the
conclusions and outcomes of empirical analysis always contain a
considerable element of uncertainty, and
3 the discrepancy between theoretical variables and observed data in terms
of coverage and precision of measurement.
Regression analysis constitutes the statistical foundation of econometric
theory and practice. Its aim is to bring out relations between variables,
especially between variables whose relation is subject to chance variation and
to the influence of unforeseen events.
Regression involves finding an average line, which summarises the relation
of Y on X among considerable chance variation and uncertainty of outcome.
20
The uncertainty inherent in conclusions and outcomes based on regression
analysis is formally modelled through the introduction of a disturbance term
in our behavioural equations. This is a stochastic variable, which we cannot
observe in practice. However, the residuals of a sample regression function
may provide us with an indication as to the behaviour of these unknown
disturbances.
Regression allows us to investigate the association between variables, but this
does not imply any causality between them. To establish causality we need to
use economic and finance theory.
In empirical work in economics and finance we cannot use experimentation.
Econometric analysis, therefore, is based on careful observation of data
drawn from a context that we do not control.
In terms of practical skills, this unit requires that:
• you are familiar with the scatter plot as a practical tool of empirical
analysis
• you know how to enter data in Eviews by opening a pre-existing text
file
• you know the Eviews commands or operations to obtain a summary of
descriptive statistics of a variable, make a scatter plot, create logarithms
of variables, and create rates of return.
1.7 Eviews
If you have not done so already, now would be a good time to install and
register your copy of the Eviews Student Edition. Instructions for installation
and registration are in the booklet that comes with the Eviews CD.
It is important to remember that you must register your copy of Eviews.
If you do not, it will stop working 14 days after installation!
Reading
Please now quickly read Chapter 1, ‘A Quick Walk Through’, in Eviews Illustrated – An
Eviews Primer. You can access this in Eviews via the Help button on the top toolbar. This
chapter provides a quick overview of using Eviews; it also follows the steps described in
this unit and in the reading from Gujarati and Porter. Do not worry about the detail of
this reading at this stage – it is intended to give you a quick idea of some of the things
you will be learning in these units.
Richard Startz (2009)
Eviews Illustrated,
Chapter 1 ‘A Quick
Walk Through’.
Eviews is a very easy package to use. Many of the mouse and keyboard
operations that you would use in other Windows packages also work in
Eviews.
With the Exercises in the units there are instructions to help you work with
Eviews. To begin with, the instructions are quite detailed. However, as you
move on to later units, you should become familiar with the basic operations
in Eviews, and the instructions will concentrate on new information required
21
for each set of exercises. Therefore, if you forget how to do something, refer
back to the instructions in the earlier units, (or use Help in Eviews).
These instructions are specifically related to the exercises, and they do not
provide an overall guide to Eviews. This is because there is excellent, comprehensive Help provided by Eviews. You can access the Eviews Help
information in a number of ways. Perhaps the easiest is to go to Help on the
top toolbar, then Eviews Help Topics...
In the Eviews Help Topics … you can look through the Contents, use an A-Z
Index, or use the Search facility. Eviews Help Topics... links to the Users
Guide I, Users Guide II, and the Command Reference (more on Commands
later). If you prefer, you can access these pdf files directly, again via the Help
button in Eviews. The pdf file Users Guide I includes the contents pages for
Users Guide I and Users Guide II, and the entries in the contents pages link
to the relevant pages in the files. You can also search within the pdf files.
Although easy to use, Eviews is a very powerful econometrics package. It has
many features that you will not use in this course, so don’t worry if you see
methods or notation in the Help files that are not covered in this course.
Everything you need to understand is described in the course units, readings,
and exercises.
Lastly, answers to the exercises are provided at the end of the unit, for you to
check you have understood and done the exercises correctly. If you do the
exercises yourself, you will develop a good understanding of the course
materials, and the models and methods described in the units; you will also
become more confident using these methods and using Eviews.
Do not go straight to the answers!
1.8 Exercises
1 What is the critical distinction between econometrics and (i) economic or
finance theory and (ii) mathematical finance and economics?
2 The file C230C330_U1_Q2.txt contains the data used in the example in
the unit. It is monthly time series data on the exchange rate between the
US dollar and UK sterling, measured in dollars per pound. The current
spot exchange rate is denoted S, and the one-month ahead forward rate is
denoted F. The data relate to the period January 1982 to January 2012, and
the source of the data is www.bankofengland.co.uk.
a. Produce a plot of the spot rate, S, over time. Comment on the plot. Are
there any noteworthy episodes?
b. Produce a scatter plot of the current spot rate, , on the vertical axis
and the forward rate available in the previous period,
, on the
horizontal axis. Comment on the scatter plot; would a linear regression
seem appropriate?
c. Produce a plot over time of the difference between the current spot rate
.
22
Comment on the plot; are there periods when the current spot rate
differs noticeably from what is predicted by the previous forward rate?
d. Produce a scatter plot with the difference between the current spot rate
, on the
vertical axis, and this difference one month ago,
, on the
horizontal axis. Comment on the scatter plot; does there appear to be a
relationship between the two transformed series?
Data files
The file C230C330_U1_Q2.txt is a tab separated text file. Eviews can open
data stored in a wide variety of different sorts of file, including text files and
Excel files. Text files are very basic, they are readable by many applications,
(you could open them in Excel, in Eviews, and even in Word), and they are
robust to upgrades in software. For these reasons, the data files for the course
are all provided in the simplest (and most accessible) format, text files.
The first line of C230C330_U1_Q2.txt contains the labels for the three columns: Date, S and F. (Please note that in Eviews certain names are reserved
and cannot be used as names for data series. For example, C is reserved for the
constant term. If you attempt to import a variable named C, Eviews will
rename it C01.) The next row contains the data for the first observation:
31-Jan-82, 1.8835 and 1.8837, separated by tabs. Row 3 is 28-Feb-82, 1.8225,
1.8237, and so on. The final row contains 31-Jan-12, 1.578 and 1.5777. A
useful tip when working with data is to note the first and last observations for
your variables, so that you can check files have been opened successfully (and
completely).
Open foreign data as a workfile
To open the file in Eviews, go to File/Open/Foreign Data as Workfile ...
This dialogue box allows you to browse folders to find the file
C230C330_U1_Q2.txt
After you have found and opened the file, you will get the dialogue box ‘Text
Read – Step 1 of 4’. This shows the preview window – how Eviews will
interpret the data in the file. You can check that the first values are as noted
above. Click Next.
Step 2 of 4 asks about the delimiter between entries; this is a single tab, as
indicated, so click Next.
Step 3 of 4 identifies that the column headers (the names of the variables) are
in line 1. Click Next.
Step 4 of 4 concerns the Import Method: Eviews will create a new workfile
containing the data series. Step 4 also concerns the Structure of the Data to be
Imported: In this case the data are dated, with the dates specified by a date
series, and in the text file that series is called ‘date’. Just click on Finish.
You should now see the Workfile window (C230C330_U1_Q2) with a list of
variables. To see the values of a series, double-click on the name of the
23
series. A new window will open, displaying the values for the series in
spreadsheet view.
(When opening the text files of data, do not use File/Open/Text File... This
really will open the file as if it is a file of text, and not as data.)
Eviews will recognise that the data are monthly, and it will arrange the values
into observations: 1982M01, 1982M02, etc. However, Eviews also takes the
Date values from the first column in the text file and assigns them to a series
in its own right, in this case with values 1982-01-31, 1982-02-28, etc. In
many contexts this date series will not be used, especially if you use annual
data, in which case it would contain values like 2,009; 2,010 etc. However, in
other contexts (daily data with irregular breaks, like the data series in Q4),
Eviews uses the date series to index the observations. Therefore it is best to
retain this series but to ignore it.
Note that in general the undo feature (Control and z) does not work in Eviews
(although it does work when editing in the Command line). If you have made
a mistake when creating a new series, for example, you will have to delete
the series and create it again. To delete an object in the Workfile window,
right-click the object: a list of possibilities appears, including Delete. And
save your Workfiles frequently.
Saving a Workfile
To save this Workfile, make sure the Workfile window is selected (highlighted), go to File on the top toolbar in Eviews, Save As..., and provide a
filename and folder where you wish to save the Workfile. Eviews will
assume you wish to save the file as a Workfile; so the filename will be
c230c330_u1_q2.wf1 in this case, unless you have renamed the file. Note that
in the Save window there is a button on the bottom left that allows you to
Browse folders (that is, to display the folders for browsing) or to Hide folders.
After clicking Save, Eviews will ask you what level of precision you wish to
use to save the Workfile. Leave the default choice as it is, and click OK.
(If the Workfile window is highlighted, Eviews will save the Workfile. If the
Command line is highlighted, Eviews will ask if you wish to save the Workfile or the command log. If you save the command log, it will be saved as a
simple text file.)
Producing a Graph
Analysing graphs of your data is a very useful method for identifying general
patterns, relations between series, or noteworthy changes in the data. To
demonstrate this, Q2a examines a plot of a series over time; Q2b examines a
scatter plot where one series is plotted against another; Q2c requires a plot of
transformed series over time; and Q2d considers a scatter plot of two transformed series.
To produce a plot of the current spot rate, S, select the object s in the Workfile window. Then go to Quick on the top bar of Eviews, and select Graph...
(Eviews then shows the series selected, which is s. If you had not already
24
selected s, you can type the name of the series directly into the Series List
box). Click on OK. This brings up the Graph options window. The selected
type of graph is Line & Symbol, which is what you want in this case, so click
on OK. This should produce a plot of s over time. The graph already has a
title label, s. If you move the mouse pointer over the graph, the observations
and values are displayed in the bottom left of the screen; resting on a point on
the line will show you which observation you are pointing to and what the
value is.
To save the graph in your Workfile you will have to Name it. With the Graph
window open, click on Name, and give a Name to identify the object. Note
that Names for objects cannot include spaces; one suggestion is to use
underscore (_) instead of a space.
To use the graph in other applications, you can save the graph in a variety of
formats, or you can simply copy and paste it into a Word document (both
very useful when writing assignments). Note that all the instructions that
follow will refer to Microsoft Word; operations for other word processing
software may not be the same. Click on the graph area so that the plot area is
highlighted (this selects everything, even though only the plot area is highlighted).
To save the graph, right click and select Save graph to disk... The Graphics
File Save dialogue box then gives you the opportunity to provide a filename
for the saved graph, and to browse to the folder where you want the file to be
saved. You can also choose the format for the file (e.g. Windows Metafile
(*.wmf), Enhanced Metafile (*.emf), *.jpg, *.bmp, etc). Note that Browsing
to change the File name/path, and clicking Save, does not save the graph.
You also need to click on OK in the Eviews Graphics File Save dialogue box.
If you prefer to copy the graph, select the graph, right click and choose Copy
to clipboard ... This then gives you a few options for the copied graph (e.g.
use colour, *.wmf or *.emf). Click on OK. Go to your Word document, then
press Control and v, or click on Paste, and the graph will be pasted into your
document.
Using Commands (an alternative)
So far, the instructions above have used the drop down menus in Eviews. An
alternative is to use Commands. (For an introduction to using Commands, see
Command and Programming Ref – available in Eviews Help – Chapter 1
‘Object and Command Basics’.) The command line is the space below the
top toolbar in Eviews. As an example, typing (without the inverted commas):
‘graph myplot.plot s’, then pressing the Enter (or return) key, will produce
the graph for question 2a. To see the graph, type the command: ‘show
myplot’ followed by Enter, and the graph will be displayed.
You will now notice there is a new object in the Workfile window. Double
clicking on the object myplot also opens the graph.
You may prefer to use Commands, or you may prefer to use the dropdown
menus, or you may prefer to switch between them. If you like using the
25
Commands, you might find it useful to develop a list of useful Commands as
you work through the exercises in the units.
Notice that once you have pressed the Enter key to execute the Command,
the Command stays in the Command area. If you want to do a similar operation again, you can edit the Command line then run it again; just move the
cursor into the line containing the Command (make the edit if required) and
press Enter.
Producing a Scatter Plot
Next, produce a scatter plot with
on the vertical axis and
on the
horizontal axis. Go to Quick on the top bar of Eviews, and select Graph...
Type the names of the series in the Series List box. Note that in scatter plots
in Eviews, the first series name you type in the series list will be measured on
the horizontal axis and the second series name will be measured on the
vertical axis. Also note that the Series List can include names of series that
you have created or imported, and also expressions. To see how this works,
type the series list for this graph: f(-1) s. Click on OK. Then in the Graph
Options window, under Graph type, select Scatter, and click OK. This should
produce a scatter plot of
on
. Note that if you had typed s and then f(1) in the series list, you would get a scatter plot of
on . The expression
f(-1) indicates that you want to use the value of F from the previous period,
(known as the first lagged value of F).
The Command to produce this scatter plot is ‘graph myscatter.scat f(-1) s’,
which will produce a graph object named myscatter.
You can add labels to the graph using the AddText button. Type in the text
for the label. You can specify the position (e.g. Top and centred for a title for
the graph), but you can drag the textbox to wherever you want on the graph
after you have clicked OK. If you made a mistake when you added the text
label, just double click on the text label and you can edit the text in it.
Question 2c requires a plot of
over time. In the Graph Series List
box (or Command), the expression for this will be s-f(-1). To help interpret
the graph, you might wish to add a zero line. This is a horizontal line that
goes through zero (measured on the vertical axis). To add this line, have the
Graph open, and go to Options. On the left of the window you will see the
Option pages arranged in a tree system. Click on Axes & Scaling, and then
Data axis labels. Under ‘Axis ticks & lines’ you will see a button titled No
zero line. Click on this and select Zero line, background, and click on OK.
on the vertical axis, and
Question 2d requires a scatter plot with
on the horizontal axis. Can you think what two expressions are
required for this graph (to go in the Series List box or to put in your Command)? And what order should they be in to produce this scatter plot?
The series list will be s(-1)-f(-2) followed by s-f(-1). And remember that it is
a Scatter graph. Once you have produced the graph, you can add a horizontal
zero line as before (like the graph in Q2c). You can also add a vertical zero
26
line (passing through zero on the horizontal axis). To do this, select Options,
Axes & Scaling, and then Data axis labels. Select the button titled ‘Left axis’,
towards the top of the window, and select ‘Bottom axis’. Click on the ‘No
zero line’ button and select Zero line, background. This should add a vertical
zero line to the graph.
Generating New Series
As you can see, it is possible to type expressions for series (transformations
of series) directly into the Graph Series List or in Commands. Sometimes it
will be more convenient, or you may prefer it, to create new series that
incorporate the transformations, and then work with the new series. So in
Q2c you could create a new series, call it Z, equal to the difference between
the current spot rate and the forward rate from the previous period, and then
plot Z.
You can create a new series in a number of ways. From the top toolbar click
on Quick/Generate Series... This brings up the ‘Generate Series by Equation’
dialogue box. In the box titled Enter equation, type (without the inverted
commas): ‘z=s-f(-1)’ then click on OK. You will see that there is a new series
in the Workfile window. Alternatively, in the Workfile window you could
click on the button Genr, to bring up the same ‘Generate Series by Equation’
dialogue box.
Or, to generate the new series using a Command, type: ‘genr z=s-f(-1)’ in the
Command space, and then Enter.
Note that if you are using Commands, you can use the editing functions to
create and edit your commands: Copy (the control key and c), Cut (control
and x), and Paste (control and v). Rightclicking in the command space also
gives a drop down menu with these editing functions.
As you can see, there are many ways to work with objects (series, graphs,
etc) in Eviews. Often rightclicking on any object in the Workfile window will
enable you to open, copy or delete that object.
Now save your Workfile (this saves the original series, the Named graphs,
and new series if you have generated them). Remember that if the Workfile
window is highlighted, clicking on File/Save As ... will allow you to save the
Workfile. If the cursor is in the Command space and the Command Space is
highlighted, clicking on File/Save As... prompts Eviews to ask if you wish to
save the Workfile or the log of Commands in the Command line.
3 A share is valued at $1000 at the start of year 1. In year 1 it experiences a
return of -20%, and in year 2 it experiences a return of +20%. Calculate
the value of the share at the end of year 1 and the end of the year 2 using
a) arithmetic returns, and
b) logarithmic returns.
Comment on the values you have obtained for the share price at the end of
year 2.
4 The tab delimited text file C230C330_U1_Q4.txt contains the share price
of Delta Airlines Inc. (DAL) and the New York Stock Exchange
27
Composite Index (NYA). The data are daily, for the period 1 March 2010
to 1 March 2012, and both series are measured in US dollars (source:
http://finance.yahoo.com). The text file also includes a column of dates.
a) Plot the series DAL and NYA over time. Comment on the plots.
b) Plot the daily logged return of Delta Airline shares over time, and
comment on the plot.
c) Produce a scatter plot with the daily logged return of Delta shares on
the vertical axis, and the daily logged return on the NYSE composite
index on the horizontal axis. Comment on the scatter plot.
d) Calculate the means, standard deviations, and minimum and
maximum values for the Delta Airline daily logged return and daily
arithmetic return. Comment on the values you have obtained.
The file contains three columns. The first column contains the date; the
second column contains the share prices for Delta Airlines Inc. (DAL); and
the third column contains the value for the NYSE Composite Index (NYA).
For reference, on 1/3/2010 DAL is 13.17 and NYA is 7100.75; on 2/3/2010
DAL is 12.78 and NYA is 7135.97; and on 1/3/2012 DAL is 9.64 and NYA
is 8175.11.
When you produce the plots of the Delta share price and NYSE Composite
Index for Q4a, you may notice gaps in the graph due to the irregular dates.
To close the gaps in the graph, go to Options, select the Graph type page, and
in the Sample breaks section, put a tick in the ‘Connect adjacent’ box.
For Q4b you need to plot the daily logged return for the shares of Delta
Airlines. Recall from the unit that the daily logged return is equal to
(1.21)
that is, the natural logarithm of the share price minus the natural logarithm of
the share price from the day before. You can obtain a plot of this variable in a
number of ways. You can create a new series, call it r, and the expression for
r will be r=log(dal)-log(dal(-1)). What does this equation do? The first term is
the natural logarithm of the current value of dal (in Eviews, log stands for the
logarithm to base e, and not base 10). The second term takes the value of dal
from the observation before, and then takes the natural logarithm of it.
Logged returns are used widely in econometrics. Therefore, Eviews has a
built-in function or short-cut for this calculation: You can create the daily log
return for Delta shares with the expression r = dlog(dal). Now you can plot
the series r. Alternatively, you can type the expression dlog(dal) directly in
the Quick/Graph…Series list box. Or the Command to produce the required
graph (called plot_dlogdal) would be ‘graph plot_dlogdal.plot dlog(dal)’.
For the scatter plot in Q4c, you could create a new series for the daily log
return on the NYSE Composite Index, or you can work directly with the
expressions in the Graph … Series list, or use a Command. Working with the
expressions, the Series list would be dlog(nya) dlog(dal). Remember that in a
Scatter plot, the first series name in the list will be measured on the horizontal axis, and the second will be on the vertical axis.
28
Sample statistics
You can produce sample statistics for a series using Quick (on the Eviews top
toolbar)/Series Statistics/Histogram and Stats, then typing in the name of the
series and clicking on OK. Copying and Pasting this output into Word will
copy the complete graph (the histogram and the statistics). You could do this
for the logged return for Delta shares, and then repeat this operation for the
arithmetic return.
The arithmetic return is
(1.22)
that is, the current share price minus the share price in the previous period, all
divided by the share price in the previous period. In Eviews the equation to
create this series, call it ra, would be ra=(dal-dal(-1))/dal(-1). Alternatively,
there is a built-in function @pch which produces the one-period percentage
change (expressed as a decimal). So the equation to produce the daily arithmetic return for dal would be ra=@pch(dal).
Alternatively, you can produce descriptive statistics for a number of series
together. If you have created two new series for the logged return and arithmetic return, then in the Workfile window select the two objects (click on
one, then press the Control key and click on the other), then just doubleclick
the two series or right click, then Open Group. In the Group Window, switch
to the Stats Table: click on View/Descriptive Stats/Common Sample. This
produces a table showing mean, median, maximum, minimum, etc. You can
select all of the table (including the row labels) by clicking in the empty top
left-hand cell, then right click and copy (or the Control key and C), then paste
into Word. (In the Copy Precision dialogue box, just leave the default selection – formatted – and click on OK.) At this stage of the course you can
ignore most of this output.
Alternatively, on the Eviews toolbar you can go to Quick/Group Statistics/Descriptive Statistics/Common Sample, and then type in the names of the
required series in the Series list box, including any necessary transformations
e.g. dlog(dal) @pch(dal).
Or, if you have generated new series for the logged returns and arithmetic
returns, you can produce the descriptive statistics for the series using the
Command ‘r.stats’ for the series of log returns, r; and ‘ra.stats’ for the series
of arithmetic returns, ra.
29
1
Economic and finance theory can be viewed as a set of qualitative
relations among variables. Such theory can frequently be written in the
form of a mathematical model. An econometric model may be obtained
from an appropriate mathematical model with the addition of a random
error term. By using data to estimate the econometric model we can in
effect quantify financial and economic relations.
2 a) The plot of S over time is shown in Figure 1.2. The plot of the current spot
rate, measured as US dollars per pound, reveals a number of notable
episodes. For example, there is a sharp depreciation of sterling in 1992,
when sterling left the European Exchange Rate Mechanism. (A lower value
for S means that one pound will buy fewer dollars, or equivalently, it takes
fewer dollars to buy one pound). There is another sharp depreciation of sterling against the dollar (and other currencies) after the 2008 financial crisis.
Figure 1.2 Plot of S 1982–2012
2.2
2.0
1.8
1.6
1.4
1.2
1.0
1985
1990
1995
2000
2005
2010
b) The scatter plot of
against
is shown in Figure 1.1 in the unit.
The scatter plot shows that
and
have the expected positive
relationship. The relationship seems to be approximately linear and
seems to be relatively strong, in that the observations appear close to a
regression line drawn in the scatter plot.
c) Figure 1.3 shows the plot of
over time. This is the difference
between the current spot rate and the one-month ahead forward rate
that was available one month previously. According to the efficient
markets hypothesis, the forward rate should be a good predictor of the
spot rate, so that any differences between
and
should be
random. Any differences also reflect information that has become
available between the time the forward exchange rate was formed, and
the current spot rate was formed. In the first few years of the sample
there are months when
is consistently negative. If
is
30
consistently greater than
it suggests that the forward market is
consistently under-predicting the value of the spot rate. Looking back
at Figure 1.2, sterling was steadily depreciating in this period. This
means the forward market is consistently underpredicting the extent of
the depreciation in the spot rate. Again in 2008, there are relatively
for a few months, and the same
large negative values for
interpretation might be applied: the forward market is not adequately
predicting the depreciation in sterling.
Figure 1.3 Plot of S – F(-1) 1982–2012
.16
.12
.08
.04
.00
-.04
-.08
-.12
-.16
-.20
-.24
1985
1990
1995
2000
2005
2010
d) Figure 1.4 shows the scatter plot of
against
. That
is, the difference between the current spot rate and the forward rate
one month ago, plotted against the difference in the previous period.
Figure 1.4 Scatterplot of S – F(-1) against S(-1) – F(-2) 1982–2012
.16
.12
.08
S-F(-1)
.04
.00
-.04
-.08
-.12
-.16
-.20
-.24
-.3
-.2
-.1
.0
S(-1)-F(-2)
.1
.2
31
The scatter plot allows us to examine whether the forecasting error between
and
can be explained by the forecasting error in the earlier period,
. Figure 1.4 suggests there is no obvious relationship, positive or
negative, between the forecasting error in one period and the forecasting
error in the period that follows.
3 The value of the share at the start of year 1 is $1000, and in year 1 it experiences a return of –20% or –0.20. In year 2 the return is +20% or +0.20.
a) Using arithmetic returns, the share price at the end of year 1 is
The share price at the end of year 2 is
b) Using logarithmic returns, the share price at the end of year 1 is
The share price at the end of year 2 is
or $1000
Arithmetic returns are not symmetric: a negative return, followed by a
positive return of the same magnitude, does not restore the share to
the original price. However, logarithmic returns are symmetric: a
negative return followed by a positive return of the same magnitude
does restore the share to the original value.
4 a) The plot of the Delta Airlines Inc. share price and the NYSE
Composite Index is shown in Figure 1.5.
Figure 1.5 Plot of DAL and NYA March 2010 to March 2012
8,800
8,400
8,000
7,600
7,200
16
6,800
14
6,400
12
10
8
6
2010M07
2011M01
2011M07
2012M01
The Delta Airlines share price is measured on the left-hand axis and
the NYSE Composite Index is measured on the right-hand axis.
Presented in this way the plots are not directly comparable, but you
can see there are periods when both series generally move together,
and there are other times when one series exhibits sharp movements
that are not shown in the other series.
32
b) Figure 1.6 shows the plot of the daily logged return of the Delta
Airlines share price. The daily logged return crosses the zero line
frequently. Occasionally there are large positive and negative daily
returns, of around +0.12 (12%) and –0.12 (minus 12%).
Figure 1.6 Plot of DAL daily logged return March 2010 to March 2012
.12
.08
.04
.00
-.04
-.08
-.12
-.16
2010M07
2011M01
2011M07
2012M01
c) Figure 1.7 shows the scatter plot of the daily logged return on Delta
shares (on the vertical axis) against the daily logged return on the
NYSE Composite Index (on the horizontal axis). There would seem to
be a positive, linear relationship between the two series.
Figure 1.7 Scatter plot of DAL daily logged return and NYA daily logged return
.12
.08
DLOG(DAL)
.04
.00
-.04
-.08
-.12
-.16
-.08
-.04
.00
DLOG(NYA)
.04
.08
d) The histogram and statistics for the Delta daily logged return is
shown in Figure 1.8. The descriptive statistics for the daily logged
return and daily arithmetic return for the Delta share price are
shown in Table 1.2.
33
Figure 1.8 Histogram and descriptive statistics dlog(dal)
80
Series: DLOG(DAL)
Sample 1/03/2010 1/03/2012
Observations 506
70
60
50
40
30
Mean
Median
Maximum
Minimum
Std. Dev.
Skewness
Kurtosis
-0.000617
-0.000895
0.104360
-0.120286
0.029401
0.088426
4.036065
Jarque-Bera
Probability
23.29091
0.000009
20
10
0
-0.10
-0.05
-0.00
0.05
0.10
Table 1.2 Descriptive statistics for dlog(dal) and @pch(dal)
DLOG(DAL)
-0.000617
-0.000895
0.104360
-0.120286
0.029401
0.088426
4.036065
@PCH(DAL)
-0.000185
-0.000895
0.110000
-0.113333
0.029448
0.222162
4.104456
23.29091
0.000009
29.88029
0.000000
Sum
Sum Sq. Dev.
-0.312020
0.436530
-0.093542
0.437919
Observations
506
506
Mean
Median
Maximum
Minimum
Std. Dev.
Skewness
Kurtosis
Jarque-Bera
Probability
For small changes, the logged return and arithmetic return are
approximately equal. However, for larger changes this approximation
is not so close. You can see this in the maximum and minimum values
for the two series in Table 1.2.
References
Gujarati D and D Porter (2010) Essentials of Econometrics, Fourth edition,
New York: McGraw-Hill Book Company.
Maddala GS (1992) Introduction to Econometrics, New York: Macmillan.
Mosteller F and JW Tukey (1977) Data Analysis and Regression: a second
course in statistics, Massachusetts: Addison-Wesley.
Startz Richard (2009) Eviews Illustrated – An Eviews Primer, Irvine California: Quantitative Micro Software.
34

Econometric Principles and Data Analysis

Transcription

Similar documents

Presentation

Introduction Eviews for Orientation course Econometrics

The Relationship Between Quality Management Practices and

BAGUETTE Utilizing predictive regression modeling to forecast

7 Quality Tools

Paper 106-2009

EViews 4.1 Student Version

Inset The Fox and Hounds, Cattistock, home to Cattistock Allotments

Speed Writing

- Vision Group Archives

Physical Assessment for the Pilates Professional

EViews Illustrated for Version 8