A Theoretical Foundation of Ambiguity Measurement
Transcription
A Theoretical Foundation of Ambiguity Measurement
A Theoretical Foundation of Ambiguity Measurement Yehuda Izhakian∗† April 17, 2015 Abstract Ordering alternatives by their degree of ambiguity is a crucial element in decision-making processes. The current paper introduces an empirically applicable, stake-independent ambiguity measure that allows for such ordering. This measure relies upon the idea that, in the presence of ambiguity, probabilities are themselves uncertain, and related preferences are applied to these probabilities such that aversion to ambiguity is defined as aversion to mean-preserving spreads in probabilities. Thereby, the degree of ambiguity can be measured by the volatility of probabilities. The applicability of this measure is demonstrated by incorporating ambiguity into an asset pricing model. Keywords: Ambiguity Measure, Ambiguity Aversion, Knightian Uncertainty, Uncertain Probabilities, Ambiguity Premium. JEL Classification Numbers: D81, D83, G11, G12. ∗ Department of Economics and Finance, Zicklin School of Business, Baruch College; yud@stern.nyu.edu I thank Menachem Abudy, Yakov Amihud, Doron Abramov, David Backus, Adam Brandenburger, Menachem Brenner, Xavier Gabaix, William Greene, Eitan Goldman, Yaniv Grinstein, Sergiu Hart, Edi Karni, Ruth Kaufman, Peter Klibanoff, Ilan Kremer, Evgeny Lyandres, Fabio Maccheroni, Massimo Marinacci, Sujoy Mukerji, Yacov Oded, Efe Ok, Jacob Sagi, David Schmeidler, Uzi Segal, Marciano Siniscalchi, Laura Veldkamp, Paul Wachtel, Jan Werner, Jaime Zender, Stanley Zin and especially Itzhak Gilboa, Mark Machina and Thomas Sargent for valuable discussions and suggestions. I would also like to thank the seminar and conference audiences at Bar Ilan University, Baruch College, Indiana University, Johns Hopkins University, Michigan State University, New York University, Norwegian School of Business, Tel Aviv University, The Interdisciplinary Center (IDC) Herzliya, The Hebrew University of Jerusalem, University of Colorado, University of Houston, University of Michigan, Arne Ryde Workshop in Financial Economics 2013, Decision: Theory, Experiments and Applications (D-TEA) 2013, Netspar International Pension Workshop 2013, North American Meetings of the Econometric Society Northwestern University 2012, Risk Uncertainty and Decision (RUD) 2013, University of Chicago Workshop on Ambiguity and Robustness in Macroeconomics and Finance 2013, and Foundations of Utility and Risk (FUR) 2014. † 1 1 Introduction How should uncertain alternatives be ranked by the criterion of ambiguity? Consider the following example: a large urn contains 30 balls which are either black or yellow, in an unknown proportion, and a second smaller urn contains only 10 balls which are also either black or yellow, in an unknown proportion. Which of the following two bets is more ambiguous? “A ball drawn from the large urn is yellow” or, “A ball drawn from the small urn is Yellow.” Say you were offered $10 if a ball drawn from the large urn is yellow and $10 if a ball drawn from the small urn is yellow. Which of these two bets would you choose? Answering this type of questions is part of almost any real-life decision. They imply that decision-making involves the ordering of alternatives by their degree of ambiguity. This paper introduces an empirically applicable ambiguity measure, underpinned by a new theoretical concept, that allows for such ordering of alternatives. This new concept proposes that, in the presence of ambiguity, probabilities are themselves uncertain, and preferences concerning ambiguity are applied directly to these probabilities such that aversion to ambiguity is defined as aversion to mean-preserving spreads in probabilities—analogous to the Rothschild-Stiglitz (1970) aversion to mean-preserving spreads in outcomes. Thereby, the degree of ambiguity can be measured by the volatility of probabilities, just as the degree of risk can be measured by the volatility of outcomes. The resulting measure is objective, stake independent, simple and intuitive. It measures the degree of ambiguity independently of individuals’ preferences and can be computed from the data in empirical studies. These are key qualities for the introduction of ambiguity into economic and financial models. The decision making framework underpinning the current paper is expected utility with uncertain probabilities (henceforth EUUP), introduced by Izhakian (2014). This framework assumes two tiers of uncertainty, one with respect to consequences (outcomes) and the other with respect to the probabilities of these consequences. A decision maker (DM) in this environment applies two differentiated phases of the decision process, each refers to one of these tiers. In the first phase—the probability formation phase—she forms a representation of her perceived probabilities for all the events which are relevant to her decision. Then, in the second phase—the valuation phase, she assesses the value of each alternative using her perceived probabilities and chooses accordingly. Ambiguity—the uncertainty about probabilities—plays a role in the probability formation phase, while risk—the uncertainty about consequences—plays a role in the valuation phase.1 This structure introduces a complete distinction of risk from ambiguity with regard to both beliefs and tastes. The degree of ambiguity and attitudes toward it are then measured with respect to one tier, while risk and risk attitudes are measured with 1 Risk is defined as a condition in which the event to be realized is a-priori unknown, but the odds of all possible events are perfectly known. Ambiguity (Knightian uncertainty) refers to conditions in which not only is the event to be realized a-priori unknown, but the odds of events are also either not uniquely assigned or are unknown. 2 respect to the other tier. The main idea of EUUP is that, in the probability formation phase, perceived probabilities are formed in a Bayesian approach by the “certainty equivalent probabilities” of uncertain probabilities.2 That is to say, an uncertain probability is modeled explicitly in a state space that is subject to a prior probability, and the perceived probability is the unique certain probability value that the DM is willing to accept in exchange for the uncertain probability of a given event. Perceived probabilities are subjectively formed based upon the DM’s preferences concerning ambiguity. These preferences are applied to probabilities such that aversion to ambiguity is defined as aversion to mean-preserving spreads in probabilities. Thereby, the Rothschild and Stiglitz (1970) approach can be used over probabilities to define an ordering by ambiguity. Based upon probability ordering, this paper shows that the degree of ambiguity can be measured by four times the expected volatility of probabilities, across the relevant events. Formally, the measure of ambiguity is given by ∫ f2 [f ] = 4 X E [φf (x)] Var [φf (x)] dx, where f is an act (a bounded measurable function from states into consequences); X is a convex subset of the real numbers (consequences); φf (·) is an uncertain probability density function; and the expectation E [·] and the variance Var [·] are taken with respect to second-order probabilities (probabilities over a set of probability distributions).3 The measure f2 (mho2 ) can be viewed as an objective measure of ambiguity, as it measures ambiguous beliefs (information) in isolation from individuals’ tastes for ambiguity. The main advantage of this measure is that it can be computed from the data and can be employed in empirical tests.4 Stake independence is another major advantage of f2 ; unlike risk measures, it does not depend upon the magnitude of consequences. This is an important property of f2 , for example, for assessing the ambiguity associated with a particular stock market, regardless of the investment amount and the associated risk. Several approaches to estimating ambiguity have been proposed in the literature. Dow and Werlang (1992) measure uncertainty as the sum of the probability of an event and the probability of its complementary event. Ui (2011) measures ambiguity by the difference between the minimal possible mean and the true mean. Bewley (2011) and Boyle et al. (2011) measure ambiguity by a critical confidence interval. Maccheroni et al. (2013) measure ambiguity by the variance of an unknown mean. These studies assume that the variance of outcomes is known and suggest a stake-dependent 2 In this paper the terms perceived probabilities and subjective probabilities are used∑interchangeably. The measure f2 is also applicable in finite state space. In this case, f2 [f ] = 4 i E [φf (xi )] Var [φf (xi )] , where φf (·) is a probability mass function. 4 See, for example, Brenner and Izhakian (2011, 2012). 3 3 measure, based only on the variation of the mean. Nevertheless, ambiguous variance has been found to be an important element in decision making processes; as stressed, for example, in Epstein and Ji (2013).5 The ambiguity measure f2 , proposed in this paper, is stake-independent and encompasses both ambiguous variance and ambiguous mean, as well as the ambiguity of all higher moments of the probability distribution (i.e., skewness, kurtosis, etc.), through the uncertainty of probabilities. Relative entropy, measured by the deviation of a probability distribution from a reference probability distribution (reference model), can also be interpreted as a measure of ambiguity; see, for example, Hansen et al. (1999), Hansen and Sargent (2001) and Maccheroni et al. (2006). However, while the use of relative entropy is restricted to cases of a single prior relative to a known true probability distribution, f2 can be employed in cases of multiple priors, when either a single true probability distribution does not exist or it is unknown. Measuring the degree of ambiguity allows alternatives to be ranked by the criterion of ambiguity. The ambiguity measure is a critical instrument for introducing ambiguity into models that attempt to explain observable phenomena such as financial anomalies. It provides a way to address important questions that arise regarding the nature of ambiguity, in general, and the nature of the aggregate ambiguity of portfolios, in particular. Accounting for ambiguity might shed light on some phenomena that previously could not be fully explained. Notable examples include the fact that individuals tend to hold very small portfolios, 3-4 stocks (Goetzmann and Kumar, 2008), the equity premium puzzle (Mehra and Prescott, 1985), the risk-free rate puzzle (Weil, 1989), the phenomenon of the observed equity volatility being too high to be justified by changes in the fundamental (Shiller, 1981), and the home bias puzzle (Coval and Moskowitz, 1999). To demonstrate the applicability of the proposed measure of ambiguity, this paper generalizes asset pricing theory (Arrow-Pratt) to incorporate ambiguity. Relaxing the assumption that probabilities are known, it shows that the price of an asset is determined not only by its degree of risk and the DM’s attitude toward risk, but also by its degree of ambiguity and the DM’s attitude toward ambiguity. The paper constructs an uncertainty premium and proves that it can be separated into a risk premium and an ambiguity premium. It provides a well-defined ambiguity premium, attributed to ambiguity and preferences concerning ambiguity and completely distinguished from the risk premium. Previous models have been mainly focused on the theoretical aspects of the implication of ambiguity for the equity premium (e.g., Chen and Epstein (2002), Izhakian and Benninga (2011), Ui (2011), and Maccheroni et al. (2013)). Unlike these models, the ambiguity premium in the current paper can be computed from the data and tested empirically. For example, Brenner and Izhakian (2011) show that 5 In empirical asset pricing and macroeconomic contexts, stochastic time varying volatility also plays an important role; see, for example, Bollerslev et al. (1988), Fernandez-Villaverde et al. (2010) and Bollerslev et al. (2011). 4 ambiguity, measured by f2 , has a significant impact on the market portfolio return.6 The rest of the paper is organized as follows. Section 2 presents the decision-making framework. Section 3 simplifies the framework as a preparation for the extraction of an ambiguity measure. Using this simplified representation, Section 4 defines ordering of events by ambiguity and Section 5 uses this ordering to suggest a measure of ambiguity. Section 6 analyzes the special properties of the proposed measure, and Section 7 discusses it relative to alternative measures of ambiguity. To demonstrate an application of this measure for asset pricing, Section 8 models the ambiguity premium. Section 9 concludes. All proofs are provided in the Appendix. 2 The decision making framework The decision making framework employed in this paper is expected utility with uncertain probabilities (EUUP), proposed by Izhakian (2014). EUUP assumes two different tiers of uncertainty, one with respect to consequences (outcomes) and the other with respect to the probabilities of these consequences. Each tier is modeled by a separate state space. A decision maker (DM) in this framework applies two differentiated phases of the decision process, each refers to one of these tiers. Preferences for ambiguity, which are applied to uncertain probabilities in the first phase of a decision process, rely upon the Savage (1954) axiomatic foundation. They underpin perceived probabilities, which are structured from uncertain probabilities in a Bayesian approach. Given these perceived (nonadditive) probabilities, preferences for risk are applied to consequences in the second phase of a decision process.7 These preferences, which rely upon the foundations of Schmeidler’s (1989) Choquet expected utility and Tversky and Kahneman’s (1992) cumulative prospect theory, are formulated by Wakker’s (2010) axiomatization. Formally, let S be a (finite or infinite) nonempty state space, called the primary space, endowed with a σ-algebra, E, of subsets of S. Generic elements of this σ-algebra are called events and are denoted by E. Define X ⊆ R to be a convex set of consequences that contains the interval [0, 1]. Let a primary act f : S → X be a bounded E-measurable function from states into consequences, and denote the set of all these (Savage) acts by F and the set of all simple measurable acts by F0 . A simple primary act can be represented as a sequence of pairs, f = (E1 : x1 , . . . , En : xn ) , where (E1 , . . . , En ) is a generic partition of the state space S; xj is the consequence if event Ej occurs; and the consequences Ä ä x1 , . . . , xn are listed in a non-decreasing order. A primary indicator act δE = E C : 0, E : 1 assigns 6 As far as I’m aware, prior studies do not conduct direct empirical tests of models of decision making under ambiguity other than through parametric fitting and calibrations. Uppal and Wang (2003), Epstein and Schneider (2008), and Ju and Miao (2012), for example, calibrate their model to the data. Several papers attribute different explanatory variables to ambiguity. For example, Anderson et al. (2009) attribute the disagreement of professional forecasters to ambiguity. 7 Schmeidler (1989), in his pioneering study, introduces the idea that, in the presence of ambiguity, the probabilities that reflect the DM’s willingness to bet may not be additive, i.e., the sum of the probabilities can be either smaller or greater than 1. 5 the outcome 1 to event E ∈ E and the outcome 0 to its complementary event E C ∈ E. The domain of first-order preference relation, %1 , is the set of primary acts F0 , and the relations -1 , ≺1 , ≻1 and ∼1 are defined as usual. A consequence x ∈ X is considered to be unfavorable if x ≤ k and favorable if k < x, where k is a reference point. An event E ∈ E is considered to be unfavorable under act f if f (E) ≤ k and favorable if k < f (E). Probabilities of events E occurring in the primary space are determined in a (finite or infinite) nonempty secondary space, defined by a set P of all possible additive probability measures over the primary space S. A first-order probability measure P ∈ P is then viewed as a state of nature in this secondary space, and the state space P is assumed to be endowed with the maximal σ-algebra, Π = 2P , of subsets of P. A secondary act, fÊ : P → X , is a bounded function from the secondary Á A secondary act space P into the set of consequences X . The set of all secondary acts is denoted F. fÊ that describes the resulting expected outcome of a primary act f contingent upon a prior P ∈ P (on S), is denoted fˆ; that is, fˆ : P → X satisfies fˆ (P) = ∫ S Á f dP. The set of all secondary acts fˆ ∈ F “ and the subset of all secondary acts in F “ that are associated with primary indicator is denoted F, ˆ A secondary act δˆE : P → [0, 1] in ∆, ˆ associated with a primary indicator act acts is denoted ∆. Ä ä δE = E C : 0, E : 1 , is given by δˆE (P) = P (E) for every P ∈ P. A secondary act δˆE can, therefore, be viewed as a function that assigns each event E ∈ E with its possible probabilities. In this view, δˆE can be interpreted as an uncertain variable describing the probability P (E) of event E. A second-order non-atomic finitely-additive probability measure χ on Π assigns each subset A ∈ Π of first-order probability measures in P with a probability χ (A).8 This second-order belief is implicated in the DM’s second-order preference relation %2 over Á In the view of δˆE : P → [0, 1] as describing the (uncertain) probability the set of all secondary acts F. ˆ defines a preference over probabilities.9 P (E) of event E, the preference relation %2 over ∆ Suppose that X and S satisfy the required richness of EUUP, the preference relation %1 on the set of primary acts F0 satisfies Wakker’s (2010, Theorem 12.3.5) axioms, the preference relation %2 on the Á satisfies Savage’s (1954) axioms, and that they jointly satisfy Izhakian’s (2014) set of secondary acts F axiom. Then, by Izhakian (2014, Theorem 1), there exists a function V : F0 → R such that f %1 g ⇐⇒ V (f ) ≥ V (g) , To maintain non-atomic when P is finite, one can define the probability measure χ on the product σ-algebra Π ⊗ 2J , where J is a non-singleton convex set of probability distributions over some auxiliary state space S1 . 9 ˆ associated with the primary indicator To understand this interpretation, consider the two secondary acts δˆE , δˆF ∈ ∆, acts δE , δF ∈ F (whose outcomes are the same), and assume a DM who prefers δˆE to δˆF . This means that she prefers to get the good outcome with the (uncertain) probability P (E) than with the (uncertain) probability P (F ). 8 6 for every f, g ∈ F0 , where ∫ V (f ) = k ∫−∞ ∞ ï Γ−1 Γ−1 Å∫ P Å∫ P k Ä ã ä ò Γ δˆ{s∈S | U(f (s))≥z} (P) dχ − 1 dz + Ä ä (1) ã Γ δˆ{s∈S | U(f (s))≥z} (P) dχ dz; U : X → R is strictly increasing continuous bounded functions, normalized such that U (k) = 0; and Γ : [0, 1] → R is a non-constant bounded function.10 Furthermore, χ is uniquely determined, U is unique up to a unit, and Γ is unique up to a positive linear transformation. The Bayesian approach asserts that everything that is not known should be modeled explicitly in a state space and be subject to a prior probability. The model of Equation (1) applies this approach to uncertain probabilities. As a result, the function V takes the form of a two-sided Choquet integration to unfavorable outcomes and to favorable outcomes (relative to the reference point). This functional representation of the DM’s aggregate preferences makes a complete distinction between beliefs and tastes and between risk and ambiguity. First-order beliefs are formed by the uncertain probability measure P; second-order beliefs are formed by the probability measure χ; risk preferences are formed by the utility function U;11 and ambiguity preferences are formed by the function Γ. The function Γ, referred to as a outlook function, forms the DM’s attitude toward ambiguity. As with risk attitudes, there are three types of attitudes toward ambiguity: aversion to ambiguity (formed by a concave Γ), loving of ambiguity (formed by a convex Γ) and indifference to ambiguity (formed by a linear Γ). To simplify the functional representation V in Equation (1), secondary acts (δˆE ) can be replaced by their resulting probabilities to obtain ∫ V (f ) = k ∫−∞ ∞ k ï Γ−1 Γ−1 Å∫ ã P Å∫ P ò Γ (P ({s ∈ S | U (f (s)) ≥ z})) dχ − 1 dz + (2) ã Γ (P ({s ∈ S | U (f (s)) ≥ z})) dχ dz. This functional representation considers acts taking infinitely many values in an infinite state space. It is important to note that all the results in this paper can be applied to a discrete representation in a finite state space with acts taking finitely many values. 3 Preliminaries To elicit a measure of the degree of ambiguity, the functional representation V has to be further simplified. The key for the additional simplification is the DM’s perceived probabilities. In EUUP, these 10 EUUP stems from the multiple priors paradigm (Gilboa and Schmeidler, 1989) and results in a two sided variation of CEU (Gilboa, 1987 and Schmeidler, 1989). It combines the concept of nonadditive probabilities with the idea of referencedependent beliefs, which is applied to differentiate between the probability of unfavorable and favorable events. Since the focus of this paper is ambiguity measurement, as opposed to preferences for ambiguity, it is assumed for simplicity that the DM has the same preference for ambiguity, formed by Γ, concerning unfavorable and favorable events. 11 As usual, a concave U implies risk aversion, and a convex U implies risk loving. 7 are derived from the nature of the uncertainty about probabilities (ambiguity) and the DM’s preferences concerning this uncertainty. In particular, the concept of EUUP is that perceived probabilities are formed by the certainty-equivalent probabilities of uncertain probabilities in a Bayesian approach. Formally, the perceived probability Q(E) of event E ∈ E is defined by12 Q(E) = Γ −1 Å∫ ã P Γ (P (E)) dχ . (3) This probability is a function of first-order (uncertain) probabilities, formed by a set P of possible probability measures over E; a second-order probability measure χ (second-order belief) over P; and the DM’s preferences concerning ambiguity, applied to probabilities and captured by Γ. Equation (3) proposes that while making decisions, the DM, who views uncertain probabilities as a set of priors, aggregates these probabilities in a nonlinear way to form her perceived probabilities.13 To simplify the exposition of perceived probabilities in Equation (3), the perceived probability Q(E) of an event E ∈ E can be approximated by taking a second-order Taylor approximation with respect to its first-order probabilities P (E) around its expected probability E [P (E)].14 To this end, ∫ E [P (E)] = P P (E) dχ is defined to be the expected probability of event E; and Var [P (E)] = ∫ Ä P P (E) − E [P (E)] ä2 dχ to be the variance of the probability of event E. Theorem 1. Assume a strictly-increasing, continuous and twice-differentiable Γ satisfying 1 2 ( Γ′′ (E[P(F )]) Γ′ (E[P(F )]) Var [P (F )] − Γ′′ (E[P(E∪F )]) Γ′ (E[P(E∪F )]) Var [P (E ) ∪ F )] ≤ E [P (E)] for any events E, F ∈ E. Then, for a relatively small P (E), the perceived probability of event E is Q(E) ≈ E [P (E)] + 1 Γ′′ (E [P (E)]) Var [P (E)] . 2 Γ′ (E [P (E)]) Notice that the approximated perceived probabilities satisfy Q(∅) = 0, Q(S) = 1, and set monotonicity with respect to set-inclusion, i.e., Q(E) ≤ Q(F ) if E ⊂ F (by Lemma 2).15 The condition on Γ bounds the level of ambiguity aversion (the concavity of Γ) and the level of ambiguity loving (the convexity of Γ) to assure that the approximated perceived probabilities are nonnegative and that the 12 This functional representation is obtained by the value function V of an indicator act, formed in Equation (1), and by replacing the secondary act with its resulting probabilities; see Izhakian (2014). 13 As a consequence of probabilistic sensitivity, i.e., the nonlinear ways in which individuals may interpret probabilities, perceived probabilities are nonadditive. That is, the sum of the probabilities can be either smaller or greater than 1. Ambiguity aversion results in a subadditive probability measure, while ambiguity loving results in a superadditive measure. 14 The same method was applied by Arrow (1965) and Pratt (1964) to consequences within the expected utility framework, whereas in this case it is applied to probabilities. 15 In fact, Q is a capacity—a subjective nonadditive probability; see, for example, Schmeidler (1989). 8 probability of an event is not smaller than the probability of any of its sub-events. Henceforth it is assumed that Γ satisfies this condition. The perceived probabilities, proposed in Theorem 1, provide a natural way to simplify the functional representation of preferences over acts to a more applicable form. Proposition 1. Suppose that the axioms of Izhakian (2014, Theorem 1) and the conditions of Theorem 1 are satisfied. The value of an act f ∈ F0 , formed in Equation (2), can then be written ∫ V (f ) ≈ − ∫ k −∞ ∞ ∫ + −∞ E [P ({s ∈ S | U (f (s)) ≤ z})] dz + k ∞ E [P ({s ∈ S | U (f (s)) ≥ z})] dz 1 Γ′′ (E [P ({s ∈ S | U (f (s)) ≥ z})]) Var [P ({s ∈ S | U (f (s)) ≥ z})] dz. 2 Γ′ (E [P ({s ∈ S | U (f (s)) ≥ z})]) To simplify the notation, the following conventions are used. Pf (x) stands for the cumulative probability P ({s ∈ S | f (s) ≤ x}), and φf (x) stands for the probability density φ ({s ∈ S | f (s) = x}).16 It is assumed that a density function φf (x) exists and well defined for every x ∈ X . When it is clear from the context, the subscript f , indicating an act, is omitted. With this notation in place, the next theorem presents a dual representation of the value function V. Theorem 2. Suppose that the axioms of Izhakian (2014, Theorem 1) and the conditions of Theorem 1 are satisfied. The dual representation W of the value function V can then be approximated by ∫ Ç å Γ′′ (1 − E [Pf (x)]) U (x) E [φf (x)] − ′ W (f ) ≈ E [φf (x)] Var [φf (x)] dx + Γ (1 − E [Pf (x)]) −∞ Ç å ∫ ∞ Γ′′ (1 − E [Pf (x)]) U (x) E [φf (x)] + ′ E [φf (x)] Var [φf (x)] dx. Γ (1 − E [Pf (x)]) k k (4) That is, f %1 g ⇐⇒ W (f ) ≥ W (g) . The functional representation provided by this theory eases the use of EUUP theory in general and the extraction of an ambiguity measure in particular. The following corollary demonstrates it. Corollary 1. Assume a DM typified by a constant relative ambiguity aversion (CAAA), i.e., Γ (P (E)) = (P(E))1−η 17 . 1−η The value function then takes the form ∫ W (f ) ≈ k ∫−∞ ∞ Ä ä Ä ä U (x) E [φf (x)] − ηE [φf (x)] Var [φf (x)] dx + U (x) E [φf (x)] + ηE [φf (x)] Var [φf (x)] dx. k In a discrete representation, when the state space S is finite, φf (x) stands for the probability mass function. CAAA means that, given a return value, while shifting linearly the range of its possible probabilities, the attitude toward ambiguity remains unchanged. See Izhakian (2014) for a detailed discussion about the nature of different attitudes toward ambiguity. 16 17 9 Notice that if the DM is ambiguity neutral or if there is no ambiguity, i.e., the variance of probabilities is zero, the functional representation of the value of an act collapses to the classical expected utility representation ∫ W (f ) = ∞ −∞ U (x) E [φf (x)] dx. That is, no disutility occurs. 4 Ordering ambiguous events A preliminary step in ordering acts by their degree of ambiguity is to define such an order over events. This ordering is determined by the second order preference relation %2 . In the view of a secondary act δˆE : P → [0, 1] as describing the (uncertain) probability P (E) of event E, the preference relation ˆ of δˆ may well be referred to as a preference relation over probabilities. To see this, %2 over the set ∆ ˆ associated with the primary indicator acts δE , δF ∈ F consider the two secondary acts δˆE , δˆF ∈ ∆, (whose outcomes are the same), and assume a DM who prefers δˆE to δˆF . This means that she prefers to get the good outcome with the (uncertain) probability P (E) than with the (uncertain) probability P (F ). That is, she prefers P (E) to P (F ). With this notion, an ordering of events by their degree of ambiguity, induced by the DM’s preferences, can be defined as follows. Definition 1. Let the uncertain probabilities of events E, F ∈ E have the same expectation, i.e., E [P (E)] = E [P (F )]. Event F is more ambiguous than event E if and only if δˆE %2 δˆF by any ambiguity-averse DM. This definition provides a subjective ordering of events that arises from the DM’s preferences concerning ambiguity. Notice that preferences concerning ambiguity, %2 , apply only to the probabilities of events and not to their consequences. Notice also that, by Izhakian (2014, Theorem 1), in EUUP18 δˆE %2 δˆF ⇐⇒ Γ−1 Å∫ ã P Γ (P (E)) dχ ≥ Γ−1 Å∫ ã P Γ (P (F )) dχ , implying that an objective ordering by the degree of ambiguity can be defined by mean-preserving spreads in probabilities. Rothschild and Stiglitz (1970) apply the idea of mean-preserving spreads to outcomes in order to define a ranking by risk, whereas here, this idea is applied to probabilities in order to define a ranking by ambiguity. 18 This obtained immediately by applying the value function in Equation (1) to indicator acts. 10 Definition 2. Event F ∈ E is more ambiguous than event E ∈ E if there exists a random variable ϵ such that P (F ) − E [P (F )] =d P (E) − E [P (E)] + ϵ, where =d means equal in distribution and E [ϵ | P (E)] = E [ϵ] = 0.19 That is, P (F ) is a meanpreserving spread of P (E). If ϵ is not identically zero, then F is strictly more ambiguous than E. This definition does not assume that events share an identical expected probability or similar probability distributions. In this definition, one event being more ambiguous than another is a condition on the deviations of its possible probabilities from the respective expectations of probabilities. It implies that, given a random variable ϵ with E [ϵ] = 0, an event with the uncertain probability P (E) + ϵ is more ambiguous than event E with the uncertain probability P (E). In turn, this implies that any event with a non-constant probability is strictly more ambiguous than an event with its expected probability. The next proposition ties the subjective ordering of Definition 1 by the DM’s preferences to the objective ordering of Definition 2, guided by the notion that every ambiguity-averse DM prefers a less ambiguous event to a more ambiguous one, assuming both have the same consequence and the same expected probability. Proposition 2. Suppose E and F are events in E with identical expected probabilities. Then, P (F ) − E [P (F )] =d P (E) − E [P (E)] + ϵ ⇐⇒ δˆE %2 δˆF by every ambiguity-averse DM, where E [ϵ | P (E)] = E [ϵ] = 0. That is, definitions 1 and 2 of the more ambiguous event coincide. The next step is to define the conditions under which spreads in probabilities can be measured by the variance of probabilities such that the higher the variance of probabilities, Var [P (E)], the higher the degree of ambiguity. These conditions apply to the nature of both, the DM’s preference for ambiguity and to her beliefs. The former refers to cases where the DM’s attitude toward ambiguity is quadratic or of the CAAA type. The latter refers to cases where the probabilities of events are uniformly or elliptically distributed, i.e., the second-order probability distribution (probabilities of probability distributions) is uniform or elliptical.20 The probability P (E) of event E is said to be The condition E [ϵ | P (E)] = E [ϵ] means that ϵ is mean-independent of the uncertain probability P (E) of event E. Note that, equality in distribution is a much weaker condition than equality and that mean-independence is less strong than independence; independence implies mean-independence, but the converse is not true. 20 For applications of elliptically distributed returns to asset pricing theory see, for example, Owen and Rabinovitch (1983). 19 11 elliptically distributed if its probability characteristic function is of the form Å ϕP(E) (t) = eitE[P(E)] Ψ where i = ã 1 2 t Var [P (E)] , 2 √ −1 and Ψ is a characteristic generator.21 Theorem 3. Suppose E and F are events in E with identical expected probabilities. Then, F is more ambiguous than E ⇐⇒ Var [P (F )] ≥ Var [P (E)] , when one or more of the following conditions hold: (i) The probabilities of events E and F are uniformly distributed; (ii) The probabilities of events E and F are truncated elliptically distributed with an identical characteristic generator;22 (iii) The DM’s attitude toward ambiguity is of the CAAA type; (iv) The DM’s attitude toward ambiguity is quadratic. This theorem proposes that, given an event, the greater the spread of its possible probabilities, the greater its ambiguity. It suggests that, when probabilities are uniformly or elliptically distributed, the ordering of events by the variance of their probabilities coincides with the ordering of Definitions 1 and 2. Henceforth, it is assumed that the probabilities of all events are either uniformly or elliptically distributed. If needed, this assumption can be replaced by assuming a CAAA or a quadratic outlook function, Γ. At this point, the order of events by their degree of ambiguity, measured by Var [P (·)], is well-defined. Using this order, one can define stochastic dominance with respect to ambiguity by applying probabilities to outcomes.23 Definition 3. Let f, g ∈ F0 be two primary acts under which the expected probabilities of each consequence x ∈ X are identical. That is, E [φf (x)] = E [φg (x)], for any given x ∈ X . Act f first-order stochastically (“cumulatively”) dominates act g with respect to ambiguity if and only if ∫ x −∞ ∫ E [φf (z)] Var [φf (z)] dz ≤ x −∞ E [φg (z)] Var [φg (z)] dz for any x ∈ X . The notion of first-order stochastic dominance with respect to ambiguity allows for the definition of the relation between the objective ordering of acts by stochastic dominance and the subjective 21 Particular forms of the elliptical distribution include: normal distribution, student-t distribution, logistic distribution, exponential power distribution and laplace distribution. 22 Notice that, since probability values are bounded between 0 and 1, truncated elliptical distributions are considered. 23 Sarin and Wakker (1992) also extend the notion of stochastic dominance to uncertainty with respect to probabilities. They define “that an act f stochastically (“cumulatively”) dominates an act g if the DM regards each cumulative consequence set at least as likely under f as under g”. 12 ordering by the DM’s preferences concerning ambiguity. The following theorem settles this relation. Theorem 4. Suppose f, g ∈ F0 are two primary acts under which the expected probabilities of each consequence x ∈ X are identical. Act f first-order stochastically dominates act g if and only if any ambiguity-averse DM weakly prefers f to g, i.e., W (f ) ≥ W (g), under every increasing utility function U and every twice-differentiable increasing concave outlook function Γ. The idea of stochastic dominance with respect to ambiguity can be developed further to define second-order stochastic dominance. Definition 4. Let f, g ∈ F0 be two primary acts under which the expected probabilities of each consequence x ∈ X are identical. Act f second-order stochastically (“cumulatively”) dominates act g with respect to ambiguity if and only if ∫ x ∫ z −∞ −∞ ∫ E [φf (t)] Var [φf (t)] dtdz ≤ x ∫ z −∞ −∞ E [φg (t)] Var [φg (t)] dtdz for any x ∈ X . Notice that, similarly to stochastic dominance with respect to risk, first-order stochastic dominance with respect to ambiguity implies a second-order stochastic dominance. As with first-order stochastic dominance, it can be shown that there is a tight relationship between second-order stochastic dominance with respect to ambiguity and preferences concerning ambiguity. 5 Ambiguity measurement The well defined ordering of events by their degree of ambiguity paves the way for defining an ordering of acts by their degree of ambiguity—a necessary step toward extracting a measure of ambiguity (over acts). To inspect the impact of ambiguity, this ordering is preformed over acts with identical properties except for their degree of ambiguity. That is, they have the same set of possible consequences with the same expected probability (implying the same risk) such that the only difference between them is the dispersion of probabilities around their expectation. The ordering of acts by their degree of ambiguity, as revealed from the DM’s subjective choices, can then be defined as follows. Definition 5. Let f, g ∈ F0 be two primary acts whose expected probabilities of any given consequence x ∈ X are identical. Act g is more ambiguous than act f if and only if f %1 g, by any ambiguity-averse DM. 13 To validate a measure of ambiguity, it has to be shown that ordering acts by the proposed measure coincides with the ordering provided by a DM. The next theorem proposes a new measure of ambiguity. It asserts that the degree of ambiguity associated with an act can be measured by the expected volatility of its related probabilities. Theorem 5. Assume an ambiguity-averse DM whose preferences satisfy the conditions of Theorem 2 and whose reference point is k = −∞. Then f %1 g ⇐⇒ f2 [f ] ≤ f2 [g] , where ∫ f2 [f ] = 4 X E [φf (x)] Var [φf (x)] dx, f, g ∈ F0 are primary acts under which the expected probabilities of each consequence x ∈ X are identical, and where the probabilities of each consequence x ∈ X are uniformly or elliptically distributed (with the same characteristic generator). This theorem ties the measure of ambiguity, denoted f2 (mho2 ), to preferences concerning ambiguity. The idea that ambiguity—the uncertainty about probabilities—takes the form of probability perturbations and that aversion to ambiguity takes the form of aversion to mean-preserving spreads in probabilities underpins the construction of f2 . Thereby, just as the degree of risk can be measured by the volatility of outcomes, so too can the degree of ambiguity be measured by the volatility of probabilities. Theorem 5 proves that if two acts are identical except in their degree of ambiguity, then any ambiguity-averse DM prefers the act with the lower f2 over the act with the higher f2 . That is, she prefers the act whose associated probabilities are on average less volatile (less spread) over the act whose associated probabilities are on average more volatile (more spread).24 Therefore, the measure f2 aggregates the variances of probabilities, which measures the dispersions of probabilities of each outcome, while assigning the variance of the probability of each outcome a weight relative to its expected probability. Theorem 5 assumes that the DM’s reference point is k = −∞, which means that all outcomes are considered favorable such that all outcomes are assigned with a positive utility. This can be viewed as if the utility function is normalized such the minimal utility is 0. Note that when the utility is always positive the Choquet expected utility of Schmeidler (1989) is obtained. This assumption can be replaced by the assumption that the outcomes of acts are symmetrically distributed. Such a symmetry in a framework with uncertain probabilities is defined as follows. 24 Jewitt and Mukerji (2011), for example, study the ranking of ambiguous acts as revealed by preferences, based upon the smooth model of Klibanoff et al. (2005). 14 Definition 6. The outcomes of an act f ∈ F0 are said to be symmetrically distributed around a point of symmetry k if E [φf (k − x)] = E [φf (k + x)] Var [φf (k − x)] = Var [φf (k + x)] and for any x ∈ X . With this definition of symmetry in place, Theorem 5 can be restated. It is important to note that for measuring ambiguity by the following theorem, the more restrictive assumption of normally distributed outcomes, which allows measuring risk by variance, can be relaxed to only symmetry. Theorem 6. Assume an ambiguity-averse DM whose preferences satisfy the conditions of Theorem 2. Then f %1 g ⇐⇒ f2 [f ] ≤ f2 [g] , where f, g ∈ F0 are primary acts whose outcomes are symmetrically distributed around a reference point k, with identical expected probabilities of each consequence x ∈ X , and where the probabilities of each consequence x ∈ X are uniformly or elliptically distributed (with the same characteristic generator). The measure of ambiguity f2 carries the unites of squared probabilities. A normalized (to the units of probability) measure can then be simply defined by ∫ f [f ] = 2 X E [φf (x)] Var [φf (x)] dx. The measure f2 in Theorems 5 and 6 considers acts taking infinitely many values in an infinite state space. This measure, however, can be applied to acts taking finitely many values in a finite state space. In this case it takes the form f2 [f ] = 4 ∑ E [φf (xi )] Var [φf (xi )] . i This measure can also be applied to any nonempty subset Y ⊂ X ⊆ R of consequences ∫ f [f, Y] = 4 2 Y E [φf (y)] Var [φf (y)] dy, or to any given event E ∈ E ∫ f [f, E] = 4 2 f −1 (x)∈E E [φf (x)] Var [φf (x)] dx. Similarly to risk, stochastic dominance (with respect to ambiguity) is closely related to ambiguity measurement. The next proposition ties the notion of stochastic dominance with respect to ambiguity and the measure of ambiguity f2 . 15 Proposition 3. Suppose that the conditions of Theorem 5 or of Theorem 6 are satisfied. Let f, g ∈ F0 be two primary acts under which the expected probabilities of each consequence x ∈ X are identical and are uniformly or elliptically distributed (with the same characteristic generator). Then, g is more ambiguous than f , i.e., f2 [g] ≥ f2 [f ], if and only if g is first-order stochastically dominated by f with respect to ambiguity. This proposition implies that if for any consequence the cumulative probability uncertainty associated with act f is preferred (by an ambiguity-verse DM) to the cumulative probability uncertainty associated with act g, then g is more ambiguous than f . It shows that the ordering of acts by firstorder stochastic dominance with respect to ambiguity coincides with their ordering by the degree of ambiguity, measured by f2 . The next section further investigates the special properties of f2 . Properties of f2 6 It is worth opening this section with an example that demonstrates some properties of the measure f2 . Consider, first, a large urn with 30 colored balls which are either black or yellow, in an unknown proportion. Assume that drawing a black ball (B) entitles the DM to a sum of $0, and a yellow ball (Y ) entitles her to a sum of $1. The probability of B (an unfavorable event) can be one of the values 0 1 30 30 , 30 , . . . , 30 , where the DM is assumed to act as if each is equally likely. Thus, the degree of ambiguity (in units of probability) is f = 0.596. Now, consider a smaller urn with only 10 colored balls which are either black or yellow, in an unknown proportion. The ambiguity associated with a bet on the color of a ball drawn from this urn is higher than in the large urn; it is f = 0.632. If there is only one ball in the urn, of unknown color, then f = 1, and in the other extreme case, if there is an infinite number of balls in the urn, then f = √1 . 3 Table 2 is a stylized description of these variations. The perceived probabilities and the value of each alternative are computed, respectively, by Equations (3) and (2), assuming a DM whose preferences concerning ambiguity are represented by Γ (P (E)) = » P (E), her preferences concerning risk are represented by U (c) = 1 − e−c , where c stands for consumption, and her reference point is k = 0.25 It can be observed that a larger number of possible probability values, which in this case are uniformly spread over the interval [0, 1], implies a lower degree of ambiguity. To see the intuition for this, notice that, since the consequence of the favorable (unfavorable) event is identical for all bets, when the DM makes her choice over urns, she actually bets on the composition of the urn rather than on the consequence. Suppose she chooses to bet on the 10-ball urn and that her probability (proportion of balls) assessment is wrong. The minimal size of her error (in terms of probability) is 25 The utility function U is normalized such that U (k) = 0 , when k = 0. 16 #Balls Total Y B P Q V f 0.500 0.316 0.000 0.445 0.281 0.577 0.435 0.275 0.596 0.417 0.263 0.632 0.250 0.158 1.000 30 15 15 15 30 ∞ 0...∞ 0...∞ 0...1 30 0, . . . , 30 0, . . . , 30 10 0, . . . , 10 0, . . . , 10 1 0, 1 0, 1 0 30 , 0 10 , 1 30 , . . . , 1 10 , . . . , 0, 1 30 30 10 10 Table 1: Degrees of ambiguity 1 10 . If, however, she chooses to bet on the 30-ball urn and her probability assessment is wrong, the minimal size of her error is only 1 30 . Accordingly, the degree of ambiguity of the 30-ball urn is lower than the degree of ambiguity of the 10-ball urn. The next observation defines the property that arises from this intuition formally. Observation 1. Suppose a finite set of probability measures P such that the probability density φ (x) of each x ∈ X is uniformly distributed over [ax , bx ]. Then, the higher the cardinality n of P, the lower the degree of ambiguity f2 [f ] of any f ∈ F0 . The next observation identifies the range of possible values that the ambiguity measure f2 can obtain. Observation 2. The values of the ambiguity measure f2 are always between 0 and 1. The minimal possible degree of ambiguity, f2 = 0, is attained when all probabilities are perfectly known. The maximal possible degree of ambiguity, f2 = 1, is attained when there are two possible outcomes and the probability of each is either 0 or 1 with equal odds. In this most extreme case, the weighted cumulative variance of probabilities attains its maximal possible value, 14 ; see Observation 2. Variances of probabilities are therefore normalized by 4 to provide an ambiguity measure ranging between 0 and 1. It is important to note that f2 is an objective measure of ambiguity. It does not depend upon the reference point, k, which determines the sets of unfavorable and favorable events. It also does not depend upon the DM’s subjective preferences. However, the most important property of f2 is stake independency. Given an event, its degree of ambiguity, measured by f2 , is invariant to the consequence of this event. That is, given an event, changing its associated (by an act) consequence does not affect the degree of ambiguity of this event. Consider, for example, an event with an unknown probability of winning $100. Changing the magnitude of gain to $1000 does not affect either its perceived probability 17 or its degree of ambiguity. This property of stake independency is of primary importance, as it allows for the measurement of ambiguity independently of risk. An interesting property of the measure f2 is that ambiguity may be “canceled out”.26 This can happen when composing a “portfolio” of acts. To see this, consider the two binary acts f = Ä ä Ä ä E : 1, E C : 2 and g = E C : 1, E : 2 , where E C stands for the complementary event of E. Even if separately each act has a strictly positive degree of ambiguity, a portfolio consisting of only these two acts has a zero degree of ambiguity (and in this extreme case, also a zero degree of risk). The reason is that E ∪ E C = S and the probability P (S) of S is always exactly one, which in turn implies that the degree of ambiguity of the entire state space (measured by 4Var [P (S)]), as well as the degree of ambiguity of an empty subset of the state space, is zero. In a less extreme scenario, consider two Ä ä Ä ä acts f = E C : 0, E : x and g = F C : 0, F : x where E and F are mutually exclusive events. In this case, the ambiguity associated with the outcome x may be lower under the portfolio {f, g} than under each act f or act g separately. This may happen when the possible probabilities of events E and F are negatively correlated. To see this, the ambiguity associated with x under {f, g} can be written f2 [{f, g} , x] = f2 [{f, g} , E ∪ F ] = 4Var [Pf (E)] + 4Var [Pg (F )] + 8Cov [Pf (E) , Pg (F )] = f2 [f, E] + f2 [g, F ] + 8Cov [Pf (E) , Pg (F )]. An important conclusion arises from this example is that unpacking an event E ∪ F into disjoint components E and F with different outcomes increases its cumulative ambiguity when Cov [P (E) , P (F )] < 0.27 Note that the ambiguity associated with a union of an event and its complementary event is always zero, since the probability of an event is perfectly negatively correlated with the probability of its complementary event; see Lemma 1. The Ellsberg’s three-color experiment can also be viewed as demonstrating the effect that unpacking events has on the degree of ambiguity. In this experiment the DM is presented with an urn. She is told that the urn contains 90 colored balls, 30 of them red and the others either black or yellow in an unknown proportion. A ball will be drawn from the urn at random and the prize for a correct bet is $100. The experiment consists of two parts. In the first part, the DM has to choose between two bets: the next drawn ball is red (R), or the next drawn ball is black (B), formed respectively by acts f and g. Then, in the second part, the DM has to choose between betting that the next drawn ball is red or yellow (RY ) or, alternatively, that the next drawn ball is black or yellow (BY ), formed respectively by acts f ∗ and g ∗ . The DM in this example does not have any information indicating which of the possible urn compositions (probabilities) is more likely, and thus she acts as if she assigns an equal weight to each possibility. The following table formalizes this experiment in terms of acts 26 This notion coincides with Epstein and Zhang’s (2001) and Siniscalchi’s (2009) notion of complementarity. Support theory, of Tversky and Koehler (1994) and Rottenstreich and Tversky (1997), documents that the judged probability of an event generally increases when its description is unpacked into disjoint components and decreases by unpacking its alternative description. 27 18 and summarizes the degree of ambiguity associated with each event and each act. The ambiguity of the events with the high payoff that are relevant to each act are underlined. Event f Prize ($) Act f Act R Y B R Y B RY BY f 100 0 0 0.000 0.233 0.233 0.329 0.000 0.000 g 0 0 100 0.000 0.233 0.233 0.329 0.000 0.584 f ∗ 100 100 0 0.000 0.233 0.233 0.329 0.000 0.584 g ∗ 0 100 100 0.000 0.233 0.233 0.329 0.000 0.000 Table 2: Ellsberg’s three-color experiment Behavioral experiments have demonstrated that individuals usually prefer R over B and BY over RY ; formally, f %1 g and g ∗ %1 f ∗ .28 It can be observed from Table 2 that, aligned with Theorem 6, f [f ] < f [g] and f [g ∗ ] < f [f ∗ ]. This means that DMs usually prefer the less ambiguous bet, implying an ambiguity-aversion behavior. Table 2 demonstrates that under act g event BY is unpacked into events B and Y such that the ambiguity associated with act g is higher than that associated with act f . Under act f ∗ event BY can also be viewed as unpacked such that the ambiguity associated with f ∗ is higher than that associated with act g ∗ . The ambiguity measure f2 , extracted in Theorems 5 and 6, measures the degree of ambiguity at the highest possible accuracy. It measures the volatility of probabilities in the resolution of each possible outcome separately. Sometimes such a resolution is not required and a simpler measure can be defined at the expense of accuracy. For example, at the expense of a loss of some information, the measure of ambiguity can be applied over only the two fundamental events: unfavorable and favorable events. That is, f2 [f ] = 4Var [Pf (U F )] = 4Var [Pf (F V )] , where Pf (U F ) is the probability of the unfavorable event U F = {s ∈ S | f (s) ≤ k} under act f , and Pf (F V ) is the probability of the favorable event F V = {s ∈ S | f (s) > k} under act f . 7 Alternative measures of ambiguity Since the seminal works of Knight (1921) and Ellsberg (1961) several attempts have been made to define a measure of ambiguity. Hansen et al. (1999), Hansen and Sargent (2001) and Maccheroni et al. (2006), for example, interpret relative entropy as a measure of ambiguity (or of model uncertainty). 28 In expected utility theory, the DM’s assessments of the likelihoods of R, B and Y can be described by some probability measure P. The DM is assumed to prefer a greater chance of winning $100 to a smaller chance of winning $100, such that the choices above imply that P (R) > P (B) and P (B ∪ Y ) > P (R ∪ Y ). However, since R, B and Y are mutually exclusive events, no such conventional probability measure exists; hence, it is considered a paradox. 19 Relative entropy, also called the Kullback-Leibler distance, is measured by the deviation of a probability distribution from a reference distribution (reference model). Formally, the relative entropy of probability distribution P with respect to distribution Q is defined by ∫ DKL (P | Q) = ∞ −∞ p (x) ln p (x) dx, q (x) where p and q are respectively the probability densities of P and Q. While the use of relative entropy is restricted to cases of a single prior relative to a known true probability distribution, f2 can be employed in cases of multiple priors where either a single true probability distribution does not exist or it is not known. Sometimes the literature takes the variance of variance or the variance of mean as measures of ambiguity (see, for example, Maccheroni et al. (2013)). The measure f2 is broader than either of these measures in that it accounts for both, as well as for the variance of all higher moments of the probability distribution (i.e., skewness, kurtosis, etc.) through the variance of probabilities. Furthermore, f2 solves some major issues that arise from the exclusive use of either the variance of variance or the variance of mean as measures of ambiguity. To illustrate a major drawback of using the variance of mean as a measure of ambiguity, consider the following two bets: a bet A with the outcomes x = (−1, 0, 1) and, respectively, probabilities P = (0.5, 0, 0.5), and a bet B with the same outcomes, but with two equally likely possible probability distributions P1 = (0.4, 0.2, 0.4) and P2 = (0.3, 0.4, 0.3). The expected outcome of bet A is EA [x] = 0. The expected outcome of bet B is either EB [x | P1 ] = 0 or EB [x | P2 ] = 0, respectively contingent upon the probability distributions P1 and P2 . Measuring ambiguity by the variance of mean indicates that both A and B have a zero degree of ambiguity, i.e., both are unambiguous. However, by definition (and as f2 indicates), B, which has a positive degree of ambiguity, is more ambiguous than A, which clearly is unambiguous. The use of the variance of variance as a measure of ambiguity also bears a major drawback. To illustrate this, one can take the following example: a bet A with the outcomes x = (−1, 0, 1) and, respectively, probabilities P = (0.48, 0.04, 0.48), and a bet B with the same outcomes but with two equally likely possible probability distributions P1 = (0.6, 0, 0.4) and P2 = (0.4, 0, 0.6). The variance of the outcomes of bet A is VarA [x] = 0.96. The variance of the outcomes of bet B is either VarB [x | P1 ] = 0.96 or VarB [x | P2 ] = 0.96, respectively contingent upon the probability distributions P1 and P2 . Measuring ambiguity by the variance of variance indicates that both A and B have a zero degree of ambiguity. While, by definition (and as f2 indicates), B is more ambiguous than A. Both variance of variance and variance of mean are functions of outcomes, which makes them stake dependent. As such, neither of these two measures allow for the measurement of the degree 20 of ambiguity in isolation from the degree of risk. Consider, for example, an event with an unknown probability of winning $100. One would expect that changing the magnitude of gain to $1000 affects neither the perceived probability of that event nor its degree of ambiguity. This requirement, however, is not satisfied by the variance of variance or by the variance of mean. Both will indicate that the bet with the $1000 prize is more ambiguous than the bet with the $100 prize, even though both are bets on the same event and the change of its associated outcome from $100 to $1000 does not provide any new information about its likelihood. On the other hand, as a stake-independent measure of ambiguity, f2 will indicate that both bets have the same degree of ambiguity. The reason is that, while variance of variance and variance of mean are functions of outcomes, and therefore subject to risk, f2 is solely a function of probabilities. This means that f2 is not affected by the magnitude or the sign of consequences. Increasing or decreasing the consequences of an act does not change its degree of ambiguity, but it does change its degree of risk. Stake independency is a major advantage of f2 . This property is of primary importance because it allows for the measurement of the degree of ambiguity independently of risk, as well as for the detection of the implications of ambiguity in isolation from risk, in empirical and behavioral studies. The point to emphasize is that a decision-making process considers not only the degree of ambiguity but also the degree of risk. Hence, when making choices, these two factors jointly play a role. For example, a consolidated uncertainty measure that aggregates risk and ambiguity can be defined by √ Υ (x) = Var [x] , 1 − f2 [x] where the variance of outcomes Var [x] is taken using expected probabilities (see, Izhakian (2012)). Namely, the variance of outcomes is defined by ∫ Var [x] = Ä E [φ (x)] x − E [x] ä2 dx, and the expected outcome is defined by the double expectation (with respect to probabilities and to outcomes) ∫ E [x] = 8 E [φ (x)] xdx. Application for asset pricing To demonstrate the qualities of f2 , this section presents an application of the theory to asset pricing. The prices that financial decision makers (investors) are willing to pay for assets could be affected by the fact that they do not know the precise probabilities of future returns. They might require a premium for bearing ambiguity in addition to the premium they require for bearing risk. The risk premium can be viewed as the premium that a DM is willing to pay for exchanging a risky 21 bet for its expected outcome. The ambiguity premium can be viewed as the premium she is willing to pay for exchanging an ambiguous bet for a risky but unambiguous bet that has an identical expected outcome.29 The uncertainty premium can be viewed as the total premium that a DM is willing to pay for exchanging an ambiguous bet for its expected outcome, i.e., the accumulation of the risk premium and the ambiguity premium. In this view, the uncertainty premium, denoted K, can be defined by ∫ Ç å Γ′′ (1 − E [P (x)]) U (E [x] − K) ≈ U (x) E [φ (x)] − ′ E [φ (x)] Var [φ (x)] dx + Γ (1 − E [P (x)]) −∞ Ç å ∫ ∞ Γ′′ (1 − E [P (x)]) U (x) E [φ (x)] + ′ E [φ (x)] Var [φ (x)] dx, Γ (1 − E [P (x)]) k k (5) where x is the outcome of some act f and φ (x) is its probability (under act f ); and C = E [x] − K is the certainty equivalent satisfying C ∼1 f . That is, C is the constant sure outcome for which the DM is willing to exchange a risky and ambiguous (uncertain) outcome of act f . The next theorem approximates the uncertainty premium and separates it into a risk premium and an ambiguity premium.30 Theorem 7. Assume a DM whose preferences are characterized by a twice-differentiable utility function U and a twice-differentiable outlook function Γ. For relatively small outcomes with relatively small probabilities the uncertainty premium is ñ ô ó 1 U′′ (E [x]) Γ′′ (1 − E [P (x)]) î K ≈ − Var [x] − E E |x − E [x]| f2 [x] , ′ ′ 2 U (E [x]) Γ (1 − E [P (x)]) (6) where the former is the risk premium and the latter is the ambiguity premium.31 Concerning financial decisions, consequences can be described by rates of return, denoted r. Assume a DM who decides to save one unit of wealth and invest it in a uncertain (risky and ambiguous) portfolio. The uncertainty premium in this case takes the following form.32 Corollary 2. Suppose that the conditions of Theorem 7 hold. The uncertainty premium, in terms of rate of return, takes the form ñ ô ó 1 U′′ (1 + E [r]) Γ′′ (1 − E [P (r)]) î K ≈ − Var [r] − E E |r − E [r]| f2 [r] . ′ ′ 2 U (1 + E [r]) Γ (1 − E [P (r)]) (7) This model (and Theorem 7) provides two distinctions. First, it distinguishes between risk and ambiguity premiums such that these two premiums are orthogonal. Second, within each premium it 29 The ambiguity premium can also be viewed as the price that a DM is willing to pay for the information about the true probabilities of events. 30 The proof of this theorem applies the same methodology to probabilities as used by Arrow (1965) and Pratt (1964) for consequences. î ′′ ó ∫ [ ] ∫ ′′ (1−E[P(x)]) (1−E[P(x)]) 31 Formally, E ΓΓ′ (1−E[P(x)]) = X E [φ (x)] ΓΓ′ (1−E[P(x)]) dx and E |x − E [x]| = X E [φ (x)] |x − E [x]| dx. 32 This representation is obtained by applying the same development of Theorem 7, where x = 1 + r. 22 distinguishes between the sources of premiums—preferences and beliefs. The risk premium, R≈− 1 U′′ (1 + E [r]) Var [r] , 2 U′ (1 + E [r]) is the Arrow-Pratt risk premium. Independently, a higher risk, measured by Var [r], or a higher ′′ (·) aversion to risk, measured by the coefficient of absolute risk aversion − UU′ (·) , result in a greater risk premium. The ambiguity premium, ñ ô ó Γ′′ (1 − E [P (r)]) î A ≈ −E E |r − E [r]| f2 [r] , ′ Γ (1 − E [P (r)]) possesses attributes resembling those of the risk premium, but with respect to probabilities rather than to consequences. A complete separation between ambiguity, measured by f2 , and tastes for ′′ (·) ambiguity, measured by the coefficient of absolute ambiguity aversion − ΓΓ′ (·) , is achieved. Ambiguity ′′ ′′ (·) (·) aversion (− ΓΓ′ (·) > 0) implies a positive ambiguity premium. Ambiguity loving (− ΓΓ′ (·) < 0) implies ′′ (·) a negative premium. Ambiguity neutrality (− ΓΓ′ (·) = 0) implies a zero premium, obtained also when probabilities are perfectly known (i.e., when f2 = 0). Higher degree of ambiguity or a higher aversion to ambiguity result in a greater ambiguity premium. The ambiguity premium is also a function of the expected absolute deviation of outcomes from expectation. This component scales the ambiguity premium to the units of outcomes. For example, one may consider the case of measuring the ambiguity premium in terms of dollars versus in terms of percentage rate of return. The next corollary shows the different premiums in the case of a DM typified by constant relative risk aversion (CRRA) and CAAA. Corollary 3. Suppose that the conditions of Theorem 7 hold, and assume a DM who is characterized by CRRA, U (c) = c1−γ −k1−γ , 1−γ γ ̸= 1 ln (c) − ln (k) , γ = 1 , and CAAA, Γ (P (E)) = − e −ηP(E) η .33 The uncertainty premium is then î ó 1 K ≈ γ Var [r] + ηE |r − E [r]| f2 [r] . 2 Several studies have documented ambiguity-averse behavior concerning gains (favorable events) and ambiguity-loving behavior concerning losses (unfavorable events); see, for example, Maffioletti and Michele (2005), Abdellaoui et al. (2005), and Du and Budescu (2005). The ambiguity premium, constructed in Corollary 2, can be refined to support different ambiguity preferences concerning un33 A more standard formulation of CRRA, U (c) = normalized to U (k) = 0. c1−γ 1−γ for γ ̸= 1 and otherwise for γ = 1 U (c) = ln (c), is not always 23 favorable and favorable events. Allowing this flexibility, the ambiguity premium takes the form ñ∫ A ≈ − k −∞ E [φ (r)] Γ′′U F (1 − E [P (r)]) dr + Γ′U F (1 − E [P (r)]) ∫ ô ∞ E [φ (r)] k î ó Γ′′F V (1 − E [P (r)]) dr E |r − E [r]| f2 [r] , ′ ΓF V (1 − E [P (r)]) where ΓU F (·) captures ambiguity preferences concerning unfavorable events and ΓF V (·) captures ambiguity preferences concerning favorable events. The implications of ambiguity for the equity premium have been studied mainly by focusing on theoretical aspects. Chen and Epstein (2002), Izhakian and Benninga (2011), Ui (2011), and Maccheroni et al. (2013) add an ambiguity premium to the conventional risk premium.34 In these models the ambiguity premium is also a function of risk attitude; whereas in the model of Equation (7), the ambiguity premium is independent of risk attitude. The pricing model of Equation (7) has been tested empirically by Brenner and Izhakian (2011). This study of the risk–ambiguity–return relationship employs the measure of ambiguity, f2 , as an explanatory factor of the aggregate return on the stock market. To do so, it assumes that each subset of stock returns is generated by the choices of a single representative DM conditional upon a different prior P within her subjective set of priors P.35 The probability distribution of returns in each subset is then estimated to reveal the set of priors. Assuming some structure on second-order beliefs, Brenner and Izhakian (2011) compute f2 from the data and investigate its effect on stock market returns. They find that ambiguity has a significant impact on expected returns. Their study provides a possible explanation for the equity premium puzzle, demonstrating that f2 can be useful in empirical studies of the implications of ambiguity. 9 Conclusion Almost any real-life decision entails ambiguity. Naturally, one of the first steps of a decision-making process is to rank alternative choices by their degree of ambiguity. The key to addressing this need is a simple well-defined measure of ambiguity. The search for such a measure that can quantify the degree of ambiguity associated with different alternatives can be viewed as having started with the seminal study of Knight (1921). The measure of ambiguity introduced in this paper aims to address this need. Ambiguity in this paper takes the form of probability perturbation (uncertain probabilities) and aversion to ambiguity the form of aversion to mean-preserving spreads in these probabilities. In this view, just as the degree of risk can be measured by the volatility of outcomes, so too can the degree of ambiguity be measured by the volatility of probabilities. This concept provides a natural objective stake-independent ambiguity measure, denoted f2 , which is simply four times the expected 34 Segal and Spivak (1990) also analyze the ambiguity premium, which they call a premium of order 2. A representative DM can be defined as an artificial DM whose tastes and beliefs are such that if all investors in the economy had tastes and beliefs identical to hers the equilibrium in the economy remains unchanged; see, for example, Constantinides (1982). 35 24 volatility of probabilities across the relevant events. The measure of ambiguity f2 has two main qualities. First, it is simple, applicable and can be used for the empirical measurement of the degree of ambiguity. Second, it is an objective stakeindependent measure. That is, it is independent of risk and independent of individuals’ preferences. These qualities are of primary importance for introducing ambiguity into theoretical, behavioral and, especially, empirical studies. The importance of ambiguity—the uncertainty about probabilities— for understanding economic and financial decision processes has being recognized in the literature for the past half century. Relevant studies have acknowledged that attempts to portrait a realistic picture of observable phenomena and anomalies should consider also the dimension of uncertainty with respect to probabilities. Accounting for ambiguity might shed light on many economic and financial phenomena that previously could not be fully explained. The measure of ambiguity introduced in this paper can be employed for this mission. For example, it can be employed for investigating the nature of the risk-ambiguity relationship and its implication for optimal decision making. Hopefully, this measure will pave the way not only for the introduction of ambiguity into empirical studies, but also for the expansion of theoretical and behavioural studies regarding the nature of ambiguity and related preferences. 25 References Abdellaoui, M., F. Vossmann, and M. Weber (2005) “Choice-Based Elicitation and Decomposition of Decision Weights for Gains and Losses Under Uncertainty.,” Management Science, Vol. 51, No. 9, pp. 1384–1399. Anderson, E. W., E. Ghysels, and J. L. Juergens (2009) “The Impact of Risk and Uncertainty on Expected Returns,” Journal of Financial Economics, Vol. 94, No. 2, pp. 233–263. Arrow, K. J. (1965) Aspects of the Theory of Risk Bearing, Helsinki: Yrjo Jahnssonin Saatio. Bewley, T. F. (2011) “Knightian Decision Theory and Econometric Inferences,” Journal of Economic Theory, Vol. 146, No. 3, pp. 1134–1147. Bollerslev, T., R. F. Engle, and J. M. Wooldridge (1988) “A Capital Asset Pricing Model with Time-Varying Covariances,” Journal of Political Economy, Vol. 96, No. 1, pp. 116–131. Bollerslev, T., N. Sizova, and G. Tauchen (2011) “Volatility in Equilibrium: Asymmetries and Dynamic Dependencies,” Review of Finance, Vol. 16, No. 1, pp. 31–80. Boyle, P. P., L. Garlappi, R. Uppal, and T. Wang (2011) “Keynes Meets Markowitz: The Tradeoff Between Familiarity and Diversification,” Management Science, Vol. 58, pp. 1–20. Brenner, M. and Y. Izhakian (2011) “Asset Prices and Ambiguity: Empirical Evidence,” Stern School of Business, Finance Working Paper Series, FIN-11-010. (2012) “Pricing Systematic Ambiguity in Capital Markets,” Stern School of Business, Finance Working Paper Series, FIN-12-008. Chen, Z. and L. Epstein (2002) “Ambiguity, Risk, and Asset Returns in Continuous Time,” Econometrica, Vol. 70, No. 4, pp. 1403–1443. Constantinides, G. M. (1982) “Intertemporal Asset Pricing with Heterogeneous Consumers and without Demand Aggregation,” The Journal of Business, Vol. 55, No. 2, pp. 253–67. Coval, J. D. and T. J. Moskowitz (1999) “Home Bias at Home: Local Equity Preference in Domestic Portfolios,” The Journal of Finance, Vol. 54, No. 6, pp. 2045–2073. Dow, J. and S. R. d. C. Werlang (1992) “Uncertainty Aversion, Risk Aversion, and the Optimal Choice of Portfolio,” Econometrica, Vol. 60, No. 1, pp. 197–204. Du, N. and D. V. Budescu (2005) “The Effects of Imprecise Probabilities and Outcomes in Evaluating Investment Options,” Management Science, Vol. 51, No. 12, pp. 1791–1803. Ellsberg, D. (1961) “Risk, Ambiguity, and the Savage Axioms,” Quarterly Journal of Economics, Vol. 75, No. 4, pp. 643–669. Epstein, L. G. and S. Ji (2013) “Ambiguous Volatility and Asset Pricing in Continuous Time,” Review of Financial Studies, Vol. 26, No. 7, pp. 1740–1786. Epstein, L. G. and M. Schneider (2008) “Ambiguity, Information Quality, and Asset Pricing,” The Journal of Finance, Vol. 63, No. 1, pp. 197–228. Epstein, L. G. and J. Zhang (2001) “Subjective Probabilities on Subjectively Unambiguous Events,” Econometrica, Vol. 69, No. 2, pp. 265–306. Fern´andez-Villaverde, J., P. Guerr´on-Quintana, J. F. Rubio-Ram´ırez, and M. Uribe (2010) “Risk Matters: The Real Effects of Volatility Shocks,” American Economic Review, Vol. 101, pp. 2530–2561. Gilboa, I. (1987) “Expected Utility with Purely Subjective Non-Additive Probabilities,” Journal of Mathematical Economics, Vol. 16, No. 1, pp. 65–88. Gilboa, I. and D. Schmeidler (1989) “Maxmin Expected Utility with Non-Unique Prior,” Journal of Mathematical Economics, Vol. 18, No. 2, pp. 141–153. Goetzmann, W. N. and A. Kumar (2008) “Equity Portfolio Diversification,” Review of Finance, Vol. 12, No. 3, pp. 433–463. Goldberger, A. (1991) A Course in Econometrics: Harvard University Press, 1st edition. 26 Hansen, L. P. and T. J. Sargent (2001) “Robust Control and Model Uncertainty,” American Economic Review, Vol. 91, No. 2, pp. 60–66. Hansen, L. P., T. J. Sargent, and T. D. Tallarini (1999) “Robust Permanent Income and Pricing,” The Review of Economic Studies, Vol. 66, No. 4, pp. 873–907. Izhakian, Y. (2012) “Capital Asset Pricing under Ambiguity,” Stern School of Business, Economics Working Paper Series, ECN-12-02. (2014) “Expected Utility with Uncertain Probabilies Theory,” SSRN eLibrary, 2017944. Izhakian, Y. and S. Benninga (2011) “The Uncertainty Premium in an Ambiguous Economy,” The Quarterly Journal of Finance, Vol. 1, pp. 323–354. Jewitt, I. and S. Mukerji (2011) “Ordering Ambiguous Acts,” University of Oxford, Department of Economics, Economics Series Working Papers. Ju, N. and J. Miao (2012) “Ambiguity, Learning, and Asset Returns,” Econometrica, Vol. 80, pp. 559–591. Klibanoff, P., M. Marinacci, and S. Mukerji (2005) “A Smooth Model of Decision Making under Ambiguity,” Econometrica, Vol. 73, No. 6, pp. 1849–1892. Knight, F. M. (1921) Risk, Uncertainty and Profit, Boston: Houghton Mifflin. Maccheroni, F., M. Marinacci, and D. Ruffino (2013) “Alpha as Ambiguity: Robust Mean-Variance Portfolio Analysis,” Econometrica, Vol. 81, pp. 1075–1113. Maccheroni, F., M. Marinacci, and A. Rustichini (2006) “Ambiguity Aversion, Robustness, and the Variational Representation of Preferences,” Econometrica, Vol. 74, No. 6, pp. 1447–1498. Maffioletti, A. and M. Santoni (2005) “Do Trade Union Leaders Violate Subjective Expected Utility? Some Insights From Experimental Data,” Theory and Decision, Vol. 59, No. 3, pp. 207–253. Mehra, R. and E. C. Prescott (1985) “The Equity Premium: A Puzzle,” Journal of Monetary Economics, Vol. 15, No. 2, pp. 145–161. Owen, J. and R. Rabinovitch (1983) “On the Class of Elliptical Distributions and Their Applications to the Theory of Portfolio Choice,” The Journal of Finance, Vol. 38, No. 3, pp. 745–52. Pratt, J. W. (1964) “Risk Aversion in the Small and in the Large,” Econometrica, Vol. 32, No. 1/2, pp. 122–136. Rothschild, M. and J. E. Stiglitz (1970) “Increasing Risk: I. A Definition,” Journal of Economic Theory, Vol. 2, No. 3, pp. 225–243. Rottenstreich, Y. and A. Tversky (1997) “Unpacking, Repacking, and Anchoring: Advances in Support Theory,” Psychological Review, Vol. 104, No. 2, pp. 406–415. Sarin, R. K. and P. P. Wakker (1992) “A Simple Axiomatization of Nonadditive Expected Utility,” Econometrica, Vol. 60, No. 6, pp. 1255–1272. Savage, L. J. (1954) The Foundations of Statistics, New York, USA: Wiley. Schmeidler, D. (1989) “Subjective Probability and Expected Utility without Additivity,” Econometrica, Vol. 57, No. 3, pp. 571–587. Segal, U. and A. Spivak (1990) “First Order Versus Second Order Risk Aversion,” Journal of Economic Theory, Vol. 51, No. 1, pp. 111–125. Shiller, R. J. (1981) “Do Stock Prices Move Too Much to be Justified by Subsequent Changes in Dividends?” American Economic Review, Vol. 71, No. 3, pp. 421–436. Siniscalchi, M. (2009) “Vector Expected Utility and Attitudes Toward Variation,” Econometrica, Vol. 77, No. 3, pp. 801–855. Tversky, A. and D. Kahneman (1992) “Advances in Prospect Theory: Cumulative Representation of Uncertainty,” Journal of Risk and Uncertainty, Vol. 5, No. 4, pp. 297–323. Tversky, A. and D. J. Koehler (1994) “Support Theory: A Nonextensional Representation of Subjective Probability,” Psychological Review, Vol. 101, pp. 547–567. 27 Ui, T. (2011) “The Ambiguity Premium vs. the Risk Premium under Limited Market Participation,” Review of Finance, Vol. 15, No. 2, pp. 245–275. Uppal, R. and T. Wang (2003) “Model Misspecification and Under Diversification,” The Journal of Finance, Vol. 58, No. 1, pp. 2465–2486. Wakker, P. and A. Tversky (1993) “An Axiomatization of Cumulative Prospect Theory,” Journal of Risk and Uncertainty, Vol. 7, No. 2, pp. 147–175. Wakker, P. (2010) Prospect Theory: For Risk and Ambiguity: Cambridge University Press. Weil, P. (1989) “The Equity Premium Puzzle and The Risk-Free Rate Puzzle,” Journal of Monetary Economics, Vol. 24, No. 3, pp. 401–421. 28 Appendix Lemma 1. The covariance between the probability of event E and the probability of its complementary event E C satisfies î Ä Cov P (E) , P E C î Ä where Cov P (E) , P E C äó = ∫ Ä P äó î Ä = −Var [P (E)] = −Var P E C P (E) − E [P (E)] äÄ Ä ä äó î Ä P EC − E P EC , äó ä dχ is the covariance of the probabilities of events E and E C . Lemma 2. Assume a twice-differentiable outlook function Γ, satisfying 1 2 Ç å Γ′′ (E [P (F )]) Γ′′ (E [P (E ∪ F )]) Var [P (F )] − Var [P (E ∪ F )] Γ′ (E [P (F )]) Γ′ (E [P (E ∪ F )]) ≤ E [P (E)] for any events E, F ⊆ S. Then Q(F ) ≤ Q(E ∪ F ). Lemma 3. d dφ (x) ∫ x −∞ φ (z) dz = φ (x) φ′ (x) ˆ whose resulting probabilities are uniformly disLemma 4. Assume two secondary acts δˆE , δˆF ∈ ∆, tributed or elliptically distributed with an identical characteristic generator, and have an identical expectation, i.e., E [P (E)] = E [P (F )]. Let Std [P (E)] and Std [P (F )] be, respectively, the standard deviations of their resulting probabilities. Then P (F ) − E [P (F )] =d λ (P (E) − E [P (E)]) , where λ = Std[P(F )] Std[P(E)] . Lemma 5. Let Z and ϵ be two random variables. If ϵ is mean-independent of z, then E [Zϵ] = E [Z] E [ϵ] . Lemma 6. Let Y and X be two random variables. If Y is mean-independent of X, then Y is also mean-independent of Z = h(X), where h : R → R. Lemma 7. The following mean-independencies hold: (i) (φ (x) − E [φ (x)])2 is mean-independent of φ (x), implying that î ó î ó E φ (x) (φ (x) − E [φ (x)])2 = E [φ (x)] E (φ (x) − E [φ (x)])2 ; (ii) Var [φ (x)] is mean-independent of x, implying that E [xVar [φ (x)]] = E [x] E [Var [φ (x)]]; (iii) Var [φ (x)] is mean-independent of P (x), implying that 29 E [P (x) Var [φ (x)]] = E [P (x)] E [Var [φ (x)]]; (iv) |x − E [x]| is mean-independent of E [P (x)], implying that î ó î ó î ó E E [P (x)] |x − E [x]| = E E [P (x)] E |x − E [x]| . Lemma 8. If Y is mean-independent of X, and φY , φX , φY,X exist, then ∫ ∫ k ∫ k −∞ −∞ φY,X (y, x) yxdydx = ∫ k φY (y) ydy −∞ k −∞ φX (x) xdx and ∫ ∞∫ ∞ ∫ ∫ ∞ φY,X (y, x) yxdydx = k ∞ φY (y) ydy k k φX (x) xdx, k for any k ∈ R. Ä Ä ä Since P (E) is additive, P E C = 1 − P (E). Thus, the covariance between Proof of Lemma 1. ä P (E) and P E C can be written î Ä Cov P (E) , P E C ∫ äó = ∫P = P Ä Ä ä î Ä (P (E) − E [P (E)]) P E C − E P E C äóä dχ (P (E) − E [P (E)]) (E [P (E)] − P (E)) dχ, and therefore î Ä Cov P (E) , P E C äó = −Var [P (E)] . The second equality is obtained by ∫ Var [P (E)] = P (P (E) − E [P (E)])2 dχ = Proof of Lemma 2. ∫ Ä Ä ä î Ä P EC − E P EC P äóä2 î Ä dχ = Var P E C äó . By Theorem 1, 1 Γ′′ (E [P (E ∪ F )]) Var [P (E ∪ F )] − 2 Γ′ (E [P (E ∪ F )]) 1 Γ′′ (E [P (F )]) E [P (F )] − Var [P (F )] 2 Γ′ (E [P (F )]) 1 Γ′′ (E [P (E ∪ F )]) 1 Γ′′ (E [P (F )]) = E [P (E)] + Var [P (E ∪ F )] − Var [P (F )] , 2 Γ′ (E [P (E ∪ F )]) 2 Γ′ (E [P (F )]) Q(E ∪ F ) − Q(F ) ≈ E [P (E)] + E [P (F )] + which is nonnegative by the Lemma’s hypothesis. Proof of Lemma 3. ∫ Let u = φ (z), then changing the integration variable provides ∫ x −∞ φ(x) φ (z) dz = φ(−∞) u du = ′ φ (z) ∫ φ(x) u φ(−∞) φ′ (φ−1 (u)) Differentiating with respect to φ(x) gives d dφ(x) ∫ φ(x) φ(−∞) du. φ(x) = ′ du = . ′ −1 ′ −1 φ (φ (u)) φ (φ (u)) u=φ(x) φ (x) u u 30 Proof of Lemma 4. Let y = P (E) − E [P (E)] and z = P (F ) − E [P (F )], and assume that event F is more ambiguous than event E. To show that z =d λy it has to be proved that λy and z have an identical probability characteristic function. Consider, first, the case of uniformly distributed y and z. Since E [y] = 0 and E [z] = 0, then y ∈ [−ay , ay ] and z ∈ [−az , az ], where ay and az are nonnegative. The characteristic function of z and λy are, respectively, ϕz (t) = eitaz − e−itaz 2itaz and ϕλy (t) = eitλay − e−itλay . 2itλay Since E [z] = E [y] = 0 and y and z are uniformly distributed, one can write their standard deviations à Std [z] λ= Std [y] = (2az )2 /12 (2ay )2 /12 to show that az = λay . This implies that ϕz (t) = ϕλy (t), and therefore z =d λy. Consider now the case of elliptically distributed z and λy. That is, z ∼ el (E [z] , Var [z] , Ψ) Ä ä λy ∼ el λE [y] , λ2 Var [y] , Ψ . and Therefore, the characteristic function of z and λy are respectively Å ϕz (t) = eitE[z] Ψ ã 1 2 t Var [z] 2 Å ϕλy (t) = eitλE[y] Ψ and ã 1 2 2 t λ Var [y] . 2 Since ϕz and ϕλy have an identical characteristic generator Ψ, E [z] = λE [y] = 0 and Std [z] = λStd [y], then ϕz = ϕλy , which implies z =d λy. Proof of Lemma 5. The expectation E [Zϵ] of Zϵ over the joint distribution of Z and ϵ can be taken first over the distribution of ϵ conditional upon Z, and then over the marginal distribution of Z. That is, E [Zϵ] = E [E [Zϵ|Z]] . Then, Z can be passed out of the inner expectation, implying that E [Zϵ] = E [ZE [ϵ|Z]] . [ ] By mean-independence E ϵZ = E [ϵ]. Therefore, E [Zϵ] = E [Z] E [ϵ] . Proof of Lemma 6. See, Goldberger (1991) page 61, M1. 31 Proof of Lemma 7. (i) One can write φ (x) − E [φ (x)] =d E [φ (x)] − E [φ (x)] + ϵ. Since E [ϵ|φ (x)] = E [ϵ] = 0, clearly ϵ and φ (x) are mean-independent. Let Var [φ (x)] = σ 2 , then [ ] [ ] ó î by construction E ϵ2 |φ (x) = σ 2 = E ϵ2 . That is, ϵ2 is mean independent of φ (x), implying that î ó E φ (x) (φ (x) − E [φ (x)])2 = E [φ (x)] E (φ (x) − E [φ (x)])2 . (ii) Writing the conditional expectation of Var [φ (x)] explicitly, provides ó î î ó E [Var [φ (x)] |x] = E E (φ (x) − E [φ (x)])2 |x . By the law of iterated expectation36 î î ó ó E E (φ (x) − E [φ (x)])2 |x î î óó = E E (φ (x) − E [φ (x)])2 |x . By (i), (φ (x) − E [φ (x)])2 is mean-independent of φ (x). By Lemma 6, (φ (x) − E [φ (x)])2 is also mean-independent of φ−1 (φ (x)) = x. Therefore, î î E E (φ (x) − E [φ (x)])2 |x óó î î = E E (φ (x) − E [φ (x)])2 óó , implying that E [xVar [φ (x)]] = E [x] E [Var [φ (x)]]; (iii) By (ii), Var [φ (x)] is mean-independent of x. By Lemma 6, Var [φ (x)] is also mean-independent of the function P (x) of x. Therefore, E [P (x) Var [φ (x)]] = E [P (x)] E [Var [φ (x)]]. (iv) Write x − E [x] =d E [x] − E [x] + ϵ. Since E [ϵ|x] = E [ϵ] = 0, clearly ϵ is mean-independent x. By Lemma 6, ϵ is also mean-independent E [P (x)]. Therefore, E [ϵ|E [P (x)]] = E [ϵ]. Mean-independent implies uncorrelatedness; see, for example, Goldberger (1991), page 63, M2. Therefore, E [ϵE [P (x)]] = E [ϵ] E [E [P (x)]]. Since E [P (x)] ≥ 0 for every x, E [ϵE [P (x)]] = E [|ϵ| E [P (x)]] = E [|ϵ|] E [E [P (x)]]. Substituting for ϵ = x − E [x] completes the proof. Proof of Lemma 8. Define Z = h(X) such that z = x if x ≤ k and otherwise z = 0. Since Y is mean-independent of X, by Lemma 6, it is also mean-independent of Z. Therefore, E [Y Z] = E [Y ] E [Z] . Writing the expectation explicitly, provides ∫ ∞ ∫ −∞ −∞ 36 ∫ ∞ φY,Z (y, z) yzdydz = ∫ k −∞ φY (y) ydy See, for example, Goldberger (1991) page 47, T8. 32 ∫ k −∞ ∫ ∞ φY (y) ydy φZ (z) zdz + k k −∞ φZ (z) zdz, which implies ∫ ∫ k ∫ k −∞ −∞ φY,X (y, x) yxdydx = ∫ k −∞ φY (y) ydy k −∞ φX (x) xdx. The second part is proved similarly. Proof of Theorem 1. The perceived probability, Q(E), of event E ∈ E can be written Q(E) = Γ −1 (Γ (E [P (E)] − Λ)) = Γ −1 Å∫ ã P Γ (P (E)) dχ , (8) for some Λ ∈ R. Taking the first-order Taylor approximation of Γ (E [P (E)] − Λ) around E [P (E)] yields Γ (E [P (E)] − Λ) ≈ Γ (E [P (E)]) − ΛΓ′ (E [P (E)]) . (9) The second-order Taylor approximation of Γ (P (E)) in Equation (8) around E [P (E)] is Γ (P (E)) ≈ Γ (E [P (E)]) + Γ′ (E [P (E)]) (P (E) − E [P (E)]) 1 + Γ′′ (E [P (E)]) (P (E) − E [P (E)])2 . 2 (10) Since Γ (E [P (E)]), Γ′ (E [P (E)]) and Γ′′ (E [P (E)]) are constants, the expectation of Equation (10) is ∫ 1 Γ (P (E)) dχ ≈ Γ (E [P (E)]) + Γ′′ (E [P (E)]) Var [P (E)] . 2 P (11) Equating (9) to (11) and organizing terms yields Λ ≈ − 1 Γ′′ (E [P (E)]) Var [P (E)] . 2 Γ′ (E [P (E)]) Substituting Λ into Equation (8), together with Lemma 2 (that assures nonnegativity), proves the theorem. Proof of Theorem 2. By Wakker and Tversky (1993, Equation 6.1), the dual representation of Equation (2) can be written ∫ W (f ) = − ∫−∞ ∞ + ï k U (x) d Γ −1 Å∫ Å∫ P ï U (x) d Γ−1 P k ã ò Γ (1 − Pf (x)) dχ − 1 (12) ãò Γ (1 − Pf (x)) dχ . The subscript f can be omitted to write ñ Γ′ (1 − P (x)) φ (x) d −1 Γ (E [Γ (1 − P (x))]) = E − ′ −1 dx Γ (Γ (E [Γ (1 − P (x))])) and to denote D (φ (x)) = − Γ′ (1 − P (x)) φ (x) . Γ′ (Γ−1 (E [Γ (1 − P (x))])) 33 ô By Lemma 3, differentiating D with respect to φ (x) provides Γ′′ (1 − P (x)) φ2 (x) − Γ′ (1 − P (x)) − Γ′ (Γ−1 (E [Γ (1 − P (x))])) ( ) Γ′ (1 − P (x)) φ (x) Γ′′ Γ−1 (E [Γ (1 − P (x))]) E [Γ′ (1 − P (x)) φ (x)] . (Γ′ (Γ−1 (E [Γ (1 − P (x))])))3 d D (φ (x)) = dφ (x) ∫ Notice that, since φ (z) is additive, x −∞ E [φ (z)] dz = E [P (x)]. Taking the first-order Taylor approx- imation of D with respect to φ (x) around E [φ (x)] provides ñ ô d E [D (φ (x))] ≈ D (E [φ (x)]) + E D (E [φ (x)]) (φ (x) − E [φ (x)]) dφ (x) ó Γ′′ (1 − E [P (x)]) î = −E [φ (x)] + ′ E φ (x) (φ (x) − E [φ (x)])2 . Γ (1 − E [P (x)]) î ó By Lemma 7, (φ (x) − E [φ (x)])2 is mean-independent of φ (x), which implies E φ (x) (φ (x) − E [φ (x)])2 = ó î E [φ (x)] E (φ (x) − E [φ (x)])2 . Therefore, E [D (φ (x))] ≈ −E [φ (x)] + Γ′′ (1 − E [P (x)]) E [φ (x)] Var [φ (x)] . Γ′ (1 − E [P (x)]) Substituting for E [D (φ (x))] in Equation (12), while accounting for the sign switch of U (x) E [φ (x)] when moving from negative to positive utility across k (see Wakker and Tversky (1993)), provides ∫ å Ç Γ′′ (1 − E [P (x)]) E [φ (x)] Var [φ (x)] dx + W (f ) ≈ U (x) E [φ (x)] − ′ Γ (1 − E [P (x)]) −∞ Ç å ∫ ∞ Γ′′ (1 − E [P (x)]) U (x) E [φ (x)] + ′ E [φ (x)] Var [φ (x)] dx. Γ (1 − E [P (x)]) k Proof of Theorem 3. k This proof considers ambiguity aversion; the proof for ambiguity loving is similar. (i+ii) Let y = P (E)−E [P (E)] and z = P (F )−E [P (F )], and assume that event F is more ambiguous than event E. Then, by Definition 2, z =d y + ϵ, where ϵ is mean-independent of y; therefore Var [P (F )] = Var [P (E)] + Var [ϵ] . For the opposite direction, assume that Var [P (F )] ≥ Var [P (E)] and define λ = Std[P(F )] Std[P(E)] ≥ 1. By the distributions of P (E) and P (F ), the random variables y and z are either uniformly distributed or elliptically distributed with an identical characteristic generator and E [z] = E [y] = 0. The random variable λy has the same characteristic function as z with E [λy] = E [z] = 0 and Var [z] = λ2 Var [y]. Therefore, by Lemma 4, z =d λy. Next, write x + y = α (x + λy) + (1 − α) x, where α = 1 λ and x is a random variable satisfying E [x | y] = E [x] = 0. Then, since Γ is concave, by 34 the Jensen inequality Γ (x + y) ≥ αΓ (x + λy) + (1 − α) Γ (x) . Taking expectations of both sides yields E [E [Γ (x + y) | x]] ≥ αE [E [Γ (x + λy) | x]] + (1 − α) E [Γ (x)] . (13) Since E [λy] = 0, a concave Γ implies E [Γ (E [x + λy | x])] = E [Γ (x)] ≥ E [E [Γ (x + λy) | x]] , which jointly with Equation (13) implies E [E [Γ (x + y) | x]] ≥ E [E [Γ (x + λy) | x]] . Let x = 0, then E [Γ (y)] ≥ E [Γ (λy)] = E [Γ (z)] , which, by Izhakian (2014, Proposition 5), implies δˆE %2 δˆF . (iii) Let Γ (P (E)) = − e −ηP(E) η . Taking a second-order Taylor approximation around E [P (E)] yields Γ (P (E)) ≈ −1 + (P (E) − E [P (E)]) − 1 (P (E) − E [P (E)])2 , 2 and taking expectation yields 1 E [Γ (P (E))] ≈ −1 − ηVar [P (E)] . 2 This implies that, concerning an ambiguity-averse DM, E [Γ (P (E))] ≥ E [Γ (P (F ))] ⇐⇒ Var [P (E)] ≤ Var [P (F )] . and, therefore, by Izhakian (2014, Proposition 5), δˆE %2 δˆF ⇐⇒ Var [P (E)] ≤ Var [P (F )] . (iv) Let Γ (P (E)) = − (P (E) − α)2 , where P (E) ≤ α for some α ∈ R. Taking expectation provides Ä ä E [Γ (P (E))] = − Var [P (E)] + (E [P (E)] − α)2 . Since E [P (E)] = E [P (F )] = 0, then E [Γ (P (E))] ≥ E [Γ (P (F ))] ⇐⇒ Var [P (E)] ≤ Var [P (F )] . and, therefore, by Izhakian (2014, Proposition 5), δˆE %2 δˆF ⇐⇒ Var [P (E)] ≤ Var [P (F )] . 35 Proposition 2 then completes the proof. Proof of Theorem 4. (⇐=) Suppose that V (f ) ≥ V (g) but f does not stochastically dominate g with respect to ambiguity. Then, there exists x∗ ∈ X such that ∫ ∫ x∗ −∞ E [φf (z)] Var [φf (z)] dz > x∗ −∞ E [φg (z)] Var [Pg (z)] dz. Define U (x) such that U (x) = −1 if x < x∗ ≤ k and otherwise U (x) = 0. Assume CAAA, i.e., Γ (P (E)) = (P(E))1−η . 1−η Then, since E [φf (x)] = E [φg (x)] for every x ∈ X , by Equation (4) ∫ V (f ) − V (g) ≈ −η x∗ −∞ î ó E [φf (z)] Var [φf (z)] − Var [φg (z)] dz Clearly V (f ) − V (g) < 0, which is a contradiction. (=⇒) Since E [φf (x)] = E [φg (x)] for every x ∈ X , by Equation (4) ∫ î ó Γ′′ (1 − E [Pf (x)]) E [φ (x)] Var [φ (x)] − Var [φ (x)] dx g f f Γ′ (1 − E [Pf (x)]) −∞ ∫ ∞ î ó Γ′′ (1 − E [Pf (x)]) + U (x) ′ E [φf (x)] Var [φf (x)] − Var [φg (x)] dx Γ (1 − E [Pf (x)]) k V (f ) − V (g) ≈ − k U (x) By Lemma 7, Var [φ (x)] is mean-independent of x as well as of P (x). Therefore, by Lemma 8, ∫ ∫ k î ó Γ′′ (1 − E [Pf (x)]) E [φf (x)] U (x) ′ E [φf (x)] Var [φf (x)] − Var [φg (x)] dx V (f ) − V (g) ≈ − dx Γ (1 − E [Pf (x)]) −∞ −∞ ∫ ∞ ∫ ∞ î ó Γ′′ (1 − E [Pf (x)]) + E [φf (x)] U (x) ′ dx E [φf (x)] Var [φf (x)] − Var [φg (x)] dx Γ (1 − E [Pf (x)]) k k k Since U (x) ≥ 0 for x ≥ k, U (x) ≤ 0 for x ≤ k, and Γ′′ (·) Γ′ (·) ≤ 0, if act f first-order stochastically dominates act g, then V (f ) − V (g) ≥ 0. Proof of Theorem 5. Since E [φf (x)] = E [φg (x)] for any x and k = −∞, then by Theorem 2 ∫ W (f ) − W (g) ≈ Ä ä Γ′′ (1 − E [Pf (x)]) E [φ (x)] Var [φ (x)] − Var [φ (x)] dx. g f f Γ′ (1 − E [Pf (x)]) ∞ −∞ U (x) By Lemma 7, Var [φf (x)] is mean-independent of x, as well as of Pf (x). Therefore, ∫ W (f ) − W (g) ≈ ∫ Since ∞ −∞ E [φf (x)] U (x) ∞ −∞ E [φf (x)] U (x) W (f ) ≥ W (g) Γ′′ Γ′′ (1 − E [Pf (x)]) dx Γ′ (1 − E [Pf (x)]) (1−E[Pf (x)]) Γ′ (1−E[Pf (x)]) ∫ ⇐⇒ ∞ −∞ ∫ ∞ −∞ Ä dx ≤ 0, then ∫ E [φf (x)] Var [φf (x)] dx ≤ ∞ −∞ and, by Theorem 2, ∫ f% g 1 ⇐⇒ ∞ −∞ 36 ä E [φf (x)] Var [φf (x)] − Var [φg (x)] dx. f2 [f ] ≤ f2 [g] . E [φf (x)] Var [φg (x)] dx Proof of Theorem 6. Since E [φf (x)] = E [φg (x)] for any x, then by Theorem 2 ∫ Ä ä Γ′′ (1 − E [Pf (x)]) E [φ (x)] Var [φ (x)] − Var [φ (x)] dx g f f Γ′ (1 − E [Pf (x)]) −∞ ∫ ∞ Ä ä Γ′′ (1 − E [Pf (x)]) U (x) ′ + E [φf (x)] Var [φf (x)] − Var [φg (x)] dx. Γ (1 − E [Pf (x)]) k k W (f ) − W (g) ≈ − U (x) By Lemma 7, Var [φf (x)] is mean-independent of x, as well as of Pf (x). Therefore, by Lemma 8, ∫ ∫ k Ä ä Γ′′ (1 − E [Pf (x)]) dx E [φf (x)] Var [φf (x)] − Var [φg (x)] dx W (f ) − W (g) ≈ − E [φf (x)] U (x) ′ Γ (1 − E [Pf (x)]) −∞ −∞ ∫ ∞ ∫ ∞ Ä ä Γ′′ (1 − E [Pf (x)]) + E [φf (x)] U (x) ′ dx E [φf (x)] Var [φf (x)] − Var [φg (x)] dx. Γ (1 − E [Pf (x)]) k k ∫ Since − k k E [φf (x)] U (x) Γ′′ (1−E[Pf (x)]) Γ′ (1−E[Pf (x)]) by the symmetry of outcomes around k, −∞ ∫ dx ≤ 0 and ∫ W (f ) − W (g) ≥ 0 ⇐⇒ k ∫−∞ ∞ k ∞ E [φf (x)] U (x) k Γ′′ (1−E[Pf (x)]) Γ′ (1−E[Pf (x)]) Ä ä Ä ä dx ≤ 0 then, E [φf (x)] Var [φf (x)] − Var [φg (x)] dx + E [φf (x)] Var [φf (x)] − Var [φg (x)] dx ≤ 0, which implies ∫ W (f ) ≥ W (g) ⇐⇒ ∞ −∞ ∫ E [φf (x)] Var [φf (x)] dx ≤ ∞ −∞ E [φf (x)] Var [φg (x)] dx and, by Theorem 2, ∫ f% g 1 ⇐⇒ ∞ −∞ f2 [f ] ≤ f2 [g] . Proof of Theorem 7. The first-order Taylor approximation of the LHS of Equation (5) with respect to K, around 0, is ∫ LHS = U (E [x] − K) = ∞ −∞ ∫ E [φ (x)] U (E [x] − K) dx ≈ ∞ ( ) E [φ (x)] U (E [x]) − KU′ (E [x]) dx. −∞ Writing the RHS of Equation (5) as ∫ RHS = | ∞ −∞ {z } I Ç∫ | E [φ (x)] U (x) dx + ∞ k ′′ (1−E[P(x)]) U (x) ΓΓ′ (1−E[P(x)]) E [φ (x)] Var [φ (x)] dx − {z ∫ k −∞ å ′′ (1−E[P(x)]) U (x) ΓΓ′ (1−E[P(x)]) E [φ (x)] Var [φ (x)] dx } II the second-order Taylor approximation of I with respect to x, around E [x], is then ∫ ∞ Ç å 1 E [φ (x)] U (E [x]) + U (E [x]) (x − E [x]) + U′′ (E [x]) (x − E [x])2 dx I ≈ 2 −∞ 1 = U (E [x]) + U′′ (E [x]) Var [x] . 2 ′ 37 , Taking the first-order Taylor approximation of II with respect to x, around E [x], provides37 ∫ II ≈ − + k −∞ ∫ ∞ ( k ) Γ′′ (1 − E [P (x)]) ( U (E [x]) + U′ (E [x]) (x − E [x]) E [φ (x)] Var [φ (x)] dx Γ′ (1 − E [P (x)]) ) Γ′′ (1 − E [P (x)]) U (E [x]) + U′ (E [x]) (x − E [x]) ′ E [φ (x)] Var [φ (x)] dx. Γ (1 − E [P (x)]) Since E [x] is relatively close to the reference point k and U (k) = 0, then U (E [x]) ≈ 0. Therefore, ∫ ′ II = U (E [x]) ∞ −∞ |x − E [x]| Γ′′ (1 − E [P (x)]) E [φ (x)] Var [φ (x)] dx. Γ′ (1 − E [P (x)]) Since, by Lemma 7, Var [φ (x)] is mean-independent of x, as well as of P (x), II = U′ (E [x]) ∫ ∫ ∞ −∞ E [φ (x)] Var [φ (x)] dx ∞ −∞ E [φ (x)] |x − E [x]| Γ′′ (1 − E [P (x)]) dx. Γ′ (1 − E [P (x)]) By Lemma 7 again, |x − E [x]| is also mean-independent of P (x). Therefore, II = U′ (E [x]) ∫ ∫ ∞ −∞ E [φ (x)] Var [φ (x)] dx ∞ −∞ ∫ E [φ (x)] |x − E [x]| dx ∞ −∞ E [φ (x)] Γ′′ (1 − E [P (x)]) dx Γ′ (1 − E [P (x)]) Combining the LHS, the RHS, I and II, the uncertainty premium is ñ ô ó Γ′′ (1 − E [P (x)]) î 1 U′′ (E [x]) Var [x] − E E |x − E [x]| f2 [x] . K ≈ − 2 U′ (E [x]) Γ′ (1 − E [P (x)]) Proof of Proposition 1. Immediately obtained by substituting the perceived probabilities approx- imated by Theorem 1 into the value function in Equation (2), while accounting for U (x) ≤ 0 when x ≤ k and substituting E [P ({s ∈ S | U (f (s)) ≤ z})] + E [P ({s ∈ S | U (f (s)) ≥ z})] for 1. Proof of Proposition 2. Let y = P (E) − E [P (E)] and z = P (F ) − E [P (F )], and assume that F is more ambiguous than E. By Definition 2, z =d y + ϵ. By Izhakian (2014, Proposition 5), the DM’s preference %2 is characterized by the outlook function Γ : [0, 1] → R, implying that E [Γ (z)] = E [E [Γ (y + ϵ) | y]] . Ignoring the expectation on the RHS for the moment, ambiguity aversion, formed by a concave Γ, implies E [Γ (z)] = E [Γ (y + ϵ)] ≤ Γ (E [y + ϵ]) = Γ (y) . Taking expectation implies E [Γ (z)] ≤ E [Γ (y)]. Hence, by Izhakian (2014, Proposition 5), δˆF -2 δˆE . For the opposite direction, let δˆF -2 δˆE . Then, by Izhakian (2014, Proposition 5), E [Γ (z)] ≤ E [Γ (y)] . It needs to be shown that there exists an ϵ that satisfies Definition 2. The proof considers two probability distributions P ∈ P; it can then be extended to any number of probability distributions. Let y 37 Note that this component holds an order of magnitude of the variance of probabilities. Thus, it is smaller by one order of magnitude than probabilities. 38 and z take two possible values, (y1 , y2 ) and (z1 , z2 ), with probabilities (α, 1 − α) and (β, 1 − β), respectively. Without loss of generality, assume that z1 ≥ y1 ≥ y2 ≥ z2 . The random variable ϵ can then be Ä constructed as ϵ1 = (z1 − y1 , z2 − y1 ) with probabilities probabilities Ä y2 −z2 z1 −y2 z1 −z2 , z1 −z2 ä y1 −z2 z1 −y1 z1 −z2 , z1 −z2 ä and ϵ2 = (z1 − y2 , z2 − y2 ) with . It can be verified that the probabilities of ϵ1 and ϵ2 are all positive, and that E [ϵ1 | y1 ] = 0 and E [ϵ2 | y2 ] = 0. Therefore, ϵ is mean-independent of y and E [z] = E [y + ϵ] = 0. The probability that y + ϵ = z1 is α y2 − z 2 y1 − z2 + (1 − α) . z1 − z2 z1 − z2 Since E [y] = E [z], then α = z2 − y2 + β (z1 − z2 ) . y1 − y2 Together, this implies that the probability that y + ϵ = z1 is equal to β, and that the probability that y + ϵ = z2 is equal to 1 − β. That is, z =d y + ϵ. Proof of Proposition 3. By Theorem 4, W (f ) ≥ W (g) ⇐⇒ f stochastically dominates g. Then, by Theorem 5, W (f ) ≥ W (g) ⇐⇒ f2 [f ] ≤ f2 [g]. The same holds by Theorem 6. Proof of Corollary 1. CAAA implies Γ′ (P (E)) = e−ηP(E) and Γ′′ (P (E)) = −ηe−ηP(E) . Substi- tuting into Equation (4) proves the corollary. Proof of Corollary 2. Obtained by substituting 1 + r for x into Equation (6) of Theorem 7 and rearranging terms. CRRA implies U′ (x) = x−γ and U′′ (x) = −γx−γ−1 . CAAA implies Proof of Corollary 3. Γ′ (P (E)) = e−ηP(E) and Γ′′ (P (E)) = −ηe−ηP(E) . Substituting into Equation (7) proves the corollary. Consider an outcome x ∈ X . Its expected probability can be written Proof of Observation 1. E [φ (x)] = ax + n ∑ (bx − ax ) i=1 1 i = ax + (bx − ax ) , n 2 and its variance can be written Var [φ (x)] = n Å ∑ i=1 i ax + (bx − ax ) − E [φ (x)] n ã2 = n Å ∑ i=1 i (bx − ax ) n ã2 − 1 (bx − ax )2 . 4 Differentiating Var [φ (x)] with respect to n provides Ç n ∑ i2 d Var [φ (x)] = −2 (bx − ax )2 3 dn n i=1 å , which proves the claim. Proof of Observation 2. Given an outcome x ∈ X , the maximal variance of its probability is 39 attained when the possible probabilities are only either 0 or 1. In this case, the expected probability of x is E [φ (x)] = χ. Therefore, the variance of the probability of x is Var [φ (x)] = χ (1 − χ)2 + (1 − χ) (0 − χ)2 = χ − χ2 , which attains its maximal value when χ = 12 . In this case, Var [φ (x)] = 14 , and therefore f2 = 1. Notice that the expected probability χ satisfies χ = n1 , where n is the number of different possible outcomes. Therefore, the maximal value of f2 is attained when there are only two possible outcomes. 40