A Spatio-Temporal State-Space Model for River Network Data
Transcription
A Spatio-Temporal State-Space Model for River Network Data
BIOSTAT Department of Applied Mathematics, Biometrics and Process Control A spatio-temporal statespace model for river network data L. Clement and O. Thas 19th Annual Conference of The International Environmetrics Society TIES 2008, Kelowna, Canada, June 8-13, 2008 BIOSTAT, Coupure Links 653, 9000 Gent,Belgium e-mail: lieven.clement@Ugent.be Introduction Methods: Spatio-Temporal model Parameter Estimation Case Study: Nitrate concentration in the river Yzer 1 Introduction 2 Methods 3 4 Case Study: Nitrate concentration in the river Yzer Conclusions TIES 2008, June 8 - 13, 2008 L. Clement and O. Thas, BIOSTAT 2/24 Introduction Methods: Spatio-Temporal model Parameter Estimation Case Study: Nitrate concentration in the river Yzer Introduction Important legislation concerning water quality (WQ): Water Framework Directive (WFD) Major goal: maintain and improve the aquatic environment In Flanders: VMM → develop basin management plans Detect impact of previous actions. Assess improvement/trend in WQ TIES 2008, June 8 - 13, 2008 L. Clement and O. Thas, BIOSTAT 3/24 Introduction Methods: Spatio-Temporal model Parameter Estimation Case Study: Nitrate concentration in the river Yzer The Yzer Catchment 1 Yzer 1 2 3 4 length 76 km, 44 km in Belgium area 1101 km2 mouth Nieuwpoort: complex of sluices. Eutrophication due to nutrient pollution: high nitrate Figure 1.1: The Yzer catchment. In the left panel the main river is shown an three parts are indicated. In the right panel the entire catchment is g along with the sampling locations of the VMM (indicated with b TIES 2008, June 8 - 13, 2008 L. Clement and O. Thas, BIOSTAT 4/24 Introduction Methods: Spatio-Temporal model Parameter Estimation Case Study: Nitrate concentration in the river Yzer The Yzer Catchment 1 Yzer 1 2 3 4 TIES 2008, June 8 - 13, 2008 length 76 km, 44 km in Belgium area 1101 km2 mouth Nieuwpoort: complex of sluices. Eutrophication due to nutrient pollution: high nitrate L. Clement and O. Thas, BIOSTAT 4/24 Introduction Methods: Spatio-Temporal model Parameter Estimation Case Study: Nitrate concentration in the river Yzer The Yzer Catchment 1 Yzer 1 2 3 4 2 Nitrate Data 1 TIES 2008, June 8 - 13, 2008 length 76 km, 44 km in Belgium area 1101 km2 mouth Nieuwpoort: complex of sluices. Eutrophication due to nutrient pollution: high nitrate Sampling location 1: dependence in time L. Clement and O. Thas, BIOSTAT 4/24 Introduction Methods: Spatio-Temporal model Parameter Estimation Case Study: Nitrate concentration in the river Yzer The Yzer Catchment 1 Yzer 30 1 MAP II 10 20 MAP I 2 0 Nitrate (mg N/l) Observed 1990 1992 1994 1996 1998 2000 2002 2004 3 Time 4 0 Effect (mg N/l) Seasonal effect 1990 1992 1994 1996 1998 2000 2002 2004 2 Nitrate Data Time 1 MAP II 0 2 3 −3 Effect (mg N/l) 3 Trend MAP I 1990 1992 1994 1996 1998 2000 2002 2004 Time TIES 2008, June 8 - 13, 2008 length 76 km, 44 km in Belgium area 1101 km2 mouth Nieuwpoort: complex of sluices. Eutrophication due to nutrient pollution: high nitrate Sampling location 1: dependence in time Seasonal variation and trend 1996: MAP I 2000: MAPIIbis L. Clement and O. Thas, BIOSTAT 4/24 Introduction Methods: Spatio-Temporal model Parameter Estimation Case Study: Nitrate concentration in the river Yzer The Yzer Catchment 1 Yzer 1 2 3 4 2 Nitrate Data 1 2 3 4 TIES 2008, June 8 - 13, 2008 length 76 km, 44 km in Belgium area 1101 km2 mouth Nieuwpoort: complex of sluices. Eutrophication due to nutrient pollution: high nitrate Sampling location 1: dependence in time Seasonal variation and trend 1996: MAP I 2000: MAPIIbis Five sampling locations: dependence in time and space L. Clement and O. Thas, BIOSTAT 4/24 Introduction Methods: Spatio-Temporal model Parameter Estimation Case S1 Study: Nitrate concentration in the river Yzer S2 S4 S5 The Yzer Catchment S3 1 Yzer length 76 km, 44 km in Belgium 2 2 area 1101 km Chapter 1. Introduction 3 mouth Nieuwpoort: complex of sluices. 4 Eutrophication due to nutrient Figure 1.2: Top Left: Network topology of the sampling locations. pollution:Bottom high Left: nitrate 1 S2 20 Nitrate (mg N/l) 01/90 30 25 20 15 0 5 5 0 10 20 30 40 50 60 70 S3 Nitrate (mg N/l) 01/02 Time 10 25 01/96 20 01/90 S5 15 01/02 Time S4 10 15 5 0 01/96 Map of the river reaches considered in this case study. Locations S1, 2 Nitrate Data S2, S4 and S5 are located on the Yzer river while location S3 is located on a joining creek. Right: Map of the 1partSampling of the Yzerlocation catchment1: located in Flanders, Belgium. The area considered in this studyinistime indidependence cated with the ellipse and the five sampling locations are indicated with 2 Seasonal variation and trend black dots Nitrate (mg N/l) 01/90 10 Nitrate (mg N/l) 20 15 10 0 5 Nitrate (mg N/l) 25 25 30 S1 01/96 Time 01/02 01/90 01/96 01/02 Time 1996: MAP I 2000: MAPIIbis 4 Five sampling locations: Sampling locations S1, S2, S4 and S5 are located on the Yzer while sampling locaFigure 1.3: Nitrate observations at five sampling locations of the river Yzer. Samdependence in time and space tion S3 is located on a joining creek. Their river network topology and locations in 0 3 01/90 01/96 01/02 Time pling locations S1, S2, S4, S5 are located on the Yzer river while sam- pling location S3 is located on a tributary TIES 2008, June 8 -1.2. 13, 2008 Clement and O. BIOSTAT the catchment are shown in Figure MonthlyL.observations areThas, available between 4/24 Introduction Methods: Spatio-Temporal model Parameter Estimation Case Study: Nitrate concentration in the river Yzer Spatial dependence structure Spatio-temporal dependence structure Observation Model Methods: Spatio-Temporal model 1 Spatial dependence structure 2 Spatio-temporal dependence structure 3 Observation model TIES 2008, June 8 - 13, 2008 L. Clement and O. Thas, BIOSTAT 5/24 Introduction Methods: Spatio-Temporal model Parameter Estimation Case Study: Nitrate concentration in the river Yzer Spatial dependence structure Spatio-temporal dependence structure Observation Model Spatial dependence structure (SDP) River networks: The water flows in 1 direction → Causal interpretation of correlations Unidirectional correlation structure Isolated river: Directed Acyclic Graph → conditional independence 0 6ρ12 6 6 0 4 0 0 2 A= S = AS + γ TIES 2008, June 8 - 13, 2008 0 0 0 ρ24 0 0 0 0 ρ34 0 0 0 0 0 ρ45 3 0 07 7 07 05 0 γ ∼ MVN(0, Σγ ) , Σγ diag L. Clement and O. Thas, BIOSTAT 6/24 Introduction Methods: Spatio-Temporal model Parameter Estimation Case Study: Nitrate concentration in the river Yzer Spatial dependence structure Spatio-temporal dependence structure Observation Model Spatio-temporal dependence structure St = ASt + BSt−1 + η t Unidirectional dependence in time Assumption: observation at t only depends on observation at t − 1 → AR(1) process B Diagonal matrix with AR coef (only monthly observations) η t ∼ MVN(0, Ση ) (Ση diag) TIES 2008, June 8 - 13, 2008 L. Clement and O. Thas, BIOSTAT 7/24 Introduction Methods: Spatio-Temporal model Parameter Estimation Case Study: Nitrate concentration in the river Yzer Spatial dependence structure Spatio-temporal dependence structure Observation Model Spatio-temporal dependence structure St = ASt + BSt−1 + η t St = ΦSt−1 + δ t Unidirectional dependence in time Assumption: observation at t only depends on observation at t − 1 → AR(1) process B Diagonal matrix with AR coef (only monthly observations) η t ∼ MVN(0, Ση ) (Ση diag) Φ = (I − A)−1 B δ t ∼ MVN(0, Q), Q = (I − A)−1 Ση (I − A)−T TIES 2008, June 8 - 13, 2008 L. Clement and O. Thas, BIOSTAT 7/24 Introduction Methods: Spatio-Temporal model Parameter Estimation Case Study: Nitrate concentration in the river Yzer Spatial dependence structure Spatio-temporal dependence structure Observation Model Observation Model Model of S only holds for isolated river model ( St = ΦSt−1 + δ t Y t = St + t Reality: disturbance → put S in observation model Error term: t ∼ MVN(0, Σ ) → spatial correlation due to disturbances Up to now only covariance structure modelled TIES 2008, June 8 - 13, 2008 L. Clement and O. Thas, BIOSTAT 8/24 Introduction Methods: Spatio-Temporal model Parameter Estimation Case Study: Nitrate concentration in the river Yzer Spatial dependence structure Spatio-temporal dependence structure Observation Model Observation Model Model of S only holds for isolated river model ( St = ΦSt−1 + δ t Y t = X t β + St + t Reality: disturbance → put S in observation model Error term: t ∼ MVN(0, Σ ) → spatial correlation due to disturbances Up to now only covariance structure modelled Mean model: E {Yt } = Xt β TIES 2008, June 8 - 13, 2008 L. Clement and O. Thas, BIOSTAT 8/24 Introduction Methods: Spatio-Temporal model Parameter Estimation Case Study: Nitrate concentration in the river Yzer Kalman Filter and Smoother Parameter estimation algorithms Parameter Estimation Kalman Filter and Kalman Smoother ECM algorithm E step: Calculate expected log likelihood of State Space model CM step1: Parameters dependence structure CM step 2: Parameters of the mean model TIES 2008, June 8 - 13, 2008 L. Clement and O. Thas, BIOSTAT 9/24 Introduction Methods: Spatio-Temporal model Parameter Estimation Case Study: Nitrate concentration in the river Yzer Kalman Filter and Smoother Parameter estimation algorithms Kalman Filter State Space model: likelihood can be factorized using the Kalman Filter E [St |t − 1] E [(St − at|t−1 )(St − at|t−1 )T ] E [St ] E [(St − at )(St − at )T ] E [vt vtT ] = = = = = at|t−1 Pt|t−1 at Pt Ft = = = = = Φat−1 ΦPt−1 ΦT + Q at|t−1 + Pt|t−1 F−1 t vt Pt|t−1 − Pt|t−1 F−1 t Pt|t−1 Pt|t−1 + Σ with innovation vt = (Yt − at|t−1 − Xt β) log LY = N P t=1 log Lyt |Yt−1 ∼ − 12 TIES 2008, June 8 - 13, 2008 N P t=1 log |Ft | − 1 2 N P t=1 vtT Ft −1 vt L. Clement and O. Thas, BIOSTAT 10/24 Introduction Methods: Spatio-Temporal model Parameter Estimation Case Study: Nitrate concentration in the river Yzer Kalman Filter and Smoother Parameter estimation algorithms Kalman Smoother Smoothed estimates at|N for St conditionally on all N For t = N − 1, . . . , 0 at|N = at + P∗t (at+1|N − at+1|t ) Pt|N = Pt + P∗t (Pt+1|N − Pt+1|t )P∗T t P∗t = Pt ΦT P−1 t+1|t Lag-one covariance smoothers (Digalakis, Rohlick and Osendorf 1993) ⇒ Forward recursion: Pt,t−1|t = (I − Pt|t−1 F−1 t )ΦPt−1 ⇒ Backward recursion: Pt,t−1|N = Pt,t−1|t + (Pt|N − Pt )P−1 t Pt,t−1|t TIES 2008, June 8 - 13, 2008 L. Clement and O. Thas, BIOSTAT 11/24 Introduction Methods: Spatio-Temporal model Parameter Estimation Case Study: Nitrate concentration in the river Yzer Kalman Filter and Smoother Parameter estimation algorithms Using the Kalman filter for GLS Apply same Kalman filter to Yt and each of the columns of Xt ⇒ A p × 1 vector of innovations, Yt∗ on Yt and ⇒ a p × m matrix of innovations, X∗t on (X1t , . . . , Xmt ) are produced. ⇒ Run recursions for Pt|t−1 , Pt|t and Ft only once rather than m + 1 times. GLS estimator of β becomes " N #−1 N X X −1 ∗ −1 ∗ β̂GLS = X∗T F X X∗T (1) t t t t F t yt . t=1 t=1 vt become vt = yt∗ − x∗t β̂GLS . Maximisation possible by classical numerical algorithms Here ECM algorithm TIES 2008, June 8 - 13, 2008 L. Clement and O. Thas, BIOSTAT 12/24 Introduction Methods: Spatio-Temporal model Parameter Estimation Case Study: Nitrate concentration in the river Yzer Kalman Filter and Smoother Parameter estimation algorithms Existing EM algorithms 1 lc (Ψ) = log LYN ,SN (Ψ) the joint log-likelihood of YN and SN . 2 Problem: unobservable state process S 3 Use an EM algorithm (e.g. Shumway and Stoffer 1982) 4 RMN topology: restrictions on parametrisation ⇒ Q and Φ have some parameters in common ⇒ Adapt existing EM algorithm for SS models 5 Presence of the exogenous variables ⇒ use ECM algorithm. ⇒ Split M-step in two CM-steps. ⇒ CM step 1: Update the parameters of the dependence structure Ψα given the current values of the mean model β. ⇒ CM step 2: estimate of β using the updated values of Ψα . TIES 2008, June 8 - 13, 2008 L. Clement and O. Thas, BIOSTAT 13/24 Introduction Methods: Spatio-Temporal model Parameter Estimation Case Study: Nitrate concentration in the river Yzer Kalman Filter and Smoother Parameter estimation algorithms ECM algorithm 1 Choose initial estimates: Ψ0 2 E-step: Calculate Q(Ψ, Ψkα , β k ) = E lc (Ψ)|YN , Ψk 3 CM-step 1: Find the covariance parameters Ψk+1 that α k k maximise Q(Ψ, Ψα , β ) 4 k CM-step 2: Find β k+1 that maximises Q(Ψ, Ψk+1 α ,β ) 5 Repeat steps 2-4 until convergence TIES 2008, June 8 - 13, 2008 L. Clement and O. Thas, BIOSTAT 14/24 Introduction Methods: Spatio-Temporal model Parameter Estimation Case Study: Nitrate concentration in the river Yzer Kalman Filter and Smoother Parameter estimation algorithms E-step o 1 n k k −1 Q(Ψ, Ψk ) ∼ − E log |ΣS0 | + ST 0 ΣS0 S0 |Y, Ψ , β 2 ( ) N X 1 T −1 − E N log |Ση | + (St − ASt − BSt−1 ) Ση (St − ASt − BSt−1 )|... 2 t=1 ( ) N X 1 − E N log |Σ | + (Yt − Xβ − St )T Σ−1 (Y t − Xβ − St )|... . 2 t=1 Exponential family ⇒ sufficient statistics |Y E [St |Y] = at|N E [St St |Y] = Pt|N + at|N aT t|N E [St St−1 |Y] = Pt,t−1|N + at|N aT t−1|N . Maximise the obtained expected likelihood in the CM steps TIES 2008, June 8 - 13, 2008 L. Clement and O. Thas, BIOSTAT 15/24 Introduction Methods: Spatio-Temporal model Parameter Estimation Case Study: Nitrate concentration in the river Yzer Kalman Filter and Smoother Parameter estimation algorithms CM-step 1: Update parameters of dependence structure given β k k Q(Ψ, Ψ ) ∼ − 1 2 n o T −1 k k E log |ΣS0 | + S0 ΣS S0 |Y, Ψ , β 0 9 8 N = < X T −1 (St − ASt − BSt−1 ) Ση (St − ASt − BSt−1 )|... − E N log |Ση | + ; 2 : t=1 9 8 N = X 1 < T −1 − E N log |Σ | + (Yt − Xβ − St ) Σ (Yt − Xβ − St )|... . ; 2 : 1 t=1 Σ̂S0 = P0|N Replace all sufficient statistics by their conditional expectations TIES 2008, June 8 - 13, 2008 L. Clement and O. Thas, BIOSTAT 16/24 Introduction Methods: Spatio-Temporal model Parameter Estimation Case Study: Nitrate concentration in the river Yzer Kalman Filter and Smoother Parameter estimation algorithms CM-step 1: Update parameters of dependence structure given β k k Q(Ψ, Ψ ) ∼ − 1 2 n o T −1 k k E log |ΣS0 | + S0 ΣS S0 |Y, Ψ , β 0 9 8 N = < X T −1 − (St − ASt − BSt−1 ) Ση (St − ASt − BSt−1 )|... E N log |Ση | + ; 2 : t=1 9 8 N = X 1 < T −1 − (Yt − Xβ − St ) Σ (Yt − Xβ − St )|... . E N log |Σ | + ; 2 : t=1 1 ( E {log LS (Ψ)|...} ∼ − 12 Pp i=1 E N log ∆σηi + 1 2 ση i TIES 2008, June 8 - 13, 2008 PN i t=1 (St [i] ) i − A[i] St − Bi St−1 )2 |... L. Clement and O. Thas, BIOSTAT 17/24 Introduction Methods: Spatio-Temporal model Parameter Estimation Case Study: Nitrate concentration in the river Yzer Kalman Filter and Smoother Parameter estimation algorithms CM-step 1: Update parameters of dependence structure given β k k Q(Ψ, Ψ ) ∼ − 1 2 n o T −1 k k E log |ΣS0 | + S0 ΣS S0 |Y, Ψ , β 0 9 8 N = < X T −1 − (St − ASt − BSt−1 ) Ση (St − ASt − BSt−1 )|... E N log |Ση | + ; 2 : t=1 9 8 N = X 1 < T −1 − (Yt − Xβ − St ) Σ (Yt − Xβ − St )|... . E N log |Σ | + ; 2 : t=1 1 8 > > > > > > > > > < > > > > > > > > > : „ «„ « „ « PN PN PN [i] [i] T −1 PN [i] i i i [i] T i t=1 St St−1 − t=1 St St t=1 St St t=1 St−1 St „ «„ «−1 „ « PN PN PN PN [i] T [i] [i] T [i] 2 Si − Si S S S Si S t=1 t−1 t=1 t−1 t t=1 t t t=1 t−1 t – „ « »“ ” −1 PN PN [i] T [i] [i] T i Ak+1 = Sti − Bik+1 St−1 St t=1 St St t=1 i RSSk+1 2 k+1 i ση = N i Bik+1 = Replace all sufficient statistics by their conditional expectations TIES 2008, June 8 - 13, 2008 L. Clement and O. Thas, BIOSTAT 17/24 Introduction Methods: Spatio-Temporal model Parameter Estimation Case Study: Nitrate concentration in the river Yzer Kalman Filter and Smoother Parameter estimation algorithms CM-step 1: Update parameters of dependence structure given β k k Q(Ψ, Ψ ) ∼ − 1 2 n o T −1 k k E log |ΣS0 | + S0 ΣS S0 |Y, Ψ , β 0 9 8 N = < X T −1 (St − ASt − BSt−1 ) Ση (St − ASt − BSt−1 )|... − E N log |Ση | + ; 2 : t=1 9 8 N = X 1 < T −1 E N log |Σ | + − (Yt − Xβ − St ) Σ (Yt − Xβ − St )|... . ; 2 : 1 t=1 PN (Y0 Y0T )− Σ k+1 = t=1 t t with Y0t = Yt − Xt β k PN PN PN 0 T 0T T t=1 (Y t St )− t=1 (St Y t )+ t=1 (St St ) N Replace all sufficient statistics by their conditional expectations TIES 2008, June 8 - 13, 2008 L. Clement and O. Thas, BIOSTAT 18/24 Introduction Methods: Spatio-Temporal model Parameter Estimation Case Study: Nitrate concentration in the river Yzer Kalman Filter and Smoother Parameter estimation algorithms CM-step 2: Update β k given Ψk+1 α β k+1 : FGLS using Kalman Filter with Ψk+1 α . Kalman filter is already available for next E step. TIES 2008, June 8 - 13, 2008 L. Clement and O. Thas, BIOSTAT 19/24 Introduction Methods: Spatio-Temporal model Parameter Estimation Case Study: Nitrate concentration in the river Yzer Case Study Has the water quality improved? MAP II 20 MAP I 10 Is mean of 2003 different from 0 Nitrate (mg N/l) 30 Mean model: ST + annual effect 1990 1994 1998 1 the general mean 2 the mean of 2001-2002 2002 Time TIES 2008, June 8 - 13, 2008 L. Clement and O. Thas, BIOSTAT 20/24 Introduction Methods: Spatio-Temporal model Parameter Estimation Case Study: Nitrate concentration in the river Yzer Model for average NO− 3 at Si on time t E {y } = µ + α + β + (κ) I(i) + γ sin(2πt/12) 1 i,t i model, j a parametric j Spatio-temporal river network approach + γ2 cos(2πt/12) + (λ)j,1 sin(2πt/12) + (λ)j,2 cos(2πt/12) S2 (i = 1 . . . 5, j = year1 , . . . , year14 ) 25 20 15 Nitrate (mg N/l) S4 S5 25 20 15 5 5 0 01/90 01/96 Time 01/02 01/90 01/96 01/02 Time 0 Nitrate (mg N/l) 10 20 30 40 50 60 70 S3 Nitrate (mg N/l) 01/02 Time 10 01/96 0 01/90 20 01/02 Time 15 01/96 10 01/90 Nitrate (mg N/l) 0 0 25 30 5 10 20 15 10 5 Nitrate (mg N/l) 25 30 30 S1 01/90 01/96 01/02 Time TIES June quality 8 - 13, 2008 L. Clement and O.ofThas, BIOSTAT Figure 5.9: Evolution of 2008, the water at five sampling locations the river 21/24 Introduction Methods: Spatio-Temporal model Parameter Estimation Case Study: Nitrate concentration in the river Yzer Statistical tests d −1 β̂(Ψα ) → MVN(β, (XT Σ−1 YN X) ) −1 Σ̂β = (XT Σ−1 YN (Ψ̂α )X) . A general linear hypothesis to compare the means: Hβ = 0, where H is the r × q hypothesis matrix. Test: T = (Hβ̂)T (HΣ̂β HT )−1 (Hβ̂) TIES 2008, June 8 - 13, 2008 L. Clement and O. Thas, BIOSTAT 22/24 Introduction Methods: Spatio-Temporal model Parameter Estimation Case Study: Nitrate concentration in the river Yzer Statistical tests H0 : Mean 2003 equal to Mean 2001-2002 H1 : Mean 2003 different from Mean 2001-2002 Test 2003 ↔ 2001-2002 2003 ↔ general mean 2003 ↔ 2001-2002 2003 ↔ general mean TIES 2008, June 8 - 13, 2008 contrast p-value Main river (“Regional”) -2.54 0.016 -2.88 0.0005 Joining creek (S3) -1.32 0.56 -8.40 < 0.0001 L. Clement and O. Thas, BIOSTAT 23/24 Introduction Methods: Spatio-Temporal model Parameter Estimation Case Study: Nitrate concentration in the river Yzer Statistical tests H0 : Mean 2003 equal to Mean 2001-2002 H1 : Mean 2003 different from Mean 2001-2002 Test 2003 ↔ 2001-2002 2003 ↔ general mean 2003 ↔ 2001-2002 2003 ↔ general mean TIES 2008, June 8 - 13, 2008 contrast p-value Main river (“Regional”) -2.54 0.016 -2.88 0.0005 Joining creek (S3) -1.32 0.56 -8.40 < 0.0001 L. Clement and O. Thas, BIOSTAT 23/24 References References 1 Clement, L. and Thas, O. (2007). Estimating and modelling spatio-temporal correlation structures for river monitoring networks. Journal of Agricultural, Biological, and Environmental Statistics, 12(2), 161-176. 2 Clement, L., Thas, O., Vanrolleghem, P.A. and Ottoy, J.P. (2006). Spatio-temporal statistical models for river monitoring networks. Water, Science and Technology, 53, 9-15. TIES 2008, June 8 - 13, 2008 L. Clement and O. Thas, BIOSTAT 24/24