View/Open - Minerva Access
Transcription
View/Open - Minerva Access
Improving flood prediction in sparsely gauged catchments by the assimilation of satellite soil moisture into a rainfall-runoff model Camila Alvarez Submitted in total fulfilment of the requirements of the degree of Doctor of Philosophy Department of Infrastructure Engineering The University of Melbourne April 2016 Produced on archival quality paper ii Abstract This thesis explores the assimilation of remotely-sensed soil moisture (SM-DA) into a rainfall-runoff model for improving flood prediction within data scarce regions. Satellite soil moisture (SM) observations are used to correct the two main controlling factors of the streamflow generation: the wetness condition of the catchment (state correction scheme) and the magnitude of rainfall events (forcing correction scheme). The core part of the research focuses on the state correction scheme. A simple rainfall runoff model (the probability distributed model, PDM) is used for this. The soil water state of PDM is corrected by assimilating active and passive satellite SM observations using an ensemble Kalman filter. Within this framework, the efficacy of different existing tools for setting up the state correction scheme are evaluated, and new techniques to address some of the key challenges in the assimilation of surface satellite SM observations into hydrological models are introduced. Various options for the state correction scheme were implemented and enhanced throughout the thesis. The proposed schemes consistently led to improved streamflow ensemble predictions for a case study. In the final state correction scheme, the ensemble root mean square error was reduced by 24% at the catchment outlet, the false alarm ratio was reduced by a 9%, and the skill and reliability of the streamflow ensemble were improved after SM-DA. The state correction scheme was also effective at improving the streamflow ensemble prediction within ungauged inner locations, which demonstrates the advantages of incorporating spatially distributed SM information within large and poorly instrumented catchments. I showed that since stochastic SM-DA is formulated to reduce the random component of the SM error (and therefore does not address systematic biases in the model), the efficacy of the state correction schemes was restricted by the model quality before assimilation. This is critical within a data scarce context, where streamflow predictions suffer from large errors coming from the poor quality data used to force and calibrate the model. Additionally, due to the higher control that SM exerts in the catchment runoff mechanisms during minor and moderate floods, the state correction scheme had more skill when the low flows were evaluated. Consequently, SM-DA improved mainly the quality of the streamflow ensemble prediction (skill, reliability and average statistics of the ensemble) but did not significantly reduced the existing biases in the peak flows prediction. These results reveal one key limitation of the proposed approach: improving flood prediction by reducing random (and not systematic) errors in the SM state of a rainfall-runoff model, while SM is probably not the main controlling factor in the runoff generation during major floods within the study catchment. Addressing the above limitation, I set up a forcing correction scheme that aimed at reiii ducing the errors in the rainfall data (the rainfall input, in addition to the infiltration estimates from the model, are probably the main factors controlling the accuracy of flood predictions). I adopted for this the soil moisture analysis rainfall tool (SMART) proposed by Crow et al., (2009). In SMART, active and passive satellite SM were assimilated into the Antecedent Precipitation Index model to correct a near real-time satellite rainfall, which was subsequently used to force PDM (without state correction). The results showed that remotely sensed SM was effective at improving mean-to-high daily satellite rainfall accumulations, which in turn led to a consistent improvement of the streamflow prediction, especially during high flows. The efficacy of the state correction and the forcing correction schemes were compared within 4 catchments. For most cases, the reduction of model SM error by the assimilation of satellite SM led to improved streamflow prediction compared with the correction of the forcing data. This was true for both the low flows and high flows. The outperformance of the state correction scheme during high flows is counterintuitive with the stronger influence that rainfall probably has during floods, and differs from previous studies. I interpreted these different results by various factors including the methodological configuration (rainfall-runoff model, model error quantification, etc.), the quality of the satellite rainfall data and the quality of the satellite SM retrievals. In agreement with the literature, the combination of the forcing and the state correction schemes further improved flood predictions. The significance of this thesis is in providing novel evidence (based on real data experiments) of the value of satellite soil moisture for improving both an operational satellite rainfall product and the streamflow prediction within data scarce regions. Additionally, I highlighted a number of challenges and limitations within the forcing and state correction schemes. I introduced new techniques to overcome some of these challenges and proposed future strategies to further address them. This contributes to advancing towards a reliable data assimilation framework for improving operational flood prediction within data scarce regions. iv Declaration This is to certify that: 1. The thesis comprises only my original work towards the PhD. 2. Due acknowledgement has been made in the text to all other material used. 3. The thesis is fewer than 100,000 words in length, exclusive of tables, figures, bibliographies and appendices. Camila Alvarez November 2015. v vi Preface This thesis is framed within the project Development of a new-generation flood forecasting system using observations from space, funded by Australian Research Council and Bureau of Meteorology under Linkage Project LP110200520 agreement. The project aims to demonstrate the potential role that satellite-derived rainfall and soil moisture products may play in improving continuous daily streamflow forecast of fluvial flooding at semi-arid catchments in Australia. The motivation is to address the deficiency of geophysical data for implementing rainfall-runoff modelling and in particular, two key informational gaps in gauge-based precipitation measurements over inland catchments and soil wetness measurements over the Australian continent. The hypotheses are that (1) rainfall accumulations derived from the constellation of geostationary and polar orbiting active and passive satellites can be used to provide near real-time rainfall to drive hydrological modelling with full continental coverage at daily time scales; (2) active and passive microwave satellite-derived soil moisture can be used to provide daily information on the antecedent soil wetness of a catchment with direct influence on runoff generation; and (3) by using satellite-derived data, forecasts of the river discharge at a sparsely-gauged or ungauged catchment outlet can be enhanced. It is therefore the objective to develop and implement a flood forecasting scheme that ingests satellite-derived precipitation and soil moisture near real-time data sets to produce discharge forecasts under an operational environment. With applications to Australia regions in mind, the implementation and evaluations are conducted at the semi-arid, poorly instrumented flood-prone regions in Australia. Within this context, this thesis investigates a data assimilation framework that uses active and passive satellite SM observations to improve flood prediction within data scarce regions. vii viii Acknowledgements This thesis is the result of 4 years of intensive work and large personal growth. This goal was achieved thanks to a sum of key factors. I had the pleasure of working with Dongryeol and Andrew, who enthusiastically shared their expertise and experience. They provided me with constant guidance, support and trust throughout these years. They had a fundamental role in the development of my academic/researcher profile. I had valuable contributions from Wade, David and Chun-Hsu, which had direct impacts in the quality of my research. These years as PhD candidate were intense and exhausting, but also very exciting. The University of Melbourne was a great place to develop my career, I was surrounded by great colleagues and friends, and a very friendly and challenging academic environment. And Melbourne is the best place that I could think of to spend this challenging time, lots of biking, yoga, parks, playgrounds, and the best coffee. My PhD was supported by Becas Chile from CONICYT. I am very grateful for the important effort that my country dedicates to developing advanced research. I will strive to effectively reward their investment by contributing to the growth of science, and to a sustainable development of Chile. Finally, I had the most fundamental support from my family. My parents and in-laws with their constant support and love. Specially my mum, who came to help us with parenting & life in the hectic last weeks of our PhDs (Robert’s and mine). My healthy and happy daughters, Blanca and Eloisa, who were a key aspect of the whole PhD experience and taught me the most important lessons about parenthood, life and happiness. And my beloved Robert, with his wisdom, his constant support, companionship, affection, shelter, trust, humour, and love during this adventure, and in life. Gracias a todos! ix x Contents List of Figures xv List of Tables xvii Chapter 1 Introduction 1 Structure of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 2 Background 1 Microwave remote sensing of soil moisture . . . . 2 Hydrologic data assimilation . . . . . . . . . . . 3 Satellite soil moisture data assimilation (SM-DA) 4 State correction schemes . . . . . . . . . . . . . . 4.1 Model error representation . . . . . . . . . 4.2 Model error parameter estimation . . . . . 4.2.1 Ensemble verification criteria . . . . . 4.2.2 Maximum a posteriori approach . . . 4.2.3 Adaptive filtering techniques . . . . . 4.2.4 Triple collocation-based estimation . 4.2.5 Summary . . . . . . . . . . . . . . . . 4.3 Satellite SM observation operator . . . . . . 4.3.1 Profile soil moisture estimation . . . . 4.3.2 Observation rescaling . . . . . . . . . 4.3.3 Observation error estimation . . . . . 5 Forcing and dual correction schemes . . . . . . . 6 Summary and overall approach . . . . . . . . . . 1 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 9 11 15 15 16 18 19 19 20 20 20 21 21 22 23 26 27 Chapter 3 Impacts of observation error structure in SM-DA 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Study area and data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Rainfall-runoff model . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 EnKF formulation . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Satellite soil moisture rescaling and observation error estimation 3.4 Model error estimation . . . . . . . . . . . . . . . . . . . . . . . . 4 Results and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Model calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Model and observation error estimation . . . . . . . . . . . . . . 4.3 Assimilation experiments . . . . . . . . . . . . . . . . . . . . . . 5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 33 33 34 34 34 34 35 35 35 36 36 38 xi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 4 Impacts of observation rescaling in SM-DA 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Study area and data . . . . . . . . . . . . . . . . . . . . . . . . . 3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Rainfall-runoff model . . . . . . . . . . . . . . . . . . . . . . 3.2 EnKF formulation . . . . . . . . . . . . . . . . . . . . . . . 3.3 Model error representation . . . . . . . . . . . . . . . . . . 3.4 Estimation of SSM and SWI . . . . . . . . . . . . . . . . . 3.5 Observation rescaling and error estimation . . . . . . . . . 3.6 Evaluation of data assimilation results . . . . . . . . . . . . 4 Results and discussion . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Model calibration . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Error model parameter calibration . . . . . . . . . . . . . . 4.3 Rescaled SSM and SWI . . . . . . . . . . . . . . . . . . . . 4.4 Data assimilation results . . . . . . . . . . . . . . . . . . . . 4.4.1 Effects of observation error assumptions in DA results 4.4.2 Effects of different rescaling in DA results . . . . . . . 4.4.3 Effects of soil moisture product used in DA results . . 5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 40 42 43 43 43 44 45 46 46 47 47 49 49 51 51 51 53 53 Chapter 5 Lumped vs semi-distributed model configurations 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Study area and data . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Lumped and semi-distributed model schemes . . . . . . . . . 3.2 EnKF formulation . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Error model representation . . . . . . . . . . . . . . . . . . . 3.4 Error model parameters calibration . . . . . . . . . . . . . . . 3.5 Profile soil moisture estimation . . . . . . . . . . . . . . . . . 3.6 Rescaling and observation error estimation . . . . . . . . . . 3.7 Evaluation metrics . . . . . . . . . . . . . . . . . . . . . . . . 4 Results and discussion . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Model calibration . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Error model parameters and ensemble prediction . . . . . . . 4.3 SWI estimation and rescaling . . . . . . . . . . . . . . . . . . 4.4 Satellite soil moisture data assimilation . . . . . . . . . . . . 5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 60 61 62 62 64 65 66 66 67 67 68 68 70 70 73 75 . . . . . . . . . . . 79 80 82 84 84 85 88 88 89 91 92 93 Chapter 6 Dual correction scheme 1 Introduction . . . . . . . . . . . . . . . . . . . . . 2 Study Area and Data . . . . . . . . . . . . . . . . 3 Methods . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Rainfall-Runoff Model . . . . . . . . . . . . . 3.2 Forcing Correction Scheme . . . . . . . . . . 3.3 State Correction Scheme . . . . . . . . . . . . 3.3.1 Satellite Soil Moisture Data Processing 3.3.2 EnKF Formulation . . . . . . . . . . . 3.4 Dual Correction Scheme . . . . . . . . . . . . 3.5 Schemes Evaluation . . . . . . . . . . . . . . 4 Results . . . . . . . . . . . . . . . . . . . . . . . . xii . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 6 4.1 Rainfall Correction . . . . . . . . . 4.2 Satellite Data Processing . . . . . 4.3 Streamflow Prediction Evaluation . Discussion . . . . . . . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . Chapter 7 Discussion and Conclusions 1 Challenges in satellite SM data processing 2 Challenges in model error representation . 3 Main findings . . . . . . . . . . . . . . . . 4 Conclusions . . . . . . . . . . . . . . . . . 5 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . for . . . . . . . . . . . . . . . . . . . . . . . DA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 96 97 100 103 . . . . . 105 . 106 . 108 . 110 . 111 . 112 Appendix A Publications 115 References 117 xiii xiv List of Figures Chapter 2 9 Figure 1 Factors affecting satellite soil moisture retrievals . . . . . . . . . . . . . . 11 Figure 2 DA schematic diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Chapter Figure 1 Figure 2 Figure 3 Figure 4 3 Warrego river catchment . . . . Discharge prediction time series Rescaled observations . . . . . Assimilation results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 34 36 36 37 Chapter Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 Figure 9 4 The Warrego river catchment . . . . . . . . . . . The PDM scheme . . . . . . . . . . . . . . . . . Hydrograph of observed and predicted discharge Observed daily runoff ratio . . . . . . . . . . . . Rescaled SSM . . . . . . . . . . . . . . . . . . . . Simulated soil moisture and rescaled SSM, SWI . NRMSD . . . . . . . . . . . . . . . . . . . . . . . Major flood prediction . . . . . . . . . . . . . . . Moderate flood prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 43 44 48 48 50 50 52 54 54 Chapter Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 Figure 9 Figure 10 5 The Warrego river basin . . . . . . . . . . . . . . . . . . . . . . . . Periods of record of the different data sets . . . . . . . . . . . . . . The PDM scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . Simulated and observed daily streamflow . . . . . . . . . . . . . . Lag-correlation between simulated streamflow and θ . . . . . . . . Lag-correlation between simulated streamflow and daily rainfall . . Rank histograms . . . . . . . . . . . . . . . . . . . . . . . . . . . . Satellite SM data processing results . . . . . . . . . . . . . . . . . T against soil depth found in previous studies . . . . . . . . . . . . Streamflow and SM ensemble predictions before and after SM-DA . . . . . . . . . . 59 63 63 64 69 70 70 71 72 72 74 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 6 79 Figure 1 Study catchments and rainfall gauges . . . . . . . . . . . . . . . . . . . . 83 xv Figure Figure Figure Figure Figure Figure Figure Figure Figure 2 3 4 5 6 7 8 9 10 Catchments seasonality . . . . . . . . Semi-distributed schemes within study The PDM scheme . . . . . . . . . . . SM-DA schemes . . . . . . . . . . . . Daily rainfall histograms . . . . . . . . Mean daily bias in rainfall . . . . . . . SMART evaluation . . . . . . . . . . . Satellite data processing results . . . . SM-DA results . . . . . . . . . . . . . xvi . . . . . . . catchments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 85 85 92 94 95 96 97 99 List of Tables Chapter 3 31 Table 1 Evaluation metrics SM-DA . . . . . . . . . . . . . . . . . . . . . . . . . . 37 Chapter 4 39 Table 1 Rescaled soil moisture observations . . . . . . . . . . . . . . . . . . . . . . 49 Table 2 Rescaled observation error . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Chapter Table 1 Table 2 Table 3 Table 4 5 Area and mean annual rainfall in study catchments . . . . . . Model evaluation . . . . . . . . . . . . . . . . . . . . . . . . . T and correlation coefficient between model and observed SM SM-DA evaluation statistics . . . . . . . . . . . . . . . . . . . Chapter Table 1 Table 2 Table 3 Table 4 6 Study catchments characteristics . . . . . . . . . . . . . . . Statistics of the reference run . . . . . . . . . . . . . . . . . Statistics from the models forced with gauged-based rainfall Model error parameters calibrated with MAP . . . . . . . . xvii . . . . . . . . . . . . . . data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 64 70 71 73 . . . . 79 82 98 98 98 xviii Chapter 1 Introduction Floods have large negative impacts on society including the destruction of infrastructure and crops, erosion, and in the worst cases, injury or loss of life (Thielen et al., 2009). Moreover, the frequency of floods is increasing worldwide (Sivapalan et al., 2003). Particularly in Australia, floods correspond to the most costly natural disaster, with an average annual cost estimated at $377 million (Middelmann-Fernandes, 2009). To reduce the tangible and intangible damage on public safety and society, flood warning systems are crucial. These systems form part of a holistic approach that has gained important priority in the political agenda in recent decades (Cloke and Pappenberger, 2009; Werner et al., 2009). Flood warning systems are generally organised by sub-systems, including operational flood forecasting, warnings to those at risk, and arrangements to communicate warning messages to people and organisations that need the information as a basis for their response (Penning-Rowsell et al., 2000). Operational flood forecasting is commonly provided by a government agency. Each agency has its own set of tools to provide this service, which may include both qualitative or quantitative predictions. Qualitative predictions rely on weather forecasts and the knowledge of historical flood events, while quantitative predictions rely on hydrologic forecasting models 1 . These quantitative predictions can be deterministic or probabilistic. Deterministic predictions can use rainfall radar images, rainfall gauge observations or deterministic rainfall forecasts to force hydrological models. Probabilistic predictions (also known as ensemble prediction systems) use an ensemble of forcing data for medium term forecasts (2-15 days). These ensembles aim to represent the forcing prediction uncertainties. The forcing uncertainties have been commonly generated from perturbing the initial conditions and/or the parameters of numerical weather predictions (NWP) models (Cloke and Pappenberger, 2009), although modern NWP models use stochastic physics. Among these 1 Model prediction refers to the output of the model at the following time step, which is obtained by running the model with input forcing data at current time step. This concept is similar to model simulation. Model forecast is used as the future prediction (or simulation) of the model that results from running the model with forecast input data at a time greater than the current time step. 1 two types (deterministic and probabilistic streamflow predictions), there are event-based predictions, where usually a team of experts is in charge of initialising and running a model (or transfer function) to forecast the streamflow for the subsequent hours; and continuous predictions, where a calibrated model for a particular catchment continuously predicts the streamflow with a lead time that depends on the input weather forecast used to force the model. While most quantitative flood forecasting systems worldwide started as event-based deterministic predictions, given the key information that probabilistic predictions provide for risk assessment and decision making (Beven, 2011; Liu and Gupta, 2007; Robertson et al., 2013), an increasing number of systems, both continuous and eventbased, are moving towards ensemble prediction systems (Cloke and Pappenberger, 2009; Krzysztofowicz, 2001). In the case of Australia, the current event-based deterministic flood forecasting system, managed by the Bureau of Meteorology, is migrating towards an operational continuous deterministic modelling approach. An appropriate probabilistic modelling approach for Australia is still under research. This thesis is framed within this context, and focuses on continuous hydrological models within the context of data scarce regions. In particular (as explained later), I aim to develop effective tools for reducing the uncertainties associated with flood prediction in these areas by exploring the use of remotely sensed hydrological observations. Hydrological (or rainfall-runoff) models represent the processes operating within a catchment in order to predict the streamflow (including flood events) generated at the catchment outlet. These physical processes, also known as runoff mechanisms, determine the catchment’s response to external forcing conditions such as rainfall (volume and intensity), temperature, solar radiation, wind speed and plant transpiration. Some of the key factors influencing these mechanisms are the climate, static land surface features (e.g., soils, topography, landscape, vegetation and water bodies) and the initial conditions of the catchment (e.g., wetness condition or soil moisture), which can be highly heterogeneous (Beven, 2011). There exist a variety of rainfall-runoff models, ranging from very complex and strictly physically based models to simpler conceptual models. However, one characteristic that they all share is that the accuracy of their predictions depends on the quality of the data used to force and calibrate them. Therefore in an operational context of scaredata regions, where there is little and poor information about catchment characteristics and forcing conditions, flood predictions can suffer from large uncertainties. So we have on one hand the need for quality flood forecasting systems and on the other hand, imperfect flood prediction models. The inherent uncertainties of these models must be accurately quantified since this information is needed for risks assessment and decision making (Liu and Gupta, 2007; Robertson et al., 2013). As a way of reducing model uncertainties, since the early 1990s hydro-meteorologic observations have been used not only as input variables of models but also as information to correct and update model 2 CHAPTER 1: INTRODUCTION variables, in a process known as data assimilation (DA) (Hreinsson, 2008). DA techniques are particularly useful in the context of flood forecasting because they allow real-time updating of the forecasts as the event proceeds, which constrains the errors in subsequent forecasts (Beven, 2011). In the context of data scarce regions, it is appealing to explore remotely sensed observations of hydrological variables in DA. Given the key role that soil moisture (SM) has on the catchment’s runoff mechanisms (Western et al., 2002), increasing attention has been given to satellite SM microwave retrievals (e.g., Francois et al., 2003; Brocca et al., 2010, 2012a; Alvarez-Garreton et al., 2013, 2014, 2015; Chen et al., 2014). In contrast to in situ SM measurements (which are sparse and correspond to point measurements that do not represent the heterogeneity over an area), satellite SM products provide spatially distributed information of the water contained within the top few centimetres of soil, at a global scale, and at regular and reasonably frequent time intervals. Additionally, satellite SM observations capture non-precipitation effects, which are not presented in most hydrological models, such as irrigation (Kumar et al., 2015; Han et al., 2015a). The existing satellite SM products have shown good agreement with ground data (Albergel et al., 2009; Draper et al., 2009b; Albergel et al., 2010; Gruhier et al., 2010; Brocca et al., 2011; Albergel et al., 2012), and there is an on-going development of satellite missions dedicated to SM estimations (Liu et al., 2011). Although the information provided by these satellites represents only the top few centimetres of soil (depth varying among different sensors), when adequately processed, they can provide valuable information about deeper layer SM (Brocca et al., 2011). Regarding the spatial resolution of these products (greater than 25 km), studies have shown that despite their coarse resolution, these observations can be used to represent catchment scale (>100 km) wetness conditions (Brocca et al., 2012b). Moreover, these products have a revisit time of 1 to 3 days depending on latitude, and the data can available within 3 h after being observed. This make them adequate for many hydrological applications including flood forecasting (Wanders et al., 2014). A common practice in satellite soil moisture data assimilation (SM-DA) is to combine SM observations and SM predictions (a model state variable) to correct the model SM state and to reduce the random component of its error (e.g., Francois et al., 2003; Brocca et al., 2010, 2012a; Alvarez-Garreton et al., 2013, 2014, 2015; Chen et al., 2014). The rationale is that processed satellite SM might be useful in improving the model’s representation of SM, enabling more accurate prediction of the catchment response to rainfall and thus better streamflow estimates. Despite their limitations, these studies have generally shown positive results for reducing streamflow prediction uncertainty. On the other hand, other studies have indicated that the assimilation of satellite soil moisture degrades the streamflow prediction (Parajka et al., 2006; Plaza et al., 2012; Kumar et al., 2014). While still in development, satellite SM-DA state correction may be considered as a promising tool for reducing the uncertainty in streamflow predictions. This approach would be specially 3 useful within data sparse regions, where there might be more value in remotely sensed data, given the greater model uncertainties coming from poorer ground data. There are, however, a number of key challenges that need to be addressed to implement such a scheme successfully (in any context, not only within data sparse regions). The challenges include accounting for the depth mismatch between model predictions and satellite observations, representing and quantifying model and observation errors and setting up a robust data assimilation scheme. In particular, there is still no agreement on the most effective techniques to quantify the model and the observation errors in the current state of the art (Brocca et al., 2012a). Furthermore, there are challenges related to the downscaling processes required given the coarse resolution of satellite soil moisture products, the quality control of satellite soil moisture and the satellite data discontinuity (Sahoo et al., 2013; Ridler et al., 2014; Yin et al., 2014; Han et al., 2015b). The efficacy of a SM-DA state correction scheme for improving streamflow prediction is restricted by factors such as the inherent model limitations (coming from structural and parameters uncertainties), the errors in forcing data, the experimental setup (e.g., model error quantification, observation error quantification, satellite data processing techniques, data assimilation scheme), and the specific catchment characteristics (e.g., soil type, location and land cover) (Massari et al., 2015). Since the aim of a SM-DA state correction scheme is to reduce the errors in the model soil moisture, the reduction in streamflow uncertainty will depend on the error covariance between the soil water state and the output streamflow. This error covariance will depend on the relative importance of soil moisture compared with other factors in the runoff generation. In other words, it will depend on the dominant runoff mechanisms within the catchment (such as saturation excess or infiltration excess). Therefore, the error covariance between SM and streamflow may become weak when the errors in streamflow come mainly from errors in the rainfall input data (Crow and Ryu, 2009) or from infiltration capacity estimates, such as in the case of very intense runoff events (Wood et al., 1990). The contribution of input forcing uncertainty to the streamflow errors becomes critical in ungauged or sparsely monitored locations, where the available rainfall data generally comes from satellite products or numerical weather prediction models. Satellite rainfall information feature high temporal resolution, but usually contain important biases and errors (Yong et al., 2013; Zhou et al., 2014; Yong et al., 2015). Recent studies have shown that these errors can potentially be reduced by using satellite SM observations (Pellarin et al., 2008; Crow et al., 2009; Brocca et al., 2013). The argument is that given the information that the surface SM contains about recent rainfall events, the rainfall can be constrained by satellite SM observations using simple water balance models. Although these studies have different approaches, they all have proved the potential of improving satellite rainfall estimates by using satellite SM. The demonstration of the potential of SM observations to correct errors in both the model 4 CHAPTER 1: INTRODUCTION states and the forcing data has motivated recent studies to test dual (state and forcing) correction schemes (e.g., Crow and Ryu, 2009; Chen et al., 2014; Massari et al., 2014). Due to differences in the main runoff controlling factors, these studies have found that high-flow estimations can be improved by correcting the rainfall forcing data, while low-flow events and baseflow estimations are improved by the correction of initial soil moisture conditions (Crow and Ryu, 2009; Chen et al., 2014). While these results suggest a promising future for the further development of dual correction schemes (which is still an ongoing research field), it remains unclear how they perform for different types of catchments. It is worth noting that the use of soil moisture data to correct (or estimate) rainfall has limitations during intense rainfall events due to the limited information that soil moisture provides when the soil gets saturated (Chen et al., 2014). This may lower the effectiveness of correcting rainfall data for flood forecasting. Moreover, the investigation of effective state and forcing correction SM-DA schemes remains an ongoing research field. Within the context described above, in this thesis I further explore the efficacy of using satellite SM observations to improve flood prediction in data-scarce regions by applying both state and forcing SM-DA correction schemes. I aim to test whether the spatially distributed satellite soil moisture information can improve model predictions via data assimilation techniques. Various options for the components of an effective SM-DA scheme are explored by working within 4 Australian catchments featuring a history of significant flooding. These catchments have very distinct characteristics compared with most of the catchments studied in SM-DA applications. Some of the techniques used during this research are adopted (and adapted) from previous studies and some of them are applied for the first time in SM-DA applications. The significance of this research is in evaluating the efficacy of different existing tools for setting up a SM-DA scheme, in presenting new techniques that address some key steps in SM-DA, and in providing (with real data experiments) novel evidence of the efficacy of SM-DA for improving flood prediction in data-scarce regions. The research is divided into two main parts, each addressing one main research question. To address these two main questions I define 9 sub-questions that target different aspects of the problem. The first (and core) stage focuses on answering the following question: Can we improve flood prediction by correcting the SM state of a rainfall-runoff model via satellite SM data assimilation? To answer this question, I explore several aspects of the assimilation schemes that can affect the efficacy of SM-DA. The aspects include the impacts of different techniques used to process the satellite data and the impacts of accounting for spatial distribution of forcing data and channel routing. Specifically, I address 5 specific sub-questions (the required information to understand the concepts behind the questions are presented in Chapter 2): 1. How do the assumed observation error structures affect the efficacy of SM-DA for 5 improving streamflow prediction? 2. What are the impacts of different rescaling techniques (applied to remove systematic biases between the model and the observation) on the efficacy of SM-DA? 3. Acknowledging that rainfall is presumably the main driver of flood generation in semi-arid catchments, can we improve streamflow prediction by correcting the soil water state of the model? 4. What is the impact of accounting for channel routing and the spatial distribution of forcing data on SM-DA performance? 5. What are the prospects for improving streamflow within ungauged catchments using satellite SM? The second part of this research builds on the first part and focuses on the following research question: Can we further improve flood prediction by correcting both the satellite rainfall forcing data and the model SM state via satellite SM data assimilation? To answer this question, I explore the relative improvement in streamflow prediction coming from correcting the model states (via the state correction scheme used in the first stage) and from correcting the rainfall forcing data (via a forcing correction scheme adopted from the currently available techniques) in a dual state/forcing SM-DA. Within this stage, I answer 4 sub-questions: 6. Can we improve the quality of an operational satellite rainfall product by the assimilation of satellite soil moisture? 7. Does this forcing correction scheme has a positive impact in streamflow prediction? 8. Can we improve streamflow prediction by the assimilation of satellite SM in a state correction scheme? 9. What are the impacts in streamflow prediction of a combined state and forcing correction scheme? 1 Structure of the thesis This thesis is structured in 7 chapters. In the present chapter I give an overview of how this research is framed, present the problem I am focusing on and its significance, the scope of the research, and the defined research questions. I also provide a detailed explanation of the structure of the thesis document. In Chapter 2 I summarise the background that supports the research questions and provide a description of the methods used to address them. I highlight which methods are currently being used in the literature and which 6 CHAPTER 1: INTRODUCTION correspond to novel contributions of this thesis. The following 4 chapters address the defined research questions. These chapters are published as peer-reviewed proceedings and articles. They include their own introduction (with a description of the objectives, scope and research questions), study area, data, methods, discussion and conclusions sections. Complying with Thesis with Publication instructions from the University of Melbourne (http://gradresearch.unimelb.edu.au/exams/publication.html), the format of these chapters was maintained from their original publication sources. In Chapter 3 I address sub-question 1 by analysing how different assumptions about the structure of the satellite observation error affect the results of SM-DA in terms of streamflow prediction improvement. I present a real data experiment and test whether the assumed degree of autocorrelation in the satellite observations error has any effect in the streamflow prediction after SM-DA. Although the effects of incorrect observation error assumptions (structure and magnitude) in SM-DA have been previously studied (Crow and van Loon, 2006; Crow and Reichle, 2008; Crow and Van den Berg, 2010; Reichle et al., 2008), I provide new and different conclusions about those effects. I explain the apparent contradiction with previous studies through factors such as the different variable of interest to evaluate the SM-DA results (previous studies have focused in the updated soil moisture, while the work presented here focuses on the streamflow prediction), the large errors in the streamflow prediction before SM-DA in this case, and the optimality of the rescaling technique used to process the satellite data. In Chapter 4 I address sub-question 2 by comparing the results of SM-DA state correction schemes using different rescaling techniques to remove the systematic biases between model SM and satellite SM observations. I test several commonly used rescaling techniques and evaluate their effects in the updated streamflow. I also explore different assumptions about the satellite observation error (related to sub-question 1). The contribution here is twofold: i) I present new evidence of the advantages of assimilating satellite SM for improving flood prediction in a semi-arid sparsely gauged catchment and ii) I provide insights about the effects of different existing techniques to process the satellite SM data in SM-DA results. In Chapter 5 I address sub-questions 3, 4 and 5 by setting up a SM-DA state correction scheme and comparing the results between lumped and semi-distributed model schematisations. With this comparison, I assess the effects of accounting for the spatial distribution in forcing data and routing processes within a large study catchment. I also evaluate the efficacy of SM-DA for improving flood prediction at ungauged sub-catchments. This work introduces techniques which have not been applied in previous SM-DA studies, related to model error representation and satellite SM data processing (observation error estimation and rescaling). The contributions here include the presentation of these novel techniques in a SM-DA context, the provision of new evidence of the efficacy of satellite SM-DA for improving ensemble streamflow predictions in a sparsely instrumented catchment (and for 7 ungauged sub-catchments), and demonstrating that SM-DA skill is enhanced if the spatial distribution in forcing data and routing processes are accounted for. In Chapter 6 I address the last 4 sub-questions by comparing the relative skills of a state correction SM-DA scheme, a forcing correction SM-DA scheme and a combined state/forcing correction SM-DA scheme. The contribution here is to provide further evidence of the value of satellite soil moisture within data sparse regions. I demonstrate that the quality of the satellite rainfall product is improved by the forcing correction scheme during mean-to-high daily rainfall events, which in turn leads to an improved streamflow prediction during high flows. I also show that for most catchments, the state correction scheme outperforms the forcing correction scheme outputs, specially during low flows. In agreement with previous studies, I show that in overall, the combined dual correction scheme further improves the streamflow predictions. I also identify a number of challenges and limitations within the proposed schemes. In Chapter 7 I provide an overall discussion of the different findings of the thesis, summarise the learning throughout the research and give recommendations based on the limitations I have found. I highlight the main contributions and significance of the thesis, and present ideas for future work. Finally, in Appendix A I provide a list of the publications and conference presentations done throughout the research that supports this thesis. 8 Chapter 2 Background In Chapter 2 I take the reader through the background that supports my two main research questions. I provide a general review of the literature that puts my research within an overall context. To minimise repetition with the background provided in the subsequent chapters, the detailed explanation of the methods used in the thesis are cross-referenced to the corresponding chapters. To set up the soil moisture data assimilation (SM-DA) schemes, several key steps must be addressed. This requires knowledge and understanding of the different components of the SM-DA scheme, including satellite SM retrieval techniques and their associated errors, various DA implementation steps (such as model and observation error representation, the required satellite data processing, the updating technique, etc), and the hydrological model used. In the following sections I cover this required background and highlight the limitations and on-going research within each topic. 1 Microwave remote sensing of soil moisture Remote sensing measures the radiation emitted and reflected from the Earth’s surface and received at the satellite sensor. Each object has a unique combination of reflected, emitted, transmitted and absorbed radiation, which forms its spectral signature. Radiometry corresponds to the measurement of this electromagnetic radiation, which is a consequence of an objects material characteristics, which in the microwave region depend on its dielectric properties and temperature (Schmugge et al., 2002; Sharkov, 2003; Campbell and Wynne, 2011). Within the range of the electromagnetic spectrum, the L-∼X-band range (1 to 10 GHz) from the microwave portion have been used for soil moisture sensing. Soil moisture retrieval techniques rely on the dielectric behaviour of liquid water, which has a substantially greater dielectric constant than dry soil and hence a different radiative response. This dif9 ference is due to the electric dipole of water molecules, which responds to the applied electromagnetic field. These highly polarised molecules have high dielectric constants in the lower frequency region of the microwave region of the spectrum (about 80 at L-band frequency), in contrast of dry soils that show low dielectric constants (about 3-4 at L-band) (Schmugge, 1978; Carver et al., 1985; Schmugge, 1983; Engman and Chauhan, 1995). The above results in water molecules having high reflectivity and low emissivity in this region, therefore, an increase in soil water content shows higher backscatter measurements for radars (active sensors) and lower brightness temperatures for radiometers (passive sensors). Active sensors generally have higher spatial resolutions, but their ability to measure soil moisture is significantly more affected by surface roughness, topographic features and vegetation than passive sensors (Engman and Chauhan, 1995; Engman, 2000; Schmugge et al., 2002). Although the relationship between brightness temperature and soil moisture has a strong theoretical basis, most algorithms used for quantifying this relation are empirical and rely on ground data (Engman and Chauhan, 1995). Each microwave sensor uses its own retrieval technique, depending on the sensor type, frequency observed, the associated microwave penetration depth, the sensor’s polarisation, and antenna scanning configurations (Njoku and Entekhabi, 1996). These techniques are based on theoretical and/or empirical models and are under continuous development (Ahmad et al., 2010). The accuracy of each algorithm relies on the identification and quantification of the interaction between soil moisture content, soil texture, surface roughness and vegetation cover (Jackson and Schmugge, 1989; Njoku and Entekhabi, 1996; Engman and Chauhan, 1995; Engman, 2000). The principal factors affecting soil moisture retrievals for passive sensors are shown in a simplified scheme in Figure 1. The scheme shows that the received radiation at an antenna (which comes from different sources) is transformed into brightness temperature through radiometric techniques. The brightness temperature is then related to physical properties of the observed scene through different physically based models and empirical relationships that enable the identification of the different factors affecting the emitted signal received from the scene, in order to finally estimate specific variables. Each of the procedures showed in Figure 1 has associated uncertainties that must be accounted for when using the final product. In this thesis I am a final user of microwave remote sensing soil moisture products and I will not go into further details of the data processing leading to these products. In particular, I use soil moisture products from one active and two passive satellites sensors with different retrieval algorithms (details are provided in Chapters 3, 4, 5 and 6). It is however, important to recognise that the penetration depth of the microwave signals is of few centimetres, thus the soil moisture estimates represent the top soil layer. The spatial resolution of these products is coarse (greater than 25 km), however, it has been shown that 10 CHAPTER 2: BACKGROUND TB is measured and instrumental error is added Microwave remote sensor Radiative transfer model Soil dielectric model SM product TB is converted to the soil TB and then to the soil (plus water) dielectric constant Physical scenario observed Figure 1: Factors affecting satellite soil moisture retrievals these observations can be used to represent catchment scale (>100 km) wetness conditions (Brocca et al., 2012b). The revisit time of these satellites is 1 to 3 days, depending on latitude, and the data can available within 3 h after being observed. This make them adequate for many hydrological applications including flood forecasting (Wanders et al., 2014). 2 Hydrologic data assimilation Hydrologic (or rainfall-runoff) models conceptualise the streamflow generation processes within a catchment based on a specific structure, set of parameters and input data. Depending on the model, different fluxes and states influencing the streamflow are estimated, such as net rainfall, soil moisture, fast surficial flows, interflow, baseflow, groundwater, etc. The models are a simplified representation of a real system, therefore their predictions are prone to errors (Beven, 2011). Moreover, they also rely on the quality of the data used to force them and to calibrate their parameters (especially when parameters are not physically measurable), which becomes critical in the context of data-scarce regions. A popular approach to reduce the random errors in hydrological models is data assimilation (DA). DA techniques use observations to inform and correct specific model components (such as model states or parameters). As an example, Figure 2 shows a Kalman-based simplified schematic of DA procedure (see below for Kalman filter description). The basis of DA techniques relies on Bayes’ theorem, which establishes that for two events A and B, with probability of occurrence P (A) and P (B), respectively, the probability that both events occur is given by P (A ∩ B) = P (A | B) · P (B) = P (B | A) · P (A) (1) Where P (A | B) is the conditional probability of occurrence of event A given that B has occurred. Eq.1 leads directly to Bayes’ theorem, where the marginal probabilities P (A) and P (B) are referred as prior probability density functions, and the conditional 11 Figure 2: Kalman-based DA schematic diagram: when an observation is available (black point), the corresponding value simulated by the model (white point) is corrected and an updated value is calculated (gray point) (Aubert et al., 2003). probability P (B | A) is referred as posterior density function: P (B | A) = P (A | B) · P (B) P (A) (2) Applying Bayes’ theorem to solve the DA problem, the posterior probability distribution Ppost of the quantity of interest, x, given a set of uncertain observations, z, can be defined as Ppost (x | z) = P (z | x) · Pprior (x) P (z) (3) Based on this Bayesian estimation of the posterior distribution, a Kalman filter (KF) (Kalman, 1960) can be formulated to reduce the errors in hydrologic models. The KF and its derivations, along with particle filters and variational assimilation techniques (Liu and Gupta, 2007), are the most commonly used DA schemes in hydrology. In the following, I provide a brief description of these 3 approaches for the particular case of a state correction framework (framework implemented in this research, as explained in Section 3). 1. The KF updating scheme determines state corrections (updating step) based on the model and observations errors, which are assumed to be independent Gaussian errors. As a simple Bayesian formulation, the KF estimates posteriori state values (Ppost in Eq. 3) recursively over time by using incoming observations and a linear background model to propagate the state variable. The KF compares Ppost (x) and Pprior (x) (the state variable with and without the observation information) and minimises the expected value of the square magnitude of their difference (i.e., KF is a minimum mean-square error estimator). The main limitation of this filter is that it requires linearity of the system for propagating the model errors from one time step to another (Evensen, 1994). Since hydrological processes are highly non-linear, variations of the Kalman filter have been developed in order to deal with this limitation including the Extended Kalman Filter (EKF) and the Ensemble Kalman Filter (EnKF). The EKF performs local linear approximation for propagating the model error co12 CHAPTER 2: BACKGROUND variance matrix. This algorithm has had some successful applications in hydrology, although it can create instabilities or divergences due to the linear approximation of non-linear processes (Clark et al., 2008). The EnKF is a Monte Carlo approximation of the traditional KF that non-linearly propagates a finite ensemble of model trajectories (Reichle et al., 2002). This filter was introduced by Evensen (1994) as an alternative of the EKF to deal with the limitations of the linear approximations within strong non-linear systems. The Monte Carlo method consists in using a large cloud of possible realisations to represent a specific probability density function. In hydrological models, these ensembles are usually generated by perturbing the model state variables, parameters and/or forcing data by a mean-zero Gaussian noise (Ryu et al., 2009). The results obtained by Evensen (1994) using the EnKF were better than the ones obtained in previous studies using the EKF, which was reflected in better quality of the forecasts error statistic and lower calculation times. The EnKF uses the updating equation of the traditional KF, but with the Kalman gain (Eq.14) calculated based on the relative magnitude of the error covariances of the model and observations (Burgers et al., 1998). As the error covariance information in the model is propagated by a Monte Carlo ensemble, the EnKF can represent almost any type of model errors (Crow and Van den Berg, 2010). The limitation of the EnKF is the invalid Gaussian error in the non-Gaussian earth system models. In addition, the EnKF cannot conserve the water balance and it may perturb the state variables into some non-physical meaning values (Li et al., 2012; Moradkhani et al., 2012). The formulation of an EnKF-based DA scheme has been widely detailed in the literature (e.g., Burgers et al., 1998; Evensen, 1994; Reichle et al., 2002). In general, when an observation is available, an ensemble of observations (θ obs ) is created by perturbing the observed time series with a particular observation error (the structure and magnitude of this error must be estimated for a specific application). Then, each member i of the ensemble prediction of the variable of interest (θ, an ensemble of a model state in this example) is updated by θi+ (t) = θi− (t) + K(t) · (θiobs (t) − Hθi− (t)). (4) The superscripts “− ” and “+ ” denote the state prediction before and after the assimilation step, respectively. H is an operator that transforms the model state to the measurement space. When the observations are processed to represent the model space before the DA step (as is it consistently done in this thesis, see Section 4.3 and Chapters 3 to 6), H reduces to the unity matrix. K is the Kalman gain, calculated for each time step as K(t) = P− (t) , + R(t) (5) P− (t) where R(t) is the θ obs error variance and P− (t) is the error covariance of the model state. This error covariance can be estimated at each time step based on state 13 ensemble mean, θ − (t), as P− (t) = i h iT 1 h − θ (t) − θ − (t) · θ − (t) − θ − (t) . N −1 (6) 2. Particle filters (PF) are based in updating the probability density function (PDF) of the model states. For this purpose, the posterior probability distribution of the model states are drawn by discrete random sampling of particles with associated weights. When an observation becomes available, the weights of the particles are evaluated and updated. In this sense, while the Kalman filters deal directly with the states of the model, particle filters update the particle weights, therefore they can update simultaneously different components of the model (associated with the updated particle). Given the latter, this technique has been used widely for simultaneously updating model states and parameters (Liu and Gupta, 2007; Hreinsson, 2008). PF are not limited to Gaussian PDFs and they can be applied to linear and non-linear models. However, they required a large number of members to avoid the collapse of the particle, which makes them computationally expensive (van Leeuwen, 2010). Weerts and El Serafy (2006) undertook a comparative analysis of EnKF and PF for state updating with hydrological models and concluded that the EnKF scheme was more suitable for flood forecasting. On the other hand, Dechant and Moradkhani (2011); DeChant and Moradkhani (2012) did a comprehensive study to test the effectiveness and robustness of the EnKF and PF on hydrologic forecasting and they concluded that the PF is superior to the EnKF. The PF shows better results since it can relax the Gaussian assumption and keep the water balance intact (Moradkhani et al., 2005a, 2012). Also, Matgen et al. (2010) used the PF for flood forecasting/inundation. 3. Variational data assimilation techniques, in contrast to Kalman filters and particle filters that have a sequential approximation, operate in a batch basis over a time window that contains the observed data. These techniques in general minimise a cost function constructed by an aggregation of errors from different sources (model structure, observations, initial conditions, inputs and parameters) over the entire assimilation window, assuming that errors are independent and additive. These methods are very well suited for smoothing problems (characterising variables at past times), but can also be implemented for filtering problems if the smoothing scheme is defined sequentially each time new observations arrive. The disadvantage of this technique for real time applications is that it can be computational inefficient (Liu and Gupta, 2007). Given the advantages and disadvantages of the DA techniques summarised above within the context of flood prediction, in this research I adopt an EnKF-based approach. In particular, I implement a state correction scheme in which the soil moisture state of a hydrological model is updated by using satellite observations (details in Section 3). 14 CHAPTER 2: BACKGROUND 3 Satellite soil moisture data assimilation (SM-DA) Within the context of data scarce regions, remotely sensed observations of hydrological variables are appealing datasets to use in DA. These observations provide temporally and spatially distributed information about hydrological variables in areas where there is little or no ground information, complementing standard measurements of rainfall, soil water content, evapotranspiration, snow cover, vegetation cover, topography, water quality, areas of groundwater recharge and discharge, etc. (Engman, 1996; Ritchie and Rango, 1996). The successful use of satellite data is, however, subject to the reliability of retrieval techniques used for determining the hydrologic variables (Stewart et al., 1996), which will be a determinant of the observations errors assessment. The use of satellite soil moisture observations in hydrologic DA in particular, has been increasingly explored (e.g., Francois et al., 2003; Brocca et al., 2010, 2012a; Chen et al., 2014). This is for three main reasons: 1) SM is a key controlling factor in runoff response of a catchment by influencing different processes including evaporative fluxes, infiltration and percolation, surface runoff, interflow, and groundwater recharge (Engman and Chauhan, 1995; Schultz and Engman, 2000; Western et al., 2002; Njoku et al., 2003; Jia et al., 2009); 2) in-situ measurements of SM are scarce and they provide point information that does not represent the heterogeneity over an area; and 3) there is on-going development of satellite missions dedicated to SM estimations (Liu et al., 2011). Sections 4 and 5 below review the two main approaches of satellite soil moisture data assimilation (SM-DA). The first and most popular approach is the use of satellite SM to correct the soil water states of hydrological models (state correction SM-DA). The second and more recent approach is the use of satellite SM to correct satellite rainfall observations, which can then provide better forcing data for hydrological models (forcing correction SM-DA). 4 State correction schemes The studies that investigate the use of satellite SM to correct SM states of models can be broadly categorised into two main groups; the first group has mostly worked with land surface models and has focused on improving surface SM or root-zone SM estimation (e.g., Crow and van Loon, 2006; Crow and Reichle, 2008; Crow and Van den Berg, 2010; Reichle et al., 2008; Ryu et al., 2009). The second group (where this thesis fits) has focused on the improvement of streamflow prediction from rainfall-runoff models (Francois et al., 2003; Brocca et al., 2010, 2012a; Chen et al., 2014; Wanders et al., 2014). This Section describes the main challenges of SM-DA within the latter group. Improvements in streamflow predictions investigated by studies in the second group are not 15 exclusively influenced by better representation of SM. The rationale here is that satellite SM can be used to correct the SM model prediction, enabling more accurate prediction of the catchment’s response to rainfall and thus better streamflow estimates. The efficacy of SM-DA state correction scheme is therefore influenced by the particular runoff mechanisms which occur within the catchment (Alvarez-Garreton et al., 2015). Since SM-DA aims to reduce the errors in the model soil moisture, the reduction in streamflow uncertainty will depend on the error covariance between soil moisture and runoff. This error covariance (which in the model space will be defined by the representation of the different sources of uncertainty) may become marginal when the errors in streamflow come mainly from errors in rainfall input data (Crow and Ryu, 2009). This physical constraint is case specific and determines the potential skill of SM-DA for improving streamflow prediction. An important challenge within this approach is that the satellite SM estimates represent only a few top centimetres of soil (given the microwave penetration depths, see Section 1) whereas hydrological models generally represent the water content of deeper soil layers (Parajka et al., 2006). These different SM representations must be addressed since data assimilation schemes reduce random state errors by combining predictions and observations of a same physical variable. Addressing this problem, several studies have related the top soil layer moisture with deeper layers by making various assumptions regarding the vertical distribution of water content or by using land surface models (Parajka et al., 2006; Draper et al., 2009a; Meier et al., 2011). Additionally, many hydrological models have state definitions that differ from measured soil moisture. Consequently, the studied model has to be carefully analysed and observation operators must be determined to relate satellite measurements with model states in order to incorporate these observations in a consistent way (Barrett and Renzullo, 2009). Despite their limitations, state correction SM-DA applications have generally been shown to reduce streamflow prediction uncertainty (e.g., Francois et al., 2003; Brocca et al., 2010, 2012a). Nevertheless, there are key challenges that need to be addressed to implement such schemes and there is still no agreement on the most effective techniques to do so (Crow and Van den Berg, 2010). These challenges include the adequate representation of model errors; the estimation of those errors; the implementation of observation operators that relate the satellite observations with the model states in a consistent way; and the estimation of the satellite observation errors. Sections 4.1 to 4.3 provide a review of the techniques used to address each of these challenges. 4.1 Model error representation The main sources of uncertainty in hydrologic models come from the errors in the forcing data, the model structure and model parameters (Liu and Gupta, 2007). In SM-DA applications, an adequate representation and estimation of these errors is critical since it determines the value of the Kalman gain (Eq.14). Moreover, the improvement in stream16 CHAPTER 2: BACKGROUND flow prediction coming from a better representation of SM relies on the covariance between the errors in SM states and the modelled streamflow, which directly depends on the specific representation and estimation of the model errors. Most SM-DA studies are based on overly simplistic error models (Crow and Van den Berg, 2010). The general practice is to capture the net effect of multiple error sources in an aggregated way by adding unbiased synthetic noise to forcing variables, model state variables and/or model parameters (Reichle et al., 2008; Ryu et al., 2009; Brocca et al., 2010; Crow and Van den Berg, 2010; Chen et al., 2011; Brocca et al., 2012a; Hain et al., 2012). To represent the forcing uncertainty within a Monte-Carlo approach, the different input data sets used by the model can be perturbed, such as temperature, potential evapotranspiration and rainfall. In the experiments carried out throughout this research, only rainfall was perturbed. This choice was made for two reasons: 1) to minimise the number of unknown error parameters and 2) because rainfall is the most critical forcing data affecting the streamflow generation within the study catchments. It should be noted that the potential evapotranspiration also plays an important role in streamflow generation, especially within the semi-arid catchments used in this study. Although this input data was not explicitly perturbed to represent the forcing uncertainty, the errors in the actual evapotranspiration (calculated based on PET and the SM state of the model) were implicitly accounted for when the SM state of the model was perturbed (see model structure error below). Regarding the uncertainty in rainfall data, this error is generally represented by a multiplicative error (McMillan et al., 2011; Tian et al., 2013). In particular, various SM-DA studies (e.g., Brocca et al., 2012a; Chen et al., 2011) have represented the rainfall error (p ) as p ∼ lnN (1, σp2 ), (7) where σp is the standard deviation of the lognormal distribution. The parameter uncertainty can be represented by perturbing selected model parameters. The structure of these errors will depend on the nature of the parameter and the assumptions made in each application. The justification of the specific rainfall error and parameter error structure adopted in each experiment of the thesis is provided in the corresponding methodology section of Chapters 3 to 6. Within the context of SM-DA applications, the model structural error is usually represented by perturbing the SM state of the model (the same variable that is updated in the DA scheme). This error is commonly assumed to be spatially homogeneous additive random error (e.g., Chen et al., 2011; Crow and Van den Berg, 2010; Hain et al., 2012; Reichle et al., 2008): s ∼ N (0, σs2 ), (8) 17 where σs is the standard deviation of the normal distribution. This type of perturbation, however, should be carefully implemented. Since the physical limits of SM (porosity as an upper bound and residual water content as a lower bound) are represented in the model space by the corresponding storage capacity, when the model SM prediction approaches the limits of this storage, applying unbiased perturbation to SM can lead to a truncation bias in the background prediction. This can result in mass balance errors and degrade the performance of the DA scheme. Moreover, the Kalman filter assumes unbiased state variables (Ryu et al., 2009). This issue is of particular importance in arid regions like the study area (see area description in Chapters 4 to 6), where the soil water content can be rapidly depleted by evapotranspiration and transmission losses, thus approaching the residual water content of the soil. Addressing this issue, Ryu et al. (2009) proposed a truncation bias correction that consists in running a single unperturbed model prediction (θ−0 ) in parallel with the perturbed model prediction (θi,− ). At each time step, the mean bias of an N -member ensemble prediction, δ(t), is calculated by subtracting θ−0 (t) from the ensemble mean: δ(t) = N 1 X − θ (t) − θ−0 (t). N i=1 i (9) Then, a bias corrected ensemble of state variables, θ̃i− (t), is obtained by subtracting δ(t) from each member of the perturbed ensemble, θi− (t). This truncation bias correction ensures unbiased state ensembles, however, some important but subtle effects remain that arise from the non-linear, bounded nature of hydrologic models. Representing model errors by adding unbiased perturbation to forcing, model parameters and/or model states can lead to a biased streamflow ensemble prediction (e.g., Plaza et al., 2012; Ryu et al., 2009), compared with the unperturbed model run. This biased streamflow ensemble prediction (hereinafter referred to as the “open-loop”) is degraded compared with the streamflow predicted by the unperturbed calibrated model. As a consequence, improvement of the open-loop after SM-DA will in part be due to the correction of bias introduced during the assimilation process itself. This issue of bias in the streamflow open-loop introduced through perturbation of forcing, etc. has not been explicitly treated in previous SM-DA applications. In Chapters 5 and 6, I address this by examining if the bias correction proposed by Ryu et al. (2009) can be used to correct the bias issue in the streamflow open-loop. 4.2 Model error parameter estimation It has been shown that inappropriate assumptions about the magnitude of model errors results in sub-optimal performance of the DA scheme and in the degradation of the updated predictions (Crow and van Loon, 2006; Crow and Reichle, 2008; Reichle et al., 2008). 18 CHAPTER 2: BACKGROUND Nevertheless, most SM-DA studies adopt error structures for the different perturbations (such as the ones presented in Section 4.1) and then assume an arbitrary magnitude for those errors, without examining the validity of either assumption. In the following sections, I describe the most commonly used methods to estimate model error parameters within SM-DA context. 4.2.1 Ensemble verification criteria When a hydrological model is perturbed to generate an open-loop streamflow ensemble (Qol ) that accounts for different sources of uncertainties, the characteristics of that ensemble should be examined to determine if it is a reliable estimate of the uncertainty. There are some verification methods (commonly used in meteorology) to measure reliability or consistency of ensembles, based on the observed streamflow (Qobs ) (De Lannoy et al., 2006). For example, if the ensemble has enough spread, the temporal average (expressed by an overbar) of the ensemble skill (skt ) should be similar to the temporal average of the ensemble spread (spt ), i.e., sk/sp = 1 (Brocca et al., 2012a; De Lannoy et al., 2006), where: sk = T 2 1 X Qol (t) − Qobs (t) , T t=1 (10) " # T N 2 1 X x1 X sp = Qol (i, t) − Qol (t) . T t=1 N i=1 (11) Additionally, if the observation is indistinguishable from a member of the ensemble, the rap tio between sk and the ensemble mean squared error (mse), normalised by (N + 1)/2N should be equal to one (Brocca et al., 2012a; Moradkhani et al., 2005b), where: " # T N 1X 1 X 2 (Qol (i, t) − Qobs (t)) . mse = T t=1 N i=1 (12) The above discharge ensemble verification criteria have been used in previous SM-DA applications (e.g., Brocca et al., 2012a). The limitation here is that they assume that the observed discharge has no error (or very small error compared with the model error), which could lead to overestimation of the model error parameters. 4.2.2 Maximum a posteriori approach The open-loop ensemble prediction can also be evaluated under a Bayesian inference procedure in order to maximise the probability of having the streamflow observation within the open-loop streamflow. Wang et al. (2009) detailed such a procedure, called a maximum a posteriori (MAP) scheme, which maximises the probability of observing historical events given the model and error parameters. The description and equations to implement MAP 19 are provided in Chapter 5. It should be noted that this approach has not been applied in previous SM-DA studies. 4.2.3 Adaptive filtering techniques Another approach found in the literature to estimate the magnitude of model errors are adaptive filtering techniques (Crow and Reichle, 2008; Reichle et al., 2008). These techniques evaluate normalised filter innovations in the updating step, ν, to correct the assumed model and observation errors: ν(t) = p θobs − θ− P− (t) + R(t) . (13) Where P is the background error variance in the model state θ, R is the error variance of the observation θobs . In the KF theory, correct assumptions about the magnitudes of the model and observation errors should result in serially uncorrelated filter innovations (white noise) and the normalised innovations should have unit variance. While these techniques have a strong theoretical basis, the convergence of model and observation error estimations is slow (simulations periods of more than 5 years are required) and they are based on the assumption that observation errors are uncorrelated, which is probably not the case with satellite soil moisture observations (Crow and Van den Berg, 2010). 4.2.4 Triple collocation-based estimation To overcome the limitations of adaptive filtering techniques, Crow and Van den Berg (2010) proposed a new approach that used an estimated observation error variance (R̂, coming from a triple collocation analysis, see Section 4.3.3) to constrain the unit-variance restriction of ν (Eq.13) and estimate the model error variance P . Within the limitations regarding the simplified representation adopted for the model error structure, Crow and Van den Berg (2010) showed an improved estimation of model error in the presence of autocorrelated observation errors, compared with adaptive filtering techniques. 4.2.5 Summary Given the determinant role that the estimation of model errors plays in SM-DA, in this thesis I strive to prevent arbitrary assumptions about the magnitude of these errors. For this I apply different techniques. In Chapters 3 and 4, I use the ensemble verification criteria described in Section 4.2.1, which is simple to implement and not too computationally expensive. To overcome the limitations of this approach and to introduce a new approach within SM-DA applications, in Chapter 5 and 6, I implement MAP to estimate the model error parameters. There is however, an important research gap here since the most suitable procedure to generate ensemble of streamflow predictions is still not assessed, and a 20 CHAPTER 2: BACKGROUND consistent inter-comparison between the available techniques has not been carried. 4.3 Satellite SM observation operator As mentioned earlier, one of the key challenges in setting up a satellite SM-DA scheme is converting surface soil moisture observations from the satellites into variables physically or statistically compatible with the model soil moisture. This can be achieved by using an observation operator. Such observation operator needs to address a few key issues. Firstly, there is a dynamical difference between model SM prediction and satellite SM observations related to the different depths that they represent that needs to be resolved. This can be done by estimating a profile soil moisture based on surface observations. Secondly, after a profile SM is estimated from the satellite SM (and therefore both the observations and the model are representing the same physical variable), there are systematic differences that need to be removed before assimilation. These differences are due to the distinct modelling and observational approaches used, which typically lead to predictions with different systematic relationships to the assumed truth (Yilmaz and Crow, 2013). Lastly, the observation error of these transformed and rescaled observations must be estimated. This error is usually assumed to be an additive random error with a variance of R (from Eq.14). The observation operator is defined here as the combination of techniques used to solve the above three steps (profile soil moisture estimation, observation rescaling and observation error estimation). Sections 4.3.1 to 4.3.3 review the methods commonly used to address these steps and present the new methods we have proposed in this thesis. 4.3.1 Profile soil moisture estimation The flux of water from the surface of the soil into deeper layers is dominated by soil properties, such as porosity, wilting point, field capacity and unsaturated hydraulic conductivity, and by forcing data, such as rainfall and evapotranspiration (Richards, 1931). If there is information available about the soil properties of the study area, a physically based model can be applied to the surface observations to estimate this flux (eg. Richards, 1931; Beven and Germann, 1982; Manfreda et al., 2014). However, this information is generally not available. Addressing this challenge, Wagner et al. (1999) proposed an empirical relationship that linearly relates the variation in time of the root zone SM to the difference between surface SM and root zone SM. This is done by applying an exponential smoothing filter to the surface observations. In this thesis I adopt the above filter (see methodology sections in Chapters 3 to 6), which has been widely used to represent deep-layer SM based on surface observations (Wagner et al., 1999; Albergel et al., 2008; Brocca et al., 2009, 2010, 2012a; Ford et al., 2013). The filter estimates an average of profile saturation by recursively calculating a soil wetness 21 index (SWI) whenever a surface SM observation is available: SWI(t) = SWI(t − 1) + G(t) [SSM(t) − SWI(t − 1)] , (14) where SSM(t) is the satellite SM observation and G(t) is a gain term varying between 0 and 1 as: G(t) = G(t − 1) . t−(t−1) G(t − 1) + e−( T ) (15) T is a calibrated parameter that implicitly accounts for several physical parameters (Albergel et al., 2008). The common practice is to calibrate T by maximising the correlation between SWI and the unperturbed model soil moisture (θ). 4.3.2 Observation rescaling In order to optimally merge model predictions and observations, the systematic differences between the two datasets must be removed. This is usually done as a pre-processing step by rescaling the observations to match the model predictions in some statistical sense (Reichle and Koster, 2004; Drusch et al., 2005; Yilmaz and Crow, 2013). If not done as a pre-processing step, this rescaling can be expressed via H in Eq. 13 (to scale model states to the observations). There are a variety of strategies for such rescaling. The most common are those based on least squares regression (LR) (Crow et al., 2005), cumulative distribution function (CFD) matching (Reichle and Koster, 2004) and variance (VAR) matching (Yilmaz and Crow, 2013). Only recently, Yilmaz and Crow (2013) applied a signal variance-based rescaling technique used as a pre-processing step in triple collocation (TC) analysis (Section 4.3.3) into a synthetic SM-DA application. In their work, they assessed the relative performance of TC-based rescaling and the above-mentioned techniques. The conclusion, based on analytical and numerical analyses, was that when the TC requirements are met (enough samples of three independent and coincident measurements/predictions of the same physical variable, with no cross-correlated errors), TC-based rescaling gives un-biased estimates of the rescaling factors (Eq.19), while the rest of the techniques above provide sub-optimal solutions. The solutions from LR-, CDF- and VAR-based rescaling techniques are very close to optimal estimates when the errors in the reference and the matched datasets (i.e., model and observations) are assumed negligible compared to the real signal (high signal to noise values) (Yilmaz and Crow, 2013). The rescaling of satellite SM observations to statistically match the model SM prediction is a necessary step in SM-DA applications. Most of these applications adopt a specific technique (commonly following previous studies) and implement the DA scheme without questioning the impacts of this decision. Addressing this gap, I adopt a real data approach (further details in Section 6) and evaluate the impacts that different rescaling techniques 22 CHAPTER 2: BACKGROUND have in the updated streamflow coming from the assimilation of satellite soil moisture into a rainfall-runoff model. The specific question defined here is: what are the impacts of different rescaling techniques in improving streamflow prediction after SM-DA? I answer this question in Chapter 4. It is worth noting that the work presented in that chapter was developed before Yilmaz and Crow (2013) emphasised the use of TC to rescale observations, therefore TC-based rescaling was not included in the analysis. In the subsequent Chapters (5 and 6), I adopt the TC-based technique to rescale the SWI derived from active and passive satellite SM products into the model space. 4.3.3 Observation error estimation The error associated with the different rescaled observations needs to be estimated for Bayesian-based updating schemes, such as the EnKF (R from Eq.14). Quantifying this uncertainty is a major challenge, especially given the lack of ground measurements of soil moisture in most areas. Moreover, incorrect assumptions regarding these errors degrade the performance of stochastic assimilation results (Reichle et al., 2008; Crow and Reichle, 2008; Crow and Van den Berg, 2010). To estimate the errors in the observed soil moisture, one alternative would be to propagate the errors through the retrieval algorithm and physical models used in the satellite soil moisture estimation (Section 1), however this requires expert knowledge in remote sensing and falls beyond the scope defined in this research. In this thesis I adopt two approaches to estimate R. The first one consists on determining an upper boundary of R and testing different values within that boundary (Chapters 3 and 4). The second one uses triple collocation (TC) technique to estimate the value of R (Chapters 5 and 6). In this section, I describe these two approaches and highlight how they are applied in order to obtain new insights about satellite error estimation in a SM-DA context. Regarding the structure of the observations error, there is a spatial and a temporal component that should be defined. A spatial correlation of the observation error can be expected given the overlapping observations and the actual footprint of the satellite observations (Wanders et al., 2012). Regarding the temporal structure of the observation error, most SM-DA applications assume that satellite SM errors are temporally uncorrelated. This is indeed one of the assumptions of the adaptive filtering approaches presented in Section 4.2.3. However, long-term comparisons between satellite SM retrievals and dense groundbased networks question this assumption (Crow and Van den Berg, 2010). The potential temporal correlation in the observation error poses a contradiction that may have several impacts in SM-DA. For example, it can lead to a sub-optimal characterisation of observation errors which can degrade the performance of the SM-DA scheme (Reichle et al., 2008; 23 Crow and Reichle, 2008; Crow and Van den Berg, 2010). Moreover, the temporal correlation of the observation error can be transferred into the model space in the updating step (Eq.13), which would result in correlated errors between model and observations in the following time step. This violates the EnKF assumption of independence between model and observation errors (Evensen, 1994). To explore the magnitude of this problem within a real data case, in Chapter 3 I investigate how different autocorrelation structures for this error affect the SM-DA results, while assuming a known observation error variance R. The specific question answered here is how assumed observation error structures affects SM-DA efficacy for improving streamflow prediction?. Among the SM-DA studies that do not assume arbitrary values for R, some studies have used the variable variance multiplier (VVM) to dynamically estimate observation errors (Leisenring and Moradkhani, 2012; Moradkhani et al., 2012; Yan et al., 2015). Additionally, and probably the most popular technique used to estimate this error is triple collocation analysis. TC was introduced by Stoffelen (1998) and uses three collocated and coincident independent datasets (with uncorrelated errors) for the same target variable to determine the parameters of the linear model and estimate the measurement errors of the three datasets. In reality, the datasets may contain error auto-correlations, which increases sampling errors. Additionally, the TC method needs a sufficient number of triplets for statistical analysis. Zwieback et al. (2012) showed that for samples above 500, TC estimates had 10% uncertainty, however, in some studies, fewer samples have been used (e.g. Scipal et al., 2008; Dorigo et al., 2010). In TC, triplets are usually composed of a model (θ) and two sets of observations (θobs1 and θobs2 in the following example) that are assumed to be linearly related to a same truth Θ as follows, θ = Θ + θ (16) θobs1 = α1 Θ + β1 + 1 θobs2 = α2 Θ + β2 + 2 where α1 , α2 are the scaling factors and β1 , β2 are the intercepts of the linear equations (the model is used as the reference dataset). 1 and 2 are the observation zero mean random errors with variances of σ1∗2 and σ2∗2 , respectively. θ is the model zero mean random error with variance σθ2 . ∗ if the observations are rescaled into the model space by θobs = (θobsi − βi ) /αi and ∗i = i i /αi (with subscript i standing for 1 and 2), then the equations in Eq.16 can be combined and re-arranged to be ∗ θ − θobs1 = θ − ∗1 ∗ θ − θobs2 = θ − ∗2 (17) ∗ ∗ θobs1 − θobs2 = ∗1 − ∗2 . 24 CHAPTER 2: BACKGROUND Following the TC procedure (Stoffelen, 1998), estimates of the model and observation error variances (σθ2 , σ1∗2 and σ2∗2 , respectively) can be obtained by cross-multiplying the equations in Eq.17, while assuming that the errors in the model and observations are independent between the three datasets and in time, and that there are sufficiently large number of triplets: ∗ ∗ σθ2 = (θobs1 − θ) (θobs2 − θ) ∗ ∗ ∗ σ1∗2 = (θobs1 − θobs2 ) (θobs1 − θ) (18) ∗ ∗ ∗ σ2∗2 = (θobs1 − θobs2 ) (θobs2 − θ). The overbar denotes the average in time. The scaling factors and intercepts in Eq.16 must be resolved before the estimation of model and observation error variances in Eq.18. Following the preprocessing techniques used in TC (see details in Yilmaz and Crow, 2013, Appendix A), these factors can be estimated as α1 = θobs2 θ θobs1 θobs2 and α2 = θobs1 θ . θobs1 θobs2 (19) The additive bias between the model and observations can be defined as B = E(θ) − E(θobsi ) (with subscript i standing for 1 and 2). If the mean of the observations is modified to match E(θ), the intercepts in Eq.16 become βi = (1 − αi )θ. TC has been widely validated as a reliable technique to estimate observation errors when the data requirements are met (e.g., Yilmaz and Crow, 2013; Su et al., 2014). The data requirements are that there is a sufficiently large number of collocated data points from three independent data series and that the linear relationships and error structures are maintained throughout the analysis period. In real applications however, these conditions are difficult to realise given the infrequent spatiotemporal sampling of satellite sensors (Su et al., 2014). Seeking to relax some of the data requirements of TC, Su et al. (2014) recalled the concept of instrumental variable regression (the general case of TC) and proposed an alternative implementation of this regression where a lagged variable was used as the third independent dataset in TC. This scheme (LV hereafter) features the important advantage of requiring only two datasets and showed satisfactory results for active and passive SSM products over Australia, when compared with TC results. In this thesis, I apply (for the first time in SM-DA context) the LV scheme for periods when the model SM prediction and only one satellite dataset is available or when the sampling requirement of TC are not met (Chapter 5). It is also common to assume that the satellite SM observation errors are time-invariant (e.g., Reichle et al., 2008; Ryu et al., 2009; Crow and Van den Berg, 2010; Brocca et al., 2010, 2012a); however, studies evaluating satellite SM products have shown an important 25 temporal variability in measurement errors (Loew and Schlenz, 2011; Su et al., 2014). Since a data assimilation scheme explicitly updates the model prediction based on the relative weights of the model and the observation errors, assuming a constant observation error will lead to over-correction of the model state if the actual observation error is higher than assumed, and vice versa. To characterise the temporal behaviour of the observation error, the above techniques (TC or LV) can be applied to specific time windows of the observations and model predictions (for example, by grouping the triplets or doublets by month-of-the-year). There is however, a trade-off between the sampling window (which defines the temporal characterisation of the error) and the sample size (number of triplets in each subset). To address the issue of temporally variant observation errors I develop an approach that involves seasonal characterisation of the observation error by applying TC and LV to 4-month sampling windows (further details in Chapter 5). This seasonal approach is novel in the context of SM-DA. 5 Forcing and dual correction schemes As introduced in Chapter 1, in addition to the popular state correction assimilation approach, recent studies have explored the use of satellite SM retrievals for filtering errors present in satellite-based rainfall accumulation products (Crow and Bolten, 2007; Pellarin et al., 2008; Crow et al., 2009; Brocca et al., 2013). Given that space-borne rainfall estimates provide the only possible source of near real time information for most global land areas, the potential of these techniques is highly significant, especially for poor instrumented areas (Crow et al., 2009). The premise of these studies is that soil moisture contains information about antecedent rainfall that can be used to constrain rainfall estimates by using simple water balance models. Although these studies have slightly different approaches (further details in Chapter 6), they have all shown the potential for improving satellite rainfall estimates by using satellite SM retrievals. The above implies that the use of microwave soil moisture could enhance models predictions of runoff by the improvement of both the antecedent moisture conditions (which theoretically determines the catchment infiltration capacity) via a state correction SM-DA scheme coupled with the improvement of storm-scale rainfall totals (which represent the most important meteorological input of a rainfall-runoff model) via a forcing correction SM-DA scheme (Crow and Ryu, 2009). This has motivated recent studies to test these dual forcing/state correction schemes (dual SM-DA). Massari et al. (2014) set up a simple scheme in which in-situ observations of SM were used to correct the rainfall (through the SM2RAIN algorithm introduced by Brocca et al. (2014)) and to initialise the wetness condition of a simple rainfall-runoff model. Their case study showed high potential for the 26 CHAPTER 2: BACKGROUND SM data to improve flood modelling. Using a relatively more complex assimilation scheme and rainfall-runoff model, Crow and Ryu (2009) set up a state correction SM-DA scheme integrated with a rainfall correction scheme (using the Soil Moisture Analysis Rainfall Tool, SMART, introduced by Crow et al. (2009)) in a series of synthetic twin experiments. The formulation of this dual correction scheme avoids ”over-use” of the remotely sensed soil moisture in the analysis (i.e., it avoids cross-correlation between forecasting and observing errors). The key point is that the corrected precipitation is fed into an off-line model simulation (based on the state update analysis) and not the state-update analysis itself. The results of this dual SM-DA scheme were further supported by Chen et al. (2014) in a real data application. Both studies showed that the satellite rainfall correction led to improvement in streamflow prediction, especially during high flow periods. On the other hand, the soil water state correction led mainly to improved base flow component (low flows simulation). The combined state/forcing correction scheme led to improvement of both the high and the low flow components of the streamflow, outperform both the state-only and forcing-only correction schemes. It remains unclear however, how well this dual SM-DA scheme performs for different catchment characteristics including climate and rainfall-runoff mechanisms. Addressing this gap, in Chapter 6 I expand the evaluation of the dual SM-DA proposed by Crow and Ryu (2009) within 4 large semi-arid catchments in Australia. I devise the dual SM-DA scheme under an ungauged catchment scenario (without rain gauges, only satellite data is used to force the model) to answer three research questions: 1) How much can we improve streamflow prediction by the correction of satellite rainfall via SMART? 2) How much can we improve streamflow prediction by the assimilation of SSM in a state correction scheme? 3) What are the impacts in streamflow prediction of a combined state and forcing correction scheme? 6 Summary and overall approach I have provided a general review of the main challenges that must be addressed to implement a SM-DA scheme. I described the most commonly used techniques to address each of those challenges and mentioned which ones were adopted in Chapters 3 to 6 of this thesis. As highlighted above, some of the adopted techniques were new in the context of SM-DA. This forms part of the novel contributions of this thesis and includes: • The correction of the unintended bias introduced in the generation of streamflow ensemble predictions (described in Section 4.1 and implemented in Chapters 5 and 6). I apply the bias correction scheme proposed by Ryu et al., (2009) directly to the streamflow prediction. I use the unperturbed model run to estimate the mean bias in the streamflow (following Eq.9, but using streamflow instead of soil moisture) and then correct each ensemble member by subtracting this mean bias. This prac27 tical tool ensures that the streamflow ensemble mean maintains the performance of the unperturbed (calibrated) model run–thus avoiding artificial degradation of the unperturbed model run by bias. • The use of a maximum a posteriori approach to estimate model error parameters (described in Section 4.1 and implemented in Chapters 5 and 6). • The use of a lagged-variable approach to estimate satellite observation error (described in Section 4.3 and implemented in Chapter 5). • A seasonal characterisation of the satellite SM error (described in Section 4.3 and implemented in Chapter 5). In addition to introducing these new techniques in the context of SM-DA applications, the following chapters answer my two main research questions: 1) Can we improve flood prediction by correcting a rainfall-runoff model SM state via satellite SM data assimilation? 2) Can we further improve flood prediction by correcting both the satellite rainfall forcing data and the model SM state via satellite SM data assimilation? The approach I have taken to answering these two main questions, which differentiates this work from a large number of previous SM-DA applications, is to set up the experiments using real data (observed satellite retrievals are used to feed the assimilation scheme and observed streamflow to evaluate SM-DA results). Real data experiments have the inherent challenge of not knowing the true information about model and observations errors. This approach therefore, leads to sub-optimal SM-DA schemes; however, it provides real evidence of the efficacy of using satellite SM to improve flood prediction. While answering my first research question, I set up a series of real data experiments and define 5 sub-questions related to different aspects of the state correction SM-DA scheme. It should be noted that in this exploration, the rainfall data used to force the model is a gauged-interpolated dataset. This avoids the incorporation of further errors in the system coming from satellite rainfall products. A summary of the sub-questions, and a description of how they are answered in Chapters 3 to 5, is presented below: 1. How do assumed observation error structures affect SM-DA efficacy for improving streamflow prediction? This question was introduced in Section 4.3 and targets the research gap regarding the assumptions made about the structure of the satellite observation errors. While in SM-DA applications it is commonly assumed that these errors are temporally uncorrelated, studies dedicated to satellite SM error characterisation have put this assumption in doubt (Crow and van den Berg, 2010). Temporal correlation in the observations errors could lead to a sub-optimal characterisation of the observation error and to the the violation of a key EnKF assumption. In Chapter 3, I evaluate the magnitude of these potential impacts within a real data case. 28 CHAPTER 2: BACKGROUND 2. What are the impacts of different rescaling techniques on the efficacy of SM-DA? This question was introduced in Section 4.3 and targets the research gap regarding the impacts that different rescaling techniques may have on SM-DA results. In Chapter 4, I answer this question by setting up a real data case and testing several commonly used techniques for rescaling the satellite observations into the model space. 3. While rainfall is presumably the main driver of flood generation in semi-arid catchments, can we effectively improve streamflow prediction by correcting the soil water state of the model? This question is defined based on the study catchment selected in the experiments presented in Chapter 5. The runoff mechanisms of the study catchment are likely to be dominated by rainfall, as is the case for several large and sparsely instrumented semi-arid catchments with an extensive history of flooding within Australia. Within this context, I aim to examine and provide real evidence of the potential for improving flood prediction by correcting the antecedent wetness condition of the catchment via SM-DA. 4. What is the impact of accounting for channel routing and the spatial distribution of forcing data on SM-DA performance? This question aims to reinforce the fact that a data assimilation scheme is designed to reduce the random component of the model error and does not address systematic errors (see Section 2). I explore the importance of the model quality before assimilation for enhancing the SM-DA performance by evaluating the results of SM-DA from a lumped and a semi-distributed model configuration. These results are presented in Chapter 5. 5. What are the prospects for improving streamflow within ungauged catchments using satellite SM? Given that the absence of co-located streamflow gauging stations is typical for most locations in all catchments, I set up the experiments in Chapter 5 under an ungauged (no stream gauges) scenario for the inner catchments of a semi-distributed model scheme. I then evaluate the skill of SM-DA within these inner catchments, which provided useful insights into this common situation. To answer my second research question, I adopt one of the available forcing correction SMDA schemes (Section 5) and combine it with the state SM-DA developed in the first stage of the thesis (Chapters 3 to 5). The dual forcing/state SM-DA scheme is implemented in Chapter 6 where the following specific questions are answered: 1. Can we improve the quality of an operational satellite rainfall product by the assimilation of satellite soil moisture via SMART? 29 2. Does this forcing correction scheme has a positive impact in streamflow prediction? 3. Can we improve streamflow prediction by the assimilation of satellite SM in a state correction scheme? 4. What are the impacts in streamflow prediction of a combined state and forcing correction scheme? The above questions aim to assess the relative benefits of correcting independently and simultaneously, the satellite rainfall forcing data and the model soil moisture state for the purposes of improving flood prediction. I set up the experiments within 4 large semi-arid catchments and present the results in Chapter 6 . 30 Chapter 3 Impacts of observation error structure in SM-DA This chapter was published as the following peer-reviewed proceeding paper: C. Alvarez-Garreton, D. Ryu, A. W. Western, W. T. Crow, and D. E. Robertson. Impact of observation error structure on satellite soil moisture assimilation into a rainfall-runoff model. In J. Piantadosi, R. Anderssen, and J. Boland, editors, MODSIM2013, 20th International Congress on Modelling and Simulation. Modelling and Simulation Society of Australia and New Zealand, pages 3071-3077, December 2013. 31 Impact of observation error structure on satellite soil moisture assimilation into a rainfall-runoff model. C. Alvarez-Garreton a , D. Ryu a , A. W. Western a , W. Crow b , D. Robertson c a Department of Infrastructure Engineering, The University of Melbourne, Parkville, Victoria, Australia b USDA-ARS Hydrology and Remote Sensing Laboratory, Beltsville, Maryland, United States c CSIRO Land and Water, Victoria, Australia Email: calverez@student.unimelb.edu.au Abstract: In the Ensemble Kalman Filter (EnKF) - based data assimilation, the background prediction of a model is updated using observations and relative weights based on the model prediction and observation uncertainties. In practice, both model and observation uncertainties are difficult to quantify thus have been often assumed to be spatially and temporally independent Gaussian random variables. Nevertheless, it has been shown that incorrect assumptions regarding the structure of these errors can degrade the performance of the stochastic data assimilation. This work investigates the autocorrelation structure of the microwave satellite soil moisture retrievals and explores how assumed observation error structure affects streamflow prediction skill when assimilating these observations into a rainfall-runoff model. An AMSR-E soil moisture product and the Probability Distribution Model (PDM) are used for this purpose. Satellite soil moisture data is transformed with an exponential filter to make it comparable to the root zone soil moisture state of the model. The exponential filter formulation explicitly incorporates an autocorrelation component in the rescaled observation, however, the error structure of this operator has been treated until now as an independent Gaussian process. In this work, the variance of the rescaled observation error is estimated based on the residuals from the rescaled satellite soil moisture and the calibrated model soil moisture state. Next, the observation error structure is treated as a Gaussian independent process with time-variant variance; a weakly autocorrelated random process (with autocorrelation coefficient of 0.2) and a strongly autocorrelated random process (with autocorrelation coefficient of 0.8). These experiments are compared with a control case which corresponds to the commonly used assumption of Gaussian independent observation error with time-fixed variance. Model error is represented by perturbing rainfall forcing data and soil moisture state. These perturbations are assumed to represent all forcing and model structural/parameter errors. Error parameters are calibrated by applying two discharge ensemble verification criteria. Assimilation results are compared and the impacts of the observation error structure assumptions are assessed. The study area is the semi-arid 42,870 km2 Warrego at Wyandra River catchment, located in Queensland, Australia. This catchment is chosen for its flooding history, along with having geographical and climatological conditions that enable soil moisture satellite retrievals to have higher accuracy than in other areas. These conditions include large area, semi-arid climate and low vegetation cover. Moreover, the catchment is poorly instrumented, thus satellite data provides valuable information. Results show a consistent improvement of the model forecast accuracy of the control case and in all experiments. However, given that a stochastic assimilation is designed to correct stochastic errors, the systematic errors in model prediction (probably due to the inaccurate forcing data within the catchment) are not addressed by these experiments. The assumed observation error structures tested in the different experiments do not exhibit significant effect in the assimilation results. This case study provides useful insight into the assimilation of satellite soil moisture retrievals in poorly instrumented semi-arid catchments. Keywords: Data assimilation, soil moisture, satellite retrievals, rainfall-runoff model, hydrology. 32 C. Alvarez-Garreton et al., Impacts of observation error structure in soil moisture data assimilation 1 I NTRODUCTION Accurate soil moisture predictions can lead to better modelling of hydrological processes including runoff, groundwater recharge and evapotranspiration. For example, it was shown that runoff prediction could be improved by assimilating antecedent soil moisture into rainfall-runoff modelling (Crow et al., 2005). Nonetheless, the improvement in model skill resulting from assimilating soil moisture observations (on-site or remotely sensed) into rainfall-runoff models has not been fully assessed due to three main limitations: observation uncertainties, temporal resolution and the spatial mismatch between observations and soil water content from rainfall-runoff models (Brocca et al., 2012). Ground measurements of soil moisture are scarce in most regions, which positions satellite retrievals as a potential solution for improving soil moisture representation. However, satellite soil moisture retrievals have in general higher uncertainty than ground measurements, a coarser spatial resolution and represent only the top few centimetres of soil, all factors that need to be accounted for their use. Exploring tools for improving runoff predictions using satellite soil moisture retrievals has become very popular. The success of stochastic assimilation relies on several factors including whether soil moisture is a dominant control on the runoff generation process in the catchment, the representativeness and accuracy of the observations, and having an adequate representation of model and observation errors. It has been shown that incorrect assumptions regarding the model structure and observation errors can degrade the performance of the stochastic data assimilation (Crow and van Loon, 2006; Crow and Reichle, 2008; Crow and van den Berg, 2010; Reichle et al., 2008; Ryu et al., 2009). Up to date, for hydrologic applications, the error structure of these observations has not been carefully investigated and their assimilation into rainfall-runoff models has been undertaken using the observed time series (i.e., one single realisation from stochastic process) and assuming a time invariant error variance. In this study, we show how observation error structure assumptions affect the improvement in assimilation skill using a soil moisture product from the Advance Microwave Scanning Radiometer (AMSR-E) and the probability distributed model (PDM). Additionally, we treat the observation as a stochastic process represented by a Monte Carlo - based ensemble. For this we set up an ensemble Kalman filter (EnKF) scheme. The depth mismatch between observed soil moisture (few centimetres of soil) and the predicted soil moisture (depth depending on the calibrated model parameters, but more comparable to the root zone layer) is addressed by applying an exponential filter to the surface observations (Wagner et al., 1999). This filter transforms the surface soil moisture into a profile soil moisture through the estimation of a soil wetness index (SWI). Subsequently, systematic differences between the SWI and the predicted soil moisture are removed by a linear regression rescaling. In different assimilation experiments, the observation error structure is treated as a sequentially independent Gaussian process or as an autocorrelated random process. For evaluating the impacts of these assumptions in the assimilation results, one control case and 3 experiments are defined. The control case corresponds to the commonly used assumption of time invariant variance of the observation error and the assimilation of the observed time series (single realisation). Experiment 1 assumes Gaussian independent error in the observation and treats the observation as a stochastic process so the assimilation is made based on an ensemble of observations. Experiments 2 and 3 assume a “weakly” autocorrelated observation error (Exp.2) and a “strongly” autocorrelated observation error (Exp.3). The assimilation of these two last experiments uses the observed time series (single realisation). Assimilation results are compared and evaluated for the different observation error structure assumptions. 2 S TUDY AREA AND DATA The study area is the Warrego River catchment (42,870 km2 ), located in south west Queensland, Australia (see Fig.1). The mean annual precipitation over the catchment is 520 mm and it has a long history of flooding, with at least 10 major events in the last 100 years that have caused extensive inundation of towns and rural lands (http://www.bom.gov.au/qld/flood/brochures/warrego). Rainfall data was obtained from the Australian Water Availability Project (AWAP), which covers the period from 1900-up to date and has a spatial resolution of 0.05◦ (Jones et al., 2009). Hourly streamflow records for Warrego at Wyandra gauge were collected from the Queensland Department of Natural Resources and Mines website (http://watermonitoring.derm.qld.gov.au) for 1967-2013 period. The soil moisture dataset was obtained from the Advance Microwave Scanning Radiometer (AMSR-E), version 5 C/X-band, 0.25◦ resolution level 3 product for the period 07/2002-10/2011 (Owe et al., 2008). 33 C. Alvarez-Garreton et al., Impacts of observation error structure in soil moisture data assimilation 10°S 1:80,000,000 Augathella 15°S # Charleville 20°S # ## ## ### # # # # Wyandra 25°S Cunnamulla 30°S 35°S Towns 40°S # Rainfall station Warrego River catchment AMSR-E grid 115°E 120°E 125°E 130°E 135°E 140°E 145°E 150°E 155°E Figure 1. Warrego river catchment 3 M ETHODS 3.1 Rainfall-runoff model The probability distributed model (PDM) is a conceptual rainfall-runoff model that has been widely used in hydrologic research (Moore, 2007). The model treats soil water content as a probability distributed variable. Then, two cascade reservoirs are used for representing surface storage and one routing reservoir for representing sub-surface runoff generation. The main inputs of the model are rainfall and potential evaporation. A detailed description of the model structure and formulations is presented by Moore (2007). Here, a lumped model of the catchment and a daily time step is used. The model is calibrated by using a genetic algorithm (Chaturvedi, 2010) with an objective function based on the Nash-Sutcliffe statistic. 3.2 EnKF formulation The Kalman filter is a Bayesian estimator that sequentially updates model background predictions with available observations. The updating step is based on the relative values of the uncertainties (error covariance) existing in the model and the observations. In the ensemble Kalman filter (EnKF), the error covariance is explicitly calculated from Monte Carlo-based ensembles. For a state-updating assimilation approach, the state ensemble is created by perturbing forcing data and/or the state of the model with unbiased errors. − − − Let θ− (t) = {θ1,t , θ2,t , ..., θN,t } be the perturbed model soil moisture state ensemble prediction (background prediction) before the updating step for time step t, where N is the number of ensemble members. Given that there is no knowledge of the real state values, the ensemble average is use as reference to estimate the 0 − prediction error. The error of member i (θ− i,t ), the abnormality matrix of the ensemble (θM (t)), and the covariance matrix of the state model errors (Pt− ), for each time step t, are calculated by: 0 − θ− i,t = θi,t − N 1 X − θ N i=1 i,t ; 0 0 0 − − − − θM (t)0 = {θ1,t , θ2,t , ..., θN,t } ; Pt− = T 1 − θ− (t)0 × θM (t)0 N −1 M (1) When a soil moisture observation is available, the SWI is estimated and rescaled (see Section 3.3), and each member of the state ensemble is updated by the rescaled observation, θobs(EF ) (t), using the following expression: + − − θi,t = θi,t + K(θobs(EF ) (t) − H(θi,t )) with K= Pt− H T HPt− H T + Rt (2) where K is the Kalman gain, H is the observation operator that relates the modelled state to the measured variable. As the observation is rescaled separately prior to the state updating, H reduces to the identity matrix in this work. Rt is the error variance of the rescaled observation for time t. 3.3 Satellite soil moisture rescaling and observation error estimation Satellite soil moisture retrievals (θobs ) represent the top few centimetres of the soil, while the rainfall-runoff model soil moisture state accounts for a significantly deeper layer. The depth of the modelled storage depends on the calibrated model parameters, but typically it is comparable with the root zone soil moisture. For transferring θobs information into the soil water content space of the model (θ), we use the exponential filter proposed by Wagner et al. (1999). This filter assumes that the variation in time of the root zone soil moisture is linearly related to the difference between surface soil moisture and root zone soil moisture. The filter estimates a profile average saturation degree by recursively calculating a soil wetness index (SWI) every time there is a 34 C. Alvarez-Garreton et al., Impacts of observation error structure in soil moisture data assimilation satellite soil moisture retrieval θobs (Brocca et al., 2010): SW I(t) = SW I(t − 1) + Gt [θobs (t) − SW I(t − 1)] with Gt = Gt−1 t−(t−1) − T (3) Gt−1 + e Gt is a gain term varying between 0 and 1. T is a calibrated parameter representing the time scale of the SWI variation. SWI is then linearly rescaled in order to meet the same mean and standard deviation as the root zone soil moisture from the model (θ). The rescaled observation is named θobs(EF ) . The estimation of observation uncertainties is a major challenge, especially given the lack of ground measurements of soil moisture in most areas. In this study we propose to determine an upper boundary of the rescaled observation error variance. If we assume that the error is independent of the measurement (i.e., orthogonal), 0 we can express the variance of the rescaled observation as V ar(θobs(EF ) ) = V ar(θobs(EF ) ) + R, where R is 0 the rescaled observation error variance from Eq. 2 and V ar(θobs(EF ) ) is directly calculated from the rescaled 0 data. Given that the variance is always positive, we can use V ar(θobs(EF ) ) as the upper boundary of R. In a first stage, R is considered to be equal to the upper boundary. Once we have the variance of the rescaled soil moisture error (R), the following experiments are defined in order to explore how error structure assumptions affect the assimilation results: • • • • Control case: error is treated as a fixed, time invariant variance. Exp.1: error is treated as white Gaussian process lacking auto-correlation. Exp.2: error is treated as a “weakly” autocorrelated process, with lag-1 day coefficient AR(1)=0.2. Exp.3: error is treated as a “strongly” autocorrelated process, with lag-1 day coefficient AR(1)=0.8. The random component of the autocorrelated errors in Exp.2 and Exp.3 is assumed to be zero mean autocorrelated Gaussian noise, the spread of which is calculated by constraining the temporal mean of the rescaled observation ensemble covariance to be equal to R. 3.4 Model error estimation Model error is represented by perturbing forcing precipitation data with an independent multiplicative lognormally distributed error (mean 1 and standard deviation σp ), and by perturbing the soil moisture with independent additive normally distributed error (mean 0 and standard deviation σsm ). These perturbations consider input forcing, model parameter and model structure error sources. Model error parameters (σp and σsm ) are calibrated by running the open-loop (perturbing forcing and soil moisture state, but without assimilating rescaled soil moisture observations) for 100 ensemble members (N), and evaluating the following two discharge ensemble verification criteria: i) If the ensemble spread is large enough, the temporal average of the ensemble skill (skt ) should be similar to the temporal average of the ensemble spread (spt ), i.e., sk/sp = 1 (Brocca et al., 2012; De Lannoy et al., 2006), where: sk = T 2 1 X Qsim (t) − Qobs (t) T t=1 and sp = " # T N 2 1 X 1 X Qsim (i, t) − Qsim (t) T t=1 N i=1 (4) ii) If the observation is indistinguishable from a member of the ensemble, the ratio between sk and the ensemp ble mean-square-error (mse) should be equal to N + 1/2N (Moradkhani et al., 2005; Brocca et al., 2012), where: " # T N 1 X 1 X 2 mse = (Qsim (i, t) − Qobs (t)) T t=1 N i=1 4 (5) R ESULTS AND DISCUSSION 4.1 Model calibration Figure 2 presents the simulated and observed discharge time series for both the calibration and verification periods. These results reveal that calibrated model underestimates the observed peak flows and overestimates low flows. A likely factor that contributes to the performance of the model is the poor density of rainfall gauges within the catchment, which results in low quality gridded rainfall data for the area. Model performance can also be related to the objective function used for calibration (maximising the Nash-Sutcliffe efficiency) (Gupta and Kling, 2011). Given the semi-arid nature of the catchment, rainfall-runoff generation processes 35 C. Alvarez-Garreton et al., Impacts of observation error structure in soil moisture data assimilation Calibration period: 1967−2001 (NS= 0.49) Verification period: 2002−2013 (NS =0.57) 8 Obs Model Q (mm/day) 6 4 2 0 Apr71 Oct76 Mar82 Sep87 Mar93 Sep98 Feb04 Aug09 Figure 2. Discharge prediction time series, the dashed black line indicates the end of calibration period. are likely to be dominated by the soil water content of the catchment, which would explain why many rainfall events do not result in discharge (this can be seen by comparing runoff ratios with rainfall intensity and with modelled soil moisture, not shown here). Presumably, given the poor representation of the forcing data, the model structure and conceptualisation are not able to correctly represent this saturation excess runoff process. Due to the lack of ground data for improving rainfall representation over the catchment, and the likely high dependency of runoff generation processes on the soil water content of the catchment, the assimilation of satellite soil moisture offers an important opportunity for improving the model discharge prediction. 4.2 Model and observation error estimation The linear rescaling described in Section 3.3 was trained using the first two years of data (2002-2004) and then updated in each time step. The parameter T of the exponential filter (eq. 3) was calibrated for the same training window. The rescaled observation time series is presented in Figure 3, and has a correlation coefficient of 0.82 with the modelled soil moisture for the verification period. The standard deviation of the associated residuals (stdres ) is 0.05 m3 /m3 (expressed as volumetric percentage of the calibrated soil moisture storage of 630 mm). These results reveal the strong concordance between the model soil moisture state and the rescaled surface soil moisture observation. The observation error variance is estimated as 1360 mm3 . The adopted standard deviation of the random component of the “weak” and “strong” autoregressive processes (exp. 2 and 3) are 0.0574 m3 /m3 and 0.0351 m3 /m3 , respectively. These values fullfill the constraint of a temporal observation variance equals to R. Following the methodology described in section 3.4, model error parameters calibration results in σP = 0.3195 and σsm = 0.0248 m3 /m3 (volumetric percentage of soil moisture storage). Soil moisture (mm) Model Rescaled 400 200 0 Jan04 Jan06 Jan08 Rescaled soil moisture (mm) Calibration period 600 Jan10 Verification period 500 500 400 400 300 300 200 200 100 100 0 0 0 100 200 300 400 500 Model soil moisture (mm) 0 100 200 300 400 500 Model soil moisture (mm) Figure 3. Rescaled observations, dashed black line indicates the end of training period. 4.3 Assimilation experiments The evaluation of assimilation is undertaken for the period 06/2004-10/2011. The first half of the analysis window, up to 03/2008, is characterised by small flow events (Period 1) while the second half (03/200810/2011) is characterised by larger flow events, having at least three major flood events (Period 2). The assimilation results for experiment 3 are presented in Fig. 4, for these two separate periods. The green dashed line represents the un-perturbed model (i.e. the predictions of the calibrated model, called “sim”). From these graphs it can be seen that for small flow events (Period 1), the assimilation procedure is reducing the 36 C. Alvarez-Garreton et al., Impacts of observation error structure in soil moisture data assimilation Period 1 Q (mm/day) Exp. 3 2.5 obs sim O−L ens O−L mean updated ens updated mean 2 1.5 1 0.5 0 09/04 03/05 10/05 05/06 11/06 06/07 Period 2 Q (mm/day) Exp. 3 8 6 4 2 0 12/07 07/08 01/09 08/09 03/10 09/10 04/11 Figure 4. Assimilation results for experiment 3 (observation error treated as a strong autocorrelated process, with lag-1 day coefficient AR(1)=0.8). Table 1. Evaluation metrics of assimilation results for control case and experiments Experiment Control case Exp.1 Exp.2 Exp.3 * RMSD sim Period 1 Period 2 0.05 0.05 0.05 0.05 0.32 0.32 0.32 0.32 MRMSD open loop Period 1 Period 2 (0.01)* (0.02)* 0.11 0.44 0.11 0.45 0.12 0.44 0.12 0.47 MRMSD updated Period 1 Period 2 (0.002)* (0.005)* 0.07 0.36 0.06 0.34 0.08 0.36 0.08 0.36 NRMSD Period 1 Period 2 (0.12)* (0.06) * 0.65 0.80 0.56 0.76 0.65 0.82 0.65 0.77 95% confidence interval open-loop spread and in general reducing the model overestimation of streamflow (when analysing the mean of the open-loop and updated ensemble, compared with the observed discharge). For larger flow events events (Period 2), the assimilation is mainly reducing the spread of the open-loop while the model underestimation is not being corrected. The assimilation results of the control case and experiments 1 and 2 show similar relation between the open-loop and updated ensembles (not shown here) and their evaluation metrics are summarised in Table 1. Table 1 presents the mean root mean square difference (MRMSD) and the normalised MRMSD (NRMSD) for the different assimilation experiments for Periods 1 and 2. The NRMSD is calculated as the ratio between the MRMSD from the open loop ensemble and the MRMSD from the updated ensemble. Additionally, the RMSD of unperturbed model (sim) is presented in the Table. Data assimilation for the control case and the experimental cases results in an improvement of around 40% in period 1 and of 20% in period 2, in terms of MRMSD. Differences between the control case and the experiments are within the confidence intervals thus they are not considered significant. In general, the spread of the discharge ensemble is reduced by assimilating satellite soil moisture retrievals, but the poor representation of the model, evaluated as the ensemble mean compared with the observed discharge, is not consistently corrected (it only improves for specific events). Given that stochastic assimilation is designed to correct stochastic errors, the model systematic errors (presumably coming from the poor representation of precipitation over the catchment, given the lack of instrumentation within the area) are not addressed thus the performance of the assimilation becomes marginal. Moreover, the different observation error structures tested does not affect the assimilation results. This suggests that even though observation error structure theoretically 37 C. Alvarez-Garreton et al., Impacts of observation error structure in soil moisture data assimilation has a direct effect on an EnKF-based assimilation, when working with real data and the uncertainties inherent in a poorly instrumented area, the effect is trivial. 5 CONCLUSIONS This work has shown that the assimilation of satellite soil moisture retrievals derived from AMSR-E into PDM results in a consistent improvement of the model predictions. This improvement is based on the reduction of the model forecast uncertainty. Nevertheless, given that a stochastic assimilation is designed to correct stochastic errors (which translates in the achieved reduction of model forecast uncertainty), the systematic poor model performance (probably due to poor representation of forcing data within the catchment) is not addressed by these experiments. While the spread of the ensemble discharge prediction is reduced after assimilation, the ensemble mean is not always closer to the discharge observation. Moreover, the different observation error structures tested here did not result in significant differences in the assimilation performance. This suggests that when the model prediction accuracy and uncertainties are mainly controlled by high uncertainties in forcing data, the assumptions of the observation error structure made in a state-update assimilation framework have little effect. These findings enhance our understanding of the advantages and limitations of assimilating satellite soil moisture observations into a rainfall-runoff model for improving streamflow prediction. In order to address the systematic model predictions biases, while reducing the stochastic errors of the model, efforts should be focused on combining the presented state-update assimilation scheme with some tool to reduce the uncertainty in rainfall data. ACKNOWLEDGEMENT This research was conducted with financial support from the Australian Research Council (ARC Linkage Project No. LP110200520) and the Bureau of Meteorology, Australia. We gratefully acknowledge the advise and data provision of Chris Leahy and Soori Sooriyakumaran from the Bureau of Meteorology, Australia. R EFERENCES Brocca, L., F. Melone, T. Moramarco, W. Wagner, V. Naeimi, Z. Bartalis, and S. Hasenauer (2010). Improving runoff prediction through the assimilation of the ASCAT soil moisture product. Hydrology and Earth System Sciences 14(10), 1881–1893. Brocca, L., T. Moramarco, F. Melone, W. Wagner, S. Hasenauer, and S. Hahn (2012). Assimilation of surface- and rootzone ASCAT soil moisture products into rainfall–runoff modeling. Geoscience and Remote Sensing, IEEE Transactions on 50(7), 2542–2555. Chaturvedi, D. (2010). Matlab Program of Genetic Algorithms. Crow, W. T., R. Bindlish, and T. J. Jackson (2005). The added value of spaceborne passive microwave soil moisture retrievals for forecasting rainfall-runoff partitioning. Geophysical Research Letters 32, L18401. Crow, W. T. and R. H. Reichle (2008). Comparison of adaptive filtering techniques for land surface data assimilation. Water Resources Research 44, W08423–. Crow, W. T. and M. J. van den Berg (2010). An improved approach for estimating observation and model error parameters in soil moisture data assimilation (doi 10.1029/2010WR009402). Water Resources Research 46, W12519. Crow, W. T. and E. van Loon (2006). Impact of Incorrect Model Error Assumptions on the Sequential Assimilation of Remotely Sensed Surface Soil Moisture. Journal of Hydrometeorology 7, 421–432. De Lannoy, G. J., P. R. Houser, V. Pauwels, and N. E. Verhoest (2006). Assessment of model uncertainty for soil moisture through ensemble verification. Journal of Geophysical Research: Atmospheres (1984–2012) 111(D10). Gupta, H. V. and H. Kling (2011). On typical range, sensitivity, and normalization of mean squared error and Nash-Sutcliffe efficiency type metrics. Water Resources Research 47(10), W10601. Jones, D. A., W. Wang, and R. Fawcett (2009). High-quality spatial climate data-sets for australia. Australian Meteorological and Oceanographic Journal 58(4), 233. Moore, R. J. (2007). The PDM rainfall-runoff model. Hydrology & Earth System Sciences 11(1), 483–499. Moradkhani, H., S. Sorooshian, H. Gupta, and P. Houser (2005). Dual state–parameter estimation of hydrological models using ensemble Kalman filter. Advances in Water Resources 28(2), 135–147. Owe, M., R. de Jeu, and T. Holmes (2008). Multisensor historical climatology of satellite-derived global land surface moisture. Journal of Geophysical Research: Earth Surface (2003–2012) 113(F1). Reichle, R. H., W. T. Crow, and C. L. Keppenne (2008). An adaptive ensemble Kalman filter for soil moisture data assimilation. Water Resources Research 44(3). Ryu, D., W. T. Crow, X. Zhan, and T. J. Jackson (2009). Correcting Unintended Perturbation Biases in Hydrologic Data Assimilation. Journal of Hydrometeorology 10, 734–750. Wagner, W., G. Lemoine, and H. Rott (1999). A method for estimating soil moisture from ERS scatterometer and soil data. Remote Sensing of Environment 70(2), 191–207. 38 Chapter 4 Impacts of observation rescaling in SM-DA This chapter was published as the following article: C. Alvarez-Garreton, D. Ryu, A. W. Western, W. T. Crow, and D. E. Robertson. The impacts of assimilating satellite soil moisture into a rainfall-runoff model in a semi-arid catchment. Journal of Hydrology, 519: 2763-2774, 2014. 39 The impacts of assimilating satellite soil moisture into a rainfall-runoff model in a semi-arid catchment C. Alvarez-Garretona , D. Ryua , A.W. Westerna , W.T. Crowb , D.E. Robertsonc a Department of Infrastructure Engineering, The University of Melbourne, Parkville, Victoria, Australia Hydrology and Remote Sensing Laboratory, Beltsville, Maryland, United States c CSIRO Land and Water, Highett, 3190 Victoria, Australia b USDA-ARS Abstract Soil moisture plays a key role in runoff generation processes, and the assimilation of soil moisture observations into rainfall-runoff models is regarded as a way to improve their prediction accuracy. Given the scarcity of in-situ measurements, satellite soil moisture observations offer a valuable dataset that can be assimilated into models; however, very few studies have used these coarse resolution products to improve rainfall-runoff model prediction. In this work we evaluate the assimilation of satellite soil moisture into the probability distributed model (PDM) for the purpose of reducing flood prediction uncertainty in an operational context. The surface soil moisture (SSM) and the soil wetness index (SWI) derived from the Advanced Microwave Scanning Radiometer (AMSR-E) are assimilated using an ensemble Kalman filter. Two options for the observed data are considered to remove the systematic differences between SSM/SWI and the model soil moisture prediction: linear regression (LR) and anomaly-based cumulative distribution function (aCDF) matching. In addition to a complete period rescaling scheme (CP), an operationally feasible real-time rescaling scheme (RT) is tested. On average, the discharge prediction uncertainty, expressed as the ensemble mean of the root mean squared difference (MRMSD), is reduced by 25% after assimilation and little overall difference is found between the various approaches. However, when specific flood events are analysed, the level of improvement varies. Our results reveal that efficacy of the soil moisture assimilation for flood prediction is robust with respect to different assumptions regarding the observation error variance. The assimilation performs similarly between the operational RT and the CP schemes, which suggests that short-term training is sufficient to effectively remove observation biases. Regarding the different rescaling techniques used, aCDF matching consistently lead to better assimilation results than LR. Differences between the assimilation of SSM and SWI, however, are not significant. Even though there is improvement in streamflow prediction, the assimilation of soil moisture shows limited capability in error correction when there exists a large bias of the peak flow prediction. Findings of this work imply that proper pre-processing of observed soil moisture is critical for the efficacy of the data assimilation and its performance is affected by the quality of model calibration. Keywords: Data assimilation, soil moisture, flood prediction, satellite retrievals, rescaling, rainfall-runoff models. 1 1. Introduction 11 12 2 3 4 5 6 7 8 9 10 Flood forecasting based on hydrologic models has shown significant improvement over the past decades as a result of a growth in the understanding of hydrological processes, computational power, and availability of various observations. Quantification and reduction of hydrologic models uncertainty still remain as key challenges to be addressed given that processes such as water resources management and decision-making rely on the accuracy of model predictions (Liu and Gupta, 13 14 15 16 17 18 19 20 21 2007). Since the early 1990s, various hydrologic observations have been used not only to calibrate and validate models but also to update model variables in real time, through a process known as data assimilation (DA). Soil moisture observations, in particular, have been used for updating the soil moisture and/or soil temperature states of models (e.g., Pauwels et al., 2001; Francois et al., 2003; Crow and van Loon, 2006; Reichle et al., 2008; Crow and Ryu, 2009; Brocca et al., 2010; Crow and van den Berg, 2010; Lee et al., 2011; Brocca et al., 2012; Han et al., 2012). Accurate soil moisture Preprint submitted to Journal of Hydrology April 18, 2016 40 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 predictions can lead to better modelling of hydrological processes including runoff, ground water recharge and evapotranspiration. Nonetheless, there is no consensus about the improvement in streamflow forecasting skill from assimilating soil moisture observations (on-site or remotely sensed) into rainfall-runoff models, because some key aspects of the assimilation framework have not been fully assessed to date. These include the correct quantification of model and observation uncertainties, the sparse observation time, the rainfall-runoff model structure and the mismatch of spatial scales between observations and model state variables, the more suitable DA technique, and the assimilation performance as a function of climatic, soil and land use conditions (Brocca et al., 2012). Ground measurements of soil moisture are scarce in most regions, thus satellite retrievals offer a valuable tool to estimate large-scale soil moisture content. Passive microwave satellite soil moisture observations in particular, generally have higher uncertainty than ground measurements and a coarse spatial resolution, and they represent water content only in the top few centimetres of soil (surface soil moisture or SSM hereafter). Nevertheless, they provide soil moisture estimates with a good spatial coverage at regular and reasonably frequent time intervals, which makes them suitable for large-scale monitoring. Moreover, the current ESA Soil Moisture Ocean Salinity (SMOS) (Barre et al., 2008) and the upcoming NASA Soil Moisture Active/Pasive (SMAP) (Entekhabi et al., 2010) missions are mainly dedicated to the estimation of soil moisture, thus more frequent and higher-quality observations are expected in the near future. The majority of studies assimilating satellite soil moisture observations focus on improving soil moisture profile estimation in land surface models (e.g., Pauwels et al., 2001; Crow and van Loon, 2006; Reichle et al., 2008; Crow and van den Berg, 2010; Chen et al., 2011; Han et al., 2012). These models calculate water and energy balances for the soil surface, which results in the partitioning of rainfall in surface runoff and infiltration. They have a strong physical basis and therefore involve complex parametrisation schemes. Conceptual rainfallrunoff models on the other hand, vary in complexity depending on the assumptions and simplifications made regarding the hydrologic processes within the catchment (Beven, 2011). The simpler models can be suitable for areas where little data is available. This group of models generally uses rainfall and potential evapotranspiration as input data, energy fluxes are considered indirectly only to estimate evapotranspiration, and runoff generation processes are conceptualised as a series of 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 2 41 interconnected storages through the catchment (Chiew et al., 1996), which requires a less complex parametrisation scheme than some available physically-based models (Loague, 2010; Mirus and Loague, 2013). There have been some synthetic studies exploring the advantages of assimilating pre-storm soil moisture observations into rainfall-runoff models (e.g., Crow and Ryu, 2009; Lee et al., 2011), but relatively few studies have assimilated real observations. Exceptions include Pauwels et al. (2001, 2002) who assimilated SSM derived from the European Space Agency (ESA) European Remote Sensing (ERS) satellites using a statistical correction assimilation method and found improvement in discharge prediction for both the lumped and the distributed models tested. Francois et al. (2003) found improvement in flood event simulation using a coupled land surface-hydrological model after the assimilation of SSM derived from ESA’s synthetic aperture radar (SAR) using an extended Kalman filter. Brocca et al. (2010) assimilated a soil wetness index (SWI) product derived from the Advanced Scatterometer (ASCAT) onboard of the Metop satellite using a direct nudging scheme, and found improvement in discharge prediction. More recently, Brocca et al. (2012) compared ensemble Kalman filter assimilations of the SSM and the SWI derived from ASCAT, and found more significant improvement in discharge prediction when the root zone soil moisture product (SWI) was assimilated. The efficacy of assimilating satellite soil moisture retrievals into rainfall-runoff models relies on factors such as the dominance of soil moisture in controlling the runoff generation process, the accuracy of the observations, the effective rescaling of observations into the model space, and the correct representation of uncertainties in model prediction. For example, it has been shown that incorrect assumptions of the model structural and observational errors (Crow and van Loon, 2006; Crow and Reichle, 2008; Reichle et al., 2008; Ryu et al., 2009; Crow and van den Berg, 2010) and the use of suboptimal rescaling schemes (Yilmaz and Crow, 2013) can degrade the performance of the stochastic DA. Moreover, using model predictions to rescale biased observations can potentially transfer the model biases into the rescaled observation, which suggests that the DA methods should explicitly take into account both model and observation biases (Pauwels et al., 2013). Most DA research working with rescaling of soil moisture observations prior to assimilation (e.g., Crow and van Loon, 2006; Crow and Reichle, 2008; Reichle et al., 2008; Crow and van den Berg, 2010; Han et al., 2012) have focused on long-term, continuous water balance modelling applications that are significantly dif- 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 ferent from the event-based application of a hydrologic model in an operational flood forecasting context. As a result, it is not evident how relevant they are to operational stream flow forecasting and thus DA studies specific to this particular application type are needed. This paper addresses this gap. In summary, there are a number of challenges to be addressed in the assimilation of satellite soil moisture into rainfall-runoff models. This study addresses some of them and has the following main objectives: 1) to provide new evidence of the improvement in flood prediction from assimilating real satellite SSM observations and a SWI derived from them, in a semi-arid catchment, in an operational context, and 2) to explore different rescaling techniques and their impacts in improving discharge prediction. In addressing our two main objectives, we compare assimilations of SSM and SWI derived from the Advance Microwave Scanning Radiometer (AMSR-E) into the probability distributed model (PDM). To remove systematic differences between the observed soil moisture (SSM and SWI) and the soil moisture predicted by PDM, different rescaling procedures are tested. These include linear regression (LR) and anomaly-based cumulative distribution function (aCDF) matching. In the aCDF scheme, seasonality is removed by grouping soil moisture values into corresponding months and applying standard CDF matching to the separately grouped values. We also test different periods for setting up the rescaling: a real time approach (RT), which uses only past and current information, thus is operationally feasible; and a complete period approach (CP) that uses all the information available, which is more robust. The study area chosen for this work is a sparselymonitored, large, semi-arid catchment. Quantitative operational flood forecasting is currently done using an event-based model whenever a moderate or larger flood is likely to occur. In this paper, we present an attempt at providing continuous streamflow modelling within the catchment, which makes effective use of satellite soil moisture observations to improve predictions. The paper is structured in five sections: Section 2 describes the study catchment and the data used; Section 3 describes the methodology, including a description of the PDM, the ensemble Kalman filter used for the assimilation, the representation of the uncertainties in the model, the estimation of SWI, the rescaling of SSM and SWI, and the metrics used to evaluate the assimilation results; Section 4 presents the results and discussion; and Section 5 summarises the main conclusions of the study. 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 2. Study Area and Data The study area is the semi-arid Warrego catchment (42,870 km2 ), which is a sub-catchment of Wyandra River catchment located in Queensland, Australia. This catchment is chosen for its flooding history, along with its geographical and climatological conditions, which enables soil moisture satellite retrievals to have higher accuracy than in other areas. These conditions include the size of the catchment and having a semi-arid climate and a low vegetation cover. Moreover, the catchment features very sparse ground monitoring networks thus satellite data can make more unique and valuable contribution compared to well instrumented catchments. The catchment has summer dominated rainfall with mean monthly rainfall of 80 mm in January, and 20 mm in August. Mean maximum temperature in January is above 30◦ C and in July below 20◦ C. The runoff seasonality is characterised by peaks in summer months and minimum values in winter and spring. The mean annual precipitation over the catchment is 520 mm and it has a long history of flooding with at least 10 major events in the last 100 years that have caused extensive inundation of towns and rural lands. Floods in the catchment are often related to heavy rainfall events linked to La Niña. Major towns within the entire Warrego river catchment are Augathella, Charleville, Wyandra (upstream to the Warrego at Wyandra gauge) and Cunnamulla (downstream of Warrego at Wyandra gauge) (see Fig.1). Among these towns, only Cunnamulla has flood protection (http://www.bom.gov.au/qld/flood/brochures/warrego). The current flood alert system within the catchment consists in a network of volunteer rainfall and river height observers who communicate observations by telephone when specified threshold levels have been exceeded, as well as automatic telemetry stations operated by the Bureau of Meteorology (BoM), the Murweh Shire Council and the Department of Environment and Resource Management (http://www.bom.gov.au/qld/flood/brochures/warrego). Therefore, an automated continuous operational model could provide valuable information about not only major floods, but also moderate and minor flooding. The rainfall data used in this work was obtained from the Australian Water Availability Project (AWAP), which covers the period from 01/1900-12/2013 and is gridded at a spatial resolution of 0.05◦ Jones et al. (2009). Hourly streamflow records were collected from the State of Queensland, Department of Environment and Resource Management 3 42 10°S 1:80,000,000 Augathella 15°S Charleville 20°S Wyandra 25°S Cunnamulla 30°S 35°S 40°S Towns Warrego River catchment Rainfall station 115°E 120°E 125°E 130°E 135°E 140°E 145°E 150°E 155°E Figure 1: The Warrego river catchment 236 (http://watermonitoring.derm.qld.gov.au) for 03/196712/2013. Daily discharge was calculated based on the daily AWAP time convention (9AM-9AM local time, UTC+10). The surface soil moisture (SSM) dataset was obtained from AMSR-E version 5 (C/X-band) Level 3 soil moisture product developed by the Vrije Universiteit Amsterdam with NASA at 0.25◦ resolution for the period 07/2002-10/2011 (Owe et al., 2008). 237 3. Methods 228 229 230 231 232 233 234 235 284 on two main factors: 1) the representativeness between the satellite soil moisture and the soil water states of the model, which is addressed by processing the satellite data (see Sections 3.4 and 3.5), and 2) the covariance between the errors in discharge and the soil water states. Due to the inherent limitations of the conceptual model, the links (based on solid physical processes) between the errors in soil moisture and discharge would become more complex to analyse at the finer scales of a semi-distributed scheme. Furthermore, for an initial real-data assimilation experiment, the selected lumped scheme would be a natural choice since it avoids the specification of spatial cross-correlation of modelling errors that is required if the model had a finer resolution than the assimilated observations. The spatially distributed catchment setup would require a more complex DA with more parameters, and therefore less transparent assimilation results. The model was run at a daily time step. This temporal resolution was chosen considering the poor instrumentation and relatively long concentration time (approximately 6 days) within the catchment. The 10 parameters of the model were calibrated using a genetic algorithm (Chipperfield and Fleming, 1995) and an objective function based on the Nash-Sutcliffe model efficiency (NS) (Nash and Sutcliffe, 1970). 285 3.2. EnKF formulation 259 260 261 262 263 264 265 266 267 268 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 269 270 3.1. Rainfall-runoff model The probability distributed model (PDM) is a conceptual rainfall-runoff model that has been widely used in hydrologic research and applications (Moore, 2007). The model treats soil water content (S 1 in Fig.2) as a distributed variable following a Pareto distribution function. Two cascade of reservoirs (S 21 and S 22 , Fig.2) represent surface routing within the catchment, and one subsurface reservoir (S 3 , Fig.2) represents sub-surface runoff generation. The main inputs to the model are rainfall and potential evapotranspiration. A detailed description of the model processes and formulations is presented in Moore (2007). Despite the large scale of the study catchment, the DA experiments are set up by using a lumped scheme to maintain a simple framework. This permitted to test the hypothesis of this work—more accurate estimation of soil water states in the model can be achieved by the assimilation of satellite soil moisture, which in turn can lead to better streamflow prediction—with a reduced number of contributing factors. Our hypothesis relies 271 272 273 274 275 276 277 278 279 280 281 282 283 286 287 288 289 290 4 43 The Kalman filter is a Bayesian estimator that sequentially updates model background predictions with available observations. The updating step is based on the relative magnitudes of model and observation error variances. In the ensemble Kalman filter (EnKF), P Fast runoff E S21 S22 Fast flow storages S1 Qr (Fast flow) Q Recharge Qs (Slow flow) S3 Slow flow storage Figure 2: The PDM scheme 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 the error covariance is explicitly calculated from Monte Carlo-based ensemble realisations. For a state-updating assimilation approach, the state ensemble is created by perturbing forcing data, parameters and/or states of the model with unbiased error. As detailed in Section 3.3, in this study the state ensemble was generated by perturbing rainfall data and the soil moisture prediction of the model (water content in S 1 , Fig.2). Then, the water content of S 1 was updated by using satellite soil moisture observations. Model and observation uncertainties are propagated by the EnKF thus the final discharge prediction is treated as an ensemble of equally likely realisations. The uncertainty of the discharge prediction can be derived from the ensemble, thus providing valuable information for operational flood alert systems. If we define the model soil moisture state ensemble in the background prediction as θ− (t): − − θ− (t) = {θ1,t , θ2,t , . . . , θ−N,t }, Satellite surface soil moisture (SSM) observations and a soil wetness index (SWI) derived from them were assimilated in separate runs to update θ− . Before being assimilated, SSM and SWI were rescaled to remove systematic differences between the model and the observation (see Section 3.5), which is required for an optimal state updating scheme (Yilmaz and Crow, 2013). When a rescaled observation was available, the state ensemble at time t was updated using the following expression: r θ+ (t) = θ− (t) + K(θobs (t) − H(θ− (t))), is the rescaled SSM or SWI and H is an operator that transforms the model state to the measurement. As the observation was rescaled separately prior to the state updating, H reduces to the identity matrix in this work. The Kalman gain matrix K was calculated for each time step by: K= (1) where N is the number of ensemble members and − − {θ1,t , θ2,t , . . . , θ−N,t } are the ensemble realisations of soil moisture at time step t. The error of the member i at time step t is estimated as: N 1 X − θ . N i=1 i,t (2) The anomaly vector of the ensemble for time step t is then given by: − 0 − 0 θV − (t)0 = {θ1,t , θ2,t , ..., θ−N,t 0 }. (3) The covariance matrix of the state model errors (P−t ) is directly estimated at each time step based on the anomaly vector: P−t = 1 θV − (t)0 × θV − (t)0 T . N−1 (6) where R is the observation error variance after rescaling. 308 3.3. Model error representation 309 311 312 313 314 315 316 317 318 319 320 321 (4) P−t HT , HP−t HT + R 307 310 − θ−i,t 0 = θi,t − (5) r where θobs 322 One of the main strengths of EnKF-based DA is that it explicitly accounts for different sources of error such as model error and observation error. As summarised in Section 3.2, the background model prediction is updated by using observations and relative weights based on the model prediction and observation uncertainties. In practice, both model and observation uncertainties are difficult to quantify and they have often been assumed to be spatially and temporally independent Gaussian random variables. In this work we followed the error model adopted by the majority of previous soil moisture DA experiments, where model error is represented by reducing the main sources of uncertainty (forcing data, structure 5 44 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 of the model and parameter errors) into two main components: uncertainties in states or fluxes of the model and uncertainties in forcing data (Reichle et al., 2002; Crow and van Loon, 2006; Reichle et al., 2008; Crow and Ryu, 2009; Kumar et al., 2009; Crow and Van den Berg, 2010; Chen et al., 2011; Hain et al., 2012). In this approach, errors from all sources in the model are represented by perturbing the forcing inputs and the states of the model. i.e., both parameter and model structural errors are schematically represented by perturbing model states (and/or fluxes). The above simplification has the drawback of not explicitly treating the parameter uncertainties, which may play a critical role in the model error representation. For example, the parameter uncertainty is important in streamflow DA experiments where there is a strong (and more direct) link between streamflow prediction and model parameters, and there is a variety of assimilation schemes formulated to deal with it (Georgakakos et al., 2004; Moradkhani et al., 2005, 2012). However, estimating optimal error model parameters for the parameter uncertainty adds an important challenge, and sub-optimal specification of parameter uncertainty can further complicate the interpretation of results from the soil moisture state error correction scheme. The uncertainties coming from forcing data were reproduced by perturbing precipitation data with a serially independent multiplicative error following a log-normal distribution (mean 1 and standard deviation σ p ). It was assumed that there is no autocorrelation in the rainfall error, which could be an oversimplification of the error structure. Uncertainties coming from the structure and the model parametrisation were represented by perturbing the soil moisture state of the model with a normally distributed additive error (mean 0 and standard deviation σ sm ). The error model parameters (σ p and σ sm ) were calibrated by running the open-loop (perturbing the forcing and soil moisture state, but without assimilating rescaled soil moisture observations) with 500 ensemble members (N), and evaluating the following two discharge ensemble verification criteria: i) If the ensemble spread is large enough, the temporal average of the ensemble skill (skt ) should be similar to the temporal average of the ensemble spread (spt ), i.e., sk/sp = 1 (De Lannoy et al., 2006; Brocca et al., 2012), where: T h X i2 1 Q sim (t) − Qobs (t) T t=1 T N 2 1 X x1 X sp = Q sim (i, t) − Q sim (t) . T t=1 N i=1 sk = ii) If the observation is indistinguishable from a member of the ensemble, the ratio between sk and the ensemble mean squared error (mse), normalised by √ (N + 1)/2N should be equal to one (Moradkhani et al., 2005; Brocca et al., 2012), where: 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 (7) (8) 6 45 T N 1 X 1 X 2 . (Q mse = sim (i, t) − Qobs (t)) T t=1 N i=1 (9) By using the above discharge ensemble verification criteria, we are assuming that the observed discharge has no error (or very small compared to the model error), which might lead to an overestimation of σ p and σ sm . 3.4. Estimation of SSM and SWI The SSM derived from AMSR-E was averaged over the entire catchment. In the downsampling procedure, averaged values of ascending (1:30AM local time, UTC+10) and descending (1:30PM local time, UTC+10) satellite passes were calculated for the entire catchment in days when more than 50% of the pixels, containing more than 50% of their areas within the catchment, had valid data. Then, anomalies of averaged ascending (AAA) and anomalies of averaged descending (AAD) datasets for the catchment were calculated by subtracting their long-term temporal means. Daily SSM was calculated as the average of AAA and AAD (if both were available) or directly as either AAA or AAD (if only one was available). Anomalies are used instead of the actual ascending and descending averaged values because there is bias between the two datasets (Brocca et al., 2011; Draper et al., 2009), which would affect the daily SSM calculation. The observed SSM and the soil water content of PDM represent different soil layers. Rainfall-runoff models in general do not separately account for the SSM, instead they work with soil moisture storage(s) representing significantly deeper layer(s). For PDM, the depth of the soil water storage (S 1 from Fig.2) is determined by calibration, but it is rather comparable to the root zone soil moisture. Satellite SSM by contrast, represent the top centimetres of soil. In order to address this mismatch, we estimated a soil wetness index (SWI) derived from the SSM product, which has been widely used to represent deeper layer soil moisture (Wagner et al., 1999; Albergel et al., 2008; Brocca et al., 2010, 2009). The SWI was obtained by using the exponential filter proposed by Wagner et al. (1999), which assumes that the variation in time of the root zone soil moisture is linearly related to the difference between SSM and root zone soil moisture. The filter estimates an average of profile saturation by recursively calculating a SWI whenever a satellite SSM retrieval is available (Brocca et al., 2010): S WI(t) = S WI(t − 1) + Gt [S S M(t) − S WI(t − 1)] , (10) where Gt is a gain term varying between 0 and 1 as: Gt = t−(t−1) . (11) T 405 T is a calibrated parameter representing the time scale of SWI variation, which was obtained by maximising the correlation between SWI and the soil moisture predicted by the model. 406 3.5. Observation rescaling and error estimation 402 403 404 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 445 446 447 448 449 Gt−1 Gt−1 + e− 444 In hydrologic DA, in order to optimally merge model predictions and observations, the systematic differences between these two data sets have to be removed. This is usually done by rescaling the observations against model predictions (Reichle and Koster, 2004; Drusch et al., 2005; Yilmaz and Crow, 2013). The rescaling approaches adopted in this work were linear regression (LR) (Crow et al., 2005) and a variation of the cumulative distribution function (CDF) matching used by Reichle and Koster (2004), which is referred to as anomaly-CDF (aCDF) matching. The aCDF matching first removes the seasonal fluctuation of the modelpredicted and the observed soil moisture by calculating anomaly from monthly mean soil moisture. CDF’s of the anomalies are then matched to rescale observations to model predictions. In this work we tested the assimilation of four rescaled observations: satellite SSM rescaled via LR (i) and aCDF matching (ii), and SWI rescaled via LR (iii) and aCDF-matching (iv). Two periods for setting up the rescaling schemes were also considered: a “real time” operational scheme where only data prior to the prediction time was used and a “complete period” scheme where all the data was used. In the real time scheme (RT), the first two years of observations were used for rescaling initially, then this rescaling time window was updated to include more recent observations. In this way, rescaling is always based on all the past available data, while no ‘future’ information is used. The real time rescaled SSM via LR and aCDF operators are named SSM+LR(RT) and SSM+aCDF(RT), respectively. The real time rescaled SWI via LR and aCDF are named SWI+LR(RT) and SWI+aCDF(RT), respectively. If all the available observations were used to evaluate the assimilation results, the complete period scheme (CP) was estimated by applying LR and aCDF rescaling to the complete time series of model predictions 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 and satellite soil moisture observations. While this approach can capture the overall climatology of soil moisture better, it is not operationally feasible because it uses ‘future’ information. The complete period rescaled SSM via LR and aCDF are named SSM+LR(CP) and SSM+aCDF(CP), respectively. The complete period rescaled SWI via LR and aCDF are named SWI+LR(CP) and SWI+aCDF(CP), respectively. The error associated with the different rescaled observations needs to be estimated for the EnKF formulation (R from eq.6). Quantifying this uncertainty is a major challenge, especially given the lack of ground measurements of soil moisture in most areas. Moreover, it has been found that incorrect assumptions of these errors degrade the performance of the stochastic assimilation results (Reichle et al., 2008; Crow and Reichle, 2008; Crow and van den Berg, 2010). In this study we determined an upper bound of the rescaled observation error and tested a selection of error variances within the bound. This allowed for evaluating the sensitivity of the assimilation results to different magnitudes of observation error variances. In theory, if r we express a rescaled observation (θobs ) as the sum of r,T the “true” rescaled observation (θobs ) and the rescaled r observation error (obs ), and we assume error orthogonality, we can express the variance of the rescaled obr,T r r servation as Var(θobs ) = Var(θobs ) + Var(obs ), where r r Var(obs ) = R from eq.6. In this way, Var(θobs ) can be considered as an upper bound of R. We used this upper bound to test different values of R: R1 = r r r 0.3Var(θobs ); R2 = 0.5Var(θobs ); R3 = 0.7Var(θobs ) and a fixed value of R4 = 3% (expressed as standard deviation in units of volumetric percentage of the calibrated r soil moisture storage of 396 mm).Note that θobs corresponds to either the rescaled SSM or the rescaled SWI, r thus Var(θobs ) is the variance of the rescaled time series. 3.6. Evaluation of data assimilation results The evaluation of the different DA experiments was based on the normalised root mean square difference (NRMSD). The NRMSD was calculated as the ratio of the mean root mean square difference (MRMSD) between the updated discharge ensemble members (Qup sim ) and the observed discharge to the MRMSD between the open loop (ensemble discharge prediction without assimilation, Qol sim ) and the observed discharge: q NRMS D = q 481 482 1 N 1 N PN i=1 PN i=1 Qup sim (i, t) − Qobs (t) Qolsim (i, t) − Qobs (t) 2 2 , (12) where N is the number of ensemble members (in these experiments, N=1000). 7 46 483 484 485 486 487 488 489 The NRMSDs from the different assimilation experiments were evaluated and compared based on four main factors: 1) the different observation error variances considered, 2) the real time and the complete period approaches used for rescaling, and the different rescaling technique used (LR and aCDF), and 3) the different products assimilated (SSM or SWI). 532 533 534 535 536 537 538 539 490 4. Results and discussion 491 4.1. Model calibration 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 540 541 542 543 The common period of rainfall and discharge records for the catchment is 1967-2013. The period 1967-2001 was used for calibration and 2002-2013 for verification. The NS model efficiency for calibration and verification periods were 0.53 and 0.69, respectively. Fig.3 presents the simulated and observed discharge time series for the catchment and the daily flow duration curve. These results reveal that the calibrated model underestimates some of the observed peak flows and consistently overestimates low and zero flows. One of the key factors responsible for the large prediction error is the low density of rainfall gauges within the catchment, which results in a low quality gridded rainfall data for the area. Another key factor to consider is the model structural error coming from a deficient representation of the runoff generation processes within the catchment. As we run a lumped model for the entire catchment, the spatial heterogeneity of rainfall over the catchment is ignored and the runoff mechanisms are represented by a single combination of storages depths for the entire area. The lumped model structure is therefore another main source of uncertainty and an important limitation of the selected scheme. Moreover, neglecting rainfall spatial heterogeneity probably contributes to the systematic underestimation of peak flows. Given the semi-arid climate of the catchment and the dominance of surface runoff in the total streamflow (baseflow component is negligible), the initial rainfallrunoff generation process is likely dominated by the antecedent soil water content of the catchment. There are a number of rainfall events that did not result in measurable discharge during the study period because the catchment did not reach the wetness threshold needed for runoff generation. One potential issue is whether or not the model structure and conceptualisation is able to correctly represent this saturation excess runoff process. To asses the impact of catchment wetness (soil moisture content) on the runoff generation behaviour, and to establish whether there is a threshold hydrological response, satellite SSM observations can be compared 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 8 47 to the discharge at the catchment outlet (Brocca et al., 2013). In this work, both simulated and observed daily runoff ratios (calculated for daily rainfall over 1 mm) were compared with the modelled soil moisture content of the catchment (at the previous time step) and with the satellite SSM observed in the catchment (at the previous time step). These results are presented in Fig.4 and show that both the observed and simulated runoff start, and continuously increase, when the modelled soil moisture has reached a value of approximately 60 mm and the observed satellite SSM has reached 0.1 vol/vol. The highly scattered nature of these plots is similar to the findings of Brocca et al. (2013), in which the semiarid catchment featured a more scattered relationship than the more humid catchments. The non-linear relation between runoff (modelled and observed) and soil moisture (modelled and observed), as well as the identified threshold values, suggest that antecedent soil moisture exerts an important control in the runoff generation mechanisms. Despite the critical role of antecedent soil moisture in runoff generation, the effect of antecedent soil moisture content on flood generation is less apparent when it is greater than the threshold values for runoff generation. In Fig.4, the runoff ratios (for both modelled and observed cases) do not exhibit strong correlation with soil moisture (modelled and observed) content beyond the threshold values. To confirm this behaviour, we set up a series of synthetic experiments (presented in AlvarezGarreton et al. (2013b), but not shown here) in which input rainfall and model soil moisture were perturbed with a range of different noise levels, using the error model specifications from Section 3.3. Standard deviations varied between 0.1-0.6% for the rainfall error parameter (σ p ) and 0.01-0.06% for the soil moisture error parameter (σ sm ). Discharge error showed a similar dependency on soil moisture and rainfall errors for the low ranges of σ p and σ sm . However, when both rainfall and soil moisture errors became large (which resulted in large flood events given the multiplicative nature of rainfall error), discharge error was considerably more affected by rainfall. These results are consistent with our runoff generation analysis, and suggest that when the catchment is (relatively) dry, the combination of soil moisture and rainfall would dominate runoff generation processes. When the catchment has reached the wetness required for runoff generation however, variation in discharge (and thus the error in discharge), which is mainly dominated by surface runoff, is predominantly affected by rainfall (and thus the error in rainfall). Factors such as the lack of ground data for improving rainfall representation over the catchment, and the 0 Obs Model 6 a) 50 4 100 2 0 Apr71 Oct76 Mar82 Sep87 Mar93 Sep98 Feb04 Aug09 Daily rainfall (mm) Daily discharge (mm) 8 150 Log daily discharge (mm) 5 10 Obs Model b) 0 10 −5 10 −10 10 0 0.5 Excedence probability 1 Figure 3: a) Hydrograph of observed and predicted discharge. The dashed black line indicates the end of calibration and beginning of verification period. b) Observed and modelled daily flow duration curve. Daily runoff ratio (RR) 0.8 Model RR Obs RR Model RR Obs RR a) b) 0.6 0.4 0.2 0 0 100 200 Model soil moisture (mm) 300 0 0.1 0.2 0.3 Satellite SSM (vol/vol) 0.4 Figure 4: Observed daily runoff ratio (Obs RR) and simulated daily runoff ratio (Model RR) plotted against a) simulated soil moisture and b) satellite SSM. The dashed lines in panels a) and b) correspond to the identified threshold values. 9 48 595 expected accuracy of satellite soil moisture retrievals (given the low vegetation cover within the catchment), provide potentially favourable conditions for achieving significant forecast improvements using a state correcting DA framework. However, given the runoff generation processes within the catchment, we expect that the improvement in skill from soil moisture DA would decrease for large floods. Moreover, the benefit of state correction via DA may be marginal when the model prediction contains large bias after calibration, because rescaling observations can transfer the bias to the observations (Pauwels et al., 2013). 596 4.2. Error model parameter calibration 584 585 586 587 588 589 590 591 592 593 594 633 634 635 636 637 638 639 640 641 642 643 644 645 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 646 For the calibration of the error model parameters we used the genetic algorithm (Chipperfield and Fleming, 1995) with ranges for σP and σ sm being 0.1-0.6 and 0.01-0.06 (% of the soil moisture storage), respectively. It should be noted that since the discharge error, as represented here, comes from two sources (rainfall and state uncertainties), a compensatory behaviour between the two sources in the final discharge uncertainty is expected. In other words, there may be multiple pairs of rainfall and soil moisture errors that result in similar, if not identical, discharge error (also known as equifinality in the set of parameters). Moreover, since we run a multi-objective calibration, we obtained a set of Pareto optimal solutions, representing the trade-off of maximising the two different verification criteria (Gupta et al., 1998). Among the set of Pareto solutions, we filtered the pair of error parameter for which both objective functions had less than 20% of error. The selected error parameters were σP = 0.37 and σ sm =√ 0.032, which resulted in sk/sp = 1.18 and sk/mse (N + 1)/2N −1 = 0.78. 4.3. Rescaled SSM and SWI 647 648 649 650 Table 1: Characteristics of rescaled soil moisture observations: correlation coefficient with the modelled soil moisture (r), standard deviation of the associated residuals (std), and mean bias (bias) for the entire period. r std (mm) bias (mm) SSM+LR(RT) 0.66 32 6 SSM+LR(CP) 0.64 33 0 SSM+aCDF(RT) 0.76 29 4 SSM+aCDF(CP) 0.76 26 0 SWI+LR(RT) 0.78 26 2 SWI+LR(CP) 0.78 26 0 SWI+aCDF(RT) 0.94 16 -2 SWI+aCDF(CP) 0.94 15 0 651 652 653 654 655 656 657 Rescaled SSM and SWI, for the real-time and the complete period approaches were estimated based on the methodology described in Section 3.5. The T parameter that maximised the correlation between SWI and the modelled soil moisture, which roughly represents a soil layer depth of 180 cm (by assuming a porosity of 0.46, taken from the A-horizon information reported in McKenzie et al. (2000), and the model soil moisture storage of 396 mm), was 9 days. This T value is within the range of the optimal values obtained by Albergel et al. (2008) for soils with lower depths (30 cm depth with T between 1 and 23 days), similar to the value found by Brocca et al. (2009) for a thinner layer of soil (depth of 15 cm and T of 10 days, for one of the study fields), and slightly lower than the values obtained by Wagner et al. (1999) for depths of 20 and 100 cm (T equal 15 and 20 days, respectively). This suggests that the study catchment is reacting faster than the ones in the latter studies, which is reasonable considering the semiarid climatic condition and the surface runoff dominant mechanism of the study area. Evaluation metrics of rescaled observations, including the correlation coefficient with the modelled soil moisture (r), the standard deviation of the associated residuals (std), and the mean bias (bias) for the entire period are presented in Table 1. These results reveal strong correlation between the modelled soil moisture and the rescaled observations for all cases, with correlation coefficients greater than 0.64. Non-zero bias values in Table 1 are the result of the real-time rescaling scheme, which uses only information prior to the current observation to set up the rescaling. 658 659 660 661 662 663 664 665 666 667 668 669 670 671 10 49 To visualise some of the rescaled results, Fig.5 shows the time series of real time rescaled products and Fig.6 presents scatter plots against modelled soil moisture. Rescaled SWI has higher r and lower std with the modelled soil moisture, compared to rescaled SSM, which suggests a better representation of the root zone soil moisture dynamics. This can also be seen in the smoother time series of rescaled SWI compared to the rescaled satellite observations (Fig.5) and is consistent with Brocca et al. (2010). High r and low std of residuals, however, does not necessarily mean better rescaled observations and more optimal Kalman update. Yilmaz and Crow (2013) evaluated LR and CDF-matching techniques based on their skill in merging model predictions and observations to create an analysis product with lower uncertainties (which is the primary goal of DA). They found that both LR and CDF-matching can give suboptimal solutions when model predictions and observations have correlated errors, which violates the orthogonality assumption between their errors. This means that, when the Soil moisture (mm) Soil moisture (mm) 300 200 Model SSM+LR SSM+aCDF a) Model SWI+LR SWI+aCDF b) 100 0 300 200 100 0 Jan04 Jan06 Jan08 Jan10 Figure 5: a) Rescaled SSM using the real time approach (rescaling is fit with data prior to the observation time). b) Rescaled SWI using real time approach. Rescaled soil moisture (mm) 300 250 SSM+LR SSM+aCDF SWI+LR SWI+aCDF 200 150 100 50 0 0 a) 100 200 Model soil moisture (mm) b) 300 0 100 200 Model soil moisture (mm) 300 Figure 6: The relationship between the simulated soil moisture and a) rescaled SSM and b) rescaled SWI. Rescaling has been done using the real time approach (rescaling is fit with data prior to the observation time). 11 50 672 673 674 675 676 677 678 679 680 681 682 683 684 high correlation between the model soil moisture and rescaled observation originates from the correlated errors between them, the resulting Kalman update can be worse than the pairs with lower correlation but with independent errors. To test the sensitivity of the DA results to different estimates of the observation error variance (R), four R values, corresponding to 0.3, 0.5 and 0.7 times the rescaled soil moisture variance and a fixed value of 3% are tried in the DA. These R values, expressed as standard deviation in units of volumetric percentage of the calibrated soil moisture storage (396 mm), are summarised in Table 2. 713 714 715 716 717 718 719 720 721 722 723 724 725 726 Table 2: Rescaled soil moisture observation error standard deviation scenarios, expressed as percentage of the model storage capacity (396 mm). R1 R2 R3 R4 SSM+LR 3.8 4.8 5.7 3.0 SSM+aCDF 5.9 7.7 9.0 3.0 SWI+LR 4.6 5.9 7.0 3.0 SWI+aCDF 5.9 7.6 9.0 3.0 727 728 729 730 731 732 733 734 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 4.4. Data assimilation results Considering the main objective of this work—to test the skill of assimilating satellite soil moisture for improving flood predictions—major flood events were selected for evaluation of the complete period. Moderate floods were also considered to evaluate the experiments in conditions where soil moisture is assumed to play a more important role in runoff generation compared to major floods conditions (see Section 4.1). The selection of major and moderate floods was made by using all events with a peak discharge greater than the corresponding threshold values provided by the Bureau of Meteorology (http://www.bom.gov.au/qld/flood/brochures/warrego) for the Warrego at Cunnamulla and the Warrego at Wyandra gauges (see Fig.1). The Bureau of Meteorology’s classification is based on the historical damage produced by flooding. Flood events were included in the classification if they were moderate or major at either one or both gauges. This resulted in discharge values in Warrego at Wyandra gauge above 0.55 and 2.05 mm for moderate and major floods, respectively, which yielded three major floods and four moderate floods for the 2001-2012 period. 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 709 710 711 712 4.4.1. Effects of observation error assumptions in DA results We evaluated the sensitivity of the DA results to the observation error variances by comparing the assimi- 760 761 762 763 12 51 lation results using each of the four assumed R values (see Table 2). Fig.7 presents the NRMSD after assimilation of the different rescaled products using the real time rescaling (RT) approach, for the total period of evaluation (2001-2012), the 3 major and the 4 moderate floods events. The NRMSD values in Fig.7 are plotted with their 95% confidence bounds. These results reveal that the performance of the assimilation scheme depends mainly on the rescaled observation used and that the sensitivity to the assumed rescaled observation error variance R is not noticeable. The NRMSD calculated with the updated ensemble after the assimilation of rescaled SSM and SWI using the complete period (CP) rescaling approach, shows a similar behaviour with respect to the assumed R (not shown here). These findings differ from previous studies in which incorrect observation error assumptions led to significant degradation of the assimilation results (Crow and van Loon, 2006; Crow and Reichle, 2008; Crow and van den Berg, 2010; Reichle et al., 2008), however, those studies focused their evaluation on soil moisture prediction and not on streamflow prediction, as this work does. When evaluating results in terms of streamflow prediction improvement, Alvarez-Garreton et al. (2013a) found that different assumed observation error structures did not significantly affect the assimilation results. The reduced sensitivity to the observation error specification can be explained by various factors including: 1) the errors in the model discharge prediction limiting the sensitivity of the DA to observation errors; 2) the weak relation between the errors in soil water states and discharge; and 3) inherent sub-optimality of the rescaling procedures adopted in this work (Yilmaz and Crow, 2013) that may not meet the orthogonality assumed for estimating the upper bound of R, thus the observation errors tested may be covering too small a fraction of the R space. 4.4.2. Effects of different rescaling in DA results An important finding to highlight from the assimilation experiments is that both the RT and the CP rescaling approaches yielded similar NRMSD (with differences not statistically significant at the 5% level). This was true for the total period and also the major and moderate floods evaluated (NRMSD for the complete period rescaled observations are not plotted because they are very similar to the graphs in Fig.7). Similar performance of RT and CP rescaling further supports the conclusions of Reichle and Koster (2004), and suggests that the short time training window used in the RT rescaling approach (2 years in this case) is sufficient to remove systematic biases in satellite soil moisture retrievals and 1.5 Complete period c) Major flood 3 d) 1 1.5 R3 R4 SWI+LR SSM+aCDF SSM+LR SWI+aCDF SWI+LR SSM+aCDF SSM+LR SWI+LR SWI+aCDF SSM+aCDF SSM+LR SWI+aCDF SWI+LR SSM+LR SSM+aCDF R1 R2 SWI+aCDF 0.5 0 Moderate flood 1 Moderate flood 2 Moderate flood 3 Moderate flood 4 e) f) g) h) 1 SWI+aCDF SWI+LR SSM+aCDF SSM+LR SWI+aCDF SWI+LR SSM+aCDF SSM+LR SWI+LR SWI+aCDF SSM+aCDF SSM+LR SWI+aCDF 0 SWI+LR 0.5 SSM+aCDF NRMSD Major flood 2 b) SSM+LR NRMSD a) Major flood 1 Figure 7: NRMSD for a) the complete period of evaluation, b-d) the three major floods, and e-h) the four moderate floods after assimilation of the four rescaled products (SSM+LR, SSM+aCDF, SWI+LR and SWI+aCDF), using the real time rescaling approach, and the four observations error variances (R1, R2, R3 and R4). 13 52 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 to provide rescaled observations which can effectively improve model predictions in a DA scheme. If we evaluate the assimilation experiments based on the rescaling scheme used (LR and aCDF), we find consistently better performance of aCDF-rescaled products compared with LR-rescaled products, for the total period of evaluation and the three major floods (Fig.7, panels a, b, c and d). The results for the moderate floods do not show significant differences. The better performance of assimilating aCDF-rescaled observations can be explained by the fact that the LR-based rescaling smears out a small number of extreme values, resulting in underestimated (rescaled) peak soil moisture values (see Fig.6). This prevents the assimilation scheme from taking advantage of using these high observed (and perhaps true) antecedent wetness values, especially before floods. The aCDF matching approach by contrast, can keep the information of the extreme observations by assuming non-linear relationship between the datasets and by accounting for possible seasonality effects. 4.4.3. Effects of soil moisture product used in DA results To evaluate and interpret the assimilation results based on the different products assimilated (SSM and SWI), it is worth highlighting the runoff mechanisms within the study catchment. The main component of the total runoff in the semi-arid study catchment is quick (surface) runoff, which occurs only immediately after rainfall events. Thus, while the overall pattern of SWI is closer to the dynamics of soil moisture store (S1) of PDM (see Figs. 5 and 6), which in turn determines the dynamics of baseflow, the actual contribution of baseflow to the runoff generation is trivial. In other words, the total runoff mainly depends on the values of S1 immediately prior to rainfall events. Therefore, even when SWI shows a better overall agreement with S1, for the specific times (or events selected in this research), SSM may provide similar information to represent surface runoff. If we evaluate the performance of the filter when a linear regression scheme is used to rescale SSM and SWI, Fig.7 indicates that assimilating SSM+LR and SWI+LR leads to similar NRMSD in all the events analysed (with differences not statistically significant at the 5% level), except for major flood 1. In the case of the first major flood (Fig.7, panel b), the assimilation of SSM leads to an improvement of the open loop, while assimilating SWI leads to degradation. The same relations are found when comparing the assimilation of SSM+aCDF and SSM+aCDF, except during moderate flood 1 (Fig.7, panel e), in which the assimilation of SWI+aCDF yielded greater improvement than 842 SSM+aCDF. As pointed out above, the different results found in the assimilation of SSM and SWI during major and moderate floods, are event specific and, given the small number of events, it is difficult to generalise. It is expected, however, that the assimilation of SSM and SWI would lead to different performance if the objective was to forecast low flows instead of floods. In this case, the better representation of deep layer soil moisture provided by SWI may result in more valuable information to correct S1 and improve runoff prediction. In summary, the assimilation of SSM and SWI derived from satellite observations in general reduces the uncertainty of model streamflow predictions by around 25% (Fig.7, panel a). However, for major flood events, improvement was achieved only for combinations of the assimilated product and rescaling approach (Fig.7, panels b, c and d). During moderate events on the other hand (Fig.7, panels e, f, g and h), the performance of the filter was less dependent on the different soil moisture products and the model discharge prediction was improved by approximately 40% (except for moderate flood 2, where there was no improvement nor degradation of the model). To visualise the effects of the assimilation in the streamflow ensemble prediction, Figs. 8 and 9 present the major and moderate floods before and after assimilation of the SSM+aCDF product. These graphs show that the updated model discharge prediction is more accurate compared to the open loop. 843 5. Conclusions 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 14 53 Coarse resolution satellite soil moisture products are globally available at a regular temporal resolution and their reliability has been widely validated in recent years (e.g., Albergel et al., 2012; Su et al., 2013). The assimilation of these products into rainfall-runoff modelling, however, has been implemented only in a few studies (e.g., Brocca et al., 2010; Meier et al., 2011; Brocca et al., 2012; Chen et al., 2014; Wanders et al., 2014), despite the potential positive impacts of a better soil moisture representation in streamflow prediction. Addressing this research gap, this paper presents an evaluation of satellite soil moisture DA into a conceptual rainfall-runoff model (PDM) for the purpose of reducing flood prediction uncertainty in an operational context. We compare assimilations of a surface soil moisture (SSM) and a soil wetness index (SWI) derived from the AMSR-E. We explore different aspects of the assimilation framework, including various rescaling options, the impact of observation uncertainty and a systematic approach to model error characterisation. Discharge (mm) 8 6 4 4 2 openloop openloop mean updated updated mean obs 3 2 /10 /10 /03 0 04/08 08/09 23 /03 /10 /03 13 02 03 /02 /01 23 13 /08 /08 /08 /01 /02 28 /03 /02 08 /03 1 0 Figure 8: Major flood prediction before (open loop) and after assimilating SSM+aCDF(RT) (updated). Rescaling has been done using the real time approach. 3 Discharge (mm) 2.5 2 1.5 1 4 0.5 3 /12 /10 1 0 04/08 23 18 /12 /10 /10 /12 13 11 01 /02 /02 /10 /10 /07 /12 09 /07 /12 04 29 /11 /07 /04 /02 03 14 /01 0 /04 2 openloop openloop mean updated updated mean obs 08/09 Figure 9: Moderate flood prediction before (open loop) and after assimilating SSM+aCDF(RT) (updated). Rescaling has been done using the real time approach. 15 54 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 The study is undertaken in the semi-arid Warrego River Basin in south central Queensland, Australia. In general, the stochastic assimilation improves the prediction for both moderate and major floods; however, the improvement is higher and more consistent for moderate events. This is in agreement with the lower dependence of runoff generation on antecedent soil moisture in large events. We also show that the stochastic assimilation efficacy depends more on the rescaling technique than on the observation error specification. The low sensitivity to the observation error may be due to the large systematic errors in the model. These systematic errors can be explained by model structural errors or by the poor calibration resulting from the lack of a dense rain gauge network. Given that stochastic assimilation is designed to correct stochastic errors, the model systematic errors are not addressed thus the performance of the assimilation becomes marginal and less sensitive to the specified observation error. Moreover, if the error orthogonality assumption made for estimating the upper boundary of the rescaled observation error R is not met, this upper bound may be too small and therefore the different error values tested may be covering a small range of possible R. We addressed various options for processing the satellite soil moisture prior to assimilation. We found that removing systematic biases between the model and the satellite observations using real time rescaling (with 2 years training window) and complete period rescaling approaches led to similar results. This suggest that a short period of training is sufficient to remove bias and effectively use the observations to improve model prediction. The later has positive implications in the effective use of short-time records from current and near-future soil moisture satellite missions for improving rainfall-runoff streamflow predictions. We tested two different techniques for rescaling, linear regression (LR) and anomaly-based cumulative distribution function matching (aCDF), which assumes a non-linear relationship between datasets and accounts for seasonal effects. We found that the assimilation of aCDF-rescaled observations performed consistently better than assimilating LR-rescaled observations. However, given the known drawback of the rescaling techniques used here (Yilmaz and Crow, 2013), the feasibility of implementing a triple collocation rescaling (that requires three mutually independent datasets with sufficient temporal length to accurately estimate the error characteristics) should be explored in future work. We also explored the impacts of assimilating SSM and SWI derived from the satellite observations and found that, given the specific runoff mechanisms within 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 16 55 the catchment (semi-arid catchment with rapid response to rainfall events, dominated by surface runoff), the better representation of deep layer soil moisture provided by the SWI does not guarantee better assimilation results during moderate and major flood events. In fact, the assimilation of SSM and SWI led to similar model streamflow prediction improvement for the complete period of evaluation and for most of the flood events analysed. However, the small number of events precludes definitive conclusions about which of the two is more suitable to improve flood prediction in this catchment. Regarding the forcing and model (structure and parameter) errors characterisation, it was assumed that all these sources of error were represented by perturbing rainfall and soil moisture state. The parameters of these perturbations were calibrated by applying two discharge ensemble verification criteria. The observed discharge error was discarded in the formulation. The assumption of having small observed discharge error compared to the model error is strong (especially for flood events) and can lead to severe overestimation of the errors in the model. Future work should explore other techniques for calibrating the error parameters that include the errors in observed discharge and explicitly treat the uncertainties in the model parameters. We highlight that a limitation of this work, which directly affects the skill of the assimilation scheme, is the large bias in streamflow prediction for individual flood events. As mentioned above, the bias found in the model streamflow predictions comes mainly from errors in forcing data, PDM structure, the lumped schematisation and the model parameters. In order to examine the advantages of assimilating satellite-based soil moisture into spatially distributed catchment setup, extension of the state-updating scheme to a semi-distributed catchment system is currently in progress. In addition to improving the spatial representation of the model, there are tasks remaining to refine the presented data assimilation scheme: 1) To improve the representation and estimation of the model error by explicitly treating parameter uncertainty and accounting for the streamflow observation error; 2) To explore physically based methods, based on Richard’s equation (Richards, 1931), to transfer the surface satellite information into deeper layer soil moisture; and 3) To explore TC-based procedures to rescale the satellite observations. Lastly, more real-case studies are needed, which contribute to build evidence and understanding of the improvement skill of satellite soil moisture assimilation into rainfall-runoff models over a wide range of catchment characteristics (catchment size, climatic con- 968 969 970 971 972 973 974 975 976 977 978 979 ditions, dominant runoff mechanisms, ground data network availability, etc.). Although the results presented here are site specific, they provide novel evidence of the advantages of assimilating satellite-based soil moisture observations for improving flood prediction. Our findings imply that proper pre-processing of observed soil moisture is critical for the efficacy of the data assimilation and its performance is affected by the the quality of model calibration. With this, we are contributing to build knowledge and understanding that can lead us towards an optimal DA framework. 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 980 Acknowledgments 1039 1040 1041 986 This research was conducted with financial support from the Australian Research Council (ARC Linkage Project No. LP110200520) and the Bureau of Meteorology, Australia. We gratefully acknowledge the advise and data provision of Chris Leahy and Soori Sooriyakumaran from the Bureau of Meteorology, Australia. 987 References 1051 Albergel, C., de Rosnay, P., Gruhier, C., Muñoz-Sabater, J., Hasenauer, S., Isaksen, L., Kerr, Y., Wagner, W., 2012. Evaluation of remotely sensed and modelled soil moisture products using global ground-based in situ observations. Remote Sensing of Environment 118, 215–226. Albergel, C., Rüdiger, C., Pellarin, T., Calvet, J.C., Fritz, N., Froissard, F., Suquia, D., Petitpa, A., Piguet, B., Martin, E., et al., 2008. From near-surface to root-zone soil moisture using an exponential filter: an assessment of the method based on in-situ observations and model simulations. Hydrology and Earth System Sciences Discussions 12, 1323–1337. Alvarez-Garreton, C., Ryu, D., Western, A.W., Crow, W.T., Robertson, D.E., 2013a. Impact of observation error structure on satellite soil moisture assimilation into a rainfall-runoff model, in: MODSIM, 20th International Congress on Modelling and Simulation. Modelling and Simulation Society of Australia and New Zealand, pp. 3071–3077. Alvarez-Garreton, C., Ryu, D., Western, A.W., Robertson, D.E., Crow, W.T., Leahy, C., 2013b. Effects of forcing uncertainties in the improvement skills of assimilating satellite soil moisture retrievals into flood forecasting models, in: Piantadosi, J., Anderssen, R., Boland, J. (Eds.), IGARSS, International Geoscience and remote sensing symposium, p. WE2.T03.4. Barre, H., Duesmann, B., Kerr, Y.H., 2008. SMOS: The mission and the system. Geoscience and Remote Sensing, IEEE Transactions on 46, 587–593. Beven, K.J., 2011. Rainfall-runoff modelling: the primer. Wiley. com. Brocca, L., Hasenauer, S., Lacava, T., Melone, F., Moramarco, T., Wagner, W., Dorigo, W., Matgen, P., Martı́nez-Fernández, J., Llorens, P., et al., 2011. Soil moisture estimation through ASCAT and AMSR-E sensors: an intercomparison and validation study across europe. Remote Sensing of Environment 115, 3390–3408. Brocca, L., Melone, F., Moramarco, T., Morbidelli, R., 2009. Antecedent wetness conditions based on ERS scatterometer data. Journal of Hydrology 364, 73–87. 1053 981 982 983 984 985 1042 1043 1044 1045 1046 1047 1048 1049 1050 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1052 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 Brocca, L., Melone, F., Moramarco, T., Penna, D., Borga, M., Matgen, P., Gumuzzio, A., Martinez-Fernández, J., Wagner, W., 2013. Detecting threshold hydrological response through satellite soil moisture data. Die Bodenkultur 64, 7–12. Brocca, L., Melone, F., Moramarco, T., Wagner, W., Naeimi, V., Bartalis, Z., Hasenauer, S., 2010. Improving runoff prediction through the assimilation of the ASCAT soil moisture product. Hydrology and Earth System Sciences 14, 1881–1893. Brocca, L., Moramarco, T., Melone, F., Wagner, W., Hasenauer, S., Hahn, S., 2012. Assimilation of surface-and root-zone ASCAT soil moisture products into rainfall–runoff modeling. Geoscience and Remote Sensing, IEEE Transactions on 50, 2542–2555. Chen, F., Crow, W., Ryu, D., 2014. Dual forcing and state correction via soil moisture assimilation for improved rainfall-runoff modeling. Journal of Hydrometeorology . Chen, F., Crow, W.T., Starks, P.J., Moriasi, D.N., 2011. Improving hydrologic predictions of a catchment model via assimilation of surface soil moisture. Advances in Water Resources 34, 526–536. Chiew, F., Pitman, A., McMahon, T., 1996. Conceptual catchment scale rainfall-runoff models and AGCM land-surface parameterisation schemes. Journal of Hydrology 179, 137 – 157. Chipperfield, A., Fleming, P., 1995. The matlab genetic algorithm toolbox, in: Applied Control Techniques Using MATLAB, IEE Colloquium on, pp. 10/1–10/4. Crow, W., Van den Berg, M., 2010. An improved approach for estimating observation and model error parameters in soil moisture data assimilation. Water Resources Research 46. Crow, W., Bindlish, R., Jackson, T., 2005. The added value of spaceborne passive microwave soil moisture retrievals for forecasting rainfall-runoff partitioning. Geophysical Research Letters 32, L18401. Crow, W.T., van den Berg, M.J., 2010. An improved approach for estimating observation and model error parameters in soil moisture data assimilation. Water Resources Research 46, W12519. Crow, W.T., van Loon, E., 2006. Impact of Incorrect Model Error Assumptions on the Sequential Assimilation of Remotely Sensed Surface Soil Moisture. Journal of Hydrometeorology 7, 421–432. Crow, W.T., Reichle, R.H., 2008. Comparison of adaptive filtering techniques for land surface data assimilation. Water Resources Research 44, W08423. Crow, W.T., Ryu, D., 2009. A new data assimilation approach for improving runoff prediction using remotely-sensed soil moisture retrievals. Hydrology & Earth System Sciences 13, 1–16. De Lannoy, G.J., Houser, P.R., Pauwels, V., Verhoest, N.E., 2006. Assessment of model uncertainty for soil moisture through ensemble verification. Journal of Geophysical Research: Atmospheres (1984–2012) 111. Draper, C.S., Walker, J.P., Steinle, P.J., de Jeu, R.A., Holmes, T.R., 2009. An evaluation of amsr–e derived soil moisture over australia. Remote Sensing of Environment 113, 703–710. Drusch, M., Wood, E., Gao, H., 2005. Observation operators for the direct assimilation of TRMM microwave imager retrieved soil moisture. Geophysical Research Letters 32. Entekhabi, D., Njoku, E.G., O’Neill, P.E., Kellogg, K.H., Crow, W.T., Edelstein, W.N., Entin, J.K., Goodman, S.D., Jackson, T.J., Johnson, J., et al., 2010. The soil moisture active passive (SMAP) mission. Proceedings of the IEEE 98, 704–716. Francois, C., Quesney, A., Ottlé, C., 2003. Sequential assimilation of ers-1 sar data into a coupled land surface-hydrological model using an extended kalman filter. Journal of Hydrometeorology 4, 473–487. Georgakakos, K.P., Seo, D.J., Gupta, H., Schaake, J., Butts, M.B., 2004. Towards the characterization of streamflow simulation uncertainty through multimodel ensembles. Journal of Hydrology 298, 222–241. 17 56 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 Gupta, H.V., Sorooshian, S., Yapo, P.O., 1998. Toward improved calibration of hydrologic models: Multiple and noncommensurable measures of information. Water Resources Research 34, 751–763. Hain, C.R., Crow, W.T., Anderson, M.C., Mecikalski, J.R., 2012. An ensemble kalman filter dual assimilation of thermal infrared and microwave satellite observations of soil moisture into the noah land surface model. Water Resources Research 48. Han, E., Merwade, V., Heathman, G.C., 2012. Implementation of surface soil moisture data assimilation with watershed scale distributed hydrological model. Journal of Hydrology 416, 98–117. Jones, D.A., Wang, W., Fawcett, R., 2009. High-quality spatial climate data-sets for australia. Australian Meteorological and Oceanographic Journal 58, 233. Kumar, S.V., Reichle, R.H., Koster, R.D., Crow, W.T., Peters-Lidard, C.D., 2009. Role of subsurface physics in the assimilation of surface soil moisture observations. Journal of hydrometeorology 10. Lee, H., Seo, D.J., Koren, V., 2011. Assimilation of streamflow and in situ soil moisture data into operational distributed hydrologic models: Effects of uncertainties in the data and initial model soil moisture states. Advances in Water Resources 34, 1597–1615. Liu, Y.Q., Gupta, H.V., 2007. Uncertainty in hydrologic modeling: Toward an integrated data assimilation framework. Water Resources Research 43. Loague, K., 2010. Rainfall-runoff modelling, in: McDonnell, J. (Ed.), IAHS Benchmark Papers in Hydrology No 4. IAHS Press, Wallingford, U.K. volume 4, p. 506. McKenzie, N.J., Jacquier, D., Ashton, L., Cresswell, H., 2000. Estimation of soil properties using the Atlas of Australian Soils. CSIRO Land and Water Canberra. Meier, P., Frömelt, A., Kinzelbach, W., 2011. Hydrological realtime modelling in the zambezi river basin using satellite-based soil moisture and rainfall data. Hydrology and Earth System Sciences 15, 999–1008. Mirus, B.B., Loague, K., 2013. How runoff begins (and ends): Characterizing hydrologic response at the catchment scale. Water Resources Research 49, 2987–3006. Moore, R.J., 2007. The PDM rainfall-runoff model. Hydrology & Earth System Sciences 11, 483–499. Moradkhani, H., DeChant, C.M., Sorooshian, S., 2012. Evolution of ensemble data assimilation for uncertainty quantification using the particle filter-markov chain monte carlo method. Water Resources Research 48. Moradkhani, H., Sorooshian, S., Gupta, H., Houser, P., 2005. Dual state–parameter estimation of hydrological models using ensemble Kalman filter. Advances in Water Resources 28, 135–147. Nash, J., Sutcliffe, J., 1970. River flow forecasting through conceptual models part ia discussion of principles. Journal of hydrology 10, 282–290. Owe, M., de Jeu, R., Holmes, T., 2008. Multisensor historical climatology of satellite-derived global land surface moisture. Journal of Geophysical Research: Earth Surface (2003–2012) 113. Pauwels, V., Hoeben, R., Verhoest, N.E., De Troch, F.P., 2001. The importance of the spatial patterns of remotely sensed soil moisture in the improvement of discharge predictions for small-scale basins through data assimilation. Journal of Hydrology 251, 88–102. Pauwels, V., Hoeben, R., Verhoest, N.E., De Troch, F.P., Troch, P.A., 2002. Improvement of toplats-based discharge predictions through assimilation of ers-based remotely sensed soil moisture values. Hydrological processes 16, 995–1013. Pauwels, V.R., De Lannoy, G.J., Franssen, H.J.H., Vereecken, H., 2013. Simultaneous estimation of model state variables and observation and forecast biases using a two-stage hybrid kalman filter. Hydrology & Earth System Sciences Discussions 10. Reichle, R.H., Crow, W.T., Keppenne, C.L., 2008. An adaptive ensemble kalman filter for soil moisture data assimilation. Water 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 18 57 resources research 44. Reichle, R.H., Koster, R.D., 2004. Bias reduction in short records of satellite soil moisture. Geophysical Research Letters 31. Reichle, R.H., McLaughlin, D.B., Entekhabi, D., 2002. Hydrologic data assimilation with the ensemble kalman filter. Monthly Weather Review 130. Richards, L.A., 1931. Capillary conduction of liquids through porous mediums. Physics 1, 318–333. Ryu, D., Crow, W.T., Zhan, X., Jackson, T.J., 2009. Correcting Unintended Perturbation Biases in Hydrologic Data Assimilation. Journal of Hydrometeorology 10, 734–750. Su, C.H., Ryu, D., Young, R.I., Western, A.W., Wagner, W., 2013. Inter-comparison of microwave satellite soil moisture retrievals over the murrumbidgee basin, southeast australia. Remote Sensing of Environment 134, 1–11. Wagner, W., Lemoine, G., Rott, H., 1999. A method for estimating soil moisture from ERS scatterometer and soil data. Remote Sensing of Environment 70, 191–207. Wanders, N., Karssenberg, D., Roo, A.d., de Jong, S., Bierkens, M., 2014. The suitability of remotely sensed soil moisture for improving operational flood forecasting. Hydrology and Earth System Sciences 18, 2343–2357. Yilmaz, M.T., Crow, W.T., 2013. The optimality of potential rescaling approaches in land data assimilation. Journal of Hydrometeorology 14, 650–660. 58 Chapter 5 Lumped vs semi-distributed model configurations This chapter was published as the following article: C. Alvarez-Garreton, D. Ryu, A. Western, C.-H. Su, W. Crow, D. Robertson, and C. Leahy. Improving operational flood ensemble prediction by the assimilation of satellite soil moisture: comparison between lumped and semi-distributed schemes. Hydrology and Earth System Sciences, 19(4):1659-1676, 2015. 59 Manuscript prepared for Hydrol. Earth Syst. Sci. with version 5.0 of the LATEX class copernicus.cls. Date: 6 March 2015 Improving operational flood ensemble prediction by the assimilation of satellite soil moisture: comparison between lumped and semi-distributed schemes. C. Alvarez-Garreton1 , D. Ryu1 , A.W. Western1 , C.-H. Su1 , W.T. Crow2 , D.E. Robertson3 , and C. Leahy4 1 Department of Infrastructure Engineering, The University of Melbourne, Parkville, Victoria, Australia USDA-ARS Hydrology and Remote Sensing Laboratory, Beltsville, Maryland, United States 3 CSIRO Land and Water, Australia 4 Bureau of Meteorology, Melbourne, Victoria, Australia 2 Correspondence to: Camila Alvarez-Garreton (calvarez@student.unimelb.edu.au) 5 10 15 20 25 30 Abstract. Assimilation of remotely sensed soil moisture data (SM-DA) to correct soil water stores of rainfall-runoff models has shown skill in improving streamflow prediction. In the case of large and sparsely monitored catchments, SM-DA is a particularly attractive tool. Within this context, we assimilate satellite soil moisture (SM) retrievals from the Advanced Microwave Scanning Radiometer (AMSR-E), the Advanced Scatterometer (ASCAT) and the Soil Moisture and Ocean Salinity (SMOS) instrument, using an Ensemble Kalman filter to improve operational flood prediction within a large (>40,000km2 ) semi-arid catchment in Australia. We assess the importance of accounting for channel routing and the spatial distribution of forcing data by applying SM-DA to a lumped and a semi-distributed scheme of the probability distributed model (PDM). Our scheme also accounts for model error representation by explicitly correcting bias in soil moisture and streamflow in the ensemble generation process, and for seasonal biases and errors in the satellite data. Before assimilation, the semi-distributed model provided a more accurate streamflow prediction (Nash-Sutcliffe efficiency, NSE=0.77) than the lumped model (NSE=0.67) at the catchment outlet. However, this did not ensure good performance at the “ungauged” inner catchments (two of them with NSE below 0.3). After SM-DA, the streamflow ensemble prediction at the outlet was improved in both the lumped and the semi-distributed schemes: the root mean square error of the ensemble was reduced by 22% and 24%, respectively; the false alarm ratio was reduced by 9% in both cases; the peak volume error was reduced by 58% and 1%, respectively; the ensemble skill was improved (evidenced by 12% and 35 40 45 13% reductions in the continuous ranked probability scores, respectively); and the ensemble reliability was increased in both cases (expressed by flatter rank histograms). SM-DA did not improve NSE. Our findings imply that even when rainfall is the main driver of flooding in semi-arid catchments, adequately processed satellite SM can be used to reduce errors in the model soil moisture, which in turn provides better streamflow ensemble prediction. We demonstrate that SM-DA efficacy is enhanced when the spatial distribution in forcing data and routing processes are accounted for. At ungauged locations, SM-DA is effective at improving some characteristics of the streamflow ensemble prediction; however, the updated prediction is still poor since SM-DA does not address the systematic errors found in the model prior to assimilation. 1 50 55 Introduction Floods have large costs to society, causing destruction of infrastructure and crops, erosion, and in the worst cases, injury and loss of life (Thielen et al., 2009). To reduce flood impacts on public safety and the economy, early and accurate alert systems are needed. These systems rely on hydrologic models, whose accuracy in turn is highly dependent on the quality of the data used to force and calibrate them. Therefore, in the case of sparsely monitored and ungauged catchments, flood prediction suffers from large uncertainties. A plausible approach to reduce model uncertainties in the sparsely monitored catchments is to exploit remotely sensed 60 2 60 65 70 75 80 85 90 95 100 105 110 C. Alvarez-Garreton et al.: Assimilation of satellite soil moisture to improve flood prediction hydro-meteorological observations to correct the states or parameters of the model in a data assimilation framework. Within this context, satellite soil moisture (SM) products are 115 appealing given the vital role of SM in runoff generation. SM influences the partitioning of energy and water (rainfall, infiltration and evapotranspiration) between the land surface and the atmosphere (Western et al., 2002). Satellite SM observations provide global scale information and can be obtained in 120 near real time at regular and reasonably frequent time intervals. This makes them valuable for improving the representation of catchment wetness. The accuracy of these observations has been assessed by a number of studies (Albergel et al., 2009; Draper et al., 2009; Albergel et al., 2010; Gruhier 125 et al., 2010; Brocca et al., 2011; Albergel et al., 2012; Su et al., 2013). In general, they have shown promising performance with moderate correlation between satellite SM and ground data, but with significant bias at some locations. In the last decade a large number of studies have explored 130 satellite SM data assimilation (SM-DA) to correct the soil water states of models. These studies can be categorised into two main groups; the first, and larger group, has focused on the improvement of the SM predicted by the model (generally working with land surface models, e.g., Crow and van 135 Loon, 2006; Crow and Reichle, 2008; Crow and Van den Berg, 2010; Reichle et al., 2008; Ryu et al., 2009). The second, and smaller group (where our study fits), has focused on the improvement of streamflow prediction in rainfall-runoff models (Francois et al., 2003; Brocca et al., 2010b, 2012; 140 Alvarez-Garreton et al., 2013, 2014; Chen et al., 2014; Wanders et al., 2014). Studies from the first group evaluate the prediction improvement of the same variable that is updated in the assimilation scheme (SM). Improvements in streamflow pre- 145 dictions investigated by studies in the second group are not exclusively influenced by better representation of SM. The potential improvement of streamflow predictions in the latter case is constrained by the particular runoff mechanisms operating within a catchment. Accordingly, even when a model 150 structure and parametrisation are capable of representing the runoff mechanisms, improving streamflow prediction by reducing error in soil moisture depends on the error covariance between these two components. This error covariance (which in the model space will be defined by the representation of the different sources of uncertainty) may become marginal when the errors in streamflow come mainly from errors in 155 rainfall input data (Crow and Ryu, 2009). This physical constraint is case specific and determines the potential skill of SM-DA for improving streamflow prediction. To understand and assess this skill, further studies focusing on the improvement of streamflow prediction are needed with different model characteristics, such as structure, parametrisation and 160 performance before assimilation; and with different catchment characteristics, such as climate, scale, soils, geology, land cover and density of monitoring network. Among the 61 latter, semi-arid catchments present distinct rainfall-runoff processes which have been rarely studied in SM-DA. Here we address this gap by studying the Warrego River catchment in Australia, a large and sparsely monitored semiarid basin. We set up the probability distributed model (PDM) within the catchment, and assimilate passive and active satellite SM products using an Ensemble Kalman filter (Evensen, 2003) for the purpose of improving operational flood prediction. We devise an operational SM-DA scheme to answer three main questions. 1) While rainfall is presumably the main driver of flood generation in semi-arid catchments, can we effectively improve streamflow prediction by correcting the antecedent soil water state of the model? 2) What is the impact of accounting for channel routing and the spatial distribution of forcing data on SM-DA performance? 3) What are the prospects for improving streamflow prediction within ungauged sub-catchments using satellite SM?. A series of SM-DA experiments using a lumped version of PDM have already been undertaken in this study catchment by Alvarez-Garreton et al. (2014). They found that assimilating passive microwave satellite SM improved flood prediction, while highlighting specific limitations in their scheme. This paper expands on this previous result in a number of key ways. We improve the representation of model error by explicitly treating forcing, parameter and structural errors. We devise a more robust ensemble generation process by correcting biases in soil moisture and streamflow predictions. We incorporate additional satellite products and apply instrumental variable regression techniques for seasonal rescaling and observations error estimation. Furthermore, we employ a semi-distributed scheme to evaluate the advantages of accounting for channel routing and the spatial distribution of forcing data. In this paper, Sect. 2 presents a description of the study catchment and the data used. Section 3 presents the methodology, including a description of the rainfall-runoff model, the EnKF formulation and the specific steps for setting up the SM-DA scheme. These include the error model estimation, estimation of profile SM based on the satellite surface data, the rescaling of satellite observations and observation error estimation. Section 4 presents the results and discussion. Section 5 summarises the main conclusions of the study. 2 Study area and data The study area is the semi-arid Warrego catchment (42,870 km2 ) located in Queensland, Australia (Fig.1). The catchment has an important flooding history, with at least three major floods within the last 15 years. The study area also features geographical and climatological conditions that enable satellite SM retrievals to have higher accuracy than in other areas. These conditions include the size of the catchment, the semi-arid climate and the low vegetation cover. Moreover, the ground-monitoring network within the catchment is C. Alvarez-Garreton et al.: Assimilation of satellite soil moisture to improve flood prediction 165 170 175 180 185 190 195 200 205 210 215 sparse thus satellite data is likely to be more valuable than in 220 well-instrumented catchments. The catchment has summerdominated rainfall with mean monthly rainfall accumulation of 80 mm in January, and 20 mm in August. Mean maximum daily temperature in January is above 30◦ C and below 20◦ C in July. The runoff seasonality is characterised by 225 peaks in summer months and minimum values in winter and spring. The mean annual precipitation over the catchment is 520 mm. Regarding the governing runoff mechanisms within the study catchment, Alvarez-Garreton et al. (2014) showed that streamflow has a negligible baseflow component and the 230 surface runoff is generated only when a wetness threshold is exceeded. They concluded that soil moisture exerts an important control on the runoff generation mechanisms. In this work, the runoff mechanisms analysis is deepened by looking at model predictions (Sect. 3.1). 235 Daily rainfall data was computed from the Australian Water Availability Project (AWAP), which has a grid resolution of 0.05◦ (Jones et al., 2009). Hourly streamflow records were collected from the State of Queensland, Department of Natural Resources and Mines 240 (http://watermonitoring.dnrm.qld.gov.au) (Fig.1). Daily discharge was calculated based on the daily AWAP time convention (9am-9am local time, UTC+10h). The flood classification for the study catchment (at the catchment outlet, N7) was provided by the Australian Bureau of Meteorol- 245 ogy as river height threshold values, corresponding to minor, moderate and major floods. These threshold values expressed as streamflow (mm/day) are 0.06, 0.55 and 2.05, respectively and relate to flood impact rather than recurrence interval. The associated annual exceedance probability for 250 the minor, moderate and major floods at N7 are 15.7%, 3.1% and 0.95%, respectively (calculated using the complete daily streamflow record period). Potential evapotranspiration was obtained from the Australian Data Archive for Meteorology database. Daily values were estimated by assuming a uni- 255 form daily distribution within a month. Three satellite products were used here. The first was the Advanced Microwave Scanning Radiometer - Earth Observing System (AMS hereafter) version 5 VUA-NASA Land Parameter Retrieval Model Level 3 gridded product (Owe et al., 2008). AMS uses C- (6.9 GHz) and X-band (10.65 and 18.7 GHz) radiance observations to derive near-surface soil moisture (2 to 3 cm depth) using a land-surface radiative transfer model. The product used is in units of volumetric water con- 260 tent (m3 m−3 ) and has a regular grid of 0.25◦ . The second product was the TU-WIEN (Vienna University of Technology) ASCAT (ASC hereafter) data produced using the change-detection algorithm (Water Retrieval Package, version 5.4) (Naeimi et al., 2009). ASC transmits elec- 265 tromagnetic waves in C-band (5.3Gz) and measures the backscattered microwave signal. The change-detection algorithm assumes that land surface characteristics are relatively static over long time periods. Based on this, the differences between instantaneous backscatter coefficients and the his- 270 62 3 torical highest and lowest values for a given incident angle, are related to changes in soil moisture (Wagner et al., 1999). The final SM estimate is provided in relative terms as the degree of saturation and has a nominal spatial resolution varying from 25 to 50 km. The third satellite product was the Soil Moisture and Ocean Salinity satellite (SMO hereafter), version RE01 (Reprocessed 1-day global soil moisture product) SM provided by the Centre Aval de Traitement des Donnees. SMO uses L-band (1.4 GHz) detectors to measure microwave radiation emitted from depth of up to approximately 5 cm. Nearsurface soil moisture is obtained in units of volumetric water content (m3 m−3 ) at a spatial resolution of approximately 43 km by using the forward physical model inversion described by Kerr et al. (2012). The overpass times of the AMS, ASC and SMO satellites over the study catchment are 1.30am/pm, 10am/pm and 6am/pm local time (UTC+10h), respectively. Figure 2 summarises the period of record of the different datasets. For each satellite dataset, a daily averaged SM was calculated for the complete catchment (or sub-catchment in the case of the semi-distributed scheme). The areal estimate of satellite SM over the catchment was given by averaging the values of ascending and descending satellite passes on days when more than 50% of the pixels had valid data. For the case of the passive sensors (AMS and SMO), we subtracted the long-term temporal mean of the ascending and descending datasets to remove the systematic bias between them (Brocca et al., 2011; Draper et al., 2009). Then, daily satellite SM was calculated as the average between the mean-removed ascending and descending passes (if both were available) or directly as the mean-removed available pass. For ASC retrievals, given the unbiased ascending and descending measurements, daily satellite SM was calculated from the actual ascending and descending values averaged over the catchment. 3 Methods 3.1 Lumped and semi-distributed model schemes The probability distributed model (PDM) is a conceptual rainfall-runoff model that has been widely used in hydrologic research and applications (Moore, 2007), mainly over temperate and humid environments. The model was selected from amongst the set of models available within the flood forecasting system managed by the Australian Bureau of Meteorology. This selection was based on both the suitability of PDM to simulate ephemeral rivers (Moore and Bell, 2002) and preliminary analysis comparing PDM against other models such as the Sacramento soil moisture accounting model, which did not perform as well as PDM. PDM is a parsimonious model where the runoff production is controlled by the absorption capacity of the soil (in- 4 C. Alvarez-Garreton et al.: Assimilation of satellite soil moisture to improve flood prediction " ) 10°S 1:80,000,000 1:7,000,000 " ) " ) " ) " ) 15°S " ) " ) " ) " ) SC2 " ) SC3 20°S " ) SC1 25°S # " ) # 35°S ) (" ! " ) 40°S " ) # 125°E 130°E " ) 135°E 140°E 145°E 150°E 155°E 145°E SC5 " ) " ) " ) (" " )! ) " ) " ) " ) " ) " ) " ) Rainfall gauge ( Node wihtout streamflow gauge ! Node 1: Warrego at Binnowee (N1) Node 3: Warrego at Augathella (N3) Node 7: Warrego at Wyandra (N7) # # # 146°E 147°E 26°S " ) SC4 " ) SC6 ! (" ) SC7 120°E " ) ( " ) " )! " ) 115°E 25°S " " ) ) 30°S Warrego River Catchment " ) " ) " ) " ) " ) 27°S " ) 148°E Fig. 1. The Warrego river basin located in Queensland, Australia (left panel). A close-up of the area is presented on the right panel. The lumped PDM scheme is set up over the entire catchment, while the semi-distributed scheme divides the total catchment in 7 sub-catchments (SC1 to SC7). In this way, for a time t, the soil moisture over the entire catchment, θ (water content of S1 ), can be expressed as the summation of all the store capacities greater than C ∗ (t): Satellite SM Streamflow Rainfall SMOS ASCAT AMSR−E N1 N3 N7 θ(t) = AWAP 1975 1980 1985 1990 1995 2000 2005 2010 295 Fig. 2. Periods of record of the different datasets. The initial date of the plot was set as the beginning of the streamflow data record 280 285 290 (1 − F (c)) dc. (2) 0 1970 275 CZ∗ (t) C ∗ (t + ∆t) = C ∗ (t) + P ∆t. cluding canopy and surface detention). This process is conceptualised by a store with a distribution of capacities across 300 the catchment and the spatial distribution of these capacities is described by a probability distribution (Moore, 2007). The spatial variability of store capacities can be related to different soil depths, which was identified as the most dominant factor governing runoff variability in a semi-arid catchment 305 (Jothityangkoon et al., 2001). In the current formulation, the model treats soil moisture store (S1 in Fig.3) over the entire catchment as a distributed variable with capacities (c) following a Pareto distribution function, F (c). At a given time, the different stores 310 receive water from rainfall and lose water by evaporation and groundwater recharge (drainage). The shallower stores with less capacity than a critical capacity, C ∗ , start to generate direct runoff while the rest accumulates the water as soil moisture. The proportion of the catchment that generates runoff can therefore be expressed in terms of the Pareto 315 density function, f (c), as prob (c ≤ C ∗ ) = F (C ∗ ) = ZC ∗ f (c)dc. Note that the critical capacity C ∗ varies in a time interval ∆t based on the net rainfall rate during that time, P , (1) 0 63 (3) Direct runoff is calculated based on Eq. 1 and routed through two cascade of reservoirs (S21 and S22 in Fig.3, with time constants k1 and k2 , respectively). Subsurface runoff is estimated based on the drainage from S1 and transformed into baseflow by using a storage reservoir (S3 in Fig.3 with time constant kb ). These are then combined as total runoff, or streamflow. A detailed description of the model conceptualisation and the formulation of the different rainfall-runoff processes is presented in Moore (2007). PDM was set up using both a lumped scheme and a semidistributed scheme (see Fig.1). The semi-distributed scheme was configured with 7 sub-catchments (SC1 to SC7), each using the lumped version of PDM. The area and mean annual rainfall of each sub-catchment are summarised in Table 1. The river routing between upstream and downstream subcatchments in the semi-distributed scheme was represented by a linear Muskingum method (Gill, 1978): S = km (Ix + (1 − x)O) , (4) where S is the storage within the routing reach, km is the storage time constant, I and O are the streamflow at the beginning and end of the reach, respectively, and x is a weighting factor parameter. The time constant parameters of the C. Alvarez-Garreton et al.: Assimilation of satellite soil moisture to improve flood prediction P Direct runoff 3.2 S21 E Surface runoff S22 Fast flow storages S1 Q Total runoff Drainage Sub-surface runoff Baseflow S3 Slow flow storage 355 Fig. 3. The PDM scheme 320 350 storages S21 , S22 and S3 (k1 , k2 and kb , respectively) were scaled by the area of each sub-catchment, and km from the 360 Muskingum routing was scaled by the length of the river channel between corresponding nodes. The remaining model and routing parameters of the semi-distributed scheme were treated as homogeneous. EnKF formulation The ensemble Kalman filter (EnKF) proposed by Evensen (2003) has been widely used in hydrologic applications given the nonlinear nature of runoff processes. In the EnKF, the error covariance between the model and observations is calculated from Monte Carlo-based ensemble realisations. In this way, the model and observation uncertainties are propagated and the streamflow prediction is treated as an ensemble of equally likely realisations. The uncertainty of the streamflow prediction can be derived from the ensemble, which provides valuable information for operational flood alert systems. In a state-updating assimilation approach, the state ensemble is created by perturbing forcing data, parameters and/or states of the model with unbiased errors. As we will see in Sect. 3.3, an N -member ensemble of model soil moisture, θ = {θ1 , θ2 , ...θN }, was generated by perturbing rainfall forcing data, the model parameter k1 , and θ. Then, the soil water error of member i at time t was estimated as 0 − θ− i (t) = θi (t) − Table 1. Area and mean annual rainfall of the catchments used in the lumped and semi-distributed schemes. 365 325 330 335 340 345 Catchment Area (km2 ) Mean annual rainfall (mm) SC1 SC2 SC3 SC4 SC5 SC6 SC7 Total 14,670 4,453 8,070 5,431 4,067 2,130 4,049 42,870 492 532 596 524 503 467 418 512 370 375 The lumped and the semi-distributed models were calibrated by using a genetic algorithm (Chipperfield and Fleming, 1995) with an objective function based on the NashSutcliffe model efficiency (NSE) (Nash and Sutcliffe, 1970). 380 The models were calibrated for the period 01 January 1967 - 31 May 2003 and evaluation performed for the period 01 June 2003 - 02 March 2014. To make fair comparisons between the two model setups in a scenario where the inner catchments are ungauged, the semi-distributed scheme was calibrated using only the outlet gauge (N7 in Fig.1). The performance of the calibrated models was evaluated based 385 on the NSE at the catchment outlet (N7, Fig.1) and at inner nodes N1 and N3, in the case of the semi-distributed scheme. To analyse the runoff mechanisms simulated by the lumped and the semi-distributed schemes, we calculated the lag-correlation between rainfall and streamflow, and between antecedent SM and streamflow. This enables further understanding of the improvement in streamflow that can be ex- 390 pected by improving the simulated SM content through SMDA. 64 5 N 1 X − θ (t), N i=1 i (5) where the superscript “− ” denotes the state prediction prior to the assimilation step. The error vector for time step t was 0 0 0 − defined as θ − (t)0 = {θ1− (t) , θ2− (t) , ..., θN (t) } and the error − covariance of the model state (P ) was estimated at each time step as: P − (t) = T 1 θ − (t)0 · θ − (t)0 . N −1 (6) When a daily SM observation from AMS, ASC or SMO was available, each member of the background prediction (θ − ) was updated. Before being assimilated, each of the three observation datasets was transformed to represent a profile SM and then rescaled to remove systematic differences between the model and the transformed observations (details in Sects. 3.5 and 3.6). We sequentially assimilated an N -member ensemble of the transformed and rescaled AMS, ASC and SMO (named θ ams , θ asc and θ smo , respectively) and updated each member of θ − with the following 3 steps: 1. If θ ams was available at time t, θi+ (t) = θi− (t) + K1 (t) · (θiams (t) − Hθi− (t)), (7) where H is an operator that transforms the model state to the measurement space. Since the additive and multiplicative biases between the model predictions and the microwave retrievals were removed via rescaling in a separate step (see Section 3.6), H reduced to a unit matrix. The Kalman gain K1 (t) was calculated as K1 (t) = P − (t)H T , HP − (t)H T + R1 (t) (8) where R1 (t) is the error variance of θ ams estimated in the rescaling procedure (Sect. 3.6). If θ ams was not available, θ + (t) = θ − (t). 6 C. Alvarez-Garreton et al.: Assimilation of satellite soil moisture to improve flood prediction 2. If θ asc was available at time t, we updated the model soil moisture with θi++ (t) = θi+ (t) + K2 (t) · (θiasc (t) − Hθi+ (t)), 395 (9) where K2 (t) was calculated as P − (t)H T K2 (t) = . HP − (t)H T + R2 (t) 2011; Brocca et al., 2012; Alvarez-Garreton et al., 2014) and represented a spatially homogeneous rainfall error (p ) as p ∼ lnN (1, σp2 ), 440 (10) R2 (t) is the error variance of θ asc and P − is the model error covariance re-calculated by applying Eq.(6) to the 445 updated soil moisture θ + (t). If θ asc was not available, θ ++ (t) = θ + (t). 400 3. If θ smo was available at time t, we updated the model soil moisture with 450 θi+++ (t) = θi++ (t) + K3 (t) · (θismo (t) − Hθi++ (t)), where K3 (t) was calculated as 405 K3 (t) = P − (t)H T . HP − (t)H T + R3 (t) (12) 455 R3 (t) is the error variance of θ smo and P − is the model error covariance re-calculated by applying Eq.(6) to the updated soil moisture θ ++ (t). If θ smo was not avail- 460 able, θ +++ (t) = θ ++ (t). 410 415 (11) In the case of the semi-distributed scheme, during the updating steps described above, each sub-catchment was treated independently and no spatial cross-correlation in the satellite measurements was considered. The order of the products assimilated in steps 1 to 3 was arbitrary; however, 465 we checked that different orders did not significantly affect the SM-DA results. 3.3 Error model representation 470 420 425 430 435 The main sources of uncertainty in hydrologic models are the errors in the forcing data, the model structure and the incorrect specification of model parameters (Liu and Gupta, 2007). Generally, these errors are represented by adding unbiased synthetic noise to forcing variables, model state vari- 475 ables and/or model parameters. The estimation of model errors is among the most crucial challenges in data assimilation, as it determines the value of the Kalman gain. In the case of a state updating SM-DA, the ability of the scheme to improve streamflow prediction will 480 mainly depend on the covariance between the errors in SM states and modelled streamflow, which directly depends on the specific representation and estimation of the model errors. To represent the forcing uncertainty, we adopted a multi- 485 plicative error model for the rainfall data (McMillan et al., 2011; Tian et al., 2013). In particular, we followed the scheme used in various SM-DA studies (e.g., Chen et al., 65 (13) where σp is the standard deviation of the lognormal distribution. The above representation assumes a spatially homogeneous fraction of the error to the rainfall intensity, which could be an over simplification in a large area like the study catchment. However, it avoids the estimation of additional error parameters (e.g., spatial correlation parameter) in an already highly undetermined problem (see Sect. 3.4). The parameter uncertainty was represented by perturbing the time constant parameter (k1 ) for store S21 , a highly sensitive parameter of the model that directly affects the streamflow generation by influencing the water stored in both surface storages S21 and S22 (note that in the PDM formulation used, the time constant k2 is calculated as a function of k1 ). Given the lack of prior information about the structure of the parameter error (k ), we adopted a normally distributed multiplicative error with unit mean and standard deviation of σk , following previous SM-DA studies working with rainfall-runoff models (Brocca et al., 2010b, 2012). Following the scheme used in most SM-DA experiments (e.g., Reichle et al., 2008; Crow and Van den Berg, 2010; Chen et al., 2011; Hain et al., 2012), the model structural error was represented by perturbing the SM prediction (θ) with a spatially homogeneous additive random error, s ∼ N (0, σs2 ), (14) where σs is the standard deviation of the normal distribution. The physical limits of SM (porosity as an upper bound and residual water content as a lower bound) are represented by the model through the storage capacity of S1 . When θ approaches the limits of S1 , applying unbiased perturbation to θ can lead to truncation bias in the background prediction. This can then result in mass balance errors and degrade the performance of the EnKF (Ryu et al., 2009). Moreover, the Kalman filter assumes unbiased state variables. This issue is of particular importance in arid regions like the study area, where the soil water content can be rapidly depleted by evapotranspiration and transmission losses, thus approaching the residual water content of the soil. To ensure that the state ensemble remained unbiased after perturbation we implemented the bias correction scheme proposed by Ryu et al. (2009). The truncation bias correction consisted of running a single unperturbed model prediction (θ−0 ) in parallel with the perturbed model prediction (θi,− ). At each time step, the mean bias, δ(t), of the N -member ensemble prediction was calculated by subtracting θ−0 (t) from the ensemble mean, as follows (Ryu et al., 2009): δ(t) = N 1 X − θ (t) − θ−0 (t). N i=1 i (15) C. Alvarez-Garreton et al.: Assimilation of satellite soil moisture to improve flood prediction 490 495 500 505 510 515 Then, a bias corrected ensemble of state variables, θ̃i− (t), was obtained by subtracting δ(t) from each member of the perturbed ensemble, θi− (t). Although the latter resulted in unbiased state ensembles, some important but subtle effects remain that arise from the 540 highly non-linear nature of hydrologic model. These need to be guarded against in SM-DA. Representing model errors by adding unbiased perturbation to forcing, model parameters and/or model states can lead to a biased streamflow ensemble prediction (e.g., Ryu et al., 2009; Plaza et al., 545 2012), compared with the unperturbed model run. This biased streamflow ensemble prediction (open-loop hereafter) is degraded compared with the streamflow predicted by the unperturbed calibrated model. As a consequence, improvement of the open-loop after SM-DA will in part be due to the correction of bias introduced during the assimilation process 550 itself. To avoid overstating the SM-DA efficacy due to the above issue, we applied the bias correction scheme directly to the streamflow prediction (in both the open-loop and the assimilation runs). We used the unperturbed model run to estimate 555 a mean bias in the streamflow (following Eq. 15, but using streamflow instead of soil moisture) and then corrected each ensemble member by subtracting this mean bias. This practical tool ensures that the streamflow ensemble mean maintains the performance skill of the unperturbed (calibrated) 560 model run, thus avoiding artificial degradation of the unperturbed model run by bias. To our knowledge, this approach has not been applied in SM-DA previous studies. 3.4 Error model parameters calibration 565 520 525 To calibrate the error model parameters (σp , σk and σs ), we evaluated the open-loop ensemble prediction (Qol ) against the observed streamflow at the catchment outlet. In this study we used a maximum a posteriori (MAP) scheme, a Bayesian inference procedure detailed by Wang et al. (2009) that max- 570 imises the probability of observing historical events given the model and error parameters. In other words, it maximises the probability of having the streamflow observation within the open-loop streamflow. 575 Member i from the N -member open-loop can be expressed as T Qol i (t) = Q (t) + m (t), (16) where Q is the (unknown) truth streamflow and m is the 580 error of the streamflow prediction and consists of forcing, parameter and states errors: T 530 m (t) = f (p (t), k (t), s (t)). 535 (17) Q̂obs (t) = Qol (t) + m (t) + obs (t). (18) 66 (19) Following Li et al. (2014), obs was assumed to be a serially independent multiplicative error following a normal distribution (mean 1 and standard deviation of 0.2). Then, the likelihood function (L) defining the probability of observing the historical streamflow data given the calibrated model parameters (x), and the error model parameters (σp , σk and σs ), was expressed as L(Qobs |x, σp , σk , σs ) = Πn t=1 p(Qobs (t)|Q̂obs (t)). (20) To maximise L, we applied a logarithm transformation to it and maximised the sum over time of the transformed function. The probability density function (p) at each time step was estimated by assuming that the ensemble prediction of the observed streamflow, Q̂obs (t), follows a Gaussian distribution, with its mean and standard deviation computed using the ensemble members. The period used to calibrate the error model parameters was 01 January 1998 - 31 May 2003. An important aspect to highlight about this error parameter calibration is that it is a highly underdetermined problem. Only one data set (streamflow at N7) is used to calibrate the error parameters, while there might be many combinations of error parameters that can generate similar streamflow ensemble (equifinality on the error parameters). 3.5 Profile soil moisture estimation The aim of the stochastic assimilation detailed in Sect. 3.2 is to correct θ, which is a profile average SM representing a soil layer depth determined by calibration. By assuming a porosity of 0.46, (A-horizon information reported in McKenzie et al. (2000)), and the model S1 storage capacity of 396 mm (420 mm) for the lumped (semi-distributed) scheme, this profile SM roughly represents the upper 1 m of the soil. On the other hand, the satellite SM observations represent only the few top centimetres of the soil column (see Sect. 2). To provide the model with information about more realistic dynamics of θ, we applied the exponential filter proposed by Wagner et al. (1999) to the satellite SM to estimate the soil wetness index (SWI) of the root-zone. SWI has been widely used to represent deeper layer SM based on satellite observations (e.g., Albergel et al., 2008; Brocca et al., 2009, 2010b, 2012; Ford et al., 2014; Qiu et al., 2014). SWI was recursively calculated as: SWI(t) = SWI(t − 1) + G(t) [SSM(t) − SWI(t − 1)] , (21) where SSM(t) is the satellite SM observation and G(t) is a gain term varying between 0 and 1 as: G(t − 1) . t−(t−1) − T G(t − 1) + e 585 Qobs (t) = QT (t) + obs (t). Combining Eqs. 16 and 18, the model ensemble prediction of the observed streamflow (Q̂obs ) is expressed as: G(t) = The observed streamflow at N7 (Qobs ) can be expressed as a function of the same (unknown) truth and the streamflow observation error (obs ), 7 (22) T is a calibrated parameter that implicitly accounts for several physical parameters (Albergel et al., 2008). T was calibrated by maximising the correlation between SWI and the 8 590 595 unperturbed model soil moisture (θ) during the first year of 640 satellite data. This calibration period was selected to maximise the independent evaluation period (see Section 3.7); however, more representative values are likely to be obtained if a longer period was used for calibration. SWI was calculated independently for each of the AMS, ASC and 645 SMO datasets (named SWIAMS , SWIASC and SWISMO , respectively) and then rescaled to remove systematic differences with the model prediction (Sect. 3.6). 3.6 600 605 610 615 620 625 630 635 C. Alvarez-Garreton et al.: Assimilation of satellite soil moisture to improve flood prediction Rescaling and observation error estimation The systematic differences (e.g., biases) between θ and the 650 SWI derived from each satellite product must be removed prior to applying a bias-blind data assimilation scheme (Dee and Da Silva, 1998). We applied instrumental variable (IV) regression to resolve the biases and estimate the measurement errors simultaneously (Su et al., 2014a). In three- 655 data IV regression analysis, also known as triple collocation (TC) analysis (Stoffelen, 1998; Yilmaz and Crow, 2013), the model θ, the passive SWI and active SWI are used as the data triplet. As the sample size requirement for TC is stringent (Zwieback et al., 2012), a pragmatic threshold of 100 triplet sample was imposed (Scipal et al., 2008). During pe- 660 riods when only one satellite product was available (i.e., before ASC) or when the sample threshold for TC was not met, a two-data set IV regression using lagged variables (LV) was applied as a practical substitute (Su et al., 2014a). The LV analysis was performed on the model θ and a single satellite 665 SWI, with the lagged variable coming from the model. In most SM-DA experiments, the error in satellite SM has been treated as time-invariant (e.g., Reichle et al., 2008; Ryu et al., 2009; Crow and Van den Berg, 2010; Brocca et al., 2010b, 2012; Alvarez-Garreton et al., 2014); however, studies evaluating satellite SM products have shown an important temporal variability in the measurement errors (Loew 670 and Schlenz, 2011; Su et al., 2014a). Since a data assimilation scheme explicitly updates the model prediction based on the relative weights of the model and the observation errors, assuming a constant observation error may lead to overcorrection of the model state if the actual error is higher, and 675 vice versa. Temporal characterisation of the observation error can be achieved by applying TC (or LV) to specific time windows of the observations and model predictions (for example, by grouping the triplets or doublets by month-of-the-year). There is however, a trade-off between the sampling window (which defines the temporal characterisation of the error) and 680 the sample size (number of triplets in each subset). In an operational context this trade-off becomes more critical since only past observations are available. After analysing the temporal variability of the observation errors using the complete period of record (not shown here), we found that a 4-month sampling window can reproduce seasonality in errors while 685 ensuring sufficient data samples for the TC and LV schemes. 67 With this analysis we also assessed the suitability of using LV, which yielded similar results to TC although some positive bias in LV error variance estimates relative to TC was noted (not shown here). Summarising, the procedure for rescaling and error estimation consists of: 1. From the start of the AMS dataset, we grouped LV triplets (SWIAMS (t), θ(t) and θ(t − 1)) into three subsets: Dec-Mar, Apr-Jul and Aug-Nov. 2. We applied LV and thus, estimated the observation error variance and rescaling factors for a given 4-month subset only when a minimum of 100 samples was reached (after one year of AMS dataset). After the first year of AMS, new seasonal triplets were added into the corresponding 4-month data pool (retaining all earlier triplets) and LV was applied to the updated subset. 3. When ASC was available, LV triplets (SWIASC (t), θ(t) and θ(t − 1)) subsets were formed following step 1 criteria and LV was applied after the 4-month data pools had more than 100 samples, following step 2. 4. In parallel with step 3, TC triplets were formed using the two available satellite datasets (SWIAMS (t), SWIASC (t) and θ(t)) and grouped into the 4-month subsets defined in step 1. TC was applied only when the 4-month data pools contained more than 100 samples (after approximately 3 years of ASC data). 5. Steps 3 and 4 were repeated when SMO was available. The triplets for TC in this case were given by SWIASC (t), SWISMO (t) and θ(t). 6. Once steps 1-5 were complete, a single time series of observations error variance and rescaling factors was constructed for each satellite-derived SWI by selecting TC results when available, and LV results if not. This criterion was adopted because LV is susceptible to bias due to auto-correlated errors in the model SM (Su et al., 2014a). The rescaled observations from AMS, ASC and SMO were named θams , θasc and θsmo , respectively. 3.7 Evaluation metrics To evaluate the SM-DA results, we used six different metrics. Firstly, the normalised root mean squared difference (NRMSE) was calculated as the ratio of the root mean square error (RMSE) between the updated streamflow ensemble (Qup ) and the observed streamflow to the RMSE between the open-loop (ensemble streamflow prediction without assimilation, Qol ) and the observed discharge: NRMSE = 1 N 1 N PN qPT i=1 PN i=1 2 up t=1 (Qi (t) − Qobs (t)) qP T t=1 Qol i (t) − Qobs (t) 2 , (23) C. Alvarez-Garreton et al.: Assimilation of satellite soil moisture to improve flood prediction 690 695 where N = 1000 is the number of ensemble members. The 735 NRMSE provides information about both the spread of the ensemble and the performance the ensemble mean, which is considered as the best estimate of the ensemble prediction. Moreover, as it is calculated in linear streamflow space, it gives more weight to high flows. To further evaluate the performance of the ensemble mean, we calculated the Nash Sutcliffe efficiency (NSE) for the en- 740 tire evaluation period as follows (example for the open-loop case): 2 P ol t Qobs (t) − Q (t) NSEol = 1 − P 2 , t Qobs (t) − Qobs (24) 745 where Qol 700 is the open-loop ensemble mean. Similarly, NSEup was calculated by applying Eq.(24) to the updated ensemble mean (Qup ). We also estimated the probability of detection (POD) of daily flow rates (not flood events) exceeding minor, moderate 750 and major floods, for the open-loop and the updated ensemble mean, as follows (example for the open-loop case): PODol = 705 710 720 (25) 755 15.7% where the symbol # represents the number of times. Qobs is the observed streamflow corresponding to a minor flood classification. This corresponds to a flow (not flood) frequency of 15.7% (see Sect. 2). Similarly, PODup was calculated by applying Eq.(25) to the updated ensemble mean 760 (Qup ). We estimated the false alarm ratio (FAR) for daily flows as (example for the open-loop case): FARol = 715 #(Qol >= Q15.7% & Qobs >= Q15.7% ) obs obs , #(Qobs >= Q15.7% ) obs ) #(Qol >= Q15.7% & Qobs < Q15.7% obs obs . #(Qobs < Q15.7% ) obs (26) Similarly, FARup was calculated by applying Eq.(26) to the 765 updated ensemble mean. Finally, we calculated the aggregated peak volume error (PVE, in mm) of the ensemble mean, for days when the observed streamflow was above a minor flood classification (t∗ days in Eq. 27). An example for the open-loop, PVE was cal- 770 culated as PVEol = X Qol (t∗ ) − Qobs (t∗ ) . (27) t∗ 725 730 To evaluate the skill of the streamflow ensemble prediction before and after SM-DA, we calculated the continuos ranked 775 probability score (CRPS; Robertson et al., 2013). CRPS is used as a measure of the ensemble errors. In the case of the deterministic unperturbed run, CRPS reduces to the mean absolute error. The reliability of the ensembles was also evaluated by inspecting the rank histograms of the ensemble fol- 780 lowing Anderson (1996). A reliable ensemble should have a uniform histogram while a u-shape (n-shape) histogram indicates that the ensemble spread is too small (large) (De Lannoy et al., 2006). The evaluation period for the SM-DA was 01 June 2003 785 - 02 March 2014. This period is independent of all scheme component calibration periods (see Sects. 3.1, 3.4 and 3.5). 68 4 9 Results and discussion 4.1 Model calibration The streamflow at the outlet of the study catchment (N7 in Fig.1) features long periods of zero-flow, a negligible baseflow component and sharp flow peaks after rainfall events, when the catchment has reached a threshold level of wetness (see observed streamflow in Fig.4). The simulated streamflows from the lumped and the semidistributed schemes are presented in Fig.4. To help visualisation of these time series, the calibration and evaluation periods were plotted separately. The evaluation period was further separated into two sub-periods, evaluation sub-period 1 (01 June 2003 - 30 April 2007), characterised by having only moderate and minor floods, and evaluation sub-period 2 (30 April 2007 - 02 March 2014), which had three major flooding events. The plots show that both the lumped and the semi-distributed models are generally able to capture the hydrologic behaviour of the catchment. As expected, the spatial distribution of forcing data and the channel routing accounted for by the semi-distributed scheme enhanced the overall performance of the model, with lower residual values through time (panels a.2, b.2 and c.2 in Fig.4) and consistently improved the simulation of peak flows. Table 2 presents the evaluation statistics for the streamflow prediction in the calibration and evaluation periods, for both the catchment outlet and the inner catchments (notice that N1 does not have data in the calibration period). The different statistics in this table consistently show that, at the catchment outlet, the semi-distributed has consistently better performance than the lumped scheme in terms of RMSE, NSE, PEV and CRPS. Both schemes show better statistics in the evaluation period due to the higher flows over that period. The good performance of the semi-distributed scheme at the catchment outlet was not reflected at the inner catchments. To explore the reasons for such bad performance, we separately calibrated the model parameters in those subcatchments by using all the available N7, N1 and N3 observations. The results (not shown here) revealed that in this case, the model was able to adequately simulate streamflow in those sub-catchments (NSE in evaluation period of 0.78, 0.69 and 0.84 at N1, N3 and N7 nodes, respectively). Based on this, we argue that the problem of the poor model performance in the “ungauged” inner catchments is most likely due to sub-optimal parameter estimation (due to the limited information about catchment heterogeneity provided by the integrated catchment streamflow response) and unlikely to be due to errors in the input data or model structure. To focus the analysis of the catchment runoff mechanisms on periods with flood events, the lag-correlation between the daily streamflow simulated at N7 and θ (Fig.5), and between daily streamflow and the daily rainfall (Fig.6), was calculated 15.7% for daily streamflow values greater than Qobs , or minor C. Alvarez-Garreton et al.: Assimilation of satellite soil moisture to improve flood prediction 5 100 0 10 Q (mm d−1) Residuals (mm d−1) Feb71 Feb75 Feb79 Feb83 Feb87 Feb91 Feb95 Feb99 Feb03 0 b.1) 1 20 0.5 40 0 1 Rainfall (mm d−1) −10 Feb67 60 b.2) 0 −1 May03 Q (mm d ) 8 −1 150 a.2) 0 1.5 Residuals (mm d−1) 0 Obs Lumped model Semi−distributed model 50 a.1) May04 May05 May06 0 c.1) 6 20 4 40 2 60 0 5 Rainfall (mm d−1) Residuals (mm d−1) Q (mm d−1) 10 Rainfall (mm d−1) 10 80 c.2) 0 −5 Apr07 Apr08 Apr09 Apr10 Apr11 Apr12 Apr13 Fig. 4. Simulated and observed daily streamflow (Q) and model streamflow prediction residuals (simulated minus observed) at the catchment outlet (N7). (a.1) and (a.2) present the calibration period. (b.1) and (b.2) present evaluation sub-period 1, which has only moderate and minor flood events. (c.1) and (c.2) present evaluation sub-period 2, which has 3 major flood events. The daily rainfall plotted on the right axis correspond to the averaged rainfall over the entire catchment. 790 flood level. The lumped scheme indicates a stronger link between θ and streamflow than the semi-distributed scheme. This is represented by higher r values in panel a compared with panels b-h in Fig.5. Conversely the link between rain- 795 69 fall and streamflow is weaker in the lumped scheme (lower r values in panel a compared with panels b-h in Fig.6). These different representations of the catchment runoff response will have a direct impact on the skill of SM-DA to improve C. Alvarez-Garreton et al.: Assimilation of satellite soil moisture to improve flood prediction Table 2. Model evaluation at the catchment outlet (N7) and at the inner catchments (N1 and N3), for calibration and evaluation periods. RMSE and PVE statistics are in units of mm. Statistic Lumped scheme (N7) 0.53 0.3 0.46 NSEcalib NSEeval 0.52 0.67 0.59 0.77 0.28 0.39 -1.89 PODcalib PODeval 0.79 0.93 0.76 0.91 0.54 0.76 0.73 FARcalib FAReval 0.09 0.11 0.10 0.11 0.07 0.15 0.14 PVEcalib PVEeval -70.86 1.30 -39.99 34.75 -100.53 168.23 115.52 CRPScalib CRPSeval 0.29 0.56 0.28 0.33 0.92 0.58 0.49 r h) Semidist−SC7 5 Lag (d) 5 Lag (d) 5 Lag (d) 5 Lag (d) 10 0 10 0 10 0 10 r 4.2 Error model parameters and ensemble prediction a) Lumped b) Semidist−SC1 c) Semidist−SC2 d) Semidist−SC3 e) Semidist−SC4 f) Semidist−SC5 g) Semidist−SC6 h) Semidist−SC7 5 Lag (d) 5 Lag (d) 5 Lag (d) 5 Lag (d) 0.4 820 0.6 r g) Semidist−SC6 Fig. 6. Lag-correlation coefficient (r) between the simulated streamflow at N7 (mm d−1 ), and the daily rainfall (mm d−1 ) of the entire catchment (a) and the 7 sub-catchments (b)-(h). 0 0.4 0.2 10 825 Fig. 5. Lag-correlation coefficient (r) between the simulated streamflow at N7 (mm d−1 ), and θ (mm d−1 ) from the lumped (a) and the semi-distributed (b)-(h) model schemes. 830 810 f) Semidist−SC5 0.4 0 0 0.2 805 e) Semidist−SC4 0.4 0.2 0.6 800 d) Semidist−SC3 0 815 10 0 c) Semidist−SC2 0.2 r 0.18 0.18 10 0 b) Semidist−SC1 0.6 0.19 0.21 10 0 a) Lumped 0.6 Semi-distributed scheme (N7) (N1) (N3) RMSEcalib RMSEeval 0 0 11 streamflow prediction. A strong relationship between θ and streamflow prediction suggests strong correlation between 835 their errors, and therefore, greater potential improvement of streamflow resulting from an improved representation of θ. If we assume that the semi-distributed scheme provides a better representation of runoff response within the entire catchment (based on its better model performance at the out- 840 let), Figs. 5 and 6 also suggest that daily rainfall is the main control on runoff generation and thus has a stronger impact in the streamflow prediction than soil moisture. Figure 5 shows that flood prediction strongly depends on antecedent soil moisture for up to the preceding 3 days. The strong correlation found at lag-0 suggests that the real time SM correction given by the proposed SM-DA would be a good strategy 845 to improve flood prediction. 70 The calibrated error parameters for the lumped and the semidistributed schemes are σp = 1.286 mm and 0.977 mm; σs = 0.099 and 0.03 and σk = 0.084 and 0.018, respectively. σs is expressed as a percentage of the total storage capacity (396 mm in the lumped scheme and 420 mm in the semidistributed scheme) and σk is expressed as a percentage of the calibrated parameter k1 . The rank histograms of the generated ensemble prediction (open-loop) are presented in Fig.7. The histograms at the catchment outlet (N7) are either n-shape or displaced to one side, for both the lumped and semi-distributed model schemes (Figs.7a and 7b, respectively). This suggests that the open-loop ensembles are slightly biased (with respect to the observed streamflow) and feature wider spread than an ideal ensemble. The width of the spread will be critical for the evaluation of SM-DA (Sect. 4.4) since any decrease of the spread would be considered as an improvement of the ensemble prediction. The wider spread of the open-loop ensembles at the catchment outlet could be due to factors such as an over-prediction of error parameters by the MAP calibration algorithm, or the representation of the model error with time-constant error parameters. The latter becomes critical given the distinct behaviour of the intermittent streamflow response within the catchment, which could indicate distinct behaviour in the model errors as well. The ensemble predictions at the inner nodes N1 and N3 (Figs.7c and 7d, respectively) feature high bias with respect to the observed streamflow (note that observations at N1 and N3 were not used to calibrate the error parameters). The large bias at these inner nodes result from the large errors in the calibrated model in SC1 and SC3 (see Sect. 4.1). 4.3 SWI estimation and rescaling The satellite SM derived from AMS, ASC and SMO are presented in Fig.8a, for the lumped model. The satellite datasets C. Alvarez-Garreton et al.: Assimilation of satellite soil moisture to improve flood prediction No of observations 12 100 a) Lumped−N7 b) Semidist−N7 c) Semidist−N1 d) Semidist−N3 Open−loop 50 0 Updated 0 20 40 60 80 100 0 20 40 60 80 100 0 20 40 60 80 100 0 20 40 60 80 100 Ensemble Percentile Ensemble Percentile Ensemble Percentile Ensemble Percentile Fig. 7. Rank histograms of the open-loop and updated streamflow ensemble predictions. (a) presents the results from the lumped scheme at node N7. (b)-(d) present the results from the semi-distributed (semidist) scheme at nodes N7, N1 and N3. 850 855 860 865 870 875 880 885 feature significantly higher noise than the modelled θ. This can be explained by factors such as random errors in the satellite retrievals (Su et al., 2014b), and the rapid varia- 890 tion of water content in the surface layer of soil due to infiltration and evapotranspiration losses. Figure 8b presents the SWI derived from the satellite products, after seasonal rescaling (θams , θasc and θsmo ). This plot shows better agreement between model and observations due to SWI filtering/transformation, even when the higher noise in the rescaled SWI time series is still present. Figure 8c shows the seasonal observation error variance, and reveals a clear variation in the error with time. The variation of the seasonal error values is due to the alternative use of TC or LV and to the increasing sample size of each seasonal pool (see Section 3.6), which should reduce the uncertainties coming from finite sample size. One limitation of this procedure is its assumption that the errors vary seasonally without inter-annual variability. Since there are inter-annual cycles (wet and dry years), one may also expect the errors to vary with year. Ideally, moving-window estimation with win895 dows smaller than 3 months should be considered, but that would cause greater sampling uncertainties for the TC and LV estimates. The inverse relationships between θams and θasc error variances at some times could be due to the passive retrieval by AMS compared with the active ASC, among 900 other factors. A common error standard deviation value used in previous SM-DA studies is 3% m3 m−3 (e.g., Chen et al., 2011). This constant error, when transformed according to the soil moisture storage capacity of the model and the soil porosity (see 905 Section 3.5) gives an error variance of 667 (750) mm2 for the lumped (semi-distributed) scheme. As a simple comparison, these values are within the range of the error variance estimated through seasonal LV/TC; however, a comprehensive analysis of the impacts of accounting for seasonality in 910 SM-DA is beyond the scope of this work. Table 3 summarises the results of the SWI calibration and seasonal rescaling for the lumped model, showing the T parameter for each SWI and the correlation coefficient (r) between θ and the satellite SM before and after SWI transfor915 mation and rescaling (θobs ). These results confirm the visual 71 assessment of plots in Fig.8 by showing an important increase in the linear correlation coefficient with θ when satellite SM is transformed into SWI. The correlation is further increased after rescaling, which illustrates that there is clear benefit from performing seasonal bias correction. Note that applying a constant rescaling factor would have no impact on on the correlation between θ and θobs . Table 3. Parameter T and correlation coefficient between model SM (θ) and satellite SM, before and after SWI transformation and rescaling. Results are presented for the entire catchment. Dataset T (days) AMS ASC SMO 3 11 40 r between θ and Satellite SM SWI θobs 0.65 0.77 0.46 0.74 0.92 0.79 0.94 0.97 0.93 The optimal T values (Table 3) are difficult to validate since there is no ground data to compare with and, given that it has been shown that they strongly depend on the physical processes of the study site (Ceballos et al., 2005), direct comparison with other studies cannot be made reliably. Indeed, previous studies have shown a wide range of optimal T values for soil depths ranging between 10 and 100 cm. As an example, in Fig.9 we have summarised the optimal T found in 5 different studies (Albergel et al., 2008; Brocca et al., 2009, 2010a; Ford et al., 2014; Wagner et al., 1999). Previous studies have shown that optimal T value increases with layer depth (e.g., Brocca et al., 2010a). Results presented here show an increased T value for SMO, which would be inconsistent with L-band having a deeper penetration than AMS C-band (to limit the comparison within passive retrievals). We speculate that these differences might be due various factors, including the different retrievals methods (which have quite different assumptions pertaining to spatial heterogeneity) and the influence that radio-frequency interference noise. Moreover, to the best of our knowledge, the existing studies examining the dependence of T on the soil depths are usually based on a single satellite product against 13 350 θ (mm d−1) 300 250 0.5 a) Model AMS ASC SMO 0.3 200 0.2 150 0.1 100 0 50 −0.1 0 350 300 θ (mm d−1) 0.4 −0.2 Satellite SM (m3 m−3 d−1) C. Alvarez-Garreton et al.: Assimilation of satellite soil moisture to improve flood prediction b) Model θams asc 250 θ 200 θ smo 150 100 Obs. error variance (mm2 d−1) 50 0 1000 θams 800 c) θasc smo θ 600 400 200 0 Jun02 Jun03 Jun04 Jun05 Jun06 Jun07 Jun08 Jun09 Jun10 Jun11 Jun12 Jun13 920 925 930 in situ measurements at variable depths. Hence it is difficult to compare our results against these studies due to the increased complexity due to different sensing and retrieval methods. There are some key theoretical issues that should be considered when using SWI as a profile SM estimator. Firstly, the parameter T in Eq.(22) was estimated by maximising the correlation between SWI and θ, which could introduce cross-correlated errors between them. This would violate the IV regression assumption of no correlation between the errors among the triplets (Sect. 3.6). A way to overcome this issue, if data requirements are met, would be to estimate a 935 profile SM independently of the rainfall-runoff model prediction, for example by using a physically-based model to transfer surface SM into deeper layers (e.g., Richards, 1931; Beven and Germann, 1982; Manfreda et al., 2014). Secondly, the SWI formulation explicitly incorporates au- 940 tocorrelation terms, which would result in autocorrelated er- 72 Optimal T (d) Fig. 8. (a) shows the model soil moisture on the left axis (θ) and the satellite soil moisture observations in the right axis. (b) shows the soil moisture on the model space, after the three satellite datasets were transformed into a soil wetness index (SWI) and then rescaled by using TC or LV (θams , θasc and θsmo ). (c) shows the rescaled satellite SM observations error variance. 40 Wagner et al., 1999 Albergel et al., 2008 Brocca et al., 2009 Brocca et al., 2010a Ford et al., 2014 20 0 0 50 Soil depth (cm) 100 Fig. 9. Optimal T parameter against soil depth found in previous studies. rors in the observation, which violates an EnKF assumption: independence between observation and prediction errors. The autocorrelation in the observation error can be transferred to the updated θ + during the SM-DA updating step. In that case, the θ − background prediction error covariance at time t + 1 would be correlated to the error of the rescaled SWI at time t + 1. In contrast with the first issue 14 945 950 955 960 965 listed above, the violation of the EnKF assumption can not be avoided by replacing SWI with a physically-based model, since the latter would result in profile SM strongly correlated with previous states as well. Indeed, given the physical mechanisms of water flux in the unsaturated soil, this problem will be present whenever a profile SM estimated from satellite SM is used as an observation in an EnKF-based data assimilation framework. A way to overcome this could be to work with models that explicitly account for the water in the top few centimetres of soil and therefore can directly assimilate a (rescaled) satellite retrieval. However, the errors in satellite SM retrievals are probably already autocorrelated (Crow and Van den Berg, 2010). Breaching some of the EnKF-based scheme and/or the IV-based rescaling assumptions could theoretically degrade the performance of the SM-DA scheme, when the variable analysed is soil moisture (Crow and Van den Berg, 2010; Reichle et al., 2008; Ryu et al., 2009). In this context, the performance of SM-DA with respect to the improvement in streamflow has been under-investigated. Alvarez-Garreton et al. (2013, 2014) show that in terms of streamflow prediction, SM-DA seems to be less sensitive to violation of these as- 995 sumptions. Both the lower sensitivity and the apparent contradiction with previous studies analysing soil moisture prediction performance highlight the need for further studies focusing on SM-DA for the purposes of improving streamflow 1000 prediction from rainfall-runoff models. 4.4 970 975 980 985 990 C. Alvarez-Garreton et al.: Assimilation of satellite soil moisture to improve flood prediction Satellite soil moisture data assimilation The ensemble predictions of streamflow and θ, before and after SM-DA, for both the lumped and the semi-distributed1005 schemes at N7, are presented in Fig.10. The truncation bias correction (Sect. 3.3) was successful in creating an unbiased θ ensemble when the unperturbed model approached the soil water storage bounds (Figs.10a.2 and 10b.2). The rank histograms at N7, N1 and N3 are presented in1010 Fig. 7. For all the evaluated nodes, the ensemble predictions are more reliable after SM-DA (flatter histograms compared with the open-loop). The consistent overestimation of the observed streamflow in the open-loop ensembles (diagonal histograms displaced towards the higher ensemble percentiles)1015 is partially addressed by the SM-DA. The evaluation statistics for the SM-DA are summarised in Table 4. The streamflow data of the inner catchments (N1 and N3) are used only for evaluation purposes in the semidistributed scheme, therefore they are representative of “un-1020 gauged” inner catchments. The NRMSE in Table 4 (all values below 1) demonstrates that the SM-DA was effective in reducing the streamflow prediction uncertainty (RMSE) across all gauged and ungauged catchments. The reductions in the RMSE ranged from 17 to1025 24% for the different evaluation nodes. The NRMSE combines precision improvement (i.e., reduction of ensemble spread) with prediction accuracy improvement (i.e., enhance- 73 Table 4. SM-DA evaluation statistics calculated at the catchment outlet (N7) and at the inner catchments (N1 and N3). Statistic Lumped scheme (N7) Semi-distributed scheme (N7) (N1) (N3) NRMSE 0.78 0.76 0.81 0.83 NSEol NSEup 0.67 0.64 0.77 0.78 0.28 0.26 -1.75 -1.39 PODol PODup 0.96 0.94 0.92 0.93 0.56 0.55 0.69 0.69 FPol FPup 0.11 0.10 0.11 0.10 0.07 0.06 0.12 0.11 PVEol PVEup 5.63 -2.37 35.30 34.93 -96.87 -109.66 56.42 40.71 CRPSol CRPSup 0.32 0.28 0.26 0.23 0.74 0.73 0.20 0.24 ment of ensemble mean performance) resulting from the SMDA. Given that the ensemble open-loop spread was larger than an ideal ensemble (based on the n-shaped rank histograms in Fig.7), the reduction of the ensemble spread may be in part artificial. The performance of the ensemble mean was assessed by computing the NSEol and NSEup (Table 4). At the catchment outlet, the NSE of the ensemble mean after SM-DA only improved for the semi-distributed scheme. At the ungauged catchments, SM-DA was effective at improving the performance of the ensemble mean only at N3, compared with the open-loop. However, the performance of the model in that catchment was still poor. This can be explained by the systematic errors present in the model for those catchments before assimilation, which were not addressed by the SM-DA. The POD values at the catchment outlet (N7) show that before and after SM-DA, the model is consistently capable of detecting minor floods. Although this does not demonstrate an advantage of the SM-DA scheme proposed here, it does reflect the adequacy of the model ensemble prediction for simulating minor (and larger) floods. Consistently with previous results, the prediction of the semi-distributed model at the inner catchments is poorer in terms of detecting minor floods. The lower FAR values after SM-DA demonstrates the efficacy of the scheme in reducing the number of times the model predicted an unobserved minor flood, at both the gauged and the ungauged catchments. The open-loop PVE was improved (lower PVE values) after SM-DA at N7 (for both the lumped and the semidistributed schemes) and at N3. This was not the case however, for inner node N1, at which the PVE was higher after SM-DA, compared with the open-loop. When compared to the unperturbed model run (Table 2), the assimilation of satellite soil moisture improved the performance of the Q (mm d−1) C. Alvarez-Garreton et al.: Assimilation of satellite soil moisture to improve flood prediction 10 OL ens. OL ens. mean Updated ens. Updated ens. mean Obs. a.2) OL ens. OL ens. mean Updated ens. Updated ens. mean Unpert. 5 0 400 θ (mm d−1) a.1) 300 200 100 Q (mm d−1) 0 Apr07 10 Apr08 b.1) Apr09 Apr10 Apr11 Apr10 Apr11 OL ens. OL ens. mean Updated ens. Updated ens. mean Obs. 5 0 400 b.2) θ (mm d−1) 15 OL ens. OL ens. mean Updated ens. Updated ens. mean Unpert. 300 200 100 0 Apr07 Apr08 Apr09 Fig. 10. Streamflow (Q in mm d−1 ) and soil moisture (θ in mm d−1 ) ensemble prediction at the catchment outlet, before and after SM-DA for evaluation sub-period 2 (01 May 2007 - 02 March 2014), which had three major flooding events. (a.1) and (a.2) present the results for the lumped model. (b.1) and (b.2) present the results for the semi-distributed model. 1030 1035 model in terms of PVE at all the nodes and for both the lumped and semi-distributed schemes. The skill of the ensembles after SM-DA was improved at the catchment outlet by 12% and 13% (expressed by a reduc-1040 tion in CRPS) for the lumped and semi-distributed scheme respectively, and by a 17% at N1. The skill of the updated ensemble was also consistently higher than the unperturbed model run (Table 2). 74 To summarise the efficacy of the SM-DA, we take into account the characteristics of the ensemble predictions (openloop and updated) in terms of the their mean, skill and reliability. Overall, SM-DA was effective at improving streamflow ensemble predictions in the gauged and the ungauged catchments. By accounting for rainfall spatial distribution and routing process within the large study catchment, we improved the model performance at the outlet compared with 16 1045 1050 a lumped homogeneous scheme.This led to greater improvements from the SM-DA for the semi-distributed model. The latter was achieved even though the relationship between θ and the streamflow prediction was weaker in the semi-1100 distributed scheme (Fig.5). The proposed SM-DA scheme therefore, has the merits of improving streamflow ensemble predictions by correcting the SM state of the model, even when rainfall appears to be the main driver of the runoff mechanism (see Sect. 4.1). 1105 5 1055 1060 1065 1070 1075 1080 1085 1090 1095 C. Alvarez-Garreton et al.: Assimilation of satellite soil moisture to improve flood prediction Conclusions This paper presents an evaluation of the assimilation of pas-1110 sive and active satellite soil moisture observations (SM-DA) into a conceptual rainfall-runoff model (PDM) for the purpose of reducing flood prediction uncertainty in a sparsely monitored catchment. We set up the experiments in the large semi-arid Warrego River Basin (>40,000 km2 ) in south cen-1115 tral Queensland, Australia. Within this context, we explore the advantages of accounting for the forcing data spatial distribution and the routing processes within the catchment. The framework proposed here rigorously addressed the two main stages of a SM-DA scheme: model error repre-1120 sentation and satellite data processing. We applied the different methods in the context of a sparsely monitored large catchment (i.e., limited data), under operational streamflow and flood forecasting scenarios (i.e., not future information 1125 is used in any of the presented methods). The model error representation was the most critical step in the SM-DA scheme, since it determined the error covariance between observations and model state, and thus the potential efficacy of SM-DA. Moreover, the SM-DA evalu-1130 ation was done against the open-loop ensemble prediction. We addressed key issues of the ensemble generation process by correcting truncation biases in soil moisture and streamflow predictions. This prevented an unintended degradation of the open-loop ensembles coming from perturbing a highly non-linear model. The open-loop ensembles at the catchment outlet provide key information about prediction uncertainty,1135 which is required for assessing risks associated with water management decisions (Robertson et al., 2013). These ensembles showed a slight bias with respect to the observed streamflow and featured a wide spread. Further exploration of model error representation (sources of error and the structure of those errors) and error parameter estimation is required to improve the characteristics of the open-loop ensem1140 ble prediction. In the satellite data processing, we highlighted that the use of an exponential filter to transfer surface information into deeper layers may potentially lead to violation of some of TC and EnKF assumptions (Sect. 4.3). Possible solutions to1145 overcome this would be to use more physically-based methods to transfer satellite SM into deeper layers or to use a rainfall-runoff model that explicitly accounts for the surface 75 soil layer that can directly assimilate a (rescaled) satellite SM product. However, both solutions are constrained by the ancillary data available for satisfactory implementation of a physically-based model. In the rescaling and error estimation procedure, we applied seasonal TC and LV to avoid error-in-variable biases. Applying these to correct biases in the SWI, showed improved agreement between observed and modelled SM. This seasonal approach is novel in the context of SM-DA and tends to lead to closer agreement between model and observations. Further investigation is required to assess the impacts and importance of accounting for seasonality in rescaling and error estimation. The evaluation of the SM-DA results led to several insights. 1) The SM-DA was successful at improving the openloop ensemble prediction at the catchment outlet, for both the lumped and the semi-distributed case. 2) Accounting for spatial distribution in the model forcing data and for the routing processes within the large study catchment improved the skill of the SM-DA at the catchment outlet. 3) The SM-DA was effective at improving streamflow prediction at the ungauged locations, compared with the open-loop. However, the updated prediction in those catchments was still poor, because the systematic errors before assimilation are not addressed by a SM-DA scheme. This work provides new evidence of the efficacy of SMDA in improving streamflow ensemble predictions within sparsely instrumented catchments. We demonstrate that SMDA skill can be enhanced if the spatial distribution of forcing data and routing processes within the catchment are accounted for in large catchments. We show that SM-DA performance is directly related to the model quality before assimilation. Therefore we recommend that efforts should be focused on ensuring adequate models, while evaluating the trade-offs between more complex models and data availability. Acknowledgements. The authors wish to thank one anonymous reviewer, Dr. Uwe Ehret and the Chief-Execute Editor Dr. Erwin Zehe for their constructive comments and suggestions on the earlier draft of the paper. This research was conducted with financial support from the Australian Research Council (ARC Linkage Project No. LP110200520) and the Australian Bureau of Meteorology. C. Alvarez-Garreton was supported by Becas Chile scholarship. References Albergel, C., Rüdiger, C., Pellarin, T., Calvet, J.-C., Fritz, N., Froissard, F., Suquia, D., Petitpa, A., Piguet, B., Martin, E., et al.: From near-surface to root-zone soil moisture using an exponential filter: an assessment of the method based on in-situ observations and model simulations, Hydrol. Earth Syst. Sci., 12, 1323– 1337, 2008. Albergel, C., Rüdiger, C., Carrer, D., Calvet, J.-C., Fritz, N., Naeimi, V., Bartalis, Z., and Hasenauer, S.: An evaluation of ASCAT surface soil moisture products with in-situ observations C. Alvarez-Garreton et al.: Assimilation of satellite soil moisture to improve flood prediction 1150 1155 1160 1165 1170 1175 1180 1185 1190 1195 1200 1205 in Southwestern France, Hydrol. Earth Syst. Sci., 13, 115–124, 2009. Albergel, C., Calvet, J., De Rosnay, P., Balsamo, G., Wagner, W.,1210 Hasenauer, S., Naeimi, V., Martin, E., Bazile, E., Bouyssel, F., et al.: Cross-evaluation of modelled and remotely sensed surface soil moisture with in situ data in southwestern France, Hydrol. Earth Syst. Sci., 14, 2177–2191, 2010. Albergel, C., de Rosnay, P., Gruhier, C., Muñoz-Sabater, J., Hase-1215 nauer, S., Isaksen, L., Kerr, Y., and Wagner, W.: Evaluation of remotely sensed and modelled soil moisture products using global ground-based in situ observations, Remote Sens. Environ., 118, 215–226, 2012. Alvarez-Garreton, C., Ryu, D., Western, A. W., Crow, W. T.,1220 and Robertson, D. E.: Impact of observation error structure on satellite soil moisture assimilation into a rainfall-runoff model, in: MODSIM2013, 20th International Congress on Modelling and Simulation. Modelling and Simulation Society of Australia and New Zealand, edited by Piantadosi, J., Anderssen, R., and1225 Boland, J., pp. 3071–3077, 2013. Alvarez-Garreton, C., Ryu, D., Western, A., Crow, W., and Robertson, D.: The impacts of assimilating satellite soil moisture into a rainfall–runoff model in a semi-arid catchment, J. Hydrol., 519, 1230 2763–2774, 2014. Anderson, J. L.: A method for producing and evaluating probabilistic forecasts from ensemble model integrations, J. Climate, 9, 1518–1530, 1996. Beven, K. and Germann, P.: Macropores and water flow in soils, 1235 Water Resour. Res., 18, 1311–1325, 1982. Brocca, L., Melone, F., Moramarco, T., and Morbidelli, R.: Antecedent wetness conditions based on ERS scatterometer data, J. Hydrol., 364, 73–87, 2009. Brocca, L., Melone, F., Moramarco, T., Wagner, W., and Hasenauer, S.: ASCAT soil wetness index validation through in situ and1240 modeled soil moisture data in central Italy, Remote Sens. Environ., 114, 2745–2755, 2010a. Brocca, L., Melone, F., Moramarco, T., Wagner, W., Naeimi, V., Bartalis, Z., and Hasenauer, S.: Improving runoff prediction through the assimilation of the ASCAT soil moisture product,1245 Hydrol. Earth Syst. Sci., 14, 1881–1893, 2010b. Brocca, L., Hasenauer, S., Lacava, T., Melone, F., Moramarco, T., Wagner, W., Dorigo, W., Matgen, P., Martı́nez-Fernández, J., Llorens, P., et al.: Soil moisture estimation through ASCAT and AMSR-E sensors: an intercomparison and validation study1250 across Europe, Remote Sens. Environ., 115, 3390–3408, 2011. Brocca, L., Moramarco, T., Melone, F., Wagner, W., Hasenauer, S., and Hahn, S.: Assimilation of Surface-and Root-Zone ASCAT Soil Moisture Products Into Rainfall–Runoff Modeling, IEEE T. 1255 Geosci. Remote, 50, 2542–2555, 2012. Ceballos, A., Scipal, K., Wagner, W., and Martı́nez-Fernández, J.: Validation of ERS scatterometer-derived soil moisture data in the central part of the Duero Basin, Spain, Hydrol. Process., 19, 1549–1566, 2005. Chen, F., Crow, W. T., Starks, P. J., and Moriasi, D. N.: Improving1260 hydrologic predictions of a catchment model via assimilation of surface soil moisture, Adv. Water Resour., 34, 526–536, 2011. Chen, F., Crow, W. C., and Ryu, D.: Dual forcing and state correction via soil moisture assimilation for improved rainfall runoff modelling, J. Hydrometeorol., doi:10.1175/JHM-D-14-0002.1,1265 2014. 76 17 Chipperfield, A. and Fleming, P.: The MATLAB genetic algorithm toolbox, in: Applied Control Techniques Using MATLAB, IEE Colloquium on, pp. 10/1–10/4, 1995. Crow, W. and Ryu, D.: A new data assimilation approach for improving runoff prediction using remotely-sensed soil moisture retrievals, Hydrol. Earth Syst. Sci., 13, 1–16, 2009. Crow, W. T. and Reichle, R. H.: Comparison of adaptive filtering techniques for land surface data assimilation, Water Resour. Res., 44, W08 423, doi:10.1029/2008WR006 883, 2008. Crow, W. T. and Van den Berg, M. J.: An improved approach for estimating observation and model error parameters in soil moisture data assimilation, Water Resour. Res., 46, W12 519, doi:10.1029/2010WR009 402, 2010. Crow, W. T. and van Loon, E.: Impact of Incorrect Model Error Assumptions on the Sequential Assimilation of Remotely Sensed Surface Soil Moisture, J. Hydrometeorol., 7, 421–432, 2006. De Lannoy, G. J., Houser, P. R., Pauwels, V., and Verhoest, N. E.: Assessment of model uncertainty for soil moisture through ensemble verification, J. Geophys. Res. - Atmos., 111, D10 101, doi:10.1029/2005JD006 367, 2006. Dee, D. P. and Da Silva, A. M.: Data assimilation in the presence of forecast bias, Q. J. Roy. Meteor. Soc., 124, 269–295, 1998. Draper, C. S., Walker, J. P., Steinle, P. J., de Jeu, R. A., and Holmes, T. R.: An evaluation of AMSR–E derived soil moisture over Australia, Remote Sens. Environ., 113, 703–710, 2009. Evensen, G.: The ensemble Kalman filter: Theoretical formulation and practical implementation, Ocean Dynam., 53, 343–367, 2003. Ford, T., Harris, E., and Quiring, S.: Estimating root zone soil moisture using near-surface observations from SMOS, Hydrol. Earth Syst. Sci., 18, 139–154, 2014. Francois, C., Quesney, A., and Ottlé, C.: Sequential assimilation of ERS-1 SAR data into a coupled land surface-hydrological model using an extended Kalman filter, J. Hydrometeorol., 4, 473–487, 2003. Gill, M. A.: Flood routing by the Muskingum method, J. Hydrol., 36, 353–363, 1978. Gruhier, C., De Rosnay, P., Hasenauer, S., Holmes, T. R., De Jeu, R. A., Kerr, Y. H., Mougin, E., Njoku, E., Timouk, F., Wagner, W., et al.: Soil moisture active and passive microwave products: intercomparison and evaluation over a Sahelian site., Hydrol. Earth Syst. Sci., 2010. Hain, C. R., Crow, W. T., Anderson, M. C., and Mecikalski, J. R.: An ensemble Kalman filter dual assimilation of thermal infrared and microwave satellite observations of soil moisture into the Noah land surface model, Water Resour. Res., 48, W11 517, doi:10.1029/2011WR011 268, 2012. Jones, D. A., Wang, W., and Fawcett, R.: High-quality spatial climate data-sets for Australia, Australian Meteorological and Oceanographic Journal, 58, 233–248, 2009. Jothityangkoon, C., Sivapalan, M., and Farmer, D.: Process controls of water balance variability in a large semi-arid catchment: downward approach to hydrological model development, J. Hydrol., 254, 174–198, 2001. Kerr, Y. H., Waldteufel, P., Richaume, P., Wigneron, J.-P., Ferrazzoli, P., Mahmoodi, A., Al Bitar, A., Cabot, F., Gruhier, C., Juglea, S. E., et al.: The SMOS soil moisture retrieval algorithm, Geoscience and Remote Sensing, IEEE Transactions on, 50, 1384–1403, 2012. 18 1270 1275 1280 1285 1290 1295 1300 1305 1310 1315 1320 1325 C. Alvarez-Garreton et al.: Assimilation of satellite soil moisture to improve flood prediction Li, Y., Ryu, D., Western, A. W., Wang, Q., Robertson, D. E., and Crow, W. T.: An integrated error parameter estimation and lagaware data assimilation scheme for real-time flood forecasting, J. Hydrol., 519, 2722–2736, 2014. Liu, Y. Q. and Gupta, H. V.: Uncertainty in hydrologic modeling:1330 Toward an integrated data assimilation framework, Water Resour. Res., 43, W07 401, doi:10.1029/2006WR005 756, 2007. Loew, A. and Schlenz, F.: A dynamic approach for evaluating coarse scale satellite soil moisture products, Hydrol. Earth Syst. Sci., 15, 1335 75–90, 2011. Manfreda, S., Brocca, L., Moramarco, T., Melone, F., and Sheffield, J.: A physically based approach for the estimation of root-zone soil moisture from surface measurements, Hydrol. Earth Syst. Sci., 18, 1199–1212, 2014. McKenzie, N. J., Jacquier, D., Ashton, L., and Cresswell, H.: Es-1340 timation of soil properties using the Atlas of Australian Soils, CSIRO Land and Water Canberra, 2000. McMillan, H., Jackson, B., Clark, M., Kavetski, D., and Woods, R.: Rainfall uncertainty in hydrological modelling: An evaluation of 1345 multiplicative error models, J. Hydrol., 400, 83–94, 2011. Moore, R. and Bell, V.: Incorporation of groundwater losses and well level data in rainfall-runoff models illustrated using the PDM, Hydrology and Earth System Sciences Discussions, 6, 25– 38, 2002. Moore, R. J.: The PDM rainfall-runoff model, Hydrol. Earth Syst.1350 Sci., 11, 483–499, 2007. Naeimi, V., Scipal, K., Bartalis, Z., Hasenauer, S., and Wagner, W.: An improved soil moisture retrieval algorithm for ERS and METOP scatterometer observations, IEEE T. Geosci. Remote, 1355 47, 1999–2013, 2009. Nash, J. and Sutcliffe, J.: River flow forecasting through conceptual models part I: A discussion of principles, J. Hydrol., 10, 282– 290, 1970. Owe, M., de Jeu, R., and Holmes, T.: Multisensor historical climatology of satellite-derived global land surface moisture, J. Geo-1360 phys. Res. - Earth, 113, F01 002, doi:10.1029/2007JF000 769, 2008. Plaza, D., De Keyser, R., De Lannoy, G., Giustarini, L., Matgen, P., and Pauwels, V.: The importance of parameter resampling for soil moisture data assimilation into hydrologic models using the1365 particle filter., Hydrol. Earth Syst. Sci., 16, 2012. Qiu, J., Crow, W. T., Nearing, G. S., Mo, X., and Liu, S.: The impact of vertical measurement depth on the information content of soil moisture times series data, Geophys. Res. Lett., 41, 4997–5004, 2014. Reichle, R. H., Crow, W. T., and Keppenne, C. L.: An adaptive ensemble Kalman filter for soil moisture data assimilation, Water Resour. Res., 44, W03 423, doi:10.1029/2007WR006 357, 2008. Richards, L. A.: Capillary conduction of liquids through porous mediums, Physics, 1, 318–333, 1931. Robertson, D. E., Shrestha, D. L., and Wang, Q. J.: Post-processing rainfall forecasts from numerical weather prediction models for short-term streamflow forecasting, Hydrol. Earth Syst. Sci., 17, 3587–3603, 2013. Ryu, D., Crow, W. T., Zhan, X., and Jackson, T. J.: Correcting Unintended Perturbation Biases in Hydrologic Data Assimilation, J. Hydrometeorol., 10, 734–750, 2009. Scipal, K., Holmes, T., De Jeu, R., Naeimi, V., and Wagner, W.: A possible solution for the problem of estimating the error struc- 77 ture of global soil moisture data sets, Geophys. Res. Lett., 35, L24 403, doi:10.1029/2008GL035 599, 2008. Stoffelen, A.: Toward the true near-surface wind speed: Error modeling and calibration using triple collocation, J. Geophys. Res. Oceans, 103, 7755–7766, 1998. Su, C., Ryu, D., Crow, W. T., and Western, A. W.: Beyond triple collocation: Applications to soil moisture monitoring, J. Geophys. Res. - Atmos., 119(11), 6416–6439, 2014a. Su, C.-H., Ryu, D., Young, R. I., Western, A. W., and Wagner, W.: Inter-comparison of microwave satellite soil moisture retrievals over the Murrumbidgee Basin, southeast Australia, Remote Sens. Environ., 134, 1–11, 2013. Su, C.-H., Ryu, D., Crow, W. T., and Western, A. W.: Stand-alone error characterisation of microwave satellite soil moisture using a Fourier method, Remote Sens. Environ., 154, 115–126, 2014b. Thielen, J., Bartholmes, J., Ramos, M.-H., and De Roo, A.: The European Flood Alert System-Part 1: Concept and development., Hydrol. Earth Syst. Sci., 13, 2009. Tian, Y., Huffman, G. J., Adler, R. F., Tang, L., Sapiano, M., Maggioni, V., and Wu, H.: Modeling errors in daily precipitation measurements: Additive or multiplicative?, Geophys. Res. Lett., 40, 2060–2065, 2013. Wagner, W., Lemoine, G., and Rott, H.: A method for estimating soil moisture from ERS scatterometer and soil data, Remote Sens. Environ., 70, 191–207, 1999. Wanders, N., Karssenberg, D., Roo, A. d., de Jong, S., and Bierkens, M.: The suitability of remotely sensed soil moisture for improving operational flood forecasting, Hydrol. Earth Syst. Sci., 18, 2343–2357, 2014. Wang, Q., Robertson, D., and Chiew, F.: A Bayesian joint probability modeling approach for seasonal forecasting of streamflows at multiple sites, Water Resour. Res., 45, W05 407, doi:10.1029/2008WR007 355, 2009. Western, A. W., Grayson, R. B., and Blöschl, G.: Scaling of soil moisture: A hydrologic perspective, Annual Review of Earth and Planetary Sciences, 30, 149–180, 2002. Yilmaz, M. T. and Crow, W. T.: The optimality of potential rescaling approaches in land data assimilation, J. Hydrometeorol., 14, 650–660, 2013. Zwieback, S., Scipal, K., Dorigo, W., and Wagner, W.: Structural and statistical properties of the collocation technique for error characterization, Nonlinear Porc. Geoph., 19, 69–80, 2012. 78 Chapter 6 Dual correction scheme This chapter was re-submitted to Water Resources Research after moderate revisions, as the following article: Alvarez-Garreton C., Ryu D., Western A.W., Crow W.T., Su, C.-H., and Robertson D.E. Dual assimilation of satellite soil moisture to improve streamflow prediction in data-scarce catchments. Submitted. Abstract This paper explores the use of active and passive microwave satellite soil moisture products for improving streamflow prediction within 4 large (>5,000km2 ) semi-arid catchments in Australia. We use the probability distributed model (PDM) under a data-scarce scenario and aim at correcting two key controlling factors in the streamflow generation: the rainfall forcing data and the catchment wetness condition. The soil moisture analysis rainfall tool (SMART) is used to correct a near-real time satellite rainfall product (forcing correction scheme) and an ensemble Kalman filter is used to correct the PDM soil moisture state (state correction scheme). These two schemes are combined in a dual correction scheme and we assess the relative improvements of each. Our results demonstrate that the quality of the satellite rainfall product is improved by SMART during moderate-to-high daily rainfall events, which in turn leads to improved streamflow prediction during high flows. When employed individually, the soil moisture state correction scheme generally outperforms the rainfall correction scheme, especially for low flows. Overall, the combined dual correction scheme further improves the streamflow predictions (reduction in root mean square error and false alarm ratio, and increase in correlation coefficient and Nash-Sutcliffe efficiency). Our results provide new evidence of the value of satellite soil moisture observations within data-scarce regions. We also identify a number of challenges and limitations within the schemes. 79 1 Introduction Flood prediction in sparsely monitored and ungauged catchments can suffer from large uncertainties given the quality of the data used to force and calibrate the models. Addressing this challenge, a number of studies have explored data assimilation methods to integrate various existing observations from the ground and satellites into streamflow models (e.g., Moradkhani et al., 2005b; Liu and Gupta, 2007; Lievens et al., 2015; Lopez et al., 2015; Mendoza et al., 2012; Wanders et al., 2014). Within this context, and given the essential role that soil moisture (SM) plays in the runoff generation (Western et al. (2002) and references therein), significant attention has been given to satellite SM observations. Microwave retrievals of SM provide near real time estimates of the water content from the top few centimetres of soil, at a global scale every 1-3 days. Moreover, satellite SM estimates have shown good agreement with ground data (Albergel et al., 2009; Draper et al., 2009b; Gruhier et al., 2010; Brocca et al., 2011; Su et al., 2013). A popular approach has been to use satellite SM in a state correction scheme (e.g., Francois et al., 2003; Brocca et al., 2010, 2012a; Alvarez-Garreton et al., 2013, 2014; Chen et al., 2014; Wanders et al., 2014; Alvarez-Garreton et al., 2015; Massari et al., 2015). The rationale is that processed satellite SM can be used to update the SM state of rainfall-runoff models, enabling more accurate prediction of catchment response to precipitation and thus better streamflow. These studies have generally shown positive results for reducing streamflow prediction uncertainty, although important limitations have been identified. The limitations influencing the efficacy of the state update schemes include the limited knowledge and skill gaps in structural and parameter uncertainties, the errors in forcing data, the particular runoff mechanisms within the catchment (Alvarez-Garreton et al., 2015), the experimental setup (e.g., model error quantification, observation error quantification, satellite data processing techniques, data assimilation scheme), and the specific catchment characteristics (e.g., soil type, location and land cover) (Massari et al., 2015). Since the aim of a state correction scheme is to reduce the errors in the model SM, the reduction in streamflow uncertainty will depend on the error covariance between these two components. This error covariance may be weak when the errors in streamflow come mainly from errors in the rainfall input data (Crow and Ryu, 2009). The latter becomes critical in locations without rain gauges, where the available rainfall data generally comes from satellites. Satellite rainfall products provide near real-time information with high temporal resolution, which can be used for flood forecasting and monitoring. This information, however, contains bias and errors that are usually corrected by using rain gauges (Yong et al., 2013; Zhou et al., 2014; Yong et al., 2015). To dispense with the need for weather stations (which are not available in large part of the world), recent studies have shown that these products 80 CHAPTER 6: DUAL CORRECTION SCHEME can potentially be improved by using satellite SM observations (Crow and Bolten, 2007; Pellarin et al., 2008; Crow et al., 2009, 2011; Pellarin et al., 2013; Brocca et al., 2013, 2014; Wanders et al., 2015; Zhan et al., 2015). The argument is that given the information that surface SM contains about antecedent rainfall events, the magnitude of these events can be estimated by satellite SM retrievals through water balance models. Although these studies have different approaches, they have all shown the potential improvement of rainfall estimates by using satellite SM. The potential of SM observations to correct errors in both the model states and the forcing data has motivated recent studies to test these dual forcing/state correction schemes (dual SM-DA). For example, Massari et al. (2014) set up a scheme in which in-situ SM observations were used to correct the rainfall (through the SM2RAIN algorithm introduced by Brocca et al. (2013)) and to initialise the wetness condition of a simple rainfall-runoff model. Their results showed high potential for SM data to improve flood modelling in a case study. Using a more complex assimilation scheme and rainfall-runoff model, Crow and Ryu (2009) set up a state SM-DA scheme integrated with a rainfall correction scheme (via the antecedent version of the soil moisture analysis rainfall tool, SMART, introduced by Crow et al. (2009)) in a series of synthetic twin experiments. To prevent the potential introduction of cross-correlation between observations and forecasting errors coming from the dual use of satellite SM, Crow and Ryu (2009) applied the rainfall correction offline (i.e., the corrected rainfall is not used within the analysis cycle used to update SM states). The results of this dual SM-DA scheme were further supported by Chen et al. (2014) in a real data application over 13 study catchments in the central United States, with areas ranging between 700 and 10,000 km2 . Both studies showed that the satellite rainfall correction led to improvement in streamflow prediction, especially during high flow periods. Conversely, the soil water state correction mainly led to improvement of the base flow component (low flows periods). The combined state and forcing correction scheme led to improvement of both the high and low flow components of the streamflow; outperforming both the state and forcing correction scheme in isolation. However, it remains unclear how this dual SM-DA scheme performs for different catchment characteristics (such as climate and rainfall-runoff mechanisms) and under different experimental conditions (such as the data assimilation setup, model structure and quality of the forcing data). In this paper we expand the evaluation of the dual SM-DA proposed by Crow and Ryu (2009) by using very distinct catchments and different experimental conditions than Chen et al. (2014). In contrast to previous studies [e.g., Crow and Ryu, 2009; Crow et al., 2011; Chen et al., 2014; Massari et al., 2014], we focus on large semi-arid catchments in Australia with a history of relatively frequent flooding. Additionally, these catchments are sparsely instrumented thus streamflow prediction is a great challenge. One of the catchments was previously studied by Alvarez-Garreton et al. [2014, 2015] while exploring effective 81 Table 1: Study catchments characteristics. Catchment Warrego Comet Thomson Barcoo Outlet stream gauge Warrego River at Wyandra Comet River at The Lake Thomson River at Longreach Barcoo River at Blackall Record initial year 1967 1972 1969 1969 Mean annual rainfall (mm) 537 723 516 570 Area (km2 ) 42,870 10,470 57,734 5,758 state correction schemes for improving flood prediction. In this paper we expand the state correction scheme proposed by Alvarez-Garreton et al. [2014, 2015] by incorporating three other catchments and by combining the data assimilation scheme with a rainfall correction scheme. Also, this is the first work that applies the dual data assimilation scheme to the semi-distributed rainfall-runoff modelling. We devise the dual SM-DA scheme under an scenario without rain gauges (only satellite data is used to force the model) to answer four main questions: 1) Can we improve the quality of an operational satellite rainfall product by the assimilation of satellite soil moisture using SMART? 2) Does this rainfall correction scheme have a positive impact on streamflow predictions? 3) Can we improve streamflow prediction by the assimilation of satellite SM in a state correction scheme? 4) What are the impacts on streamflow prediction of a combined state and forcing correction scheme? To set up the experiments, we use the probability distributed model (PDM) forced with the Tropical Rainfall Measuring Mission (TRMM) Multisatellite Precipitation Analysis (TMPA) rainfall products. We assimilate passive and active satellite SM products to correct the PDM soil moisture state (state correction scheme via an ensemble Kalman filter, EnKF) and to correct the satellite rainfall (forcing correction scheme via SMART). 2 Study Area and Data The study area consists of four catchments in Queensland, Australia: the Warrego, Comet, Thomson and Barcoo (Figure 1). These catchments were selected for their flooding history, along with their low density of rainfall gauge networks. Some of the main characteristics of the catchments, including the mean annual rainfall (calculated using 3B42 dataset, described below), area and stream gauge at the outlet, are summarised in Table 1. The catchments are located in arid, steppe, hot climatic region (Peel et al., 2007) and feature summer-dominated rainfall (Figure 2). Moreover, since the ground-monitoring network within the catchments is sparse (rainfall gauges are shown in Figure 1), satellite SM data is likely to be more valuable than in well-instrumented catchments. Streamflow records were collected from the State of Queensland, Department of Natural Resources and Mines (http://watermonitoring.dnrm.qld.gov.au/ ) for each outlet gauge (Table 1). Potential evapotranspiration was obtained from the climatological 0.05◦ grid82 CHAPTER 6: DUAL CORRECTION SCHEME 1:80.000.000 !! ! ! ! ! ! ! ! !! !! ! ! ! ! ! ! !! ! !! ! ! ! ! ! ! !! ! ! !! !! ! ! ! ! ! ! ! ! ! ! !! ! ! !! ! ! ! ! ! ! ! ! ! ! ! ! ! !! ! ! ! ! ! ! ! ! ! ! ! ! ! !! ! ! ! ! ! ! ! ! ! !! ! ! !! ! ! ! ! ! ! ! ! ! !! ! !! !! ! ! ! !! ! ! ! ! ! ! ! !! !! ! ! ! ! ! ! ! ! !! ! ! ! !! ! ! !! !! !! ! ! !! !! ! ! ! ! ! ! ! ! ! !!!! ! !!! !! !! ! ! !!!!! !! ! !! ! ! !! ! ! !! ! ! !! ! !! ! ! ! ! !! ! ! !! ! !!!! !! ! !! !!! ! !! !! ! !! ! ! ! !! !!! !!! !! ! !! !! !!!!!!! ! ! !! !! !! ! !! !!! !! !! ! !! ! ! !!!!!! !! ! !! ! ! ! !!! !! !! !! ! ! ! !!!!!! ! ! ! !! ! ! !! !! ! ! ! !! !! !! ! !!!! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !!! ! ! !! ! ! ! !!!!! ! ! ! ! ! ! !! ! ! ! !! !! !! ! !! !! ! ! ! !! !! !! ! !! !! ! ! ! ! !! !! ! !! ! !! !!! !!! ! ! !!! !! ! !! ! ! !! ! ! !! !! ! ! ! !! !! ! ! !! ! !! ! !! ! !! !! ! ! !! ! ! !! ! !! ! !! !! ! ! !!! !! ! !! !!! !! !! ! !! ! !! ! ! !!! ! ! !!! ! ! ! ! ! !! ! ! ! ! ! !!! ! ! ! ! !!! ! ! ! !! ! !! ! ! !! ! !! !! ! !! ! !!! !! !! !! ! ! ! !! ! !!! !!! ! !! !! ! !! ! !! ! ! ! ! ! ! !! !! ! !! ! !! !! ! ! !!! ! ! ! ! ! ! . !! ! ! ! !! ! ! ! ! ! ! ! ! ! !!! !! ! ! ! !! ! ! ! ! ! ! ! ! ! ! !! ! ! ! ! !! ! ! ! ! !! ! ! ! !! ! !! ! ! ! ! ! ! ! ! ! ! !! !! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !! ! ! ! !! ! ! ! ! ! ! ! !! ! ! ! ! ! !! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !! ! ! ! ! ! ! ! ! ! ! ! ! !! !! ! ! ! ! ! ! ! !! ! !! ! ! ! ! ! ! ! !! ! ! ! ! ! !! !! ! ! ! ! ! ! ! ! !!! !! ! ! ! ! ! ! ! ! ! ! ! ! ! !! ! ! ! ! ! ! ! ! !! ! ! ! !! ! ! ! !! ! ! ! ! ! ! !! ! !! !! ! ! ! !! ! ! ! ! !! ! !! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !! ! ! ! ! ! !!! ! ! ! ! ! ! ! !! ! ! ! !! !! ! ! !! ! ! !!!!!!! ! ! ! ! ! ! !!! ! ! ! ! !! !!! ! ! ! ! ! ! ! ! ! ! ! ! ! !! ! ! !! ! ! !!! ! !! ! ! ! ! ! !! ! ! !!!!! ! ! ! ! ! !!! ! !! ! !! ! ! ! ! !!! ! ! ! ! ! ! ! ! !!! ! ! !! !! ! !!!! !! ! ! !! ! ! ! ! ! ! ! ! !! ! !! ! ! ! ! ! ! ! ! ! !!! ! ! ! ! ! ! !! !!! ! ! ! ! !!! ! ! ! ! ! !! ! !! !!! ! ! !!! ! ! !! !!! ! ! !! !! ! !! ! ! !! ! ! !! !! !! ! !!! ! ! ! !! ! !!! ! ! !! !! ! ! ! !!! ! ! ! !! ! !! ! ! ! !! !! ! ! ! ! ! ! ! ! !! !! ! ! ! ! ! !! ! !! ! ! !! ! ! ! !! ! ! !! !! ! ! !! ! ! ! ! ! !!! ! ! ! ! ! ! ! ! !!! ! !! ! !! ! !! ! ! !!! ! ! ! !! ! ! ! ! ! ! ! ! ! ! ! !! ! ! ! ! ! ! ! ! ! ! ! ! !!! ! ! ! ! ! ! ! !! ! ! !! !! ! !! !! ! ! !! ! !! !! !! !! ! !! ! ! !!! ! ! ! !! ! ! ! ! ! !! ! ! ! ! ! ! ! ! ! !! ! ! !! ! !! !! ! ! !! ! !! ! ! !!! ! ! !! ! ! !!! ! !! !! !!! ! ! !!!!! !! !! ! ! ! ! !! !! !! !! !!! ! ! !! ! ! ! ! !! ! ! !!! !! !! ! ! !! !! !! ! ! ! !! ! ! !!!! ! ! !!! !! ! ! ! !! !!!! ! !! ! !!!! ! ! ! !! ! ! ! ! !! !! ! ! ! ! !! ! ! ! ! !! ! ! !! !! ! ! !!! ! ! ! ! ! !!! ! ! !! ! ! ! ! ! !!! ! !! ! ! ! ! ! ! ! ! ! ! !! ! ! !!!! ! !! ! ! ! ! ! !! ! !! !! ! !!! !! !! !! ! ! !! ! !!! !!! !! !! !! ! !! ! ! ! ! ! ! !! ! ! ! ! !! !! !!!!!!! !! !!! ! ! ! ! ! ! ! ! ! ! ! ! !!! ! ! ! ! ! ! ! ! !! !! ! !! !!! ! !! ! ! ! !! ! ! ! ! ! ! ! !!!!!! ! ! ! !!! ! ! !! ! !! ! !! !! ! !! ! ! !! !!! ! ! !! ! ! ! ! ! ! ! ! ! ! !! ! ! ! ! ! !! ! ! ! ! ! !!! ! ! ! !! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !!!!! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !!! !! ! ! ! ! !! ! ! ! ! ! ! ! ! ! !! ! ! !!! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !! !!! ! !! !! ! ! ! ! ! ! ! ! !! ! ! !!!!!! ! !! ! ! !! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !! ! ! ! ! ! ! !! ! !! ! ! ! ! !! ! ! ! ! ! !! ! ! ! ! ! ! ! ! ! !! ! !!!!!!! !! ! ! !! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !! !! ! ! !!!!! ! ! ! !! ! !!! ! ! ! !! !! ! ! ! !!! ! ! ! !! !! !!! ! ! !!!!! !!! ! ! ! !! ! !! ! ! !!! ! ! ! !! ! !!! !! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !! ! ! ! !!!! ! ! !!!!! ! !! ! !! ! ! !! !! ! ! !! ! ! !! !! ! !! ! ! ! ! !! !! !! ! ! ! !!!!!! ! ! ! !!! ! ! !! ! !! !! ! ! ! !!! ! !! ! !! ! ! ! ! ! ! !! ! ! ! ! ! ! ! ! !! ! ! !! !! !!! ! !!! !! !! ! !! ! ! ! !! ! ! ! ! ! !! ! ! ! ! !! !! !! !! !!! ! ! ! ! ! ! ! !! !! ! ! ! ! !! ! !!! !! !! ! ! !! !! ! ! !! ! !!! ! ! ! ! !! ! !!! !! !! !! ! ! ! ! ! ! !! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !! !!!!! ! ! !!! ! ! !! ! ! ! ! !!!! !! ! ! !! ! !! ! ! ! !! ! !!! ! ! !!! ! !! ! ! ! ! !!!!! ! !! ! ! ! !! ! !! ! !! ! !! !!! ! ! !!! ! ! ! ! !!! ! !! ! !! !! ! !! ! ! ! !! ! ! ! ! ! ! ! ! !! !!!! ! ! ! ! ! ! ! ! ! !! ! !! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !! ! !! !! !!! ! ! ! !!!! ! !! ! ! ! ! !!! !! !!! ! !!! ! !! !!! !!! !! ! !!! ! ! ! ! ! ! ! ! ! ! ! !! ! ! ! ! ! ! !! ! !! ! !! ! !!! ! ! !!! !!! ! ! ! ! ! !! ! ! ! !! ! ! !!! !!! ! ! !!! ! ! !! ! !! !! ! ! ! ! ! !! ! !! !!!!!! !! ! !!! !! !! ! ! ! ! !! ! !!!!!! ! ! ! !! !! ! ! ! !! ! ! !! ! ! ! ! ! ! ! !! ! !!!! !!!! !!!! ! ! ! ! ! ! ! ! ! ! !!! ! ! ! ! !! !! ! !!! ! ! ! ! !! !! ! !! ! ! !!!! ! ! ! ! ! !! ! ! ! !! ! ! !! ! ! !!! ! ! !! ! ! ! ! ! ! ! ! !!! ! ! ! ! !!! !! ! ! ! ! !!! !!!!!!!! ! ! ! ! ! ! !! ! ! !!! ! ! !! ! ! !! !! ! !! ! !! !! ! ! !!! ! ! !! !!!! ! !! ! !!! ! ! ! !! ! !! !!! ! !! !!! !! ! ! ! ! !! !! ! ! !! ! ! ! ! !! ! ! ! ! ! ! ! ! ! !! ! ! ! ! ! !!!!!!! ! ! ! ! ! ! !!! !! ! ! !! ! !! ! ! ! !! ! !! !! ! !! !!!! ! ! !!! ! !! !! ! ! !! ! !! ! ! !! ! ! ! ! ! !! ! ! ! !! ! !!!!! !!!! ! !! ! ! ! ! ! !!!! !! ! !!! !! ! !!! ! ! !! ! ! ! !! ! ! ! !! ! !!! ! !! ! ! ! !! !! ! !!!!! !! ! !! ! ! !! ! ! ! ! ! !!! ! ! !!! ! ! ! !! ! !!!!! ! ! ! ! ! ! ! ! !! ! ! ! ! ! !! !! ! !! !! ! ! ! ! ! ! ! !! ! ! ! ! ! !! ! ! ! ! ! !! ! !! !! !!!! ! ! ! !! ! !! ! ! !! ! ! ! ! ! !! !! !!!! ! ! !! ! ! ! ! ! ! ! ! ! !! !!! ! !! !! ! ! !!!!!! ! ! ! ! ! ! ! ! ! ! ! ! !!! !!!!! ! ! ! ! ! ! ! ! ! ! ! ! !! ! !! !! ! ! ! ! ! ! ! !!! ! !!!! ! ! !!! ! ! ! ! ! ! ! ! ! ! ! ! ! !! !!!! ! ! ! ! ! ! ! !!! !!! ! ! ! ! ! ! ! ! ! ! ! !! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !! ! !!! !! !! ! ! ! ! !! !!!!! !! ! ! ! ! ! ! ! !!!!! ! !! ! !! !!! !!!!! ! ! ! !!! ! ! ! ! ! ! !! ! !!! !!!! ! !!!!! !!! !!! !!!!!! !! !!! !!!! !! !! !! !! !! !! ! !!!!! ! !! ! !! ! ! ! !!!!! ! ! ! ! !!! ! ! ! !! !!! ! !! !! !!!! !!!! ! !! !! !!! !! ! ! ! !! !! ! !!! !! ! ! !! !!! !!! !!!!! !!! ! !!!! ! !!!!!!! ! !! !! ! !! ! ! !!! !! !!! ! !! ! !!! !!!!!! ! ! !! ! !! !! !! !! ! ! !! !! !!! !!! ! !! !!! !! ! !!! ! ! ! !! ! !! !! !! !!! !!! ! !! ! !! ! ! ! !!!!! !! ! ! ! ! !! !! ! !! !! ! !! ! ! !! ! !!!! ! ! ! ! !!! !! !!! ! !! !! ! ! !! !! ! !! !!!!! ! ! ! !!! !! !!! !!! ! !! !! !! ! ! ! ! !! !! ! ! !!! !!! ! ! !!!! ! !! ! !!! !!!! ! !!! ! !! !! !! ! !!! !!!! !!! !! !! ! ! !! ! !! !! !! ! !! !! !!! ! ! !!!!!! !!! ! ! ! ! !!!! ! ! ! !!!! !! !! !!!! !! ! ! !! !! ! ! !! !! !!! ! ! ! ! !!!!! ! ! !! !! ! !!! ! !! !! !! ! !! ! !! ! ! ! ! !! ! ! !! ! !!!! ! ! ! ! ! ! !! !! ! ! ! ! ! ! !! ! ! ! ! ! !! !! ! !! !! !!! ! !! ! !!! !!!! !!! ! !! ! ! ! ! ! ! ! ! ! ! ! ! !! !! !! ! !! !!! ! !! !! ! ! ! ! !! ! ! !! ! !!! ! ! ! ! ! ! !!! !! ! !!! ! !! ! ! ! ! ! ! !! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !! ! ! ! ! ! ! ! !!! ! ! !! !! ! ! ! !!! ! ! ! ! ! !! !! ! !! ! ! !! !!!! !!!!!! !!! ! ! ! ! ! !!! ! ! ! ! !! ! ! ! ! ! ! ! !! !!! ! ! ! ! ! !! ! ! ! !!! !! ! ! ! !! ! ! !! ! ! ! ! ! ! ! ! ! ! ! ! !!! ! ! ! ! ! ! !! ! ! ! ! ! ! ! ! ! !!!! !!!!! !!! ! ! ! ! ! !! ! !! ! !! ! ! ! !! ! ! ! ! ! ! !! !!!! ! !!! !! ! ! !!! !!! ! !! ! ! !! ! ! ! !! ! !!! ! ! ! ! ! !! ! ! ! !! !! ! ! ! ! !! !!! ! ! ! ! ! ! ! !! ! ! ! ! ! ! ! ! ! ! ! !! ! ! ! ! ! !!! ! ! !! ! ! ! ! ! ! ! ! ! ! ! !!! ! ! ! ! ! ! !! ! !! ! ! !! ! Rainfall gauge 20°S 30°S Study catchments Warrego Comet 120°E 40°S Thomson Barcoo 130°E 140°E 150°E 200 150 Warrego Comet Thomson Barcoo 100 100 50 0 150 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 50 Mean monthly PET (mm) Mean monthly rainfall (mm) Figure 1: Study catchments and rainfall gauges. Figure 2: Seasonal rainfall (bars) and potential evapotranspiration (lines) of the study catchments, calculated for the period January 1998 to August 2013. ded data provided by the Australian Bureau of Meteorology (Australian Data Archive for Meteorology database) and daily values were estimated by assuming a uniform daily distribution within a month. Satellite rainfall data were obtained from the Tropical Rainfall Measurement Mission (TRMM) Multisatellite Precipitation Analysis (TMPA) (Huffman et al., 2007). We used the 0.25◦ resolution corrected TMPA research product (3B42, period 01 January 1998 - 31 December 2013) and the near real-time operational product (3B42-RT, period 01 January 2000 - 28 November 2013), which is derived exclusively from satellite-based observations. A daily averaged time series was calculated for each study sub-catchment (sub-catchments delineation are presented in Figure 3). The 3B42-RT product was corrected using SMART (section 3.2). To evaluate the correction scheme, we used the gauge-interpolated rainfall dataset of the Australian Water Availability Project (AWAP) (Jones et al., 2009) as the benchmark rainfall. The near real-time satellite product was also used to force the rainfall-runoff models. These runs were used as the reference to evaluate the different data assimilation schemes (section 3.5). Satellite SM products were obtained from one active and two passive sensors. The active sensor product was the TU-WIEN (Vienna University of Technology) Advanced Scatterometer (ASCAT, ASC hereafter) data produced using the change-detection algorithm 83 (Water Retrieval Package, version 5.4) (Naeimi et al., 2009), for the period 04 January 2007 - 14 July 2013. One passive sensor product was the Advanced Microwave Scanning Radiometer - Earth Observing System (AMSR-E, AMS hereafter) version 5 VUA-NASA Land Parameter Retrieval Model, Level 3 gridded product (Owe et al., 2008) for the period 29 July 2002 - 03 October 2011. The second passive sensor product was the Soil Moisture and Ocean Salinity satellite (SMOS, SMO hereafter), version RE02 (Re-processed 1-day global SM) product provided by Centre Aval de Traitement des Donnees for the period 16 January 2010 - 01 January 2014. A daily averaged SM value was calculated for each product over the study sub-catchments (Figure 3). The areal SM estimate over a catchment was calculated by averaging the values of ascending and descending satellite passes on days when more than 50% of the catchment had valid data. For AMS and SMO, we subtracted the long-term temporal mean of the ascending and descending datasets before areal SM estimation to remove the systematic bias between them (Brocca et al., 2011; Draper et al., 2009b). 3 3.1 Methods Rainfall-Runoff Model The probability distributed model (PDM) is a parsimonious rainfall-runoff model that has been widely used in hydrologic research and applications (Moore, 2007). PDM belongs to the set of models within the flood forecasting system managed by the Australian Bureau of Meteorology. The model estimates a profile average SM (θ) within the catchment (water content of S1 in Figure 4) by conceptualising the soil water store S1 with varying capacities across the catchment. In this study, the spatial heterogeneity of the store capacities was represented by a Pareto distribution function. The SM component and the net rainfall (total rainfall minus evaporation and drainage) define the separation between direct runoff and sub-surface runoff. Direct runoff is transformed into surface runoff by two reservoirs (S21 and S22 in Figure 4). Subsurface runoff is estimated based on the drainage from S1 and transformed into baseflow by using a one-storage reservoir (S3 in Figure 4). Surface runoff and groundwater flow are combined as total runoff (streamflow hereafter). A detailed explanation of the model conceptualisation is presented in Moore (2007) and the description of the formulation used in this research is provided in Alvarez-Garreton et al. (2015). A semi-distributed scheme using PDM was set up at a daily time step for each of the study catchments (see Figure 3). The time constant parameters of reservoirs S21 , S22 and S3 (k1 , k2 and kb , respectively) were scaled by the area of each sub-catchment. Following Alvarez-Garreton et al. (2015), the river routing between nodes was represented by a linear Muskingum method (Gill, 1978) with a storage time constant km . This parameter was 84 CHAPTER 6: DUAL CORRECTION SCHEME Warrego 0 50 SC2 SC1 ! ! ! SC6 . SC6 SC5 SC1 ! ! SC5 SC4 !! SC6 ! Nodes SC8 Sub-catchments 1:8.000.000 ! SC7 ! SC1 SC3 Barcoo SC3 ! SC6 ! ! ! !! SC8 !SC7 ! ! ! SC2 1:8.000.000 SC4 SC4 ! Comet SC2 SC3 SC7 ! Thomson !! SC5 1:8.000.000 Km 200 100 ! SC5 SC4 ! !! SC2 SC3 SC1 1:8.000.000 Figure 3: Semi-distributed schemes within the study catchments. P Direct runoff S21 E S22 Surface runoff Fast flow storages S1 Q Total runoff Drainage Sub-surface runoff S3 Baseflow Slow flow storage Figure 4: The PDM scheme. scaled by the length of the river channel between consecutive nodes. The rest of the model and routing parameters were treated as homogeneous within each study catchment. To calibrate the model parameters, we forced the model with 3B42 rainfall dataset and used half of its entire period of record to calculate the objective function, which was based on the Nash-Sutcliffe model efficiency (NSE) (Nash and Sutcliffe, 1970). In particular, we divided the complete period in wet and dry years (based on mean annual rainfall from Table 1) and selected half the wet and half the dry years as calibration period. The calibration was done by using a genetic algorithm (Chipperfield and Fleming, 1995). 3.2 Forcing Correction Scheme Following Crow et al. (2011), we implemented the soil moisture analysis rainfall tool (SMART) to correct the 3B42-RT satellite rainfall dataset. This corrected rainfall dataset was used to force the PDM in the forcing correction and the dual correction schemes (see schematic in Figure 5). In general terms, SMART uses the antecedent precipitation index (API) model to estimate a SM proxy. This proxy is corrected by using satellite SM observations via a Kalman filter. The Kalman filter innovations are then used to correct the potential errors in the satellite rainfall data used to force the API model. The API model is used here because SMART was shown to perform better when applied to a linear water balance model lacking saturation (which is a non-linear process represented 85 by rainfall-runoff models). This was one of the key findings in Crow et al. [2011], where a more complex land surface model was used that did not enhance the correction of rainfall accumulations. The expected improvement coming from a more realistic modeling of soil moisture (which accounted for the energy balance control on soil water loss, soil saturation and runoff generation) was uncertain given the challenges in evapotranspiration modeling approaches and the multilayered model with finite soil water capacities. The API model at day t was formulated as API(t) = γ(t)API(t − 1) + P (t), (1) where P is the 3B42-RT rainfall data and γ(t) is a dimensionless loss coefficient that varies according to the day of the year (D): γ(t) = 0.8 + 0.05(2πD(t)/365). (2) The coefficients in (2) aim to capture radiation and climatological temperature effects. Their values were adopted from Chen et al. (2014). Following Crow and Ryu (2009), prior to being assimilated into the API model, satellite SM observations were rescaled into the API space by using a cumulative distribution function. When a rescaled AMS, ASC and/or SMO observation (Θams , Θasc and Θsmo , respectively) was available at time t, API was updated using three sequential steps (the time t was omitted in the following equations when all the terms corresponded to the same time step): 1. If Θams was available at time t, API+ = API− + K f (Θams − API− ). (3) The superscripts minus and plus denote before and after updating, respectively. K f is the Kalman gain (superscript f refers to forcing scheme) calculated at each time step as Kf = T− . +Σ (4) T− Where Σ is the scalar error variance of the observation Θams , fixed at 0.042 mm3 mm−3 . The use of fixed observation error variance follows previous studies applying SMART (Chen et al., 2014; Crow and Ryu, 2009; Crow et al., 2011). Since λ (10) is calibrated for each sub-catchment, the difference in (relative) SM error estimates is compensated within the calibration process. T − is the scalar error variance for the API forecast at time t, calculated as T − (t) = γ 2 (t)T + (t − 1) + Z + P 2 (t). (5) T + is the updated API error variance, calculated whenever API was updated, as T + = (1 − K f )T − . (6) 86 CHAPTER 6: DUAL CORRECTION SCHEME The term Z + P 2 (t) in (5) is the model background uncertainty added in each time step, which assumes greater error in API prediction when P is greater than zero. Following Crow et al. (2011) and Chen et al. (2014), Z was fixed at 3 mm2 and at 5 (dimensionless). If Θams was not available at time t, API+ = API− , i.e., no correction was done. 2. If Θasc was available at time t, API++ = API+ + K f (Θasc − API+ ). (7) In this step, K f was calculated by (4) where Σ was the scalar error variance of Θasc , fixed at 0.042 mm3 mm−3 . Similarly to step 1, if API+ was updated by (7), T − was also updated by (6). If Θams was not available at time t, API++ = API+ . 3. If Θsmo was available at time t, API+++ = API++ + K f (Θsmo − API++ ). (8) Similarly to the previous steps, K f was calculated by (4), but now using Σ as the scalar error variance of Θsmo , fixed at 0.042 mm3 mm−3 . If API++ was updated by (8), T − was also updated by (6). If Θsmo was not available at time t, API+++ = API++ . After the above 3-step updating scheme, the analysis increments δ were defined for each time step t as δ = API+++ − API− . (9) Following Crow et al. (2011), the rainfall accumulation [P ] was corrected by [P ]c = [P ] + λ[δ]. (10) The superscript c denotes after correction. The square brackets represent non-overlapping accumulation windows. To ensure that SM observations corrected only past rainfall accumulations, the length of these windows was varied so that the last day of the window had a SM observation. The parameter λ in (10) is constant in time, and was calibrated for each sub-catchment by minimising the root-mean-square error between [P ]c and the 3B42 rainfall product. Negative values of [P ]c were reset to zero. Following Chen et al. (2014), we applied a 2 mm threshold value for rainfall correction when [P ] was zero (i.e., the correction step in (10) was done only if λ[δ] > 2). The latter is done since SMART analysis tends to create spurious very low rainfall due to positive noise in the SM observations, which results in increased false alarm ratios in rainfall events. The consequence of this is that real rainfall values smaller than 2 mm can be discarded, however, these low-intensity rainfall values are unlikely to make significant impacts on the streamflow prediction with the given high evapotranspiration of the study regions. 87 To get a daily corrected rainfall time series, we redistributed the corrected accumulations in proportion to the original daily rainfall. To remove positive bias due to the resettingto-zero step, the corrected rainfall was multiplicatively rescaled to match the long-term mean of P . The corrected 3B42-RT dataset using SMART is called 3B42-RTC . 3.3 State Correction Scheme To set up the state correction scheme (schematic in Figure 5), we followed the procedure developed by Alvarez-Garreton et al. (2015). The rainfall dataset used in this scheme was the uncorrected 3B42-RT. In summary, the scheme consisted of using satellite SM observations to correct the model SM (θ hereafter) via a stochastic data assimilation framework using an ensemble Kalman filter (EnKF) (Evensen, 2003). Prior to being assimilated, the satellite data was processed to provide consistent information about θ’s dynamics. In the following we provide a description of the satellite data processing and the EnKF implementation. 3.3.1 Satellite Soil Moisture Data Processing Given the active and passive sensors microwave penetration depths, satellite SM observations represent only the top few centimetres of soil. Furthermore, θ is an average profile SM representing a deeper layer. Therefore, the depth that θ represents depends on soil properties and model parameters, and is unique for each sub-catchment and PDM scheme. By assuming a porosity for the study catchments ranging between 0.45 and 0.5 (A-horizon information reported in McKenzie et al. (2000)) and S1 storage capacity ranging from 200 to 280 mm (obtained from calibrated model parameters for the different catchments), θ represents roughly a depth varying between 400 and 600 mm. To address the depth mismatch between satellite and model, we applied the exponential filter proposed by Wagner et al. (1999) to the satellite SM observations and obtained a soil wetness index (SWI) of the root-zone. The use of SWI to characterise the dynamics of the root zone SM based on surface observations has been successfully evaluated in a number of studies (e.g., Albergel et al., 2008; Brocca et al., 2009, 2010; Ford et al., 2013). We calculated the SWI for each satellite dataset and each sub-catchment by using the following recursive formulation: SWI(t) = SWI(t − 1) + G(t) (SSM(t) − SWI(t − 1)) , (11) where t is the daily time step, SSM is the satellite observation (AMS, ASC or SMO) and G is a gain term varying between 0 and 1 calculated as G(t) = G(t − 1) . t−(t−1) G(t − 1) + e−( T ) (12) The parameter T accounts for several physical parameters defining infiltration and per88 CHAPTER 6: DUAL CORRECTION SCHEME colation processes (Albergel et al., 2008). T was calibrated for each satellite product and each sub-catchment. The calibration was done by maximising the correlation coefficient between SWI and the model SM (θ) over the entire period of record of the corresponding satellite product. Once the SWI was calculated for AMS, ASC and SMO, we applied instrumental variables (IV) regression (Su et al., 2014) to remove the systematic (multiplicative and additive) biases between each SWI and θ and to estimate the observation errors. Following AlvarezGarreton et al. (2015), we applied the triple collocation (TC) analysis (Stoffelen, 1998; Yilmaz and Crow, 2013) to rescale the SWI and estimate its observation error variance. The TC-based method has been used as an optimal rescaling method and error estimator if assumed assumptions are met (Yilmaz and Crow, 2013) and it has been increasingly applied in hydrologic data assimilation applications (Dorigo et al., 2010; Alvarez-Garreton et al., 2015; Chen et al., 2014; Crow and Yilmaz, 2014). Data triplets for TC comprised of the model θ and two SWI time series (derived from a passive and an active sensor, respectively). We implemented TC with an imposed threshold sample of 100 (Scipal et al., 2008). For the periods where only one satellite product was available, or when the threshold for TC triplets was not met, a two-data IV regression was used as a practical substitute. The two-data IV, also known as lagged variables (LV) (Su et al., 2014), was applied to the model θ, a single satellite SWI, and a 1-day lagged variable coming from the model θ. As a simplification of the seasonal approach proposed by Alvarez-Garreton et al. (2015), in this study we applied TC and LV to the complete period of record of AMS, ASC and SMO. This bulk approach provided a scalar observation error variance and constant rescaling factors for each of the SWI datasets. The rescaled SWI datasets for AMS, ASC and SMO were named θams , θasc and θsmo , respectively. Finally, ensembles of 500 members were calculated for the three rescaled datasets (θ ams , θ asc and θ smo , respectively) by adding a Gaussian noise with mean zero and the error variance obtained from the TC and LV analyses. As discussed in section 5, the adopted bulk estimations may have implications for the observation error characterisation and data assimilation results. 3.3.2 EnKF Formulation In the EnKF, the errors in the model and the observations are calculated from MonteCarlo ensemble realisations. To implement the state-correction scheme, every time there was a 500-member ensemble of observations available (θams , θasc and/or θsmo estimated via the satellite data processing described in section 3.3.1), each member of a 500-member ensemble of predictions (θ) at time t was sequentially updated using (t was omitted from the following equations since all the terms corresponded to the same time step): 89 1. If θams was available at time t, θi+ = θi− + K(θiams − Hθi− ). (13) The subscript i indicates a member of the ensemble of predictions and observations. The minus and plus denote θi values before and after updating, respectively. The H is an operator that transforms the model state into the measurement space. Given the pre-processing applied to the satellite SM products (section 3.3.1), H reduced to a unit matrix (and therefore was omitted from the following equations). The Kalman gain K was calculated at each time step as K= C , C +E (14) where E is the θams rescaled observation error variance estimated from IV analyses (section 3.3.1) and C is the scalar error covariance of the background prediction θ− . C was calculated at each time step as C= T 1 θ− − hθ− i · θ− − hθ− i , N −1 (15) where hθ− i is the ensemble mean at time t. If θams was not available, θ+ = θ− , i.e., no correction was done. 2. If θasc was available at time t, θi++ = θi+ + K(θiasc − θi+ ). (16) To calculate K we applied (14), where E corresponded to the θasc observation error variance estimated from IV analyses and C was re-calculated by applying (15) to the updated soil moisture θ+ . If θasc was not available, θ++ = θ+ . 3. If θsmo was available at time t, θi+++ = θi++ + K(θismo − θi++ ). (17) Consistently with the previous steps, K was calculated by (14), where E corresponded to the θsmo observation error variance and C was calculated by applying (15) to the updated soil moisture θ++ . If θsmo was not available, θ+++ = θ++ . During the sequence of three updating steps, each sub-catchment was treated independently and no spatial cross-correlation in the satellite measurements was considered. The order of the satellite products used in the 3-step sequential assimilation was arbitrary, however different orders were tested with no significant variation in results. Following Alvarez-Garreton et al. (2015), to generate the model background ensemble prediction (θ− ) we applied un-biased perturbations to the rainfall forcing data, the model parameter k1 and the model SM prediction (θ). These perturbations aimed to represent the main sources of model error, coming from the forcing data, the model parameters and 90 CHAPTER 6: DUAL CORRECTION SCHEME the model structure. The error models adopted for each perturbation followed a number of previous SM-DA experiments (e.g., Chen et al., 2011; Brocca et al., 2012a; Alvarez-Garreton et al., 2014) and consisted in a serially independent log normal multiplicative error (mean 1, standard deviation σp ) for the rainfall data, a serially independent Gaussian additive error (mean 0, standard deviation σk ) for parameter k1 , and a serially independent Gaussian additive error (mean 0, standard deviation σs ) for the model soil moisture. To avoid truncation biases while applying the θ perturbation, we implemented the bias correction scheme proposed by Ryu et al. (2009) to the SM ensemble. This bias correction ensures unbiased state ensembles; however, given the non-linear processes represented by the hydrological model, the perturbation process can still generate biased streamflow ensemble prediction. This can degrade the performance of the EnKF (Ryu et al., 2009). To remove the biases in streamflow caused by the forcing and state ensemble perturbations, we followed AlvarezGarreton et al. (2015) and also applied the bias correction scheme to the streamflow ensembles (of the subcatchment outlets and routing channels). The error model parameters (σp , σs and σk ) were assumed to be homogeneous within each study catchment and were calibrated using a maximum a posteriori likelihood approach (MAP) (Wang et al., 2009) for the period 01 January 1998 - 31 December 2003. MAP has been used as an objective method to estimate reliable model error parameters (AlvarezGarreton et al., 2015; Li et al., 2014). With this approach we maximised the likelihood (aggregated over time) of having the observed streamflow within the model streamflow ensemble prediction. In the MAP scheme, the error in the observed streamflow at the outlet of each study catchment was assumed to follow a serially independent multiplicative gaussian error (mean 1, standard deviation 0.2). Further details about model error formulations and MAP calibration can be found in Alvarez-Garreton et al. (2015). 3.4 Dual Correction Scheme The dual correction scheme combined the forcing correction scheme (section 3.2) with the state correction scheme (section 3.3). The streamflow prediction for a given time step was obtained by running the state correction scheme and then using the average of the updated θ ensemble (and the average of the model ensemble states that were not updated) to initialise a new PDM run, this time forced with 3B42-RTC coming from the SMART forcing correction scheme. Following Crow and Ryu (2009), the state outputs of this last single run of PDM were discarded (not fed back to the state correction scheme) to avoid cross-correlation between model and observation errors. Figure 5 presents a diagram of the dual correction scheme. 91 Reference 3B42-RT Forcing correction scheme 3B42-RT PDM AMS, ASC, SMO PDM PDM Qsim Qsim Ensemble mean Qsim θ+ θ- DP θams, θasc, θsmo EnKF 3B42-RT Dual correction scheme 3B42-RTC SMART AMS, ASC, SMO 3B42-RT State correction scheme Qsim AMS, ASC, SMO PDM 3B42-RTC θ+ θ- Ensemble mean PDM Qsim θ+ DP θams, θasc, θsmo EnKF Figure 5: Diagram of the reference evaluation run and the 3 correction schemes. The red single-lined boxes correspond to deterministic variables while the blue double-lined boxes correspond to stochastic variables (ensembles). The circle labelled DP indicates the data processing of the satellite SM observations detailed in section 3.3.1 . 3.5 Schemes Evaluation The reference model run used to evaluate the different correction schemes was the unperturbed model forced with 3B42-RT dataset. The use of the near-real time satellite rainfall product to force the reference run reflects our aim to evaluate the efficacy of the correction schemes under a data scarce scenario. Given that the streamflow from the reference run, the forcing correction scheme (section 3.2) and the dual correction scheme (section 3.4) are deterministic predictions, the state correction scheme (which provides an ensemble of updated streamflow predictions) was evaluated in terms of its ensemble mean. The evaluation period was July 2002 to November 2013, which was determined by the availability of the satellite datasets (section 2). The streamflow prediction from the reference run and the 3 correction schemes (forcing correction, state correction and dual correction schemes) were evaluated based on the Nash-Sutcliffe efficiency (NSE) (Nash and Sutcliffe, 1970), the root mean square error (RMSE) and the correlation coefficient (R). In particular, we calculated the difference (in percentage) between the statistics of 92 CHAPTER 6: DUAL CORRECTION SCHEME the streamflow from the different correction schemes and the reference run. To evaluate the improvement from the different schemes during high and low flows periods, the three statistics were calculated in natural space (more sensitive to high flows) and logtransformed space (more sensitive to low flows). Following Massari et al. (2015), to avoid exclusion of the zero-flow periods when applying the log transformation, an arbitrary fraction of the mean daily observed streamflow (Qobs /40) was added to the streamflow time series (observed and simulated) before calculating their logarithm. Additionally, we estimated the false alarm ratio (FAR) as the number of times (#) the model streamflow prediction exceeded a threshold value (Q∗ ), while the observed streamflow was less than Q∗ : FAR = #(Qsim >= Q∗ & Qobs < Q∗ ) . #(Qobs < Q∗ ) (18) Q∗ was set as the daily flow rate corresponding to a minor flood classification. The flood classification for the study catchments was provided by the Australian Bureau of Meteorology as river height threshold values. These relate to flood impact rather than recurrence interval. The threshold values for minor floods, expressed as streamflow (mm day−1 ) for Warrego, Comet, Thomson and Barcoo catchments are 0.06, 0.1, 0.02 and 0.14 , respectively. Similarly, we estimated the probability of detection (POD) of these flow rates as POD = #(Qsim >= Q∗ & Qobs >= Q∗ ) , #(Qobs >= Q∗ ) (19) The rainfall correction scheme was further evaluated in terms of the corrected rainfall dataset. For this we used the gauge-interpolated AWAP rainfall as a benchmark dataset and calculated 5 statistics for the 3B42-RT dataset before and after SMART correction. The statistics used here were the mean daily bias, the coefficient of determination R2 , the RMSE, the FAR and the POD. In this case, FAR and POD were calculated based on daily rainfall threshold values specified in section 4.1. 4 4.1 Results Rainfall Correction To categorise the evaluation of SMART into meaningful ranges, we analysed the histograms of daily rainfall for the benchmark rainfall dataset (AWAP). Figure 6 shows the frequency of daily rainfall accumulations within four representative sub-catchments (each from one of the four study catchments). To calculate the histograms we filtered daily accumulations greater than 2 mm, which resulted in 532, 658, 444 and 553 daily records for the four sub-catchments, respectively (panels a to d in Figure 6). These plots reveal that for 93 Frequency 0.6 a) Warrego (SC1) b) Comet (SC3) c) Thomson (SC4) d) Barcoo (SC5) 0.4 0.2 0 0 20 40 60 Rainfall (mm day - 1) 0 20 40 60 Rainfall (mm day - 1) 0 20 40 60 Rainfall (mm day - 1) 0 20 40 60 Rainfall (mm day - 1) Figure 6: Histograms of the benchmark daily AWAP rainfall accumulations larger than 2 mm over 4 representative sub-catchments. all the sub-catchments more than 50% of the daily rainfall values are within the first histogram bin, which corresponds to 2 to 7 mm (each histogram bin corresponds to a 5 mm increment). Almost 20% (slight variation across the catchments) of the daily records range between 7 and 12 mm and the rest is distributed above 12 mm. Based on this, the evaluation statistics of the forcing correction scheme (section 3.5) were calculated within these three ranges. FAR and POD in particular were calculated with (18) and (19), using the upper and lower ranges bounds as threshold values, respectively. To illustrate the tendency of the satellite rainfall estimates to over or under predict daily rainfall events, in Figure 7 we present the mean daily bias of the rainfall before and after SMART correction (using AWAP as the benchmark dataset) for each sub-catchment within the 4 study catchments (29 sub-catchments in total, see Figure 3). The plots in Figure 7 do not represent long-term biases, but rather the mean daily bias within specific evaluation ranges. Before SMART correction, the near real-time satellite product generally over predicted daily accumulations of rainfall for the low-to-mean ranges (panels a and b). For the high rainfall events (panel c), the behaviour of the satellite product before correction was the opposite, there was a fairly consistent under-prediction of daily accumulations. This over-prediction of low rainfall events and under-prediction of high rainfall events is consistent with the literature (Ebert et al., 2007). To interpret the plots in Figure 7, we should recall that SMART is formulated to reduce the random component of the error in the satellite rainfall data, and thus a reduction in long-term biases should not be expected. However, since the plots illustrate the mean biases within specific ranges, some variation of the general tendency to over or under predict rainfall events at some locations was identified after SMART correction. For the lowest rainfall range (panel a), the over-prediction of daily accumulations was reduced within 90% of the catchments after applying SMART (up to 15% of reduction in positive biases). For the second rainfall range (panel b) the impacts of SMART correction were not so consistent. Almost half of the cases where there was over prediction of rainfall estimates (20 in total) were improved after applying SMART. Similarly, almost half of the under prediction cases (8 in total) were improved after SMART correction (e.g., 2 subcatchments within Warrego, 1 sub-catchment within Comet and 1 sub-catchment within Thomson). For the high rainfall range (panel c) there was consistent negative bias in 3B42-RT, which was not reduced after applying SMART. The latter could be due to the 94 Bias (mm day - 1) Bias (mm day - 1) Bias (mm day - 1) CHAPTER 6: DUAL CORRECTION SCHEME 3 a) 2 to 7 mm day -1 2 1 0 3 b) 7 to 12 mm day-1 3B42-RT 2 3B42-RT c 1 0 -1 c) >12 mm day-1 0 -5 -10 0 5 10 15 20 25 30 Sub-catchment Figure 7: Mean bias in daily rainfall before and after SMART correction within the study catchments: Warrego (sub-catchments 1 to 7), Comet (8 to 15), Thomson (16 to 23) and Barcoo (24 to 29). The mean bias was calculating for daily accumulations of the benchmark dataset varying between 2 to 7 mm (panel a), 7 to 12 mm (panel b) and above 12 mm (panel c). limited information that SM provides when the surface soil gets saturated, and to the tendency of under-estimate peak rainfall events coming from SMART’s core formulation (further discussion in section 5). Figure 8 presents the statistics calculated for 3B42-RT before (x-axes) and after SMART correction (y-axes), for the 3 rainfall ranges defined above and for the 29 study subcatchments. Additionally, the R2 and the RMSE were calculated for the complete range of daily rainfall accumulations. The coefficient of determination R2 between 3B42-RT and AWAP for the 3 rainfall ranges (panels a, b and c in Figure 8) and for the complete rainfall accumulation range (panel d) increased after SMART correction. The improvement was moderate, but consistent throughout most catchments (it should be noted that different scales in x- and y-axes were used for the different rainfall ranges). The low R2 values in panel a are consistent to the larger errors found in 3B42-RT product for daily accumulations below 10 mm (Pipunic et al., 2015). In terms of the RMSE, the forcing correction scheme reduced the satellite rainfall data error, across all catchments and all daily rainfall ranges (panels e, f, g and h in Figure 8). The contrasting results between reduced RMSE (panel g in Figure 8) and increased mean biases (panel c in Figure 7) for the high rainfall range is consistent with the SMART core formulation. SMART was effective at reducing the error variance of the rainfall estimates (random component of the satellite rainfall error), which is reflected by a reduction in RMSE; however, the existing biases in the original dataset were not significantly impacted by the scheme. Positive results after SMART correction were also observed in the increased POD (panels i, j and k) and decreased FAR (panels l and m) statistics, across all catchments and all daily rainfall ranges. 95 2 to 7 mm day - 1 0.15 7 to 12 mm day- 1 0.1 a) >12 mm day - 1 b) 0.4 Complete range c) d) 0.6 R2 0.1 0.05 0 RMSE (mm) 10 0.4 0 0.05 0 0.1 e) 0 0.05 0.1 14 f) 8 12 6 10 0 20 0 0.5 0.3 10 g) 0.4 0.6 h) 8 16 6 8 4 80 POD (%) 0.5 0.2 0.05 5 10 10 80 i) 15 12 12 80 j) 75 16 4 20 k) 70 60 65 65 FAR (%) 5 70 75 80 60 60 4 l) 4 3 3 2 2 2 3 4 5 1 70 80 m) 50 50 0.04 10 Warrego Comet Thomson Barcoo 70 70 5 60 70 80 n) 0.02 1 2 3 4 0 0 0.05 Figure 8: Sub-catchment wise evaluation of SMART analyses using AWAP as the benchmark dataset. Y-axes presents the corrected 3B42-RTC statistics and x-axes the uncorrected 3B42-RT statistics. The 3 columns show the results for the indicated daily rainfall ranges. The 4 rows present the results for R2 , RMSE, POD and FAR statistics, respectively. In summary, although SMART led to increased rainfall mean biases at some locations for specific ranges of rainfall events (Figure 7), overall the scheme improved the quality of the satellite rainfall data (Figure 8). In particular, the positive impacts within high rainfall events (increased R2 and POD, decreased RMSE) suggest that this could be a suitable scheme to improve the prediction of high streamflow events. 4.2 Satellite Data Processing Figure 9 summarises some of the satellite data processing results, panel a shows the parameter T from equation 12 that maximised the correlation coefficient (presented in panel b) between the model SM and SWI for each sub-catchment and each satellite product. The parameter T varied within a range of 3 to 42 days, which is consistent with the range of values found in previous studies (e.g., Albergel et al., 2008; Brocca et al., 2009; Ford et al., 2013). These variations in T could be due to a series of factors, including the particular sub-catchment physical processes, the retrieval method of the satellite product, the quality of the SM predicted by the model, and the different periods of time used for the calibration. Across all sub-catchments, and similarly to previous findings (Alvarez-Garreton et al., 2015), T values were larger for the SMO product, which would be inconsistent with L-band having a deeper penetration than AMS C-band (to limit the comparison within passive retrievals). This might be due to factors including the different retrieval meth96 CHAPTER 6: DUAL CORRECTION SCHEME 50 a) T (days) 40 30 20 10 0 1 b) R 0.9 0.8 0.7 0.6 E (vol/vol) 2 2.5 2 ×10 -3 c) AMS ASC SMO 1.5 1 0.5 0 0 5 10 15 Sub-catchment 20 25 30 Figure 9: Satellite data processing results for the Warrego (sub-catchments 1 to 7), Comet (8 to 15), Thomson (16 to 23) and Barcoo (24 to 29). Panel a shows the calibrated parameter T used in the SWI estimates. Panel b presents the correlation coefficient between AMS, ASC and SMO-derived SWI and the model soil moisture. Panel c presents the observation error variance in the observation space. ods (which have quite different assumptions pertaining to spatial heterogeneity) and the influence of radio-frequency interference noise. The observation error variances for SWI derived from AMS, ASC and SMO, respectively, are presented in panel c, Figure 9. SMO-derived SWI generally outperformed the others two sensors, which is consistent with its higher correlation with the model (panel b). The passive AMS product showed the largest error across the study sub-catchments. It should be noted that the errors presented in Figure 9 come from TC analyses, the results from LV procedure (applied when there was only one satellite product available or when the sample threshold for TC was not met) maintained a similar comparative relationship among sensors; however, the magnitude of the error was consistently higher. This overestimation of the observation errors by LV is consistent to previous studies (Su et al., 2014; Alvarez-Garreton et al., 2015) and it is likely to be explained by error autocorrelation of the lagged variables used in the triplets. This will have the impact of giving greater weights to the model predictions in the assimilation. 4.3 Streamflow Prediction Evaluation The statistics of the reference runs used to evaluate the different assimilation schemes are presented in Table 2. It can be seen from this table that the quality of the streamflow prediction in most of these catchments was poor, with low NSE and R values (calculated using both the raw and the log-transformed streamflow values). The only catchment that showed good quality statistics is the Comet. The poor performance of the model in 97 Table 2: Streamflow prediction statistics for the reference runs calculated using the raw streamflow values (r) and for the log-transformed values (l). NSE (r) (l) 0.30 0.40 0.70 -0.30 0.28 0.10 0.24 -0.03 Catchment Warrego Comet Thomson Barcoo R (r) 0.58 0.87 0.55 0.50 (l) 0.74 0.56 0.75 0.62 RMSE (r) 0.34 0.78 0.21 0.53 (mm) (l) 1.11 1.49 1.51 1.21 FAR (r) 0.07 0.22 0.23 0.10 POD (r) 0.85 0.81 0.95 0.73 Table 3: Streamflow prediction statistics from the models forced with gauged-based rainfall data. (r) refers to the raw streamflow values and (l) to the log-transformed values. Catchment Warrego Comet Thomson Barcoo NSE (r) (l) 0.85 0.69 0.74 0.21 0.54 0.56 0.43 0.25 R (r) 0.92 0.89 0.77 0.66 (l) 0.87 0.71 0.86 0.73 RMSE (r) 0.15 0.73 0.17 0.46 (mm) (l) 0.80 1.16 1.06 1.04 FAR (r) 0.05 0.15 0.17 0.06 POD (r) 0.92 0.81 0.96 0.79 the study catchments was mainly due to the low quality of the forcing rainfall data (the near real-time 3B42-RT product). This was confirmed by the higher statistics obtained after calibrating the models with the (higher quality) gauged-interpolated AWAP dataset (presented as reference in Table 3). The relevance of using these reference runs is that they represent the data scarce scenario within most areas in the world. The calibrated model error parameters (σp , σs and σk ) for the study catchments are presented in Table 4. It can be seen from this table that Comet catchment presents the lowest error in rainfall, which is consistent with the better performance in streamflow prediction. The results of the different assimilation schemes are presented in Figure 10. Overall, the use of (processed) satellite SM to correct the model SM and/or the forcing rainfall led to an improvement over the reference model runs. In terms of NSE and RMSE, there was a wide range of improvement for both the high flows (panels a, e) and the low flows (panels b, f). In 3 out of 4 high flow cases and all the low flow cases, the state correction scheme outperformed the forcing correction scheme. The combined dual scheme further improved the results, irrespective of wether the high flows or low flows were emphasised in the evaluation. The only case where this relation changed was for the high flows in Comet catchment (panel a), where the forcing scheme consistently showed a greater positive impact in streamflow prediction. The counterintuitive behaviour of the dual scheme in Table 4: Model error parameters calibrated with MAP Catchment Warrego Comet Thomson Barcoo σp 0.98 0.70 0.85 0.89 98 σs 0.03 0.03 0.02 0.02 σk 0.10 0.05 0.08 0.03 CHAPTER 6: DUAL CORRECTION SCHEME Raw flow Log flow ∆ NSE a) b) 0.2 0.1 0 ∆ RMSE ∆R 0.2 ∆ FAR 0 c) 0.2 d) 0.1 0.1 0 0 0.1 0.1 e) 0 0 -0.1 -0.1 -0.2 0.2 -0.2 g) f) Wa Co Th Ba 0 -0.2 -0.4 0.1 ∆ POD fDA sDA dDA 5 h) 0 -0.1 Wa Co Th Ba Figure 10: Data assimilation results of the forcing correction scheme (fDA), the state correction scheme (sDA) and the dual correction scheme (dDA). The statistics in the left column (panels a, c, e, g and h) were calculated using the raw streamflow values. The statistics in the right column (panels b, d and f) used log-transformed streamflow values. the Comet catchment could be due to a combinations of factors, including the better initial performance of the model, the higher quality of the rainfall data, the catchments runoff mechanisms, the quality of the satellite SM within the catchment. Based on the improvements in R, the dual correction scheme in general outperformed the other two schemes. This is the true for 2 out of 4 high flow cases (panel c) and for 3 out of 4 low flow cases (panel d). The dual correction scheme also led to a consistent decrease of the FAR (panel g in Figure 10) within all the study catchments. By applying this dual correction scheme, the number of incorrectly predicted minor floods was reduced by 10 to 30%. If these predictions were to be applied to feed operational flood alert systems, this improvement in FAR would have a significant impact. In terms of POD (panel h), the data assimilation schemes had a much lower (negative and positive) impact on the streamflow prediction (less than 10% of POD variation). The POD is only improved in Comet catchment, where the dual scheme showed the highest effect. For the other catchments there was a decrease of POD after the state and dual correction schemes. 99 5 Discussion The results presented here demonstrate that active and passive satellite SM retrievals have the potential to improve an operational satellite rainfall product (3B42-RT). We also showed that assimilating the satellite SM observations into the streamflow modelling generally had positive impacts in the quality of the streamflow prediction. Finally, by combining the forcing and state correction schemes we further improved the streamflow predictions for most study catchments. These outcomes are consistent with previous studies (Crow and Ryu, 2009; Chen et al., 2014; Massari et al., 2014). Overall, the assimilation of SM retrievals via SMART improved the rainfall estimates over the study catchments, with a decrease in RMSE and FAR, and an increase in R2 and POD within most sub-catchments (Figure 8). Similarly to previous studies (Crow et al., 2011; Chen et al., 2014), SMART showed limitations during wet conditions. This resulted in the under-prediction of some high intensity rainfall events (increased negative biases in panel c, Figure 7). As mentioned in section 4.1, this could be due to the limited information about rainfall that SM provides when the surface soil is wet. This key issue affects not only SMART, but also other correction schemes aiming to estimate (or reduce the error in) rainfall based on SM information (e.g., Brocca et al., 2013; Zhan et al., 2015). Another reason for the under-prediction of rainfall peaks could be the precipitation error variance minimisation approach used by SMART (the Kalman filter). It has been shown that an error variance minimisation algorithm increases the conditional bias of the rainfall estimates, which is manifested by an underestimation of strong rainfall (Ciach et al., 2000). When the SM was near the lower limit of the volumetric water content (dry conditions), SMART consistently reduced the over-prediction of small rainfall events (decreased positive bias in panel a, Figure 7). This suggests that, in contrast to previous studies (e.g., Crow et al., 2011; Chen et al., 2014), the noise in the SM retrieval signal was not misinterpreted by SMART as rainfall. The better performance of SMART during low intensity rainfall events could be explained by the different climatology of our study region and by the different quality of the satellite products (rainfall and SM products). The impact of SMART correction in the streamflow modelling was assessed in section 4.3. The overall improvement of the forcing data via SMART was successfully transferred into the streamflow modelling. Similar to previous studies (Crow and Ryu, 2009; Chen et al., 2014), during high flow periods SMART led to a consistent positive impact on the streamflow modelling across all catchments (increased NSE, R and a reduced RSME in Figure 10). Therefore, even when some peak rainfall values were under predicted at some locations (negative mean biases in Figure 7), the corrected rainfall featured consistently lower errors than the near real-time satellite rainfall (Figure 8). This resulted in an improved overall performance of the rainfall-runoff model after SMART rainfall correction. 100 CHAPTER 6: DUAL CORRECTION SCHEME The low flow estimations were also improved after SMART; however, the improvement was less significant than during high flows. This was expected given the higher control that rainfall exerts in the streamflow generation during intense events. The correction of the model SM state by the assimilation of satellite SM led to a significant improvement in the prediction of low flows (panels b, d and f in Figure 10). The improvement during high flows was less for most cases (panels a, c and e in Figure 10), which is consistent with the higher control that the catchment wetness condition has in the streamflow generation during low flow periods. Our results agree with various studies demonstrating the potential of these observations for enhancing streamflow modelling (e.g., Brocca et al., 2010; Alvarez-Garreton et al., 2014; Wanders et al., 2014; AlvarezGarreton et al., 2015; Massari et al., 2015). Notwithstanding this evidence, there are key choices to be made in setting up the SM data assimilation schemes that can have significant impacts on their results. As clearly described by Massari et al. (2015), these schemes are highly influenced by local conditions and methodological issues. The latter should be carefully taken into consideration before drawing general conclusions. Regarding our methodology, the first key step to set up the state correction scheme was the satellite data processing (section 3.3.1). The use of the exponential filter to estimate the SWI of the root zone based on surface observations was a simple solution that has shown positive results in several studies (e.g., Albergel et al., 2008; Brocca et al., 2009; Ford et al., 2013). However, there are some issues related to the autocorrelation in the observation errors and the potential cross-correlation with the model SM errors that have been highlighted when SWI is used within a data assimilation scheme (Brocca et al., 2010; Alvarez-Garreton et al., 2015). There is an important research gap here since the implications of the latter issues have not yet been assessed, and the use of other profile SM estimation methods (e.g., Richards, 1931; Manfreda et al., 2014) have not been tested within this data assimilation context. The rescaling of the observations and the quantification of their errors was performed here by a triple collocation based approach (TC and LV detailed in section 3.3.1), which has been assessed as an optimal rescaling procedure if assumptions are met (Yilmaz and Crow, 2013). In particular, we applied TC and LV to the complete observation period (bulk estimation of rescaling parameters and observation errors), which does not consider the temporal variability (e.g., seasonality) in the observation errors (Draper and Reichle, 2015; Su and Ryu, 2015). This simplification could potentially lead to overcorrection of the model state if the actual error is higher, and vice versa. To address this, some studies have applied seasonal rescaling and error estimation (Alvarez-Garreton et al., 2015) or have separately treated anomalies and seasonality within the TC implementation (Chen et al., 2014). Despite these attempts to address the temporal variation in the observation errors, further investigation is required to assess the impacts of rescaling assumptions and simplifications in satellite SM data assimilation. 101 A practical implication of the highlighted limitations within the satellite data processing is that lower errors estimated for a particular dataset (e.g. SMO-derived SWI in Figure 9) do not necessarily imply a better performance of the product in the data assimilation schemes. Therefore, a comparative assessment of the skill of the different satellite SM products to improve streamflow prediction in the proposed schemes cannot be drawn from these results (such comparison would require to run the assimilation schemes independently for the different satellite products). Acknowledging this limitation, the benefit of sequentially assimilating the three satellite products is that since we are using a statistically optimal updater, integrating multiple observations should provide better results. Additionally, given the different period of record of the SM products, using the three products enables a longer evaluation period for the assimilation schemes. The second key step in the state correction scheme was the representation and quantification of the model errors, which has a direct impact in the data assimilation results (Massari et al., 2015). There are a number of methods to quantify model errors such as the assumption of arbitrary error parameter values (Chen et al., 2014), the maximisation of ensemble verification criteria (De Lannoy et al., 2006; Brocca et al., 2010; Massari et al., 2015), the auto-tuned land data assimilation system proposed by Crow and Yilmaz (2014) and the maximisation of the likelihood of having the streamflow observations within the streamflow ensemble prediction (Alvarez-Garreton et al., 2015; Li et al., 2014) (adopted in this study). The evaluation of these techniques and their impacts on the assimilation of SM into rainfall-runoff models has not been studied deeply. In particular, the reliability and quality of the generated open-loop ensembles (in Monte Carlo-based applications) used to evaluate the data assimilation results are usually not assessed. In our case, since we based the evaluation on deterministic predictions (as explained in section 3.5), the skill of the stochastic state correction scheme in terms of ensemble prediction characteristics was not assessed. Despite the highlighted limitations and challenges within the forcing and state correction schemes, our experiments demonstrated that the streamflow prediction for these sparsely gauged locations is improved by the assimilation of satellite soil moisture. The state correction scheme generally showed higher positive impact on the streamflow prediction than the forcing correction scheme, for both high flows and low flows. This larger improvement during high flows contrasts with previous studies (Crow and Ryu, 2009; Chen et al., 2014) where the forcing correction scheme generally outperformed the state correction for high flows. This could be due to several factors including differences in rainfall-runoff mechanisms between catchments, the quality of the forcing data before correction, the quality of the satellite SM products and the different experimental methodologies. Finally, we showed that the combination of a better representation of the catchment wetness condition (via state correction scheme) with higher quality forcing data (via forcing correction scheme) in most cases outperformed the results of separately applying either data assimilation scheme (Figure 10). 102 CHAPTER 6: DUAL CORRECTION SCHEME Finally, it should be noted that while the proposed schemes were able to improve the quality of an operational satellite rainfall product and the PDM SM state, which in turn led to better streamflow predictions, the streamflow predictions after the dual SM-DA scheme still did not outperform the case where the gauge-based rainfall data was used as input forcing (evaluation scores presented in Table 3). This implies that while satellite SM may be useful for improving satellite rainfall products and SM states of hydrological models within data scarce regions, it is more critical to have a higher quality forcing data for accurate streamflow prediction in the study regions. 6 Conclusions We explored the use of active and passive satellite SM products for improving the streamflow prediction of a rainfall-runoff model (PDM) within 4 large semi-arid catchments. We set up our experiments under a scenario without rain gauges to represent the data scarcity in most areas worldwide, which led to poor streamflow predictions before assimilation. Within this context, two key variables controlling the runoff generation were corrected by the assimilation of the surface SM observations: the satellite rainfall forcing data and the PDM soil moisture state. The forcing correction used SMART (Crow et al., 2011) and the results showed a consistent improvement in the the operational satellite rainfall (increased R and reduced RMSE and mean bias). In general, the use of the corrected rainfall data to force the rainfall-runoff models improved the streamflow prediction (increased NSE, R2 and decreased RMSE), especially during high flow periods. The state correction scheme generally showed a higher positive impact on the streamflow prediction compared with the forcing correction scheme, especially for low flows. The combined dual correction scheme enhanced the benefits of the individual schemes, which led to an improved prediction of both low and high flows. We have highlighted a number of limitations within the forcing and state correction schemes that should be addressed to advance towards a robust data assimilation framework. Although our results are case specific and depend on the catchment characteristics, degree of instrumentation and the experimental set up, they provide new evidence of the value of satellite SM for improving both an operational satellite rainfall product and the streamflow prediction within data scarce regions. Acknowledgements This research was conducted with financial support from the Australian Research Council (ARC Linkage Project No. LP110200520) and the Bureau of Meteorology, Australia. C. 103 Alvarez-Garreton was supported by a Becas Chile scholarship. We are grateful to all who contributed to the data sets used in this study. We thank Chris Leahy and Soori Sooriyakumaran from the Australian Bureau of Meteorology for providing catchment and AWAP rainfall data, and gratefully acknowledge their advice. AMSR-E data were produced by Richard de Jeu and colleagues at Vrije University Amsterdam and NASA. ASCAT level 3 data were produced by the Vienna University of Technology within the framework of EUMETSAT’s Satellite Application Facility on Support of Operational Hydrology and Water Management from MetOp-A observations. The SMOS version RE02 data were provided by Centre Aval de Traitement des Donnees. he TMPA data were provided by NASA Goddard Earth Sciences Data and Information Services Center (GES DISC). We also thank the anonymous reviewers and associate editor for their comments which have improved the quality of this paper. 104 Chapter 7 Discussion and Conclusions This research aimed to improve flood forecasting in catchments with low on-ground data availability by using space observations. Active and passive satellite soil moisture (SM) products were assimilated a into a rainfall-runoff model to improve hydrologic model streamflow predictions. Remotely sensed SM was used to correct two key variables controlling the streamflow generation: the catchment wetness condition (via a state correction scheme) and the rainfall forcing data (via a forcing correction scheme). The core part of the research focused on the state correction scheme (Chapters 3, 4 and 5). To set up this scheme, I used a simple rainfall runoff model (the probability distributed model, PDM) and corrected the soil water state of the model by assimilating active and passive satellite SM observations. Each of the required steps to set up an effective soil moisture data assimilation (SM-DA) scheme were rigorously addressed. These steps were categorised into two main topics: the satellite SM data processing and the model error representation. Several aspects within each topic were explored and different techniques to address them were applied. Some of the techniques were adopted from previous studies and some were introduced in this thesis (the innovative aspects of the research are summarised in Section 5). After setting up a SM-DA state correction scheme that was effective at improving streamflow predictions, in Chapter 6 the scheme was coupled with a forcing correction scheme. In the forcing correction scheme, the near real-time satellite rainfall product used to force PDM was corrected by assimilating satellite SM via the soil moisture analysis rainfall tool (SMART) proposed by Crow et al. (2009). This dual correction scheme was evaluated within 4 large sparsely gauged catchments. The following sections provide a description of the key challenges and limitations found throughout the research, a summary of the main findings, the conclusions of the thesis, including my recommendations for future work, and finally a list of the key contributions of this thesis. 105 1 Challenges in satellite SM data processing for DA The satellite soil moisture products were processed to resolve three key issues within the state correction scheme: 1) the depth mismatch between the soil moisture represented by the satellite retrievals (i.e., the top few centimetres of soil) and by the model (usually a deeper layer), 2) the presence of systematic biases between the observations and the model predictions, and 3) the need to quantify the statistical properties of random observation error (and model error, as explained in Section 2). The reason for having a depth mismatch between model and observations relies on the selected conceptual rainfall-runoff model used in this research. The selected PDM uses only one tank to represent the soil water storage. One of the main challenges of using such a parsimonious model is that the satellite data must be processed in such a way that the model can ingest information compatible with the conceptual storage (deep layer of soil). This would not be necessary if a land surface model that explicitly represents the first top layer of soil water storage was used instead, such as VIC/Noah (Kumar et al., 2014; DeChant and Moradkhani, 2015). However, there are some limitations in applying a distributed land surface model in data-scarce regions such as the study catchments. For example, the calibration of such a model given the poor hydro-meteorological data (quality and quantity) can lead to unrealistic parameters and over parametrisation issues. The depth mismatch was addressed by applying the exponential filter proposed by Wagner et al. (1999) to the surface satellite observations to estimate a soil wetness index (SWI) of the root zone soil moisture. This was a simple solution that has shown positive results in several studies (e.g., Albergel et al., 2008; Brocca et al., 2009; Ford et al., 2013). However, there are some issues that have been highlighted when SWI is used within a data assimilation scheme. For example, SWI can introduce autocorrelation in the observation errors that can lead to cross-correlation with the background soil moisture errors (Brocca et al., 2010; Alvarez-Garreton et al., 2015) (further discussed below). The implications of this in SM-DA have not been assessed yet. Moreover, there are more physically-based profile soil moisture estimation methods which have not been tested within a data assimilation context (e.g., Richards, 1931; Manfreda et al., 2014). There is a plethora of techniques to remove the systematic (additive and multiplicative) biases between the model soil moisture and the SWI derived from the observations. Here, I applied a number of them throughout the research, including: linear rescaling (LR, Chapters 3 and 4), anomaly-based cumulative distribution function (aCDF) matching (Chapter 4), three-data set instrumental variable regression (triple collocation, Chapters 5 and 6) and a two-data set instrumental variable regression (lagged variables, Chapters 5 and 6). In Chapter 4, I showed that the assimilation of aCDF-rescaled observations performed consistently better than assimilating LR-rescaled observations. However, this was a single 106 CHAPTER 7: DISCUSSION AND CONCLUSIONS real-data case study thus generalised conclusions were not drawn. A few other studies have evaluated different rescaling techniques in SM-DA with varied results. For example, Massari et al. (2015) compared LR, CDF and variance matching techniques and found little impact in SM-DA results when assumptions of model error were correct. Yilmaz and Crow (2013) demonstrated that triple collocation (TC) was the optimal rescaling technique, if the procedure requirements were met. These requirements include having sufficient linearly related independent triplets and zero error autocorrelation in the observations (non-zero error auto-correlation is allowed, but it increases the sampling error of TC estimates). Despite the increasing use and evaluation of the above techniques, a deep investigation leading to strong conclusions about the impacts of the different rescaling techniques in the updated streamflow predictions has not been undertaken yet. Such a complex investigation should consider that SM-DA results are highly influenced by a large number of experimental considerations (i.e., model structure, quality of the model parameters, quality of the forcing data, quality of the satellite soil moisture data, satellite data processing, model error characterisation, etc.) and specific catchment characteristics (i.e., climate, geology, topography, runoff mechanisms, location, etc.). In this sense, the experiments carried out in this research contribute to address this gap by implementing different rescaling techniques. Moreover, I worked within semi-arid large catchments, which feature very distinct runoff mechanisms from most catchments studied in SM-DA. Another unsolved issue is that the systematic differences between model and observations may have a temporal component (e.g., seasonality) (Draper and Reichle, 2015; Su and Ryu, 2015), which has been rarely taken into consideration within SM-DA applications. Addressing this, in Chapter 5 I applied seasonal rescaling and found positive outcomes in terms of the updated streamflow prediction. Nevertheless, I highlighted that further investigation was required to assess the importance and impacts (if any) of this seasonal approach compared to the commonly used bulk rescaling. The final step in the satellite data processing was the quantification of the (rescaled) observation errors. Quantifying errors requires the assumption of a certain error structure, and incorrect assumptions can degrade the performance of the data assimilation (Reichle et al., 2008; Crow and Reichle, 2008; Crow and Van den Berg, 2010). In most SM-DA applications, an independent Gaussian error is assumed, which disregards the potential autocorrelation in the observation errors. Error autocorrelation in the observations can lead to error correlation between the observations and the background model forecast, which violates a critical EnKF assumption (Alvarez-Garreton et al., 2015). This would be especially critical when a SWI is used to represent deep-layer soil moisture, since its formulation explicitly incorporates autocorrelation terms. Exploring this, different error autocorrelation structures were tested in Chapter 3, however results showed little impact on the updated streamflow for the case study. I explained this low sensitivity 107 through factors such as the large errors in the model, the arbitrary procedure used to estimate the observation error variance, and the error correlation between the model and the raw satellite observations that probably exists before data processing. These findings are broadly consistent with Crow and van den Berg (2010), who found that soil moisture analysis was not further improved via the introduction of the Colored Kalman Filter (which explicitly accounts for observation error auto-correlation), despite the clear presence of error auto-correlation in the assimilated observations. Given the results from Chapter 3, for the subsequent chapters I adopted an independent Gaussian structure for the observation error and concentrated on quantifying its variance using different procedures. In Chapter 4, I assumed orthogonality between the rescaled observations and their errors and determined an upper bound of the observation error variance. Different error variances within the bound were tested; however, little impact on the updated streamflow was found (Alvarez-Garreton et al., 2015). This new procedure was adopted later by Massari et al. (2015), who suggested that the optimal choice of the observation error variance had a strong connection with the accurate representation of the model error. It should be noted that this error estimation procedure is arbitrary and does not consider an evaluation of the satellite observation quality (Alvarez-Garreton et al. (2015) and Massari et al. (2015) evaluated the quality of the error estimates in terms of the updated streamflow). In the following Chapters 5 and 6, the error quantification procedure was improved by applying instrumental variable regressions. In particular, I applied triple collocation, TC (Stoffelen, 1998; Yilmaz and Crow, 2013), and lagged variables (Su et al., 2014) when TC requirements were not met. These procedures simultaneously resolved the rescaling of the observations and the quantification of the observation errors. As mentioned above, given the evidence of temporal changes in the variance of the observation errors (Draper and Reichle, 2015; Su and Ryu, 2015), in Chapter 5 I explicitly represented the seasonality in the satellite errors by applying seasonal TC and LV. Although those results demonstrated a significant seasonality in the satellite observations errors, it was not clear what the impact of representing this within SM-DA context was, hence further investigation is needed. Assessing the impacts of observation error seasonality in SM-DA fell beyond the scope of this thesis, thus in Chapter 6 the error quantification procedure was simplified by applying bulk TC and LV. This reduced the number of assumptions and to simplify the interpretation of results in the (already complex) dual correction scheme. 2 Challenges in model error representation A consistent representation of the errors in the model is an important and major challenge in SM-DA. On the one hand, there are several sources of error in a rainfall-runoff model, including the model structure, the model parameters and the quality of the forcing data. 108 CHAPTER 7: DISCUSSION AND CONCLUSIONS On the other hand, there is usually very limited data to evaluate the model predictions and quantify these errors, which makes the problem highly underdetermined. Solving this problem involves several arbitrary decisions and assumptions that may significantly affect SM-DA results. The most critical are the selection of the error sources to quantify, the assumptions about the structure of those errors, and the assumed (or estimated) quality of the observed data used to evaluate the model predictions. Furthermore, after the above above assumptions are made, there are different techniques to represent those errors and to estimate their parameters (described in Chapter 2, Section 4.2), with little agreement on the most suitable procedure to achieve this. The estimation of the error parameters has a direct impact in SM-DA, since they influence the error covariance between the model soil moisture and the predicted streamflow. Moreover, most stochastic SM-DA applications evaluate their schemes using the openloop as the reference run (open-loop refers to the ensemble of predictions resulting from perturbing the selected error sources with the estimated error parameters). Despite the significant impacts that the error parameters estimation may have in SM-DA, the quality of these open-loop simulations is usually not a major focus of investigation in SM-DA applications. In this research I firstly made some decisions about which errors to represent and how. Then, an ensemble verification approach to estimate the error parameters of the rainfall forcing data and the model SM was adopted (Chapters 3 and 4). Advancing towards a more consistent error representation, in Chapters 5 and 6 I added a (sensitive) model parameter into the error sources, which directly affected the surface runoff estimation (main component of the total streamflow generated by the study catchments). I also introduced a maximum a posteriori approach to quantify the error parameters, which resulted in a more reliable open-loop ensemble (evaluated in Chapter 5 by using rank histograms). Another challenge within this error characterisation procedure is that perturbing components of the model (such as the forcing data, parameters and/or states) with un-biased errors may introduce bias in the open-loop streamflow prediction. This unintended bias is due to two main reasons: the truncation of SM state ensembles and the non-linear, bounded nature of hydrological models. This can result in mass balance errors and degrade the performance of the SM-DA scheme. The truncation bias was removed by applying the bias correction scheme proposed by Ryu et al., (2009) to the SM state ensembles (Chapters 4, 5 and 6). To remove the biases caused by non-linearities in the model, I applied a similar bias correction scheme to the streamflow prediction ensembles (Chapters 5 and 6). In summary, the representation and estimation of model errors is a necessary and key step in SM-DA. Given that this is one of the several steps required in SM-DA, most applications adopt a specific technique (based on previous studies) to estimate the model errors 109 without comparing different techniques or without deeply evaluating the estimated error parameters. This investigation contributes towards assessing the most suitable techniques to generate reliable streamflow prediction ensembles and understanding their impacts in SM-DA. 3 Main findings The results of this research demonstrated that the assimilation of remotely sensed SM to correct the model’s SM state consistently led to improved streamflow predictions. While these improvements were significant, an important limitation was clearly evident. Stochastic data assimilation is formulated to reduce the random component of the errors and therefore does not address systematic biases in the model, therefore, the efficacy of the state correction scheme was restricted by the model quality before assimilation (Chapter 4). Consequently, SM-DA mainly improved the quality of the streamflow ensemble prediction (skill, reliability and averaged statistics) but did not significantly reduced the existing biases in the peak flows prediction (Chapters 3, 4 and 5). The state correction scheme was also effective at improving the streamflow ensemble prediction within ungauged internal locations, which demonstrates the advantages of incorporating spatially distributed SM information within large and poorly instrumented catchments (Chapter 5). Given the particular runoff mechanisms within the study catchments, the state correction scheme led to varied improvements in the streamflow prediction. The streamflow at the outlet of the study catchments features long periods of zero-flow, a negligible base flow component and sharp flow peaks after rainfall events, when the catchment have reached a threshold level of wetness. Given these characteristics, SM exerts a higher control on catchment runoff generation during minor and moderate floods, therefore the state correction scheme showed more skill when the low flows were evaluated. SM-DA improved major floods to a lesser extent (Chapter 4 and 6). These results reveal one key limitation of this approach: it aims at improving flood predictions by correcting the SM state of a rainfall-runoff model, however, SM is probably not the main controlling factor in the runoff generation during large floods (within the study catchments used in this research). Addressing the above limitation, I set up a forcing correction scheme that aimed at reducing the errors in the rainfall data (Chapter 6). The rainfall data, in addition to the infiltration estimates from the model, are probably the main factors controlling the accuracy of flood predictions. I demonstrated that remotely sensed SM was effective at improving a near-real time satellite rainfall product (in particular, the medium-to-high daily rainfall accumulations), which in turn led to a consistent improvement of the streamflow prediction, especially during high flows. When comparing both schemes individually, results showed that the skill of the state correction scheme was, for most cases, greater at improving streamflow prediction than when the corrected rain was used to force the model 110 CHAPTER 7: DISCUSSION AND CONCLUSIONS (without state correction). This was true for both the low flows and high flows. Finally, when the forcing and the state correction schemes were combied, flood predictions were further improved. 4 Conclusions This thesis investigated and assessed the value of coupling satellite soil moisture products into the streamflow modelling for improving flood prediction. A real-data experimental approach was used as a platform for developing and testing a variety of innovations to improve SM-DA. With this, I provided new evidence of the advantages of exploiting this spatially distributed information within data scarce regions. The main challenges for implementing an effective SM-DA scheme were highlighted and some of the limitations found in the current practices were identified. To overcome these limitations, I propose the following strategies for future research: • I suggest further exploration and assessment of the suitability of different rescaling techniques to remove the systematic biases between the model and the observations. To achieve a robust inter comparison, this exploration should include a range of different catchment characteristics (size, climate, controlling runoff characteristics, etc.) compared under similar SM-DA frameworks (i.e., implementing consistent techniques for the estimation of observation and model errors). • I suggest further exploring and assessing the importance of accounting for nonstationarity in the satellite errors within SM-DA applications. • I recommend further assessing the impacts in SM-DA of the error autocorrelation in SWI, when the latter is used as a profile soil moisture estimator. Additionally, since an improved profile soil moisture estimation should have a positive effect on the SM-DA efficacy for improving streamflow, I recommend testing other methods to estimate the root zone soil moisture based on surface observations (e.g., Richards, 1931; Manfreda et al., 2014). • I recommend exploring the suitability of assimilating satellite soil moisture into more complex hydrological models that explicitly account for the water storage of the top soil layer. In this way, the profile soil moisture estimation based on surface satellite observations would not be a necessary step. • There is an important research gap in the generation of streamflow ensemble predic- tion, therefore I suggest exploring suitable strategies to consistently represent model errors and estimate the error parameters. • Finally, I strongly recommend that a subsequent approach should combine the pro- posed SM-DA framework (which exploits spatially distributed information about the 111 catchment wetness condition and therefore has benefits for improving streamflow prediction throughout the catchment and, in particular, at ungauged inner locations), with the assimilation of the observed streamflow at available stream gauges. The assimilation of observed streamflow has shown to be effective at improving flood prediction at the catchment outlet, however, this does not necessarily improve (or even degrade) predictions at ungauged sub-catchments (Mendoza et al., 2012; Li et al., 2015). The combination of these two data assimilation schemes was recently explored by Wanders et al. (2014) with positive results for a case study, which encourages further investigation. The assimilation of satellite soil moisture observations into a hydrologic model is not a simple task. It requires addressing several challenges and there is no agreement on the most suitable strategies. Moreover, the outcomes of a SM-DA scheme are highly influenced by the particular catchment characteristics and methodological procedures, therefore general conclusions should be drawn only with great care. Acknowledging these limitations, this thesis provides new evidence of the value of remotely sensed soil moisture for improving flood prediction within data scarce regions. I discussed the current practices to implement the state and the forcing correction schemes. I highlighted their main limitations and proposed new strategies to address them. 5 Contributions There are several scientific and practical contributions of this research. Different existing tools to set up a SM-DA state correction scheme were evaluated; new techniques to address some of the key challenges in SM-DA were introduced; an effective dual correction scheme for improving flood prediction was implemented, which combined a state and a forcing correction schemes; and various real data experiments were presented, providing novel evidence of the efficacy of SM-DA for improving flood prediction in data-scarce regions. In particular, the new techniques introduced to overcome some of the limitations found in the state correction scheme included: • The correction of the unintended bias introduced in the generation of streamflow ensemble predictions (Chapters 5 and 6). • The use of a maximum a posteriori approach to estimate model error parameters (Chapters 5 and 6). • The use of a lagged-variable approach to overcome TC requirements and estimate satellite observation error (Chapters 5 and 6). • The explicit representation of seasonality within the satellite SM errors (Chapter 5). 112 The framework proposed in this thesis improved the prediction of floods during the last decade within 4 Australian catchments. This framework can be implemented within an operational flood alert system, which would provide valuable information to reduce risks associated with floods within data scarce regions. 113 114 Appendix A Publications Journal articles refereed 1. Alvarez-Garreton C., Ryu D., Western A.W., Crow W.T., Su, C.-H., and Robertson D.E. Dual assimilation of satellite soil moisture to improve streamflow prediction in data-scarce catchments. Submitted to Water Resources Research. 2. Alvarez-Garreton C., Ryu D., Western A.W., Su, C.-H., Crow W.T., Robertson D.E. and Leahy C. Improving operational flood ensemble prediction by the assimilation of satellite soil moisture: comparison between lumped and semi-distributed schemes. Hydrol. Earth Syst. Sci., 19, 1659-1676, doi:10.5194/hess-19-1659-2015, 2015. 3. Alvarez-Garreton C., Ryu D., Western A.W., Crow W.T. and Robertson D.E. The impacts of assimilating satellite soil moisture into a rainfall-runoff model in a semiarid catchment. Journal of Hydrology, doi:10.1016/j.jhydrol.2014.07.041, 2014. Conference papers refereed 1. Alvarez-Garreton C., Ryu D., Western A.W., Crow, W.T., and Robertson D.E.: Impact of observation error structure on satellite soil moisture assimilation into a rainfall-runoff model, in: MODSIM2013, 20th International Congress on Modelling and Simulation. Modelling and Simulation Society of Australia and New Zealand, edited by Piantadosi, J., Anderssen, R., and Boland, J., pp. 3071-3077, 2013. Conference presentations 1. Alvarez-Garreton C., Ryu D., Western A.W., Crow, W.T., Su, C.-H., and Robertson D.E., 2014. Improving Flood Prediction By the Assimilation of Satellite Soil Mois115 ture in Poorly Monitored Catchments. 47th American Geophysical Union (AGU). HM13M-04, 15-19 December, San Francisco, US. 2. Ryu, D., Alvarez-Garreton C., 2014. Conjunctive Use of Satellite Precipitation and Soil Moisture for Hydrologic Predictions in Ungauged Regions. Smart Water Grid International Conference, Incheon, Republic of Korea. 3. Alvarez-Garreton C., Ryu D., Western A.W., Su, C.-H., Crow, W.T., and Robertson D.E., 2014. Improving Flood Prediction By the Assimilation of Satellite Soil Moisture in Poorly Monitored Catchments. Asia-Oceania Top University League on Engineering(AOTULE) Conference. 26-28 November, Melbourne, Australia. 4. Alvarez-Garreton C., Ryu D., Western A.W., Crow, W.T., Robertson D.E. and Leahy C., 2013. Effects of forcing uncertainties in the improvement skills of assimilating satellite soil moisture retrievals into flood forecasting models. International Geoscience and remote sensing symposium (IGARSS). WE2.T03.4. 116 References Ahmad, S., Kalra, A., and Stephen, H. Estimating soil moisture using remote sensing data: A machine learning approach. Advances in Water Resources, 33:69–80, 2010. Albergel, C., Rüdiger, C., Pellarin, T., Calvet, J.-C., Fritz, N., Froissard, F., Suquia, D., Petitpa, A., Piguet, B., Martin, E., et al. From near-surface to root-zone soil moisture using an exponential filter: an assessment of the method based on in-situ observations and model simulations. Hydrology and Earth System Sciences, 12:1323–1337, 2008. Albergel, C., Rüdiger, C., Carrer, D., Calvet, J.-C., Fritz, N., Naeimi, V., Bartalis, Z., and Hasenauer, S. An evaluation of ascat surface soil moisture products with in-situ observations in southwestern france. Hydrology and Earth System Sciences, 13(2):115– 124, 2009. Albergel, C., Calvet, J., De Rosnay, P., Balsamo, G., Wagner, W., Hasenauer, S., Naeimi, V., Martin, E., Bazile, E., Bouyssel, F., et al. Cross-evaluation of modelled and remotely sensed surface soil moisture with in situ data in southwestern france. Hydrology and Earth System Sciences, 14(11):2177–2191, 2010. Albergel, C., de Rosnay, P., Gruhier, C., Muñoz-Sabater, J., Hasenauer, S., Isaksen, L., Kerr, Y., and Wagner, W. Evaluation of remotely sensed and modelled soil moisture products using global ground-based in situ observations. Remote Sensing of Environment, 118:215–226, 2012. Alvarez-Garreton, C., Ryu, D., Western, A. W., Crow, W. T., and Robertson, D. E. Impact of observation error structure on satellite soil moisture assimilation into a rainfallrunoff model. In Piantadosi, J., Anderssen, R., and Boland, J., editors, MODSIM2013, 20th International Congress on Modelling and Simulation. Modelling and Simulation Society of Australia and New Zealand, pages 3071–3077, December 2013. Alvarez-Garreton, C., Ryu, D., Western, A. W., Crow, W. T., and Robertson, D. E. The impacts of assimilating satellite soil moisture into a rainfall–runoff model in a semi-arid catchment. Journal of Hydrology, 519:2763–2774, 2014. Alvarez-Garreton, C., Ryu, D., Western, A. W., Su, C.-H., Crow, W. T., Robertson, D. E., and Leahy, C. Improving operational flood ensemble prediction by the assimilation 117 of satellite soil moisture: comparison between lumped and semi-distributed schemes. Hydrology and Earth System Sciences, 19(4):1659–1676, 2015. Aubert, D., Loumagne, C., and Oudin, L. Sequential assimilation of soil moisture and streamflow data in a conceptual rainfall–runoff model. Journal of Hydrology, 280:145– 161, 2003. Barrett, D. J. and Renzullo, L. J. On the Efficacy of Combining Thermal and Microwave Satellite Data as Observational Constraints for Root-Zone Soil Moisture Estimation. Journal of Hydrometeorology, 10:1109–1127, 2009. Beven, K. J. Rainfall-runoff modelling: the primer. John Wiley & Sons, 2011. Beven, K. J. and Germann, P. Macropores and water flow in soils. Water Resources Research, 18(5):1311–1325, 1982. Brocca, L., Melone, F., Moramarco, T., and Morbidelli, R. Antecedent wetness conditions based on ers scatterometer data. Journal of Hydrology, 364(1):73–87, 2009. Brocca, L., Melone, F., Moramarco, T., Wagner, W., Naeimi, V., Bartalis, Z., and Hasenauer, S. Improving runoff prediction through the assimilation of the ascat soil moisture product. Hydrology and Earth System Sciences, 14(10):1881–1893, 2010. Brocca, L., Hasenauer, S., Lacava, T., Melone, F., Moramarco, T., Wagner, W., Dorigo, W., Matgen, P., Martı́nez-Fernández, J., Llorens, P., et al. Soil moisture estimation through ASCAT and AMSR-E sensors: an intercomparison and validation study across europe. Remote Sensing of Environment, 115(12):3390–3408, 2011. Brocca, L., Moramarco, T., Melone, F., Wagner, W., Hasenauer, S., and Hahn, S. Assimilation of surface-and root-zone ascat soil moisture products into rainfall–runoff modeling. Geoscience and Remote Sensing, IEEE Transactions on, 50(7):2542–2555, 2012a. Brocca, L., Tullo, T., Melone, F., Moramarco, T., and Morbidelli, R. Catchment scale soil moisture spatial–temporal variability. Journal of hydrology, 422:63–75, 2012b. Brocca, L., Moramarco, T., Melone, F., and Wagner, W. A new method for rainfall estimation through soil moisture observations. Geophysical Research Letters, 40(5): 853–858, 2013. Brocca, L., Ciabatta, L., Massari, C., Moramarco, T., Hahn, S., Hasenauer, S., Kidd, R., Dorigo, W., Wagner, W., and Levizzani, V. Soil as a natural rain gauge: Estimating global rainfall from satellite soil moisture data. Journal of Geophysical Research: Atmospheres, 119(9):5128–5141, 2014. Burgers, G., van Leeuwen, P. J., and Evensen, G. Analysis Scheme in the Ensemble Kalman Filter. Monthly Weather Review, 126:1719, 1998. 118 Campbell, J. B. and Wynne, R. H. Introduction to remote sensing. New York : Guioford Press, c2011. 5th ed., 2011. Carver, K. R., Elachi, C., and Ulaby, F. T. Microwave remote sensing from space. Proceedings of the IEEE, 73:970–996, 1985. Chen, F., Crow, W. T., Starks, P. J., and Moriasi, D. N. Improving hydrologic predictions of a catchment model via assimilation of surface soil moisture. Advances in Water Resources, 34:526–536, 2011. Chen, F., Crow, W. T., and Ryu, D. Dual forcing and state correction via soil moisture assimilation for improved rainfall runoff modelling. Journal of Hydrometeorology, accepted, 2014. Chipperfield, A. and Fleming, P. The matlab genetic algorithm toolbox. In Applied Control Techniques Using MATLAB, IEE Colloquium on, pages 10/1–10/4, 1995. Ciach, G. J., Morrissey, M. L., and Krajewski, W. F. Conditional bias in radar rainfall estimation. Journal of Applied Meteorology, 39(11):1941–1946, 2000. Clark, M., Rupp, D., Woods, R., Zheng, X., Ibbitt, R., Slater, A., Schmidt, J., and Uddstrom, M. Hydrological data assimilation with the ensemble Kalman filter: Use of streamflow observations to update states in a distributed hydrological model. Advances in Water Resources, 31:1309–1324, 2008. Cloke, H. and Pappenberger, F. Ensemble flood forecasting: a review. Journal of Hydrology, 375(3):613–626, 2009. Crow, W. T. and Bolten, J. D. Estimating precipitation errors using spaceborne surface soil moisture retrievals (doi 10.1029/2007GL029450). Geophysical Research Letters, 34: L08403, 2007. Crow, W. T. and Reichle, R. H. Comparison of adaptive filtering techniques for land surface data assimilation. Water Resources Research, 44:W08423–, 2008. ISSN 00431397. Crow, W. T. and Ryu, D. A new data assimilation approach for improving runoff prediction using remotely-sensed soil moisture retrievals. Hydrology and Earth System Sciences, 13(1):1–16, 2009. Crow, W. T. and Van den Berg, M. J. An improved approach for estimating observation and model error parameters in soil moisture data assimilation. Water Resources Research, 46(12), 2010. Crow, W. T. and van Loon, E. Impact of Incorrect Model Error Assumptions on the Sequential Assimilation of Remotely Sensed Surface Soil Moisture. Journal of Hydrometeorology, 7:421–432, 2006. 119 Crow, W. T. and Yilmaz, M. T. The auto-tuned land data assimilation system (atlas). Water Resources Research, 50(1):371–385, 2014. Crow, W. T., Bindlish, R., and Jackson, T. J. The added value of spaceborne passive microwave soil moisture retrievals for forecasting rainfall-runoff partitioning. Geophysical Research Letters, 32:L18401, 2005. Crow, W. T., Huffman, G. J., Bindlish, R., and Jackson, T. J. Improving Satellite-Based Rainfall Accumulation Estimates Using Spaceborne Surface Soil Moisture Retrievals. Journal of Hydrometeorology, 10:199–212, 2009. Crow, W. T., Van Den Berg, M., Huffman, G., and Pellarin, T. Correcting rainfall using satellite-based surface soil moisture retrievals: The Soil Moisture Analysis Rainfall Tool (SMART). Water Resources Research, 47(8):W08521, 2011. De Lannoy, G. J., Houser, P. R., Pauwels, V., and Verhoest, N. E. Assessment of model uncertainty for soil moisture through ensemble verification. Journal of Geophysical Research: Atmospheres (1984–2012), 111(D10), 2006. Dechant, C. and Moradkhani, H. Radiance data assimilation for operational snow and streamflow forecasting. Advances in Water Resources, 34(3):351–364, 2011. DeChant, C. M. and Moradkhani, H. Examining the effectiveness and robustness of sequential data assimilation methods for quantification of uncertainty in hydrologic forecasting. Water Resources Research, 48(4), 2012. DeChant, C. M. and Moradkhani, H. Analyzing the sensitivity of drought recovery forecasts to land surface initial conditions. Journal of Hydrology, 526:89–100, 2015. Dorigo, W., Scipal, K., Parinussa, R., Liu, Y., Wagner, W., De Jeu, R., and Naeimi, V. Error characterisation of global active and passive microwave soil moisture datasets. Hydrology and Earth System Sciences, 14(12):2605–2616, 2010. Draper, C. and Reichle, R. The impact of near-surface soil moisture assimilation at subseasonal, seasonal, and inter-annual time scales. Hydrology & Earth System Sciences Discussions, 12(8), 2015. Draper, C. S., Mahfouf, J. F., and Walker, J. P. An EKF assimilation of AMSR-E soil moisture into the ISBA land surface scheme. Journal of Geophysical Research Atmospheres, 114, 2009a. Draper, C. S., Walker, J. P., Steinle, P. J., de Jeu, R. A., and Holmes, T. R. An evaluation of AMSR–E derived soil moisture over australia. Remote Sensing of Environment, 113 (4):703–710, 2009b. Drusch, M., Wood, E., and Gao, H. Observation operators for the direct assimilation of trmm microwave imager retrieved soil moisture. Geophysical Research Letters, 32(15), 2005. 120 Ebert, E. E., Janowiak, J. E., and Kidd, C. Comparison of near-real-time precipitation estimates from satellite observations and numerical models. Bulletin of the American Meteorological Society, 88(1):47–64, 2007. Engman, E. Soil Moisture. In Remote sensing in hydrology and water management. New York ; London : Springer, c2000., 2000. Engman, E. T. Remote sensing applications to hydrology: future impact. Hydrological Sciences Journal, 41:637–647, 1996. Engman, E. T. and Chauhan, N. Status of microwave soil moisture measurements with remote sensing. Remote Sensing of Environment, 51:189–198, 1995. Evensen, G. Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte-Carlo methods to forecast error statistics. Journal of Geophysical Research Oceans, 99:10143–10162, 1994. Evensen, G. The ensemble Kalman filter: Theoretical formulation and practical implementation. Ocean dynamics, 53(4):343–367, 2003. Ford, T., Harris, E., and Quiring, S. Estimating root zone soil moisture using near-surface observations from smos. Hydrology and Earth System Sciences Discussions, 10(6):8325– 8364, 2013. Francois, C., Quesney, A., and Ottlé, C. Sequential assimilation of ERS-1 SAR data into a coupled land surface-hydrological model using an extended Kalman filter. Journal of Hydrometeorology, 4(2):473–487, 2003. Gill, M. A. Flood routing by the Muskingum method. Journal of Hydrology, 36(3): 353–363, 1978. Gruhier, C., De Rosnay, P., Hasenauer, S., Holmes, T. R., De Jeu, R. A., Kerr, Y. H., Mougin, E., Njoku, E., Timouk, F., Wagner, W., et al. Soil moisture active and passive microwave products: intercomparison and evaluation over a sahelian site. Hydrology and Earth System Sciences, 2010. Hain, C. R., Crow, W. T., Anderson, M. C., and Mecikalski, J. R. An ensemble Kalman filter dual assimilation of thermal infrared and microwave satellite observations of soil moisture into the noah land surface model. Water Resources Research, 48(11), 2012. Han, X., Franssen, H.-J., Rosolem, R., Jin, R., Li, X., and Vereecken, H. Correction of systematic model forcing bias of clm using assimilation of cosmic-ray neutrons and land surface temperature: a study in the heihe catchment, china. Hydrology and earth system sciences, 19(1):615–629, 2015a. Han, X., Li, X., Rigon, R., Jin, R., and Endrizzi, S. Soil moisture estimation by assimilating l-band microwave brightness temperature with geostatistics and observation localization. PloS one, 10(1):e0116435, 2015b. 121 Hreinsson, E. Assimilation of snow covered area into a hydrologic model: a thesis submitted in partial fulfilment of the requirements for the degree of Master of Science in Geography in the University of Canterbury. University of Canterbury, 2008. Huffman, G., Bolvin, D., Nelkin, E., Wolff, D., Adler, R., Gu, G., Hong, Y., Bowman, K., and Stocker, E. The TRMM multisatellite precipitation analysis (TMPA): Quasiglobal, multiyear, combined-sensor precipitation estimates at fine scales. Journal of Hydrometeorology, 8(1):38–55, 2007. Jackson, T. J. and Schmugge, T. J. Passive microwave remote sensing system for soil moisture: some supporting research. IEEE Transactions on Geoscience and Remote Sensing, 27:225–235, 1989. Jia, B. H., Xie, Z. H., Tian, X. J., and Shi, C. X. A soil moisture assimilation scheme based on the ensemble Kalman filter using microwave brightness temperature. Science in China Series D: Earth Sciences, 52:1835–1848, 2009. Jones, D. A., Wang, W., and Fawcett, R. High-quality spatial climate data-sets for australia. Australian Meteorological and Oceanographic Journal, 58(4):233, 2009. Kalman, R. E. A new approach to linear filtering and prediction problems. Journal of Fluids Engineering, 82(1):35–45, 1960. Krzysztofowicz, R. The case for probabilistic forecasting in hydrology. Journal of Hydrology, 249(1):2–9, 2001. Kumar, S., Peters-Lidard, C., Santanello, J., Reichle, R., Draper, C., Koster, R., Nearing, G., and Jasinski, M. Evaluating the utility of satellite soil moisture retrievals over irrigated areas and the ability of land data assimilation methods to correct for unmodeled processes. Hydrology and Earth System Sciences, 19(11):4463–4478, 2015. Kumar, S. V., Peters-Lidard, C. D., Mocko, D., Reichle, R., Liu, Y., Arsenault, K. R., Xia, Y., Ek, M., Riggs, G., Livneh, B., et al. Assimilation of remotely sensed soil moisture and snow depth retrievals for drought estimation. Journal of Hydrometeorology, 15(6): 2446–2469, 2014. Leisenring, M. and Moradkhani, H. Analyzing the uncertainty of suspended sediment load prediction using sequential data assimilation. Journal of hydrology, 468:268–282, 2012. Li, B., Toll, D., Zhan, X., and Cosgrove, B. Improving estimated soil moisture fields through assimilation of amsr-e soil moisture retrievals with an ensemble kalman filter and a mass conservation constraint. Hydrology and Earth System Sciences, 16(1):105– 119, 2012. Li, Y., Ryu, D., Western, A. W., Wang, Q., Robertson, D. E., and Crow, W. T. An integrated error parameter estimation and lag-aware data as122 similation scheme for real-time flood forecasting. Journal of Hydrology, http://dx.doi.org/10.1016/j.jhydrol.2014.08.009, 2014. Li, Y., Ryu, D., Western, A. W., and Wang, Q. Assimilation of stream discharge for flood forecasting: Updating a semidistributed model with an integrated data assimilation scheme. Water Resources Research, 2015. Lievens, H., Tomer, S. K., Al Bitar, A., De Lannoy, G., Drusch, M., Dumedah, G., Franssen, H.-J. H., Kerr, Y., Martens, B., Pan, M., et al. Smos soil moisture assimilation for improved hydrologic simulation in the murray darling basin, australia. Remote Sensing of Environment, 168:146–162, 2015. Liu, Y., Parinussa, R., Dorigo, W., De Jeu, R., Wagner, W., Van Dijk, A., McCabe, M., and Evans, J. Developing an improved soil moisture dataset by blending passive and active microwave satellite-based retrievals. Hydrology and Earth System Sciences, 15 (2):425–436, 2011. Liu, Y. Q. and Gupta, H. V. Uncertainty in hydrologic modeling: Toward an integrated data assimilation framework. Water Resources Research, 43, 2007. Loew, A. and Schlenz, F. A dynamic approach for evaluating coarse scale satellite soil moisture products. Hydrology and Earth System Sciences, 15(1):75–90, 2011. Lopez, P. L., Wanders, N., Schellekens, J., Renzullo, L., Sutanudjaja, E., and Bierkens, M. Improved large-scale hydrological modelling through the assimilation of streamflow and downscaled satellite soil moisture observations. Hydrology and Earth System Sciences Discussions, 2015. Manfreda, S., Brocca, L., Moramarco, T., Melone, F., and Sheffield, J. A physically based approach for the estimation of root-zone soil moisture from surface measurements. Hydrology and Earth System Sciences, 18(3):1199–1212, 2014. Massari, C., Brocca, L., Moramarco, T., Tramblay, Y., and Didon Lescot, J.-F. Potential of soil moisture observations in flood modelling: estimating initial conditions and correcting rainfall. Advances in Water Resources, 2014. Massari, C., Brocca, L., Tarpanelli, A., and Moramarco, T. Data assimilation of satellite soil moisture into rainfall-runoff modelling: A complex recipe? Remote Sensing, 7(9): 11403–11433, 2015. Matgen, P., Montanari, M., Hostache, R., Pfister, L., Hoffmann, L., Plaza, D., Pauwels, V., De Lannoy, G., Keyser, R. D., and Savenije, H. Towards the sequential assimilation of sar-derived water stages into hydraulic models using the particle filter: proof of concept. Hydrology and Earth System Sciences, 14(9):1773–1785, 2010. McKenzie, N. J., Jacquier, D., Ashton, L., and Cresswell, H. Estimation of soil properties using the Atlas of Australian Soils. CSIRO Land and Water Canberra, 2000. 123 McMillan, H., Jackson, B., Clark, M., Kavetski, D., and Woods, R. Rainfall uncertainty in hydrological modelling: An evaluation of multiplicative error models. Journal of Hydrology, 400(1):83–94, 2011. Meier, P., Frömelt, A., and Kinzelbach, W. Hydrological real-time modelling in the Zambezi river basin using satellite-based soil moisture and rainfall data. Hydrology and Earth System Sciences, 15:999–1008, 2011. ISSN 10275606. Mendoza, P. A., McPhee, J., and Vargas, X. Uncertainty in flood forecasting: A distributed modeling approach in a sparse data catchment. Water Resources Research, 48(9), 2012. Middelmann-Fernandes, M. H. Review of the Australian Flood Studies Database. Geoscience Australia Record, 34, 2009. Moore, R. J. The PDM rainfall-runoff model. Hydrology and Earth System Sciences, 11 (1):483–499, 2007. Moradkhani, H., Hsu, K.-L., Gupta, H., and Sorooshian, S. Uncertainty assessment of hydrologic model states and parameters: Sequential data assimilation using the particle filter. Water Resources Research, 41(5), 2005a. Moradkhani, H., Sorooshian, S., Gupta, H., and Houser, P. Dual state–parameter estimation of hydrological models using ensemble Kalman filter. Advances in Water Resources, 28(2):135–147, 2005b. Moradkhani, H., DeChant, C. M., and Sorooshian, S. Evolution of ensemble data assimilation for uncertainty quantification using the particle filter-markov chain monte carlo method. Water Resources Research, 48(12), 2012. Naeimi, V., Scipal, K., Bartalis, Z., Hasenauer, S., and Wagner, W. An improved soil moisture retrieval algorithm for ers and metop scatterometer observations. Geoscience and Remote Sensing, IEEE Transactions on, 47(7):1999–2013, 2009. Nash, J. and Sutcliffe, J. River flow forecasting through conceptual models part i: A discussion of principles. Journal of Hydrology, 10(3):282–290, 1970. Njoku, E. G. and Entekhabi, D. Passive microwave remote sensing of soil moisture. Journal of Hydrology, 184:101–129, 1996. Njoku, E. G., Jackson, T. J., Lakshmi, V., Chan, T. K., and Nghiem, S. V. Soil moisture retrieval from AMSR-E. IEEE Transactions on Geoscience and Remote Sensing, 41: 215–229, 2003. Owe, M., de Jeu, R., and Holmes, T. Multisensor historical climatology of satellite-derived global land surface moisture. Journal of Geophysical Research: Earth Surface (2003– 2012), 113(F1), 2008. Parajka, J., Naeimi, V., Blöschl, G., Wagner, W., Merz, R., and Scipal, K. Assimilating 124 scatterometer soil moisture data into conceptual hydrologic models at the regional scale. Hydrology and Earth System Sciences, 10:353–368, 2006. Peel, M. C., Finlayson, B. L., and McMahon, T. A. Updated world map of the köppengeiger climate classification. Hydrology and earth system sciences discussions, 4(2): 439–473, 2007. Pellarin, T., Ali, A., Chopin, F., Jobard, I., and Bergès, J.-C. Using spaceborne surface soil moisture to constrain satellite precipitation estimates over west africa. Geophysical Research Letters, 35(2), 2008. Pellarin, T., Louvet, S., Gruhier, C., Quantin, G., and Legout, C. A simple and effective method for correcting soil moisture and precipitation estimates using amsr-e measurements. Remote Sensing of Environment, 136:28–36, 2013. Penning-Rowsell, E. C., Tunstall, S. M., Tapsell, S., and Parker, D. J. The benefits of flood warnings: real but elusive, and politically significant. Water and Environment Journal, 14(1):7–14, 2000. Pipunic, R. C., Ryu, D., Costelloe, J. F., and Su, C.-H. An evaluation and regional error modeling methodology for near-real-time satellite rainfall data over australia. Journal of Geophysical Research: Atmospheres, 120(20), 2015. Plaza, D., De Keyser, R., De Lannoy, G., Giustarini, L., Matgen, P., and Pauwels, V. The importance of parameter resampling for soil moisture data assimilation into hydrologic models using the particle filter. Hydrology and Earth System Sciences, 16(2), 2012. Reichle, R. H. and Koster, R. D. Bias reduction in short records of satellite soil moisture. Geophysical Research Letters, 31(19), 2004. Reichle, R. H., Walker, J. P., Koster, R. D., and Houser, P. R. Extended versus Ensemble Kalman Filtering for Land Data Assimilation. Journal of Hydrometeorology, 3:728–740, 2002. ISSN 1525-755X. Reichle, R. H., Crow, W. T., and Keppenne, C. L. An adaptive ensemble Kalman filter for soil moisture data assimilation. Water Resources Research, 44(3), 2008. Richards, L. A. Capillary conduction of liquids through porous mediums. Physics, 1(5): 318–333, 1931. Ridler, M.-E., Madsen, H., Stisen, S., Bircher, S., and Fensholt, R. Assimilation of smosderived soil moisture in a fully integrated hydrological and soil-vegetation-atmosphere transfer model in western denmark. Water Resources Research, 50(11):8962–8981, 2014. Ritchie, J. C. and Rango, A. Remote sensing applications to hydrology: introduction. Hydrological Sciences Journal, 41:429–431, 1996. Robertson, D., Shrestha, D., and Wang, Q. Post-processing rainfall forecasts from numer125 ical weather prediction models for short-term streamflow forecasting. Hydrology and Earth System Sciences, 17(9):3587–3603, 2013. Ryu, D., Crow, W. T., Zhan, X., and Jackson, T. J. Correcting Unintended Perturbation Biases in Hydrologic Data Assimilation. Journal of Hydrometeorology, 10:734–750, 2009. Sahoo, A. K., De Lannoy, G. J., Reichle, R. H., and Houser, P. R. Assimilation and downscaling of satellite observed soil moisture over the little river experimental watershed in georgia, usa. Advances in Water Resources, 52:19–33, 2013. Schmugge, T. J. Remote Sensing of Surface Soil Moisture. Journal of Applied Meteorology, 17:1549–1557, 1978. Schmugge, T. J. Remote Sensing of Soil Moisture: Recent Advances. IEEE Transactions on Geoscience and Remote Sensing, GE-21:336–344, 1983. Schmugge, T. J., Kustas, W. P., Ritchie, J. C., Jackson, T. J., and Rango, A. Remote sensing in hydrology. Advances in Water Resources, 25:1367–1385, 2002. Schultz, G. A. and Engman, E. T. Remote sensing in hydrology and water management / G.A. Schultz, Edwin T. Engman (eds.). New York ; London : Springer, c2000., 2000. Scipal, K., Holmes, T., De Jeu, R., Naeimi, V., and Wagner, W. A possible solution for the problem of estimating the error structure of global soil moisture data sets. Geophysical Research Letters, 35(24), 2008. Sharkov, E. A. Passive microwave remote sensing of the earth : physical foundations. Springer, 2003. Sivapalan, M., Takeuchi, K., Franks, S., Gupta, V., Karambiri, H., Lakshmi, V., Liang, X., McDonnell, J., Mendiondo, E., O’connell, P., et al. Iahs decade on predictions in ungauged basins (PUB), 2003–2012: Shaping an exciting future for the hydrological sciences. Hydrological sciences journal, 48(6):857–880, 2003. Stewart, J., Engman, E., Feddes, R., and Kerr, Y. Scaling up in hydrology using remote sensing / Edited by J.B. Stewart; E.T. Engman; R.A. Feddes; Y. Kerr. New York : John Wiley & Sons, 1996., 1996. Stoffelen, A. Toward the true near-surface wind speed: Error modeling and calibration using triple collocation. Journal of Geophysical Research: Oceans (1978–2012), 103 (C4):7755–7766, 1998. Su, C., Ryu, D., Crow, W. T., and Western, A. W. Beyond triple collocation: Applications to soil moisture monitoring. Journal of Geophysical Research - Atmospheres, 119(11): 6416–6439, 2014. Su, C.-H. and Ryu, D. Multi-scale analysis of bias correction of soil moisture. Hydrology and Earth System Sciences, 19(1):17–31, 2015. 126 Su, C.-H., Ryu, D., Young, R. I., Western, A. W., and Wagner, W. Inter-comparison of microwave satellite soil moisture retrievals over the murrumbidgee basin, southeast australia. Remote Sensing of Environment, 134:1–11, 2013. Thielen, J., Bartholmes, J., Ramos, M.-H., and De Roo, A. The european flood alert system-part 1: Concept and development. Hydrology and Earth System Sciences, 13(2), 2009. Tian, Y., Huffman, G. J., Adler, R. F., Tang, L., Sapiano, M., Maggioni, V., and Wu, H. Modeling errors in daily precipitation measurements: Additive or multiplicative? Geophysical Research Letters, 40(10):2060–2065, 2013. van Leeuwen, P. J. Nonlinear data assimilation in geosciences: an extremely efficient particle filter. Quarterly Journal of the Royal Meteorological Society, 136(653):1991– 1999, 2010. Wagner, W., Lemoine, G., and Rott, H. A method for estimating soil moisture from ers scatterometer and soil data. Remote Sensing of Environment, 70(2):191–207, 1999. Wanders, N., Karssenberg, D., Bierkens, M., Parinussa, R., de Jeu, R., van Dam, J., and de Jong, S. Observation uncertainty of satellite soil moisture products determined with physically-based modeling. Remote Sensing of Environment, 127:341–356, 2012. Wanders, N., Karssenberg, D., Roo, A. d., de Jong, S., and Bierkens, M. The suitability of remotely sensed soil moisture for improving operational flood forecasting. Hydrology and Earth System Sciences, 18(6):2343–2357, 2014. Wanders, N., Pan, M., and Wood, E. Correction of real-time satellite precipitation with multi-sensor satellite observations of land surface variables. Remote Sensing of Environment, 160:206–221, 2015. Wang, Q., Robertson, D., and Chiew, F. A bayesian joint probability modeling approach for seasonal forecasting of streamflows at multiple sites. Water Resources Research, 45 (5):W05407, 2009. Weerts, A. H. and El Serafy, G. Y. H. Particle filtering and ensemble Kalman filtering for state updating with hydrological conceptual rainfall-runoff models. Water Resources Research, 42:W09403, 2006. Werner, M., Cranston, M., Harrison, T., Whitfield, D., and Schellekens, J. Recent developments in operational flood forecasting in england, wales and scotland. Meteorological Applications, 16(1):13–22, 2009. Western, A. W., Grayson, R. B., and Blöschl, G. Scaling of soil moisture: A hydrologic perspective. Annual Review of Earth and Planetary Sciences, 30(1):149–180, 2002. Wood, E. F., Sivapalan, M., and Beven, K. J. Similarity and scale in catchment storm response. Reviews of Geophysics, 28(1):1–18, 1990. 127 Yan, H., DeChant, C. M., and Moradkhani, H. Improving soil moisture profile prediction with the particle filter-markov chain monte carlo method. Geoscience and Remote Sensing, IEEE Transactions on, 53(11):6134–6147, 2015. Yilmaz, M. T. and Crow, W. T. The optimality of potential rescaling approaches in land data assimilation. Journal of Hydrometeorology, 14(2):650–660, 2013. Yin, J., Zhan, X., Zheng, Y., Liu, J., Hain, C. R., and Fang, L. Impact of quality control of satellite soil moisture data on their assimilation into land surface model. Geophysical Research Letters, 41(20):7159–7166, 2014. Yong, B., Ren, L., Hong, Y., Gourley, J. J., Tian, Y., Huffman, G. J., Chen, X., Wang, W., and Wen, Y. First evaluation of the climatological calibration algorithm in the real-time tmpa precipitation estimates over two basins at high and low latitudes. Water Resources Research, 49(5):2461–2472, 2013. Yong, B., Liu, D., Gourley, J. J., Tian, Y., Huffman, G. J., Ren, L., and Hong, Y. Global view of real-time trmm multisatellite precipitation analysis: Implications for its successor global precipitation measurement mission. Bulletin of the American Meteorological Society, 96(2):283–296, 2015. Zhan, W., Pan, M., Wanders, N., and Wood, E. Correction of real-time satellite precipitation with satellite soil moisture observations. Hydrology & Earth System Sciences Discussions, 12(6), 2015. Zhou, T., Nijssen, B., Huffman, G., and Lettenmaier, D. Evaluation of the TRMM realtime multi-satellite precipitation analysis version 7 for macro scale hydrologic prediction. J. Hydrometeor, 15(4):1651–1660, 2014. Zwieback, S., Scipal, K., Dorigo, W., and Wagner, W. Structural and statistical properties of the collocation technique for error characterization. Nonlinear Processes in Geophysics, 19(1):69–80, 2012. 128