Validation report for the 2010 Air Quality Assessment Report

Transcription

Validation report for the 2010 Air Quality Assessment Report
MACC-II Deliverable D_113.3
l
Validation report for the
2010 Air Quality
Assessment Report
Date: 05/2013
Lead Beneficiary: INERIS (#17)
Nature: R
Dissemination level: PU
Grant agreement n°283576
File: MACCII_EVA_DEL_D_113.3_AQ2010-Valid_May2013_INERIS.docx/.pdf
Work-package
Deliverable
Title
Nature
Dissemination
Lead Beneficiary
Date
Status
Authors
Approved by
Contact
113 (EVA, Assessment reports production and routine
validation)
D_113.3
Validation reports for 2010
R
PU
INERIS (#17)
05/2013
Final version
L. Rouïl et al. (INERIS)
V.-H. Peuch (ECMWF)
info@gmes-atmosphere.eu
This document has been produced in the context of the MACC-II project (Monitoring Atmospheric
Composition and Climate - Interim Implementation). The research leading to these results has received
funding from the European Community's Seventh Framework Programme (FP7 THEME [SPA.2011.1.5-02])
under grant agreement n° 283576. All information in this document is provided "as is" and no guarantee or
warranty is given that the information is fit for any particular purpose. The user thereof uses the information
at its sole risk and liability. For the avoidance of all doubts, the European Commission has no liability in
respect of this document, which is merely representing the authors view.
2 / 52
Evaluation Report of the Air quality
assessments in Europe for 2010
Edited by Laurence ROUÏL (INERIS)
With contribution from the MACC regional modelling teams:
CERFACS
Emmanuele Emili
CNRS/LISA
Matthias Beekman, Gilles Foret
FMI
Mikhail Sofiev, Julius Vira
KNMI
Henk Eskes
INERIS
Frédérik Meleux, Anthony Ung
Meteo France
Virginie Marécal
Met.no
Alvaro Aldebenito, Anna Carlin-Benedictow
RIUKK
Hendrik Elbern, Elmar Friese, Achim Strunk
SMHI
Lennart Robertson
TNO
Arjo Segers
April 2013
1
Table of content
1.
2.
3.
4.
5.
6.
7.
Rationale .......................................................................................... 7
Methodology....................................................................................... 9
2.1
Observation datasets ......................................................................9
2.2
Performance indicators.................................................................. 12
2.3
Models list ................................................................................. 14
. Ozone simulations and re-analyses ........................................................
Nitrogen Dioxide simulations and re-analyses..............................................
PM10 simulations and re-analyses ...........................................................
PM2.5 simulations and re-analyses ..........................................................
Conclusions ......................................................................................
15
27
33
42
46
2
List of figures
Figure 1.
Location of the NO2 AIRBASE stations selected for data assimilation (red
dots) and validation (green dots) processes ..................................................... 10
Figure 2.
Location of the O3 AIRBASE stations selected for data assimilation (red dots)
and validation (green dots) processes ............................................................ 11
Figure 3.
Location of the PM10 AIRBASE stations selected for data assimilation (red
dots) and validation (green dots) processes ..................................................... 12
Figure 4.
Taylor diagram representing the performance of the MACC-II/EVA modes to
simulate summer daily max of ozone (2010); data assimilated results correspond to the
model noted with un “a” index. ................................................................... 15
Figure 5.
Taylor diagram representing the performance of the MACC-II/EVA modes to
simulate summer daily mean of ozone (2010); data assimilated results correspond to the
model noted with un “a” index .................................................................... 16
Figure 6.
Statistical scores of the “assimilated ensemble” model results against the
AIRBASE validation dataset for the ozone daily maximum over the summer 2010 Bias (a)
Correlation coefficient (b) Root mean square error (c) ........................................ 17
Figure 7.
Statistical scores of the “raw ensemble” model results against the AIRBASE
validation dataset for the ozone daily maximum over the summer 2010 Bias (a)
Correlation coefficient (b) Root mean square error (c) ........................................ 18
Figure 8.
MACC-II/EVA model responses to simulated ozone daily peaks over summer
2010, for various station typologies: rural (top), suburban(middle), urban (bottom) ..... 19
Figure 9.
MACC regional model scores for predicting daily ozone peak over the year
2010 throughout European sub-regions (a) Bias (b) RMSE (c) Correlation coefficient at
rural stations ......................................................................................... 21
Figure 10. MACC regional model scores for predicting daily ozone peak over the year
2010 throughout European sub-regions (a) Bias (b) RMSE (c) Correlation coefficient at
suburban stations .................................................................................... 22
Figure 11. MACC regional model scores for predicting daily ozone peak over the year
2010 throughout European sub-regions (a) Bias (b) RMSE (c) Correlation coefficient at
urban stations ........................................................................................ 23
Figure 12. Number of days (observed in 2010) when the information regulatory
threshold for ozone (180 µg/m3 hourly) was exceededed . Classification by European subregions
......................................................................................... 24
Figure 13. Number of days (observed in 2010) when the alert regulatory threshold for
ozone (240 µg/m3 hourly) was exceeded . Classification by European sub-regions ........ 24
Figure 14. Capacity of the “assimilated Ensemble” model (ENSa) to reproduce the
number of days when the information ozone threshold was exceeded ...................... 25
Figure 15. Contingency graphs for the prediction of exccedances of the information
threshold in 2010 by the MACC-II/EVA models. Rural sites (top), suburban sites (middle)
and urban sites (bottom) ........................................................................... 26
Figure 16. Bias calculated for the NO2 daily mean in 2010 by the data assimilation
systems LOTOS-EUROSa (left) , SILAMa (right) and EURADa (bottom) ....................... 28
Figure 17. RMSE calculated for the NO2 daily mean in 2010 by the data assimilation
systems LOTOS-EUROSa (left) , SILAMa (right) and EURADa (bottom) ....................... 28
Figure 18. Correlation coefficien) calculated for the NO2 daily mean in 2010 by the
data assimilation systems LOTOS-EUROSa (left) , SILAMa (right) and EURADa (bottom) .. 29
Figure 19. Bias the MACC-II/EVA models to predict daily mean of NO2 concentrations in
2010 for various station typologies: rural (top), suburban (middle), urban (bottom) ..... 30
Figure 20. RMSE the MACC-II/EVA models to predict daily mean of NO2 concentrations
in 2010 for various station typologies: rural (top), suburban (middle), urban (bottom) .. 31
3
Figure 21. Correlation coefficient the MACC-II/EVA models to predict daily mean of NO2
concentrations in 2010 for various station typologies: rural (top), suburban (middle),
urban (bottom) ....................................................................................... 32
Figure 22. Taylor diagram representing the performance of the MACC-II/EVA modes to
simulate PM10 daily mean (2010); data assimilated results correspond to the model noted
with un “a” index. ................................................................................... 33
Figure 23. Bias, RMSE and correlation coefficient of the data assimilated Ensemble for
the prediction of PM10 annual average in 2010 ................................................. 35
Figure 24. Performance indicators of the MACC-II/EVA models subregion by subregion
for the prediction of the PM10 daily mean, 2010, rural stations ............................. 36
Figure 25. Performance indicators of the MACC-II/EVA models subregion by subregion
for the prediction of the PM10 daily mean, 2010, suburban stations ........................ 37
Figure 26. Performance indicators of the MACC-II/EVA models subregion by subregion
for the prediction of the PM10 daily mean, 2010, urban stations ............................ 38
Figure 27. Various model scores to simulate PM10 daily mean at urban sites, 2010,
according : Bias (top), RMSE (middle) and Correlation coefficient (bottom) ............... 39
Figure 28. Number of days of exceedance of the PM10 daily average threshold (50
µg/m3) observed in 2010 and sorted by European sub-regions: EUW : Western, EUC :
Central, EUN: Northern, EUS: Southern, EUE: eastern ......................................... 40
Figure 29. Number of days of exceedance of the PM10 daily average threshold (50
µg/m3) in 2010 predicted by the MACC-II/EVA data assimilated ensemble models; sorted
by European sub-regions: EUW : Western, EUC : Central, EUN: Northern, EUS: Southern,
EUE: eastern ......................................................................................... 41
Figure 30. Number of days of exceedance of the PM10 daily average threshold (50
µg/m3) in 2008 predicted by the “data assimilated ensemble” model ...................... 41
Figure 31. Contingency indicators for the prediction of exceedances of the daily PM10
limit value at urban stations by the MACC-II/EVA models: good predictions, false alerts,
and non detection for the year 2010 ............................................................. 42
Figure 32. Bias between PM2.5 observed and modelled daily means for the year 2010
for the MACC-II/EVA models and for various station typologies: rural (top), suburban
(middle), urban (bottom) ........................................................................... 43
Figure 33. RMSE between PM2.5 observed and modelled daily means for the year 2010
for the MACC-II/EVA models and for various station typologies: rural (top), suburban
(middle), urban (bottom) ........................................................................... 44
Figure 34. Correlation coefficient between PM2.5 observed and modelled daily means
for the year 2010 for the MACC-II/EVA models and for various station typologies: rural
(top), suburban (middle), urban (bottom) ....................................................... 45
4
Glossary
AIRBASE
Analyses
AOT 40
European
Air
Quality
database
(http://airclimate.eionet.europa.eu/databases/airbase/)
Maps of air pollutant concentrations fields issued from numerical
model results combined with up-to-date available observation data
to improve their accuracy in the vicinity of measurement points. In
MACC-II, they are produced routinely on a daily basis.
Accumulated Ozone over the 40 ppb Threshold
AQD
Assessments
Air Quality Directive
Quantitative evaluation of air quality fields based on validated data
and numerical model results
CERFACS
Centre Européen de Recherche et de Formation Avancée en Calcul
Scientifique (France)
Data assimilation
SMHI
Mathematical process to incorporate observations in a numerical
model of physical systems
European Aeroallergen Network
Air Quality data assimilation sub-project in the MACC and MACC-II
projects
European Environment Agency
Air quality forecasting and analysis sub-project in the MACC and
MACC-II projects
Combination of various results from various models. This can be a
simple average (median), or a weighted average resulting from
analysis of models’ behaviour over past periods. The models
building-up the ensemble can correspond to different systems
(multi-model approach) or to the same modelling system fed with
different input datasets.
Air quality validated assessments sub-project in the MACC and
MACC-II projects
Finnish meteorological Institute
Royal Netherlands Meteorological Institute
Laboratoire Interuniversitaire des Systèmes Atmosphérique (France)
French Weather Services
Norwegian Meteorological institute (Norway)
Model results directly issued from the modelling chain, without any
post-treatment process
Maps of air pollutant concentrations fields issued from numerical
model results combined with validated observation data to improve
their accuracy in the vicinity of measurement points
Rhenish Institute for Environmental research at the University of
Cologne (Germany)
Root Mean Square Error. It gives the standard deviation of the
model prediction error. A smaller value indicates better model
performance.
Swedish Meteorological and Hydrological Institute (S)
SOMO35
Ozone concentrations accumulated dose over a threshold of 35 ppb
EAN:
EDA
EEA
ENS
Ensemble Model
EVA:
FMI
KNMI
LISA
Météo France
Met.no
Raw model data
Re-analyses
RIUKK
RMSE
5
TNO
Netherlands Organisation for applied Scientific Research (NL)
VOC
Volatile Organic Compound
WHO
World Health Organization
WMO
World Meteorological Organization
6
1. Rationale
The Copernicus/MACC-II project aims at delivering a number of services dedicated to
global atmospheric composition and air quality monitoring in Europe. Air quality issues
are covered by two services (http://www.copernicus-atmosphere.eu/services/aqac/) :
- The so-called ENS services are focussed on routine and near real time products
(forecasts, Near Real Time analyses). Up to four days forecasts of ozone, nitrogen
dioxide and particulate matter concentrations (PM10 and PM2.5) throughout Europe
are available every day. Daily analyses of the same variables (simulated fields are
improved with observations thanks to data assimilation techniques) are proposed
as well. They are available on the Copernicus atmosphere services website
(http://macc-raq.gmes-atmosphere.eu/som_regrid_ens3D.php ).
- The so-called EVA services relate to detailed analysis of past situations thanks to
validated material issued from observation networks and modelling. Therefore, a
posteriori validated air quality assessments for Europe, based on re-analysed air
pollutant concentration fields are proposed. Simulations of past years are
performed and “improved” thanks to the assimilation of available validated in-situ
and satellite observations. They are available on the Copernicus atmosphere
services
website
(http://www.gmesatmosphere.eu/services/raq/raq_reanalysis/).
The so-called MACC-II regional air quality assessment reports describe, with a yearly
frequency, the state and the evolution of background concentrations of air pollutants in
European countries. Special care is given to pollutants characterised by the influence of
long range transport, correctly caught by European scale modelling systems: ozone,
nitrogen dioxide, particulate matter (PM10 and PM2.5). Focus on specific pollution episodes
that happened during the year will be considered.
This work results from a service that use both observations (in-situ and satellite) and
model results to elaborate assessments. Both sources of information (modelling and
measurement) are smartly mixed by the MACC-II scientists to elaborate high quality maps
of air pollutant concentrations and patterns.
Because the MACC-II/EVA assessment reports have the ambition to address policy and
decision makers’concerns reliability, quality assurance and accuracy must be ensured.
The model versions implemented and their capacities must be objectively evaluated and
described in a transparent and traceable way. This is the objective of the Quality
Assurance plans elaborated within MACC-II for each re-analysis chain. This kind of
information is necessary to interpret correctly the model results, their variability and to
assess uncertainties of the maps and air pollution fields delivered by the service.
The air quality European assessments provided by MACC-II/EVA are built up on the basis
of seven state-of-the-art chemistry transport models run in an operational way by
decentralised modelling teams. The list and the current configuration of the modelling
systems are given in annex. Moreover the multi-model functionalities developed in the
MACC-II “regional cluster” allow to derived “ensemble” model estimations. They relate to
the combination of various model results to obtain an average with improved skills
compared to the individual models’ ones. This combination can be a simple median
average or more sophisticated averages, weighted by coefficients depending on the
model, the geographical location, the simulated period.... The former option has still
been used for 2010.
7
The present document is the MACC-II/EVA evaluation report on European air Quality
assessments for the year 2010.
It provides the reader with a number of commented performance indicators which allow
the evaluation of the quality of model results used in the assessment report. Building up
confidence in the MACC-II/EVA re-analysis system is not the only issue covered by this
analysis. It aims at establishing keys for a better understanding of the model results as
well.
Notice:
At this stage, it should be noted that not all the expected capacities of the MACC
program have been used for the 2010 assessment report working-out. Therefore, some
results were issued exclusively from raw simulations of past situations, and other from
combined numerical data with observations according to various data assimilation
approaches (“re-analyses”). The performances of all model configurations were evaluated
by INERIS and are reported in this report. For some pollutants, because too few members
with re-analysis process (individual data assimilated model results) were available, the
“ensemble” concept is no longer relevant. In such cases one individual model could be
significantly better than the ensemble and consequently is selected for illustrating the
assessment report.
Content of the report
This report addresses the capacities of state-of-the-art chemistry transport models to
predict air quality indicators considered for the air quality re-analysis process. The
evaluation phase which is discussed in the present document allows the establishment of
objective and quantitative criteria to assess model skills and performances. The
variability of the model responses is an indicator of the uncertainty of the modelling and
re-analysis approaches as well. Geographical areas where all models perform reasonably
well with the same trends can be considered as “well described” by the modelling
systems. Conversely, more caution must be accorded to the sub-regions where the range
of variability of the model results spreads out significantly.
The next section describes the evaluation methodology that has been adopted. The other
ones provides for each pollutant a synthesis of the statistical performance scores
calculated for the various models involved in MACC-II/EVA. They are presented on
comprehensive maps or with time series and histograms.
8
2. Methodology
2.1 Observation datasets
The evaluation work is focussed on ozone, nitrogen dioxide, PM10 and PM2.5 concentration
fields. Sufficiently representative and relevant observation data are now available for
these pollutants allowing for the calculation of performance indicators. The AIRBASE
database
from
the
European
Environment
Agency
(EEA)http://www.eea.europa.eu/data-and-maps/data/airbase-the-european-air-qualitydatabase-6- gathering all validated observations reported from the European regulatory
air quality monitoring networks available until the year 2010. This data had been used for
both re-analysis and evaluation processes.
Considering the available data for ozone, NO2 and PM10, same sets cannot be used for
both validation and data assimilation systems (it would not make sense to evaluate reanalyses against the same observation dataset that was used for data assimilation in the
models). So the set of observations available in AIRBASE had been split in two subsets,
one for DA and the other for validation (about one third of the total number of stations).
Randomness and homogeneous spatial coverage with respect with the station typology
(rural, suburban, and urban) were the principles considered for the station classification.
The table below presents the number of stations selected in each category and their
location is mapped on the following figures.
O3
Rural/
DA
281
Rural/
Validation
144
NO2
209
105
PM10
162
74
PM2.5
34
20
Suburban/
Suburban/
DA
Validation
262
124
Total O3: 1345
251
120
Total NO2: 1336
211
94
Total PM10: 1198
49
21
Total PM2.5: 226
Urban/
DA
360
Urban/
Validation
166
449
212
455
203
68
34
Table 1 : Number of stations selected in the AIRBASE database for 2010 for DA and validation
9
Figure 1. Location of the NO2 AIRBASE stations selected for data assimilation (red dots) and
validation (green dots) processes
10
Figure 2. Location of the O3 AIRBASE stations selected for data assimilation (red dots) and
validation (green dots) processes
11
Figure 3. Location of the PM10 AIRBASE stations selected for data assimilation (red dots)
and validation (green dots) processes
2.2 Performance indicators
The model performances are evaluated on the basis of classical statistical indicators
which measure objectively the gap between the model results (raw data or re-analyses)
and the observations at the available stations: bias, root mean square error (RMSE) and
correlation coefficient are the most classical. Comparison of observed and model
averages is generally considered as well. Obviously the behaviour of performance
indicators depends on the station typology and the considered pollutant: the models used
in the MACC-II/EVA systems run at the European scale and their spatial resolution is about
20 km in the best case. Consequently for pollutants which are largely influenced by local
sources (NO2, PM in some situations) these regional models are not able to reproduce hot
12
spots monitored by traffic or industrial stations. Performance indicators will not be
assessed. Difficulties can even be encountered at urban stations.
Conversely for pollutants characterised by long residence time in the atmosphere and
large impacted areas (typically ozone and PM in some cases), performance indicators
evaluated at all type of stations (except traffic and industrial sites) make sense.
The definition of the various performance indicators used in the report are reminded
below. They are very usual1 in evaluation processes:
•
Bias indicates, on average, if the simulations under or over predicts the actual
measured concentrations. In our case, negative values indicate under-prediction,
whereas positive values indicate over-prediction; values close to 0 are the best
ones:
1 N
⋅ ∑ (Pi − Oi )
N i =1
•
Where N is the number of observations, Pi refers to the predictions and Oi to the
observations. It is expressed in µg/m3.
Root Mean Square Error (RMSE) gives information about the skill of the model in
predicting the overall magnitude of the observations. It should be as weak as
possible:
1 N
2
⋅ ∑ (Pi − Oi )
N i =1
•
Where N is the number of observations, Pi refers to the predictions and Oi to the
observations. It is expressed in µg/m3.
Correlation is a measure of whether predictions and observations change together
in the same way (i.e. at the same time and/or place). The closer the correlation is
to one, the better is the correspondence of extreme values of the two data sets.
r=
cov( Pi , Oi )
var( Pi ) ⋅ var(Oi )
Where N is the number of observations, Pi refers to the predictions and Oi to the
observations. This is a non dimensional number.
Taylor diagrams synthesize on a unique quadrant various statistical indicators for various
models: the radii correspond to the correlation coefficient values, the x-axis and the yaxis delimits arcs with bias values and the internal semi-circles correspond to the RMSE
values. Therefore this is a very pedagogic way to present an overview of the relative
performances of a set of models, often used in model intercomparison exercises.
For indicators related to threshold values, for instance the number of days, hours when a
certain concentration level is exceeded, some “contingency tables” giving the
percentages of correct predictions (GP), false alarms (FA), or missing events (ME) are
estimated. These concepts come from the weather or air quality forecasting world.
Although they are very severe and not objectively representative of the intrinsic model
performance (because of the threshold cut-effect, a result close to the threshold can fall
arbitrary in one or the other category), they can give a useful information to compare
various models’ behavior in different geographical regions. GP, FA and ME are expressed
in percentage (%).
1
Chang J.C. et Hanna S.R., 2004. Air quality model performance evaluation. Meteorol. Atmos. Phys. 87,
167–196.
13
Several representations of the models’ skills are proposed: maps with coloured patches at
the location of the stations selected in AIRBASE for the evaluation process. The colour
scale indicates how the model performs. Taylor diagrams provide a wider overview of the
model performances. These graphs propose a global representation to consider at one
glance the classical statistical scores that characterise the model performances against
observations: bias, correlation coefficient and RMSE.
Histograms with model performances sorted by station typology and by European subregion (Western, Northern, Southern, Central, Eastern) are proposed as well.
2.3 Models list
The models involved in this evaluation are those run operationally by the MACC regional
air quality modelling teams. A short reminder of the characteristics of their systems is
given in annex. But for an easier reading, the list of models discussed in the next sections
is given in the table below:
Model
CHIMERE
EMEP
EURAD
LOTOS-EUROS
MATCH
MOCAGE
SILAM
Origin
France (CNRS/INERIS)
Norway (met.no)
Germany (FRIUUK)
The
Netherlands
(KNMI/TNO)
Sweden (SMHI)
France (Météo France)
Finland (FMI)
14
3. . Ozone simulations and re-analyses
Figure 4and Figure 5 shows Taylor diagrams for each individual model results and the
ensemble models’results: raw simulation results (“ENS”) and data assimilated ensemble
model results (“ENSa”). They relate to ozone daily maximum and ozone daily mean over
the 2010 summer period respectively. The station typology is distinguished for a more
comprehensive analysis.
The benefits of the data assimilation process are significant considering the correlation
coefficient (+ 0.05-0.07) and the Root Mean Square Error (-5 µg/m3). Standard deviation
of assimilated model result improved significantly too, the observed reference being 33
and 27 µg/m3 respectively .
In all cases, the assimilated ensemble provided the best results, which are very
satisfactory considering the state of the art: correlation coefficients were about 0.95,
except for rural daily mean (0.90). RMSE ranged between 15 and 10 µg/m3, except for
rural daily for which it was slightly higher than 15 µg/m3.
The fact that the performances are slightly lower for rural daily mean can certainly be
explained by the way model simulate night ozone levels, which are generally too high.
Those results sho a significant improvement compared to the previous assessed years.
Figure 4. Taylor diagram representing the performance of the MACC-II/EVA modes to
simulate summer daily max of ozone (2010); data assimilated results correspond to the
model noted with un “a” index.
15
Figure 5. Taylor diagram representing the performance of the MACC-II/EVA modes to
simulate summer daily mean of ozone (2010); data assimilated results correspond to the
model noted with un “a” index
In-depth analysis of the “assimilated model results” can be elaborated considering
the spatial distribution of the statistical indicators over Europe. Figure 6 presents maps
of bias, correlation coefficient and RMSE related to the “ENSa” model results, for the
summer 2010. Correlation coefficient is excellent with values larger than 0.9 in most of
the cases. Actually, only few stations in Italy, Portugal and in Central Europe show poor
performances. For major number of stations, RMSE ranges between 5 and 15 µg/m3 what
is very good. Performances decrease for stations around the Mediterranean area and in
Central Europe. But in this last case, very few stations are available for validation.
Therefore the ENSa model is correctly fitted to predict high values and exceedances of
the regulatory thresholds.
For comparison, Figure 7 shows the same panel of information for the ensemble model
results (without data assimilation) and the same indicator (2010 summer daily average).
It highlights where dadat assimilation improved the most model results: Western Europe
is clearly concerned but the difficult regions too (Mediterranean coast and Central
Europe).
16
(a)
(b)
(c)
Figure 6. Statistical scores of the “assimilated ensemble” model results against the AIRBASE
validation dataset for the ozone daily maximum over the summer 2010 Bias (a)
Correlation coefficient (b) Root mean square error (c)
(a)
(b)
17
(c)
Figure 7. Statistical scores of the “raw ensemble” model results against the AIRBASE
validation dataset for the ozone daily maximum over the summer 2010 Bias (a)
Correlation coefficient (b) Root mean square error (c)
The multi-model approach developed in the MACC-II system for regional air quality
modelling is of high interest for the qualification of the uncertainty of the results. of
the ozone concentrations assessments can be approached considering the range of
variability of the model results. Figure 8 shows the ozone daily peaks simulated by the
EVA models in summer 2010, at rural, suburban and urban monitoring sites in Europe. The
consistency between the various models is good, whatever the site typology. Differences
do not exceed 15 µg/m3 and the temporal correlation with the observed time series (in
green) is high.
18
Figure 8. MACC-II/EVA model responses to simulated ozone daily peaks over summer 2010,
for various station typologies: rural (top), suburban(middle), urban (bottom)
More investigation on how the models (in raw simulation and data assimilation modes)
behave to predict the hourly concentrations and the daily peak has been conducted.
Statistical scores established for the various typologies of stations and detailed sub-region
by sub-region2 over the year 2010 are proposed in the figures below (see Figure 9, Figure
10, Figure 11). These figures show the variability rather high of the models performances,
when considering the typology of the stations and the sub-region as well. Model
performances were quite satisfactory and consistent with the state of the art. As
expected the best results are obtained for Northern Europe stations while more
difficulties were highlighted for Southern stations. Complexity of the meteorological
patterns and topography, uncertainties on some sources (for instance biogenic sources)
and uncertainties related to some chemical mechanisms could explain this “well-known”
2
EUW = Western Europe, EUC= Central Europe, EUS= Southern Europe, EUN= Northern Europe,
EUE= Eastern Europe
19
limitation of the current modelling systems. It should be noted the good consistency
between model performances (with satisfactory performances) for stations located in
western and central Europe Lack of stations prevented from achieving the evaluation for
suburban and urban locations in eastern Europe. The benefits of data assimilation are
generally clearly established when we compare the “ensemble” of raw simulations with
the “ensemble” of “DA simulations”.
Considering individual models, CHIMERE performed (considering both raw results and DA
results) the best. Inconsistencies can be noted in EURAD results with some discrepancies
that could occurred in the data assimilated results.
(a)
(b)
20
(c)
Figure 9. MACC regional model scores for predicting daily ozone peak over the year 2010
throughout European sub-regions (a) Bias (b) RMSE (c) Correlation coefficient at rural
stations
(a)
21
(b)
(c)
Figure 10. MACC regional model scores for predicting daily ozone peak over the year 2010
throughout European sub-regions (a) Bias (b) RMSE (c) Correlation coefficient at
suburban stations
22
(a)
(b)
(c)
Figure 11. MACC regional model scores for predicting daily ozone peak over the year 2010
throughout European sub-regions (a) Bias (b) RMSE (c) Correlation coefficient at urban
stations
23
Models ‘capacities to predict situations when the regulatory thresholds (information
level: 180 µg/m3 and alert level 240 µg/m3) are exceeded, especially during the “summer
period” (April to October ) were assessed for each model and version of model (raw
simulation or data assimilation modes).
Figure 12 and Figure 13 show the number of days when exceedances of the information
and alert thresholds respectively had been reported in the AIRBASE database. They are
classified by sub-regions. Several ozone peaks held in summer 2010, especially in July.
Western Europe and central Europe were mainly concerned by the highest values. The
ability of the MACC-II/EVA “assimilated ensemble” model (ENSa) to reproduce these
peaks is illustrated on Figure 14. Although detected on a qualitative point of view, the
number of exceedances observed in July especially in Central Europe, was
underestimated. This is not surprising; almost all the models, and the Ensemble, showed
a negative bias in these areas.
Figure 12. Number of days (observed in 2010) when the information regulatory threshold for
ozone (180 µg/m3 hourly) was exceededed . Classification by European sub-regions
Figure 13. Number of days (observed in 2010) when the alert regulatory threshold for ozone
(240 µg/m3 hourly) was exceeded . Classification by European sub-regions
24
Figure 14. Capacity of the “assimilated Ensemble” model (ENSa) to reproduce the number of
days when the information ozone threshold was exceeded
Capacities of MACC-II/EVA models in predicting exceedances of the threshold values can
be assessed, even if it is recommended to give special caution to the interpretation of
such results. Indeed it is very difficult to deal with the threshold effect, only one
microgram/m3 (which is lower than the intrinsic model uncertainty) over or under the
threshold being likely to be responsible for a bad mark. Contingency graphs are proposed
for information below for all the involved models and various station typologies. In
general, data assimilation process tends to reduce the number of non detections (pink
bars) and increase the number of good predictions (blue bars). In some cases (for
instance the EURAD model) it can increase the number of false alerts (red bars). Balance
between those classes of events are generally consistent, whatever the station typology.
25
Figure 15. Contingency graphs for the prediction of exccedances of the information threshold
in 2010 by the MACC-II/EVA models. Rural sites (top), suburban sites (middle) and urban
sites (bottom)
26
4. Nitrogen Dioxide simulations and re-analyses
It is important to note that for the year 2010, only three teams assimilated NO2
observations: three teams over six assimilated in an operational way NO2 observations:
RIUUK (Rhenish Institute for Environmental Research at the University of Cologne)
assimilated NO2 ground-level observations from the AIRBASE database and also NO2
columns retrieved from satellites observations (OMI, GOME-2, SCHIAMACHI). The
consortium KNMI/TNO assimilated ground level concentrations from AIRBASE, and
satellite observations from IASI. The FMI assimilated NO2 in-situ data from AIRBASE in its
results.
Figure 16, Figure 17 and Figure 18 present the statistical scores (bias, RMSE and
correlation coefficient respectively) of the EURADa, SILAMa and LOTO-EUROSa data
assimilation systems to reproduce NO2 daily mean values over the year 2010. Score
indicators are clearly better for EURAD, whatever the indicator. The geographical
consistency of EURADa scores is remarkable as well. One should note the low RMSE (lower
than 5 µg/m3 at many locations) obtained with the EURAD system. Its superiority can be
explained by the maturity of the data assimilation system which bears the operational
chain, and the fact that not only in-situ data from the AIRBASE stations are assimilated
but also satellite information. Earth observations should help in reproducing the
geographical distribution of air pollution patterns. Generally, the geographical areas
where models are less satisfactory are the same for all systems: Italy, Alps and
mountainous areas, and Eastern and central Europe for SILAM and LOTOS-EUROS. All
models perform correctly in Western Europe.
Considering those model results it should be noted a significant improvement of the
scores compared to those obtained in the previous years, showing progress in the whole
MACC-II regional modelling chains. In 2008, RMSE was ranging from 10 µg/m3 (in Germany
and Central Europe) to 40 µg/m3, in Italy and in Eastern Europe. In the current system,
best values are below 5 µg/m3 and do not exceed 30 µg/m3 in the worst cases.
Because NO2 in ambient air is mainly influenced by local sources, European-wide models
with a limited resolution (20km in the best case) perform less efficiently than for other
pollutants.
27
Figure 16. Bias calculated for the NO2 daily mean in 2010 by the data assimilation systems
LOTOS-EUROSa (left) , SILAMa (right) and EURADa (bottom)
Figure 17. RMSE calculated for the NO2 daily mean in 2010 by the data assimilation systems
LOTOS-EUROSa (left) , SILAMa (right) and EURADa (bottom)
28
Figure 18. Correlation coefficien) calculated for the NO2 daily mean in 2010 by the data
assimilation systems LOTOS-EUROSa (left) , SILAMa (right) and EURADa (bottom)
More detailed analysis for the daily mean was performed model by model and is
proposed on Figure 19 to Figure 21 for the different statistical indicators. The analysis of
model performances is consistent for all station typologies. The data assimilated system
(EURADa) always gave better results for all indicators than the other codes and the
“ensemble”: it helps to gain one or two points on the correlation coefficient and 3 to 5
µg/m3 on the RMSE. The EURAD system improved its performances compared to the
previous years.
The results were much more disappointing for LOTOS-EUROS. It seemed that the data
assimilation chains did not improve the individual model results (without data
assimilation).
The raw simulation results are quite consistent from a model to another: CHIMERE, EMEP,
EURAD and LOTOS-EUROS have more or less the same basic behaviour, which is consistent
with the state of the art.
29
Figure 19. Bias the MACC-II/EVA models to predict daily mean of NO2 concentrations in 2010
for various station typologies: rural (top), suburban (middle), urban (bottom)
30
Figure 20. RMSE the MACC-II/EVA models to predict daily mean of NO2 concentrations in 2010
for various station typologies: rural (top), suburban (middle), urban (bottom)
31
Figure 21. Correlation coefficient the MACC-II/EVA models to predict daily mean of NO2
concentrations in 2010 for various station typologies: rural (top), suburban (middle),
urban (bottom)
32
5. PM10 simulations and re-analyses
Figure 22 presents the Taylor diagram for each individual model results and the ensemble
models’results : raw simulation results (“ENS”) and data assimilated ensemble model
results (“ENSa”). They relate to PM10 daily mean over the year 2010. The station
typology is distinguished for a more comprehensive analysis.
The data assimilated ensemble model gave good results even if it seems that they were
slightly degraded by LOTOS-EUROSa performances which were lower than those of the
other models. However the correlation coefficient is higher than 0.8 and the RMSE lower
than 12 µg/m3 what is very good according to the state of the art.
Figure 23 details the geographical distribution of these scores for the data assimilated
ensemble. The scores were the best in Western Europe and were the worst in Eastern
Europe and in mountainous areas. This can be explained by the complexity (in
meteorological terms) of such areas but also by uncertainties in the emission inventories
(especially in the east part of Europe).
The gain of the data assimilation process is illustrated and quantified considering
CHIMERE raw simulation results in regard with CHIMERE data assimilated results. The bias
is reduced by 10 to 20 µg/m3 in absolute value, and the correlation coefficient is
increased by almost 0.3-0.4. Bias becomes positive at almost all the stations: 5 to 10
µg/m3.
Highest difficulties held for the prediction of the annual concentrations in Southern
Europe. Best results were obtained for Western Europe. Correlation coefficients were
highly variable from a site to another and from a model to another: it ranged from 0.85 in
some excellent situations (with the data assimilated systems) to 0.1 in the worst ones.
Figure 22. Taylor diagram representing the performance of the MACC-II/EVA modes to
simulate PM10 daily mean (2010); data assimilated results correspond to the model
noted with un “a” index.
33
34
Figure 23. Bias, RMSE and correlation coefficient of the data assimilated Ensemble for the
prediction of PM10 annual average in 2010
In depth analysis can be conducted considering the scores for prediction PM10 daily
mean at various station typologies and in the different geographical regions (Figure
24toFigure 26). Except for LOTOS-EUROS (this point should be further investigated) data
assimilation improved significantly the model results. It should be noted that in rural
areas it led the CHIMERE model to overestimate PM10 concentrations what is generally
unexpected. As for NO2, one can note the consistent behaviour of the CHIMERE and
EURAD models (raw data), but EURAD data assimilated system had a more significant
impact on PM10 concentrations. It remarkably improved its performances at urban sites:
2 to 4 more points on the correlation coefficient, RMSE decreased by 10 to 15 at Eastern
and Southern urban locations (known to be difficult to catch).
The improvement is highly significant and these results are very encouraging, also for
developing use of earth observation data in data assimilated systems. The performances
of the CHIMERE and EURAD DA systems are generally very good. RMSE ranged below 10
µg/m3 (rural and suburban stations) to 15 to 20 µg/m3 (urban stations with a maximum
reached in Southern Europe) and correlation coefficient stood around 0.4 in the worst
case to 0.6-0.8 in the best ones. This is a significant improvement compared to the
previous years reports. The results (whatever the model set-up) were generally the best
for Western and Northern locations. Southern and Eastern Europe is the area where the
results were the most uncertain.
Finally, Figure 27 gives an overview of those results, for urban typologies at the European
scale. The added-value of the Ensemble is clearly highlighted with the consistent
behaviour of the CHIMEREa and EURADa models. Scores were very good for these models
compared to the state of the art.
35
(a)
(b)
(c)
Figure 24. Performance indicators of the MACC-II/EVA models subregion by subregion for the
prediction of the PM10 daily mean, 2010, rural stations
36
(a)
(b)
(c)
Figure 25. Performance indicators of the MACC-II/EVA models subregion by subregion for the
prediction of the PM10 daily mean, 2010, suburban stations
37
(a)
(b)
(c)
Figure 26. Performance indicators of the MACC-II/EVA models subregion by subregion for the
prediction of the PM10 daily mean, 2010, urban stations
38
Figure 27. Various model scores to simulate PM10 daily mean at urban sites, 2010, according :
Bias (top), RMSE (middle) and Correlation coefficient (bottom)
39
Last part of the analysis deal with the prediction of the situation when regulatory
threshold value (50 µg/m3 for the PM10 daily mean) is exceeded.
Figure 28 shows the number of days when such exceedance situations held in 2010.
Western and Central Europe were mainly concerned. One can note that winter 2010
(January, February and December) was particularly rich in such events.
Figure 29 represents the same indicator modelled by the data assimilated Ensemble
model (ENSa). One third of the number of days of exceedance was correctly modelled.
Missing events concerned the summer period, and some exceedances that occurred in
winter in Western Europe and central Europe.
It is important to note that the proposed simulations did not account for the impact of
the huge forest fires that occurred in Russia in during the first half of August. The impact
of forest fires on ozone and PM atmospheric concentrations is well-known and it is
expected that summer ozone and PM10 concentrations in Central and Eastern Europe
could have been influenced by huge forest fire emissions. This contribution is clearly
missing in the proposed simulation and can explain the model discrepancies observed in
the summer period. In the next reports the MACC-II modelling chains will account for
forest fire emissions provided by the dedicated sub-project. A significant improvement of
the model performances is expected from this new functionality.
Figure 30 are time series of the number of days exceeding the PM10 regulatory limit value
(daily mean) predicted by all the MACC-II/EVA data assimilated models and the
observations. Excellent correlation between both is highlighted. This shows the ability of
the models to predict episodes, even if their importance is underestimated. Very
encouraging performance of the DA systems is demonstrated with these graphs, although
it seems that some events are still underestimated (summer period).
Figure 28. Number of days of exceedance of the PM10 daily average threshold (50 µg/m3)
observed in 2010 and sorted by European sub-regions: EUW : Western, EUC : Central,
EUN: Northern, EUS: Southern, EUE: eastern
40
Figure 29. Number of days of exceedance of the PM10 daily average threshold (50 µg/m3) in
2010 predicted by the MACC-II/EVA data assimilated ensemble models; sorted by
European sub-regions: EUW : Western, EUC : Central, EUN: Northern, EUS: Southern,
EUE: eastern
Figure 30. Number of days of exceedance of the PM10 daily average threshold (50 µg/m3) in
2008 predicted by the “data assimilated ensemble” model
41
Histograms of contingency indicators (Figure 31) also give a good representation of the
improvements expected from the data assimilation systems with a significant reduced
number of non detections (except for LOTOS-EUROS) and an increased number of good
predictions. It is interesting to note that the number of false alerts, which tends to
increase, remains quite stable.
Figure 31. Contingency indicators for the prediction of exceedances of the daily PM10 limit
value at urban stations by the MACC-II/EVA models: good predictions, false alerts, and
non detection for the year 2010
6. PM2.5 simulations and re-analyses
In the previous evaluation reports (2007 to 2009), the PM2.5 modelled concentrations had
not been deeply assessed because of lack of observation data. The number of PM2.5
stations increased in Europe with the implementation of the air quality directive, and
now formal assessment becomes more relevant.
A first attempt had been made with the 2010 re-analyses. Figure 32, Figure 33, Figure 34
represent usual statistical score indicators obtained with each model that computed
PM2.5 concentrations against available observations. Because of limited number of
stations those results must be interpreted with caution.
The consistency of models’behavior, whatever the station typology should be noted.
EURADa (data assimilation chain) is the only one which overestimated PM2.5
concentrations over Europe. Its performances were quite encouraging with a correlation
coefficient of about 0.6 and RMSE of about 10-15 µg/m3. However this last figure reflects
lower quality results than those obtained for the other pollutants. One should note
promising results provided by the CHIMERE model, even the raw simulation results.
42
Figure 32. Bias between PM2.5 observed and modelled daily means for the year 2010 for the
MACC-II/EVA models and for various station typologies: rural (top), suburban (middle),
urban (bottom)
43
Figure 33. RMSE between PM2.5 observed and modelled daily means for the year 2010 for the
MACC-II/EVA models and for various station typologies: rural (top), suburban (middle),
urban (bottom)
44
Figure 34. Correlation coefficient between PM2.5 observed and modelled daily means for the
year 2010 for the MACC-II/EVA models and for various station typologies: rural (top),
suburban (middle), urban (bottom)
45
7. Conclusions
This report provides an extensive analysis of the performances of the MACC-II/EVA
modelling systems (simulations and re-analyses platforms) to predict the concentrations
of the regulatory air pollutant concentrations (O3, NO2, PM10, PM2.5) in 2010. Daily mean
values, daily maximum values (ozone), annual means and indicators related to situations
when regulatory thresholds are exceeded, were investigated. For the first time, it was
not possible to develop the same analysis for PM2.5. Distinction between the station
typologies and the European sub-regions was made for a more comprehensive
interpretation.
It is interesting to note how the data-assimilated systems improved the representation of
the air pollution patterns at least for EURAD, SILAM and CHIMERE. The LOTOS-EUROS
situation needs to be further investigated, with data assimilated results which seems less
improved compared to what is achieved with the other models. The EURAD data
assimilation chain, which integrates satellite information (at least for NO2) got the best
results in many situation demonstrating the potential added-value of Earth Observations
for air quality issues.
Anyway, such analysis demonstrates the added-value of the provision of operational reanalyses of air quality fields for the policy decision and the air quality management.
The performances of the MACC-II/EVA models are very promising for the simulation
platform, with capacities compliant with the state-to-the-art and even better in some
cases. However it is important to improve the models in southern and Eastern Europe
which are the most difficult regions to simulate. This is a well-known situation which
justifies the development of current research projects: uncertainties in emissions, and
limitation of the models to reproduce the dynamical and chemical processes in this
geographical area are well-known and should reduce in the coming year. However it must
be noted that in general few stations are available in these parts of Europe for both
validation en evaluation. This can make the interpretation of the performance results
more difficult.
However the score performances established in this report allows building up confidence
in the MACC-II/EVA assessment reports for air quality in Europe. The mapped indicators
can be considered as relevant with a controlled uncertainty. For the regions where the
models perform less efficiently (Southern, Eastern Europe) general patterns are correctly
represented: the episode situations are generally predicted but their intensity is
underestimated.
In the next steps more DA systems will be run and the models should globally improve
(integration of forest fires, progress in the parametrisations...), therefore the scores
presented in the present report are expected to be even better. We have already
mentioned that for some indicators the global performance of the modelling systems is
better than for the previous years assessment exercises. Local situations in sensitive
areas (Italy, Balkans, Eastern Europe) should be investigated in a deeper way.
46
ANNEX :
methodologies and assumptions
Models:
The models that provided raw simulations and data assimilated fields of air pollutant
concentrations for the year 2010 are the ones involved in the MACC-II regional cluster
dedicated to the provision of air quality information (near-real time and in delayed
mode) at the European scale.
Seven models running operationally on their own modelling platform perform re-analyses
since the end of the MACC project (October 2011) for establishing yearly assessment
reports. These models are described in the QA/QC dossiers available and regularly
updated
on
the
MACC
website
(http://www.gmesatmosphere.eu/documents/deliverables/r-ens/). It is important to note that differences
can occur between the modelling chain run routinely for the provision of daily air quality
forecasts and near real time analyses, and the modelling chain used for re-analyses
calculations. The later requires larger computational resources (computational time and
storage space) to deal with a whole year on an hourly basis. Some teams did not achieve
the development of their data assimilation system and in this cases reported only raw
simulation values.
The tables below give an overview of each data assimilation system developed by the
regional air quality modelling partners in MACC, and its status at the time when the 2010
runs have been performed. This synthesis can facilitate the interpretation of some results
reports in this report and in the validation report.
Model
DA process
Pollutants
concerned
Data sources
Operational production
of data assimilated
fields
CHIMERE
Optimal
interpolation :
kriging observation
data with CHIMERE
as external drift
Ensemble Kalman
filter
O3, PM10
AIRBASE
Yes
Significant
improvement
O3
AIRBASE
Under evaluation : O3 partial
tropospheric columns (IASI)
EMEP
3D-VAR
NO2
OMI
NO2
column
EURAD
Intermittent 3DVAR
O3, NO2,
NO, CO,
SO2, PM10
AIRBASE
in
situ
measurements
MOSAIC air borne in situ
measurements
NO2 tropospheric column
retrievals from OMI, GOME2, SCIAMACHY
MOPITT CO profiles
tropospheric
Not yet; need for
comparison with OI
Yes but did not operate
its data assimilation
chain
(not
yet
operational)
Yes
47
LOTOSEUROS
Ensemble
filter
Kalman
O3
NO2,
PM10,
AOD, SO2,
SO4
Airbase
OMI : observation operator
developed and used for the
2009 report
Yes,
Significant
improvement
However
further
evaluation is needed
MATCH
3D-VAR
with
transform
into
spectral space
O3, NO2
Airbase
System
not
fully
operational which did
not provide model
outputs for the 2010
assessment report
MOCAGE
3D-VAR
O3
Ozone in-situ data (AIRBASE)
Yes since summer 2010,
online evaluation
available
SILAM
3D-VAR
4D-VAR
O3, NO2,
SO2
AIRBASE in situ
Yes,
Operational
Table 1: Synthesis of the current data assimilation capacities developed in the regional air quality models
involved in MACC. They must be operational by the end of the MACC project
Model
Data provided
Quality checking
CHIMERE
O3, PM10, with OI for the whole year and
the requested episodes ; raw simulation
results for NO2 and PM2.5
Significant improvement
of the simulation results
EMEP
Only raw simulations provided, DA under
development
Not applicable for DA
chain (still under
development)
EURAD
O3, NO2, NO, CO, SO2, PM10 (surface obs) ,
MOZAIC, NO2 tropospheric column
retrievals from OMI, GOME-2, SCIAMACHI
and MOPITT CO profiles
Significant improvement
of the simulation results
LOTOSEUROS
ozone, PM10 and NO2, PM2.5 re-analyses
on 30 km resolution
Significant improvement
of the simulation results
MATCH
No re-analyses nor raw simulations
provided
Not applicable
MOCAGE
O3 re-analyses provided; raw simulation
results for the other compounds
Significant improvement
SILAM
O3,NO2, PM10 re-analyses provided
Significant improvement
of the model results
Table 2: Brief summary of the model configurations used for the 2010 assessment report
Assumptions on input data:
48
Emission data, meteorological re-analyses and boundary conditions emissions have been
provided by the other MACC-II sub-project or by the MACC-II partners. Indeed,
meteorological re-analyses were provided by ECMWF. The emission inventory used for
running the models is the high resolution one provided for the year 2009 by the TNO
within the “Emissions” MACC subproject. Finally boundary conditions come, for the
gaseous compounds from the Global “reactive gases” sub-project (MOZART re-analyses).
49
50