Metal complexation model identification and the detection

Transcription

Analytica Chimica Acta 363 (1998) 261±278
Metal complexation model identi®cation and the detection and
elimination of erroneous points using evolving least-squares
®tting of voltammetric data
BozÏidar S. GrabaricÂ1,*, Zorana GrabaricÂ1, JoseÂ Manuel DõÂaz-Cruz,
Miquel Esteban, Enric Casassas
Department of Analytical Chemistry, University of Barcelona, Av. Diagonal 647, E-08028 Barcelona, Spain
Received 12 November 1997; received in revised form 27 January 1998; accepted 2 February 1998
Abstract
The experimental errors and their propagation are very often neglected when using voltammetric data for chemical model
identi®cation, consecutive stability-constants determination and speciation studies. The in¯uence of the experimental error has
been analyzed for two mathematical models used very frequently, viz. the Leden±DeFord±Hume and the van Leeuwen
mathematical models. It was demonstrated using simulated noise-free and noise-corrupted data that relatively small and
usually occurring experimental errors, in half-wave or peak potential, can blur the initially assumed chemical model or can
lead to a set of incorrect stability constants. In order to minimize the in¯uence of error on model identi®cation and parameter
estimation, an evolving least-squares ®tting (ELSQF) procedure is proposed which makes use of progressively increasing (in
forward and backward directions) data window size. At the same time, a procedure for detection and elimination of erroneous
points is introduced enabling more reliable estimation of the parameters that describe the metal-ion±ligand complexation
systems.
The proposed procedure was tested on experimental data of consecutively formed Pb(II) 2-hydroxypropanoates, obtained by
differential pulse polarography (DPP) in aqueous solution of constant ionic strength, I2 M (NaClO4), pH 5.7 and constant
temperature, t(231)8C, when investigating the in¯uences of errors in the Leden±DeFord±Hume model. Data obtained by
differential pulse anodic stripping voltammetry (DPASV) for the interaction of Zn(II) with a macromolecular ligand, the anion
of polymethacrylic acid (PMA) at constant degree of neutralization, d0.8, and at two different concentrations of supporting
electrolyte, c(KNO3)0.04 and 0.10 M, are used to demonstrate the in¯uence of error for the model proposed by van
Leeuwen's group. In both cases, the ELSQF approach gave clear and unambiguous complexation model identi®cation and
reliable parameters evaluation with, or without detection and elimination of erroneous points. # 1998 Elsevier Science B.V.
Keywords: Anodic stripping differential pulse voltammetry; Metal-ion±Ligand complexation model identi®cation; Consecutive complexes;
Differential-pulse polarography; Evolving least-squares ®tting; Lead(II) 2-hydroxypropanoates; Metal-ion speciation; Stability-constant
determination; Zinc(II) polymethacrylates
*Corresponding author. Tel.: +34 3 402 1286; fax: +34 3 402
1233; e-mail: bosko@zeus.qui.ub.es
1
On leave from the Faculty of Chemical Engineering and
Technology, University of Zagreb, Zagreb, Croatia.
0003-2670/98/$19.00 # 1998 Elsevier Science B.V. All rights reserved.
PII S0003-2670(98)00143-3
262
B.S. GrabaricÂ et al. / Analytica Chimica Acta 363 (1998) 261±278
1. Introduction
Voltammetric techniques and methods are very
often used for studying the metal-ion complexation
and speciation with different ligands in natural aquatic
environment [1±3], life sciences [4], pharmaceutical
and all other branches of the chemical industry [5].
Using these techniques, chemical-model identi®cation
and stability-constants determination are usually performed, very often neglecting the fact that the errors in
experimental measurement as well as the those introduced during data evaluation, in most circumstances,
can strongly in¯uence the model identi®cation and its
validation. Recently, it was demonstrated [6] using the
Leden±DeFord±Hume model [7,8], and both simulated and experimental data, that relatively small
overall errors in voltammetric measurement and evaluation of half-wave or peak potentials (<1 mV) and
in limiting or peak currents (<2%) can lead to a
wrong model identi®cation and to inaccurate evaluation of consecutive stability constants, if several constraints based on physicochemical reasoning and error
propagation are not taken into account. Regardless of
the existing computerised instrumentation which
minimises the possible human error, there are still
many sources which can generate experimental errors
and which cannot be compensated by the computer.
On the other hand, during the evaluation of characteristic parameters of voltammograms (limiting or peak
current, half-wave or peak potential and reversibility
parameters), an error may be introduced merely
because of the use of approximate methods for background current correction, not to mention the fact that
very often many approximate graphical parameter
estimation methods are still in use. Such overall errors
can contribute to inappropriate model identi®cation,
generate unreliable parameters of the chemical system
investigated, and in some circumstances lead to physicochemically meaningless parameters or a complete
blurring of the chemical model.
The most frequent evidences of the strong in¯uence
of errors on model identi®cation and stability-constant
determination using conventional LSQ methods are:
1. high standard error of evaluated parameter,
which by statistical tests usually suggests that
the given parameter statistically does not differ
from zero;
2. physicochemical meaningless values of evaluated
parameters (e.g. negative values of stability constant); and
3. divergence in iterative optimisation procedure.
In a previous paper [6], the concept of forward
evolving least-squares ®tting (ELSQF) of a polynomial to the experimental data obtained using the
Leden±DeFord±Hume model was introduced, which
clearly showed that the failure of conventional polynomial ®tting to experimental data, in many circumstances, is simply due to the errors in measurement
and evaluation of characteristic parameters of voltammograms (half-wave or peak potential and limiting or
peak current). Therefore, without a rigorous errorpropagation analysis of the experimental data and
the implication of the error level on the numerical
and statistical procedures used, ambiguous results can
be obtained. This so-called hard modelling ambiguity
can be resolved by:
(i) using some additional criteria or constraints
because numerical and statistical methods cannot
give the characterisation of the investigated
chemical system beyond the level of the overall
experimental error; and/or
(ii) performing many replicate measurements in
order to decrease the statistical error.
In the present paper, both forward and backward
ELSQF are applied as a general tool using two different mathematical models, and the procedure for the
detection and elimination of erroneous points is proposed. Using the proposed overall procedure, more
reliable chemical system identi®cation and parameters
evaluation are obtained. Simulated data and errors
were treated together with experimental voltammetric
data obtained for the interaction between Pb(II) ions
and 2-hydroxypropanoates, and that between Zn(II)
ions and polymethacrylates.
2. Theoretical part
The concept of parameters optimisation using leastsquares ®tting (LSQF) of a mathematical function to
the experimental data is well known [9]. Its goal is to
®nd those parameters that minimise a selected func-
tion with respect to the experimental data set. This is
usually done by taking i(1,2,. . .,n) observations,
yobs,i, and de®ning a model which correlates each
observation yobs,i with a calculated value ycalc,i:
yobs;i ycalc;i ei
i 1; 2; . . . ; n
(6)
(1)
where ei is the experimental error. The calculated
value ycalc,i will depend on the g number of parameters, mg, and on the independent variable xi:
ycalc;i f mg ; xi
(i) Hamilton R-factor (HRF):
s
Pn
yobs;i ÿ ycalc;i 2
i1P
HRF
n
2
i1 yobs;i
263
(ii) Akaike information criterion (AIC):
(P
)
n
2
y
ÿ
y

obs;i
calc;i
i1
AIC n ln
2m
nÿm
(7)
(2)
The assumption in this model is that yobs,i values
contain experimental error, while xi values are
error-free. Another important assumption present in
this model is that the errors are all equal and independent of each other. Neither of these assumptions
perfectly hold in most experiments; moreover, electrochemical measurements are especially sensitive
to the in¯uence of errors due to their complexity.
By the method of least squares, a sum of squared
residuals:
(iii) Mean quadratic error of prediction (MQEP)
which is a measure of predictive ability of the
proposed model and is defined as:
MQEP
n
1X
yobs;i ÿ ycalc;i 2
n i1
(8)
(5)
The ELSQF approach proposed [6] and generalised
in this paper, i.e. the forward and backward evolving
least squares, consists of analysing ± by the least
squares ± experimental data pairs, starting with a data
window containing the statistically required minimal
number of data points, wmin. Then, the same procedure
is repeated using a progressively increasing window
size of data points, up to maximal or total number of
experimental data points, wmax. In this way, each of the
`best' (in the sense of least squares) parameters, mh, is
obtained as a function of window size or an independent variable, i.e. a set of the `best' parameters, mg,h
(where g denotes the gth parameter and h denotes the
progressively increasing window size in forward or
backward direction, hwmin to wmax), are obtained.
The main advantages of the ELSQF approach are:
where Wi,k are statistical weights and inverses of the
variances and covariances Mi,k.
By the conventional LSQ approach, all the data
pairs within one (or more) experiment(s) are taken and
the selected function is minimised. The correct choice
of this function is very important for obtaining reliable
and physicochemically meaningful results.
In the case when there are several possible chemical
models (e.g. consecutive complex formation), the
correct model is identi®ed using some statistical criteria and physicochemical reasoning. The statistical
criteria used in this paper are [10]:
(i) any trend deviating from the expected constancy
of the parameters or inadequate selection of the
function to be minimised can be easily visualised,
which enables simple and reliable complexation
model identification;
(ii) in most cases, erroneous points can be
determined and eliminated using statistical criteria;
and
(iii) any gth parameter is obtained as the mean
of mhs (with or without erroneous points elimination), and therefore it is more reliable then
those obtained from only one experimental data
window.
ri2 yobs;i ÿ ycalc;i ;
(3)
is minimised (least squares):
S
n
X
i1
ri2 min
(4)
A more general function to be minimised is the one
which accounts for different errors assuming that
errors for each pair of observations are connected
through a covariance term:
S
n X
n
X
ri Wi;k rk min
i1 k1
264
In the ideal case, parameters mg,h, obtained by
conventional or by the evolving LSQ approaches
would be practically the same. However, simulating
the voltammetric data and possible errors, it was
demonstrated that the conventional approach: (i)
can fail in model identi®cation, (ii) does not converge,
or (iii) produces unreliable parameters, when the
errors are relatively high although experimentally
acceptable. Of course, this depends on the model
function used and on the values of parameters to be
evaluated. In the case of the Leden±DeFord±Hume
model the error in half-wave or peak potential of
1 mV and that in current of 2% blurred the chemical model of consecutively formed weak labile
complexes of lead(II) propanoates, while reliable
results could be obtained [6] using forward evolving
least-squares ®tting.
The Leden±DeFord±Hume mathematical model
proposed for consecutive complex formation between
metal ion and small simple ligands (Dmetal ionDligand)
can be expressed as [7,8]:
n
X
zF
Ep;i ÿ ln i
j Lji exp
F0;i 1
RT
i1
(9)
Fj;i
Fjÿ1;i ÿ jÿ1
Li
(10)
where j represents the number of species in the
system. So far this model was investigated without
the analysis made in backward ELSQ direction and
without the procedure for erroneous points detection
and elimination. In Eqs. (9) and (10), j[MLj]/
([M][L]j), i.e. cumulative stability constants, Ep,i
(Ep,freeÿEp,i,complexed) and iIp,i,complexed/Ip,free.
Square brackets with symbols of species denote their
equilibrium concentrations.
The second model investigated is the one proposed
by van Leeuwen [11±14] and is more general because
it holds for metal-ion complexation with macromolecular ligand (Dmetal ionDligand). This model can be
expressed as:
!
P
j p
1 m
i1 DMLj =DM j Li
(11)
i
P
j
1 m
j1 j Li
and, for the case that Dmetal ionDligand, it can be
shown that this model is equivalent to the DeFord-
Hume models. For 1 : 1 metal-ion-to-ligand stoichiometry, Eq. (11) is usually written as:
1 "1 Li p
i
(12)
1 1 Li
where " is the ratio of diffusion coef®cients of the
complexed and free metal ions, "DML/DM, and
1[ML]/([M][L]). The empirical parameter p
depends on hydrodynamic conditions in the pre-electrolysis step of stripping techniques which is 1/2 for
semi-in®nite linear diffusion and 2/3 for laminar
convective diffusion. Both models assume the following:
(i) the electron transfer reaction between oxidised
and reduced metal ion is sufficiently fast to make
the system reversible on the time scale of the
technique;
(ii) the complex ML formed is not electroactive;
(iii) the ligand reacts only with metal ion;
(iv) the ligand concentration is in large excess over
the metal-ion concentration and the ligand equilibrium concentration is approximately equal to the
bulk ligand concentration i.e. [L]cL;
(v) in the case that the ligand is a macromolecule
with many coordinating groups, ligand L represents only one site in the macromolecule; and
(vi) the adsorption of electroactive species on the
electrode surface is absent.
3. Experimental part
3.1. Chemicals
All chemicals were of analytical reagent grade and
were used without additional puri®cation. Ultrapure
water used for solution preparation was obtained from
a Culligan (Spain) water puri®cation system.
Lead and zinc salts were Titrisol (Merck). Sodium
2-hydroxypropanoate, sodium perchlorate and potassium nitrate were of p.a. grade (Merck). Polymethacrylic acid (PMA) solution was reagent grade
(BDH) (average molar mass 26 000 g/mol) and it was
used to prepare stock solution of 0.1 mol/l in water (in
monomers). The total number of carboxylic groups
was determined by conductometric acid±base titration.
3.2. Instrumentation
Differential pulse (DP) polarograms were recorded
using a drop time of 0.8 s, pulse duration of 50 ms,
pulse amplitude of ÿ50 mV and scan rates of 4 mV/s,
with an Autolab System (Eco Chemie, The Netherlands) attached to a Metrohm 663 VA stand. The
mercury electrode in the static mercury-drop mode
was used for working electrode, Ag|AgCl|(3 M KCl)
was used together with an electrolytic bridge containing 2 M NaClO4, as reference electrode, while glassy
carbon was used as auxiliary electrode. All solutions
were initially deaerated using nitrogen for 15 min, and
for 5 min after each ligand solution addition. Measurements were performed at constant ionic strength,
I2 M (NaClO4), constant pH5.7 and constant temperature t(231)8C. DP polarograms were corrected
for background current [15] and peak potentials and
peak currents evaluated using parabolic ®tting of the
points around the peak [6]. The peak potentials were
determined within the same DP polarogram to
0.3 mV, but the overall reproducibility was not better
than 0.8 mV. The peak current was determined
within the same polarogram to 0.7%, but the overall
reproducibility was 2% at 60 mM concentration
level of Pb(II) ions. All DP polarograms showed
reversible behaviour having a peak potential halfwidth of 481 mV.
In the voltammetric titrations of Zn(II) with PMA
differential pulse anodic stripping voltammograms
(DPASV) were obtained with 646 VA Processor
(Metrohm) and 663 VA Stand (Metrohm) attached
to a Dosimat 665 (Metrohm) for the automatic addition of the titrant solutions. A pulse duration of 40 ms,
a pulse height of 50 mV, and a deposition potential of
ÿ1200 mV were used. The pre-electrolysis time and
the rest period used were 1 and 0.5 min, respectively.
The scan rate in the stripping step was 10 mV/s.
Measurements were taken at (251)8C after deaeration with puri®ed nitrogen.
Acid±base conductometric titration of PMA stock
solutions were performed in an Orion cell
(k1.03 cmÿ1) with an Orion 120 microprocessor
conductivity meter.
More details on solution preparation and experimental procedure can be found in Refs. [6,16].
All calculation were performed using ELCHEM
toolbox for MATLAB written in this laboratory for
265
electrochemical signal simulation, visualisation and
manipulation [17,18].
4. Results and discussion
4.1. Leden±DeFord±Hume model
The error propagation and their in¯uence on chemical model identi®cation using the Leden±DeFord±
Hume model and conventional polynomial ®tting
procedure was reported in a previous paper [6]. In
the same paper, the advantage of the forward EPLSQF
on simulated and experimental data of consecutively
formed Pb(II) propanoate complexes was demonstrated. In this paper, the backward as well as the
forward EPLSF were investigated, and the procedure
for detection and elimination erroneous points introduced. The lead(II) 2-hydroxypropanoate was chosen
as chemical model system and the values of consecutive stability constants selected for simulation were:
1130 Mÿ1, 21300 Mÿ2 and 33600 Mÿ3, as
they are close to the ones obtained experimentally for
the investigated system. The theoretical error-free F0,i
function was calculated according to Eq. (9). Forward
and backward EPLSQF, to this error-free F0,i function,
were performed assuming different chemical models
(i.e. number of complex species, j1,2,. . .,m). The
results obtained can be classi®ed in the following three
patterns:
1. Underestimating the number of species de®ning
the chemical model, the plot of j vs. [L] does not
give constant values for progressively increasing
data window sizes. In Fig. 1, j values together
with corresponding error bars, obtained by
forward (`f') and backward (`b') EPLSQF assuming a chemical model having only one complex
species (j1), are shown, while in Fig. 2 the
results obtained when the assumed chemical
model has two consecutive complexes (j1 and
2) are plotted. From the results shown in Figs. 1
and 2, both chemical models can be easily
discarded because the values of the stability
constants are not constant at all. The standard
errors obtained using error-free simulated data are
a consequence of selecting and ®tting the wrong
chemical model. The conventional PLSQF uses
266
Fig. 1. Values of 1 obtained by EPLSQF of error-free F0 Leden function simulated using 1130 l/mol, 21300 (l/mol)2 and 33600 (l/
mol)3, assuming only one complex species. Results marked with `f' represent forward and those marked with `b' backward evolving direction.
Fig. 2. Values of 1 and 2 obtained by EPLSQF of error-free F0 Leden function simulated using 1130 l/mol, 21300 (l/mol)2 and
33600 (l/mol)3, assuming two complex species. Results marked with `f' represent forward and those marked with `b' backward evolving
direction.
only one data window with all the points in it. This
approach gives only one point for j (the last point
at highest ligand concentration in Figs. 1 and 2),
and the only way for model identi®cation is to use
the statistical criteria and physicochemical reason-
ing. On the contrary, using EPLSQF, the increase
in values of the stability constants, obtained with
different ligand concentration windows, can easily
disqualify the wrong chemical model. It is
interesting to note that the extrapolation of j
267
Fig. 3. Values of 1, 2 and 3 obtained by EPLSQF of error-free F0 Leden function simulated using 1130 l/mol, 21300 (l/mol)2 and
33600 (l/mol)3, assuming three complex species. Results marked with `f' represent forward and those marked with `b' backward evolving
direction.
values of the forward curves to zero ligand
concentration, tends towards the `true' value
selected to simulate the Leden polynomials.
2. If the model is correctly identified, constant values
for all stability constants are obtained independently of the window size in both, forward and
backward EPLSQF directions. Error bars are not
visible on the scale of the graph shown in Fig. 3,
when the error-free simulated data were evaluated.
Of course, with experimental error present, this
constancy will not be so perfect, but the EPLSQ
approach will easily identify the correct chemical
model, even in the presence of errors. Moreover,
this approach offers another advantage in visualisation, detection and elimination of the erroneous
points. The procedure for the latter will be
explained in Section 4.2.
3. Although the number of species may be overestimated, the values of j vs. [L] will be constant and
correct, as in Fig. 3, only the values of j for `nonexistent' higher complex(es) will have very small
values close to zero in the case of error-free data or
negative values. Using simulated error-corrupted
and experimental data, the discrimination between
higher j values is made according to lower number
of points that fall within the selected interval.
To demonstrate the effect of errors in measurement
and evaluation of peak potentials and peak currents
when the Leden±DeFord±Hume model is used, random noise with the same initial pattern is added to the
error-free Ep,i, having the mean 0 (<10ÿ3 mV) and
standard deviations of 0.3, 0.6 and 1.8 mV, respectively. Random noise with a mean value of 0 and a
standard deviation of 1% of the peak current is
added to the i function, as well. The stability constants obtained, with noise-free and noise-corrupted
F0 functions, are shown in Table 1. From the results,
shown in Table 1, it can be seen that the correct
chemical model cannot be identi®ed by any of the
three statistical criteria used (minimum of HRF, AIC
and MQEP) unambiguously and that no acceptable
set of stability constants can be obtained at any error
level.
One way of overcoming the in¯uence of the error on
®tting the polynomial to the experimental data is to
calculate the higher order Leden functions according
to Eq. (10), which is a polynomial one degree lower
than F0, and to perform the conventional PLSQ ®tting.
The results given in Table 2 show that acceptable sets
(bold) of stability constants can be obtained just by
®tting of F1 polynomial, which is a simple transformation of the F0 polynomial shown in Eq. (10). When
3.11
3.11
0.00
0.00
3.56
3.56
0.00
i<0
i<0
log 4SE f
b
Maximal number of complex species.
Absolute difference between initially assumed and obtained value of stability constants in log units;
c
Hamilton R-factor.
d
Akaike information criterion.
e
Mean quadratic error of prediction.
f
SEstandard error.
a
F0 function (errors in Ep1.8 mV and in Ip1%) 1130 Mÿ1; 21300 Mÿ2; 33600 Mÿ3
1
3.030.03
0.92
0.47
2
i<0
3.970.11
0.41
3
2.710.18
0.60
i<0
3.990.18
0.88
i<0
4.770.15
4
i<0
5
1.771.12
0.34
2.811.35
0.03
4.420.65
0.86
i<0
1
3.020.03
0.91
3.550.02
0.44
2
i<0
3
2.440.13
0.33
2.290.62
0.82
3.760.07
0.20
3.660.15
0.55
i<0
4.450.15
4
i<0
5
1.930.70
0.18
3.140.70
0.03
3.990.66
0.43
i<0
1
3.010.03
0.90
3.540.01
0.43
2
i<0
3
2.340.10
0.23
2.800.20
0.31
3.690.05
0.13
3.510.13
0.40
i<0
4.130.15
4
i<0
5
2.000.49
0.11
3.140.53
0.03
3.850.61
0.29
i<0
0.00
0.00
2.11
|| b
5
log 3SE f
0.00
|| b
2.11
log 2SE f
4
|| b
0.00
log 1SE f
F0 function (error-free) 1130 Mÿ1; 21300 Mÿ2; 33600 Mÿ3
1
3.010.03
0.90
3.530.01
0.52
2
i<0
3
2.11
0.00
3.11
0.00
3.56
ma
4.77
4.45
4.13
|| b
5.260.38
4.800.42
4.560.42
ÿ6.021.30
log 5SE f
5.26
4.80
4.56
6.02
|| b
AIC d
26.3
5.9
4.1
3.74
3.72
26.3
5.9
4.1
3.74
3.72
25.7
4.4
2.5
2.28
2.26
250
154
134
131
133
250
154
134
131
133
248
134
100
97
100
710ÿ10 ÿ1347
310ÿ10 ÿ1412
25.0
245
2.7
101
910ÿ12 ÿ1637
HRF c
1622
81
40
33
32
1622
81
40
33
32
1526
45
14
12.0
11.8
1410
16
2 10ÿ22
2 10ÿ19
1 10ÿ18
MQEP e
Table 1
Log j values and corresponding standard errors obtained by conventional PLSQF of simulated error-free and error-corrupted F0 function for different chemical models (number of
consecutively formed complexes j1,2,. . .,m; stability constants without standard deviation are obtained from simulated error-free data; standard error<10ÿ10 log units)
268
3.11
3.11
0.00
0.00
3.56
3.56
i<0
3.550.24
i<0
0.03
0.88
0.60
3.770.24
i<0
33600 Mÿ3
0.01
0.33
0.45
33600 Mÿ3
0.00
i<0
log 4SE f
b
Absolute difference between initially assumed and obtained value of stability constants in log units.
c
Hamilton R-factor.
d
e
f
Standard error.
a
2
1.700.17
0.41
3.440.02
0.33
3
2.110.09
0.00
3.090.13
0.02
3.630.10
0.07
4.170.24
4
1.950.16
0.16
3.400.15
0.39
i<0
5
2.131.18
0.02
2.810.83
0.30
3.990.85
0.43
i<0
F1 function (errors in Ep0.6 mV and in Ip1%) 1130 Mÿ1; 21300 Mÿ2;
2
1.760.10
0.35
3.420.01
0.31
3
2.110.04
0.00
3.100.05
0.01
3.590.05
4
2.060.06
0.05
3.250.09
0.14
2.680.80
5
2.140.06
0.03
2.750.37
0.36
4.160.19
F1 function (errors in Ep0.3 mV and in Ip1%) 1130 Mÿ1; 21300 Mÿ2;
2
1.780.09
0.33
3.420.01
0.31
3
2.110.02
0.00
3.110.03
0.00
3.570.03
4
2.080.03
0.03
3.200.06
0.09
3.230.27
5
2.130.04
0.02
2.930.19
0.18
4.010.17
0.00
0.00
2.11
|| b
5
log 3SE f
0.00
|| b
2.11
log 2SE f
4
|| b
0.00
log 1SE f
F1 function (error-free) 1130 Mÿ1; 21300 Mÿ2; 33600 Mÿ3
2
1.800.07
0.31
3.410.01
0.30
3
2.11
0.00
3.11
0.00
3.56
ma
4.17
3.77
3.55
|| b
4.331.28
4.860.19
4.640.19
01.510ÿ6
log 5SE
4.33
4.86
4.64
0
|| b
15.3
12.6
12.2
11.5
9.6
5.0
4.9
4.6
8.5
3.1
3.0
2.8
297
288
289
291
264
225
226
226
257
192
193
192
ÿ1359
ÿ1575
710ÿ12
210ÿ10
249
ÿ1774
AIC d
7.7
410ÿ13
HRF c
6836
4612
4340
3840
2515
694
651
582
1979
253
237
212
1586
3 10ÿ24
1 10ÿ21
8 10ÿ19
MQEP e
Table 2
Log j values and corresponding standard errors obtained by conventional PLSQF of simulated error-free and error-corrupted F1 function for different chemical models (number of
consecutively formed complexes j2,3,. . .,m; stability constants without standard deviation are obtained from simulated error-free data (standard error <10ÿ10 log units))
269
270
Fig. 4. Values of 1 with corresponding error bars obtained by forward EPLSQF of error-corrupted (error equivalent to 0.3 mV in peak
potential and 1% in peak current) F0 Leden function simulated using 1130 l/mol, 21300 (l/mol)2 and 33600 (l/mol)3 assuming three
complex species. Result marked with `o' represents the value obtained by conventional PLSQF.
using statistical criteria (HRF, AIC and MQEP), it
seems that the most reliable and robust criterion for
chemical-model identi®cation is the Akaike information criterion, although neither of the used criteria
gave an unambiguous model identi®cation, when
error-corrupted data were used.
The other approach to minimise the error in¯uence
is the use of EPLSQF. Fig. 4 shows clearly why
EPLSQF is preferred over the conventional LSQF.
In this ®gure, the 1 values obtained by forward
EPLSQF and error level in peak potential of
0.3 mV are shown together with corresponding standard errors. The conventional PLSQF of F0 polynomial on experimental data will give the result shown as
the point at the highest ligand concentration (point
marked `o'), which even for the smallest error level in
Ep (0.3 mV) does not give an acceptable set of
stability constants (see Table 1). Moreover, any data
window size can be chosen in the experimental
design, which means that, using conventional PLSQF,
different values of j shown in Fig. 4 might be
obtained, and some of these ¯uctuations are unacceptably large. Therefore, it is obvious that the mean value
of stability constant obtained using EPLSQF will be
more reliable than the single value obtained by conventional PLSQF.
The use of EPLSQF in backward direction is not
recommended when using the Leden±DeFord±Hume
model for stability constant determination, unless
weighted LSQ procedure is applied, because the polynomial values in the initial window (highest concentration points) prevail and bias the obtained results, as
can be clearly seen in Fig. 5. The results obtained by
EPLSQF in backward direction (`b') are incorrect (too
far from the initial simulated value) and their standard
errors are unacceptably high. For some other mathematical models, however, backward EPLSQ analysis
can be more useful then the forward one, as will be
shown later when using the van Leeuwen model.
Another advantage of the ELSQPF procedure is
that, together with simple model identi®cation, it
enables visualisation, detection and elimination of
erroneous points. The procedure for this is as follows:
(i) Eliminate negative j values, because they do
not have any physicochemical meaning.
(ii) Convert all positive j values in log j domain
and set the criterion of erroneous points elimination to 0.3 log units. This value is selected
because it can be easily demonstrated, using the ttest, that stability constants having this standard
error (or standard deviations) do not significantly
271
Fig. 5. Values of 1 with corresponding error bars obtained by forward (`f') and backward (`b') EPLSQF of error-corrupted (error equivalent
to 0.3 mV in peak potential and 1% in peak current). F0 function simulated using 1130 l/mol, 21300 (l/mol)2 and 33600 (l/mol)3,
assuming three complex species.
differ from zero at 95% confidence intervals (it
corresponds to an error of 100% in nonlogarithmic units). The 0.3 log units elimination
limit is taken very conservatively, because in
practice this elimination limit can be even lower,
depending on the degrees of freedom.
(iii) Eliminate the points outside this limit and
calculate the new mean and standard deviations.
The dispersion of points and their standard errors
of the Leden polynomial is higher in the lower
ligand concentration range, partially due to division by small concentration values. Therefore, the
reference mean j values should be calculated from
a data window in the higher ligand concentration
range in order to avoid the strongly biased
erroneous points in the lower ligand concentration
range influencing the calculation of the mean and
corresponding standard deviations.
In Fig. 6, an example of detection and elimination
of erroneous points is shown, together with the discrimination between the models with higher complex
species. Log 1 values, assuming m3 (*) and those
obtained assuming m4 () consecutively formed
complexes in the system, are presented. Simulated
initial value of the log 12.11 (± ± ±). Mean value of
log 1 obtained with model having m3 is 2.24
(ÐÐÐ) and elimination limits of 0.3 log units
are marked with (± ±). Assuming m3, the points
are less scattered and only six points fall outside the
elimination limit using an error of 0.3 mV in peak
potential and an error of 1% in measured peak
current, respectively. Assuming m4, the eliminated
number of points using the same criteria is almost
twice as large (11 out of 24 points eliminated). The
situation is even less favourable when analysing
higher constants. Assuming m4, the values of 4
are unacceptable, because there are 13 points out of 24
with values smaller than 0, or the mean value obtained
is 41.71051.1106 which statistically does not
differ from 0. Therefore, by using EPLSQF, the chemical model identi®cation is simple, while the elimination of several erroneous points give a more reliable
set of stability constants than those obtained using the
conventional LPSQF (see Table 3).
To verify the advantages of the principle of the
EPLSQF on experimental results, Pb(II) complexation
with 2-hydroxypropanoate system was investigated by
DPP at I2 M (NaClO4). Using conventional PLSQF
of both, F0 and F1 polynomials on experimental data,
three complex species are identi®ed in the system and
acceptable sets of stability constants are obtained
272
Fig. 6. Values of log 1 obtained by forward EPLSQF of error-corrupted (error equivalent to 0.3 mV in peak potential and 1% in peak
current). F0 function simulated using 1130 l/mol, 21300 (l/mol)2 and 33600 (l/mol)3, assuming (*) three and () four complex
species. (ÐÐÐ) ± the obtained mean value of log 12.24; (± ± ±) ± the initial `true' value (log 12.11), and (± ±) ± the elimination
intervals of 0.3 log units.
Table 3
Evolving PLSQF of simulated error-free and error-corrupted F0 and F1 functions. Log j values and corresponding standard deviations are
obtained as a mean of progressively increasing (in forward direction) window size assuming the chemical system with three consecutively
formed complexes (m3)
ma
|| b
e/t c
log 3SD
|| b
e/t c
F0 function (error in Ip1%) 1130 Mÿ1; 21300 Mÿ2; 33600 Mÿ3
3
0.3
2.240.11
0.13
6/24
3.140.14
3
0.6
2.290.13
0.18
8/24
3.140.09
3
1.8
2.430.12
0.32
13/24
3.540.15
0.03
0.03
0.02
10/24
14/24
17/24
3.640.14
3.680.15
4.000.15
0.08
0.12
0.44
15/24
17/24
20/24
F1 function (error in Ip1%) 1130 Mÿ1; 21300 Mÿ2; 33600 Mÿ3
3
0.3
2.100.03
0.01
0/24
3.160.10
3
0.6
2.080.04
0.03
0/24
3.160.11
3
1.8
2.070.07
0.04
2/24
3.240.14
0.05
0.05
0.13
1/24
2/24
8/24
3.610.10
3.580.11
3.510.14
0.05
0.02
0.05
7/24
9/24
15/24
error/mV
log 1SD d
|| b
e/t c
log 2SD d
a
Absolute difference between initially assumed and obtained value of stability constants in log units.
c
Eliminated/total number of stability constant values.
d
Standard deviation.
b
(Table 4, bold) with most probable values of stability
constant by EPLSQF of F1 function: log(1/Mÿ1)
2.260.01, log( 2/Mÿ2)3.130.08 and log(3/
Mÿ3)3.790.10, which is in a very good agreement
with results reported earlier for the same system using
different electroanalytical techniques [19,20].
4.2. Van Leeuwen model
In order to demonstrate that EPLSQF is a useful
general approach for reliable system identi®cation and
erroneous in¯uential points detection and elimination,
a generalised model which holds for metal ion and
273
Table 4
Log j values and corresponding standard errors obtained by conventional PLSQF of F0 and F1 functions, obtained using DPP for different
chemical models (maximal number of consecutively formed complexes, m) in solutions of Pb(II) 2-hydroxypropanoates. c(Pb2)60 mM;
I2 M (NaClO4); pH5.7 and t(231)8C
ma
log 1SE e
log 2SE e
log 3SE e
F0 function
1
2
3
4
3.110.03
1<0
2.370.08
1.760.40
3.640.11
3.040.10
3.500.11
3.750.04
3<0
F1 function
2
3
4
1.930.08
2.250.01
2.240.02
3.510.01
3.180.02
3.200.05
3.690.02
3.640.11
log 4SE e
HRF b
AIC c
MQEP d
4.030.15
25.4
3.6
1.6
1.5
262
137
86
83
2359
49
9
8
2.980.51
8.5
2.0
1.9
272
178
180
3113
164
163
a
Hamilton R-factor.
c
d
e
SEStandard error.
b
Table 5
Evolving PLSQF of F0 and F1 functions, obtained using DPP for different chemical models (maximal number of consecutively formed
complexes, m) in solutions of Pb(II) 2-hydroxypropanoates. c(Pb2)60 mM; I2 M (NaClO4); pH5.7 and t(231)8C. Log j values and
corresponding standard deviations are obtained as a mean of progressively increasing (in forward direction) window size, assuming the
chemical system with three and four consecutively formed complexes (m3 and 4)
ma
log 1SD c
e/t b
log 2SD c
e/t b
log 3SD c
e/t b
log 4SD c
e/t b
F0 function
3
4
2.260.10
1.950.13
2/26
16/26
3.230.12
3.500.10
8/26
12/26
3.640.13
4.200.10
11/24
20/24
4.170.11
20/24
F1 function
3
4
2.260.01
2.260.02
0/26
0/26
3.130.08
3.100.09
0/26
6/26
3.790.10
3.850.12
3/26
11/26
4.600.18
21/24
a
Eliminated/total number of stability constant values.
c
Standard deviation.
b
ligands with very different diffusion coef®cients
and developed by van Leeuwen's group [11±14] has
also been analysed using simulated data. The experimental veri®cation of EPLSQF was done using
DPASV data obtained for complexation system of
Zn(II) with macromolecular ligand PMA [14,16,21]
(see Table 5).
According to Eq. (12), vs. [L] curve was simulated using log (/Mÿ1)4.50, "0.02 and p2/3. To
this curve, random errors with the same initial pattern
and a mean value of 0 and standard deviations of
0.3, 0.6, 1.1 and 1.8% of the maximum peak current
were added. These simulated data were evaluated by
conventional LSQF procedure using analytical derivatives of three parameters (, " and p) in Eq. (12).
The results are presented in Table 6. As can be seen,
only error-free data gave the correct initial values.
Noise-corrupted data with 0.3 and 0.6% noise levels
gave acceptable values of , but evaluated values of "
and p were unacceptable. With increased noise levels
(1.1 and 1.7%) the LSQF did not converge. This
simulation demonstrates that a three-parameters
274
Table 6
Log , " and p values from Eq. (5) and corresponding standard errors obtained by three-parameter conventional LSQF of simulated error-free
and error-corrupted (different error levels) vs. c(ligand) functions (mmaximal number of complex species)
(error)/%
log 1SE e
|| a
"SE e
|| a
pSE e
|| a
HRF b
AIC c
MQEP d
0.0
0.3
0.6
1.1
1.7
4.50910ÿ16
4.600.08
4.690.12
Does not converge!
Does not converge!
0.00
0.10
0.19
0.020310ÿ16
ÿ0.0040.016
ÿ0.0150.017
0.000
0.024
0.035
0.67110ÿ15
0.560.07
0.490.09
0.000
0.107
0.177
810ÿ15
0.25
0.50
ÿ1401
ÿ219
ÿ193
610ÿ33
610ÿ6
210ÿ5
a
Absolute difference between evaluated and initial simulated value.
Hamilton R-factor.
c
d
e
Standard error.
b
Table 7
Log and " values from Eq. (5) with p value fixed (p2/3) and corresponding standard errors obtained by two parameters conventional LSQF
of simulated error-free and error-corrupted (different error levels) vs. c(ligand) functions (mmaximal number of complex species)
(error) (%)
log 1SE e
|| a
"SE e
|| a
HRF b
AIC c
MQEP d
0.0
0.3
0.6
1.1
1.7
4.500.00
4.490.01
4.490.02
4.480.03
4.460.05
0.00
0.01
0.02
0.03
0.05
0.020410ÿ17
0.0190.004
0.0170.008
0.0140.015
0.0110.023
0.000
0.001
0.003
0.006
0.009
0.00
0.45
0.90
1.80
2.65
ÿ1
ÿ200
ÿ174
ÿ148
ÿ133
0.00
210ÿ5
810ÿ5
310ÿ4
710ÿ4
a
Absolute difference between evaluated and initial simulated value.
Hamilton R-factor.
c
d
e
Standard error.
b
non-linear LSQF procedure cannot be used with the
model described by Eq. (12) because even the lowest
error level (0.3%) gives incorrect values for the evaluated parameters. Therefore, a two-parameters LSQF
procedure is recommended, ®xing the value of p to
either 1/2 or 2/3, according to the electrochemical
technique used [14].
In Table 7, the results obtained using conventional
LSQF to evaluate and " values from simulated noisefree and noise-corrupted vs. [L] curves are shown.
An acceptable set of these two parameters was
obtained only by ®tting the Eq. (12) to error-free data
and to data having a noise level of 0.3 and 0.6% (bold).
With higher simulated noise level (1.1 and 1.7%)
evaluated values for are acceptable, but the parameter " statistically does not differ from zero and
consequently is not reliable. Graphical representations
of log and " vs. forward and backward ligand
concentration data window and 1.1% of error level
added are shown in Fig. 7. It can be seen that parameters obtained from initial data windows (forward
and backward), are incorrect (far from simulated
values) and have quite large standard errors. With
increase of data window the parameters become constant and coincide with the simulated values. The
ELSQF approach reveal visually all the conclusions
made by van den Hoop et al. [14] who have analysed
the Eq. (12) analytically. From the evaluation of
simulated error-free and noise-corrupted data it can
be concluded that the van Leeuwen model for 1 : 1 M
± ligand complexation stoichiometry is quite errorrobust for the determination of stability constants but,
at the same time, very error-sensitive for the determination of ".
275
Fig. 7. Log and " values obtained by ELSQF of the Eq. (12) to simulated data with added error (mean 0 and standard deviation in 1.1%) for forward (`f') and backward (`b') ligand concentration direction.
Fig. 8. Experimental vs. c(-COOH, PMA)/mM (points) and LSQ fitted (full line) curves obtained by DPASV for Zn(II) PMA system with
(*) c(KNO3)0.04 M and () 0.10 M. c(Zn2)1 mM; n0.8; and t(251)8C.
To verify the ELSQF approach, experimental points
obtained by DPASV for complexation of Zn(II) with
PMA, at d0.8 and two different concentrations of
KNO3 (0.04 and 0.10 M) were used (Fig. 8). Conventional LSQ ®tted curves obtained using p2/3 are
shown as full lines in the same ®gure. In Table 8, the
results obtained by conventional LSQF using p2/3
and 1/2 are presented. Again log values are acceptable, regardless of the ®xed p value, but " values are
quite dispersed and not fully reliable. Three statistical
parameters shown in Table 8 indicate that a slightly
better ®t is obtained when using p2/3 is applied,
276
Table 8
Log , " values and corresponding standard errors obtained by two-parameter conventional LSQF. Experimental vs. c(-COOH, PMA) curves
were obtained by DPASV in solutions of Zn(II) PMA with c(KNO3)0.04 and 0.10 M. c(Zn2)1 mM; n0.8; t(251)8C. The LSQF
procedure was performed using p1/2 and 2/3
c(KNO3)/M
p
log 1SE d
"SE d
HRF a
AIC b
MQEP c
0.04
0.04
0.10
0.10
2/3
1/2
2/3
1/2
4.540.03
4.720.03
3.970.03
4.070.03
0.040.01
ÿ0.00040.0063
0.130.05
0.040.03
2.0
3.1
1.1
1.2
ÿ142
ÿ126
ÿ135
ÿ133
410ÿ4
110ÿ3
610ÿ4
710ÿ4
a
Hamilton R-factor.
c
d
Standard error.
b
Fig. 9. Log and " values obtained by forward (`f') and backward (`b') ELSQF of Eq. (5) to experimental data obtained by DPASV for Zn(II)
PMA system with (*) c(KNO3)0.04 M and () 0.10 M. c(Zn2)1 mM; n0.8; t(251)8C.
which is theoretically expected using DPASV
[11±14].
The results obtained using ELSQF procedure are
shown in Table 9 for forward, backward and cumulative forward and backward evolving directions,
obtained using the procedure for erroneous points
detection and elimination. In Fig. 9, log and " values
are shown with corresponding error bars obtained in
solutions with 0.1 M KNO3. It can be seen that ELSQF
approach disclose that the scattering of obtained parameters and their standard errors is greater in the initial
data window of the forward than in the backward
ELSQF direction, where very good constancy of the
parameters was obtained. This can be due to the
greater experimental error or deviation from the model
at lower ligand concentrations, as can be seen from the
graphs presented in Fig. 8. In Fig. 9, some of the
initial points of the forward ELSQF have been deliberately omitted because they are very far out of the
scale. Therefore, it seems that for this system and
model data obtained by ELSQF in backward direction
is more reliable than those obtained from forward
direction, because the latter needs elimination of more
erroneous points. Again, as it can be concluded from
b
a
4.490.05
3.910.08
log SD
Forward
b
a
0/15
6/15
e/t
Eliminated/total number of points.
Standard deviation.
0.04
0.10
c(KNO3)/M
0.0310.007
0.110.03
"SD
b
6/15
9/15
e/t
a
4.560.02
3.990.04
log SD
Backward
b
0/15
0/15
e/t
a
0.0430.004
0.140.02
"SD
b
0/15
0/15
e/t
a
4.520.05
3.960.08
log SD b
0/30
6/30
e/t a
Forward and backward
0.0380.07
0.130.03
"SD b
6/30
9/30
e/t a
Table 9
Log , " values and corresponding standard deviations obtained by two parameters evolving LSQF. Experimental vs. c(-COOH, PMA) curves were obtained by DPASV in
solutions of Zn(II) PMA with c(KNO3)0.04 and 0.10 M. c(Zn2)1 mM; n0.8; t(251)8C. The evolving LSQF procedure was performed in forward and backward directions
using p2/3
277
278
simulated data, the log values are more reliable than
" values, and a simple visual and statistical detection
and elimination of in¯uential erroneous points is
possible, because the values of the i vs. [L]i function
are always within the scaled range (0 and 1), and an
error estimation is possible using statistical criteria.
Acknowledgements
The authors gratefully acknowledge ®nancial support from the Spanish Ministry of Education and
Science, DGICYT Projects PB93-1055 (1994±1997)
and PB96-0379-C03-01 (1997±2000). B.S. GrabaricÂ
also acknowledges ®nancial supports from the following: Spanish Ministry of Education and Science
(October 1996±March 1997), General Direction of
Universities, Generalitat de Catalunya (April 1997±
August 1997) and IBERDROLA (October 1997±
March 1998) for a visiting professorship of science
and technology.
References
[1] A.E. Martell, R.M. Smith, Critical Stability Constants, vols.
1±5, Plenum Press, New York, 1974±1982.
[2] J. Buffle, Complexation Reactions in Aquatic Systems. An
Analytical Approach, Ellis Horwood, Chichester, 1988.
[3] H.P. van Leeuwen, R. Cleven, J. Buffle, Pure Appl. Chem. 61
(1989) 255.
[4] J. Wang, Electroanalytical Techniques in Clinical Chemistry
and Laboratory Medicine, VCH, New York, 1988.
[5] M.R. Smyth, J.G. Vos (Eds.), Analytical Voltammetry, in G.
Svehla (Ed.), Wilson's and Wilson's Comprehensive Analytical Chemistry, vol. XXVII, Elsevier, Amsterdam, 1992.
[6] B.S. GrabaricÂ, Z. GrabaricÂ, M. Esteban, E. Casassas, Anal.
Chim. Acta 325 (1996) 135.
[7] I. Leden, Z. Physik. Chem. (Leipzig) 188A (1941) 160.
[8] D.D. DeFord, D.N. Hume, J. Am. Chem. Soc. 73 (1951)
5321.
[9] P. Gans, Data Fitting in the Chemical Sciences ± By the
Method of Least Squares, John Wiley and Sons, Chichester,
1992.
[10] M. Meloun, J. Militky, M. Forina, Chemometrics for
Analytical Chemistry, PC-aided Regression and Related
Methods, vol. 2, Ellis Horwood, New York, 1994.
[11] H.G. de Jong, H.P. Van Leeuwen, K. Holub, J. Electroanal.
Chem. 234 (1987) 1.
[12] H.G. de Jong, H.P. Van Leeuwen, J. Electroanal. Chem. 234
(1987) 17.
[13] H.G. de Jong, H.P. Van Leeuwen, J. Electroanal. Chem. 235
(1987) 1.
[14] M.A.G.T. Van den Hoop, F.M.R. Leus, H.P. Van Leeuwen,
Coll. Czech. Chem. Commun. 56 (1991) 96.
[15] A.M. Bond, B.S. GrabaricÂ, Anal. Chem. 51 (1979) 337.
[16] J.M. DõÂaz-Cruz, C. ArinÄo, M. Esteban, E. Casassas, Biophys.
Chem. 45 (1992) 109.
[17] MATLAB High-Performance Numeric Computation and
Visualisation Software, Reference Guide, The MathWorks,
Inc., Natick, 1992.
[18] B.S. GrabaricÂ, MATLAB toolbox ELCHEM for electrochemical signal simulation, visualisation and evaluation, Department of Analytical Chemistry, University of Barcelona,
Barcelona, 1997.
[19] I. FilipovicÂ, M. TkalcÏec, B. GrabaricÂ, Inorg. Chem. 29 (1990)
1092 (and references cited therein).
[20] M. TkalcÏec, B. GrabaricÂ, I. FilipovicÂ, Anal. Chim. Acta 143
(1982) 255.
[21] M. Esteban, H.G. de Jong, H.P. Van Leeuwen, Int. J. Environ.
Anal. Chem. 38 (1990) 75.