IDENTIFICATION OF THE STATIONARITY IN BIOLOGICAL TIME

Transcription

IDENTIFICATION OF THE STATIONARITY IN BIOLOGICAL TIME
Anais do XIX Congresso Brasileiro de Automática, CBA 2012.
IDENTIFICATION OF THE STATIONARITY IN BIOLOGICAL TIME SERIES
Luciana R. Nicacio∗, Carlos D. Maciel∗, David M. Simpson†, Philip L. Newland‡, Giovana
Y. Nakashima§
∗
Laboratory of Signal Processing, Electrical Eng. Dept., EESC, University of São Paulo
Av. Trabalhador São Carlense, 400
São Carlos, SP, Brazil
†
ISVR, University of Southampton
University Road Highfield S017 1BJ
Southampton, United Kingdom
‡
§
School of Biological Science, University of Southampton
University Road Highfield S017 1BJ
Southampton, United Kingdom
Federal Institute of Education, Science and Technology of São Paulo, Campus Salto
R. Rio Branco, 1780
Salto, SP, Brazil
Emails: lucynicacio@sc.usp.br, maciel@sc.usp.br, ds@isvr.soton.ac.uk, pln@soton.ac.uk,
gyuko@ifsp.edu.br
Abstract— Wide sense stationarity is a requirement that a time series has to satisfy so that some statistical
tools can be used in its analysis. However, most of biological signal are non-stationary time series, thus to
correctly analyze this biological signals is necessary to find the segments in the series that satisfy the stationarity
condition. Whereas a non-stationary time series is formed by concatenation of stationary segments, it is possible
create a algorithm able to identify these segments. For this, the z-test and Bartlett’s test were used to localize
the points in which the statistical properties of the series, such as mean and variance, changed abruptly and
then, split the series in this points. After detecting the changes points, the run test and trend test were used to
verify whether segments formed by results of the z-test and the Bartlett’s test were indeed stationary.
Keywords—
Biological Time Series, Hypothesis Test, Stationarity.
Resumo— Estacionariedade no sentido amplo é um requisito que uma série temporal tem que satisfazer para
que algumas ferramentas estatı́sticas possam ser usadas em sua análise. No entanto, a maioria dos sinais biológicos são séries temporais não estacionárias, assim para analisar corretamente tais sinais biológicos é necessário
encontrar segmentos na série que satisfaçam a condição da estacionariedade. Considerando que uma série temporal não estacionária seja formada pela concatenação de segmentos estacionários, é possı́vel criar um algoritmo
capaz de identificar estes segmentos. Para isto, o teste z e o teste de Bartlett foram usados para localizar os
pontos em que as propriedades estatı́sticas da série, tais como média e variância, alteram de forma abrupta, e
então segmentar a série nestes pontos. Após detectar os pontos de alteração, o teste da corrida e o teste de
tendência foram usados para verificar se os segmentos formados a partir dos resultados do teste z e do teste de
Bartlett eram realmente estacionários.
Palavras-chave—
1
Séries Temporais Biológicas, Teste de Hipóteses, Estacionariedade.
Introduction
The statistical functions most commonly used to
describe the basic properties of random data are:
mean square value, probability density functions,
autocorrelation functions and power spectral density functions (Bendat and Piersol, 1966). However, these functions are easily calculated for stationary random data and most of biological signals
are non-stationary random data (Hung, 1981).
Thus, to correctly analyze biological time series
it is necessary verify whether such series satisfy
the condition of stationarity, or else, identify the
segments in the series which are stationary.
Strict sense stationary time series (abbreviated SSS and also known as strongly stationary)
are series whose statistical properties are invariant to time translation (Papoulis and Pillai, 2002).
ISBN: 978-85-8001-069-5
When only the mean and variance of the series do
not vary with time translation, it is called wide
sense stationary (abbreviated WSS and also called
weakly stationary) (Papoulis and Pillai, 2002),
and this condition of the weak stationarity of the
series is sufficient so that the statistical functions
mentioned above can be applied (Hung, 1981), except the probability density function which not
require this condition.
For some practical purposes, a non-stationary
time series can be seen as a concatenation of stationary segments (Fukuda et al., 2004). From
this consideration, it is possible create a segmentation algorithm consisting in the split of nonstationary time series into smaller segments whose
statistical properties are invariant with time. In
practice, this segmentation problem is a problem of detection and localization of the statistical
3482
Anais do XIX Congresso Brasileiro de Automática, CBA 2012.
changes of the series (Lopatka et al., 2005). However, identify with accuracy all the non-stationary
events which compose the time series is a computational problem of hard solution due quantity
and complexity of the calculations. An exact segmentation algorithm requires a computation time
that scales as N N , in which N is the number of
points in the time series (Fukuda et al., 2004).
Hence, to long time series such algorithm would
not be practical, and for this reason, the segmentation of a real-world time series must accomplish a trade-off between the complexity of the
calculation and the desired precision of the result (Fukuda et al., 2004).
In the present paper, we propose a algorithm
to split biological time series in the segments
which can be considered stationary. In section
2, we briefly describe the signals to be segmented,
we explain the segmentation problem and propose
a algorithm to solve the problem. In section 3, we
show the obtained results and discuss them. In the
last section, we summarize our results and present
our conclusion from the observed results.
2
2.1
1 = t1 < t2 < ... < tM −1 < tM = N.
|x1 − x2 |
z=s
s21
s2
+ 2
N1
N2
(4)
The non-stationary signals
in which x1 , s21 and N1 are, respectively, mean,
variance and length of sample 1, and x2 , s22 and N2
of the sample 2. For this problem, N1 = N2 = L.
As can be observed in (4), for samples with
equal means, the value of z is zero. Thus, larger
values of z means that the values of the mean of
both samples are more likely to be significantly
different, making the points with the largest values of z, good candidates for change points.
The Bartlett’s test, defined in Snedecor and
Cochran (1989), tests the null hypothesis, H0 , of
that the variances of k independent normally distributed samples are identical, against the alternative hypothesis, H1 , of that at least two samples
have unequal variances.
For the segmentation problem, k = 2 and the
length of the two samples are equal to L, thus the
Bartlett statistic can be computed as
The segmentation problem
B=
Let
(3)
To find all change points, a sliding window
of length 2L was moved through the time series,
starting with window center in L and ending in
(N − L). For each position of window, two tests
were computed. The z-test to quantify the difference between the means of two samples of the
series, and the Bartlett’s test to quantify the difference between the variances of the same samples,
of the left-side and right-side of the window center,
each one with length L.
The z-test tests the null hypothesis, H0 , of
that the means of two independent normally distributed samples, with known variances, are identical, against the alternative hypothesis, H1 , of
that the two means are different.
According to Miller et al. (1990), for samples with length greater than or equal to 30, the
statistic z is defined as:
Materials and Methods
The signals to be segmented are the intracellular
recordings made from sensory neurons that provide input to the the local circuits controlling leg
movements of the locust, during stimulation of organ FCO (femoral chordotonal organ). The stimulus was responsible for move the apodeme of organ resulting in movements of flexion and extension of the tibia. To move the apodeme was used
a band-limited Gaussian white noise signal, with a
cutoff frequency of 100 Hz generated from filtering
of band-limited Gaussian white noise signal with
cutoff frequency of 200 Hz (Kondoh et al., 1995).
The signals were recorded to a sample rate of 24
kHz during 30 to 60 seconds, approximately. Simulated signals formed by concatenation of some
stationary segments were used to validate the algorithm before of to segment the biological series.
2.2
in wich
→
−
x = [x1 , x2 , x3 , ..., xN ]
(1)
be a non-stationary time series. The aim of the
work is split this series into smaller segments that
can be considered wide sense stationary. For this,
it is necessary to detect and localize all the points
of the time series in which the statistical properties change.
According to Velis (2007) a sequence of
change points can be defined as
→
−
t = [t1 , t2 , ..., tM ]
ISBN: 978-85-8001-069-5
(2)
2 ln(
s21 +s22
2 )
− ln s21 − ln s22
6L − 5
(5)
If s21 = s22 the value of B will be zero, and as
in the z-test, the candidates to change points by
Bartlett’s test will be the points which presented
the largest values of B.
After sliding the window through series, the
variables z and B were normalized between 0 and
1, then the mean and standard deviation for the
variables were computed to know whether the
probability density function of any variables had
trend to be symmetric or asymmetric. For the
case of asymmetric probability density function,
3483
Anais do XIX Congresso Brasileiro de Automática, CBA 2012.
the standard deviation of the variable should be
greater than half of the its mean (Bastos and
Duquia, 2007). Thus, deviation greater than half
of the mean indicates that the variable presented
values very distant of the mean, and in the case of
z and B this means that exist some peak values
and this values are related to the abrupt statistical changes. Thus, when z had trend to have symmetric probability density function meant that the
mean of the series not varied abruptly, and the results of z-test were not used to split the series,
and when this happened with B was the variance
of the series which not varied, and the results of
Bartlett’s test were not used.
If the standard deviation of the variables z
and B were greater than the half of the its mean,
a threshold for z and other for B were calculated
from the estimate of the probability density function (pdf) of both statistics and a predefined significance level, α. The cumulative distribution
function (cdf) was computed from the estimated
pdf and the value of the variable corresponding to
cdf = (1 − α) was the adopted threshold. Thus,
the procedure of segmentation was continued as
follows: the point, tmax , with the largest value of
z, zmax , between all calculated values of z, was
located. Then, the series was cut in this point
creating two new segments, and the point was insert in the set of change points. Next, the values
of z in the interval [tmax − ∆, tmax + ∆] were set
equal to zero, then a new value of zmax was again
located, the segment was cut in this localization
and the above procedure was accomplished until
the zmax was smaller than adopted threshold.
The same procedure was accomplished separately for the values of B, yielding a second set of
change points.
Then the points of the two sets were joined
into a single set, taking care to not insert in
the set two consecutive points with separation
smaller than ∆. When two points were very close,
their statistics were compared, and the point with
smaller statistic was deleted.
Through these two tests was possible localize the instants which the signal statistics change
abruptly and split the signal at these instants generating small segments which can be stationary.
But as reported earlier, a signal is considered as
weakly stationary when its mean and variance are
invariant in the time translation. Thus to verify
whether the formed segments were weakly stationary, it was used the run test and the trend test
which are able of identifying presence of trends in
the mean and variance of the series in the time
translation.
To apply the run test and the trend test, each
segment was divided into n equal time intervals
where the data in each interval could be considered independent, the n value was computed dividing the length of segment by desired value for
ISBN: 978-85-8001-069-5
intervals length, l. With this, the n value varied according to length of the segment. Then,
the mean value and variance of the signal were
computed for each interval and aligned in time sequence, as follows
SM = [x1 , x2 , ..., xn ]
(6a)
SV = [s21 , s22 , ..., s2n ]
(6b)
The run test, defined by Bendat and Piersol
(1966), was used to classify the n observations of
the two sequences in (6) in two categories: category zero (0) if xi < x for (6a) or s2i < s2 for
(6b) and category one (1) if xi > x for (6a) or
s2i > s2 for (6b), in which x and s2 are, respectively, the mean and variance of the whole segment. Thus, the classifications of the intervals
were joined yielding a sequence of zeros and ones.
According to Bendat and Piersol (1966), a
run is defined as a sequence of identical observations that are followed or preceded by a different observation or no observation at all. Thus,
the number of runs, r, is the number of times in
which was observed transition of zero to one, or
contrary, in the sequence of zeros and ones, plus
one.
The r value encountered indicates whether the
observations of a sequence are independent random observations of the same random variable
(Bendat and Piersol, 1966). If the observations
of a sequence were independent random observations of the same random variable, then the sampling distribution for r in the sequence is a random variable r(k) with the following mean value
and variance
2n0 n1
+1
n
(7a)
2n0 n1 (2n0 n1 − n)
n2 (n − 1)
(7b)
µr =
σr2 =
in which, n0 and n1 are the quantity of zeros and
ones, respectively.
The trend test, also defined by Bendat and
Piersol (1966), considers the same sequences (6a)
and (6b) and count the number of time that xi >
xj and s2i > s2j for i < j. Each such inequality is
called a reverse arrangement. The total number
of reverse arrangements is denoted by A.
In general, for the set of observations in (6),
A is defined as
A=
n−1
X
Ai
(8)
i=1
in which
Ai =
n
X
hij
(9)
j=i+1
3484
Anais do XIX Congresso Brasileiro de Automática, CBA 2012.
and
hij =
1
0
if xi > xj
otherwise
(10)
If the sequence of n observations are independent observations of same random variable, the
the number of reverse arrangements is a random
variable A(k) with a mean variable and variance
given by
n(n − 1)
4
(11a)
n(2n + 5)(n − 1)
72
(11b)
µA =
2
σA
=
Considering that the variables r(k) and A(k)
have normal distribution, is possible to find a confidence interval for significance level α to verify
whether the values of r and A encountered in the
tests for each segment are inside confidence interval, and so conclude that the analyzed segment
do not have trends in its mean and variance, and
it can be considered as stationary. If the values
of r or A for the segment are outside confidence
interval, the segmented should be considered as
non-stationary.
The use of two test for verify the stationary
of each segment, it is necessary because the trend
test is powerful for detecting monotonic trends in
a sequence of observations and the run test is powerful for detecting fluctuating trends, such trends
are smooth statistic changes which the z-test and
the Bartlett’s test are not able of identify.
2.3
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18:
19:
20:
21:
22:
23:
24:
25:
26:
27:
28:
29:
Delete tB .
30:
end if
31: end if
32: Choose a value to l;
→
−
33: for ti in t do
34:
seg ← sinal[ti , ti+1 ]
35:
Compute the mean and variance of segment;
36:
Compute the value of n;
37:
Divide the segment in n intervals;
38:
Form the sequences in (6).
39:
Compute the values of r and A for each sequence.
40:
From (7), (11) and pre-defined significance level, α,
compute the confidence interval to r and A, respectively.
41:
if values of r and A are inside confidence interval then
42:
The segment is classified as stationary.
43:
else
44:
The segment is classified as non-stationary.
45:
end if
46: end for
This algorithm was implemented using the
python programming language. Due to the data
length that was very large, the execution time
of the sequential program was also very large.
Thus, to reduce the compute time was used
the MPI (Message Passing Interface) for python
(mpi4py package) which provides MPI bindings
for the python programming language, allowing
any python program to exploit multiple processors (Dalcin, 2012).
2.4
Test cases
To ensure that algorithm is able to detect and
localize correctly the instants which the signal
statistics change, three test cases were accomplished. In each case, a time series was generated
by concatenation of ten stationary segments with
different length.
The segments of the test case 1 had, beyond
the length, the mean and variance different, as
indicated in the following table.
The algorithm
Choose values to L and ∆;
for t in interval [L, N − L] do
seg1 ← sinal[t − L, t];
seg2 ← sinal[t, t + L];
zt ← testeZ (seg1, seg2);
Bt ← testeBartlett (seg1, seg2);
end for
Normalize the variables z and B between 0 and 1.
if standard deviation of z > half of the mean of z then
Estimate the pdf of z;
Define the significance level for z and compute the
threshold;
→
−
Start with tz ← [1, N ].
Find the value zmax in the sequence z;
while zmax > threshold do
Find the value of tmax for which z = zmax ;
→
−
Add this value of tmax in the sequence tz ;
Do z[tmax − ∆, tmax + ∆] ← 0
Find the new value of zmax in the sequence z.
end while
else
→
−
tz = [1, N ]
end if
Return to line 9 and accomplish the same procedure for
−
→
values B, yielding the tB sequence.
→
−
−
→
Join the points of the two sequences tz and tB , yielding
→
−
a single t sequence but
if abs(tz − tB ) < ∆ then
if ztz < BtB then
Delete tz
else
ISBN: 978-85-8001-069-5
Table 1: Characteristics of the segments of the series
to test case 1.
Segment
1
2
3
4
5
6
7
8
9
10
Length (104 )
4
3
3
2
4
4
4
2
1
4
Mean
6.007
4.967
2.004
4.011
7.997
5.988
8.990
6.020
8.028
6.003
Variance
5.997
2.003
1.007
2.993
0.996
1.988
6.991
8.003
2.024
1.006
To the test case 2, only the mean of the segments varied abruptly, and the variance was maintained near unit, as shown in the Tab.2.
The test case 3, as can be observed in the
Tab.3, was otherwise test case 2, the mean maintained it near unit, and the variance varied
abruptly.
The values to segments lengths, mean and
variance which varied abruptly were generated
randomly.
3485
Anais do XIX Congresso Brasileiro de Automática, CBA 2012.
Table 2: Characteristics of the segments of the series
Table 5: Real changes points and the points selected
to test case 2.
by tests to the series of the test case 2.
Segment
1
2
3
4
5
6
7
8
9
10
Length (104 )
4
3
3
2
4
4
4
2
1
4
Mean
3.004
9.002
2.006
5.005
1.998
3.995
3.009
9.004
4.010
2.004
Variance
1.002
0.998
0.997
1.001
0.998
0.992
1.005
1.002
1.000
0.999
Real
1
40000
70000
100000
120000
160000
200000
240000
260000
270000
310000
z-test
1
40000
70000
100000
120000
160000
240000
260000
270000
310000
Bartlett’s test
1
40847
70688
239236
260785
310000
Final
1
40000
70000
100000
120000
160000
240000
260000
270000
310000
Table 3: Characteristics of the segments of the series
Table 6: Real changes points and the points selected
to test case 3.
Segment
1
2
3
4
5
6
7
8
9
10
3
Length (104 )
4
3
3
2
4
4
4
2
1
4
Mean
0.978
0.989
1.036
0.973
0.982
0.999
1.016
0.995
0.946
1.020
Variance
7.971
5.961
5.993
8.963
4.990
2.989
9.052
1.992
4.010
8.002
Real
1
40000
70000
100000
120000
160000
200000
240000
260000
270000
310000
z-test
1
310000
Bartlett’s test
1
99980
120018
160001
199997
239999
259974
270002
310000
Final
1
99980
120018
160001
199997
239999
259974
270002
310000
Results and Discussion
To split the series of test cases, were used the
following values: length of window, L = 1500,
minimum separation between two changes points,
∆ = 8000, length of intervals, l = 500, significance level to z, B, A and r, α = 5%. The results of the segmentation of the series of the test
cases are shown in the Fig.1, Fig.2 and Fig.3, and
the real change points, the selected by each test
and the final changes points resulting of the joining of the points of the two tests are shown in
the Tab.4, Tab.5 and Tab.6. The Tab. 6 shows
that the results of z-test were not used in the segmentation, this happened because z not presented
peaks, meaning the series not had abrupt changes
of mean.
Table 4: Real changes points and the points selected
by tests to the series of the test case 1.
Real
1
40000
70000
100000
120000
160000
200000
240000
260000
270000
310000
by tests to the series of the test case 3.
z-test
1
70001
99999
120001
159999
200005
240043
270000
310000
Bartlett’s test
1
40007
70428
100000
120300
159683
199996
260011
270062
310000
Final
1
40007
70001
100000
120001
159999
199996
240043
260011
270000
310000
The Fig.1 shows that to a significance level
of 5%, the threshold calculated to z and B were,
respectively, 0.2 and 0.24, and with this threshold the z-test was not able to identify the change
points between the segments 1 and 2 which, ac-
ISBN: 978-85-8001-069-5
cording the Tab.1, had mean equal to 6.007 and
4.967, respectively, and between the segments 8
and 9 had mean equal to 6.020 and 8.028, respectively. In the Fig.1(c) is possible to observe
that z presented small peaks between these segment, however the values of z were smaller than
threshold adopted, thus the points corresponding
to these small peaks were not considered change
points. If the chosen significance level was greater,
the threshold calculated would be smaller and
these points would be selected. The Bartlett’s
test was not able to identify the points between
the segments 7 and 8. From the Tab.1, it is possible observe that the variances between these segments, not varied abruptly and in the Fig.1(e)
not shows peaks of B for no points between these
segments. The Bartlett’s test identified the two
points that the z-test was not able to identify,
and the only point that the Bartlett’s test not
identified, it was identified by z-test, thus joining the points, the algorithm was able to identify
all change points of this time series.
The Fig.1(b), Fig.1(d) and Fig.1(f) shows the
utility of the run test and trend test which classified correctly the non-stationary segments, such
as the segments 1 and 7 of the Fig.1(b) which was
formed by original segments 1 and 2, and 8 and 9,
respectively, that had mean and variance different
as can be seen in the Tab.1. The segment 7 of
the Fig.1(d) was also correctly classified as nonstationary. However, the algorithm classified as
non-stationary two segments considered stationary.
In the test case 2, despite only the mean of
3486
Anais do XIX Congresso Brasileiro de Automática, CBA 2012.
Figure 1: Test case 1: S = stationary and N = non-stationary. (a) Signal generated by concatenation of
segments with length, mean and variance different and the real change points. (b) Signal segmented by change
points selected by z-test. (c) Value computed of z for all points through the sliding of window. (d) Signal
segmented by change points selected by Bartlett’s test. (e) Value computed of B for all points through the sliding
of window. (f) Final segmentation resulting from joining of the points of (b) and (d).
Figure 2: Test case 2: S = stationary and N = non-stationary. (a) Signal generated by concatenation of segments
with length and mean and the real change points. (b) Signal segmented by change points selected by z-test. (c)
Value computed of z for all points through the sliding of window. (d) Signal segmented by change points selected
by Bartlett’s test. (e) Value computed of B for all points through the sliding of window. (f) Final segmentation
resulting of the joining of the points of (b) and (d).
ISBN: 978-85-8001-069-5
3487
Anais do XIX Congresso Brasileiro de Automática, CBA 2012.
Figure 3: Test case 3: S = stationary and N = non-stationary. (a) Signal generated by concatenation of segments
with length and variance different and the real change points. (b) Signal segmented by change points selected by
Bartlett’s test. (c) Value computed of B to all points through the sliding of window.
series have varied abruptly, the Bartlett’s test presented some peaks to points near to exact point
in which the mean changed, and in the exact instant which the mean changed, the result of the
test was near of zero, this can be observed in the
Fig.2(e). This happened because when the righttail of window attained the instant which the signal mean changed, the sample of one window side
was formed by two segments with different means
and this was responsible by differentiate the variances of two samples. When the window center
coincided with exact instant of mean variation,
the two samples formed by the window presented
different means but close variances, making the
value of B was near of zero. With the to slide of
window, the instant of mean variation was inside
left-side window, and the result of the Bartlett’s
test presented similar behavior to that described
above for the right-side. Thus, the Bartlett’s test
presented two peaks for each mean variation, however not in the exact instant.
Through Tab.5, it is possible to note that the
z-test only was not able to identify a change point
of mean, the other points were identified with accuracy. The run test and trend test classified
correctly the segment 6 of the Fig.2(f) as nonstationary.
To the test case 3 only the results of the
Bartlett’s test were used in the segmentation, because the standard deviation of z was greater than
the half of the its mean. Thus, as can be observed
in the Fig.3 or Tab.6, two changes points were
not selected. Through the Fig.3(c) it is possible
to note that the test presented a small peak only
to the point between original segments 1 and 2
which have variances equal to 7.971 and 5.961,
respectively, however this peak was smaller than
threshold of 0.06, and so this point was not selected. Between the segments 2 and 3 not exist
peak, this happened because the variance of the
two segments are very close. In the Fig.3(b) is
possible to note that the algorithm not correctly
classified the last segment, already the classification of the first segment is correct.
ISBN: 978-85-8001-069-5
The Fig.4 shows the segmentation result to
two signals recorded simultaneously. The signals
of the Fig.4(a) is a biological time series consisting the response of a spiking neuron, the FCO
sensory neuron, to stimulation signal applied to
FCO apodeme shown in the Fig.4(b). To segment
these signals the values were set as L = 5000,
∆ = 20000, l = 3000 and it was adopted a significance level of α = 5% for variables z, B, r and
A. The change points selected by the z-test and
the Bartlett’s test for two signals are shown in the
Tab.7 and Tab.8. The points are shown in samples in the tables and in the Fig.4 the points are
converted to seconds using the sampling frequency
of 24 kHz used to record the signals.
Table 7: Change points selected by tests to the biological series.
z-test
1
30502
67140
117612
159735
335926
406827
612985
699904
Bartlett’s test
1
30476
67146
117604
612779
699904
Final
1
30476
67146
117612
1159735
335926
406827
612779
699904
Table 8: Change points selected by tests to the stimulation signal.
z-test
1
699904
Bartlett’s test
1
30288
67291
117529
612579
699904
Final
1
30288
67291
117529
612579
699904
As can be observed, the stimulation signal is
similar to the series of the case test 3, in which
only the variance of series varies abruptly, thus
to segmented the signal it was used only the
Bartlett’s test. The segment 4 of the Fig.4(a) was
classified as non-stationary due the clear variation
of the amplitude of the spikes. The segments 1,
3488
Anais do XIX Congresso Brasileiro de Automática, CBA 2012.
Figure 4: Biological signal: S = stationary and N = non-stationary. (a) Response of the spiking neuron to
stimulus of (d). (b) Stimulus of 100 Hz applied to FCO apodeme.
3 and 5 of the Fig.4(b) present fluctuating trends
therefore are considered as non-stationary. The
points between the segments 4-5, 5-6 and 6-7 of
the biological series were selected because in this
points there is a increase or decrease of the frequency of spikes.
4
Conclusions
This paper shown that the z-test and Bartlett’s
test are very effective in to detect, respectively,
abrupt changes in the mean and variance of a time
series. However, it is important to note that the
changes points selected by z-test are more accurate than the points selected by Bartlett’s test,
this can be confirmed by the Tab.4 and Tab.5
where can be seen that the z-test selected the
change points nearest the real change points. The
run test and trend test used to verify the stationarity of the segments formed by two previous
tests proved be very sensitive to smooth statistical
changes, because they were able to identify both
the linear trend and the fluctuating trend that the
z-test and Bartlett’s test can not detect. Thus,
this algorithm can be used as an auxiliary tool to
future analysis of biological signals and it will be
responsible for identifying the segments for which
the statistical functions can be used correctly.
Acknowledgments
This work was supported by resources supplied by the Center for Scientific Computing
(NCC/GridUNESP) of the São Paulo State University (UNESP). We also thanks the Brazilian
National Council for Scientific and Technological
Development (CNPq) by a studentship.
References
Bastos, J. L. D. and Duquia, R. P. (2007).
Medidas de dispersão: os valores estão
próximos entre si ou variam muito?, Scientia
Medica 17(1): 40–44. [online] Available at:
http://revistaseletronicas.pucrs.br/
ISBN: 978-85-8001-069-5
ojs/index.php/scientiamedica. Accessed
15-april-2012.
Bendat, J. and Piersol, A. (1966). Measurement
and analysis of random data, Wiley.
Dalcin, L. (2012). Mpi for python, [online]
Available at: http://mpi4py.scipy.org/
docs/usrman/index.html.
Accessed 14april-2012.
Fukuda, K., Stanley, H. E. and Amaral, L. A. N.
(2004). Heuristic segmentation of a nonstationary time series, Physical Review E 69(2).
Hung, J. F. (1981). Digital processing of nonstationary signals, Master’s thesis, McMaster
Universtiy.
Kondoh, Y., Okuma, J. and Newland, P. L.
(1995). Dynamics of neurons controlling
movements of a locust hind leg: Wiener kernel analysis of the responses of proprioceptive afferents., Journal of Neurophysiology
73(5): 1829–1842.
Lopatka, M., Laplanche, C., Adam, O., Motsch,
J.-f. and Zarzycki, J. (2005). Non-stationary
time-series segmentation based on the schur
prediction error analysis, IEEESP 13th
Workshop on Statistical Signal Processing
2005 2: 2–6.
Miller, I., Freund, J. and Johnson, R. (1990).
Probability and statistics for engineers, Prentice Hall.
Papoulis, A. and Pillai, S. (2002). Probability,
random variables, and stochastic processes,
McGraw-Hill electrical and electronic engineering series, McGraw-Hill.
Snedecor, G. W. and Cochran, W. (1989). Statistical Methods, number v. 276 in Statistical
Methods, Iowa State University Press.
Velis, D. R. (2007). Statistical segmentation of
geophysical log data, Mathematical Geology
39(4): 409–417.
3489