IDENTIFICATION OF THE STATIONARITY IN BIOLOGICAL TIME
Transcription
IDENTIFICATION OF THE STATIONARITY IN BIOLOGICAL TIME
Anais do XIX Congresso Brasileiro de Automática, CBA 2012. IDENTIFICATION OF THE STATIONARITY IN BIOLOGICAL TIME SERIES Luciana R. Nicacio∗, Carlos D. Maciel∗, David M. Simpson†, Philip L. Newland‡, Giovana Y. Nakashima§ ∗ Laboratory of Signal Processing, Electrical Eng. Dept., EESC, University of São Paulo Av. Trabalhador São Carlense, 400 São Carlos, SP, Brazil † ISVR, University of Southampton University Road Highfield S017 1BJ Southampton, United Kingdom ‡ § School of Biological Science, University of Southampton University Road Highfield S017 1BJ Southampton, United Kingdom Federal Institute of Education, Science and Technology of São Paulo, Campus Salto R. Rio Branco, 1780 Salto, SP, Brazil Emails: lucynicacio@sc.usp.br, maciel@sc.usp.br, ds@isvr.soton.ac.uk, pln@soton.ac.uk, gyuko@ifsp.edu.br Abstract— Wide sense stationarity is a requirement that a time series has to satisfy so that some statistical tools can be used in its analysis. However, most of biological signal are non-stationary time series, thus to correctly analyze this biological signals is necessary to find the segments in the series that satisfy the stationarity condition. Whereas a non-stationary time series is formed by concatenation of stationary segments, it is possible create a algorithm able to identify these segments. For this, the z-test and Bartlett’s test were used to localize the points in which the statistical properties of the series, such as mean and variance, changed abruptly and then, split the series in this points. After detecting the changes points, the run test and trend test were used to verify whether segments formed by results of the z-test and the Bartlett’s test were indeed stationary. Keywords— Biological Time Series, Hypothesis Test, Stationarity. Resumo— Estacionariedade no sentido amplo é um requisito que uma série temporal tem que satisfazer para que algumas ferramentas estatı́sticas possam ser usadas em sua análise. No entanto, a maioria dos sinais biológicos são séries temporais não estacionárias, assim para analisar corretamente tais sinais biológicos é necessário encontrar segmentos na série que satisfaçam a condição da estacionariedade. Considerando que uma série temporal não estacionária seja formada pela concatenação de segmentos estacionários, é possı́vel criar um algoritmo capaz de identificar estes segmentos. Para isto, o teste z e o teste de Bartlett foram usados para localizar os pontos em que as propriedades estatı́sticas da série, tais como média e variância, alteram de forma abrupta, e então segmentar a série nestes pontos. Após detectar os pontos de alteração, o teste da corrida e o teste de tendência foram usados para verificar se os segmentos formados a partir dos resultados do teste z e do teste de Bartlett eram realmente estacionários. Palavras-chave— 1 Séries Temporais Biológicas, Teste de Hipóteses, Estacionariedade. Introduction The statistical functions most commonly used to describe the basic properties of random data are: mean square value, probability density functions, autocorrelation functions and power spectral density functions (Bendat and Piersol, 1966). However, these functions are easily calculated for stationary random data and most of biological signals are non-stationary random data (Hung, 1981). Thus, to correctly analyze biological time series it is necessary verify whether such series satisfy the condition of stationarity, or else, identify the segments in the series which are stationary. Strict sense stationary time series (abbreviated SSS and also known as strongly stationary) are series whose statistical properties are invariant to time translation (Papoulis and Pillai, 2002). ISBN: 978-85-8001-069-5 When only the mean and variance of the series do not vary with time translation, it is called wide sense stationary (abbreviated WSS and also called weakly stationary) (Papoulis and Pillai, 2002), and this condition of the weak stationarity of the series is sufficient so that the statistical functions mentioned above can be applied (Hung, 1981), except the probability density function which not require this condition. For some practical purposes, a non-stationary time series can be seen as a concatenation of stationary segments (Fukuda et al., 2004). From this consideration, it is possible create a segmentation algorithm consisting in the split of nonstationary time series into smaller segments whose statistical properties are invariant with time. In practice, this segmentation problem is a problem of detection and localization of the statistical 3482 Anais do XIX Congresso Brasileiro de Automática, CBA 2012. changes of the series (Lopatka et al., 2005). However, identify with accuracy all the non-stationary events which compose the time series is a computational problem of hard solution due quantity and complexity of the calculations. An exact segmentation algorithm requires a computation time that scales as N N , in which N is the number of points in the time series (Fukuda et al., 2004). Hence, to long time series such algorithm would not be practical, and for this reason, the segmentation of a real-world time series must accomplish a trade-off between the complexity of the calculation and the desired precision of the result (Fukuda et al., 2004). In the present paper, we propose a algorithm to split biological time series in the segments which can be considered stationary. In section 2, we briefly describe the signals to be segmented, we explain the segmentation problem and propose a algorithm to solve the problem. In section 3, we show the obtained results and discuss them. In the last section, we summarize our results and present our conclusion from the observed results. 2 2.1 1 = t1 < t2 < ... < tM −1 < tM = N. |x1 − x2 | z=s s21 s2 + 2 N1 N2 (4) The non-stationary signals in which x1 , s21 and N1 are, respectively, mean, variance and length of sample 1, and x2 , s22 and N2 of the sample 2. For this problem, N1 = N2 = L. As can be observed in (4), for samples with equal means, the value of z is zero. Thus, larger values of z means that the values of the mean of both samples are more likely to be significantly different, making the points with the largest values of z, good candidates for change points. The Bartlett’s test, defined in Snedecor and Cochran (1989), tests the null hypothesis, H0 , of that the variances of k independent normally distributed samples are identical, against the alternative hypothesis, H1 , of that at least two samples have unequal variances. For the segmentation problem, k = 2 and the length of the two samples are equal to L, thus the Bartlett statistic can be computed as The segmentation problem B= Let (3) To find all change points, a sliding window of length 2L was moved through the time series, starting with window center in L and ending in (N − L). For each position of window, two tests were computed. The z-test to quantify the difference between the means of two samples of the series, and the Bartlett’s test to quantify the difference between the variances of the same samples, of the left-side and right-side of the window center, each one with length L. The z-test tests the null hypothesis, H0 , of that the means of two independent normally distributed samples, with known variances, are identical, against the alternative hypothesis, H1 , of that the two means are different. According to Miller et al. (1990), for samples with length greater than or equal to 30, the statistic z is defined as: Materials and Methods The signals to be segmented are the intracellular recordings made from sensory neurons that provide input to the the local circuits controlling leg movements of the locust, during stimulation of organ FCO (femoral chordotonal organ). The stimulus was responsible for move the apodeme of organ resulting in movements of flexion and extension of the tibia. To move the apodeme was used a band-limited Gaussian white noise signal, with a cutoff frequency of 100 Hz generated from filtering of band-limited Gaussian white noise signal with cutoff frequency of 200 Hz (Kondoh et al., 1995). The signals were recorded to a sample rate of 24 kHz during 30 to 60 seconds, approximately. Simulated signals formed by concatenation of some stationary segments were used to validate the algorithm before of to segment the biological series. 2.2 in wich → − x = [x1 , x2 , x3 , ..., xN ] (1) be a non-stationary time series. The aim of the work is split this series into smaller segments that can be considered wide sense stationary. For this, it is necessary to detect and localize all the points of the time series in which the statistical properties change. According to Velis (2007) a sequence of change points can be defined as → − t = [t1 , t2 , ..., tM ] ISBN: 978-85-8001-069-5 (2) 2 ln( s21 +s22 2 ) − ln s21 − ln s22 6L − 5 (5) If s21 = s22 the value of B will be zero, and as in the z-test, the candidates to change points by Bartlett’s test will be the points which presented the largest values of B. After sliding the window through series, the variables z and B were normalized between 0 and 1, then the mean and standard deviation for the variables were computed to know whether the probability density function of any variables had trend to be symmetric or asymmetric. For the case of asymmetric probability density function, 3483 Anais do XIX Congresso Brasileiro de Automática, CBA 2012. the standard deviation of the variable should be greater than half of the its mean (Bastos and Duquia, 2007). Thus, deviation greater than half of the mean indicates that the variable presented values very distant of the mean, and in the case of z and B this means that exist some peak values and this values are related to the abrupt statistical changes. Thus, when z had trend to have symmetric probability density function meant that the mean of the series not varied abruptly, and the results of z-test were not used to split the series, and when this happened with B was the variance of the series which not varied, and the results of Bartlett’s test were not used. If the standard deviation of the variables z and B were greater than the half of the its mean, a threshold for z and other for B were calculated from the estimate of the probability density function (pdf) of both statistics and a predefined significance level, α. The cumulative distribution function (cdf) was computed from the estimated pdf and the value of the variable corresponding to cdf = (1 − α) was the adopted threshold. Thus, the procedure of segmentation was continued as follows: the point, tmax , with the largest value of z, zmax , between all calculated values of z, was located. Then, the series was cut in this point creating two new segments, and the point was insert in the set of change points. Next, the values of z in the interval [tmax − ∆, tmax + ∆] were set equal to zero, then a new value of zmax was again located, the segment was cut in this localization and the above procedure was accomplished until the zmax was smaller than adopted threshold. The same procedure was accomplished separately for the values of B, yielding a second set of change points. Then the points of the two sets were joined into a single set, taking care to not insert in the set two consecutive points with separation smaller than ∆. When two points were very close, their statistics were compared, and the point with smaller statistic was deleted. Through these two tests was possible localize the instants which the signal statistics change abruptly and split the signal at these instants generating small segments which can be stationary. But as reported earlier, a signal is considered as weakly stationary when its mean and variance are invariant in the time translation. Thus to verify whether the formed segments were weakly stationary, it was used the run test and the trend test which are able of identifying presence of trends in the mean and variance of the series in the time translation. To apply the run test and the trend test, each segment was divided into n equal time intervals where the data in each interval could be considered independent, the n value was computed dividing the length of segment by desired value for ISBN: 978-85-8001-069-5 intervals length, l. With this, the n value varied according to length of the segment. Then, the mean value and variance of the signal were computed for each interval and aligned in time sequence, as follows SM = [x1 , x2 , ..., xn ] (6a) SV = [s21 , s22 , ..., s2n ] (6b) The run test, defined by Bendat and Piersol (1966), was used to classify the n observations of the two sequences in (6) in two categories: category zero (0) if xi < x for (6a) or s2i < s2 for (6b) and category one (1) if xi > x for (6a) or s2i > s2 for (6b), in which x and s2 are, respectively, the mean and variance of the whole segment. Thus, the classifications of the intervals were joined yielding a sequence of zeros and ones. According to Bendat and Piersol (1966), a run is defined as a sequence of identical observations that are followed or preceded by a different observation or no observation at all. Thus, the number of runs, r, is the number of times in which was observed transition of zero to one, or contrary, in the sequence of zeros and ones, plus one. The r value encountered indicates whether the observations of a sequence are independent random observations of the same random variable (Bendat and Piersol, 1966). If the observations of a sequence were independent random observations of the same random variable, then the sampling distribution for r in the sequence is a random variable r(k) with the following mean value and variance 2n0 n1 +1 n (7a) 2n0 n1 (2n0 n1 − n) n2 (n − 1) (7b) µr = σr2 = in which, n0 and n1 are the quantity of zeros and ones, respectively. The trend test, also defined by Bendat and Piersol (1966), considers the same sequences (6a) and (6b) and count the number of time that xi > xj and s2i > s2j for i < j. Each such inequality is called a reverse arrangement. The total number of reverse arrangements is denoted by A. In general, for the set of observations in (6), A is defined as A= n−1 X Ai (8) i=1 in which Ai = n X hij (9) j=i+1 3484 Anais do XIX Congresso Brasileiro de Automática, CBA 2012. and hij = 1 0 if xi > xj otherwise (10) If the sequence of n observations are independent observations of same random variable, the the number of reverse arrangements is a random variable A(k) with a mean variable and variance given by n(n − 1) 4 (11a) n(2n + 5)(n − 1) 72 (11b) µA = 2 σA = Considering that the variables r(k) and A(k) have normal distribution, is possible to find a confidence interval for significance level α to verify whether the values of r and A encountered in the tests for each segment are inside confidence interval, and so conclude that the analyzed segment do not have trends in its mean and variance, and it can be considered as stationary. If the values of r or A for the segment are outside confidence interval, the segmented should be considered as non-stationary. The use of two test for verify the stationary of each segment, it is necessary because the trend test is powerful for detecting monotonic trends in a sequence of observations and the run test is powerful for detecting fluctuating trends, such trends are smooth statistic changes which the z-test and the Bartlett’s test are not able of identify. 2.3 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28: 29: Delete tB . 30: end if 31: end if 32: Choose a value to l; → − 33: for ti in t do 34: seg ← sinal[ti , ti+1 ] 35: Compute the mean and variance of segment; 36: Compute the value of n; 37: Divide the segment in n intervals; 38: Form the sequences in (6). 39: Compute the values of r and A for each sequence. 40: From (7), (11) and pre-defined significance level, α, compute the confidence interval to r and A, respectively. 41: if values of r and A are inside confidence interval then 42: The segment is classified as stationary. 43: else 44: The segment is classified as non-stationary. 45: end if 46: end for This algorithm was implemented using the python programming language. Due to the data length that was very large, the execution time of the sequential program was also very large. Thus, to reduce the compute time was used the MPI (Message Passing Interface) for python (mpi4py package) which provides MPI bindings for the python programming language, allowing any python program to exploit multiple processors (Dalcin, 2012). 2.4 Test cases To ensure that algorithm is able to detect and localize correctly the instants which the signal statistics change, three test cases were accomplished. In each case, a time series was generated by concatenation of ten stationary segments with different length. The segments of the test case 1 had, beyond the length, the mean and variance different, as indicated in the following table. The algorithm Choose values to L and ∆; for t in interval [L, N − L] do seg1 ← sinal[t − L, t]; seg2 ← sinal[t, t + L]; zt ← testeZ (seg1, seg2); Bt ← testeBartlett (seg1, seg2); end for Normalize the variables z and B between 0 and 1. if standard deviation of z > half of the mean of z then Estimate the pdf of z; Define the significance level for z and compute the threshold; → − Start with tz ← [1, N ]. Find the value zmax in the sequence z; while zmax > threshold do Find the value of tmax for which z = zmax ; → − Add this value of tmax in the sequence tz ; Do z[tmax − ∆, tmax + ∆] ← 0 Find the new value of zmax in the sequence z. end while else → − tz = [1, N ] end if Return to line 9 and accomplish the same procedure for − → values B, yielding the tB sequence. → − − → Join the points of the two sequences tz and tB , yielding → − a single t sequence but if abs(tz − tB ) < ∆ then if ztz < BtB then Delete tz else ISBN: 978-85-8001-069-5 Table 1: Characteristics of the segments of the series to test case 1. Segment 1 2 3 4 5 6 7 8 9 10 Length (104 ) 4 3 3 2 4 4 4 2 1 4 Mean 6.007 4.967 2.004 4.011 7.997 5.988 8.990 6.020 8.028 6.003 Variance 5.997 2.003 1.007 2.993 0.996 1.988 6.991 8.003 2.024 1.006 To the test case 2, only the mean of the segments varied abruptly, and the variance was maintained near unit, as shown in the Tab.2. The test case 3, as can be observed in the Tab.3, was otherwise test case 2, the mean maintained it near unit, and the variance varied abruptly. The values to segments lengths, mean and variance which varied abruptly were generated randomly. 3485 Anais do XIX Congresso Brasileiro de Automática, CBA 2012. Table 2: Characteristics of the segments of the series Table 5: Real changes points and the points selected to test case 2. by tests to the series of the test case 2. Segment 1 2 3 4 5 6 7 8 9 10 Length (104 ) 4 3 3 2 4 4 4 2 1 4 Mean 3.004 9.002 2.006 5.005 1.998 3.995 3.009 9.004 4.010 2.004 Variance 1.002 0.998 0.997 1.001 0.998 0.992 1.005 1.002 1.000 0.999 Real 1 40000 70000 100000 120000 160000 200000 240000 260000 270000 310000 z-test 1 40000 70000 100000 120000 160000 240000 260000 270000 310000 Bartlett’s test 1 40847 70688 239236 260785 310000 Final 1 40000 70000 100000 120000 160000 240000 260000 270000 310000 Table 3: Characteristics of the segments of the series Table 6: Real changes points and the points selected to test case 3. Segment 1 2 3 4 5 6 7 8 9 10 3 Length (104 ) 4 3 3 2 4 4 4 2 1 4 Mean 0.978 0.989 1.036 0.973 0.982 0.999 1.016 0.995 0.946 1.020 Variance 7.971 5.961 5.993 8.963 4.990 2.989 9.052 1.992 4.010 8.002 Real 1 40000 70000 100000 120000 160000 200000 240000 260000 270000 310000 z-test 1 310000 Bartlett’s test 1 99980 120018 160001 199997 239999 259974 270002 310000 Final 1 99980 120018 160001 199997 239999 259974 270002 310000 Results and Discussion To split the series of test cases, were used the following values: length of window, L = 1500, minimum separation between two changes points, ∆ = 8000, length of intervals, l = 500, significance level to z, B, A and r, α = 5%. The results of the segmentation of the series of the test cases are shown in the Fig.1, Fig.2 and Fig.3, and the real change points, the selected by each test and the final changes points resulting of the joining of the points of the two tests are shown in the Tab.4, Tab.5 and Tab.6. The Tab. 6 shows that the results of z-test were not used in the segmentation, this happened because z not presented peaks, meaning the series not had abrupt changes of mean. Table 4: Real changes points and the points selected by tests to the series of the test case 1. Real 1 40000 70000 100000 120000 160000 200000 240000 260000 270000 310000 by tests to the series of the test case 3. z-test 1 70001 99999 120001 159999 200005 240043 270000 310000 Bartlett’s test 1 40007 70428 100000 120300 159683 199996 260011 270062 310000 Final 1 40007 70001 100000 120001 159999 199996 240043 260011 270000 310000 The Fig.1 shows that to a significance level of 5%, the threshold calculated to z and B were, respectively, 0.2 and 0.24, and with this threshold the z-test was not able to identify the change points between the segments 1 and 2 which, ac- ISBN: 978-85-8001-069-5 cording the Tab.1, had mean equal to 6.007 and 4.967, respectively, and between the segments 8 and 9 had mean equal to 6.020 and 8.028, respectively. In the Fig.1(c) is possible to observe that z presented small peaks between these segment, however the values of z were smaller than threshold adopted, thus the points corresponding to these small peaks were not considered change points. If the chosen significance level was greater, the threshold calculated would be smaller and these points would be selected. The Bartlett’s test was not able to identify the points between the segments 7 and 8. From the Tab.1, it is possible observe that the variances between these segments, not varied abruptly and in the Fig.1(e) not shows peaks of B for no points between these segments. The Bartlett’s test identified the two points that the z-test was not able to identify, and the only point that the Bartlett’s test not identified, it was identified by z-test, thus joining the points, the algorithm was able to identify all change points of this time series. The Fig.1(b), Fig.1(d) and Fig.1(f) shows the utility of the run test and trend test which classified correctly the non-stationary segments, such as the segments 1 and 7 of the Fig.1(b) which was formed by original segments 1 and 2, and 8 and 9, respectively, that had mean and variance different as can be seen in the Tab.1. The segment 7 of the Fig.1(d) was also correctly classified as nonstationary. However, the algorithm classified as non-stationary two segments considered stationary. In the test case 2, despite only the mean of 3486 Anais do XIX Congresso Brasileiro de Automática, CBA 2012. Figure 1: Test case 1: S = stationary and N = non-stationary. (a) Signal generated by concatenation of segments with length, mean and variance different and the real change points. (b) Signal segmented by change points selected by z-test. (c) Value computed of z for all points through the sliding of window. (d) Signal segmented by change points selected by Bartlett’s test. (e) Value computed of B for all points through the sliding of window. (f) Final segmentation resulting from joining of the points of (b) and (d). Figure 2: Test case 2: S = stationary and N = non-stationary. (a) Signal generated by concatenation of segments with length and mean and the real change points. (b) Signal segmented by change points selected by z-test. (c) Value computed of z for all points through the sliding of window. (d) Signal segmented by change points selected by Bartlett’s test. (e) Value computed of B for all points through the sliding of window. (f) Final segmentation resulting of the joining of the points of (b) and (d). ISBN: 978-85-8001-069-5 3487 Anais do XIX Congresso Brasileiro de Automática, CBA 2012. Figure 3: Test case 3: S = stationary and N = non-stationary. (a) Signal generated by concatenation of segments with length and variance different and the real change points. (b) Signal segmented by change points selected by Bartlett’s test. (c) Value computed of B to all points through the sliding of window. series have varied abruptly, the Bartlett’s test presented some peaks to points near to exact point in which the mean changed, and in the exact instant which the mean changed, the result of the test was near of zero, this can be observed in the Fig.2(e). This happened because when the righttail of window attained the instant which the signal mean changed, the sample of one window side was formed by two segments with different means and this was responsible by differentiate the variances of two samples. When the window center coincided with exact instant of mean variation, the two samples formed by the window presented different means but close variances, making the value of B was near of zero. With the to slide of window, the instant of mean variation was inside left-side window, and the result of the Bartlett’s test presented similar behavior to that described above for the right-side. Thus, the Bartlett’s test presented two peaks for each mean variation, however not in the exact instant. Through Tab.5, it is possible to note that the z-test only was not able to identify a change point of mean, the other points were identified with accuracy. The run test and trend test classified correctly the segment 6 of the Fig.2(f) as nonstationary. To the test case 3 only the results of the Bartlett’s test were used in the segmentation, because the standard deviation of z was greater than the half of the its mean. Thus, as can be observed in the Fig.3 or Tab.6, two changes points were not selected. Through the Fig.3(c) it is possible to note that the test presented a small peak only to the point between original segments 1 and 2 which have variances equal to 7.971 and 5.961, respectively, however this peak was smaller than threshold of 0.06, and so this point was not selected. Between the segments 2 and 3 not exist peak, this happened because the variance of the two segments are very close. In the Fig.3(b) is possible to note that the algorithm not correctly classified the last segment, already the classification of the first segment is correct. ISBN: 978-85-8001-069-5 The Fig.4 shows the segmentation result to two signals recorded simultaneously. The signals of the Fig.4(a) is a biological time series consisting the response of a spiking neuron, the FCO sensory neuron, to stimulation signal applied to FCO apodeme shown in the Fig.4(b). To segment these signals the values were set as L = 5000, ∆ = 20000, l = 3000 and it was adopted a significance level of α = 5% for variables z, B, r and A. The change points selected by the z-test and the Bartlett’s test for two signals are shown in the Tab.7 and Tab.8. The points are shown in samples in the tables and in the Fig.4 the points are converted to seconds using the sampling frequency of 24 kHz used to record the signals. Table 7: Change points selected by tests to the biological series. z-test 1 30502 67140 117612 159735 335926 406827 612985 699904 Bartlett’s test 1 30476 67146 117604 612779 699904 Final 1 30476 67146 117612 1159735 335926 406827 612779 699904 Table 8: Change points selected by tests to the stimulation signal. z-test 1 699904 Bartlett’s test 1 30288 67291 117529 612579 699904 Final 1 30288 67291 117529 612579 699904 As can be observed, the stimulation signal is similar to the series of the case test 3, in which only the variance of series varies abruptly, thus to segmented the signal it was used only the Bartlett’s test. The segment 4 of the Fig.4(a) was classified as non-stationary due the clear variation of the amplitude of the spikes. The segments 1, 3488 Anais do XIX Congresso Brasileiro de Automática, CBA 2012. Figure 4: Biological signal: S = stationary and N = non-stationary. (a) Response of the spiking neuron to stimulus of (d). (b) Stimulus of 100 Hz applied to FCO apodeme. 3 and 5 of the Fig.4(b) present fluctuating trends therefore are considered as non-stationary. The points between the segments 4-5, 5-6 and 6-7 of the biological series were selected because in this points there is a increase or decrease of the frequency of spikes. 4 Conclusions This paper shown that the z-test and Bartlett’s test are very effective in to detect, respectively, abrupt changes in the mean and variance of a time series. However, it is important to note that the changes points selected by z-test are more accurate than the points selected by Bartlett’s test, this can be confirmed by the Tab.4 and Tab.5 where can be seen that the z-test selected the change points nearest the real change points. The run test and trend test used to verify the stationarity of the segments formed by two previous tests proved be very sensitive to smooth statistical changes, because they were able to identify both the linear trend and the fluctuating trend that the z-test and Bartlett’s test can not detect. Thus, this algorithm can be used as an auxiliary tool to future analysis of biological signals and it will be responsible for identifying the segments for which the statistical functions can be used correctly. Acknowledgments This work was supported by resources supplied by the Center for Scientific Computing (NCC/GridUNESP) of the São Paulo State University (UNESP). We also thanks the Brazilian National Council for Scientific and Technological Development (CNPq) by a studentship. References Bastos, J. L. D. and Duquia, R. P. (2007). Medidas de dispersão: os valores estão próximos entre si ou variam muito?, Scientia Medica 17(1): 40–44. [online] Available at: http://revistaseletronicas.pucrs.br/ ISBN: 978-85-8001-069-5 ojs/index.php/scientiamedica. Accessed 15-april-2012. Bendat, J. and Piersol, A. (1966). Measurement and analysis of random data, Wiley. Dalcin, L. (2012). Mpi for python, [online] Available at: http://mpi4py.scipy.org/ docs/usrman/index.html. Accessed 14april-2012. Fukuda, K., Stanley, H. E. and Amaral, L. A. N. (2004). Heuristic segmentation of a nonstationary time series, Physical Review E 69(2). Hung, J. F. (1981). Digital processing of nonstationary signals, Master’s thesis, McMaster Universtiy. Kondoh, Y., Okuma, J. and Newland, P. L. (1995). Dynamics of neurons controlling movements of a locust hind leg: Wiener kernel analysis of the responses of proprioceptive afferents., Journal of Neurophysiology 73(5): 1829–1842. Lopatka, M., Laplanche, C., Adam, O., Motsch, J.-f. and Zarzycki, J. (2005). Non-stationary time-series segmentation based on the schur prediction error analysis, IEEESP 13th Workshop on Statistical Signal Processing 2005 2: 2–6. Miller, I., Freund, J. and Johnson, R. (1990). Probability and statistics for engineers, Prentice Hall. Papoulis, A. and Pillai, S. (2002). Probability, random variables, and stochastic processes, McGraw-Hill electrical and electronic engineering series, McGraw-Hill. Snedecor, G. W. and Cochran, W. (1989). Statistical Methods, number v. 276 in Statistical Methods, Iowa State University Press. Velis, D. R. (2007). Statistical segmentation of geophysical log data, Mathematical Geology 39(4): 409–417. 3489