scrollbar

Recurrence Plots and Cross Recurrence Plots

Common pitfalls

Selected discussion from N. Marwan: How to avoid potential pitfalls in recurrence plot based data analysis, International Journal of Bifurcation and Chaos, 21(4), 1003-1017 (2011). DOI:10.1142/S0218127411029008

Indicators of determinism – Artifacts based on time delay embedding

The length of a diagonal line in the RP corresponds to the time the system evolves very similar as during another time, i.e., a segment of the phase space trajectory runs parallel and within an \(\varepsilon\)-tube of another segment of the phase space trajectory. Deterministic systems are often characterised by repeated similar state evolution (corresponding to a local predictability), yielding in a large number of diagonal lines in the RP. In contrast, systems with independent subsequent values, like white noise, have RPs with mostly single points. Therefore, the fraction of recurrence points forming such diagonal lines (of length \(l \ge l_\min\)) $$ DET = \frac{\sum_{l=l_{\min}}^N l P(l)}{\sum_{l=1}^N l P(l)} $$ can be calculated and is, therefore, called determinism in the RQA. Somehow this measure can be interpreted as an indication of determinism in the data. But we should be careful in using the term determinism in a more general or mathematical sense. In a deterministic system we can calculate the same exact state by using given initial conditions, i.e., there is no stochastic process involved. Different methods can be used to test for determinism in time series, e.g., a combined modelling-surrogate approach (Small & Tse, 2003) or an analysis of the directionality of the phase space trajectory (Kaplan & Glass, 1992). High values of \(DET\) might be an indication of determinism in the studied system, but it is just a necessary condition, not a sufficient one. Even for non-deterministic processes we can find longer diagonal lines in the RP, resulting in increased \(DET\) values. For example, the following (non-deterministic) auto-regressive process \( x_i = 0.8 x_{i-1} + 0.3 x_{i-2} - 0.25 x_{i-3} + 0.8 \xi \) (where \(\xi\) is white Gaussian noise) has a \(DET\) value of \(0.6\) (embedding dimension \(m=4\), delay \(\tau = 4\), and fixed recurrence rate of \(0.1\)). As it was shown in Thiel et al. (2003), stochastic processes can have RPs containing longer diagonal lines just by chance (although very rare). Moreover, due to embedding we introduce correlations in the RP and, therefore, also uncorrelated data (e.g., from white noise process) have spurious diagonal lines (Thiel et al., 2006; Marwan et al., 2007) (Fig. 1). Moreover, data pre-processing like low-passfiltering (smoothing) is frequently used. Such pre-processing can also introduce spurious line structures in the RP. Therefore, from just a high value of the RQA measure \(DET\) we have to be careful in infering that the studied system would be deterministic. For such conclusion we need at least one further criterion included in the RP: the directionality of the trajectory (Kaplan & Glass, 1992). One possible solution is to use iso-directional RPs (Horai et al., 2002) or perpendicular RPs (Choi et al., 1999); if then the measure reaches \(DET \approx 1\) for a very small recurrence density (i.e., \(RR<0.05\)), the underlying system will be a deterministic one (like a periodic or chaotic system).

Recurrence plot of one realisation of
Gaussian white noise
Figure 1:  (A) Recurrence plot of one realisation of Gaussian white noise, calculated using embeddeding dimension \(m=6\), delay \(\tau=1\), and a recurrence threshold of \(\varepsilon=0.2\). The embedding causes a number of long lines. (B) Correlation between a single recurrence point at \((15,30)\) and other recurrence points in the RP of white noise demonstrating the effect of embedding for a bogusly creation of long diagonal lines (estimated from 1,000 realisations). (C) The histogram of line lengths found in the RP shown in (A). The maximum length is \(L_\max=17\), a value, which would not be uncommon for a deterministic process.

Indicators of periodic systems

As explained in the previous section, deterministic systems cause a high value in the RQA measure \(DET\). This measure has been successfully used to detect transitions in the dynamics of complex systems. A frequently used example in order to present this ability is the study of the different dynamical regimes of the logistic map, where \(DET\) is able to detect the periodic windows (by values \(DET\) = 1). Therefore, it is often claimed that this measure is able to detect chaos-period transitions.

However, we can also find such high \(DET\) values for non-periodic, but chaotic systems. For example, the Roessler system exhibits in the parameter interval c ∈ [35, 45] a transition from periodic to chaotic states (Fig. 2A). But due to the smooth phase space trajectory and high sampling frequency (sampling time Δ t = 0.1), the RP for the chaotic trajectory consists almost exclusively on diagonal line structures (Fig. 3), resulting in a high value of \(DET\), i.e., \(DET\) ≈ 1 (Fig. 2B).

1st and 2nd positive Lyapunov exponents
Figure 2:  (A) 1st and 2nd positive Lyapunov exponents of the Roessler oscillator with parameters \(a=b=0.25\) and \(c \in [35, 45]\). A periodic window occurs between \(c=36.56\) and \(c=37.25\). However, the \(DET\) measures reveals an almost constant very high value of approximately \(DET=0.94\). Used RP parameters: dimension \(m=3\), delay \(\tau=6\), adaptive recurrence threshold to ensure a \(RR = 0.05\).
Recurrence plot of the Roessler
Figure 3:  Recurrence plot of the Roessler oscillator with parameters \(a=b=0.25\) and \(c = 40\). For this parameters, the Roessler system is in a chaotic regime (\(\lambda_1 = 0.14\)), but the RP consists almost only on diagonal lines. (same parameters as in Fig. 2).

A very high value of \(DET\) is not a clear or even sufficient indication of a periodic system. High values can be caused by very smooth phase space trajectories. This should also be considered when looking for indications of unstable periodic orbits (UPOs), where \(DET\) or mean and maximal line lengths \(L\) and \(L_\max\) may not be sufficient. A solution could be to increase the minimal length \(l_\min\) of a diagonal recurrence structure which is considered to be a line. However, a better solution is to look at the cumulative distribution of the diagonal line lengths and estimate the \(K_2\) entropy (but this requires much longer time series). Recent work has shown that measures coming from complex network theory, like clustering coefficient, applied to recurrence matrices are more powerful and reliable for the detection of periodic dynamics (Zou et al., 2010).

Indicators of chaos

The RP visualises the recurrence structure of the considered system (based on the phase space trajectory). The basic idea behind RPs comes, in general, from the study of chaos. Therefore it can be considered as a nonlinear tool for data analysis. But this cannot be a criterion to understand complex structures in the RP or high values of RQA measures as indicators of chaos or nonlinearity in the dynamical system.

As mentioned above, uncorrelated stochastic systems have mostly short or almost no diagonal line structures in their RPs, whereas deterministic and regular systems, like periodic processes, have mostly long and continuous diagonal line structures. Chaotic processes have also diagonal, but shorter lines, and can have single recurrence points. Nevertheless, only by looking at the appearance of an RP it is difficult (almost impossible) to infer about the type of dynamics; only periodic and white noise processes can be identified with some certainty.

The alternative is to look at the RQA measures quantifying the structures in an RP which are related to some dynamical characteristics of the system. As diagonal lines in the RP correspond to parallel running trajectory segments, it is clear that the length of these lines is somehow related to the divergence behaviour of the dynamical system. Divergence rate of phase space trajectories is measured by the Lyapunov exponent. In fact, the lengths of the diagonal lines are directly related to dynamical invariants as \(K_2\) entropy or \(D_2\) correlation dimension (Faure & Korn, 1998; Thiel et al., 2004). The \(K_2\) entropy is the lower limit of the sum of the positive Lyapunov exponents.

For example, RQA measures based on the length of the diagonal lines, like determinism \(DET\) and mean line length \(L\), also depend on the type of the dynamics of the systems (rather low values for uncorrelated stochastic (white noise) systems, higher values for more regular, correlated and also chaotic systems). It has been suggested to measure the length of the longest diagonal line \(L_\max\) and interpret its inverse \(DIV = \frac{1}{L_\max}\) as an estimator of the maximal Lyapunov exponent (Trulla et al., 1996). However, this interpretation incorporates high potential of erroneous conclusions derived from RQA.

First, the main diagonal in the RP (i.e., the line of identity, LOI) is naturally the longest diagonal line, wherefore it is usually excluded from the analysis. However, due to the tangential motion of the phase space trajectory (tangential motion becomes even more crucial and influential for highly sampled or smooth systems.), subsequent phase space vectors are often also considered as recurrence points (known as sojourn points) (Marwan et al., 2007). These recurrence points lead to further continuous diagonal lines directly close to the LOI. Without excluding an appropriate corridor along the LOI (the Theiler window), \(L_\max\) will be artificially large (\(\approx N\)) and \(DIV\) too small.

Second, as explained above, even white noise can have long diagonal lines, leading to a small \(DIV\) value just by chance (Fig. 1). Although the probability for the occurrence of such long lines is rather small, the probability that lines of length two occur in RPs of stochastic processes is, on the contrary, rather high. Only one line of length two is enough to get a finite value of \(DIV\) which might be mis-interpreted as a finite Lyapunov exponent and that the system would be chaotic instead stochastic.

Therefore, we have to be careful in interpreting the RQA measures themselves as indicators of chaos. Moreover, such conclusion cannot be drawn by applying a simple surrogate test where the data points are simply shuffled (such a test would only destroy the correlation structure within the data, and, thus, the frequency information).

RP or RQA alone cannot be used to infer nonlinearity from a time series. For this purpose, advanced surrogate techniques are more appropriate (Schreiber & Schmitz, 2000; Rapp et al., 2001).

Significance of RQA measures

When analysing time series by a windowed RQA, an important question is how significant is the variation of the RQA measures. A sub-optimal scaling of the variation of the RQA measures can mislead to conclusions that the studied system has changed its regime or that it would be nonstationary (Fig. 4A, B). Therefore, it is strongly recommended to cross-check the scaling of the presentation and to present confidence intervals (Fig. 4C, D). Confidence intervals can be calculated in various ways, but we should avoid to derive them by simply shuffling the original data. One approach could be a bootstrap resampling of the line structures in the RP (Marwan et al., 2013). Another approach fits the probability of serial dependences (diagonal lines) to a binomial distribution (Hirata et al., 2011). Whatever approach we chose, the estimation of the confidence intervals is not a trivial task, but in the future the standard software for RQA should include such tests.

Two exemplary RQA measures
Figure 4:  Two exemplary RQA measures, (A, C) determinism \(DET\) and (B, D) laminarity (\(LAM\)), of the auto-regressive process. (A, B) The scaling of the \(y\)-axis is affecting a strong variation in the RQA measures – a potential of wrong conclusions. (C, D) Considering a 5% confidence interval of the RQA measures (details can be found in Marwan et al., (2013)) and a better value range for the \(y\)-axis, we cannot infer that the values of the RQA measures as shown in (A) and (B) significantly vary. The RQA is calculated using a window size of \(w=250\) and a window step of \(ws= 20\), using maximum norm, \(\varepsilon = 0.3\) and without embedding (the RQA time point is set to the centre of the RQA window). \(LAM\) is the fraction of recurrence points forming vertical lines in an RP (analogously as \(DET\) for the diagonal lines).

A common statement on recurrence analysis is that it is useful to analyse short data series. But we have to ask, how short is short? The required length for the estimation of dynamical invariants will be discussed in the following Subsect. Applying RQA analysis we should be aware that the RQA measures are statistical measures (like an average) and need some minimal length that a variation can be considered to be significant.

References




Creative Commons License © 2000-2024 SOME RIGHTS RESERVED
The material of this web site is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 2.0 Germany License.

Please respect the copyrights! The content of this web site is protected by a Creative Commons License. You may use the text or figures, but you have to cite this source (www.recurrence-plot.tk) as well as N. Marwan, M. C. Romano, M. Thiel, J. Kurths: Recurrence Plots for the Analysis of Complex Systems, Physics Reports, 438(5-6), 237-329, 2007.

@MEMBER OF PROJECT HONEY POT
Spam Harvester Protection Network
provided by Unspam
report
honeypot link