---
title: "GPoM : 7 Retro-modelling"
author: "Sylvain Mangiarotti & Mireille Huc"
date: "`r Sys.Date()`"
output:
  pdf_document:
    number_sections: no
    latex_engine: xelatex
vignette: >
  %\VignetteIndexEntry{GPoM : 7 Retro-modelling}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
load package: "`r library(GPoM)`"
---

## Detection of miscellaneous chaotic systems
  
The question underlying the expression retro-modelling
is the following: is it possible to retrieve
the original formulation of a given dynamical system
from a set of observed time series generated by
this system  (and without any extra information)?
To answer this question, the RÃ¶ssler system^[O. RÃ¶ssler,
An Equation for Continuous Chaos, *Physics Letters*,
**57A**(5), 1976, 397-398.] was extensively used
in the previous vignettes, in particular in vignettes
`3 Modelling` and `6 Sensitivity`, to show
that the global modelling is very efficient to retrieve
this system under various conditions
of noise, subsampling, short length time series, etc.
In the present vignette, other three-dimensional chaotic
systems are considered in order to check the
generality of the technique.
Seventeen other systems are considered in the present
vignette which equations can be loaded as follows:

```{r load1, eval=TRUE}
  data("allMod_nVar3_dMax2")
  names(allMod_nVar3_dMax2)
```

These systems were used to produce time series.
These time series can also be directly loaded as follows:

```{r load2, eval=TRUE}
  data("TSallMod_nVar3_dMax2")
```

In the following experiments, these time series are used
as an input of the `GPoM` package to check if
the sets of original equations can be retrieved.


## The NosÃ©-Hoover-1986 system
The NosÃ©-Hoover system was discovered by
ShÅ«ichi NosÃ©^[S. NosÃ©, A molecular dynamics method
for simulations in the canonical ensemble,
*Molecular Physics*, **52** (2), 255-268, 1984.]
and William Hoover^[W. G. Hoover, Nonlinear conductivity
and entropy in the two-body Boltzmann gas,
*Journal of Statistical Physics*, **42**(3/4), 587-600, 1986.]
in 1986 (it was rediscovered by Sprott in 1994, known as Sprott-A).
This system reads

```{r visuNH86, eval=TRUE}
visuEq(K = allMod_nVar3_dMax2$NH86,  substit = 1)
```

### Data

This system is particularly interesting because,
despite their deterministic couplings, the three variables
are characterized by quite different signals:
```{r plotNH86, eval=TRUE, fig.align='center'}
  TS <- TSallMod_nVar3_dMax2$NH86$reconstr
  plot(TS[,1], TS[,2], type = 'l', ylim = c(-6.2,5.2), xlab = 't', ylab = '', 
       main = 'NosÃ©-Hoover', col = 'blue')
  lines(TS[,1], TS[,3], type = 'l',  col = 'red')
  lines(TS[,1], TS[,4], type = 'l',  col = 'green')
  legend(10,-3.5, c("x(t)", "y(t)", "z(t)"), col=c('blue', 'red', 'green'), lty=1, cex = 0.8)
```

And the three variables are almost completely decorrelated.
```{r corrNH86, eval=FALSE}
  # Compute the correlation
    cor(TS[,2:4])
```
Indeed, correlation does not exceed ~0.142
(the maximum correlation is observed between $x$ and $z$),
and it is almost zero between $x$ and $y$ (~0.025)
and between $y$ and $z$ (~0.016).
Obviously, for such a type of dynamic, the correlation
may be particulary misleading information for whom would
like to detect the existence of a causal coupling.

### Global modelling

The `GPoM` package ^[S. Mangiarotti, 2015. Low dimensional chaotic models
for the plague epidemic in Bombay, *Chaos, Solitons & Fractals*,
**81**(A), 184-196.]$^,$ ^[S. Mangiarotti, M. Peyre & M. Huc, 2016.
A chaotic model for the epidemic of Ebola virus disease
in West Africa (2013â€“2016). *Chaos*, **26**, 113112.]
is used to investigate the relations existing between
the three time series $x(t)$, $y(t)$ and $z(t)$ shown above.
A drastic model selection is performed by the `gPoMo`
function when trying to obtain a global model from
a set of observational time series.
The numerical integration is then used to obtain
an optimal model among the few models preselected
by the function:

```{r gpomoNH86, eval=FALSE}
# modelling from time series starting
outNH86 <- gPoMo(data = TS[,2:4], tin = TS[,1], dMax = 2, nS = c(1,1,1),
              show = 1,
              IstepMin = 10, IstepMax = 15000, nPmin = 3, nPmax = 5,
              method = 'rk4')
```

With the only exception of model #6, the preselected models
are either diverging (models #1, #2, #4, #5, #7 and #10)
or converging to a fixed point (#3, #8 and #9).
Equations for model #6 effectively correspond to the original
system:
```{r visuNH86b, eval=FALSE}
visuEq(K = outNH86$models$model6,  substit = 1, approx = 2)
```
```{r visuNH86c, eval=TRUE, echo=FALSE}
visuEq(K = data_vignetteVII$out4model6,  substit = 1, approx = 2)
```

Therefore, despite the absence of correlation observed
between the three variables, the proper formulation is directly
and completely retrieved.
Note that allowing models of larger size (e.g. with `nPmax = 6`),
three other models of nontrivial solution can be obtained
(models #12, #14 and #15).
The model #6 should be preferred from these for its more
parsimonious formulation.

The results obtained for this system are especially
interesting since providing an obvious evidence
that correlation is not a good tool for the detection
of linkages between dynamical variables.
Contrarily, the retro-modelling technique enables
not only to detect the couplings.
Furthermore, it goes far beyound, by making it
possible to retrieve the original and
complete formulation of the coupling between
the three variables.

## The Genesio-Tesi system (1992)
The following differential equations was introduced
by Roberto Genesio and Alberto Tesi in 1992^[R. Genesio & A. Tesi,
Harmonic balance methods for the analysis of chaotic dynamics in nonlinear
systems, *Automatica*, **28**(3), 531-548, 1992.]

$d^3x/dt^3 + a . d^2x/dt^2 + b.dx/dt + x (1 + x) = 0$,

(for a = 0.446 and b = 1.1)
which can be rewritten in the canonical form:

```{r visuGT92, eval=TRUE}
visuEq(K = allMod_nVar3_dMax2$GT92,  substit = 1)
```

The following set of time series $x(t)$, $y(t)$ and $z(t)$
was produced from this system.
```{r plotGT92, eval=TRUE, fig.align='center'}
  TS <- TSallMod_nVar3_dMax2$GT92$reconstr
  plot(TS[,1], TS[,2], type = 'l', ylim = c(-1.1, 0.8), xlab = 't', ylab = '', 
       main= 'Genesio-Tesi (1992)', col = 'blue')
  lines(TS[,1], TS[,3], type = 'l', col = 'red')
  lines(TS[,1], TS[,4], type = 'l', col = 'green')
  legend(22,-0.5, c("x(t)", "y(t)", "z(t)"), col=c('blue', 'red', 'green'), lty=1, cex = 0.8)

```

And the global modelling technique was then applied
to this set of time series:

```{r gpomoGT92, eval=FALSE}
# modelling from time series starting
outGT92 <- gPoMo(data = TS[,2:4], tin = TS[,1], dMax = 2, nS = c(1,1,1),
              show = 1,
              IstepMin = 10, IstepMax = 15000, nPmin = 6, nPmax = 6,
              method = 'rk4')
```

Performing the same analysis as for the system NosÃ©-Hoover
(see upper in the text), the formulation of the original
system can be directly retrieved (model #1):
  
```{r visuGT92b, eval=FALSE}
visuEq(K = outGT92$models$model1,  substit = 1, approx = 2)
```
```{r visuGT92c, eval=TRUE, echo=FALSE}
visuEq(K = data_vignetteVII$out5model1,  substit = 1, approx = 2)
```

Note that obtained models of smaller size have
period-1 cycles (these can be obtained by setting
`nPmin = 3`). The complexity of the observed
signal is thus not reproduced by these models
and these can be rejected.
Allowing models of larger size (nPmax = 7), interesting
solutions will be otbained (in terms of phase portrait),
but their formulation will be less concise and the previous
formulation obtained with model #1 should thus be preferred.


## The Sprott systems
Numerous systems were discovered (some rediscovered)
by Julian Clinton Sprott in the 1990s^[J. C. Sprott,
Some simple chaotic flows, *Physical Review E*, **50**(2),
647-650, 1994.]. Several of these are considered
in this section, namely systems
Sprott-F, H, K, O, P, G, M, Q and S.

### The Spott-F system

The Sprott-F system reads:

```{r visuSprF, eval=TRUE}
visuEq(K = allMod_nVar3_dMax2$SprF,  substit = 1)
```

A specific attention was paid uuper to the NosÃ©-Hoover system
for its very low correlation.
The Sprott-F system is interesting for the opposite reason:
its variables are characterized by a high level of correlation
(or anticorrelation) as shown by the following plots
$x(t)$, $y(t)$ and $z(t)$:
```{r plotSprF, eval=TRUE, fig.align='center'}
  TS <- TSallMod_nVar3_dMax2$SprF$reconstr
  plot(TS[,1], TS[,2], type = 'l', ylim = c(-3.6,4.2),
       xlab = 't', ylab = '', main = 'Sprott-F', col = 'blue')
  lines(TS[,1], TS[,3], type = 'l', col = 'red')
  lines(TS[,1], TS[,4], type = 'l', col = 'green')
  legend(0,4, c("x(t)", "y(t)", "z(t)"), col=c('blue', 'red', 'green'), lty=1, cex = 0.8)

```

Correlation is high between the three variables:
~0.68 between $x(t)$ and $y(t)$,
~0.74 between $x(t)$ and $-z(t)$,
and ~0.77 between  $y(t)$ and $-z(t)$:
```{r corrSprF, eval=FALSE}
  # Compute the correlation
    cor(TS[,2:4])
```

The global modelling technique was applied to this data set
using the `gPoMo` function:

```{r gpomoSprF, eval=FALSE}
# modelling from time series starting
outSprF <- gPoMo(data = TS[,2:4], tin = TS[,1], dMax = 2, nS = c(1,1,1),
              show = 1, IstepMin = 10, IstepMax = 15000, nPmin = 6, nPmax = 6,
              method = 'rk4')
```

Note that allowing a larger range of model size (`nPmin = 3` and `nPmax = 7`),
models solutions can be rejected either because they are diverging or
oversimplified (models #3-14, #16-24, #26-27 and #30-35), or because
they are less parsimonious (models #15, #25, #28 and #29).

The formulation of the original system is thus effectively retrieved:

```{r visuSprFb, eval=FALSE}
visuEq(K = outSprF$models$model5,  substit = 1, approx = 2)
```
```{r visuSprFc, eval=TRUE, echo=FALSE}
visuEq(K = data_vignetteVII$out6model5,  substit = 1, approx = 2)
```

Similarly, results are found to be straightforward for all the Sprott systems.

### The Spott-H system

The Sprott-H system reads
```{r visuSprH, eval=TRUE}
visuEq(K = allMod_nVar3_dMax2$SprH,  substit = 1)
```

Although less marked, the behavior of this system
is more or less similar to Sprott-F in terms of
correlation (or anticorrelation).

```{r plotSprH, eval=TRUE, fig.align='center'}
  TS <- TSallMod_nVar3_dMax2$SprH$reconstr
  plot(TS[,1], TS[,2], type = 'l', ylim = c(-5,5),
       xlab = 't', ylab = '', main = 'Sprott-H', col = 'blue')
  lines(TS[,1], TS[,3], type = 'l',  col = 'red')
  lines(TS[,1], TS[,4], type = 'l',  col = 'green')
  legend(0, -2, c("x(t)", "y(t)", "z(t)"), col=c('blue', 'red', 'green'), lty=1, cex = 0.8)
```

Correlation is high between $x(t)$ and $z(t)$ (~0.77),
and $x(t)$ and $-y(t)$ (~0.63),
and lower between  $y(t)$ and $z(t)$ (~0.38):
```{r corrSprH, eval=FALSE}
  # Compute the correlation
    cor(TS[,2:4])
```

The global modelling technique was applied to this data set:

```{r gpomoSprH, eval=FALSE}
# modelling from time series starting
outSprH <- gPoMo(data = TS[,2:4], tin = TS[,1], dMax = 2, nS = c(1,1,1),
              show = 1,
              IstepMin = 10, IstepMax = 15000, nPmin = 6, nPmax = 6,
              method = 'rk4')
```

The formulation of the original system is directly retrieved:
  
```{r visuSprHa, eval=FALSE}
visuEq(K = outSprH$models$model5,  substit = 1, approx = 2)
```
```{r visuSprHb, eval=TRUE, echo=FALSE}
visuEq(K = data_vignetteVII$out7model5,  substit = 1, approx = 2)
```

No simpler formulation could enable to reproduce the observed dynamic.


### The Spott-K system

The Sprott-K system reads

```{r visuSprK, eval=TRUE}
visuEq(K = allMod_nVar3_dMax2$SprK,  substit = 1)
```

The following data set was used:
```{r plotSprK, eval=TRUE, fig.align='center'}
  TS <- TSallMod_nVar3_dMax2$SprK$reconstr
  plot(TS[,1], TS[,2], type = 'l', ylim = c(-3,3.6), xlab = 't', ylab= '', 
       main = 'Sprott-K', col = 'blue')
  lines(TS[,1], TS[,3], type = 'l', col = 'red')
  lines(TS[,1], TS[,4], type = 'l',  col = 'green')
  legend(0,3.5, c("x(t)", "y(t)", "z(t)"), col=c('blue', 'red', 'green'), lty=1, cex = 0.8)

```

and the global modelling technique was applied.

```{r gpomoSprK, eval=FALSE}
# modelling from time series starting
outSprK <- gPoMo(data = TS[,2:4], tin = TS[,1], dMax = 2, nS = c(1,1,1),show = 1,
              IstepMin = 10, IstepMax = 15000, nPmin = 6, nPmax = 6, method = 'rk4')
```

The formulation of the original system is directly retrieved:
  
```{r visuSprKb, eval=FALSE}
visuEq(K = outSprK$models$model5,  substit = 1, approx = 2)
```
```{r visuSprKc, eval=TRUE, echo=FALSE}
visuEq(K = data_vignetteVII$out8model5,  substit = 1, approx = 2)
```

and no simpler formulation enabling to reproduce the observed dynamic
was obtained.

### The Spott-O system

The Sprott-O system reads

```{r visuSprO, eval=TRUE}
visuEq(K = allMod_nVar3_dMax2$SprO,  substit = 1)
```

The following data set was used:
```{r plotSprO, eval=TRUE, fig.align='center'}
  TS <- TSallMod_nVar3_dMax2$SprO$reconstr
  plot(TS[,1], TS[,2], type = 'l', ylim = c(-1.5,1), main = 'Sprott-O',
       xlab = 't', ylab='', col = 'blue')
  lines(TS[,1], TS[,3], type = 'l', col = 'red')
  lines(TS[,1], TS[,4], type = 'l', col = 'green')
  legend(0,-1, c("x(t)", "y(t)", "z(t)"), col=c('blue', 'red', 'green'), lty=1, cex = 0.8)

```

The global modelling technique was applied.

```{r gpomoSprO, eval=FALSE}
# modelling from time series starting
outSprO <- gPoMo(data = TS[,2:4], tin = TS[,1], dMax = 2, nS = c(1,1,1),
              show = 1,
              IstepMin = 10, IstepMax = 15000, nPmin = 6, nPmax = 6,
              method = 'rk4')
```

The formulation of the original was directly retrieved:
  
```{r visuSprOb, eval=FALSE}
visuEq(K = outSprO$models$model2,  substit = 1, approx = 2)
```
```{r visuSprOc, eval=TRUE, echo=FALSE}
visuEq(K = data_vignetteVII$out9model2,  substit = 1, approx = 2)
```

and no simpler formulation enabling to reproduce the observed dynamic
was obtained.

### The Spott-P system

The Sprott-P system reads

```{r visuSprP, eval=TRUE}
visuEq(K = allMod_nVar3_dMax2$SprP,  substit = 1)
```

The following data set was used:
```{r plotSprP, eval=TRUE, fig.align='center'}
  TS <- TSallMod_nVar3_dMax2$SprP$reconstr
  plot(TS[,1], TS[,2], type = 'l', ylim = c(-1.2,2.2), main='Sprott-P',
       xlab = 't', ylab = '', col = 'blue')
  lines(TS[,1], TS[,3], type = 'l', col = 'red')
  lines(TS[,1], TS[,4], type = 'l', col = 'green')
  legend(8,2, c("x(t)", "y(t)", "z(t)"), col=c('blue', 'red', 'green'), lty=1, cex = 0.8)

```

The global modelling technique was applied.

```{r gpomoSprP, eval=FALSE}
# modelling from time series starting
outSprP<- gPoMo(data = TS[,2:4], tin = TS[,1], dMax = 2, nS = c(1,1,1), show = 1,
                IstepMin = 10, IstepMax = 15000, nPmin = 6, nPmax = 6, method = 'rk4')
```

The formulation of the original system is directly retrieved:
  
```{r visuSprPb, eval=FALSE}
visuEq(K = outSprP$models$model5,  substit = 1, approx = 2)
```
```{r visuSprPc, eval=TRUE, echo=FALSE}
visuEq(K = data_vignetteVII$out10model5,  substit = 1, approx = 2)
```

No simpler formulation enabling to reproduce the observed dynamic
was obtained.

### The Spott-G system

The Sprott-G system reads

```{r visuSprG, eval=TRUE}
visuEq(K = allMod_nVar3_dMax2$SprG,  substit = 1)
```

The following data set was used:
```{r plotSprG, eval=TRUE, fig.align='center'}
  TS <- TSallMod_nVar3_dMax2$SprG$reconstr
  plot(TS[,1], TS[,2], type = 'l', ylim = c(-2.5,2),
       xlab = 't', ylab = '', main='Sprott-G', col = 'blue')
  lines(TS[,1], TS[,3], type = 'l', col = 'red')
  lines(TS[,1], TS[,4], type = 'l', col = 'green')
  legend(0, 2 , c("x(t)", "y(t)", "z(t)"), col=c('blue', 'red', 'green'), lty=1, cex = 0.8)

```

The global modelling technique was then applied.

```{r gpomoSprG, eval=FALSE}
# modelling from time series starting
outSprG <- gPoMo(data = TS[,2:4], tin = TS[,1], dMax = 2, nS = c(1,1,1),
              show = 1, IstepMin = 10, IstepMax = 15000, nPmin = 3, nPmax = 6, method = 'rk4')
```

The formulation of the original system was directly retrieved:
  
```{r visuSprGb, eval=FALSE}
visuEq(K = outSprG$models$model5,  substit = 1, approx = 2)
```
```{r visuSprGc, eval=TRUE, echo=FALSE}
visuEq(K = data_vignetteVII$out11model5,  substit = 1, approx = 2)
```

No simpler formulation enabling to reproduce the observed dynamic
was obtained.

### The Spott-M system

The Sprott-M system reads

```{r visuSprM, eval=TRUE}
visuEq(K = allMod_nVar3_dMax2$SprM,  substit = 1)
```

The following data set was used:
```{r plotSprM, eval=TRUE, fig.align='center'}
  TS <- TSallMod_nVar3_dMax2$SprM$reconstr
  plot(TS[,1], TS[,2], type = 'l', ylim = c(-5.1,2.8), main='Sprott-M',
       xlab = 't', ylab = '', col = 'blue')
  lines(TS[,1], TS[,3], type = 'l', col = 'red')
  lines(TS[,1], TS[,4], type = 'l', col = 'green')
  legend(2, -3, c("x(t)", "y(t)", "z(t)"), col=c('blue', 'red', 'green'), lty=1, cex = 0.8)
```

The global modelling technique was then applied.

```{r gpomoSprM, eval=FALSE}
# modelling from time series starting
outSprM <- gPoMo(data = TS[,2:4], tin = TS[,1], dMax = 2, nS = c(1,1,1),
              show = 1,
              IstepMin = 10, IstepMax = 15000, nPmin = 6, nPmax = 6,
              method = 'rk4')
```

The formulation of the original system was directly retrieved:
  
```{r visuSprMb, eval=FALSE}
visuEq(K = outSprM$models$model2,  substit = 1, approx = 2)
```
```{r visuSprMc, eval=TRUE, echo=FALSE}
visuEq(K = data_vignetteVII$out12model2,  substit = 1, approx = 2)
```

No simpler formulation enabling to reproduce the observed dynamic
was obtained.

### The Spott-Q system

The Sprott-Q system reads

```{r visuSprQ, eval=TRUE}
visuEq(K = allMod_nVar3_dMax2$SprQ,  substit = 1)
```

The following data set was used:
```{r plotSprQ, eval=TRUE, fig.align='center'}
  TS <- TSallMod_nVar3_dMax2$SprQ$reconstr
  plot(TS[,1], TS[,2], type = 'l', ylim = c(-7,8.8), main = 'Sprott-Q',
       xlab = 't', ylab = '', col = 'blue')
  lines(TS[,1], TS[,3], type = 'l', col = 'red')
  lines(TS[,1], TS[,4], type = 'l', col = 'green')
  legend(4, 7, c("x(t)", "y(t)", "z(t)"), col=c('blue', 'red', 'green'), lty=1, cex = 0.8)

```

The global modelling technique was applied.

```{r gpomoSprQ, eval=FALSE}
# modelling from time series starting
outSprQ <- gPoMo(data = TS[,2:4], tin = TS[,1], dMax = 2, nS = c(1,1,1), show = 1,
                 IstepMin = 10, IstepMax = 15000, nPmin = 6, nPmax = 6, method = 'rk4')
```

The formulation of the original is directly retrieved:
  
```{r visuSprQb, eval=FALSE}
visuEq(K = outSprQ$models$model2,  substit = 1, approx = 2)
```
```{r visuSprQc, eval=TRUE, echo=FALSE}
visuEq(K = data_vignetteVII$out13model2,  substit = 1, approx = 2)
```

No simpler formulation enabling to reproduce the observed dynamic
was obtained.

### The Spott-S system

The Sprott-S system reads

```{r visuSprS, eval=TRUE}
visuEq(K = allMod_nVar3_dMax2$SprS,  substit = 1)
```

The following data set was used:
```{r plotSprS, eval=TRUE, fig.align='center'}
  TS <- TSallMod_nVar3_dMax2$SprS$reconstr
  plot(TS[,1], TS[,2], type = 'l', ylim = c(-4,2), main = 'Sprott-S',
       xlab = 't', ylab = '', col = 'blue')
  lines(TS[,1], TS[,3], type = 'l', col = 'red')
  lines(TS[,1], TS[,4], type = 'l', col = 'green')
  legend(0,-2.5, c("x(t)", "y(t)", "z(t)"), col=c('blue', 'red', 'green'), lty=1, cex = 0.8)

```

The global modelling technique was applied.

```{r gpomoSprS, eval=FALSE}
# modelling from time series starting
outSprS<- gPoMo(data = TS[,2:4], tin = TS[,1], dMax = 2, nS = c(1,1,1), show = 1,
                IstepMin = 10, IstepMax = 15000, nPmin = 6, nPmax = 6, method = 'rk4')
```

The formulation of the original system is directly retrieved:
  
```{r visuSprSb, eval=FALSE}
visuEq(K = outSprS$models$model5,  substit = 1, approx = 2)
```
```{r visuSprSc, eval=TRUE, echo=FALSE}
visuEq(K = data_vignetteVII$out14model5,  substit = 1, approx = 2)
```

No simpler formulation enabling to reproduce the observed dynamic
was obtained.

## The Lorenz-1963 system

The Lorenz system^[E. N. Lorenz, Deterministic non-periodic flow,
*Journal of the Atmospheric Sciences*, **20**(2), 130â€“141 (1963).]
discovered by Edouard N. Lorenz in 1963 is now considered.
This system reads

```{r visuL63, eval=TRUE}
  visuEq(K = allMod_nVar3_dMax2$L63,  substit = 1, approx = 4)
```

The following data set was considered for the modelling:
```{r plotL63, eval=TRUE, fig.align='center'}
  TS <- TSallMod_nVar3_dMax2$L63$reconstr
  plot(TS[,1], TS[,2], type = 'l', ylim = c(-25,45), xlab = 't', ylab = '', 
       main = 'Lorenz 1963', col = 'blue')
  lines(TS[,1], TS[,3], type = 'l', col = 'red')
  lines(TS[,1], TS[,4], type = 'l', col = 'green')
  legend(8, -5, c("x(t)", "y(t)", "z(t)"), col=c('blue', 'red', 'green'), lty=1, cex = 0.8)

```
The global modelling technique was applied to this set
of series with the hope to retrieve the original equations.

```{r gpomoL63, eval=FALSE}
# modelling from time series starting
outL63 <- gPoMo(data = TS[,2:4], tin = TS[,1], dMax = 2, nS = c(1,1,1), show = 1,
                IstepMin = 10, IstepMax = 1500, nPmin = 6, nPmax = 7, method = 'rk4')
```

Surprisingly, the simplest model able to reproduce the original phase
portrait has six terms, that is, one less term than the original model:
the term  $-y$ is missing in the second equation.
This obtained morel reads

```{r visuL63b, eval=FALSE}
visuEq(K = outL63$models$model5,  substit = 1, approx = 2)
```
```{r visuL63c, eval=TRUE, echo=FALSE}
#data("data_vignetteVII$out1model5")
visuEq(K = data_vignetteVII$out1model5,  substit = 1, approx = 2)
```
This term is not detected because it is of small influence
on the dynamics (actually the dynamic of variable $y$ is more or
less similar to the dynamic of $x$ -- See upper the observed
time series -- and its coefficient is 28 times smaller).
Three 7-parameter models were also obtained, all having a phase space
very similar to the original one. The original model (#18) is
included among them :

```{r visuL63d, eval=FALSE}
visuEq(K = outL63$models$model18,  substit = 1, approx = 2)
```
```{r visuL63e, eval=TRUE, echo=FALSE}
visuEq(K = data_vignetteVII$out1model18,  substit = 1, approx = 2)
```

Since model #5 is more parsomonious, it can be considered that
six of the seven terms of the Lorenz system are thus clearly detected.
A more in-deep investigation might enable the detection of the term
of weaker influence, but this remains unsure.
The phase portraits of four potentially valid models
are very close to the original phase portraits, it may be quite
quite challenging to retrieve the full formulation of the original
equations.


```{r plotLorenzPortraits_a, eval=FALSE}
  plot(TS[,2], TS[,3], type = 'l', xlab = 'x(t)', ylab = 'y(t)', main = 'model #5')
  lines(outL63$stockoutreg$model5[,1], outL63$stockoutreg$model5[,2], 
        type = 'l', col = 'red')
```
```{r plotLorenzPortraits_a2, eval=TRUE, echo=FALSE}
  library(float)
  plot(TS[,2], TS[,3], type = 'l', xlab = 'x(t)', ylab = 'y(t)', main = 'model #5')
  lines(dbl(data_vignetteVII$outL63_stockoutreg_model5[,1]), 
        dbl(data_vignetteVII$outL63_stockoutreg_model5[,2]), type = 'l', col = 'red')
```
```{r plotLorenzPortraits_b, eval=FALSE}
  plot(TS[,2], TS[,3], type = 'l', xlab = 'x(t)', ylab = 'y(t)', main = 'model #15')
  lines(outL63$stockoutreg$model15[,1], outL63$stockoutreg$model15[,2], 
        type = 'l', col = 'red')
```
```{r plotLorenzPortraits_b2, eval=TRUE, echo=FALSE}
  plot(TS[,2], TS[,3], type = 'l', xlab = 'x(t)', ylab = 'y(t)', main = 'model #15')
  lines(dbl(data_vignetteVII$outL63_stockoutreg_model15[,1]), 
        dbl(data_vignetteVII$outL63_stockoutreg_model15[,2]), type = 'l', col = 'red')
```
```{r plotLorenzPortraits_c, eval=FALSE}
  plot(TS[,2], TS[,3], type = 'l', xlab = 'x(t)', ylab = 'y(t)', main = 'model #18')
  lines(dbl(outL63$stockoutreg$model18[,1]), dbl(outL63$stockoutreg$model18[,2]), 
        type = 'l', col = 'red')
```
```{r plotLorenzPortraits_c2, eval=TRUE, echo=FALSE}
  plot(TS[,2], TS[,3], type = 'l', xlab = 'x(t)', ylab = 'y(t)', main = 'model #18')
  lines(dbl(data_vignetteVII$outL63_stockoutreg_model18[,1]), 
        dbl(data_vignetteVII$outL63_stockoutreg_model18[,2]), type = 'l', col = 'red')
```
```{r plotLorenzPortraits_d, eval=FALSE}
  plot(TS[,2], TS[,3], type = 'l', xlab = 'x(t)', ylab = 'y(t)', main = 'model #19')
  lines(dbl(outL63$stockoutreg$model19[,1]), dbl(outL63$stockoutreg$model19[,2]), 
        type = 'l', col = 'red')
```
```{r plotLorenzPortraits_d2, eval=TRUE, echo=FALSE}
  plot(TS[,2], TS[,3], type = 'l', xlab = 'x(t)', ylab = 'y(t)', main = 'model #19')
  lines(dbl(data_vignetteVII$outL63_stockoutreg_model19[,1]), 
        dbl(data_vignetteVII$outL63_stockoutreg_model19[,2]), type = 'l', col = 'red')
```

A very refined validation technique based on
the transition matrices^[Mangiarotti S., Coudret R.,
Drapeau L. & Jarlan L., Global modelsâ€™ formulation and
associated transition matrices for the RÃ¶ssler
system, the electrodissolution of copper in phosphoric
acid and the cycles of rainfed wheat observed from satellite
in North Morocco. *Supplemental Material* to article
â€œPolynomial Model Search and Global Modelling â€“ two new
algorithms for global modelling of chaosâ€ by Mangiarotti S.,
Coudret R., Drapeau L. & Jarlan L., 2012, *Physical Review E*,
**86**(4), 046205.]
may be a powerful tool to reach such a level of precision
for the distinction.


## The Burke and Shaw system (1981)
The system discovered by Bill Burke and Robert
Show^[R. Shaw, Strange attractor, chaotic behavior
and information flow, *Zeitschrift fÃ¼r Naturforsch A*,
**36**, 80-112, 1981]
in 1981 is a Lorenz-like system. This system reads

```{r visuBQ81, eval=TRUE}
visuEq(K = allMod_nVar3_dMax2$BS81,  substit = 1, approx = 4)
```

The following set of variable was used to test
the possibility to retrieve the original set of equations:
```{r plotBQ81, eval=TRUE, fig.align='center'}
  TS <- TSallMod_nVar3_dMax2$BS81$reconstr
  plot(TS[,1], TS[,2], type = 'l', ylim = c(-2,2.2), main = 'Burke-Shaw 1981',
       xlab = 't', ylab = '', col = 'blue')
  lines(TS[,1], TS[,3], type = 'l', col = 'red')
  lines(TS[,1], TS[,4], type = 'l', col = 'green')
  legend(9, 2.2, c("x(t)", "y(t)", "z(t)"), col=c('blue', 'red', 'green'), lty=1, cex = 0.8)
  
```

The global modelling technique was applied:

```{r gpomoBQ81, eval=FALSE}
# modelling from time series starting
outBS81 <- gPoMo(data = TS[,2:4], tin = TS[,1], dMax = 2, nS = c(1,1,1), show = 1,
                 IstepMin = 10, IstepMax = 1500, nPmin = 5, nPmax = 6, method = 'rk4')
```

Here also, the simplest model has one less term than
the original model:
  
```{r visuBQ81b, eval=FALSE}
visuEq(K = outBS81$models$model3,  substit = 1, approx = 2)
```
```{r visuBQ81c, eval=TRUE, echo=FALSE}
visuEq(K = data_vignetteVII$out2model3,  substit = 1, approx = 2)
```

Three 6-parameter models were also obtained which all have a phase space
very similar to the original one. The original model

```{r visuBQ81d, eval=FALSE}
visuEq(K = outBS81$models$model11,  substit = 1, approx = 2)
```
```{r visuBQ81e, eval=TRUE, echo=FALSE}
visuEq(K = data_vignetteVII$out2model11,  substit = 1, approx = 2)
```
is included among them.

A more in-deep investigation will be required to check
if the term of weaker influence can be detected.
At this stage, it can already be observed that the phase
portrait of the model with the proper formulation
(model #11 in red) seems closer to the original
phase space (in black).

```{r plotBS81Portraits_a, eval=FALSE}
  plot(TS[,2], TS[,3], type = 'l', xlab = 'x(t)', ylab = 'y(t)', main = 'model #3')
  lines(outBS81$stockoutreg$model3[,1], outBS81$stockoutreg$model3[,2], 
        type = 'l', col = 'red')
```
```{r plotBS81Portraits_a2, echo=FALSE, eval=TRUE}
  plot(TS[,2], TS[,3], type = 'l', xlab = 'x(t)', ylab = 'y(t)', main = 'model #3')
  lines(dbl(data_vignetteVII$outBS81_stockoutreg_model3[,1]), 
        dbl(data_vignetteVII$outBS81_stockoutreg_model3[,2]), type = 'l', col = 'red')
```
```{r plotBS81Portraits_b, eval=FALSE}
  plot(TS[,2], TS[,3], type = 'l', xlab = 'x(t)', ylab = 'y(t)', main = 'model #9')
  lines(outBS81$stockoutreg$model9[,1], outBS81$stockoutreg$model9[,2], 
        type = 'l', col = 'red')
```
```{r plotBS81Portraits_b2, echo=FALSE, eval=TRUE}
  plot(TS[,2], TS[,3], type = 'l', xlab = 'x(t)', ylab = 'y(t)', main = 'model #9')
  lines(dbl(data_vignetteVII$outBS81_stockoutreg_model9[,1]), 
        dbl(data_vignetteVII$outBS81_stockoutreg_model9[,2]), type = 'l', col = 'red')
```
```{r plotBS81Portraits_c, eval=FALSE}
  plot(TS[,2], TS[,3], type = 'l', xlab = 'x(t)', ylab = 'y(t)', main = 'model #11')
  lines(outBS81$stockoutreg$model11[,1], outBS81$stockoutreg$model11[,2], 
        type = 'l', col = 'red')
```
```{r plotBS81Portraits_c2, echo=FALSE, eval=TRUE}
  plot(TS[,2], TS[,3], type = 'l', xlab = 'x(t)', ylab = 'y(t)', main = 'model #11')
  lines(dbl(data_vignetteVII$outBS81_stockoutreg_model11[,1]), 
        dbl(data_vignetteVII$outBS81_stockoutreg_model11[,2]), type = 'l', col = 'red')
```
```{r plotBS81Portraits_d, eval=FALSE}
  plot(TS[,2], TS[,3], type = 'l', xlab = 'x(t)', ylab = 'y(t)', main = 'model #12')
  lines(outBS81$stockoutreg$model12[,1], outBS81$stockoutreg$model12[,2], 
        type = 'l', col = 'red')
```
```{r plotBS81Portraits_d2, echo=FALSE, eval=TRUE}
  plot(TS[,2], TS[,3], type = 'l', xlab = 'x(t)', ylab = 'y(t)', main = 'model #12')
  lines(dbl(data_vignetteVII$outBS81_stockoutreg_model12[,1]), 
        dbl(data_vignetteVII$outBS81_stockoutreg_model12[,2]), type = 'l', col = 'red')
```

## The Lorenz-1984 system
The Lorenz-1984 system^[E. N. Lorenz, Irregularity: a fundamental property of the atmosphere, *Tellus A*, **36**, 98-110, 1984.] was discovered by E.N. Lorenz in 1984.
This system is much more complex than the systems previously
considered.

```{r visuL84, eval=TRUE}
visuEq(K = allMod_nVar3_dMax2$L84,  substit = 1)
```

It is also a quadratic system, but it includes eleven terms.
The following data set was used for the analysis:
```{r plotL84, eval=TRUE, fig.align='center'}
  TS <- TSallMod_nVar3_dMax2$L84$reconstr
  plot(TS[,1], TS[,2], type = 'l', ylim = c(-2,2.2), main = 'Lorenz 1984',
       xlab = 't', ylab = '', col = 'blue')
  lines(TS[,1], TS[,3], type = 'l', col = 'red')
  lines(TS[,1], TS[,4], type = 'l', col = 'green')
  legend(16, -1, c("x(t)", "y(t)", "z(t)"), col=c('blue', 'red', 'green'), lty=1, cex = 0.8)

```

The global modelling technique was applied to this set
of series.

```{r gpomoL84, eval=FALSE}
# modelling from time series starting
outL84 <- gPoMo(data = TS[,2:4], tin = TS[,1], dMax = 2, nS = c(1,1,1), show = 1,
              IstepMin = 10, IstepMax = 15000, nPmin = 5, nPmax = 11, method = 'rk4')
```

The simplest obtained model has only five terms
```{r visuL84b, eval=FALSE}
visuEq(K = outL84$models$model3,  substit = 1, approx = 2)
```
```{r visuL84c, eval=TRUE, echo=FALSE}
visuEq(K = data_vignetteVII$out3model3,  substit = 1, approx = 2)
```
However, its attractor is periodic and symmetric.
Therefore, it is obviously not a precise approximation
of the observed time series.
The first model showing a high level of complexity
is the model #9. This model has 6 terms
  
```{r visuL84d, eval=FALSE}
visuEq(K = outL84$models$model9,  substit = 1, approx = 2)
```
```{r visuL84e, eval=TRUE, echo=FALSE}
visuEq(K = data_vignetteVII$out3model9,  substit = 1, approx = 2)
```

Although several terms are missing (two in the first equation,
three in the second one), it can be noticed that the detected
terms are effectively present in the original system.

Many other models were obtained (models with 7 terms: #19 and #22;
with 8 terms: #34, #40 and #49; with 9 terms: #55, #64, #76 and #77;
with 10 terms: #83, #92-94, #97-99, #111 and #113; or with 11
terms: #119, #128, #129, #133, #134, #136, #139, #140-142, #147-149
and #155-158).
Most of these models have strongly negative values along
the $x$ abscissa and can thus be rejected.
Only a few of them may be considered as potentially acceptable
(models #9, #55, #77, #113, #119, #141 and #158).
A visual inspection of the phase portraits obviously shows
that the best agreement is obtained with the model #141.
  
```{r visuL84f, eval=FALSE}
visuEq(K = outL84$models$model141,  substit = 1, approx = 2)
```
```{r visuL84g, eval=TRUE, echo=FALSE}
visuEq(K = data_vignetteVII$out3model141,  substit = 1, approx = 2)
```

which formulation effectively corresponds to the original
Lorenz-1984 system.


## The Chlouverakis-Sprott system (2004)

The Chlouverakis-Sprott system was adapted from the Lorenz-1984 system.
It was introduced by Konstantinos Chlouverakis and Julian Sprott
in 2004.^[K. E. Chlouverakis & J. C. Sprott, A comparison of correlation
and Lyapunov dimensions, *Physica D*, **200**, 156-164, 2004.]

```{r visuCS2004, eval=TRUE}
visuEq(K = allMod_nVar3_dMax2$CS2004,  substit = 1)
```

The following data set was used for the analysis:
```{r plotCS2004, eval=TRUE, fig.align='center'}
  TS <- TSallMod_nVar3_dMax2$CS2004$reconstr
  plot(TS[,1], TS[,2], type = 'l', xlab = 't', ylab ='', 
       main='Chlouverakis-Sprott', col = 'blue')
  lines(TS[,1], TS[,3], type = 'l', xlab = 't', ylab = 'y(t)', col = 'red')
  lines(TS[,1], TS[,4], type = 'l', xlab = 't', ylab = 'z(t)', col = 'green')
  legend(40, -1.5, c("x(t)", "y(t)", "z(t)"), col=c('blue', 'red', 'green'), lty=1, cex = 0.8)

```

The global modelling technique was applied to this set
of time series.

```{r gpomoCS2004, eval=FALSE}
# modelling from time series starting
outCS2004 <- gPoMo(data = TS[,2:4], tin = TS[,1], dMax = 2, nS = c(1,1,1), show = 1,
              IstepMin = 10, IstepMax = 15000, nPmin = 3, nPmax = 8, method = 'rk4')
```

Three types of models were obtained
(type-1: #7, #13, #15, #23, #25, #38, #40, #43;
type-2: #12, #22, #27, #37;
and type-3: #26, #30, #41, #44-46, #49)
among which only type-3 models seem compatible with
the original phase portrait (the other types can
only reproduce a small part of the original phase
space).
Some spurious effects being observed in models #30
and #49, only five models may be potential candidates:
models #26, #41 and #44-46.
The smallest model (#26) obtained with the third type
has the following formulation

```{r visuCS2004b, eval=FALSE}
visuEq(K = outCS2004$models$model26,  substit = 1, approx = 2)
```
```{r visuCS2004c, eval=TRUE, echo=FALSE}
visuEq(K = data_vignetteVII$out15model26,  substit = 1, approx = 2)
```
It is almost identical to the original system since
only one term is missing in the second equation.

The complete formulation is obtained in model #41.
However, it is difficult to tell which model is
the best when comparing models #26, #41, #44 and #45.
These results suggest that the original system
may be reductible to a 7-term sytem.


## The Li system (2007)

A toroidal and symmetrical system was introduced
by Dequan Li in 2007^[Dequan Li, A three-scroll chaotic
attractor, *Physics Letters A*, **372**, 387-393, 2007.]:

```{r visuLi2007, eval=TRUE}
visuEq(K = allMod_nVar3_dMax2$Li2007,  substit = 1, approx = 2)
```

The following data set was obtained from this system:
```{r plotLi2007, eval=TRUE, fig.align='center'}
  TS <- TSallMod_nVar3_dMax2$Li2007$reconstr
  plot(TS[,1], TS[,2], type = 'l', ylim = c(-160, 230), xlab = 't', 
       ylab = '', main= ' Li 2007', col = 'blue')
  lines(TS[,1], TS[,3], type = 'l', xlab = 't', ylab = 'y(t)', col = 'red')
  lines(TS[,1], TS[,4], type = 'l', xlab = 't', ylab = 'z(t)', col = 'green')
  legend(0.3, -50, c("x(t)", "y(t)", "z(t)"), col=c('blue', 'red', 'green'), lty=1, cex = 0.8)

```

The global modelling technique was then applied to this set
of time series.

```{r gpomoLi2007, eval=FALSE}
# modelling from time series starting
outLi2007<- gPoMo(data = TS[,2:4], tin = TS[,1], dMax = 2, nS = c(1,1,1), show = 1,
                  IstepMin = 10, IstepMax = 15000, nPmin = 9, nPmax = 9, method = 'rk4')
```

Only the model #13 is able to reproduce the original phase portrait.

```{r visuLi2007b, eval=FALSE}
visuEq(K = outLi2007$models$model13,  substit = 1, approx = 2)
```
```{r visuLi2007c, eval=TRUE, echo=FALSE}
visuEq(K = data_vignetteVII$out16model13,  substit = 1, approx = 2)
```

The same result is obtained when allowing models
of smaller sizes (e.g. `nPmin = 3`).
The original formulation is thus appropriately retrieved.

## The Cord system (Aguirre & Letellier 2012)

The cord attractor^[C. Letellier & L. A. Aguirre, Required criteria
for recognizing new types of chaos : Application to the â€œcordâ€ attractor,
*Physical Review E*, **85**, 036204, 2012.] was introduced by Luis Aguirre and
Christophe Letellier in 2012 by a modification of the Lorenz-1984 system.
The system reads:
```{r visuCord, eval=TRUE}
visuEq(K = allMod_nVar3_dMax2$Cord2012,  substit = 1)
```

The following data set was used for the analysis:
```{r plotCord, eval=TRUE, fig.align='center'}
  TS <- TSallMod_nVar3_dMax2$Cord2012$reconstr
  plot(TS[,1], TS[,2], type = 'l', ylim = c(-20,30),
       xlab = 't', ylab = '', main = 'Cord 2012', col = 'blue')
  lines(TS[,1], TS[,3], type = 'l', xlab = 't', ylab = 'y(t)', col = 'red')
  lines(TS[,1], TS[,4], type = 'l', xlab = 't', ylab = 'z(t)', col = 'green')
  legend(17, -8, c("x(t)", "y(t)", "z(t)"), col=c('blue', 'red', 'green'), lty=1, cex = 0.8)

```

The global modelling technique was applied to this set
of time series.

```{r gpomoCord, eval=FALSE}
# modelling from time series starting
outCord2012 <- gPoMo(data = TS[,2:4], tin = TS[,1], dMax = 2, nS = c(1,1,1), show = 1,
                     IstepMin = 10, IstepMax = 15000, nPmin = 7, nPmax = 11, method = 'rk4')
```

Several obtained models were able to reproduce, more or less,
the original phase space are obtained.
The smallest models (#9 and #10) have seven terms and are
not chaotic. Their formulation is similar to the original system.
All the terms detected in model #9 are effectively included
in the original system (four terms are not detected).

```{r visuCordb, eval=FALSE}
visuEq(K = outCord2012$models$model9,  substit = 1, approx = 2)
```
```{r visuCordc, eval=TRUE, echo=FALSE}
visuEq(K = data_vignetteVII$out17model9,  substit = 1, approx = 2)
```

Only one spurious term (lower in magnitude compared to
the others) is detected in model #10 (term $z^2$
in the first equations) and five terms are not detected:

```{r visuCordd, eval=FALSE}
visuEq(K = outCord2012$models$model10,  substit = 1, approx = 2)
```
```{r visuCorde, eval=FALSE, echo=FALSE}
visuEq(K = data_vignetteVII$out17model10,  substit = 1, approx = 2)
```

Among all the obtained models, the phase portrait of model #62
is the more similar to the original one.

```{r visuPhPortrait62, eval=FALSE}
visuOutGP(out17,selecmod = 62, prioMinMax = 'model')
```

In this formulation, eight of the eleven terms are correctly detected.
The same spurious term is wrongly detected (term `z^2`
in the first equations).

```{r visuCordf, eval=FALSE}
visuEq(K = outCord2012$models$model62,  substit = 1, approx = 2)
```
```{r visuCordg, eval=FALSE, echo=FALSE}
visuEq(K = data_vignetteVII$out17model62,  substit = 1, approx = 2)
```

The cord system cannot be retrieved completely in the present case,
possibly because its formulation is too complex (too many terms)
and not minimal. The most important terms can be detected.


## Conclusion

In this vignette, it is shown that, in many cases,
the original systems can be fully retrieved.
It is in particular the case when the formulation of
the original system is concise enough.
But it is not always the case.
When the systems formulation is more complex (more terms),
an exhaustive detection becomes more difficult
and only one part of the algebraic terms can be detected.
In such conditions the global modelling technique
can be used to obtain a reduced formulation
of the dynamical systems.