First-Order Gauss Markov Processes

Change of pace is always good and, in this spirit, this week’s column will aims to model the solutions of the First-order Gauss Markov process. A First-order Gauss Markov process is a stochastic process that is used in certain applications for scheduling the injection of process noise into filtering methods. The basic idea is that during certain time periods, the internal physical process modelled in the filter will be insufficient due to the turning on and off of some additional, unmodelled force. Typical applications include the application of man-made controls to physical systems to force certain desirable behaviors.

The formulation of the First-order Gauss Markov process is usually done in terms of a stochastic equation. For simplicity, the scalar form will be studied. The generalization to the case with many components is straightforward and nothing is gained by the notational clutter. The equation we will analyze is
\[ {\dot x} = -\frac{1}{\tau} x + w \quad ,\]
where $$x(t)$$ is defined as the state and $$w$$ is a inhomogeneous noise term. In the absence of the noise, the solution is given by a trivial time integration to yield
\[ x_{h}(t) = x_{0} e^{ -(t-t_{0})/\tau } \quad , \]
assuming that $$x(t=t_{0}) = x_{0}$$. The solution $$x_h(t)$$ is known as the homogeneous solution and, as has often been discussed in this column, it will help in integrating the inhomogeneous term $$w$$. Ignore, for the moment, that $$w$$ (and, as a result, $$x$$) is a random variable. Treated as just an ordinary function, $$w$$ is a driving term that can be handled via the state transition matrix (essentially a one-sided Green’s function for an initial-value problem). Recall that a state transition matrix (STM) is defined as the object linking the state at time $$t_{0}$$ to the state at time $$t$$ according to
\[ x(t) = \Phi(t,t_{0}) x(t_{0}) \quad , \]
thus, by the definition, the STM is obtained as
\[ \Phi(t,t_{0}) = \frac{\partial x(t)}{\partial x(t_{0})} \quad . \]
Taking the partial derivative of homogeneous solution as required by the definition of the STM gives
\[ \Phi(t,t_{0}) = e^{ -(t-t_{0})/\tau } \quad . \]
The solution of the inhomogeneous equation is then given as
\[ x(t) = x_{h}(t) + \int_{t_{0}}^{t} \Phi(t,t’) w(t’) dt’ \quad .\]
As a check, take the time derivative of this expression to get
\[ {\dot x}(t) = {\dot x}_{h}(t) + \frac{d}{dt} \left[ \int_{t_{0}}^{t} \Phi(t,t’) w(t’) dt’ \right] \; \]
and then eliminate the time derivative on the two terms on the right-hand side by using the homogeneous differential equation for the first term and by using the Liebniz rule on the second term involving the integral
\[ {\dot x}(t) = -\frac{1}{\tau} x_{h}(t) + \Phi(t,t) w(t) + \int_{t_{0}}^{t} \frac{\partial \Phi(t,t’)}{\partial t} w(t’) dt’ \quad . \]
Since the following two conditions
\[ \Phi(t,t) = 1 \quad , \]
and
\[ \frac{\partial \Phi(t,t’)}{\partial t} = -\frac{1}{\tau} \Phi(t,t’) \quad , \]
are met, they can be substituted into the above expression. Doing so yields
\[ {\dot x}(t) = -\frac{1}{\tau} x_{h}(t) + w(t) – \frac{1}{\tau} \int_{t_{0}}^{t} \Phi(t,t’) w(t’) dt’ \; .\]
Grouping the terms in a suggestive way gives
\[ {\dot x}(t) = -\frac{1}{\tau} \left[ x_{h}(t) + \int_{t_{0}}^{t} \Phi(t,t’) w(t’) dt’ \right] + w(t) \; \]
from which immediately springs the recognition that
\[ {\dot x}(t) = -\frac{1}{\tau} x(t) + w(t) \quad \; , \]
which is what we were trying to prove.

It is worth noting that the condition listed for the time derivative of the STM is a specific example of the familiar form
\[ \frac{\partial \Phi(t,t’)}{\partial t} = A(t) \Phi(t,t’) \quad ,\]
where $$A(t)$$, known sometimes as the process matrix, is generally given by
\[ A(t) = \frac{\partial {\mathbf f}( {\mathbf x} (t) ) }{\partial {\mathbf x}(t)} \; .\]

With the general formalism well understood, the next step is to generate solutions. Since $$w(t)$$ is a noise term, each ensemble of realizations will birth a different response in $$x(t)$$ via the driving term in
\[ x(t) = x_{0} e^{ -(t-t_{0})/\tau } + \int_{t_{0}}^{t} e^{ -(t-t’)/\tau } w(t’) dt’ \; .\]
So there are an infinite number of solution and there is no practical way to deal with them as a group but statistically some meaningful statements can be made. The most useful statistical characterizations are found by computing the statistical moments about the origin (i.e. $$E[x^n(t)]$$, where $$E[\cdot]$$ denotes the expectation value of the argument).

In order to do this, we need to say something about the statistical distribution of the noise. We will assume that $$w(t)$$ is a Gaussian white-noise source with zero mean, a variance of $$q$$, and is uncorrelated. Mathematically these assertations amount to the condition $$E[w(t)] \equiv {\bar w} = 0$$ and the condition
\[ E[(w(t)-{\bar w}) (w(s) – {\bar w}) ] = E[ w(t)w(s) – {\bar w} w(t) – {\bar w} w(s) – {\bar w}^2 ] \\ = E[ w(t)w(s) ] = q \delta(t-s) \quad . \]

In addition, we assume that the state and the noise are statistically independent at different times
\[ E[x(t) w(s) ] = E[x(t)] E[w(s)] = 0 \quad . \]
Now we can compute the statistical moments of $$x(t)$$ about the origin.

The first moment is given by
\[ E[x(t)] = E \left[ x_{0} e^{ -(t-t_{0})/\tau } + \int_{t_{0}}^{t} e^{ -(t-t’)/\tau } w(t’) dt’ \right] \; .\]
Since the expectation operator is linear and since it only operates on $$x(t)$$ and $$w(t)$$ (they are the only two random variable) this expression becomes
\[ E[x(t)] = E \left[ x_{0} \right] e^{ -(t-t_{0})/\tau } + \int_{t_{0}}^{t} e^{ -(t-t’)/\tau } E \left[ w(t’) \right] dt’ \; .\]
From its statistical properties, $$E[w(t)] = $$ and so the above expression simplifies to
\[ E[x(t)] = E \left[ x_{0} \right] e^{ -(t-t_{0})/\tau } \; .\]

The second moment is more involved, although no more conceptually complicated, and is given by


\[ E[ x(t)^2 ] = E \left[ x_{0}^2 e^{ -2(t-t_{0})/\tau } \\ + 2 x_{0} e^{ -(t-t_{0})/\tau } \int_{t_{0}}^{t} e^{ -(t-t’)/\tau } w(t’) dt’ \\ + \int_{t_{0}}^{t} e^{ -(t-t’)/\tau } w(t’) dt’ \int_{t_{0}}^{t} e^{ -(t-t”)/\tau } w(t^{\prime\prime}) dt^{\prime\prime} \right] \; \]

Again using the facts that the expectation operator is linear and that the noise is zero mean, the second moment becomes

\[ E[x(t)] = E \left[x_{0}^2\right]e^{ -2(t-t_{0})/\tau } \\ + \int_{t_{0}}^{t} \int_{t_{0}}^{t} e^{ -(2t-t’-t”)/\tau } E \left[ w(t’) w(t^{\prime\prime}) \right] dt’ dt^{\prime\prime} \; .\]

Next we use the fact that the noise auto-correlation resolves to a delta function leaving behind

\[ E[x(t)] = E \left[x_{0}^2\right]e^{ -2(t-t_{0})/\tau } + \int_{t_{0}}^{t} \int_{t_{0}}^{t} e^{ -(2t-t’-t”)/\tau } + q \delta(t’-t”) dt’ dt^{\prime\prime} \; .\]

From there the final step is to use the properties of the delta-function to gid rid of one the integrals and then to explicitly evaluate the remaining one to get

\[ E[x(t)] = E \left[x_{0}^2\right]e^{ -2(t-t_{0})/\tau } + \frac{q \tau}{2} \left( 1 – e^{ -2(t-t_{0})/\tau } \right) \;.\]

With the first two moments in hand it is easy to get the variance

\[ E \left[x^2\right] – E\left[x\right]^2 = \left( E\left[x_{0}^2\right] – E \left[x_{0}\right]^2 \right) e^{-2(t-t_{0})/\tau} + \frac{q\tau}{2} \left(1- e^{-2(t-t_{0})/\tau} \right) \; ,\]

which can be re-written as

\[ {\mathcal P}(t) = {\mathcal P}_0 e^{ -2(t-t_0)/\tau } + \frac{q\tau}{2}\left(1-e^{-2(t-t_{0})/\tau}\right) \quad , \]

where $${\mathcal P_0} = \left( E\left[x_0^2\right] – E\left[x_0\right]^2 \right)$$ is the initial covariance.

The final step is to discuss exactly what these equations are telling. The solution for the mean suggests that the mean will, after a long time, settle in at a value of zero regardless of the initial conditions for $$x_0$$. The variance equation then tells us that the uncertainty or dispersion about this zero-mean grows as $$\sqrt{t}$$ (since the only long-time term in the variance is proportional to $$t$$). This behavior is suggestive of a random walk. One of the more useful applications of this formalism to schedule in the parameter $$q$$ and being non-zero only within a certain time span. During this time span, the variance of process ‘opens up’ thereby allowing more freedom for a filter to heed measurement data rather than paying strict attention to its internal process. Thus a First-order Gauss Markov process, when scheduled into an underlying physical process, allows for modern filters to tracking unmodeled forces.