In the last post, we derived a formula for the inverse of the operator $(1 + G)$, where $G$ is some matrix or differential operator that we specify.  The formal inverse was given by

\[ (1+G)^{-1} = 1 – G + G^2 – G^3 + G^4 + \cdots \; \]

since this series can be multiplied to the right or the left of $1+G$ to find that all terms involving $G$ cancel to all orders.  The expansion is formally identical to the Taylor series expansion of $1/(1+x)$.  Since the Taylor series expansion is valid only for $|x| < 1$, one is left to wonder about the convergence of the operator series as well and what it means for $G$ to, in some sense, be small.

In this post, we’ll explore that question in an informal way looking at three simple $2\times2$ matrices.  The $2\times2$ matrix was chosen because it is easy to work with, the inverse is very familiar, and it provides a simple laboratory for experimentation.  We added convenience, we will work solely with real numbers although the extension to complex numbers shouldn’t be terrible difficult. 

Our prototype $2\times2$ matrix will be given by:

\[M =  \left( \begin{array}{cc} a & b \\ c & d \\ \end{array} \right) \; , \]

whose inverse is given by

\[M^{-1} =  \frac{1}{ad – bc} \left( \begin{array}{cc}d &-b \\-c &a \\ \end{array} \right) \; .\]

To apply the operator formula, we first separate $M$ into a on-diagonal matrix $A$ and an off-diagonal matrix $B$ as

\[ M = A + B = \left( \begin{array}{cc}a & 0 \\ 0 & d \\ \end{array} \right) + \left( \begin{array}{cc}0 & b \\c & 0 \\ \end{array} \right) \; . \]

Other decompositions are possible but this division makes it relatively easy to understand the interaction between the matrix elements and in what sense things some of the elements or matrices can be regarded as small compared to others.  In particular, the inverse of $A$ is also diagonal being given by

\[ A^{-1} =  \left( \begin{array}{cc} 1/a & 0  \\ 0 & 1/d \\ \end{array} \right) \; \]

To apply the operator series expansion formula to $M$, we first factor out $A$ to the left to get

\[ M = A \left( 1 + A^{-1} B \right) \; .\]

Now we can define $G \equiv A^{-1} B$ and then, using the operator series expansion, determine how quickly it helps us converges to the value of $M^{-1}$ we already know how to calculate for the exact formula above.

The inverse of $M$ can be expressed in terms of $G$ as

\[ M^{-1} = (1 + G)^{-1} A^{-1} = \left[ 1 – G + G^2 – G^3 + G^4 – \cdots \right] A^{-1} \; . \]

There are two ways to proceed.  First, we can try to analytically wrestle with the above formula.  Second, we can try some numerical experiments and see if we can get some intuition.  We’ll try the second one first and add in some analytic meanderings as we go along.

Case 1

Start with $a=1$ and $d=2$ and $b = 1/2$ and $c=1/3$.  The computations were all done in wxMaxima.

The exact inverse is given by

\[ M^{-1} =   \left( \begin{array}{cc} \frac{12}{11} & -\frac{3}{11} \\ -\frac{2}{11} & \frac{6}{11}  \\ \end{array} \right) \approx =  \left( \begin{array}{cc} 1.090909 & -0.272727 \\ -0.181818 & 0.545454 \\ \end{array} \right) \; .\]

The first term in the series, which will be our first guess is

\[ M^{-1}_{(0)} = A^{-1} =  \left( \begin{array}{cc} 1 & 0 \\ 0  & \frac{1}{2} \\ \end{array} \right) \approx  \left( \begin{array}{cc} 1.0 & 0.0 \\ 0.0 & 0.5 \\ \end{array} \right) \; ,\]

where the subscript tracks the number of terms in the expansion.  Observe that this is a poor approximation.

The first correction is built from

\[ A^{-1} B =  \left( \begin{array}{cc} 0 & \frac{1}{2} \\ \frac{1}{6} & 0 \\ \end{array} \right) \; \]

being multiplied on the right by $A^{-1}$.  Performing that multiplication and adding to the earlier result we get

\[ M^{-1}_{(1)} = A^{-1} – A^{-1} B A^{-1} =  \left( \begin{array}{cc} 1.0 & -0.25 \\ -0.166{\bar 6} & 0.5 \\ \end{array} \right) \; , \]

which is a better approximation.

The next two terms are

\[ M^{-1}_{(2)} = A^{-1} – A^{-1} B A^{-1} + A^{-1} B A^{-1} B A^{-1}  =  \left( \begin{array}{cc} 1.08{\bar 3} & -0.25 \\ -0.166{\bar 6} & 0.541{\bar 6} \\ \end{array} \right) \;  \]

and

\[ M^{-1}_{(3)} = A^{-1} – A^{-1} B A^{-1} + A^{-1} B A^{-1} B A^{-1} – A^{-1} B A^{-1} B A^{-1} B A^{-1}  \\ =  \left( \begin{array}{cc} 1.08{\bar 3} & -0.2708{\bar 3} \\ -0.180{\bar 5} & 0.541{\bar 6} \\ \end{array} \right) \;  . \]

This pattern, wherein the on-diagonal elements are updated at every order and the off-diagonal elements are updated on the odd orders, persists to all higher orders examined.

At 3rd order, each term’s relative error was $1/144 \approx 0.694%$.   At 5th order (since the entire matrix has again been revised), each term’s relative error was $\approx 0.058%$, clearly showing rapid convergence.

A Hypothesis

At this point, we can ask under what conditions will this convergence if we were to keep $a$ and $d$ fixed numerically but allowed $b$ and $c$ to vary.

Our starting matrix is

\[ M =  \left( \begin{array}{cc} 1 & b \\ c & d \\ \end{array} \right) \; \]

with the inverse

\[ M^{-1} =  \left( \begin{array}{cc} \frac{1}{1-\frac{bc}{2}} & -\frac{b/2}{1-\frac{bc}{2}} \\ -\frac{c/2}{1-\frac{bc}{2}} & \frac{1/2}{1-\frac{bc}{2}} \\ \end{array} \right) \; . \]

The operator series expansion should converge if it is meaningful to expand the denominator in a convergent Taylor series as

\[ \left( 1-\frac{bc}{2} \right)^{-1} = 1 + \frac{bc}{2} + \frac{b^2c^2}{4} + \frac{b^3c^3}{8} + \cdots \; , \]

for which we know is the condition $bc < 2$. 

Of course, expanding in this fashion cannot reproduce the terms in the operator expansion since, even to lowest order (note the prime on the subscript as a way of tracking which expansion is in play),

\[ M^{-1}_{(0’)} =  \left( \begin{array}{cc} 1 & -1/4 \\-1/6 & 1/2 \\ \end{array} \right) \; , \]

which has non-zero entries in the off-diagonals, in contrast to $A^{-1}$.

Case 2

To test this hypothesis, we set $b= 1$ (twice the value used in Case 1) and we looked at values of $c=1/3, 2/3, 0.9, 1.1, 1.3, 1.5, 1.7, 1.9$ picked fairly arbitrarily.  The expansion was run to 6th order and the results are shown in the table below

$c$ % error
1/3 0.08
2/3 1.23
0.9 4.10
1.1 9.15
1.3 17.85
1.5 31.69
1.7 52.20
1.9 81.45

 Clearly the convergence suffers as the product of $b \cdot c$ approaches the value of $2$.   Setting $c = 2.01$ causes the expansion to diverge.  Of course, one might raise an objection since the matrix $M$ is ill-conditioned.  More interesting is to set $c = 3$, which gives

\[ M =   \left( \begin{array}{cc} 1 & 1 \\ 3 & 2 \\ \end{array} \right) \; \]

and

\[ M^{-1} =  \left( \begin{array}{cc} -2 & 1 \\ 3 & -1 \\ \end{array} \right) \; . \]

Despite the well-behaved nature of $M$ and $M^{-1}$, the 6th order approximation is

\[ M^{-1}_{(6)} = \left( \begin{array}{cc} 8.125 & -2.375 \\ -7.125 & 4.0625 \\ \end{array} \right) \; \]

clearly showing that the series is diverging.

So, in summary, there are conditions on the off-diagonal elements compared to the on-diagonal ones.  The specific condition here is that $bc < ad$ (essential $|det(B)| < |det(A)|$) but the general condition, applicable for all matrices, is likely far more involved and subtle but we’ll leave that for the experts.  The point is clear that for operator expansion to be valid, despite its formal structure, the matrix $G$ has to be small in some sense when compared with the identity matrix.  This condition holds for quantum electrodynamics, which explains its ubiquity in modern physics.