COMPUTING THE SQUARE ROOT OF A MARKOV MATRIX:

EIGENVALUES AND THE

Donald R. Burleson, Ph.D.

Copyright © 2005 by Donald R. Burleson. All rights reserved.

To return to the previous eigenvalues article, click here.

To return to the “About the Author” page, click here.

For a
square (nXn) matrix A with eigenvalues
λ_{i} and corresponding eigenvectors **v _{i}**,
one may use (see “Eigenvalues and Non-Integer Powers
of a Square Matrix”) the relation

A^{x}**v _{i}**

_{ }

with x = ½ to *define *the square root of the matrix A by specifying that _{} must be a matrix
satisfying the relation

A^{1/2}**v _{i} = λ^{1/2}v_{i}**

for i= 1, 2, …, n. (To see a detailed discussion of this essentially simple computational procedure, and an important application, click here.)

To
corroborate the reasonableness of this eigenvalue approach,
one may also compute _{} by making use of an
analogy with the standard Taylor series expansion of a function of a real or
complex variable, where the function f(x) is expanded about a center x = a:

_{}.

Taking a = 1 and taking derivatives, one may expand f(x) = x^{1/2}
as the series representation

_{}

_{}

with radius of convergence ρ = 1.

The
analogy, for a function of a square matrix A rather than a real or complex
variable x, with a view to computing the square root of A asymptotically, is to
employ the identity matrix I in the role
of the number 1 (the center of the series) and to treat the powers (A-I)^{n}
in terms of matrix multiplication:

_{}

under an appropriate convergence condition, which as we shall see can be readily determined, at least in the case of a Markov matrix.

For example, consider the 2X2 Markov state-transition matrix

_{} with
_{}

When we compute powers of A-I we find

_{}

_{}

and in general

_{}

where, it is easy to show, for a
2X2 Markov matrix A with eigenvalues λ_{1}
and λ_{2}, the factor (-0.7) in general represents *trace*(A-I) (the sum of the eigenvalues of A-I), which is λ_{1} + λ_{2}
– 2. (Analogous things are true for
Markov matrices of higher order.)

When one
writes (-0.7)^{n}^{-1}(A-I) for (A-I)^{n}
in the matrix-valued ^{1/2}, one obtains

_{}

which, if one follows the coefficient series out for the first fourteen terms (the convergence being rather slow) produces the result

_{}

_{}.

This is of course an approximation, which improves upon adding further terms of the series; the result shown is good to within one-ten-thousandth.

It turns out that a sufficient condition, by the standard Ratio Test, for convergence of this coefficient series (the numerical series standing as the coefficient of the matrix A-I above), is that

| trace(A-I) | < 1.

If λ_{1} and λ_{2} are the eigenvalues of the original 2X2 matrix A, this convergence
condition is equivalent to the condition

| λ_{1} + λ_{2}
– 2 | < 1.

For a
Markov matrix, since one eigenvalue (see below for
the 2X2 case) is always λ_{1} = 1, this convergence condition is
equivalent to the condition | λ_{2} – 1 | < 1. Thus we have -1 < λ_{2} – 1
< 1, or 0 < λ_{2} < 2.
We will need this result in a moment.

_{ }

In particular, for a Markov matrix

_{}

the characteristic polynomial can
be shown to be (λ- 1)[λ – (p + q -1)] so that its zeros, the eigenvalues of A, are λ_{1} = 1 and λ_{2}
= p + q -1. If we assume, as is
customary for a Markov transition matrix, that 0 < p < 1 and 0 < q
< 1 so that 0 < p + q < 2, then it follows that p + q – 1 < 1, that
is, λ_{2} < 1. This,
together with our earlier result 0 < λ_{2} < 2, implies

0 < λ_{2}
< 1,

which is thus a necessary condition
for convergence. [A “non-unity” eigenvalue for a Markov matrix is always less than 1 in
absolute value, but the point here is that it might be negative. Note that even some Markov matrices do not
satisfy the eigenvalue condition 0 < λ_{2}
< 1 precisely because λ_{2} , though absolutely
less than 1, may indeed be negative. For
example, for the 2X2 Markov matrix with p = q = 0.1, the “secondary” eigenvalue λ_{2} is -0.8 and the absolute
value of the trace of A-I is 1.8 > 1 so that there is no

Under the
assumptions that 0 < p < 1 and 0 < q < 1 (and λ_{1} =
1) the convergence condition 0 < λ_{2} < 1 is in fact
equivalent to the “trace” convergence condition stated. We have already seen that the “trace”
condition implies the λ_{2} condition. Conversely, the latter condition 0 <
λ_{2} < 1 implies 0 < λ_{2} < 2 which
implies (since λ_{1} – 1 = 0 under the assumptions about the
matrix A) 0 < λ_{1} + λ_{2} – 1 < 2 or -1 <
λ_{1} + λ_{2} – 2 < 1 or | trace(A-I)
| < 1. (Convergence for the matrix
used in the worked-out _{2} = 0.3
so that we had 0 < λ_{2} < 1. It turns out that this condition on
“non-unity” eigenvalues applies to

Had we used
the eigenvalue method (which requires far less
computation than the _{} in exact form, we
would have begun by determining that the eigenvalues
of A are λ_{1} =
1 with corresponding eigenvector **v _{1} **= (5,2) and λ

_{}

then the relation A^{1/2}**v _{i}** = λ

_{} and
_{}

implying, by matrix multiplication,
5a + 2b = 5, 5c + 2d = 2, a – b = _{}, and c – d = _{}. Solving these
equations for a, b, c and d gives the desired result in *exact* form:

_{}

which, with entries shown to nine decimal places, is

_{}

to which the

Unlike the