ON NON-INTEGER POWERS OF A SQUARE MATRIX

Donald R. Burleson,
Ph.D.

Copyright (c) 2005 by Donald R. Burleson. All rights
reserved.

(TO SEE ANOTHER, RELATED, ARTICLE, CLICK HERE.)

(To return to the "About the
Author" page, click here.)

Given a square (*n*-by-*n*) matrix A, the zeros l_{i} of the *n*th-degree polynomial

det(A - lI) (where I represents the identity matrix of the same size as A)

are the *eigenvalues* of A, and to each
eigenvalue l_{i} there corresponds an *eigenvector* **v _{i}** (a vector unique up to a
scalar multiple), such that for 1

A**v _{i}**

That is, the nature of an eigenvalue and its corresponding eigenvector is that the
effect of multiplying the matrix A onto the vector is the same as multiplying
the number l_{i} onto the vector. Also,

A^{2}**v _{i} = **A(A

and in fact one may easily show by
induction on *n* that

A^{n}**v _{i}** = l

for all positive integers *n*.
Thus we have, in this context, the very economical substitution of a mere
power of the number l_{i} for the more laboriously computed power
A^{n} of the given matrix, having the same multiplicative effect on the
eigenvector—and potentially, by extension, the same multiplicative effect on any
vector in the same space, when that vector is written in terms of an eigenbasis {v_{i}} : for **w** = Σ **k _{i}v_{i}** , in
terms of appropriate basis-representation scalar coefficients k

It is natural to consider generalizing the relation A^{n}**v _{i}** = l

Let’s look at the case n = 1/2, so that the relation in question becomes

A^{1/2}** _{}=
**l

The “square root” of the matrix A should reasonably be *defined* as a matrix M such that MM =
A and such that M = A^{1/2} satisfies the above
relation for any given eigenvalue and corresponding
eigenvector. And indeed it turns
out that the **eigenvalues** of A provide a way
of computing it.

Take for example the 2X2 matrix

A =
_{}

The eigenvalues l_{i} are the zeros of the characteristic polynomial
of A:

det(A - lI)
= det
_{ } = (3-l)(2-l) -
2(1) = l^{2}
- 5l+ 4
= (l-4)(l-1)

so that the *spectrum* of A, i. e. the set of all
eigenvalues of A, is { l_{1}
= 4, l_{2}
= 1). Let an eigenvector
corresponding to l_{1}
be (x,y). By putting

_{}

so that with 3x + y = 4x (and
equivalently 2x + 2y = 4y), we have y = x, and we may most simply take x =
y = 1 so that the corresponding
eigenvector (x,y) is **v _{1} = **(1,1).

In like manner for the other eigenvalue l_{2}
we may derive the eigenvector **v _{2}** = (1,-2).

If we denote the desired square root matrix A^{1/2} as _{ }with a
view to finding the elements a, b, c, and d, we have, from the relation
A^{1/2}**v _{1}** = l

_{}

which by matrix multiplication
implies a + b = 2 and c + d = 2.
Similarly, from A^{1/2}**v _{2} = **l

_{}

which implies a - 2b = 1 and c - 2d = -2. Solving the a,b equations together and the c,d equations together gives a = 5/3, b = 1/3, c = 2/3, and d = 4/3, so that the desired square root matrix is

_{}

As a check, since we should have M^{2} = A, we may
multiply:

_{}

Different sorts of eigenvalues may produce
different sorts of results. (For
present purposes we will look only at scenarios with *distinct* eigenvalues, though similar results may be pursued in other
cases.) For example, the matrix

_{}

has eigenvalue spectrum { l_{1}
= 4, l_{2}
= -1 } with corresponding eigenvectors **v _{1}** = (1,-2) and

_{}

which, as one may verify, checks
upon multiplication: _{}

Naturally the same process applies to larger square matrices. E.g., for C =
_{,}

which (since it’s a triangular matrix) has eigenvalues equal to the diagonal elements 1, 9, and 4, we have

_{}

(The only limitation here, with larger matrices, is the
possibility of finding the zeros of the characteristic polynomial; for degree *n* __>__ 5 they can in general
only be approximated.)

One may of course pursue higher-index “roots” of a square matrix A, e.g.
_{ }etc. by
working with the appropriate fractional power. For example, with n = 1/3, from the
relation A^{1/3}**v _{i}** = l

_{}

which checks as

_{}

In a similar fashion we may compute noninteger square-matrix powers in general, as they only involve raising an eigenvalue to the powers in question and following the procedure shown.

Remark: By using the identity (where *i* denotes the imaginary unit)
l^{i} =
(e^{ ln(λ)})^{i}^{ } = e^{i}^{ ln(λ)} =
cos[ln(λ)] + i sin[ln(λ)] (where if λ < 0 we may use the
principle complex value of the logarithm, i.e. ln(|λ|)
+ πi) one may extend the determination of A^{z} to complex powers: A^{z} = A^{x+iy}
=
A^{x}(A^{y})^{i} . In
this connection it is interesting to note that for a nonsingular square matrix
A, if we use the indicated method to raise A to the power *i* and then raise the resulting matrix
A^{i} to the power i producing (A^{i})^{i}^{ } = A^{ – 1}, all in terms of
principal complex value, the result is the ordinary inverse of A. Since the action of raising the
nonsingular matrix A to the power *i* twice in succession has the
effect of producing A^{ – 1}, we may call the matrix A^{i }the **semi-inverse** of A. For purposes of raising the eigenvalues to the power *i*, it is helpful to recall that
Euler’s formula imples 1^{i} = e^{-2π and} (-1)^{i}^{ } = e^{-π} , in principal
value.

But let us look at an important application of the eigenvalue-based method of raising a square matrix A to real but non-integer powers.

APPLICATION:

Computing a market-share transition matrix from knowledge of

cumulative market-share changes over two or more transition periods

Consider, by way of quick illustration, a simple economic scenario in which there are two brands A and B of a certain product. Suppose that over any given “transition” period (period in which buyers may or may not switch brands, e.g. possibly a week, since many people shop weekly) there is a constant pattern of consumer “defection” or non-defection, defined by the following column-stochastic transition matrix T:

_{}

This means that, over one transition period, 70% of A-users will stay put with Brand A while 30% will defect to Brand B; and of B-users, 20% will defect to A and 80% will stick with B. (The first column is the “from A” column; the second column is the “from B” column; the first row is the “to A” row; the second row is the “to B” row.) If the current market share configuration is 81% for A and 19% for B, the matrix multiplication

_{}

shows that after one transition period the market shares will be 60.5% for A and 39.5% for B, an improvement for B because B’s retention rate is better than A’s. (Further transitions will show further improvement for B, but only up to a point. It can be shown that if the transition pattern T remains steady over sufficient time, the market share sequence (Markov chain) approaches a kind of fixed market share vector (“steady-state” vector) of (0.40, 0.60). I.e., eventually A will have 40% of the market and B will have 60%. A’s situation will not deteriorate beyond that point, and B’s situation will not further improve.)

Now—suppose the matrix

_{}

is known to describe the market
share changes after *two* transition
periods. (I.e., we have empirically
observed that by final tally after two such periods, 55% of “going in” A-users
have stuck with A while 45% have defected to B, and 30% of “going-in” B users
have defected to A with 70% staying with B.) Suppose also that the basic transition
matrix T describing the market share changes over *one* typical transition period is
unknown, and this is what we need to find.
(Among other things, it’s useful to recover T if one plans to project
market share patterns over several further transition periods.)

Since M describes the total effect of two standard transitions T, we have
TT = M and we need to compute T = _{, }which by the above-illustrated eigenvalue method turns out to be

_{}

Thus from knowing the cumulative changes over *two* transition periods, we have been
able to recover the transition structure that prevails over each *single *transition period, always
assuming this structure to be constant for the duration in question. If the recovered matrix structure T is
indeed constant, it becomes a predictor of the effects over the next transition
period, as well as subsequent ones.

Similarly, if we knew that a matrix M described cumulative market share
changes over *three* transition
periods, we could recover the basic one-period transition matrix by computing
_{ }and
similarly for higher-order iterations.

In fact, if we knew, for example, that the market share changes after *five* periodic iterations were described
by the matrix

_{}

and if for some reason we needed to
find out what matrix describes the state of things at the end of the *third* of those five transition periods
(where, again, the basic transition matrix T is unknown), we could use the above
eigenvalue method to compute

_{}

finding that by the end of the third
(out of five) transition periods, brand A had a 47.5% brand retention to B’s
65.0%. In this scenario if the
initial market shares are 81% (A) and 19% (B), we could recover the fact that
after *three *transitions, the
market share configuration must have been

_{}

and this from knowing only the “end”
conditions. Clearly, in this
manner, with a fixed transition structure T, whether initially known or not, we
may recover the accumulated effects of market share changes at *any* point in the chain. And of course we may obtain the original
basic transition matrix T in similar fashion.

Obviously there are similar applications in any similar Markov-chain situation.

_{}