On Non-Integer Powers of a Square Matrix

ON NON-INTEGER POWERS OF A SQUARE MATRIX

Donald R. Burleson, Ph.D.
Copyright (c) 2005 by Donald R. Burleson. All rights reserved.

(TO SEE ANOTHER, RELATED, ARTICLE, CLICK HERE.)
(To return to the "About the Author" page, click here.)

Given a square (n-by-n) matrix A, the zeros l_i of the nth-degree polynomial

det(A - lI) (where I represents the identity matrix of the same size as A)

are the eigenvalues of A, and to each eigenvalue l_i there corresponds an eigenvector v_i (a vector unique up to a scalar multiple), such that for 1 < i < n,

Av_i = l_iv_i .

That is, the nature of an eigenvalue and its corresponding eigenvector is that the effect of multiplying the matrix A onto the vector is the same as multiplying the number l_i onto the vector. Also,

A²v_i = A(Av_i) = A(l_iv_i) = l_i(Av_i) = l_i(l_iA) = l_i²A

and in fact one may easily show by induction on n that

Aⁿv_i = l_iⁿv_i

for all positive integers n. Thus we have, in this context, the very economical substitution of a mere power of the number l_i for the more laboriously computed power Aⁿ of the given matrix, having the same multiplicative effect on the eigenvector—and potentially, by extension, the same multiplicative effect on any vector in the same space, when that vector is written in terms of an eigenbasis {v_i} : for w = Σ k_iv_i , in terms of appropriate basis-representation scalar coefficients k_i we would have Aⁿw = Σ k_iλ_iⁿv_i since A represents a linear transformation.

It is natural to consider generalizing the relation Aⁿv_i = l_iⁿv_i to non-integer values of the exponent n. [Remark: By the method about to be described, for a nonsingular matrix A if one puts n = -1 one simply produces the multiplicative inverse matrix A ^-1 by a somewhat circuitous way of computing it; all other negative-integer values of n are of course powers of A ^-1.]

Let’s look at the case n = 1/2, so that the relation in question becomes

A^1/2= l_i^{1/2 v_i} .

The “square root” of the matrix A should reasonably be defined as a matrix M such that MM = A and such that M = A^1/2 satisfies the above relation for any given eigenvalue and corresponding eigenvector. And indeed it turns out that the eigenvalues of A provide a way of computing it.

Take for example the 2X2 matrix

A =

The eigenvalues l_i are the zeros of the characteristic polynomial of A:

det(A - lI) = det = (3-l)(2-l) - 2(1) = l² - 5l+ 4 = (l-4)(l-1)

so that the spectrum of A, i. e. the set of all eigenvalues of A, is { l₁ = 4, l₂ = 1). Let an eigenvector corresponding to l₁ be (x,y). By putting

so that with 3x + y = 4x (and equivalently 2x + 2y = 4y), we have y = x, and we may most simply take x = y = 1 so that the corresponding eigenvector (x,y) is v₁ = (1,1).

In like manner for the other eigenvalue l₂ we may derive the eigenvector v₂ = (1,-2).

If we denote the desired square root matrix A^1/2 as with a view to finding the elements a, b, c, and d, we have, from the relation A^1/2v₁ = l₁^1/2v₁:

which by matrix multiplication implies a + b = 2 and c + d = 2. Similarly, from A^1/2v₂ = l₂^1/2v₂ we have

which implies a - 2b = 1 and c - 2d = -2. Solving the a,b equations together and the c,d equations together gives a = 5/3, b = 1/3, c = 2/3, and d = 4/3, so that the desired square root matrix is

As a check, since we should have M² = A, we may multiply:

Different sorts of eigenvalues may produce different sorts of results. (For present purposes we will look only at scenarios with distinct eigenvalues, though similar results may be pursued in other cases.) For example, the matrix

has eigenvalue spectrum { l₁ = 4, l₂ = -1 } with corresponding eigenvectors v₁ = (1,-2) and v₂ = (1,3), and since one of the eigenvalues is negative, with l₂^1/2 = (-1)^1/2 = i, the square root matrix turns out to have complex entries when we work it out. One obtains

which, as one may verify, checks upon multiplication:

Naturally the same process applies to larger square matrices. E.g., for C = _,

which (since it’s a triangular matrix) has eigenvalues equal to the diagonal elements 1, 9, and 4, we have

(The only limitation here, with larger matrices, is the possibility of finding the zeros of the characteristic polynomial; for degree n > 5 they can in general only be approximated.)

One may of course pursue higher-index “roots” of a square matrix A, e.g. etc. by working with the appropriate fractional power. For example, with n = 1/3, from the relation A^1/3v_i = l_i^1/3v_i it turns out that

which checks as

In a similar fashion we may compute noninteger square-matrix powers in general, as they only involve raising an eigenvalue to the powers in question and following the procedure shown.

Remark: By using the identity (where i denotes the imaginary unit) lⁱ = (e^ln(λ))ⁱ = eⁱ^ln(λ) = cos[ln(λ)] + i sin[ln(λ)] (where if λ < 0 we may use the principle complex value of the logarithm, i.e. ln(|λ|) + πi) one may extend the determination of A^z to complex powers: A^z = A^x+iy = A^x(A^y)ⁱ . In this connection it is interesting to note that for a nonsingular square matrix A, if we use the indicated method to raise A to the power i and then raise the resulting matrix Aⁱ to the power i producing (Aⁱ)ⁱ = A^{– 1}, all in terms of principal complex value, the result is the ordinary inverse of A. Since the action of raising the nonsingular matrix A to the power i twice in succession has the effect of producing A^{– 1}, we may call the matrix Aⁱthe semi-inverse of A. For purposes of raising the eigenvalues to the power i, it is helpful to recall that Euler’s formula imples 1ⁱ = e^{-2π and} (-1)ⁱ = e^-π , in principal value.

But let us look at an important application of the eigenvalue-based method of raising a square matrix A to real but non-integer powers.

APPLICATION:

Computing a market-share transition matrix from knowledge of

cumulative market-share changes over two or more transition periods

Consider, by way of quick illustration, a simple economic scenario in which there are two brands A and B of a certain product. Suppose that over any given “transition” period (period in which buyers may or may not switch brands, e.g. possibly a week, since many people shop weekly) there is a constant pattern of consumer “defection” or non-defection, defined by the following column-stochastic transition matrix T:

This means that, over one transition period, 70% of A-users will stay put with Brand A while 30% will defect to Brand B; and of B-users, 20% will defect to A and 80% will stick with B. (The first column is the “from A” column; the second column is the “from B” column; the first row is the “to A” row; the second row is the “to B” row.) If the current market share configuration is 81% for A and 19% for B, the matrix multiplication

shows that after one transition period the market shares will be 60.5% for A and 39.5% for B, an improvement for B because B’s retention rate is better than A’s. (Further transitions will show further improvement for B, but only up to a point. It can be shown that if the transition pattern T remains steady over sufficient time, the market share sequence (Markov chain) approaches a kind of fixed market share vector (“steady-state” vector) of (0.40, 0.60). I.e., eventually A will have 40% of the market and B will have 60%. A’s situation will not deteriorate beyond that point, and B’s situation will not further improve.)

Now—suppose the matrix

is known to describe the market share changes after two transition periods. (I.e., we have empirically observed that by final tally after two such periods, 55% of “going in” A-users have stuck with A while 45% have defected to B, and 30% of “going-in” B users have defected to A with 70% staying with B.) Suppose also that the basic transition matrix T describing the market share changes over one typical transition period is unknown, and this is what we need to find. (Among other things, it’s useful to recover T if one plans to project market share patterns over several further transition periods.)

Since M describes the total effect of two standard transitions T, we have TT = M and we need to compute T = _,which by the above-illustrated eigenvalue method turns out to be

Thus from knowing the cumulative changes over two transition periods, we have been able to recover the transition structure that prevails over each single transition period, always assuming this structure to be constant for the duration in question. If the recovered matrix structure T is indeed constant, it becomes a predictor of the effects over the next transition period, as well as subsequent ones.

Similarly, if we knew that a matrix M described cumulative market share changes over three transition periods, we could recover the basic one-period transition matrix by computing and similarly for higher-order iterations.

In fact, if we knew, for example, that the market share changes after five periodic iterations were described by the matrix

and if for some reason we needed to find out what matrix describes the state of things at the end of the third of those five transition periods (where, again, the basic transition matrix T is unknown), we could use the above eigenvalue method to compute

finding that by the end of the third (out of five) transition periods, brand A had a 47.5% brand retention to B’s 65.0%. In this scenario if the initial market shares are 81% (A) and 19% (B), we could recover the fact that after three transitions, the market share configuration must have been

and this from knowing only the “end” conditions. Clearly, in this manner, with a fixed transition structure T, whether initially known or not, we may recover the accumulated effects of market share changes at any point in the chain. And of course we may obtain the original basic transition matrix T in similar fashion.

Obviously there are similar applications in any similar Markov-chain situation.