Archive

Posts Tagged ‘Lemma of the Month’

No Similarity-Invariant Matrix Norm

September 4th, 2009

A matrix norm on Mn is said to be weakly unitarily-invariant if conjugating a matrix by a unitary U does not change the norm. That is,

\|X\|=\|UXU^*\|\ \ \forall \, X,U\in M_n \text{ with $U$ unitary.}

Many commonly-used matrix norms are weakly unitarily-invariant, including the operator norm, Frobenius norm, numerical radius, Ky Fan norms and Schatten p-norms. One might naturally wonder whether there are matrix norms that satisfy the slightly stronger property of similarity-invariance:

\|X\|=\|SXS^{-1}\|\ \ \forall\, X,Sin M_n\text{ with $S$ nonsingular.}

Upon first glance there doesn’t seem to be any reason why this shouldn’t be possible — one can look for simple examples that cause problems, but you’ll have trouble coming up with a matrix that causes problems if you restrict your attention to “nice” (i.e., normal) matrices. Nevertheless, we have the following lemma, which appeared as Exercise IV.4.1 in [1]:

Lemma (No Similarity-Invariant Norm). Let f : Mn → R be a function satisfying f(SXS-1) = f(X) for all X,S ∈ Mn with S invertible. Then f is not a norm.

If you’re interested in the (very short and elementary) proof of this lemma, see the pdf attached below. I would be greatly interested in seeing a proof of this fact that relies less on the structure of matrices themselves. It seems as though there should be a more general result that characterizes when we can and can not find a norm on a given vector space that is invariant with respect to some given subgroup, or some such thing. Would anyone care to enlighten me?

Related Links:

References:

  1. R. Bhatia, Matrix analysis. Volume 169 of Graduate texts in mathematics (1997).

Sylvester's Law of Inertia and *-Congruence

August 7th, 2009

Recall for matrices that a similarity transformation is one that takes a (square) matrix A to SAS-1, where S is some invertible matrix of the same size as A. Standard linear algebra courses like to beat into students just how important similarity transformations are, which makes sense from their point of view because linear algebra is really more about studying operators than matrices themselves, and similarity transformations correspond to just a change of basis. The downside is that many people end up thinking of similarity as the equivalence relation among matrices and hence the Jordan canonical form as the canonical form. This month’s Lemma of the Month is about the canonical form under a different equivalence relation: *-congruence.

What is *-Congruence?

Two square matrices A and B are said to be *-congruent if there is an invertible matrix S such that SAS* = B. Even though no inverses are present in that equation, invertibility really is required for any nice results to be obtained (for example, *-congruence would not even define an equivalence relation if S were allowed to be singular). For people who like knowing applications of mathematical tools before knowing the tools themselves, *-congruence has an important role when dealing with quadratic forms. Also, the inertia of a matrix, which will be defined in the next paragraph, comes into play when looking at sign pattern matrices (apologies for the lack of a wiki link; it seems as though sign pattern matrices are at that ugly stage after there are lots of papers written about them but before there is a Wikipedia page written about them).

Sylvester’s Law of Inertia

By far the most well-known result about *-congruence is Sylvester’s Law of Inertia, which gets its name from the seldom-used inertia of a matrix. The inertia of a Hermitian matrix A is defined to be the tuple

Inertia of a matrix

where n+ is the number of positive eigenvalues of A, n0 is the number of zero eigenvalues of A, and n- is the number of negative eigenvalues of A (recall that Hermitian matrices have real eigenvalues, so this definition makes sense). Sylvester’s Law of Inertia is as follows:

Theorem [1,2]. Let A, B ∈ Mn be Hermitian. Then A and B are *-congruent if and only if they have the same inertia.

One obvious corollary of this theorem is that every Hermitian matrix A is *-congruent to a diagonal matrix with n+ ones, n0 zeroes, and n- negative ones on the diagonal; this is the canonical form for Hermitian matrices under *-congruence.

Generalizations of Sylvester’s Law

If you’re a linear algebra nerd like me, then you might be thinking to yourself that Sylvester’s Law of Inertia seems trivial in light of the Spectral Decomposition Theorem; for any Hermitian A we could find a unitary so that UAU* is real and diagonal, and then we could find a diagonal D such that DUAU*D* is scaled appropriately to satisfy the theorem. Setting S = DU then shows us that we can always arrive at the canonical form, and the remaining steps in the proof are all more or less trivial (this proof is provided in much more detail in the PDF attached to the end of this post).

Notice, however, that the Spectral Decomposition Theorem applies not only to Hermitian matrices, but to normal matrices as well. Thus, using the same logic outlined in the previous paragraph, it stands to reason that the canonical form under *-congruence for normal matrices should be a diagonal matrix in which each diagonal entry either has modulus one or zero. Indeed, this is essentially the content of the following theorem of Ikramov, which was proved in 2001.

Theorem [3]. Let A, B ∈ Mn be normal. Then A and B are *-congruent if and only if they have the same number of eigenvalues on each open ray from the origin in the complex plane.

This theorem truly generalizes Sylvester’s Law of Inertia; in the Hermitian case, the only two open rays that eigenvalues can lie on are the positive real line and the negative real line. What I would like to know, though, is why on Earth this theorem was first proved/published so recently — it can be proved in about a paragraph using only ideas from an undergraduate linear algebra course, and I’m pretty sure that anyone who uses the Spectral Decomposition Theorem semi-regularly would see the main idea of the proof more or less instantly. Is it just a hole in the literature that wasn’t noticed until recently?

Anyway, for the really interested reader, there is a generalization of these results to the case when A and B need not even be normal, but it’s quite technical so I won’t present it here. See [4] if you’re curious.

Thanks to Roger Horn for his great series of lectures at the Summer School and Advanced Workshop on Trends and Developments in Linear Algebra, which introduced me to the generalizations of Sylvester’s Law.

Related Links:

References:

  1. Sylvester, J. J., A demonstration of the theorem that every homogeneous quadratic polynomial is reducible by real orthogonal substitutions to the form of a sum of positive and negative squares. Philosophical Magazine IV: 138–142 (1852).
  2. Horn, R. and Johnson, C., Matrix analysis. Cambridge University Press (1990).
  3. Ikramov, Kh. D., On the inertia law for normal matrices. Doklady Math. 64 (2001) 141-142.
  4. Horn, R. and Sergeichuk, V., Canonical forms for complex matrix congruence and *congruence. Linear Algebra Appl. 416 (2006) 1010-1032. arXiv:0709.2473v1 [math.RT]

A Backward Triangle Inequality for Matrices

July 3rd, 2009

This month’s Lemma of the Month comes all the way from the Summer School and Advanced Workshop on Trends and Developments in Linear Algebra in Trieste, Italy, where Professor Rajendra Bhatia presented a lecture that introduced several simple yet endlessly interesting matrix inequalities. I will briefly present the various results here without proof, though the proof of the first “stepping stone” lemma is provided in the PDF attached to the bottom of this post. The truly interested reader can find full proofs in Professor Bhatia’s notes (follow the link above) or in [1].

Recall that one of the defining properties of a matrix norm is that it satisfies the triangle inequality:

Triangle Inequality

So what can we say about generalizing the backward triangle inequality to matrices? We can of course replace A by A – B in the above equation to find the following backward triangle inequality:

Backward Triangle Inequality

However, what happens if we swap the roles of the absolute value and the matrix norm on the left-hand side? That is, if we recall that |A| is the positive semidefinite part of A (i.e., the square root of A*A), then can we say something like

Incorrect Backward Triangle Inequality

It turns out that the answer to this question is heavily norm-dependent, so we will focus on the norm that gives the simplest answer: the Frobenius norm, which I will denote by ||•||2.

Theorem [Araki-Yamagami]. Let A, B ∈ Mn. Then

Araki

Finally, it just wouldn’t seem right to post from Italy without sharing a bit of it. Thus, I leave you with a taste of the highlight of the trip (excepting the mathematics, of course).

Building Up to the Result

In order to prove the result, one can proceed by proving a series of simple commutant-type matrix norm inequalities, which are interesting in their own right.

Lemma. Let A, B ∈ Mn be positive semi-definite and let X ∈ Mn be arbitrary. Then

Lemma 1

Lemma. Let A ∈ Mn be Hermitian. Then

Lemma 2

Lemma. Let A, B ∈ Mn be Hermitian. Then

Lemma 3

Finally, it just wouldn’t seem right to post from Italy without sharing a bit of it. Thus, I leave you with a taste of the highlight of the trip (excepting the mathematics, of course).

The colosseum as of seven hours ago

The colosseum as of eight hours ago

Related Links

References

  1. H. Araki and S. Yamagami, An inequality for the Hilbert-Schmidt norm, Commun. Math. Phys., 81 (1981) 89-98.

Unital Channel Eigenvalue Majorization

June 6th, 2009

I’ve decided that, starting today, I will once a month post a mathematical result that I find interesting and/or useful, but I feel sadly gets less attention than it deserves. I will try to present all relevant preliminaries along with the result to provide context, so hopefully the results and proofs will be accessible to someone at the upper undergraduate level.

Since I’m a quantum kinda guy, it seems natural that the first such lemma deals with quantum information science. In particular, it helps quantify the behaviour of unital quantum channels acting on density operators. Before delving into the result, let’s begin with…

Quantum Preliminaries

Given the complex matrix space Mn, a quantum channel E is defined to be a completely positive, trace-preserving map. That is, it is a map of the form

Choi-Kraus representationwhere {Aj} ∈ Mn is a family of matrices. Trace-preservation of E is equivalent to the requirement that

Trace-preservationIn many physical situations we are interested in unital quantum channels; that is, channels that satisfy E(I) = I. Such channels in general exhibit much nicer behaviour than arbitrary quantum channels, and this month’s lemma will show one particular instance of this fact.

The Hardy-Littlewood-Polya Theorem

The proof of the lemma relies on a classical result known as the Hardy-Littlewood-Polya Theorem. The result explains how doubly stochastic matrices act on vectors. Since it seems to be surprisingly difficult to find on Wikipedia and other popular (read: non-research) websites, I will state it here.

Theorem [Hardy-Littlewood-Polya]. Let x,y ∈ Cn be complex vectors. Then x majorizes y if and only if y = Dx for some doubly stochastic matrix D ∈ Mn.

It might be worth mentioning that the “if” direction of the proof is borderline trivial; the real meat and potatoes of the theorem is the “only if” direction.

The Lemma Itself

The lemma makes precise something that feels quite natural when thought of physically: a unital channel (that is, a completely positive, trace-preserving map E for which E(I) = I) can only increase the impurity (or “mixedness”) of quantum states. It has several simple consequences that are of great use when dealing with unital channels, and furthermore its proof makes excellent use of classical machinery. It was originally due to Uhlmann [1,2], but has recently appeared in [3]. The proof provided in the PDF attached at the end of this post is from the latter source.

Lemma [Unital Channel Eigenvalue Majorization]. Suppose ρ = E(σ) for a unital channel E. Then the ordered spectrum r of ρ is majorised by the ordered spectrum s of σ.

One particularly useful corollary of this lemma is presented here, and its proof is omitted (and dare I say left as an exercise for the reader?)

Corollary. If E is a unital quantum channel and ρ is a positive operator, then rank(E(ρ)) ≥ rank(ρ).

Related Links

References

  1. A. Uhlmann, Commun. Math. Phys. 54, 21 (1977).
  2. I. Bengtsson, and K. Zyczkowski, Geometry of quantum states, Cambridge University Press (2006).
  3. D. W. Kribs, R. W. Spekkens, Phys. Rev. A 74, 042329 (2006). arXiv:quant-ph/0608045v2