Cholesky Decomposition

To understand the setup here, refer to the page on real semisimple algebras

Let \(A\) be a semisimple \(\mathbb{R}\)-algebra with a positive involution \((\ )^*\). The algebra \(M_k(A)\) is also a semisimple \(\mathbb{R}\)-algebra and the involution \((\ )^{*}\) can be easily extended to \(M_k(A)\) via the mapping \([a_{ij}] \mapsto [a_{ji}^{*}]\). We will denote this involution with same notation \((\ )^{*}\). With this, the meaning of positive definite and symmetric matrices in \(M_k(A)\) is unambiguous. For clarity, we will distinguish between the norms and traces of \(A\) and \(M_k(A)\) by using the notations \(\mathop{\mathrm{tr}}_{A}, \mathop{\mathrm{N}}_{A}, \mathop{\mathrm{tr}}_{M_k(A)}, \mathop{\mathrm{N}}_{M_k(A)}\) whenever appropriate.

For any \(a \in M_k(A)\), we can create a bilinear form on \[\beta_{a}:A^{k} \times A^{k} \rightarrow \mathbb{R}\] as \(\beta_{a}(x,y) = \sum_{i,j=1}^{k} \mathop{\mathrm{tr}}_{A}( x_{i}^{*} a_{ij} y_{j})\). The following lemma then approves that the conventional intuition of positive definiteness is in confirmation with the definition above.

An element \(a \in M_k(A)\) is positive definite if and only if \(\beta_{a}\) is a positive definite quadratic form on \(A^{k}\) as an \(\mathbb{R}\)-vector space.

This lemma leads to the following decomposition for quadratic forms \(\beta_{a}\) induced by symmetric positive definite matrices \(a\). What the upcoming theorem is really going to tell us is that the quadratic form \(\beta_{a}\) can be diagonalized up to a triangular change of basis.

When \(A =\mathbb{R}\), this is simply the Cholesky decomposition of real symmetric positive definite matrices. In [1], the theorem below is referred to as the Babylonian reduction theorem, perhaps because it is spiritually similar to completing the square in a quadratic equation of one variable.

Let \(a \in M_k(A)\) be a symmetric positive definite matrix. Then there is an upper triangular matrix \(t \in M_k(A)\) with \(1_{A}\) on the diagonal entries, and a diagonal matrix \(d\) with symmetric positive definite elements of \(A\) on the diagonal such that \[\begin{align} a = t^{*} d t. \end{align}\] That is, writing explicitly in terms of \(A\)-valued matrix entries, we can find \(d,t \in M_{k}(A)\) such that \[\begin{align} \begin{bmatrix} a_{11} & a_{12} & & a_{1k} \\ a_{21} & a_{22} & & a_{2k} \\ & & \ddots & \\ a_{k1} & a_{k2} & & a_{kk} \end{bmatrix} = & \begin{bmatrix} 1_A & & & \\ t_{12}^{*} & 1_A & & \\ & \vdots & \ddots & \\ t_{1k}^{*} & t_{2k}^{*} & & 1_A \end{bmatrix} \begin{bmatrix} d_{11} & & & \\ & d_{22} & & \\ & & \ddots & \\ & & & d_{kk} \end{bmatrix} \begin{bmatrix} 1_A & t_{12} & \dots & t_{1k} \\ & 1_A & & t_{2k} \\ & & \ddots & \\ & & & 1_A \end{bmatrix}\\ = & \label{eq:decomp} \begin{bmatrix} d_{11} & d _{11}t_{12} & \dots & d _{11} t_{1k} \\ t_{12}^{*} d_{11} & t_{21}^{*}d_{11} t_{12} + d_{22} & & t_{12}^{*} d_{11} t_{1k} + d_{22}t_{2k} \\ \vdots & & \ddots & \\ t_{1k}^{*} d_{11} & & & \sum_{i=1}^{k} t_{ki}^{*} d_{ii} t_{ik} \end{bmatrix}.\\ \end{align}\]

This is what we want in symbols. \[\begin{align} a_{ij} = \sum_{r = 1}^{ \min(i,j)} t^{*}_{ri} d_{rr} t_{rj}. \end{align}\] Here, the diagonal entries \(t_{ii}=1_{A}\). Note that by \(t^{*}_{ri}\), we mean \((t_{ri})^{*} = (t^{*})_{ir}\).

This implies for \(i \le j\), we get \[\begin{align} \label{eq:dii}d_{ii} =& a_{ii} - \sum_{j = 1}^{ i-1} t_{ji}^{*} d_{jj} t_{ij}.\\ \label{eq:aij}t_{ij} =& d_{ii}^{-1} \left( a_{ij} - \sum_{r = 1}^{ i-1 } t_{r i}^{*} d_{rr} t_{rj} \right) . \end{align}\] The two previous equations can be used to inductively generate the matrix elements \(d_{ii}\) and \(t_{ij}\) by calculating them row-wise from top to bottom. But how do we know that \(d_{ii}^{-1}\) will exist at every stage of the induction? Let us show this by induction on \(i\). For \(i=1\), it is clear that \(a_{11} = d_{11}\) and \(a_{11}\) is positive definite since for \(x \in A\), \(\mathop{\mathrm{tr}}_{A}(x^{*} a_{11} x ) = \beta_{a}(x,0,0,\dots,0)\) non-negative and zero only when \(x=0\).

For the general case, note that any upper triangular matrix \(t'\) with units in the diagonal entries admits an inverse in \(M_{k'}(A)\). The entries of \(t'^{-1}\) will be some non-commutative polynomials in the entries of \(t'\) which one can find inductively via ``forward substitution’’.

With this, we can conclude that whenever we write \(a' \in M_{k'}(A)\) that is symmetric and positive definite and wherever a diagonal matrix \(d'\) and a unit upper triangular matrix \(t'\) exist so that \({a'} = t'^{*} d' t'\), the diagonal entries of \(d' = {t'^{*}}^{-1} a' t'^{-1}\) are automatically symmetric and positive definite. This follows from \(d'\) itself being symmetric and positive definite. The fact that \(d'\) is symmetric is easy to compute and to see positive definiteness, observe that for any \(x,y \in M_{k'}(A)\), we get \(\mathop{\mathrm{tr}}_{M_k(A)}( x ^{*} d' y) = \mathop{\mathrm{tr}}_{M_k'(A)}( ( t'^{-1} x )^{*} a' (t'^{-1} y ))\). Hence in particular, the diagonal entries of \(d'\) are invertible in \(A\). Note that this claim is valid for all \(k' \le k\) and it will now aid us in induction.

Suppose that \(\{ d_{ii}\}_{i=1}^{k'}\) are positive definite. Then we can compute \(d_{(k'+1)(k'+1)}\) and \(\{ t_{(k'+1)j}\}_{ j \ge k'+1 }\). Now note that this gives us a solution for the given statement when \(k\) is replaced by \(k'+1\). Hence each \(\{d_{ii}\}_{i=1}^{k'+1}\) is symmetric and positive definite and in particular \(d_{(k'+1)(k'+1)}\) is invertible.

The decomposition above is unique, because the elements \(d_{ii}\) and \(t_{ij}\) are completely determined by Equation ().

It is possible to view \(A^{k}\) as a \((k \dim_{\mathbb{R}}A)\)-dimensional vector space over \(\mathbb{R}\) and all the matrices in \(M_k(A)\) can be seen as block matrices with each entry \(a_{ij}\) being replaced by its left-multiplication matrix as an element of \(A\). From this point of view, Theorem is the same thing as the block matrix variant of the Cholesky decomposition.

 

There is a further improvement that is possible to be done here using the proposition below.

Suppose that \(a \in A\) is a positive definite symmetric element. Then, there exists another positive definite symmetric element \(b \in A\) such that \(b^{2} = a\).

Just like in the proof of Norm-Trace inequality, using the spectral theorem for positive definite matrices, we can construct a basis \(\{ e_1, e_2,\dots, e_{d}\}\) of \(A\), orthonormal with respect to \(x \mapsto \mathop{\mathrm{tr}}(x^{*}x)\), such that the matrix \(a_{ij} = \mathop{\mathrm{tr}}(e_{i}^{*} a e_{j})\) is a diagonal matrix. Then the \(a_{ii}\) in the diagonal are the non-zero entries and are positive.

Note that the map \(a \mapsto a_{ij}\) is a faithful representation, since the matrix \(a_{ij}\) is just the left-multiplication matrix with respect to the basis \(\{ e_{i}\}_{i=1}^{d}\). Moreover, in general \(b \in A\) is positive-definite if and only if the matrix \(b_{ij}= \mathop{\mathrm{tr}}(e_i^{*} b e_{j})\) is positive-definite and is symmetric if and only if \(b_{ij}=(b^{*})_{ij} = b_{ji}^{*}\). In terms of this matrix representation, finding \(b \in A\) such that \(b^{2} = a\) is the same as showing that the diagonal matrix with diagonal entries \(\sqrt{a_{ii}}\) lies in the image of this representation.

To see this, note that there exists a polynomial \(f(x) \in \mathbb{R}[X]\) such that \(f(a_{ii}) = \sqrt{a_{ii}}\) for \(i \in \{ 1,2,\dots,d\}\). Then put \(b = f(a) \in A\) and this satisfies all the requirements.

For every positive definite symmetric element \(a \in A\), \(a = b^{*}b\) for some positive defining symmetric \(b \in A\).

Every element \(a \in M_k(A)\) that is positive definite can be written in the form of \[\begin{align} a = t^{*} b^{*} b t = p^{*}p, \end{align}\] where \(t \in M_k(A)\) is upper triangular with \(1_{A}\) on the diagonal, \(b \in M_k(A)\) is diagonal and \(p \in M_k(A)\) is just upper triangular.

Use Cholesky decomposition and decompose \(a\) as \(t^{*}dt\). Then each diagonal entry of \(d\) can be split as \(d_{ii} = b_{ii}^{*}b_{ii}\) according to the previous corollary.

References

1.
A. Weil, Discontinuous subgroups of classical groups: lectures (University of Chicago, 1958).

This page was updated on October 11, 2021.
Main Page