Gram-Schmidt Orthonormalization Process

This is a very generalized form of Gram-Schmidt algorithm. The setup for this is the same as that of real semisimple algebras with a positive involution. The classical version is recovered when we substitute \(A=\mathbb{R}\).

Let \(A\) be a real semisimple algebra and let \((\ )^{*}: A\rightarrow A\) be a positive involution.

Orthogonality in \(A^k\)

Let \(k \ge 1\). Fix a positive definite symmetric element \(a \in A\). Let us define \[\begin{align} \langle \ , \ \rangle_{A} : A^{k} \times A^{k} & \mapsto A \\ \langle x,y\rangle_{A} & = \sum_{i=1}^{k} x_{i} a y_{i}^*. \end{align}\]

This form is \(\mathbb{R}\)-bilinear but not necessarily \(A\)-bilinear, since \(A\) may not be commutative. We can however define a real positive definite symmetric bilinear form on \(A^{k}\) given by \[\begin{align} \langle x,y\rangle_{\mathbb{R}} = \mathop{\mathrm{tr}}\left(\langle x, y\rangle_{A}\right). \end{align}\]

We assume that \(A^{k}\) carries this real inner product.

The following properties are satisfied by \(\langle \ , \ \rangle_{A}\).

All of them are trivial verificiations. For the last one, we must find a \(b \in A\) such that \(\langle x,x\rangle_{A}^{-1} = b^{*}b\). This can be done using the proposition given in Cholesky decomposition.

 

We say two vectors in \(A^{k}\) are orthogonal if the above given product between them is \(0\). We call a set of vectors \(\{ x_1,x_2,\cdots,x_m\} \in A^{k}\) orthonormal if \(\langle x_i,x_{j}\rangle_{A} = \delta_{ij} 1_{A}\).

Note that if \(\langle x,y\rangle= 0\), then \(\langle \alpha_1x,\alpha_2y\rangle=0\) for each \(\alpha_1,\alpha_2 \in A\) and \(x,y \in A^{k}\).

Suppose \(x_1,x_2,\cdots,x_m \in A^{k}\) is an orthonormal set of vectors. Then \(m \le k\) and \(x_1,x_2,\cdots,x_m\) are free under the left action of \(A\). If \(m=k\), then the vectors \(x_1,\cdots,x_k\) freely generate \(A^{k}\) as a left \(A\)-module.

Observe that if \(a_1,a_2,\cdots, a_m \in A\) are such that \[\begin{align} & a_1 x_1 + a_2 x_2+ \dots + a_m x_m = 0 , \end{align}\] then we can just evaluate \[\begin{align} \langle a_1 x_1 + a_2 x_2+ \dots + a_m x_m ,x_i \rangle = a_i \langle x_i, x_i\rangle = a_i 1_{A} = 0. \end{align}\] Hence, we have that each \(a_i=0\). Using this, we get that \[\begin{align} A^{m} & \rightarrow A^{k} \\ (a_1,a_2,\ldots,a_m) & \mapsto a_1x_1 + \cdots + a_m x_m. \end{align}\] is an injective \(\mathbb{R}\)-linear map. Hence, for dimensional constraints, \(m \le k\) and \(m=k\) will imply that it is an isomorphism.


If we were to define \[\begin{align} \langle \ , \ \rangle_{A} : A^{k} \times A^{k} & \mapsto A \\ \langle x,y\rangle & = \sum_{i=1}^{k} x_{i}^{*} y_{i}, \end{align}\] we will reach the same proposition above, but for right actions instead of left.

Suppose \(x_1,x_2,\cdots,x_k \in A^{k}\) is an orthonormal set of vectors, then for any \(v = a_1 x_1+ a_2 x_2 + \dots + a_k x_k\) for \(\{ a_{i}\}_{i=1}^{k} \subseteq A\), we have \[\begin{align} \langle v,v\rangle_{\mathbb{R}} = \mathop{\mathrm{tr}}(\langle v,v\rangle_{A}) = \sum_{i=1}^{k} \mathop{\mathrm{tr}}(a_{i}^{*}a a_{i}). \end{align}\]

Gram-Schmidt algorithm

The following algorithm is an analogue of Gram-Schmidt.

Suppose \(v_1,v_2,\cdots,v_k \in A^{k}\) are some vectors that freely generate \(A^{k}\) as a left \(A\)-module. We claim that using these vectors, it is possible to create a set of orthonormalized basis vectors \(x_1,x_2,\cdots,x_k\).

First, we do the following. Define for \(u,v \in A^{k}\), \[\begin{align} \mathop{\mathrm{proj}}(u, v)= \begin{cases} {\langle u,v\rangle_{A}}\langle u,u\rangle_{A}^{-1} u & \text{ if }u \neq 0 \\ 0 & \text{if } u = 0 \end{cases}. \end{align}\] This has the property that \(\langle \mathop{\mathrm{proj}}(u,v),u\rangle_{A} = \langle u,v\rangle_{A}\).

Generate vectors \(x_1',x_2',\cdots,x_k'\) as follows. \[\begin{align} x_1'& = {v_1}, \\ x_{2}'& = v_{2} - \mathop{\mathrm{proj}}(x_1',v_2) \\ x_{3}'& = v_{3} - \mathop{\mathrm{proj}}(x_1',v_3) - \mathop{\mathrm{proj}}(x_2',v_3)\\ x_{4}'& = v_{4} - \mathop{\mathrm{proj}}(x_1',v_4) - \mathop{\mathrm{proj}}(x_2',v_4) - \mathop{\mathrm{proj}}(x_3',v_4)\\ \vdots\\ \end{align}\]

We can prove that \(\langle x_i',x_j'\rangle =0\) for \(i > j\) by first induction via ordering \((i,j)\) as \((2,1),(3,1),\cdots,(k,1),(3,2),(4,2),\cdots,(k,2),(4,3),\cdots\). Now choose \(b_i\) such that \(x_i = b_i x'_i\) has \(\langle x_i,x_i\rangle = 1_{A}\). Hence, we are done.

Given a real semisimple algebra \(A\) with a positive involution \((\ )^{*}\), we call the above method of generating \(x_1,x_2,\cdots,x_m\) from vectors \(v_1,v_2,\cdots,v_m\) the Gram-Schmidt algorithm. If \(\{ v_i\}_{i=1}^{m}\) are free under left \(A\) multiplication, then so are \(\{ x_i\}_{i=1}^{m}\).

Cholesky decompositon

This Gram-Schmidt is also related to the Cholesky decomposition in the same way the usual Gram-Schmidt is related to the Cholesky decomposition. See the wikipedia page.

Use in Minkowski’s theorem

This Gram-Schmidt appears is used in the Division algebra variant of Minkowski’s theorem about successive minimas.


This page was updated on October 19, 2021.
Main Page