Noncommutative probability I: motivations and examples

In many areas of mathematics, one can usually assign a commutative algebraic structure to the “space” that one is studying. For example, instead of studying an (affine) algebraic variety, one can study algebraic functions on that variety, which gives rise to a commutative ring. Another example is that one can study the space of complex-valued continuous functions on a topological space (with some “nice” topology) instead of the topological space itself, and this space of complex-valued continuous functions is a commutative C*-algebra. These algebraic structures have their noncommutative analogues (namely noncommutative rings and noncommutative C*-algebras), and these might correspond to some “noncommutative varieties” or “noncommutative topological space”. This is probably one of the motivations of the area noncommutative geometry.

This abstraction process also applies to probability spaces, because there is a very natural algebra on a probability space. Given a probability space (\Omega, \mathcal{F}, \mathbf{P}) , one can define the space of (essentially) bounded random variables L^\infty(\Omega) , and a linear functional \mathbf{E} on this space, namely the expected value. So we may study the algebra L^\infty(\Omega) instead of the probability space itself.

In fact, it is very natural to study only random variables but not the probability space in probability theory. Usually (but not always!), the role of the sample space is not important at all, but the random variables and their associated \sigma -algebras are more important.

This leads to the following definition/abstraction.

Defintion. A (noncommutative) probability space is a pair (\mathcal{A}, \tau) , where \mathcal{A} is a complex algebra with a unit, and \tau:\mathcal{A}\to\mathbb{C} is a linear map such that \tau(1)=1 .

The requirement \tau(1)=1 can be seen as an abstraction of the sample space has probability 1. To see that this is not a meaningless generalization, we can look at the following examples.


  1. As we have seen, for a given probability space (\Omega, \mathcal{F}, \mathbf{P}), (L^\infty(\Omega), \mathbf{E}) is a noncommutative probability space.
  2. In many cases, we also care about random variables that are not bounded, for instance Gaussian random variables. So it might be more natural to study \mathcal{A} = \bigcap_{1\leq p<\infty} L^p(\Omega) with \tau = \mathbf{E}.
  3. \mathcal{A}=M_n(\mathbb{C}) with \tau(X) = \frac{1}{n}\textnormal{tr}(X).
  4. \mathcal{A} = M_n(\mathbb{C}), fix a unit vector v\in \mathbb{C}^n with \|v\|_2 = 1 and define \tau(X) = v^*Xv. Then (\mathcal{A},\tau) is also a probability space.

In classical probability, we care about the distribution of a random variable, which is the push forward of the probability measure on \mathbb{R} by the random variable. We don’t have a natural measure in noncommutative context, but we can still define a distribution of a random variable using moments.

Definition. Let (\mathcal{A},\tau) be a probability space. We say a\in\mathcal{A} has a distribution \mu, where \mu is a probability measure on \mathbb{R} , if \tau(a^k) = \int_{-\infty}^{\infty} t^k\;\textnormal{d}\mu(t) for all k\in\mathbb{N}.

Note that even in the classical case, \mu may not be uniquely determined by the moments, so it might actually be better to view a distribution as a sequence of moments instead of a probability measure in this case.

Example. Consider (M_n(\mathbb{C}), \tau_n), where \tau_n(T) = \frac{1}{n}\textnormal{tr}(T). Let T\in M_n(\mathbb{C}) be self-adjoint. Then there exist an orthonormal basis e_1,\ldots, e_n in \mathbb{C}^n and \lambda_1,\ldots, \lambda_n\in \mathbb{R} such that Te_j = \lambda_je_j, j=1,\ldots,n. We have

\displaystyle \tau_n(T^k) = \frac{1}{n}\sum_{j=1}^n (T^k)_{j,j} = \frac{1}{n}\sum_{j=1}^n \lambda_j^k = \int_{-\infty}^\infty \lambda^k\;\textnormal{d}\mu(\lambda),

where \mu = \frac{1}{n} \sum_{j=1}^n \delta_{\lambda_j}. That is, T has distribution \mu. We call \mu the empirical eigenvalue distribution of T.

We can construct the space of random matrices as follows. Given a probability space (\Omega,\mathcal{F},\mathbf{P}) and n\in\mathbb{N}, we define

\displaystyle \mathcal{A} = M_n\left(\bigcap_{1\leq p <\infty} L^p(\Omega)\right).

For X = [x_{i,j}] \in \mathcal{A}, x_{i,j}:\Omega\to\mathbb{C} are random variables with \mathbf{E}|x_{i,j}|^k<\infty for all k\in\mathbb{N}. Define \tau:\mathcal{A}\to \mathbb{C} by \tau(X) = \mathbf{E}(\tau_n(X)) = \frac{1}{n}\sum_{j=1}^n \int_{\Omega} x_{j,j} \;\textnormal{d}\mathbf{P}. Then (\mathcal{A},\tau) is a probability space.

Suppose that X(\omega) = X(\omega)^* for all \omega. Then

\displaystyle \tau(X^k) = \mathbf{E} (\tau_n(X(\omega)^k)) = \mathbf{E}\left(\int_{-\infty}^\infty \lambda^k \;\textnormal{d}\mu_\omega(\lambda)\right),

where \mu_\omega is the empirical eigenvalue distribution of X(\omega). We can see that a distribution of X is given by “\mathbf{E} \mu_\omega“.

We will now consider another example of probability space. Let G be a group and let \mathbb{C}[G] be the group algebra of G. One can define a norm \|x\|_1 = \sum_{g\in G} |x_g| and consider the completion of \mathbb{C}[G] under this norm:

\displaystyle \ell^1(G) = \left\{\sum_{g\in G} x_gg: x_g\in\mathbb{C}, \sum_{g\in G} |x_g| <\infty\right\}.

One can show that for all x,y\in \ell^1(G), \|xy\|_1 \leq \|x\|_1\|y\|_1. Note that the unit e of G is also a multiplicative unit in \mathbb{C}[G] and \ell^1(G). We define \tau:\mathbb{C}[G]\to \mathbb{C} (or \tau:\ell^1(G)\to \mathbb{C}) by

\displaystyle\tau\left(\sum_{g\in G} x_gg\right) = x_e.

Consider a particular case, where G is isomorphic to \mathbb{Z}. Write G = \{g^n: n\in \mathbb{Z}\}. Then one can associate \sum_{n\in \mathbb{Z}} \alpha_n g^n \in \ell^1(G) with a function (in fact, Fourier series) f(t) = \sum_{n\in \mathbb{Z}} \alpha_ne^{2\pi int} on \mathbb{T} = \mathbb{R}/\mathbb{Z}, the dual group of \mathbb{Z}. Then the expected value \tau(\sum_{n\in \mathbb{Z}} \alpha_n g^n ) in this case can be computed by \int_0^1 f(t)\;\textnormal{d}t.

Let’s see an application, namely random walks on a group G . Let g_1,g_2,\ldots, g_k\in G. Define a sequence of random variables X_n:\Omega\to G by X_0 = e, and

\displaystyle \mathbf{P}(X_{n+1} = X_ng_j) = \frac{1}{2k},\quad \mathbf{P}(X_{n+1} = X_ng_j^{-1}) = \frac{1}{2k}

for all j=1,2,\ldots,k. Of interest are the numbers p_n = \mathbf{P}(X_n = e), the return probabilities.

Define a\in\mathbb{C}[G] by a = \frac{1}{2k} \sum_{j=1}^k (g_j + g_j^{-1}). Then we have p_n = \tau(a^n) for all n=0,1,2,\ldots, because if we write S = \{g_1,\ldots, g_k, g_1^{-1},\ldots, g_k^{-1}\}, then

\displaystyle a^n = \left(\frac{1}{2k}\right)^n \sum_{h_j\in S} h_1h_2\cdots h_n,

and hence

\displaystyle \tau(a^n) = \left(\frac{1}{2k}\right)^n \#\{(h_1,\ldots,h_n): h_j\in S, h_1h_2\cdots h_n = e\}.

It is not difficult to see that the right hand side is exactly p_n (by simply writing down the definition of p_n ).

Let’s consider a simpler case, G being isomorphic to \mathbb{Z}. Let S= \{g,g^{-1}\} with g\neq e. As we have seen before, we can associate a= \frac{1}{2}(g+g^{-1}) with f(t) = \frac{1}{2}(e^{2\pi it} + e^{-2\pi i t}) = \cos(2\pi t). Then \tau(a^n) is given by \int_0^1 \cos^n(2\pi t)\;\textnormal{d}t, which implies p_{2n} = {2n \choose n} \frac{1}{2^{2n}}, the return probability of simple random walk on \mathbb{Z} .

Next time we will talk about a more interesting and important topic: independence in noncommutative probability.

This entry was posted in Functional analysis, Probability. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s