Do join us!

No matter what your level of mathematics is, if you are interested to comment on or contribute to this blog, you are strongly encouraged to join! Continue reading

Advertisements
Posted in Miscellaneous | 2 Comments

Noncommutative probability II: independence

Recall that two random variables X and Y are said to be independent if \mathbf{P}(X \in A, Y \in B) = \mathbf{P}(X\in A)\mathbf{P}(Y\in B) for all Borel sets A, B. We don’t have a probability measure behind the abstract definition of noncommutative probability space, so we cannot define independence in this way. However, it is not difficult to show that the above definition is equivalent to the following: for any Borel-measurable functions f, g such that f(X), g(Y) are integrable, one has \mathbf{E}[f(X)g(Y)] = \mathbf{E}[f(X)] \mathbf{E}[g(Y)] . Since we have an expectation in noncommutative context (namely \tau in the previous post), we can define independence in a similar way.

Definition. Let (\mathcal{A},\tau) be a noncommutative probability space. Let \mathcal{A}_1,\mathcal{A}_2\subseteq \mathcal{A} be two subalgebras such that e\in\mathcal{A}_1\cap\mathcal{A}_2. We say that \mathcal{A}_1 and \mathcal{A}_2 are classically independent if

  1. a_1a_2 = a_2a_1 if a_1\in\mathcal{A}_1,a_2\in\mathcal{A}_2,
  2. if a_1\in \mathcal{A}_1 and a_2\in \mathcal{A}_2 with \tau(a_1) = \tau(a_2)= 0, then \tau(a_1a_2) = 0.

Note that if \mathcal{A}_1 and \mathcal{A}_2 are classically independent, then if a_1\in\mathcal{A}_1 and a_2\in\mathcal{A}_2, one has \tau(a_1a_2) = \tau(a_1)\tau(a_2) (by applying the definition to a_j - \tau(a_j)e). So this is indeed the same as the independence we saw in classical probability.

Continue reading

Posted in Combinatorics, Probability | Leave a comment

Noncommutative probability I: motivations and examples

In many areas of mathematics, one can usually assign a commutative algebraic structure to the “space” that one is studying. For example, instead of studying an (affine) algebraic variety, one can study algebraic functions on that variety, which gives rise to a commutative ring. Another example is that one can study the space of complex-valued continuous functions on a topological space (with some “nice” topology) instead of the topological space itself, and this space of complex-valued continuous functions is a commutative C*-algebra. These algebraic structures have their noncommutative analogues (namely noncommutative rings and noncommutative C*-algebras), and these might correspond to some “noncommutative varieties” or “noncommutative topological space”. This is probably one of the motivations of the area noncommutative geometry.

This abstraction process also applies to probability spaces, because there is a very natural algebra on a probability space. Given a probability space (\Omega, \mathcal{F}, \mathbf{P}) , one can define the space of (essentially) bounded random variables L^\infty(\Omega) , and a linear functional \mathbf{E} on this space, namely the expected value. So we may study the algebra L^\infty(\Omega) instead of the probability space itself.

In fact, it is very natural to study only random variables but not the probability space in probability theory. Usually (but not always!), the role of the sample space is not important at all, but the random variables and their associated \sigma -algebras are more important.

This leads to the following definition/abstraction.

Defintion. A (noncommutative) probability space is a pair (\mathcal{A}, \tau) , where \mathcal{A} is a complex algebra with a unit, and \tau:\mathcal{A}\to\mathbb{C} is a linear map such that \tau(1)=1 .

The requirement \tau(1)=1 can be seen as an abstraction of the sample space has probability 1. To see that this is not a meaningless generalization, we can look at the following examples.

Examples.

  1. As we have seen, for a given probability space (\Omega, \mathcal{F}, \mathbf{P}), (L^\infty(\Omega), \mathbf{E}) is a noncommutative probability space.
  2. In many cases, we also care about random variables that are not bounded, for instance Gaussian random variables. So it might be more natural to study \mathcal{A} = \bigcap_{1\leq p<\infty} L^p(\Omega) with \tau = \mathbf{E}.
  3. \mathcal{A}=M_n(\mathbb{C}) with \tau(X) = \frac{1}{n}\textnormal{tr}(X).
  4. \mathcal{A} = M_n(\mathbb{C}), fix a unit vector v\in \mathbb{C}^n with \|v\|_2 = 1 and define \tau(X) = v^*Xv. Then (\mathcal{A},\tau) is also a probability space.

Continue reading

Posted in Functional analysis, Probability | Leave a comment

A simple proof of the Gauss-Bonnet theorem for geodesic ball

In this short note, we will give a simple proof of the Gauss-Bonnet theorem for a geodesic ball on a surface. The only prerequisite is the first variation formula and some knowledge of Jacobi field (second variation formula), in particular how its second derivative (or the second derivative of the Jacobian) is related to the curvature of the surface. This is different from most standard textbook proofs at the undergraduate level. (Of course, this is just a local version of the Gauss-Bonnet theorem and topology has not yet come into play.)

Let {M} be a surface equipped with a Riemannian metric. We will fix a point {p} in {M} and from now on {B_r} always denotes the geodesic ball of radius {r} centered at {p}, and {\partial B_r} its boundary, which is called the geodesic sphere. In geodesic polar coordinates, let the area element of {M} be locally given by

\displaystyle \begin{array}{rl} \displaystyle   dA=f(\theta, r) dr d\theta, =f_\theta(r) dr d\theta, \end{array}

where {f(\theta, r)} is the Jacobian (with respect to polar coordinates). For our purpose it is more convenient to regard {f_\theta(r)} as a one-parameter family of functions in the variable {r}. It is well-known that {f_\theta} satisfies the Jacobi equation (here {'=\frac{d}{dr}} )

\displaystyle \begin{array}{rl} \displaystyle   {f_\theta}''(r)=-K(\theta, r) f_\theta(r),\quad f_\theta(0)=0,\quad {f_\theta}'(0)=1  \ \ \ \ \ (1)\end{array}

where {K=K(\theta, r)} is the Gaussian curvature (in polar coordinates). Indeed, if we fix a geodesic polar coordinates, and {\gamma_\theta(t)} is the arc-length parametrized geodesic with initial “direction” {\theta} starting from {p}, then we can define a parallel orthonormal frame {e_1(t), e_2(t)=\gamma_\theta'(t)} along {\gamma_\theta(t)}. Then {Y(t)=f_\theta(t)e_1(t)} is a Jacobi field and so

\displaystyle \begin{array}{rl} \displaystyle   Y''(t)={f_\theta}''(t) e_1 (t) =- R(Y(t), \gamma_\theta'(t))\gamma_\theta'(t) =& \displaystyle -K(\theta, t) Y(t)\\ =& \displaystyle -K(\theta, t) f_\theta (t) e_1(t). \end{array}

From this (1) follows.

The first variation formula says (here {s} is the arclength parameter)

\displaystyle \begin{array}{rl} \displaystyle   \frac{d}{dt} \left(\mathrm{Length}(\partial B_t)\right) =\frac{d}{dt} \left(\int_0^{2\pi}f_\theta( t) d\theta\right) =\int_{ \partial B_t}k_g ds =\int_0^{2\pi} k_g(\theta, t) f_\theta( t)d\theta. \end{array}

Here {k_g} is the geodesic curvature of the geodesic circle {\partial B_t}. (Indeed, the differential version {\frac{f_\theta'}{f_\theta}=k_g} is already true for the geodesic circle.) This implies

\displaystyle \begin{array}{rl} \displaystyle   \int_{\partial B_t}k_g ds =\int_{0}^{2\pi} f_\theta'(t) d\theta. \end{array}

So by the fundamental theorem of calculus and (1), we have

\displaystyle \begin{array}{rl} \displaystyle   \int_{\partial B_t}k_g ds =& \displaystyle \int_{0}^{2\pi}\left({f_\theta}'(0)+\int_0^t {f_\theta}''(r)dr\right)d\theta\\ =& \displaystyle \int_{0}^{2\pi}\left(1-\int_0^t K (\theta, r)f_\theta(r)dr\right)d\theta\\ =& \displaystyle 2 \pi-\int_{B_t} K dA. \end{array}

This is exactly the Gauss-Bonnet theorem (for a geodesic ball), which is usually written as

\displaystyle \begin{array}{rl} \displaystyle   \int_{B_r}K dA+\int_{\partial B_r}k_g ds=2\pi. \end{array}

Posted in Calculus, Differential equations, Differential geometry, Geometry | Leave a comment

Least squares in a non-ordinary sense

Simple ordinary least squares regression (SOLSR) means the following. Given data (x_i,y_i)\in \mathbb{R}^2, i = 1, \ldots,N, find a line in \mathbb{R}^2 represented by y = mx+c that fits the data in the following sense. The loss of each data point (x_i,y_i) to the line is

|y_i - (mx_i + c) |          for every i,

so we find (m,c) that minimizes the loss function

\displaystyle \sum_{i=1}^N [y_i - (mx_i + c)]^2.

See here for a closed-form of the minimizer (m,c). Instead of SOLSR, one can consider the distance between a data point and a line in \mathbb{R}^2 as the loss. Notice that |y_i - (mx_i + c) | is not the distance from (x_i,y_i) to y = mx+c unless m = 0. Then the new least squares problem can be formulated as follows.

A general line in \mathbb{R}^2 can be expressed as \sin(\theta)x-\cos(\theta)y+a=0 where (\theta,a)\in \mathbb{R}^2. Thus the distance between (x_i,y_i) and this line is

|\sin(\theta) x_i - \cos(\theta) y_i + a|          for every i.

Hence we want to find (\theta,a) that minimizes the loss function

\displaystyle L(\theta,a) :=\sum_{i=1}^N (\sin(\theta) x_i - \cos(\theta) y_i + a)^2.

Continue reading

Posted in Applied mathematics, Calculus, Optimization, Statistics | Leave a comment

Archimedes’ principle for hyperbolic plane

After writing the previous post, I realized that there is an exact analogue of the Archimedes’ principle for the surface area of the hyperbolic disk inside the hyperbolic plane.

450px-Domenico-Fetti_Archimedes_1620

Archimedes (287 –  212 BC)


Let us fix the notations. Let {\mathbb R^{2,1}} be the {3}-dimensional Minkowski space defined by

\displaystyle \begin{array}{rl} \displaystyle \mathbb R^{2,1}:=\{(x, y, t)\in \mathbb R^{3}\}, \end{array}

equipped with the Lorentzian metric {-dt^2+dx^2+dy^2}.

Fix {r >0} and define the hyperbolic plane {\mathbb H^2(r)} by

\displaystyle \begin{array}{rl} \displaystyle \mathbb H^2(r):=\{(x, y, t)\in \mathbb R^{2,1}: \quad t>0,\quad t^2-x^2-y^2=r^2\}, \end{array}

equipped with the induced metric.

hyperboloid2

Hyperbolic plane

It can be shown that {\mathbb H^2(r)} is a surface with constant curvature {-\frac{1}{r^2}} (analogous to the fact that the sphere of radius {r}, {\mathbb S^2(r)}, is a surface with constant curvature {\frac{1}{r^2}}). It is easy to see that {\mathbb H^2(r)} can be parametrized by

\displaystyle \begin{array}{rl} \displaystyle X(\theta, \phi)= (r\sinh \theta \cos \phi,r \sinh \theta \sin \phi, r\cosh \theta), \quad \theta\ge 0, \quad \phi\in[0, 2\pi],\end{array}

the “polar coordinates” around {P=(0, 0, r)}, with {r\theta} being the (geodesic) distance from {P}.

Finally, let the (infinite) cylinder {C} of radius {r} be defined by

\displaystyle \begin{array}{rl} \displaystyle C= \{(x, y, t)\in \mathbb R^{2,1}: x^2+y^2=r^2\}, \end{array}

again equipped with the induced metric.

cylinder.png

Cylinder

There is a natural orthogonal projection {\Pi} from {\mathbb H^2(r)\setminus \{P\}} to {C} defined by

\displaystyle \begin{array}{rl} \displaystyle \Pi(x, y, t)=\left(\frac{r x}{\sqrt{x^2+y^2}}, \frac{r y}{\sqrt{x^2+y^2}}, t\right). \end{array}

cylin_hyp

The natural projection

In polar coordintes, this is given by

\displaystyle \begin{array}{rl} \displaystyle \Pi(X(\theta, \phi ))=(r\cos \phi, r\sin \phi, r\cosh \theta). \ \ \ \ \ (1)\end{array}

We claim that the map {\Pi} is area-preserving. If this is true, then we can easily calculate the area of the hyperbolic geodesic disk (with radius {rR})

\displaystyle \begin{array}{rl} \displaystyle B_R:=\{X(\theta, \phi): 0\le \theta\le R\}\end{array}

because the projection of {B_R\setminus \{P\}} is exactly the finite cylinder

\displaystyle \begin{array}{rl} \displaystyle \Pi(B_R\setminus \{P\})=\{(x, y, t)\in C: r<t\le r\cosh R\} \end{array}

whose area is easy to be calculated. Indeed, the area of the cylinder is exactly the same as the ordinary cylinder in the Euclidean space {\mathbb R^3}.

hyp_ball

A geodesic disk inside the hyperbolic plane

Now, it is easy to see that in the coordinates {(\theta, \phi)}, the area form of {\mathbb H^2(r)} is given by

\displaystyle \begin{array}{rl} \displaystyle \omega_{\mathbb H^2(r)}= r^2\sinh \theta d\theta\wedge d\phi. \ \ \ \ \ (2)\end{array}

On the other hand, the area form of the cylinder is given by

\displaystyle \begin{array}{rl} \displaystyle \omega_{C}=rdt\wedge d\phi \ \ \ \ \ (3)\end{array}

if {C} is parametrized by {(r\cos \phi, r\sin \phi, t)}.

From (1) and (3), we see that (note that {d\phi(-r \sin \phi, r\cos \phi, 0)=1})

\displaystyle \begin{array}{rl} \displaystyle \omega_C(d \Pi(X_\theta), d\Pi(X_\phi))=r^2\sinh\theta. \end{array}

Comparing with (2), we conclude that {\Pi^*\omega_C=\omega_{\mathbb H^2(r)}}, i.e. {\Pi} is area-preserving. As a corollary, we get

Corollary 1 The area of {B_R} is

\displaystyle \begin{array}{rl} \displaystyle \mathrm{Area}(B_R)=2\pi r^2(\cosh R-1). \end{array}

Posted in Calculus, Differential geometry, Geometry | Leave a comment

Archimedes and the area of sphere

I record here a remarkable discovery of Archimedes about the formula for the surface area of a sphere. I think the derivation is elementary enough so that it can be taught in high school. (I think I was not taught any derivation of it the first time I learned it, but I derived it myself after learning enough calculus.) The sad thing about our math education is that we are fed with tons of formulas, often without much motivation, that even the good students have to resort to mere memorization. My point is that many formulas taught in high school can be demonstrated in a rather natural and elementary way, which can be helpful to understand or remember them.

About 1800 years prior to the discovery of calculus, Archimedes showed that the surface area of a sphere of radius {r} is {4\pi r^2}. He also showed that the volume of a ball of radius {r} is {\frac{4}{3}\pi r^3} using the Cavalieri’s principle, again without calculus (this is explained in the last part of my calculus notes here).

450px-Domenico-Fetti_Archimedes_1620

Archimedes (287 –  212 BC)

Let us now describe how he discovered the surface area of a sphere. First, we inscribe the sphere of radius {r} in a cylinder of the same radius and height {2r} as shown.

inscribedsphere

We claim that the orthogonal projection from the lateral face of the cylinder onto the sphere is area-preserving. This of course can be shown by calculus. Remarkably, Archimedes proved it without using calculus. Unavoidably, this would not be completely rigorous in today’s standard as we lack the precise definition of area if we avoid calculus, and have to resort to the language of the “infinitesimals”.

The following picture is the cross section of the inscribed sphere along the {x-z} plane, which then becomes a circle of radius {r} inscribed in a square of side {2r}.

archsphere2


Suppose {\Delta \theta} is the “infinitesimal” increment of the angle {\theta}. Then on one hand (if we ignore the {\pm} sign),

\displaystyle \begin{array}{rl} \displaystyle \Delta z=r \Delta \theta \cos \left(\frac{\pi}{2}-\theta\right)=r\Delta \theta \sin \theta. \ \ \ \ \ (1)\end{array}

So the infinitesimal area element of the cylinder is

\displaystyle \begin{array}{rl} \displaystyle \Delta A_{\mathrm{cylinder}} =& \displaystyle 2\pi r \cdot \Delta z\\ =& \displaystyle 2\pi r ^2 \sin \theta \Delta \theta. \end{array}

On the other hand, the infinitesimal area element of the sphere (the area corresponding to the projection of \Delta A_{\mathrm{cylinder}}) is (see Fig. 1)

\displaystyle \begin{array}{rl} \displaystyle \Delta A_{\mathrm{sphere}} =& \displaystyle r \Delta \theta\cdot 2\pi r \sin \theta\\ =& \displaystyle 2\pi r^2 \sin \theta\Delta \theta\\=&\displaystyle \Delta A_{\mathrm{cylinder}}. \end{array}

We conclude that the projection is area-preserving and so the lateral face of the cylinder has the same area as the sphere. So the sphere has area {2\pi r\times 2r=4\pi r^2}.

Exercise: Make the above argument precise by using the language of differential forms. For example, as {z=r\cos \theta}, we have {dz=-r\sin \theta d\theta}, which (up to a sign) is the rigorous version of (1).

Posted in Calculus, Geometry | Leave a comment

Zeros of random polynomials

Given a polynomial p(z) = a_0+a_1z+\cdots + a_nz^n, where the coefficients are random, what can we say about the distribution of the roots (on \mathbb{C})? Of course, it would depend on what “random” means. Here, “random” means that the sequence (a_n) is an i.i.d. sequence of complex random variables.

It turns out that under a rather weak condition on (a_n) , then as n\to\infty, the roots will tend to be distributed on the unit circle! (There are lots of interesting discussions here, which explain why this should be true intuitively.)

I will give a rigorous proof (and a rigorous formulation) of this result. For simplicity, we will assume that (a_n) are complex standard Gaussians. That is, for any Borel set B\subseteq \mathbb{C}, we have

\displaystyle \mathbf{P}(a_j\in B) = \frac{1}{\pi} \int_B e^{-|z|^2}\,\text{d}m(z),

where m is the Lebesgue measure on the complex plane. We will also assume two basic potential theoretic results without proofs, namely

\displaystyle \Delta \max\{0,\log|z|\} = \frac{1}{2\pi}\text{d}\theta \quad \text{and} \quad \Delta \frac{1}{2\pi} \log|z| = \delta_0.

For p_n(z) = \sum_{j=0}^n a_jz^j = a_n\prod_{j=1}^n (z-\zeta_j) , we write Z_{p_n} = \frac{1}{n}\sum_{j=1}^n \delta_{\zeta_j}, the normalized counting measure.  We define the expected normalized counting measure, \mathbf{E}[Z_{p_n}], as follows. For any \psi\in C_c(\mathbb{C}),

\displaystyle \begin{array}{rl} \displaystyle (\mathbf{E}[Z_{p_n}],\psi) &:= \mathbf{E}[(Z_{p_n}, \psi)]\\  &\displaystyle =\frac{1}{\pi^{n+1}}\int_{\mathbb{C}^{n+1}} \frac{1}{n}\sum_{j=1}^n \psi(\zeta_j)e^{-\sum_{j=0}^n |a_j|^2} \,\text{d}m(a_0)\cdots\text{d}m(a_n). \end{array}

Intuitively, this \mathbf{E}[Z_{p_n}] tells us how the zeros of p_n are distributed “on average”.

Theorem. We have

\displaystyle \lim_{n\to\infty} \mathbf{E}[Z_{p_n}] = \frac{1}{2\pi}\text{d}\theta

in weak*-topology.

Continue reading

Posted in Algebra, Complex analysis, Potential theory, Probability | Leave a comment