Do join us!

No matter what your level of mathematics is, if you are interested to comment on or contribute to this blog, you are strongly encouraged to join! Continue reading

Posted in Miscellaneous | 2 Comments

Probability that a random inscribed triangle contains another circle

Suppose {C} is the unit circle centered at {O} on the plane and {C_r} is a concentric circle with radius {r} ({r<1}), what is the probability that a random triangle inscribed in {C} contains {C_r} in its interior?

This is a problem raised by one of the winning papers in the Hang Lung Mathematics Award 2018, which I was reviewing recently. The original solution is a bit indirect so we give two more direct methods and a generalization here. We will also consider the case where C_r is replaced by a general line segment inside the unit circle. Continue reading

Posted in Calculus, Geometry, Probability | Leave a comment

Noncommutative probability II: independence

Recall that two random variables X and Y are said to be independent if \mathbf{P}(X \in A, Y \in B) = \mathbf{P}(X\in A)\mathbf{P}(Y\in B) for all Borel sets A, B. We don’t have a probability measure behind the abstract definition of noncommutative probability space, so we cannot define independence in this way. However, it is not difficult to show that the above definition is equivalent to the following: for any Borel-measurable functions f, g such that f(X), g(Y) are integrable, one has \mathbf{E}[f(X)g(Y)] = \mathbf{E}[f(X)] \mathbf{E}[g(Y)] . Since we have an expectation in noncommutative context (namely \tau in the previous post), we can define independence in a similar way.

Definition. Let (\mathcal{A},\tau) be a noncommutative probability space. Let \mathcal{A}_1,\mathcal{A}_2\subseteq \mathcal{A} be two subalgebras such that e\in\mathcal{A}_1\cap\mathcal{A}_2. We say that \mathcal{A}_1 and \mathcal{A}_2 are classically independent if

  1. a_1a_2 = a_2a_1 if a_1\in\mathcal{A}_1,a_2\in\mathcal{A}_2,
  2. if a_1\in \mathcal{A}_1 and a_2\in \mathcal{A}_2 with \tau(a_1) = \tau(a_2)= 0, then \tau(a_1a_2) = 0.

Note that if \mathcal{A}_1 and \mathcal{A}_2 are classically independent, then if a_1\in\mathcal{A}_1 and a_2\in\mathcal{A}_2, one has \tau(a_1a_2) = \tau(a_1)\tau(a_2) (by applying the definition to a_j - \tau(a_j)e). So this is indeed the same as the independence we saw in classical probability.

Continue reading

Posted in Combinatorics, Probability | Leave a comment

Noncommutative probability I: motivations and examples

In many areas of mathematics, one can usually assign a commutative algebraic structure to the “space” that one is studying. For example, instead of studying an (affine) algebraic variety, one can study algebraic functions on that variety, which gives rise to a commutative ring. Another example is that one can study the space of complex-valued continuous functions on a topological space (with some “nice” topology) instead of the topological space itself, and this space of complex-valued continuous functions is a commutative C*-algebra. These algebraic structures have their noncommutative analogues (namely noncommutative rings and noncommutative C*-algebras), and these might correspond to some “noncommutative varieties” or “noncommutative topological space”. This is probably one of the motivations of the area noncommutative geometry.

This abstraction process also applies to probability spaces, because there is a very natural algebra on a probability space. Given a probability space (\Omega, \mathcal{F}, \mathbf{P}) , one can define the space of (essentially) bounded random variables L^\infty(\Omega) , and a linear functional \mathbf{E} on this space, namely the expected value. So we may study the algebra L^\infty(\Omega) instead of the probability space itself.

In fact, it is very natural to study only random variables but not the probability space in probability theory. Usually (but not always!), the role of the sample space is not important at all, but the random variables and their associated \sigma -algebras are more important.

This leads to the following definition/abstraction.

Defintion. A (noncommutative) probability space is a pair (\mathcal{A}, \tau) , where \mathcal{A} is a complex algebra with a unit, and \tau:\mathcal{A}\to\mathbb{C} is a linear map such that \tau(1)=1 .

The requirement \tau(1)=1 can be seen as an abstraction of the sample space has probability 1. To see that this is not a meaningless generalization, we can look at the following examples.


  1. As we have seen, for a given probability space (\Omega, \mathcal{F}, \mathbf{P}), (L^\infty(\Omega), \mathbf{E}) is a noncommutative probability space.
  2. In many cases, we also care about random variables that are not bounded, for instance Gaussian random variables. So it might be more natural to study \mathcal{A} = \bigcap_{1\leq p<\infty} L^p(\Omega) with \tau = \mathbf{E}.
  3. \mathcal{A}=M_n(\mathbb{C}) with \tau(X) = \frac{1}{n}\textnormal{tr}(X).
  4. \mathcal{A} = M_n(\mathbb{C}), fix a unit vector v\in \mathbb{C}^n with \|v\|_2 = 1 and define \tau(X) = v^*Xv. Then (\mathcal{A},\tau) is also a probability space.

Continue reading

Posted in Functional analysis, Probability | Leave a comment

A simple proof of the Gauss-Bonnet theorem for geodesic ball

In this short note, we will give a simple proof of the Gauss-Bonnet theorem for a geodesic ball on a surface. The only prerequisite is the first variation formula and some knowledge of Jacobi field (second variation formula), in particular how its second derivative (or the second derivative of the Jacobian) is related to the curvature of the surface. This is different from most standard textbook proofs at the undergraduate level. (Of course, this is just a local version of the Gauss-Bonnet theorem and topology has not yet come into play.)

Let {M} be a surface equipped with a Riemannian metric. We will fix a point {p} in {M} and from now on {B_r} always denotes the geodesic ball of radius {r} centered at {p}, and {\partial B_r} its boundary, which is called the geodesic sphere. In geodesic polar coordinates, let the area element of {M} be locally given by

\displaystyle \begin{array}{rl} \displaystyle   dA=f(\theta, r) dr d\theta, =f_\theta(r) dr d\theta, \end{array}

where {f(\theta, r)} is the Jacobian (with respect to polar coordinates). For our purpose it is more convenient to regard {f_\theta(r)} as a one-parameter family of functions in the variable {r}. It is well-known that {f_\theta} satisfies the Jacobi equation (here {'=\frac{d}{dr}} )

\displaystyle \begin{array}{rl} \displaystyle   {f_\theta}''(r)=-K(\theta, r) f_\theta(r),\quad f_\theta(0)=0,\quad {f_\theta}'(0)=1  \ \ \ \ \ (1)\end{array}

where {K=K(\theta, r)} is the Gaussian curvature (in polar coordinates). Indeed, if we fix a geodesic polar coordinates, and {\gamma_\theta(t)} is the arc-length parametrized geodesic with initial “direction” {\theta} starting from {p}, then we can define a parallel orthonormal frame {e_1(t), e_2(t)=\gamma_\theta'(t)} along {\gamma_\theta(t)}. Then {Y(t)=f_\theta(t)e_1(t)} is a Jacobi field and so

\displaystyle \begin{array}{rl} \displaystyle   Y''(t)={f_\theta}''(t) e_1 (t) =- R(Y(t), \gamma_\theta'(t))\gamma_\theta'(t) =& \displaystyle -K(\theta, t) Y(t)\\ =& \displaystyle -K(\theta, t) f_\theta (t) e_1(t). \end{array}

From this (1) follows.

The first variation formula says (here {s} is the arclength parameter)

\displaystyle \begin{array}{rl} \displaystyle   \frac{d}{dt} \left(\mathrm{Length}(\partial B_t)\right) =\frac{d}{dt} \left(\int_0^{2\pi}f_\theta( t) d\theta\right) =\int_{ \partial B_t}k_g ds =\int_0^{2\pi} k_g(\theta, t) f_\theta( t)d\theta. \end{array}

Here {k_g} is the geodesic curvature of the geodesic circle {\partial B_t}. (Indeed, the differential version {\frac{f_\theta'}{f_\theta}=k_g} is already true for the geodesic circle.) This implies

\displaystyle \begin{array}{rl} \displaystyle   \int_{\partial B_t}k_g ds =\int_{0}^{2\pi} f_\theta'(t) d\theta. \end{array}

So by the fundamental theorem of calculus and (1), we have

\displaystyle \begin{array}{rl} \displaystyle   \int_{\partial B_t}k_g ds =& \displaystyle \int_{0}^{2\pi}\left({f_\theta}'(0)+\int_0^t {f_\theta}''(r)dr\right)d\theta\\ =& \displaystyle \int_{0}^{2\pi}\left(1-\int_0^t K (\theta, r)f_\theta(r)dr\right)d\theta\\ =& \displaystyle 2 \pi-\int_{B_t} K dA. \end{array}

This is exactly the Gauss-Bonnet theorem (for a geodesic ball), which is usually written as

\displaystyle \begin{array}{rl} \displaystyle   \int_{B_r}K dA+\int_{\partial B_r}k_g ds=2\pi. \end{array}

Posted in Calculus, Differential equations, Differential geometry, Geometry | Leave a comment

Least squares in a non-ordinary sense

Simple ordinary least squares regression (SOLSR) means the following. Given data (x_i,y_i)\in \mathbb{R}^2, i = 1, \ldots,N, find a line in \mathbb{R}^2 represented by y = mx+c that fits the data in the following sense. The loss of each data point (x_i,y_i) to the line is

|y_i - (mx_i + c) |          for every i,

so we find (m,c) that minimizes the loss function

\displaystyle \sum_{i=1}^N [y_i - (mx_i + c)]^2.

See here for a closed-form of the minimizer (m,c). Instead of SOLSR, one can consider the distance between a data point and a line in \mathbb{R}^2 as the loss. Notice that |y_i - (mx_i + c) | is not the distance from (x_i,y_i) to y = mx+c unless m = 0. Then the new least squares problem can be formulated as follows.

A general line in \mathbb{R}^2 can be expressed as \sin(\theta)x-\cos(\theta)y+a=0 where (\theta,a)\in \mathbb{R}^2. Thus the distance between (x_i,y_i) and this line is

|\sin(\theta) x_i - \cos(\theta) y_i + a|          for every i.

Hence we want to find (\theta,a) that minimizes the loss function

\displaystyle L(\theta,a) :=\sum_{i=1}^N (\sin(\theta) x_i - \cos(\theta) y_i + a)^2.

Continue reading

Posted in Applied mathematics, Calculus, Optimization, Statistics | Leave a comment

Archimedes’ principle for hyperbolic plane

After writing the previous post, I realized that there is an exact analogue of the Archimedes’ principle for the surface area of the hyperbolic disk inside the hyperbolic plane.


Archimedes (287 –  212 BC)

Let us fix the notations. Let {\mathbb R^{2,1}} be the {3}-dimensional Minkowski space defined by

\displaystyle \begin{array}{rl} \displaystyle \mathbb R^{2,1}:=\{(x, y, t)\in \mathbb R^{3}\}, \end{array}

equipped with the Lorentzian metric {-dt^2+dx^2+dy^2}.

Fix {r >0} and define the hyperbolic plane {\mathbb H^2(r)} by

\displaystyle \begin{array}{rl} \displaystyle \mathbb H^2(r):=\{(x, y, t)\in \mathbb R^{2,1}: \quad t>0,\quad t^2-x^2-y^2=r^2\}, \end{array}

equipped with the induced metric.


Hyperbolic plane

It can be shown that {\mathbb H^2(r)} is a surface with constant curvature {-\frac{1}{r^2}} (analogous to the fact that the sphere of radius {r}, {\mathbb S^2(r)}, is a surface with constant curvature {\frac{1}{r^2}}). It is easy to see that {\mathbb H^2(r)} can be parametrized by

\displaystyle \begin{array}{rl} \displaystyle X(\theta, \phi)= (r\sinh \theta \cos \phi,r \sinh \theta \sin \phi, r\cosh \theta), \quad \theta\ge 0, \quad \phi\in[0, 2\pi],\end{array}

the “polar coordinates” around {P=(0, 0, r)}, with {r\theta} being the (geodesic) distance from {P}.

Finally, let the (infinite) cylinder {C} of radius {r} be defined by

\displaystyle \begin{array}{rl} \displaystyle C= \{(x, y, t)\in \mathbb R^{2,1}: x^2+y^2=r^2\}, \end{array}

again equipped with the induced metric.



There is a natural orthogonal projection {\Pi} from {\mathbb H^2(r)\setminus \{P\}} to {C} defined by

\displaystyle \begin{array}{rl} \displaystyle \Pi(x, y, t)=\left(\frac{r x}{\sqrt{x^2+y^2}}, \frac{r y}{\sqrt{x^2+y^2}}, t\right). \end{array}


The natural projection

In polar coordintes, this is given by

\displaystyle \begin{array}{rl} \displaystyle \Pi(X(\theta, \phi ))=(r\cos \phi, r\sin \phi, r\cosh \theta). \ \ \ \ \ (1)\end{array}

We claim that the map {\Pi} is area-preserving. If this is true, then we can easily calculate the area of the hyperbolic geodesic disk (with radius {rR})

\displaystyle \begin{array}{rl} \displaystyle B_R:=\{X(\theta, \phi): 0\le \theta\le R\}\end{array}

because the projection of {B_R\setminus \{P\}} is exactly the finite cylinder

\displaystyle \begin{array}{rl} \displaystyle \Pi(B_R\setminus \{P\})=\{(x, y, t)\in C: r<t\le r\cosh R\} \end{array}

whose area is easy to be calculated. Indeed, the area of the cylinder is exactly the same as the ordinary cylinder in the Euclidean space {\mathbb R^3}.


A geodesic disk inside the hyperbolic plane

Now, it is easy to see that in the coordinates {(\theta, \phi)}, the area form of {\mathbb H^2(r)} is given by

\displaystyle \begin{array}{rl} \displaystyle \omega_{\mathbb H^2(r)}= r^2\sinh \theta d\theta\wedge d\phi. \ \ \ \ \ (2)\end{array}

On the other hand, the area form of the cylinder is given by

\displaystyle \begin{array}{rl} \displaystyle \omega_{C}=rdt\wedge d\phi \ \ \ \ \ (3)\end{array}

if {C} is parametrized by {(r\cos \phi, r\sin \phi, t)}.

From (1) and (3), we see that (note that {d\phi(-r \sin \phi, r\cos \phi, 0)=1})

\displaystyle \begin{array}{rl} \displaystyle \omega_C(d \Pi(X_\theta), d\Pi(X_\phi))=r^2\sinh\theta. \end{array}

Comparing with (2), we conclude that {\Pi^*\omega_C=\omega_{\mathbb H^2(r)}}, i.e. {\Pi} is area-preserving. As a corollary, we get

Corollary 1 The area of {B_R} is

\displaystyle \begin{array}{rl} \displaystyle \mathrm{Area}(B_R)=2\pi r^2(\cosh R-1). \end{array}

Posted in Calculus, Differential geometry, Geometry | Leave a comment

Archimedes and the area of sphere

I record here a remarkable discovery of Archimedes about the formula for the surface area of a sphere. I think the derivation is elementary enough so that it can be taught in high school. (I think I was not taught any derivation of it the first time I learned it, but I derived it myself after learning enough calculus.) The sad thing about our math education is that we are fed with tons of formulas, often without much motivation, that even the good students have to resort to mere memorization. My point is that many formulas taught in high school can be demonstrated in a rather natural and elementary way, which can be helpful to understand or remember them.

About 1800 years prior to the discovery of calculus, Archimedes showed that the surface area of a sphere of radius {r} is {4\pi r^2}. He also showed that the volume of a ball of radius {r} is {\frac{4}{3}\pi r^3} using the Cavalieri’s principle, again without calculus (this is explained in the last part of my calculus notes here).


Archimedes (287 –  212 BC)

Let us now describe how he discovered the surface area of a sphere. First, we inscribe the sphere of radius {r} in a cylinder of the same radius and height {2r} as shown.


We claim that the orthogonal projection from the lateral face of the cylinder onto the sphere is area-preserving. This of course can be shown by calculus. Remarkably, Archimedes proved it without using calculus. Unavoidably, this would not be completely rigorous in today’s standard as we lack the precise definition of area if we avoid calculus, and have to resort to the language of the “infinitesimals”.

The following picture is the cross section of the inscribed sphere along the {x-z} plane, which then becomes a circle of radius {r} inscribed in a square of side {2r}.


Suppose {\Delta \theta} is the “infinitesimal” increment of the angle {\theta}. Then on one hand (if we ignore the {\pm} sign),

\displaystyle \begin{array}{rl} \displaystyle \Delta z=r \Delta \theta \cos \left(\frac{\pi}{2}-\theta\right)=r\Delta \theta \sin \theta. \ \ \ \ \ (1)\end{array}

So the infinitesimal area element of the cylinder is

\displaystyle \begin{array}{rl} \displaystyle \Delta A_{\mathrm{cylinder}} =& \displaystyle 2\pi r \cdot \Delta z\\ =& \displaystyle 2\pi r ^2 \sin \theta \Delta \theta. \end{array}

On the other hand, the infinitesimal area element of the sphere (the area corresponding to the projection of \Delta A_{\mathrm{cylinder}}) is (see Fig. 1)

\displaystyle \begin{array}{rl} \displaystyle \Delta A_{\mathrm{sphere}} =& \displaystyle r \Delta \theta\cdot 2\pi r \sin \theta\\ =& \displaystyle 2\pi r^2 \sin \theta\Delta \theta\\=&\displaystyle \Delta A_{\mathrm{cylinder}}. \end{array}

We conclude that the projection is area-preserving and so the lateral face of the cylinder has the same area as the sphere. So the sphere has area {2\pi r\times 2r=4\pi r^2}.

Exercise: Make the above argument precise by using the language of differential forms. For example, as {z=r\cos \theta}, we have {dz=-r\sin \theta d\theta}, which (up to a sign) is the rigorous version of (1).

Posted in Calculus, Geometry | Leave a comment

Zeros of random polynomials

Given a polynomial p(z) = a_0+a_1z+\cdots + a_nz^n, where the coefficients are random, what can we say about the distribution of the roots (on \mathbb{C})? Of course, it would depend on what “random” means. Here, “random” means that the sequence (a_n) is an i.i.d. sequence of complex random variables.

It turns out that under a rather weak condition on (a_n) , then as n\to\infty, the roots will tend to be distributed on the unit circle! (There are lots of interesting discussions here, which explain why this should be true intuitively.)

I will give a rigorous proof (and a rigorous formulation) of this result. For simplicity, we will assume that (a_n) are complex standard Gaussians. That is, for any Borel set B\subseteq \mathbb{C}, we have

\displaystyle \mathbf{P}(a_j\in B) = \frac{1}{\pi} \int_B e^{-|z|^2}\,\text{d}m(z),

where m is the Lebesgue measure on the complex plane. We will also assume two basic potential theoretic results without proofs, namely

\displaystyle \Delta \max\{0,\log|z|\} = \frac{1}{2\pi}\text{d}\theta \quad \text{and} \quad \Delta \frac{1}{2\pi} \log|z| = \delta_0.

For p_n(z) = \sum_{j=0}^n a_jz^j = a_n\prod_{j=1}^n (z-\zeta_j) , we write Z_{p_n} = \frac{1}{n}\sum_{j=1}^n \delta_{\zeta_j}, the normalized counting measure.  We define the expected normalized counting measure, \mathbf{E}[Z_{p_n}], as follows. For any \psi\in C_c(\mathbb{C}),

\displaystyle \begin{array}{rl} \displaystyle (\mathbf{E}[Z_{p_n}],\psi) &:= \mathbf{E}[(Z_{p_n}, \psi)]\\  &\displaystyle =\frac{1}{\pi^{n+1}}\int_{\mathbb{C}^{n+1}} \frac{1}{n}\sum_{j=1}^n \psi(\zeta_j)e^{-\sum_{j=0}^n |a_j|^2} \,\text{d}m(a_0)\cdots\text{d}m(a_n). \end{array}

Intuitively, this \mathbf{E}[Z_{p_n}] tells us how the zeros of p_n are distributed “on average”.

Theorem. We have

\displaystyle \lim_{n\to\infty} \mathbf{E}[Z_{p_n}] = \frac{1}{2\pi}\text{d}\theta

in weak*-topology.

Continue reading

Posted in Algebra, Complex analysis, Potential theory, Probability | Leave a comment

Hopf fibration double covers circle bundle of sphere

Two days ago, I gave a seminar talk on Chern‘s proof of the generalized Gauss-Bonnet theorem. Here I record the answer to a question asked by one of my colleague during the talk. Although not directly related to the proof of the generalized Gauss-Bonnet theorem, I think it’s quite interesting itself.

Let me first give some background before stating the question.

Roughly speaking, the idea of proof goes like this: Let’s assume {n=2} for simplicity. Usually the curvature form (or more appropriately, the Pfaffian of the curvature form) is not exact, for otherwise the integral of the curvature form on a closed surface is zero. However, Chern observed that the pullback of the curvature form onto the unit sphere bundle {SM} is exact, and a smooth non-degenerate vector field {X} on {M} naturally induces a diffeomorphism from {M'} onto a submanifold {\Sigma} in {SM}. Here {M'} is the open subset of {M} where {X\ne 0}. By pulling back the curvature form onto {\Sigma} and applying the Stokes theorem, we can localize the curvature integral into a sum of line integrals on small loops around the singularities of {X}, which turn out to give the sum of the index of the vector field. Finally by the Poincare-Hopf theorem, this would give the Euler characteristic of {M}.

While introducing the concept of the unit sphere bundle, I was asked by one of my colleague whether the unit sphere bundle of {\mathbb S^2} is the Hopf fibration. I didn’t know the answer at that time. But then I thought about it again and found that the answer is quite obviously no. However, I found it quite interesting that the Hopf fibration is actually the double cover of {S(\mathbb S^2)}. I think this is a good exercise in geometry and I am recording it here.

For a Riemannian manifold {M^n}, the unit sphere bundle {SM} is defined to be

\displaystyle \begin{array}{rl} \displaystyle   SM:=\{(x, v): x\in M, v\in T_xM, |v|=1\} \end{array}

with projection {\pi_S: (x, v)\mapsto x}. This is an {\mathbb S^{n-1}}-bundle over {M}. In particular, if {M} is two-dimensional, then {SM} is a circle bundle.

Proposition 1 The circle bundle {S(\mathbb S^2)} is diffeomorphic to {SO(3)}.

Proof: We regard {\mathbb S^2\subset \mathbb R^3} is the standard unit sphere, and regard {x\in \mathbb S^2} as a column vector. We can define {\Phi: S(\mathbb S^2)\rightarrow SO(3)} by

\displaystyle \begin{array}{rl} \displaystyle   \Phi(x, v):=\left[x, v, x\times v\right], \end{array}

where {x\times v} is the cross product of {x} and {v} in {\mathbb R^3}. Then

\displaystyle \begin{array}{rl} \displaystyle   \Phi^{-1}\left(\left[v_1\;v_2\;v_3\right]\right)=(v_1, v_2). \end{array}

Clearly this is a diffeomorphism. \Box

In particular, {S(\mathbb S^2)\cong SO(3)} has fundamental group {\mathbb Z/2\mathbb Z}, with {\mathrm{Spin}(3)=Sp(1)} as its double cover (cf. here).

Recall that the Hopf fibration is given by the quotient of the action {\mathbb S^1=\{\alpha\in \mathbb C: |\alpha|=1\}} on {\mathbb S^3=\{z=(z_1,z_2)\in\mathbb C^2: |z|=1\}}, where the action is given by {\alpha\cdot (z_1,z_2)=(\alpha z_1, \alpha z_2)}. It is clear that the quotient space is {\mathbb CP^1}, which is diffeomorphic to {\mathbb S^2}. The Hopf fibration is defined to be this quotient: {\pi: \mathbb S^3\rightarrow \mathbb CP^1\cong \mathbb S^2}.

From the above discussion, it is clear that {\pi_S: S(\mathbb S^2)\rightarrow \mathbb S^2} is not the Hopf fibration. In fact, as {\mathbb S^3} is simply connected, it is not diffeomorphic to {S(\mathbb S^2)}. However, the Hopf fibration can be regarded as the double cover of {\pi_S: S(\mathbb S^2)\rightarrow \mathbb S^2}, in the sense that this diagram commutes

Here {p} is the double covering map from {Sp(1)=\mathrm{Spin}(3)} to {SO(3)}. To see this, first identify {\mathbb S^3} with the compact symplectic group {Sp(1)=\{q\in \mathbb H, |q|=1\}} where {\mathbb H=\{a+bi+cj+dk: a, b, c, d\in \mathbb R\}} is the set of quaternions, where {i^2=j^2=k^2=-1}. We identify {\mathbb R^3} with the space of purely imaginary quaternions {\{bi+cj+dk\}}. Then we define {p(q)} by

\displaystyle \begin{array}{rl} \displaystyle   p(q): v\mapsto q^* v q, \textrm{ where } v \textrm{ is a purely imaginary quaternion}. \end{array}

It is clear that {p(q)\in SO(3)}. Under this identification, then {\pi_S(p(q))=q^* i q} as we identify {i} with {e_1}. Let {q=a+bi+cj+dk=z_0+z_1j\in Sp(1)}, where {z_i\in \mathbb C}. Suppose {z_0=a+bi}, {z_1=c+di}. Then by a direct calculation (done by Mathematica here)

\displaystyle \begin{array}{rl} \displaystyle   \pi_S(p(q))=q^* i q =(a^2+b^2-c^2-d^2)i+2(bc-ad)j+2(ac+bd)k.  \ \ \ \ \ (1)\end{array}

On the other hand, for {q=a+bi+cj+dk=z_0+z_1j\in Sp(1)}, {\pi} can be defined to be

\displaystyle \begin{array}{rl} \displaystyle   \pi(q)=\pi(z_0, z_1):= (|z_0|^2-|z_1|^2, 2\mathrm{Im}(z_0z_1^*), 2\mathrm{Re}(z_0z_1^*)). \end{array}

Expanding the above, we have

\displaystyle \begin{array}{rl} \displaystyle   (|z_0|^2-|z_1|^2, 2z_0z_1^*)=(a^2+b^2-c^2-d^2, 2(bc-ad), 2(ac+bd)). \end{array}

Comparing with (1), we have proved the commutativity.

Posted in Algebra, Differential geometry, Group theory | 3 Comments

Euler’s formula e^ix = cos x + i sin x: a geometric approach

Today I mentioned the famous Euler’s formula briefly in my calculus class (when discussing hyperbolic functions, lecture notes here):

\displaystyle \begin{array}{rl} \displaystyle   \boxed{e^{ix}=\cos x+ i \sin x}  \ \ \ \ \ (1)\end{array}

where {i\in \mathbb C} is a solution to {z^2=-1} (usually denoted by “{i=\sqrt{-1}}”, but indeed there is no single-valued square root for complex numbers, or even negative real numbers).

One of the usual ways to derive this formula is by comparing the power series of the exponential function and the trigonometric functions {\sin} and {\cos}:

\displaystyle \begin{array}{rl} \displaystyle   \begin{cases} e^z=& \displaystyle 1+\frac{z^1}{1!}+\frac{z^2}{2!} +\frac{z^3}{3!} +\frac{z^4}{4!}+\cdots\\ \sin x=& \displaystyle  \frac{x^1}{1!} -\frac{x^3}{3!} +\frac{x^5}{5!}-\cdots\\ \cos x=& \displaystyle 1 -\frac{x^2}{2!} +\frac{x^4}{4!}-\cdots \end{cases} \end{array}

Putting {z=ix} in the first expansion and comparing with the remaining two, it’s easy to see that

\displaystyle \begin{array}{rl} \displaystyle   e^{ix} =& \displaystyle  \left(1-\frac{x^2}{2!} +\frac{x^4}{4!}-\cdots\right) +i\left(\frac{x^1}{1!}-\frac{x^3}{3!}+\frac{x^5}{5!}-\cdots\right)\\ =& \displaystyle \cos x+i\sin x. \end{array}

Here, I am going to give another approach which does not require any knowledge of series (thus avoid the problem of convergence), but only basic knowledge in complex number (complex addition and multiplication). I am sure this approach must has been taken before but I couldn’t find a suitable reference, especially an online one.

Let us agree that we define the Euler’s number to be

\displaystyle \begin{array}{rl} \displaystyle   e:=\lim_{n\rightarrow \infty}\left(1+\frac{1}{n}\right)^n. \end{array}

From this it is easy to see that

\displaystyle \begin{array}{rl} \displaystyle   e^x=\lim_{n\rightarrow \infty}\left(1+\frac{x}{n}\right)^n\textrm{ for }x\in \mathbb R. \end{array}

It is then natural to define the complex exponential function by

\displaystyle \begin{array}{rl} \displaystyle   e^z :=\lim_{n\rightarrow \infty}\left(1+\frac{z}{n}\right)^n\textrm{ for }z\in \mathbb C. \end{array}

Here I am cheating a little bit because I have implicitly assumed that this limit exists.

Now recall the geometry of the complex plane. We can identify a complex number {x+iy} with the point {(x, y)} on the plane. We can write a complex number in its polar form {z=r(\cos \theta+i \sin\theta)}, which is identified with {(r, \theta)} in polar coordinates. We call {r=|z|} and {\theta=\mathrm{arg}(z)} the modulus (or length) and the argument (or angle) of {z} respectively.

The complex plane

The complex multiplication of {z_1=r_1(\cos \theta_1+i\sin \theta_1)} and {z_2=r_2(\cos \theta_2+i\sin \theta_2)} is then

\displaystyle \begin{array}{rl} \displaystyle   z_1z_2 =r_1r_2\left(\cos (\theta_1+\theta_2)+i\sin (\theta_1+\theta_2)\right). \end{array}

i.e. the modulus of {z_1z_2} is the product of the two moduli and the argument of {z_1z_2} is the sum of the two arguments.

So now, let’s fix {x\in \mathbb R} and compute

\displaystyle \begin{array}{rl} \displaystyle   \lim_{n\rightarrow \infty}\left(1+\frac{ix}{n}\right)^n, \end{array}

which by definition would be {e^{ix}}. We will argue that its length is {1} and its argument is {x}, i.e. (1) holds:

\displaystyle \begin{array}{rl} \displaystyle   e^{ix}=\cos x+i \sin x.  \ \ \ \ \ (2)\end{array}

The geometry of the powers of 1+yi

To see this, let {z_n=1+\frac{ix}{n}}. Then {|z_n|=\left(1+\frac{x^2}{n^2}\right)^{\frac{1}{2}}} and so

\displaystyle \begin{array}{rl} \displaystyle   |z_n|^n=\left(1+\frac{x^2}{n^2}\right)^{\frac{n}{2}}  \end{array}

From this we have

\displaystyle \begin{array}{rl} \displaystyle   |({z_n})^n| =|z_n|^n =\left(1+\frac{x^2}{n^2}\right)^{\frac{n}{2}} =& \displaystyle \left[\left(1+\frac{x^2}{n^2}\right)^{n^2}\right]^{\frac{1}{2n}}\\ \rightarrow& \displaystyle \left(e^{x^2}\right)^{0}=1  \ \ \ \ \ (3)\end{array}

as {n\rightarrow \infty}. On the other hand, the argument of {z_n} (which is well-defined up to a multiple of {2\pi}) can be chosen to be

\displaystyle \begin{array}{rl} \displaystyle   \arg(z_n)=\theta_n=\tan^{-1}\left(\frac{x}{n}\right). \end{array}

Then by the L’Hôpital’s rule,

\displaystyle \begin{array}{rl} \displaystyle   \lim_{n\rightarrow \infty} n\theta_n= \lim_{n\rightarrow \infty} n\tan^{-1}\left(\frac{x}{n}\right) =\lim_{t\rightarrow 0^+} \frac{\tan^{-1}(tx)}{t} =\lim_{t\rightarrow 0^+} \frac{x}{1+t^2x^2}=x. \end{array}

So we have

\displaystyle \begin{array}{rl} \displaystyle  \lim_{n\rightarrow \infty} \arg [(z_n)^n ]= \lim_{n\rightarrow \infty} n\arg(z_n)= \lim_{n\rightarrow \infty} n\theta_n=x. \ \ \ \ \ (4)\end{array}

Combining (3) and (4), we have

\displaystyle \begin{array}{rl} \displaystyle   e^{ix}:=\lim_{n\rightarrow \infty}\left(1+\frac{ix}{n}\right)^n=\lim_{n\rightarrow \infty}(z_n)^n= \cos x+i\sin x. \end{array}

Added Nov 12, 2017:
I found a video explaining e^{i\pi}=-1 (but without giving the full mathematical details) in the above approach:

Posted in Analysis, Calculus, Complex analysis, Geometry | Leave a comment

An inequality for functions on the plane

I accidentally came across a curious inequality for functions of two variables. I would like to know if this inequality is a special case of a more general result but I was unable to find a reference. It would also be interesting if applications can be found for this inequality.

Theorem 1 For any nontrivial {C^2} function {f=f(x,y)} on {\mathbb R^2} such that {[c_1,c_2]\subset f(\mathbb R^2)} and {f^{-1}([c_1,c_2])} is compact, we have

\displaystyle \begin{array}{rl} \displaystyle  \int_{\{c_1\le f\le c_2\}} \frac{|f_{xx}{f_y}^2-2f_{xy}f_xf_y+f_{yy}{f_x}^2|}{|\nabla f|^2} dxdy \ge 2 \pi (c_2-c_1). \ \ \ \ \ (1)\end{array}

Here we define the integrand to be zero if {\nabla f=0}.

In particular, if {f} is compactly supported, then

\displaystyle \begin{array}{rl} \displaystyle   \int_{\mathbb R^2} \frac{|f_{xx}{f_y}^2-2f_{xy}f_xf_y+f_{yy}{f_x}^2|}{|\nabla f|^2} dxdy \ge 2 \pi \left(\max f-\min f\right). \end{array}

Continue reading

Posted in Analysis, Geometry, Inequalities | Leave a comment