On integrating products of dot products

Today KKK showed me the following formula:

There exists a constant C > 0 such that

\int \langle u, x \rangle \langle v, x \rangle = C \langle u, v \rangle

for all u, v \in {\mathbb{R}}^d. Here the integral is either over the unit sphere {\mathbb{S}}^{d-1} or over the ball B(0, 1) (of course the corresponding constants are different). In this note we explain this result in a probabilistic way and give more examples.

Exercise: Find the constants in both cases.

An obvious generalization of the above formula is as follows. Here we write x = (x_1, ..., x_d) for x \in {\mathbb{R}}^d.

Proposition 1. Let \mu be a Borel probability measure on {\mathbb{R}}^d such that

\int_{{\mathbb{R}}^d} x_ix_j d\mu(x) = \delta_{ij}.

Then for all u, v \in {\mathbb{R}}^d, we have

\int_{{\mathbb{R}}^d} \langle u, x \rangle \langle v, x \rangle d\mu(x) = \langle u, v \rangle.

Proof. We calculate directly:

\int_{{\mathbb{R}}^d} \langle u, x \rangle \langle v, x \rangle d\mu(x)

= \sum_{i = 1}^d \sum_{j = 1}^d u_iv_j \int_{{\mathbb{R}}^d} x_ix_j d\mu(x)

= \sum_{i = 1}^d \sum_{j = 1}^d u_iv_j \delta_{ij} = \langle u, v \rangle. \Box

We observe that the converse is also true (why?).

To get KKK’s formula from Proposition 1, simply check that the condition is satisfied. This is an exercise in integration. (Hint: Use symmetry!)

Now we translate the above into a probabilistic framework.

Let X be a real random variable. The expectation of X is defined as

{\mathbb{E}}(X) = \int X d{\mathbb{P}}

provided that the integral exists absolutely. The variance of X is defined as

Var(X) = {\mathbb{E}}[(X - {\mathbb{E}}(X))^2]

provided the expectation exists absolutely. See Wikipedia for more details about these definitions.

Definition. Let X and Y be real random variables with finite variances. The covariance between X and Y is defined as

Cov(X, Y) = {\mathbb{E}}[(X - {\mathbb{E}}(X))(Y - {\mathbb{E}}(Y))].

We say that X and Y are uncorrelated if Cov(X, Y) = 0.

Note that Cov is bilinear and Cov(X, X) = Var(X) (it is basically the L^2-inner product). We remark that if X and Y are independent, then they are uncorrelated. The converse is not true. Statistically, Cov(X, Y) = 0 implies that there are no “linear relationships” between X and Y. This statement can be made precise using orthogonal projection. In this note, all random variables have expectation 0. In this case the formula becomes

Cov(X, Y) = {\mathbb{E}}(XY).

Let X = (X_1, ..., X_d) be a random vector on {\mathbb{R}}^d, i.e. each X_i is a random variable. The distribution of X is the (Borel) probability measure \mu on {\mathbb{R}}^d defined by

\mu(B) = {\mathbb{P}}\{(X_1, ..., X_d) \in B\}

where B is any Borel set in {\mathbb{R}}^d. We also note the following basic result: If g is a real measurable function on {\mathbb{R}}^d, then

{\mathbb{E}}(g(X)) = \int_{{\mathbb{R}}^d} g(x) d\mu(x)

provided the integrals exist. (Actually the existence of one integral implies the existence of the other.)

With the above notations, we can rephrase Proposition 1 as follows:

Proposition 2 (Probabilistic version of Proposition 1). Consider a random vector X = (X_1, ..., X_d) on {\mathbb{R}}^d with distribution \mu. Suppose that {\mathbb{E}}(X_i) = 0 for all i and

Cov(X_i, X_j) = \delta_{ij}.

Then for all u, v \in {\mathbb{R}}^d, we have

\int_{{\mathbb{R}}^d} \langle u, x \rangle \langle v, x \rangle d\mu(x) = \langle u, v \rangle.

Proof. We simply note that

\int_{{\mathbb{R}}^d} \langle u, x \rangle \langle v, x \rangle d\mu(x) = Cov(\langle u, X \rangle, \langle v, X \rangle) = \sum_{i = 1}^d \sum_{j = 1}^d u_i v_j Cov(X_i, X_j). \Box

The idea is that checking \int_{{\mathbb{R}}^d} x_ix_j d\mu(x) = \delta_{ij} is equivalent to checking that certain random variables are uncorrelated, and sometimes the probabilistic way is more intuitive.

All these are quite trivial. Now we give some examples.

Example 3. For all u, v \in {\mathbb{R}}^d,

\int_{{\mathbb{R}}^d} \langle u, x \rangle \langle v, x \rangle \frac{1}{(2\pi)^{d/2}} \exp(\frac{1}{2} |x|^2) dx = \langle u, v \rangle

To see this, note that d\mu(x) = \frac{1}{(2\pi)^{d/2}} \exp(\frac{1}{2} |x|^2) dx is the the normal distribution on {\mathbb{R}}^d with mean vector 0 and covariance matrix I_d (identity). It follows that if X has this distribution, then X_1, ..., X_d are independent and identically distributed with mean 0 and variance 1.

Example 4. To get KKK’s identity for the spherical integral, we let X = (X_1, ..., X_d) be uniformly distributed on the unit sphere {\mathbb{S}}^{d-1}. Note that X_1, ..., X_d have the same distribution (but they are independent). By Proposition 2 (and scaling), we only need to show that X_i and X_j are uncorrelated for i \neq j. Of course this can be interpreted as an integration problem; here we give a probabilistic trick.

Now X is uniformly distributed on {\mathbb{S}}^{d-1}. We can construct a strictly positive random variable R, independent of X, such that

Z = (Z_1, ..., Z_d) := RX = (RX_1, ..., ZX_d)

is standard normal as in Example 3. (The idea is similar to that of the Box–Muller transform in statistics. Roughly speaking, we first choose the length of Z, and then randomly pick a direction.)

Then, for i \neq j, by independence of Z_i and Z_j (this follows from a standard theorem in statistics) we have

0 = {\mathbb{E}}(Z_iZ_j) = {\mathbb{E}}(RX_iRX_j).

On the other hand, by construction of R we get

{\mathbb{E}}(RX_iRX_j) = {\mathbb{E}}(R^2) {\mathbb{E}}(X_iX_j).

Since {\mathbb{E}}(R^2) > 0, it follows that {\mathbb{E}}(X_iX_j) = 0 as well. By scaling X appropriately we can use Proposition 3 (note here that 0 < {\mathbb{E}}(X_i^2) = \frac{1}{d}< 1).

Example 5. For KKK’s identity for the volume integral (over the ball), we can use a similar argument as in Example 4, but we use another R so that RX has the correct distribution.

This entry was posted in Miscellaneous. Bookmark the permalink.

4 Responses to On integrating products of dot products

  1. KKK says:

    I do not quite understand the first equal sign of the proof of proposition 2, do you require
    \mathbb{E}(\langle u, X\rangle)= \mathbb{E}(\langle v, X\rangle)=0?

  2. KKK says:

    I also have a proof which is in some sense parallel to the proof of Proposition 1, but it’s only for the “standard” sphere and ball case, so Wongting’s result is much more general.
    In another direction, I have a second proof (this time for the sphere case only, but the ball case follows by integrating the sphere case) using divergence theorem:
    Note that x is the normal n(x) to \mathbb{S}^{n-1}, define the vector field V(x)=\langle u, x\rangle v on \mathbb{R}^n. Then
    \int_{x\in\mathbb{S}^{n-1}} \langle u, x\rangle \langle v, x\rangle dS(x)=\int_{x\in\mathbb{S}^{n-1}} V(x)\cdot n(x) dS(x)=\int_{\mathbb{B}_1(0)} \text{div }V(x)dx.
    It is easy to see that \text{div }V(x)=\langle \nabla(\langle u, x\rangle ),v\rangle=\langle u,v\rangle (or, write it in coordinates and apply divergence), and hence the result. (Here we regard \langle u, x\rangle as a function in x\in \mathbb{R}^n, whose gradient is of course u. )

  3. wongting says:

    Yes. In Proposition 2 I forgot to mention that the random variables are assumed to have mean 0.

  4. KKK says:

    I now have a third proof, which seems more natural to me.
    Define \displaystyle L(u,v)=\int_{\mathbb{S}^{n-1}} \langle u,x\rangle \langle v, x\rangle dS(x).
    Then clearly {L} is a symmetric bilinear form. We first claim that \displaystyle L(Au, Av)=L(u,v)\text{ for all }A\in SO(n).
    \displaystyle  \begin{array}{rcl}  \int_{\mathbb{S}^{n-1}} \langle Au,x\rangle \langle Av, x\rangle dS(x) &=&\int_{\mathbb{S}^{n-1}} \langle u,A^tx\rangle \langle v, A^tx\rangle dS(x)\\ &=&\int_{\mathbb{S}^{n-1}} \langle u,A^tx\rangle \langle v, A^tx\rangle dS(A^tx)\\ &=&\int_{\mathbb{S}^{n-1}} \langle u,y\rangle \langle v, y\rangle dS(y). \end{array}
    The second equality is due to the invariance of the spherical measure under {SO(n)}. Now, let {C= L(e_1, e_1)}. We claim that \displaystyle L(u,v)=C \langle u,v\rangle.
    This is clearly true for {(u,v)=(e_1, e_1)}, and thus by applying a suitable {A\in SO(n)} and a scaling, {L(u,u)=C\langle u,u\rangle} for all {u}. Finally, \displaystyle 2L(u,v)= L(u+v, u+v)-L(u,u)-L(v,v)= C(\langle u+v, u+v\rangle-\langle u, u\rangle-\langle v, v\rangle)=2C\langle u,v\rangle.

    Remarks: this proof can also be extended to the second Wongting’s theorem, with some modification.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s