A proof of the Michael-Simon-Sobolev inequality

The Michael-Simon-Sobolev inequality states that

 Theorem 1 Let ${M^m}$ (${m\ge 2}$) be a smooth immersed submanifold in ${\mathbb R^n}$ and ${f}$ be a non-negative smooth function with compact support, then $\displaystyle C_m \|f\|_{\frac{m}{m-1}}\le \|\nabla f\|_{1}+ \|fH\|_{1}. \ \ \ \ \ (1)$ Here ${\|f\|_p= (\int_M |f|^p)^{\frac 1p}}$ is the ${L^p}$ norm of ${f}$ on ${M}$, ${H}$ is the mean curvature vector of ${M}$ and ${C_m}$ depends on ${m}$ only.

In our convention, the mean curvature vector of ${\mathbb S^2\subset \mathbb R^3}$ is ${-2x}$ where ${x}$ is the position.

 Remark 1 Of course, by completion the above inequality also holds on ${W^{1,1}_{}(M)}$ (completion of ${C^\infty_0}$ w.r.t. ${W^{1,1}}$ norm). When ${M}$ is ${\mathbb R^m}$ (or subset of it), then it becomes the basic ordinary Sobolev inequality $\displaystyle C_m \|f\|_{\frac{m}{m-1}}\le \|\nabla f\|_{1}. \ \ \ \ \ (2)$ The term ${\|fH|_1}$ can’t be dropped in general when ${M}$ has non-zero curvature. E.g. when ${M}$ is the sphere of radius ${r}$ in ${\mathbb R^{m+1}}$, choose ${f=1}$, then LHS is of order ${r^{m-1}}$, whereas ${\|\nabla f\|_1=0}$. As the mean curvature of the sphere is of order ${1/r}$, the ${\|fH\|_1}$ term is also of order ${r^{m-1}}$, so this inequality is quite “reasonable”.

The interesting thing about the inequality is that ${C}$ depends on ${m}$ only (not even depends on ${M}$)! (Of course, the constant in the ordinary Sobolev inequality (2) also doesn’t depend on ${\Omega}$, even if we restrict our domain to ${\Omega \subset \mathbb R^m}$, the reason is that such a function ${f}$ can always be extended (trivially) as a function compactly supported in ${\mathbb R^m}$. )

I haven’t really read the proof of the most general form (1) of this inequality, but I come across a cute proof of it when ${m=2}$ (i.e. ${M}$ is a surface), which I am going to give here. As a bonus, ${C_m }$ can be explicitly given in this case. (Actually Michael-Simon’s paper also gives an explicit constant. )

Proof: (Proof of Theorem 1 for ${m=2}$)

For an ${\mathbb R^n}$-valued vector field ${\phi}$ which is defined on ${M}$, we define

$\displaystyle \mathrm{div}_M \phi= \sum_{i=1}^m \langle \overline \nabla _{i} \phi, e_i\rangle$

where $\{e_i\}_{i=1}^m$ is a local orthonormal frame on $M$ and $\overline \nabla$ is the connection on $\mathbb R^n$.
Suppose ${N^m}$ is a smooth submanifold in ${\mathbb R^n}$ with boundary ${\Sigma}$ with $\nu$ being the unit outward normal of $\Sigma$ w.r.t. $N$, then as ${\Delta x=H}$ (here for the proof),

$\displaystyle \begin{array}{rcl} \int_N \phi\cdot H = \int_N \phi\cdot \Delta x &= &\int_\Sigma \phi \cdot \langle \nabla x, \nu\rangle- \int_N \langle \nabla \phi, \nabla x\rangle\\ &= &\int_\Sigma \phi \cdot \langle \nabla x, \nu\rangle- \int_N \mathrm{div}_N \phi\\ &= &\int_\Sigma \phi \cdot \nu- \int_N \mathrm{div}_N \phi. \end{array} \ \ \ \ \ (3)$

 Remark 2 In particular, if ${N}$ has no boundary, this becomes $\displaystyle \int_N \phi\cdot H = - \int_N\mathrm{div}_N \phi.$ This becomes zero if ${\phi}$ is a tangent vector field (as ${\phi\perp H}$), thus this formula generalizes the ordinary divergence theorem.

Now, assume ${M}$ contains ${0}$ for the time being and let ${\phi= \frac {f(x)x}{|x|^m}}$.

Easy computations give ${\overline \nabla _{e_i} (|x|^{-k})= -k |x|^{-k-2} \langle x, e_i\rangle}$ and

$\displaystyle \mathrm{div}_M (\frac x{|x|^k})= \frac m {|x|^k} - \frac {k |x^T|^2} {|x|^{k+2} } = \frac{m-k}{|x|^k }+ \frac{k |x^\perp|^2}{|x|^{k+2}}. \ \ \ \ \ (4)$

In particular,

$\displaystyle \mathrm{div}_M( \frac x { |x|^m})= \frac m {|x|^{m+2}} |x^\perp|^2\ge 0 .$

Let ${B_r}$ to be the geodesic ball around ${0}$ of radius ${r}$, ${\Sigma_r=\partial B_r}$ and ${M_r= M\setminus B_r}$. Then applying (3) on ${M\setminus B_r}$,

$\displaystyle \int_{M_r} \phi\cdot H = \int_{\Sigma _r} \phi\cdot \nu - \int_{M_r} \mathrm{div}_M \phi.$

As ${r\rightarrow 0}$, we have ${\nu(x)= - \frac x{|x|}+o(1)}$ and ${\mathrm{Area}(\Sigma_r)= \omega r^{m-1} + O(r^m)}$, where ${\omega= \omega_{m-1}}$ is the area of the standard ${(m-1)}$-dimensional sphere, so taking ${r\rightarrow 0}$ in the above, we have

$\displaystyle \begin{array}{rcl} - \omega f(0)-\int_{M } f \frac x{|x|^m}\cdot H= \int_{M } \mathrm{div}_M \phi &= &\int_M (\nabla f \cdot \frac x{|x|^m} + f \mathrm{div}_M (\frac {x}{|x|^m}))\\ &\ge& \int_M \nabla f \cdot \frac x{|x|^m} . \end{array}$

So

$\displaystyle \omega f(0) \le \int_M (\frac{|\nabla f(x)|}{|x|^{m-1}} + \frac{f(x)|H(x)|}{|x|^{m-1}})dx. \ \ \ \ \ (5)$

Translating the origin to ${y}$ on ${M}$, we have

$\displaystyle \omega f(y) \le \int_M (\frac{|\nabla f(x)|}{|x-y|^{m-1}} + \frac{f(x)|H(x)|}{|x-y|^{m-1}})dx.$

Multiply ${f(y)^{\alpha-1}}$ to both sides and integrating, we have

$\displaystyle \begin{array}{rcl} \omega \int_M f(y)^\alpha dy &\le &\int_M \left(\int_M (\frac{|\nabla f(x)|}{|x-y|^{m-1}} + \frac{f(x)|H(x)|}{|x-y|^{m-1}})dx \right)f(y)^{\alpha-1}dy\nonumber\\ &= &\int_M (|\nabla f(x)|+ f(x)|H(x)|)(\int_M \frac{f(y)^{\alpha-1}}{|x-y|^{m-1}}dy )dx. \end{array} \ \ \ \ \ (6)$

Now, we take ${\alpha = m=2}$. In this case ${\omega=2\pi}$. We estimate ${\int_M \frac{f(y)^{}}{|x-y|^{}}dy}$ as follows. We apply (3) to ${\phi= \frac{f(y)y}{|y|}}$, then as ${ \mathrm{div}_M (\frac y{|y|})\ge \frac 1{|y|}}$ by (4), we have

$\displaystyle \begin{array}{rcl} - \int_M f \frac y{|y|}\cdot H = \int_M \mathrm{div}_M (\frac{fy}{|y|}) &= &\int_M (\frac {y\cdot \nabla f }{|y|} + f \mathrm{div}_M (\frac y{|y|}))\\ &\ge &\int_M (\frac {y\cdot \nabla f }{|y|} + \frac f{|y|}). \end{array}$

So

$\displaystyle \int_M \frac{f}{|y|}\le \int_M (f(y)|H(y)|+ |\nabla f(y)|)dy.$

By translating the origin from ${0}$ to ${x}$, we have

$\displaystyle \int_M \frac{f}{|x-y|}\le \int_M (f(y)|H(y)|+ |\nabla f(y)|)dy.$

So (6) becomes

$\displaystyle 2\pi \int_M f^2 \le (\int_M |\nabla f|+f|H|)^2.$

In particular the constant ${C_2}$ can be taken to be ${\sqrt{2\pi}}$. $\Box$

 Remark 3 The constant ${\sqrt{2\pi}}$ above is not optimal. Suppose not, by translating ${M}$, we can assume ${M}$ contains ${0}$, but then the proof shows that ${x^\perp =0}$ for all ${x\in M}$. So ${x=x^T}$ for all ${x}$. However, the choice of the origin is arbitrary in the proof, i.e. for any points ${x, y\in M}$, the vector ${x-y}$ is tangential to both ${T_xM}$ and ${T_yM}$. We then conclude that ${M}$ is a domain in a ${2}$-plane ${\mathbb R^2}$. Also, for fixed ${y\in M}$, by tracing the proof, we see that ${\nabla f(x)}$ is a multiple of ${x-y}$ for all ${x\in M}$. But this is impossible unless ${f}$ is zero, for if ${\nabla f(x)\ne 0}$, for any ${y\in M}$ which does not lie on ${\{x+t\nabla f(x)\}_{t\in \mathbb R}}$, clearly ${x-y}$ and ${\nabla f(x)}$ are not parallel.