It is often claimed in advanced calculus that measures how fast “rotates”. More specifically, suppose , then locally around , the vector field behaves like a “swirl”, rotating (in the anti-clockwise sense) around the axis pointing in the same direction as and with angular speed . (Strictly speaking this picture is not quite true even locally: only captures the “rotational” part of around . Besides the constant part and the “rotational” part , its first order part also contain the “flux” part which is given by , but let’s ignore this for the discussion here.)
To my dismay, while this gives a good geometric description of , I find that in many textbooks the demonstration of this “fact” is either by looking at the curl of some very special vector field (like ), or using the Stokes’ Theorem. The first approach is really too simplistic although it gives a good model for a “rotational” vector field. The second approach, while rigorous, is “a posteriori”, as it seems that the rather obscure definition of curl given in (1) just comes from nowhere. How on earth would one come up with such a definition at the very beginning?
Remark 1 As we will see from the proof below, the rotational part (but not the divergence part) of locally around (even up to second order effect) can really be modeled by the linear rotational vector field , after some suitable linear transformation (see the computation (3) below). But this is not at all obvious without some effort.
This is the problem that came to my mind as I was recently teaching Stokes’ theorem to non-math undergraduates. I come up with the following approaches in explaining (which however I’m not going to show in my class), or rather, trying to explain why this definition is natural. The first approach nevertheless still requires Stokes’ theorem and is therefore just differs slightly from the “a posteriori” viewpoint above. The second one is a variation of it, which does not require the Stokes’ theorem.
The observation is largely motivated from physics. With hindsight, it is natural to guess that is the direction in which locally creates the largest moment. To be more precise, if we fix the origin and suppose that for a point , there is a force vector whose line of action passes through , then the moment vector of about is the vector , which measures the tendency of to produce rotation about . In particular, the moment of about a fixed unit vector is , i.e. the component of the moment vector in the direction . Suppose now we have a (force) vector field instead of a single vector, and let us fix a unit vector . We consider the total moment generated by when restricted to the circle of radius on the plane perpendicular to , i.e. the function
where is the circle of radius on the plane which is perpendicular to . The circle is oriented such that it rotates in anti-clockwise sense when is pointing up. Of course, vanishes up to the first order, i.e. , as clearly and , but a second of thought would suggest that it actually vanishes up to the second order, i.e. , due to the symmetry of the circle which produces a further cancelation in the integral (note that for any constant vector ). Therefore we instead look at the function
which can be understood as the normalized total moment about by the force along the circle as , up to a positive multiplication constant.
It is natural to guess that
Proof: The observation is that for . Therefore
by the Stokes’ theorem. Here is the disk of radius on . It then follows that is maximum exactly when has the same direction as , establishing our claim.
While the above argument has a nice physical interpretation, it uses the Stokes’ theorem and therefore doesn’t satisfactorily explain the origin of the definition of . We now modify the argument to eliminate the use of the Stokes’ theorem altogether.
By Taylor’s theorem, we can express locally around as
as . Note that the second order part is even in , i.e. , and therefore . Note also that is the linear vector field
So we deduce
Lemma 2 For a linear vector field , where . If , then
Proof: By the symmetric of , if , so
Using , we have . So
where is defined as in (1). In particular, this provides an alternative proof of Proposition 1, and which at the same time justifies the definition of . This can also be regarded as a local (but much weaker) version of the Stokes’ theorem.
Remark 2 The definition of the divergence can be similarly obtained by considering the limit of the vector field along shrinking concentric spheres. Interested reader may try.
1. Higher dimensional and co-dimensional case
Of course in higher dimensional and co-dimensional case, it is better to formulate Stokes’ theorem using the language of differential forms than vector field. In this case the Stokes’ theorem can of course be elegantly written as
Nevertheless, the analysis above can still be done, with some modification. For example, consider a vector field in and consider a (-dimensional) circle of radius which lies on the -plane perpendicular to two orthonormal vectors , . Now consider the function
Here the orientation is chosen such that its unit tangent vector satisfies , is the wedge product and is the Hodge dual on (note that there is no cross product for two vectors in ). The above computation ((3)) can be easily adapted to this case: w.l.o.g. assume , , then as , we have
Therefore is maximum when the -vector points to the same “direction” as . Moreover, this can be regarded as the local version of the Stokes’ theorem: .
In any case, I think the above derivation of a local version of Stokes’ theorem (by reducing a general vector field/differential form to the linear case first, and then integrate on a small circle/sphere) could provide some motivation why we should define the curl (or more generally, the exterior derivative) this way and why it appears in the Stokes’ theorem.