We answer the following question:

Let and be positive integers with . What is the number of surjective maps from to ?

The solution is, of course, well-known. We discuss it to illustrate a useful technique in combinatorics and discrete probability: **inclusion-exclusion**.

To aid intuition, we use the language of elementary probability. This is not necessary and you may interpret the following without using any “randomization”.

Let be the set of all maps from to , and let be a *random element *of , i.e. for each fixed map ,

.

Let be the event that is *not* surjective, i.e.

.

We decompose as follows:

.

To calculate the probability of a union, the **inclusion-exclusion formula** is frequently useful.

Theorem 1 (Inclusion-exclusion formula). Let be events. Thenwhere for each , the indexes are distinct (and unordered).

We include a proof of this formula at the end.

We apply the inclusion-exclusion formula with . To do this we need to know the probabilities of arbitrary intersections of the s. We need the following simple but important fact, whose proof is left as an exercise.

Lemma 2. The random variables are independent and identically distributed.

Now let be any distinct elements of . By independence, the intersection has probability

.

(spell this out to see how independence is used) Since there are ways to pick , by the inclusion exclusion formula we get

.

(Note that we add an extra term to make the formula more “elegant”.) On the other hand, by definition of . Hence, we have proved

Theorem 3..

Putting and noting that , we have the following remarkable – but rather mysterious if you do not know inclusion-exclusion – formula:

Corollary 4..

The above formulas may be found in the exercises in Chapter 2 of Feller’s classic *Introduction to Probability Theory and Its Application: Volume 1*. I strongly recommend all of you to read this book.

Questions.(1) Come up with other approaches.(2) The above formula is quite complicated. Derive recursive relations (or otherwise) so that we can compute for large easily.

(3) Find an asymptotic formula for (something like Stirling’s approximation). In particular, we would like to know the

rate of growthof .

Finally, we prove the inclusion-exclusion formula.

*Proof of inclusion-exclusion formula*: For any event , let denote the **indicator function** of . Let be given. Then

.

Expanding and taking expectation on both sides, we get the result.

Remark(Spoiler alert) This is almost the famous Stirling number of the second kind. Indeed, if we define an equivalence relation on by for each surjective map f, we obtain a partition of into m classes, and there is a one-one correspondence between such partition and the surjective maps, up to relabeling the classes by 1 to m. Therefore, .