## An exercise in combinatorics

Let $n$ and $m$ be positive integers with $n \geq m$. What is the number $C_{n, m}$ of surjective maps from $\{1, ..., n\}$ to $\{1, ..., m\}$?

The solution is, of course, well-known. We discuss it to illustrate a useful technique in combinatorics and discrete probability: inclusion-exclusion.

To aid intuition, we use the language of elementary probability. This is not necessary and you may interpret the following without using any “randomization”.

Let $\Omega = \{1, ..., m\}^{\{1, ..., n\}}$ be the set of all maps from $\{1, ..., n\}$ to $\{1, ..., m\}$, and let $X$ be a random element of $\Omega$, i.e. for each fixed map $f$,

${\mathbb{P}}(X = f) = \frac{1}{|\Omega|} = \frac{1}{m^n}$.

Let $A$ be the event that $X$ is not surjective, i.e.

$A = \{X \in \Omega: \text{there exists } j \text{ such that } X(i) \neq j \text{ for all } i\}$.

We decompose $A$ as follows:

$A = \bigcup_{j = 1}^m \{X \in \Omega: X(i) \neq j \text{ for all } i\} = \bigcup_{j = 1}^m A_j$.

To calculate the probability of a union, the inclusion-exclusion formula is frequently useful.

Theorem 1 (Inclusion-exclusion formula). Let $E_1, E_2, ..., E_m$ be events. Then

${\mathbb{P}}(\bigcup_{j = 1}^m E_j) = \sum_{k = 1}^m (-1)^{k+1} \sum_{j_1, ..., j_k} {\mathbb{P}}(E_{j_1}...E_{j_k})$

where for each $k$, the indexes $j_1, ..., j_k$ are distinct (and unordered).

We include a proof of this formula at the end.

We apply the inclusion-exclusion formula with $E_j = A_j$. To do this we need to know the probabilities of arbitrary intersections of the $A_j$s. We need the following simple but important fact, whose proof is left as an exercise.

Lemma 2. The random variables $X(1), ..., X(n)$ are independent and identically distributed.

Now let $j_1, ..., j_k$ be any distinct elements of $\{1, ..., m\}$. By independence, the intersection $A_{j_1}...A_{j_k}$ has probability

${\mathbb{P}}(A_{j_1}...A_{j_k}) = \left(\frac{m-k}{m}\right)^n$.

(spell this out to see how independence is used) Since there are $C_k^m$ ways to pick $j_1, ..., j_k$, by the inclusion exclusion formula we get

${\mathbb{P}}(A) = 1 - \sum_{k = 0}^m (-1)^k C_k^m \left(\frac{m - k}{m}\right)^n$.

(Note that we add an extra $k = 0$ term to make the formula more “elegant”.) On the other hand, ${\mathbb{P}}(A) = 1 - \frac{C_{n, m}}{m^n}$ by definition of $C_{n, m}$. Hence, we have proved

Theorem 3. $C_{n, m} = \sum_{k = 0}^m (-1)^k C_k^m (m - k)^n$.

Putting $n = m$ and noting that $C_{n, n} = n!$, we have the following remarkable – but rather mysterious if you do not know inclusion-exclusion – formula:

Corollary 4. $n! = \sum_{k = 0}^n (-1)^k C_k^n (n - k)^n$.

The above formulas may be found in the exercises in Chapter 2 of Feller’s classic Introduction to Probability Theory and Its Application: Volume 1. I strongly recommend all of you to read this book.

Questions. (1) Come up with other approaches.

(2) The above formula is quite complicated. Derive recursive relations (or otherwise) so that we can compute $C_{n, m}$ for large $n, m$ easily.

(3) Find an asymptotic formula for $C_{n, m}$ (something like Stirling’s approximation). In particular, we would like to know the rate of growth of $C_{n, m}$.

Finally, we prove the inclusion-exclusion formula.

Proof of inclusion-exclusion formula: For any event $E$, let $1_E$ denote the indicator function of $E$. Let $E_1, E_2, ..., E_m$ be given. Then

$1_{\bigcup_{j = 1}^m E_j} = 1 - 1_{\bigcap_{j = 1}^m E_j^c} = 1 - \prod_{j = 1}^m (1 - 1_{E_j})$.

Expanding and taking expectation on both sides, we get the result. $\blacksquare$

Remark(Spoiler alert) This is almost the famous Stirling number of the second kind. Indeed, if we define an equivalence relation on $\{ 1, \hdots, n \}$ by $i \sim j \Leftrightarrow f(i) = f(j)$ for each surjective map f, we obtain a partition of $\{ 1, \hdots, n \}$ into m classes, and there is a one-one correspondence between such partition and the surjective maps, up to relabeling the classes by 1 to m. Therefore, $\frac{C_{n,m}}{m!} = S(n, m)$.