An exercise in combinatorics

We answer the following question:

Let n and m be positive integers with n \geq m. What is the number C_{n, m} of surjective maps from \{1, ..., n\} to \{1, ..., m\}?

The solution is, of course, well-known. We discuss it to illustrate a useful technique in combinatorics and discrete probability: inclusion-exclusion.

To aid intuition, we use the language of elementary probability. This is not necessary and you may interpret the following without using any “randomization”.

Let \Omega = \{1, ..., m\}^{\{1, ..., n\}} be the set of all maps from \{1, ..., n\} to \{1, ..., m\}, and let X be a random element of \Omega, i.e. for each fixed map f,

{\mathbb{P}}(X = f) = \frac{1}{|\Omega|} = \frac{1}{m^n}.

Let A be the event that X is not surjective, i.e.

A = \{X \in \Omega: \text{there exists } j \text{ such that } X(i) \neq j \text{ for all } i\}.

We decompose A as follows:

A = \bigcup_{j = 1}^m \{X \in \Omega: X(i) \neq j \text{ for all } i\} = \bigcup_{j = 1}^m A_j.

To calculate the probability of a union, the inclusion-exclusion formula is frequently useful.

Theorem 1 (Inclusion-exclusion formula). Let E_1, E_2, ..., E_m be events. Then

{\mathbb{P}}(\bigcup_{j = 1}^m E_j) = \sum_{k = 1}^m (-1)^{k+1} \sum_{j_1, ..., j_k} {\mathbb{P}}(E_{j_1}...E_{j_k})

where for each k, the indexes j_1, ..., j_k are distinct (and unordered).

We include a proof of this formula at the end.

We apply the inclusion-exclusion formula with E_j = A_j. To do this we need to know the probabilities of arbitrary intersections of the A_js. We need the following simple but important fact, whose proof is left as an exercise.

Lemma 2. The random variables X(1), ..., X(n) are independent and identically distributed.

Now let j_1, ..., j_k be any distinct elements of \{1, ..., m\}. By independence, the intersection A_{j_1}...A_{j_k} has probability

{\mathbb{P}}(A_{j_1}...A_{j_k}) = \left(\frac{m-k}{m}\right)^n.

(spell this out to see how independence is used) Since there are C_k^m ways to pick j_1, ..., j_k, by the inclusion exclusion formula we get

{\mathbb{P}}(A) = 1 - \sum_{k = 0}^m (-1)^k C_k^m \left(\frac{m - k}{m}\right)^n.

(Note that we add an extra k = 0 term to make the formula more “elegant”.) On the other hand, {\mathbb{P}}(A) = 1 - \frac{C_{n, m}}{m^n} by definition of C_{n, m}. Hence, we have proved

Theorem 3. C_{n, m} = \sum_{k = 0}^m (-1)^k C_k^m (m - k)^n.

Putting n = m and noting that C_{n, n} = n!, we have the following remarkable – but rather mysterious if you do not know inclusion-exclusion – formula:

Corollary 4. n! = \sum_{k = 0}^n (-1)^k C_k^n (n - k)^n.

The above formulas may be found in the exercises in Chapter 2 of Feller’s classic Introduction to Probability Theory and Its Application: Volume 1. I strongly recommend all of you to read this book.

Questions. (1) Come up with other approaches.

(2) The above formula is quite complicated. Derive recursive relations (or otherwise) so that we can compute C_{n, m} for large n, m easily.

(3) Find an asymptotic formula for C_{n, m} (something like Stirling’s approximation). In particular, we would like to know the rate of growth of C_{n, m}.

Finally, we prove the inclusion-exclusion formula.

Proof of inclusion-exclusion formula: For any event E, let 1_E denote the indicator function of E. Let E_1, E_2, ..., E_m be given. Then

1_{\bigcup_{j = 1}^m E_j} = 1 - 1_{\bigcap_{j = 1}^m E_j^c} = 1 - \prod_{j = 1}^m (1 - 1_{E_j}).

Expanding and taking expectation on both sides, we get the result. \blacksquare

This entry was posted in Miscellaneous, Probability. Bookmark the permalink.

One Response to An exercise in combinatorics

  1. lemontea says:

    Remark(Spoiler alert) This is almost the famous Stirling number of the second kind. Indeed, if we define an equivalence relation on \{ 1, \hdots, n \} by i \sim j \Leftrightarrow f(i) = f(j) for each surjective map f, we obtain a partition of \{ 1, \hdots, n \} into m classes, and there is a one-one correspondence between such partition and the surjective maps, up to relabeling the classes by 1 to m. Therefore, \frac{C_{n,m}}{m!} = S(n, m).

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s