Three proofs

In how many ways can you choose a sample of size k from n distinguishable objects?

As is well known, the answer depends on the sampling rules: Does the order of selection matter, or not? Are items allowed to be selected more than once, or not?

Three of the answers are very easy. If order is significant, the answer is nk if repetition is allowed, and the “falling factorial” n(n−1)···(nk+1) if not. If repetition is not allowed and order is not significant, the answer is the preceding number divided by k!, which is the binomial coefficient {n\choose k}.

The fourth case is the most difficult. The answer is the “rising factorial” n(n+1)···(n+k−1) divided by k!, which is the binomial coefficient {n+k-1\choose k}.

Rosemary and I have assembled three quite different proofs of the last formula for your festive entertainment.

First proof

This one I regard as the “standard” proof. A selection of k items from n with repetition allowed is specified by giving the numbers of occurrences of the first, second, … nth object; these numbers must add up to k. We can choose a specification as follows.

Take a row of n+k−1 boxes. Choose n−1 of them, and put “barriers” in these; then k empty boxes remain. Now put copies of the first object in the boxes before the first barrier; copies of the second object in the boxes between the first and second barrier; … and copies of the nth object in the boxes after the last barrier. This gives a selection. Conversely, any selection can be decoded into positions for the barriers. So the required number is {n+k-1\choose n-1}={n+k-1\choose k}.

Second proof

This proof more directly chooses k distinct objects from a set of n+k−1. The set consists of the given objects A1, … An, together with k−1 “dummy” objects X1, … Xk−1, whose purpose is to tell us about repeats.

Choose a subset of k elements from this set, and order it with the As first (in order), and then the Xs (in order).

If the chosen set of size k consists only of As, then we have a selection with no repeats. If there is just one dummy element, say Xi, then we repeat the ith object in the list. If there are more than one dummy, we repeat the procedure, working from left to right through the set of dummies.

For example, A1A3X1X2 is translated first into A1A1A3X2, and then into A1A1A1A3; for A1A3X1X3, the first step is the same but at the second step we obtain A1A1A3A3.

It is slightly more difficult to show that this gives the required bijection.

Third proof

This proof is by induction.

One reason for having more than one proof is that different proofs may give different extra information. It follows from the preceding proof that the number of selections which involve just m distinct objects is {n\choose m}{k-1\choose k-m}, since the As chosen in the first step are all the distinct objects which we see.

Now the Vandermonde convolution shows that

\sum_m{n\choose m}{k-1\choose k-m}={n+k-1\choose k},

where the sum is over all m for which the binomial coefficients are non-zero. So in order to prove the formula, it would suffice to have a direct proof of the formula in the second paragraph.

This can be done as follows. Clearly {n\choose m} is the number of ways of choosing the m objects which actually appear. Once they are chose, we have to select k from this set of m in such a way that all of the m objects actually appear. This can be done by first including each object once in the selection, and then choosing km more (with repetitions allowed). Since the numbers are smaller than in the original instance, the induction hypothesis guarantees that the number of ways of doing this is {m+(k-m)-1\choose k-m}={k-1\choose k-m}.

I leave the problem of starting the induction to the reader.

About Peter Cameron

I count all the things that need to be counted.
This entry was posted in exposition and tagged , . Bookmark the permalink.

4 Responses to Three proofs

  1. sris says:

    in the third para, you mean $n^k$ not $nk?$

  2. Robin Chapman says:

    Another proof is to apply Ehrhart’s theorem to the simplex given by the equation
    0\le x_1\le x_2\le\ldots\le x_n\le 1.

    Of course, this is a bit of a sledgehammer, but Ehrhart’s theorem subsumes
    many “combinatorial reciprocity” theorems. Another example is Richard
    Stanley’s theorem on acyclic orientations of graphs.

  3. Thanks. Nice — but as you realised, I was trying to avoid the heavier sledgehammers.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.