We start with a naive model of the situation.
Model 1. Suppose that I apply for schools , , …, . I estimate that I will be admitted to school with probability . Thus, if denotes the event that I am admitted by the th school, then . I am interested in the probability .
Suppose that the admission results are independent. Then
For example, if and , then is about 0.57, which is not too bad…
Discussion. The assumption of independence is an over-simplification. At the other extreme, suppose that all schools receive the same applications and the admission committees share the same preference. Then are really the same event, and (using the same number). Hence applying for multiple schools has no “effect of diversification”. From this, we may expect that the “actual probability” lies between 0.1 (complete dependence) and 0.57 (complete independence), and the exact probability depends on the “dependence structure”.
Of course, if the are disjoint (when will this happen?), then . This gives the absolute upper bound of .
In the general case, the probability of the union is given by
(This is just the inclusion-exclusion formula. See my earlier post here.) Each term represents the joint admission for a subcollection of schools. Specifying these probabilities is equivalent to specifying the dependence structure.
At this point, the reader may try to derive a better model.
Model 2.1. We consider a “micro”-model for a single school. Here are the ingredients:
1. Let be fixed. It is a parameter that represents your “ability”. A parameter close to means that you are very bright. A parameter close to means that you are dim. If GRE math subject test is somewhat reliable, a reasonable range is . (As we shall see, this is consistent with the conclusion of the model.)
2. Suppose students apply to the same school. Looking at the websites, a typical range for is . Each student has his/her “ability index”. For simplicity, we are only interested in whether a given student is better than you. Also, we assume that the abilities of the other students are independent of each other. For , let be an indicator random variable that equals if student is better than you. We assume that
Hence, the parameter is interpreted as the probability that you are better than a randomly chosen applicant (other than you).
3. We suppose that the school admits the best % of the applicants. So . A typical range for is .
From the above assumptions, the probability that you are NOT admitted to the school is
if is an integer. The sum is a Binomial random variable, and it is natural to try the normal approximation (we address its accuracy only later):
where and is the cdf of standard normal distribution. Equivalently, your probability of success is about
Now we may draw a few things from the model.
First, let us plot a graph:
1. In this example, the school admits 15 students out of 200 applicants. You may check that the slope is maximum around . You have a reasonable chance of being admitted if is not below very much. You get a figure close to the example in Model 1 () if . If , then your chance is 50-50.
2. If increases (decreases), then the curve shifts to the left (right)
3. If increases and stays the same, then the graph becomes “steeper”. In fact, it is easy to see that
if and if .
4. From 3 (and a little more calculation), if you believe that , you prefer to have large.
Otherwise, if you believe that , you should apply schools with small, so that the greater fluctuation might save you.
Later, we will extend this model to cover the case of multiple schools with correlation.