Friday, July 18, 2008

Tank production

A story is told from World War II, when the Allies were considering a ground invasion of Europe. Critical to their success would be the number of Mark V tanks fielded by the Germans. Intelligence, however, gave contradictory estimates as to the number of tanks that were being produced per month.

It was noticed that captured tanks had serial numbers, which were apparently assigned in order of production. Statisticians came up with a formula which would give an estimate of the total number of tanks produced given a sample of serial numbers. After the war, German production records showed that the estimate was almost exactly correct.

Suppose tanks are numbered 1, 2, 3, 4, 5, ... N, where N is the total number of tanks produced. Given a sample of, say, five serial numbers, how would you estimate the value of N?

4 comments:

RWH said...

Let the serial numbers of the captured tanks be x1, x2, x3, ...xm. Let P(N) be the probability that xi <= xm for all i <= m given that the max serial number is N.

We can calculate some values of P(N) for various values of N, then do something smart with that information.

Obviously, P(x) is 0 for x < xm, and P(xm) = 1.

Then p(xm + k) = [xm/(xm + k)] * [(xm - 1)/(xm + k - 1)] * [(xm - 2)/(xm + k - 2)] * [(xm - 3)/(xm + k - 3)] * [(xm - 4)/(xm + k - 4)] for k >= 1.

Then let's assume that the probability distribution of k is constant for 0 < k < infinity. Then the expected total number of tanks would seem to be the limit as j goes to infinity of the sum from 0 <= k <= j of (xm + k) * P(xm + k).

Evaluation left as an exercise for the reader.

The probability distribution is not constant, of course - it tails off as k gets large, since there's obviously an upper limit on how many tanks can be produced. But at that point P(N) is getting minuscule anyway, so it doesn't seem to matter much.

RWH said...

Oh no, wait, I left out the denominator - I think that limit should have been:

the limit as j goes to infinity of [the sum from 0 <= k <= j of (xm + k) * P(xm + k)]/(j + 1).

Steve said...

Wired recently covered this problem, pointing to a Wikipedia article that was in stub when I posted the problem here.

RWH said...

I went back and reread my original comments and I have no idea what I was talking about.