Putting a proposition before a large population of voters can be expensive, so
an organization wishing to do so would like to have a reasonable assurance that
a given proposition will pass. One approach is to take a survey of a randomly
chosen subset of voters and use the results to estimate the proposition’s
chances amongst the general population. The larger the survey size and the
larger the margin that the proposition passes in the survey, the larger the
chances are that the proposition would pass for a general vote.
The mathematics for computing the size of the survey required was already
worked out in Surveying
for a Voter Proposition, assuming an infinite population. The exact formula
for a finite population is worked out here. The result still requires
Now assume that for the general population
- n – the total number of voters
- k – the number of voters who would vote yes.
The formula for the distribution of votes is the same as for the survey,
Except that p, rather than being fixed, has the distribution given by equation (3). This
combined distribution has the name Beta-Binomial, and is
with α=y+1 and β=s–y+1.
What the organization wants to know is the smallest number of polled people
y required such that the chances of the vote not passing by a majority
is less than some given number δ. This means that we need the cumulative
distribution, which is given by
where F is the generalized
Now substituting in for α and β gives
Given that a majority vote is required, the CDF needs to be evaluated at
n = 2k. We can use this to eliminate n from the CDF
The generalized hypergeometric function F may be expanded out as a sum,
where (a)i is the rising factorial or Pochhammer symbol,
(a)0 = 1, (a)i = a(a+1)(a+2)...(a+i-1). The summation
series terminates at the first zero term.
Next eliminate the Beta functions, giving the final “closed form” solution,
The number of terms in the sum is k+1. Summing a series which has as
many terms as half the general population may not be realistic. Another
approach is required for practical computations regarding large populations.
This computation was suggested to me by David Chaum, who wanted to
know if a closed-form solution existed. See rsvoting.org for more information.