Up: Issue 11 Next: Paper 5 Previous: Paper 3

Voting matters - Issue 11, April 2000

Checking two STV programs

B A Wichmann

Last year, I received a request from the Electoral Reform Ballot Services to 'validate' the computer software that they use to perform elections for their customers. Before that work was finished, I had another request from ERS itself to re-certify the program used to perform elections in the Church of England. Since there was a substantial overlap between both of these activities, these are reported together.

The checking undertaken was merely to ensure that the election results reported were as required by the respective rules. Hence many issues which might be of interest were not examined, such as: the user-friendliness, speed and memory requirements, number of satisfied users, maturity of the program, etc. In fact, the two programs which were tested are very different: David Hill's program is a complete system for data entry and edit, counting and presentation of the results and has been available for some years. In contrast, Keith Edkins' program is solely a counting program and is a recent development.

ERBS's requirements were identified as mainly to check a program that implements the ERS rules that were published in 1997(ERS97). However, their requirements are significant in terms of the capacity required, amounting to the ability to handle up to 350 candidates and up to 250,000 votes. In principle, modern computers have no inherent difficulty in handling elections of that size, provided the software is designed appropriately.

If software is to be shown to be reliable, then a large number of test cases need to be run, or an alternative means needs to be devised to show logically that all the relevant functionality is correctly implemented. In performing the first certification of the Church of England rules in 1990, the technique adopted was to ensure that all the code in David Hill's counting engine was executed, and that the election results obtained were correct (checked by Eric Syddique). It was not thought that the same technique could be applied effectively for the ERBS validation, so the use of many tests was used instead.

If high reliability is to be demonstrated then several hundred tests should be run (corresponding to some years of use by ERBS). This immediately gives a difficult problem - how can one be assured that the result produced by the computer is correct? Initially an attempt was made to determine a small number of tests which performs all the relevant functionality which would then make manual checking feasible. However, the individual actions in ERS97 are quite numerous and difficult to identify - for instance, the result sheet does not state many specific actions undertaken during a count. Hence it seemed that the best means for undertaking the checking was to compare two programs for the ERS97 rules which were available.

Comparing two programs to increase reliability is not widely regarded2, but in this case, the two programs were known to have very different internal workings and were quite independently developed. Hence it was thought that the comparison would be effective.

Unless comparisons can be made automatically by program, the number of tests will be limited to a level which would not give the assurance needed. Hence to facilitate such comparison and to avoid the need for the STV programs to produce elaborate printing, an output format was designed that could be input into a spreadsheet for printing. This format is logically just the conventional Result Sheet, but specified so that mechanical checks, such as those on row and column arithmetic, can be made. I am grateful to both authors that they amended their programs to produce this output since the testing would have been very tedious without that. Two small differences were located between the programs but an analysis showed that neither could change the result. Finally, the comparisons were automated which resulted in a successful validation of Keith Edkins' program.

No formal validation was undertaken of David Hill's program for these rules, but, of course, the same results were obtained. The program is not designed to handle ERBS's very large elections. It currently has 50 as its maximum number of candidates. ERBS would also wish for Colin Rosenstiel's interpretation of the quota reduction rules to be applied, but this has not been implemented, as explained in David Hill's article.

A number of issues arose from the validation as follows:

Quota reduction

A logical problem has been noted by David Hill in ERS97 which arises when the quota is reduced before any candidate is elected. This issue is defined and discussed in a separate article in this issue. The consequences for this validation was that no comparison was possible when this situation arose since David Hill's program does not produce a result, due to the uncertainty in the meaning of the rules. The problem can be regarded as serious, since around 25% of those tests which are based upon real elections involved quota reduction. I decided that I could not formally sign my validation report, since, in my opinion, the meaning of the rules was sufficiently uncertain in this respect. Subsequent to undertaking this work, an analysis showed that the problem could only arise when transfers occurred after quota reduction. For instance, this cannot happen when there is only one seat. An analysis of my election data suggests that the quota reduction problem actually arises in about 12% of real elections. Readers can decide for themselves the significance of this problem from the two articles about quota reduction in this issue3, 8.

New data base

The data base of election data described in Voting matters has been substantially enhanced as a result of both validations. This data is now available on a CD-ROM. In order to facilitate the collection of data from real elections, a program has been written, available as a MS-DOS/Windows program, which produces an anonymous version of election data by taking a statistical sample. Anybody can therefore add data to the collection without concern for the confidentiality of the source. (The data base contains the results for each election for the two rules being considered here, and also for the Meek rules.)

Capacity tests

In order to check that large elections could be handled, a program was written to generate large test data together with the results in result sheet format. This technique showed that these large tests can indeed be handled by any modern PC.

Tie-breaks

If an election requires the use of a tie-break, then a computer program makes a random choice. When comparing two programs, such a tie-break can result in two valid, but different results. This made the validation awkward, since either that election had to be ignored, or one of the programs had to be re-run with the option taken by the other program enforced. In most such elections the results were not compared, and as a result, a small difference between the two programs was not detected. The proposal to resolve tie-breaks by Borda scores would largely avoid this problem.

Church of England validation

Since the objective here was to revalidate David Hill's program, little would be gained in repeating the activity undertaken for the first validation. There were two changes to the Church's specification: a small change to rectify the Lichfield anomaly (which influences the main counting logic, see below), and the much larger change to add the handling of constraints. The logic used to handle constraints is specified in Voting matters.

The testing of the main counting logic relied upon the previous testing and the clearance of the Lichfield anomaly. Also, all the tests run were checked for the correctness of the row and column arithmetic. Hence the main effort was in checking the constraint handling.

The new Church of England rules (GS1327) merely specify the actions to be taken during the count using the concept of candidates which are doomed or guarded. A doomed candidate is one that cannot be elected if a conformant result is to be obtained. A guarded candidate is one that must be elected if a conformant result is to be obtained. GS1327 does not specify the forms that the constraint might take, although it is understood that David Hill's program provides direct support for the constraints that are actually used by the Church. The program requires that every candidate is a member of one and only one constraint group. The constraints themselves specify the maximum and minimum number in a set of constraint groups.

A concern was that it might be possible to specify some constraints which would cause the program to compute for an effectively unbounded length of time. This does not seem possible, basically because the constraints are linear. However, a test was devised which produced a very large table of potential solutions which caused the program to produce a message that insufficient computer storage was available. David Hill has subsequently modified his program to use a file for the table within the counting process which now handles even this case.

Although the program provides direct support for only one form of constraint, indirect support is provided for a much larger range of constraints. As an example, suppose that the constraint groups are Scottish, English and Welsh. A constraint that is not directly supported would be that the number of English elected is greater than the number of Scottish elected. However, the indirect method was capable of handling this case.

The approach to testing constraints was to take some elections from the data base (which are like real elections) and add constraints and then check for a conformant result. It was thought that 13 tests adequately covered the implementation of the constraint logic. It appears that the released program handles constraints which are very much more complex that would arise with Church of England counts.

Lichfield anomaly

A problem arose with the use of previous rules which resulted in the change to the rules even when constraints are not being used. This is called the Lichfield anomaly after the diocese where it arose. A simple test case (based upon an example from David Hill) would be to elect 2 from 5 with the following voting pattern:

 20 AC
 13 B
 12 C
 2 DB
 1 EB
Under the old rules, even though exclusions were one at a time, A's surplus redistribution would be deferred, because it could not change who were the bottom two. Under the new rules it is not deferred because it could change who is the bottom one.

Old rules

A 20  20            Elected
B 13  +1  14 +2  16 Elected
C 12      12
D  2       2 -2  0
E  1  -1   0
New rules
A 20 -4 16 Elected
B 13
C 12 +4 16 Elected
D  2
E  1

A large election

The original certification of David Hill's program did not cover (as it really should have done) the data preparation side. Hence this time, an effort was made to use and test the input logic of the program. A large election was input, both by use of a text editor, and by use of David Hill's program with all the checking options enabled. The conclusion from this was that double-entry should be used in almost all circumstances, since several data entry errors would otherwise be undetected. On the other hand, the program behaved perfectly. (A few points were noted on the user interface, which has resulted in some improvements to the released version.)

Conclusions

Suitable techniques can be used to check STV software. The results have revealed some defects in the programs involved, which, of course, have been removed. However, in fairness to the authors, it is unclear if any of these defects would have remained undetected. Hence the main gain is additional confidence in the software and a reduced risk that such a program would fail during an actual count.

Copies of the full report on both validations are available from the author. Electronic copies are available by mailing a request to Brian.Wichmann@freenet.co.uk.

References

  1. R A Newland and F S Britton. How to conduct an election by the Single Transferable Vote. 3rd Edition, ERS. 1997.
  2. J C Knight and N G Leveson. An experimental evaluation of the assumption of independence in multiversion programming. IEEE Trans. Software Eng. Volume 12, No 1. 1986.
  3. I D Hill. Quota Reduction in hand-counting STV rules. Voting matters, Issue 11 pp8-9. 2000.
  4. B A Wichmann. An STV Database. Voting matters, Issue 2. p9. 1994.
  5. I D Hill. STV with constraints. Voting matters, Issue 9 pp2-4. 1998.
  6. GS1327: General Synod, Single Transferable Vote regulations 1990 and 1998. (Obtainable from Church House Bookshop, Great Smith Street, London SW1P 3BN.)
  7. Earl Kitchener. Tie-breaking in STV. Voting matters, Issue 11 p5. 2000.
  8. C Rosenstiel. The problem of surpluses when the quota is reduced. Voting matters. Issue 11 p10. 2000.

Up: Issue 11 Next: Paper 5 Previous: Paper 3