Fair and Safe Elections: A Mathematical and Computational Approach

Posted on Jul 16, 2021

Beginning Remarks

In light of what I know about the current state of computer processor backdoors and the lack of auditable silicon, I'd trust modern electronic voting systems about as far as I can throw them.

In this article I claim that by:

  1. making electronic voting machines 100% auditable down to the silicon
  2. providing temporal anonymity with one way trapdoor functions that map from real ID to a public alias
  3. using a public commit/block vote chains tagged by voter public alias

we can provide a free and safe election of unprecedented trustworthiness, exceeding the trustworthiness of previous paper ballot schemes, and certainly previous electronic schemes, largely because of the fact that nearly the entire process is now auditable and accountable to the ones who would be affected by it most, the people.

The only component of the process that is not entirely auditable is the pooling of public IDs and their batch mapping to aliases, which are then made public.

It's no use if you can watch somebody get assigned an alias, you still know their real name.

It is however possible, by random sampling, by perhaps, an independent third party organization, to collect aliases from x% of the population, and verify that the aliases coincide with exactly x% of the public alias pool.

Nevertheless, for this scheme to work, there needs to be at least one group with access to all of the population's IDs that can then generate a public alias pool. It would be wise to structure the alias generation process such that the group(s) generating the public alias pool is/are incentivized to be as scrupulous as possible. Such a structure would likely involve multiple competing groups that must then resolve any differences between their public alias pools. Once all pools are identical, all public pools are deemed canonical.

Lastly, I address the practical concern that most public IDs are not issued from sparse integer-valued spaces, i.e., if we have 300e6 people in the U.S., drivers license ID number are probably pretty densely packed in between a range such as [0, 300e6], allowing an individual to easily guess public IDs. This problem can be circumvented be issuing new voter IDs from a sparse integer valued space during voter registration.

Crude Definitions

Before we move on to more rigorous definitions, let us define a fair and safe election as an election with the following properties:

  1. the election results are completely verifiable
  2. the voter identities remain completely anonymous

The astute reader will quickly remark about 1. "but verifiable by whom?!" Yes, without more constraints, 1. is a bit of a weak condition. Don't worry, we shall refine our properties later on.

Interestingly enough, 2. is somewhat difficult to achieve in the strongest sense. If everyone votes identically, a mere vote tally leaks everyones' votes. The paper mentioned below does put some restrictions on 2. to make it's proof more robust.

Anyways, it turns out that mathematically, a completely verifiable and anonymous election are incompatible properties as proven by Chevallier-Mames et al. in 2006.

So we're done right? This problem can't be solved? End of post?

Not quite...

The Mathematical Escape Hatch

In Mathematics, there is a beautiful world I like to call escape hatches better known as trap-door functions.

For example:

  • \(a \cdot b\) is much easier to compute than \( \frac {a} {b} \).
  • \( \frac {d} {dx} f(x) \) should it exist, is usually easier to compute than \( \int _0^x {f(x)} \), again should it exist.

You get the idea... There are many more trapdoor functions.

Some of the most common trapdoor function used in cryptography use hash functions and prime numbers. In fact, prime number encryption uses large primes, because, as it turns out, nobody, as of the writing of this post, has found a way to quickly factor large numbers. If they did, I suspect most of modern day encryption would be worthless.

Applying a Trapdoor Function to Achieve Temporal Anonymity

Consider the following scheme. All voters vote with an alias identity that is extremely hard to trace back to them. Only long after they die could their alias identity and their voting record linked back to them.

Most people would probably be OK with the above voting scheme.

It turns out that by applying a trapdoor function, we can make it such that it is very easy to convert a voters true identity into their alias when they cast their ballot, but extremely difficult to decode their alias to their actual identity. When I say extremely difficult, I mean that it might take 200 years before we can solve for true identity given alias.

This scheme has the nice property of anonymity during the voter's lifetime, which what most people actually mean when they claim they want anonymity or privacy.

Privacy or anonymity doesn't usually do dead people any good.

In this context, the SHA-2 function would probably be a good candidate for a trapdoor function.

More Properties

Before we flesh out any more mechanics of the ideal process surrounding electronic voting, let's return to the properties of a fair and safe election.

First, let's consider some of the common problems that could arise during voting:

  • voting twice
  • fake ballots
  • false counts

There are probably more problems, but, following properties copied straight out of the paper satisfy what the paper defines as desirable election properties, as well as deals with the problems mentioned above.

  • Detection of individual fraud
  • Vote tally can be computed.
  • The set of eligible voters and voters who actually voted can be determined

We also assume implicitly the property of universal verifiability which means that any individual should be able to:

  1. Detect election fraud.
  2. Compute the vote tally.
  3. Know the set of eligible voters and actual voters.

Practical Considerations for Universal Verifiability

For small populations, it is not so difficult to enable the entire population to verify results. Imagine there are only 10 people who can vote. While 1 person votes, the other 9 stand and watch the voter making sure that there is no shady stuff going on. For example, perhaps the voter attempts to remove an already cast ballot from the lockbox. By giving the entire population complete insight into the voting process and mechanics, it becomes incredibly difficult for bad behavior to occur from any party. This includes, voters, vote counters, and even set-up/tear-down staff.

Providing complete transparency for a small population is very straightforward. Providing transparency for 300 million people is incredibly difficult.

Enter technology.

Technology Enabled Near-Complete Transparency

Every single part of the election process, down to the silicon in the voting machines should be well documented and public. During the actual elections, every single transaction should be logged into a tamper resistant commit chain. Block-chains are an easy way to do this.

The Scheme

Now we can address the good stuff. The scheme in its entirety. I present what I would consider an ideal or close to ideal scheme, although variations are allowed.

0. Auditable Machinery

Step-0 is auditable machinery. It is an absolute must. All machines that do hashing, vote consumption, vote tallying, public alias pooling, vote printing, and even sparse-space ID generation etc... All of them must be 100% auditable down to the silicon, as well as the software they run.

It is also advised that the machines run:

  1. Tamper resistant bootloaders.
  2. Exclusively provably memory safe software, which probably means running a small RTOS, perhaps written in Rust.
  3. Isolated hypervisor partitions when moving sensitive data to and from the outside world.

1. Registering to Vote

Everybody who wishes to vote is required to register to vote.

Registration would likely occur with multiple approved third parties. For the purpose of this example, let us say that there are two third parties, A and B.

During registration, an individual presents some sort of ID such as a drivers license number or passport number that can be validated to party A. The third party verifies that the number is authentic and that the individual is indeed eligible to vote. The third party then hashes the valid ID into an alias which is released publicly. The individual records their alias for their own personal records.

The individual then repeats this process with party B.

Parties A and B update their respective public alias pools in realtime and are required to make sure that they match at the end. Any entries that don't exist in both pools are deleted.

Since the alias pools are public, an individual can check if their alias exists in both pools. If not, the individual must attempt to register to vote again.

An individual who later attempts to vote, for whom a public alias does not exist, will not be allowed to vote.

Given that this process is a bit involved, it may make sense to allow registration to occur a year ahead of time.

2. Voting

The individual takes their personal ID to a voting machine. The machine computes the individual's alias, and checks to see if the alias exists in a public pool. If the alias does not exists, the individual may not vote.

If the alias exists, the machine checks a public commit chain to see if there already exists a vote entry tagged with the individuals alias. If the entry exists, the individual is not allowed to vote.

This means that either the individual is double voting, or someone earlier stole their ID and voted. Either way, the individual cannot vote in this case, thus, the individual is incentivized to keep their ID from being stolen, which is an entirely reasonable requirement.

If all goes well, the machine records the vote, and commits the vote to the public chain.

Since people like paper trails, the machine can also print a paper ballot to be cast into a lockbox. The paper ballot should have a signature at the bottom that is a hash of the character contents of the paper. The paper ballot should also have the individual's alias printed onto it.

3. Vote Counting

This one is very easy. Any individual can download the commit chain and run a simple script, perhaps a Python script to compute the vote results.

4. Bad Behavior

I wouldn't be surprised if issued passports or driver's license numbers have a dense distribution. This means it is possible to guess ID's fairly easily and compute their public alias.

An easy way around this would be to randomly issue unique new IDs during voter registration from a space much larger than 300,000,000 values such as the space of all integer values contained in a 4096 bit digits.

It would then be these new IDs that get hashed into the public alias pool, as well as these new IDs that are brought to the voting booths.

In a scheme where two parties are responsible for creating a public pool, only one of the parties would be required to issue the voting ID. The individual can then take this issued ID to the second party and have them update their public pool with the hash of the new ID.