Veil of Ignorance: Combating Manipulation with Artificial Empathy and Risk Avoidance

Hi Voting Theory people! I want to discuss the concept of the “Veil of Ignorance,” and ways that it might be used to mitigate the manipulation of voting systems via strategic voting.

Discussion on Manipulation in General

Manipulation of voting systems can occur in many ways that have been discussed at length on this forum. Manipulation is often described in terms of casting a “dishonest” ballot in order to gain a strategic advantage regarding the probable outcome of an election. This is a good way to acquire an intuition about manipulation, but it is an informal description that fails to capture the full scope. It has been correctly noted on this forum that the notion of an “honest” versus a “dishonest” ballot is poorly defined, and is at best a social construct that is both difficult to measure and up to debate. Furthermore, it is important to categorize manipulation according to scale, where, as a model, the predominant strategies are segregated by the boundaries of chosen coalitions of appropriate sizes. Coalitions can range in size from a single individual to a majority of the electorate, and it is important in game-theoretical analysis to select coalitions that can be used to effectively and efficiently model the functional interests, incentives, and influences of the groups within a population. Otherwise it will be almost impossible to reliably model electoral behavior.

In a society, it seems there will always be many diverse coalitions, so choosing the coalitions to analyze functionally will depend on the kind of information we want to glean about the society. An “ideal” voting system I think would be one where an analysis based on segregated individual interests could effectively predict outcomes. That is, one in which each voter is expected to cast a ballot that does not vary respective to the expected ballots cast by other coalitions of voters. We might say that we want ballots cast by individual voters to be invariant under changes in information about the behavior of external coalitions, i.e. the strategy of any single voter is not contingent or dependent on the strategies of other voters. Unfortunately, satisfying this goal completely is known to be impossible by Gibbard’s Theorem, but it may be possible to approximate the situation in a way that is stable and persists over time.

Here is a descriptive hypothesis: small coalitions tend to form spontaneously, and then as a reaction, other small coalitions form to counterbalance the advantages gained by the spontaneously formed coalitions. Eventually, those small coalitions will spontaneously coalesce into larger ones, etc., until one is left with a small number of large political parties. The analogy I have in mind is the freezing of a liquid—first, small, random impurities act as nucleation sites for solidification, and then those small solid particles act as more potent nucleation sites for further solidification, until the whole substance is frozen. A “solid” electorate is what we want to avoid. The ideal liquid electorate is impossible by Gibbard’s theorem—but perhaps a slushy or goopy electorate might be attainable. What we need to accomplish to achieve this is to somehow reduce the incentives that exist for individuals to form strategic coalitions, and also reduce the incentives that exist for smaller coalitions to coalesce strategically into larger ones.

As a somewhat obvious example, plurality voting does the exact opposite of what we want—it is essentially a catalyst for the solidification of the electorate, which is what “Duverger’s Law” recognizes. But regardless of the precise mechanisms that cause this unfavorable catalyzation (which in the case of plurality voting, I think it can be mostly pinned on “vote splitting”), the catalyzation itself is, in my opinion, the crux of its failure.

The Veil of Ignorance

To try to counteract coalition formation and manipulation, I believe we must be able to somehow cause voters to empathize with each other on a large scale. Smaller coalitions are less likely to form among a healthy group of friends, for example, since each friend tends to naturally empathize with each other friend, and also tends to consider each other friend’s well-being as an influential interest regarding their own behavior. Small coalitions will form as small groups empathize amongst themselves, and as its members fail to empathize in a relevant way beyond the confines of their coalition.

The Veil of Ignorance can be used, on one hand, to require voters to put themselves in the position of a “typical” voter. Here is an example to illustrate how this might be accomplished, where manipulations are for the moment ignored:

Suppose a group of friends are all trying to decide on where to go to dinner. There are several different restaurants that are suggested by the group, and trying to be fair they put the matter to a vote. Because they don’t want any bias and want to make sure everybody is reasonably pleased with the result, they do the following: First, each of them (in good faith) scores each restaurant with an integer from 0 to 5, with 0 meaning the restaurant is tied for their least favorite, and differences between scores corresponding to degrees of preference. Next, the friends look only at the distributions of the scores, without reference to the name of the restaurant it belongs to, and vote together on the distribution. The restaurant with the winning distribution is where they all agree to go.

Notice that by ignoring the name of the restaurant when voting for distributions, the friends have been able to eliminate some degree of self-interest—or rather, they have been able to re-direct it. A reasonable friend will choose a distribution that balances his own personal risks of going to a restaurant that he gave a low score with the reward of going to a restaurant that he gave a high score. Assuming that no voter can identify restaurant names corresponding with certain distributions, each voter will have been placed in a state of “artificial empathy” with all of the other voters.

On the other hand, the Veil of Ignorance can be used to discourage dishonesty! Here is an elaboration: Consider the same friends trying to go out to eat, only this time, they are worried that some friends might score “dishonestly” to manipulate the end result of the vote—for instance, several of the friends might agree to score their mutually favored restaurant as a 3, and then vote for the distribution with the most 3s. So they come up with the following idea: with a 50% chance, they will vote on the distribution themselves, and with the other 50% chance, some impartial algorithm will determine the winning distribution. This way, if any particular friend wants to avoid risk, they must construct a ballot that is likely to yield a decent outcome for them in either situation. If the algorithm operates such that higher scores are more favorable for the possibility of going to each restaurant, then this dramatically discourages the dishonest voting that was described above. There is still the possibility that a voter might be able to identify which of the restaurants correspond to certain distributions, but with many voters this becomes unlikely.

To generalize this procedure, it may also be a good idea to keep secret which particular voting system will be used, i.e. to allow voters the information about which systems will potentially be used, but to select one at random without their prior knowledge. Then, if the systems are chosen carefully, the risk of using an inappropriate manipulation tactic for the randomly chosen system may also discourage manipulation in general.

That’s all I have to say for now. Thanks for reading and I’m definitely interested in any thoughts on this subject. I don’t think we want to be hooking voters up to polygraphs, but we want voters to be “honest,” so it’s an issue.

1 Like

I mean if you really want honest voters, proxy and asset systems would probably be an easier route.

1 Like

Yes I’ve considered proxy voting. I don’t know if it is palatable, it seems vulnerable to bribery and begs the question of how those representatives ought to be selected. I would prefer no proxies and as little room for electoral control by the candidates as possible.

Despite this maybe being lengthy, I don’t think it’s really all that difficult. Furthermore, asset voting can’t be used for candidates that aren’t people or groups (or artificial negotiators).

I think this is all quite interesting. If it was all about wantning honest ballots, you could just go for the random ballot option!

But generally, I think it would be difficult to go into this sort of vote without the friends knowing enough about each other’s preferences to have a good chance at guessing which restaurant has which score distribution.

Edit - also what voting method do they use for the distributions?

I think it could be interesting. I know that many people are averse to nondeterministic systems, but in my opinion I would rather have a nondeterministic system that provides consistently good results than a deterministic system that doesn’t.

What do you mean by going into this sort of vote? It would be difficult for the friends to identify the restaurants based on their distributions, but it might be possible for example if some large-enough coalition purposely indicated a characteristic score for each restaurant. Or for example, if a voter indicated a 3 for some restaurant, and only one distribution showed a rating of a 3, then they would be able to reason out the identity of that restaurant.

Random noise could be used to eliminate that possibility, but it shouldn’t be too noisy or else the distributions become unreliable.

The voting method for the distribution is up for discussion. I think even a simple plurality vote would be effective. You would expect there to be a normative, rationally superior distribution based on how typical individuals process risks.

I mean approaching the vote. Before they have the vote, it’s not as if they’re strangers. They probably largely already know each other’s preferences.

That’s true for the friends, but for people who are not friends this kind of system would essentially force them to consider the distribution of preferences over all of the voters for each candidate. A small enough group of friends should in principle be able to come to an agreement about where to go without the formalities of a voting system anyway. They should have an informal, sociable system that works for them.

1 Like

The idea of voting twice may be a good one, precisely because of the information the first vote provides (even your anonymized distributions are new information). And randomness may be useful for the same reason (namely fairness) deterministic ways of dividing the office are useful; however, I don’t see how the introduction of ignorance (either by anonymizing the distributions or by randomizing which selection method is used) improves the outcome.

As for the problem of anonymizing the distributions, it’s two-fold. First, even if you didn’t vote, your knowledge of others helps you identify the candidate by his scores. Second, even if you can’t identify the candidates, when you go to evaluate a distribution, the weight you give a given score is going to be equal to the probability of its being yours, which is the number of candidates you gave that score to divided by that score’s total number of appearances.

As for anonymizing the distributions via random noise, I see no basis for your assumption that there is a Goldilocks zone where both the anonymity and the reliability of the distributions are high; it’s a trade-off whose optimum may well be 0 noise. It’s more feasible, I think, that reducing the distributions to their p-means would partially anonymize them while fully preserving their reliability. After all, isn’t that what voters would be doing anyway if you somehow showed them the true scores without tipping them off to whom they belonged to? Judging a distribution by the p-mean appropriate to their level of what you call risk-aversion and I call instability-aversion?

The hope is of course that their risk- or instability-aversion will result in a fairer outcome than if we, say, automatically chose the candidate with the highest total/average score (i.e. p=1). But if instead you revealed which candidates were which, the voters would just score the candidates again, only now with knowledge that might help them calibrate their approval (i.e. min-max) thresholds better (i.e. less dependently on irrelevant candidates), resulting in a more even distribution of influence. And the winner of that vote needn’t be the candidate with the highest total/average score. Indeed, if these voters are risk-averse with respect to scores, why would they have agreed to use p=1 to decide the outcome of the second vote in the first place?

As for randomizing the selection method, you did not demonstrate that it incentivizes honesty. To a rational voter, the candidate’s prospective rating would simply be the average of his prospective ratings for the methods that might be selected. Even if the number of methods leaves him with only enough time to roughly estimate the prospective ratings for each method, wouldn’t that be more rational than voting with utter disregard for the consequences of his vote? Randomization randomizes the gains from honest and strategic votes alike, so no amount of risk aversion would make it incentivize honesty.

I can’t help but feel like that isn’t going to be very good information. It seems like you would not be using all of the relevant information revealed to you by the distributions themselves. It also doesn’t seem to take into account risk at all, i.e. the risk of electing a candidate you gave a low score versus one you gave a high-score. If you could try to explain this to me in terms of conditional reasoning I would appreciate it.

The introduction of ignorance at least partially disintegrates the strategically aligned block majority (or in fact any strategically aligned group) back into the more diffuse voter pool. This will at least partially mitigate majoritarianism, which I think is an important thing to attempt, since I believe majority rule is too pathological to give distributionally just outcomes enough of the time.

You mention p-means—do you mean taking the Lp norm? I think reducing the distribution to a single parameter defeats the whole purpose of this system, which encourages voters to deal with the raw information about how many voters scored a candidate into whatever bin.

Also, I take issue with this definition of a rational voter. A rational person will not simply ignore the distribution and evaluate based on its mean. The mean is only useful for making predictions about repeated experiments, but in the case of this system, the experiment is only performed once, and in the future the distribution will have changed.

If your knowledge of others enables you to use even the scores that aren’t yours to help you estimate the probability a given distribution belongs to a given candidate, all the better. If not, you could at least weight each score equally to the probability of its being yours; a score that’s not yours is irrelevant in directly estimating the value of a distribution, for whatever concern you have for others would have influenced your scores in the first place. If you’re risk averse with respect to scores (a dubious assumption as, again, your risk aversion influences your scores themselves), that wouldn’t change the weights, it would change the estimator (e.g. from mean to geomean)

Majority rule has the same outcome as consensus rule when voters are individually rational. This is because any unfair majority proposal is, by definition, vulnerable to defection by a subset of that majority to an alternative majority that offers it a better deal. A tyrannical majority is thus inherently unstable without the introduction of kin-selected altruism and spite. But there are two ways to manifest altruism and spite. The first is preferentially allying with relatives; the second is differentially factoring others’ interests into the perceived utilities of the candidates in the first place. Even if your veil successfully eliminated the first option, all that would do is guarantee the second.

But you don’t want to show the raw information anyway. You want to add noise because you understand showing the raw information lifts the veil. I think power means would give more useful information at less cost to the veil than distorted data, but you may be right; hiding the raw data may defeat the purpose, meaning the veil of ignorance is impractical.

A prospective rating is not a prediction, it’s a measure of the value of voting for a given candidate. And it’s already based on utility, which factors in risk aversion, so the probability-weighted mean.is the appropriate total. Only when the outcome is defined by the resource bet are outcomes aggregated differently depending on whether the bet is repeated, and even then the difference is the opposite of what you indicated (see the Kelly criterion). But regardless of how best to aggregate conditional prospective ratings, it’s obviously better than voting honestly.

Nothing on that page backs you up. In fact, neither “repeated” nor any other derivation of “repeat” appears anywhere.

The point of this system is to increase the incentives for voters to select a candidate with a favorable distribution for everyone. Just like evaluating a musician based only on how they sound by having them play behind a curtain. If you are familiar with the musician you may be able to guess their identity based on certain quirks or tone or impressions, but it’s still more ethical—and practical—to try to eliminate certainty in that regard.

The very fact that the expected value is used intrinsically connects your prospective rating with a long-run prediction. You can read this if you would like:

https://www.quora.com/q/vxyolmbprxfkpixg/Expected-Values-Convolutions-and-the-Central-Limit-Theorem-for-Continuous-Random-Variables?ch=10&share=0274736d

Regardless, do you think it would be better then to display the distributions to a group of representatives who abstain from voting, and who then must come to a transparent decision about which distribution to elect?

I find that analogy problematic for two reasons. First, while anonymizing the musicians would not interfere with the evaluation of that night’s performance, it arguably would interfere with a more analogous evaluation: how well the musician is expected to perform in the future. Second, is it not an important difference that a distribution is itself composed of evaluations? Why would you expect the method, agreed upon in advance, of automatically selecting a winner based on those evaluations to be worse than evaluating the distributions of evaluations and then automatically selecting the plurality winner (or whatever) of those second-order evaluations? Is it possible to make an argument for second-order evaluations that doesn’t beg the question, why stop there?

I didn’t see “The mean is only useful for making predictions about repeated experiments” or anything similar in there. Could you point to the right section? Anyway, I don’t see what the appropriateness of the mean has to do with the question at hand. Strategic voting is based on probabilities and occurs in an experiment performed once (in the future, the other votes and even the candidates and voters will have changed) even when you don’t randomize the selection method. Why would randomizing the selection method mean the difference between honest and strategic voting? It would barely complexify the problem; even if it greatly complexified it, it wouldn’t matter, because real strategic voters use heuristics rather than actually solving the problem.

Maybe, provided the representatives were sufficiently ignorant of others’ preferences and voting intentions. Maybe you could sequester them like a jury.

Yes, it does beg the question. You could iterate it indefinitely, but the reason to stop is that voters don’t want to vote many many times.

Also, I don’t think it is an important difference that the distributions are composed of evaluations. We want a candidate with a distribution that is pleasing to the electorate as a whole. You could argue that the music one hears is also composed of evaluations, i.e. musical evaluations, where the judges observe and then evaluate certain aspects of the music that they like or that they don’t like. Only in this case, the evaluations would hopefully be made with regards to how well the candidate pleases the whole of the electorate compared with the other candidates, which, if the incentives are correct, should be reflected in the distribution of scores given to that candidate relative to the others.

Also I would expect a totally formal method to do worse than a method that incorporates certain social informalities in an appropriate way, since society is not really a formal construct. Often a “majoritarian” winner of the first round of voting would not be the majoritarian winner of the second round, which would be an indication that the Veil of Ignorance successfully disintegrated those block majorities and provided a more distributionally just outcome. As more rounds of voting on distributions are introduced, the majoritarian outcome should logically converge to the election of a fixed candidate.

This is a consequence of the definition of the expected value as made in that post, which turns out to be logically equivalent to the usual computational definition whenever the integral value defining the usual expected value exists, and a generalization of it when the computational definition fails—for example, it gives an expected value of 0 for the standard Cauchy distribution, whereas the computational definition is not adequate to do so since the integral does not converge.

Sorry if that post is a bit wordy. But the definition of the expected value includes already the construction of a random variable that is the sum of a number of independent copies of the random variable in question. The question that the expected value (approximately, i.e. to first order) answers is, “If I take N copies and add them together, for what value E will it be equally likely to observe a sum smaller than E as to observe a sum greater than E?” If you follow through the analysis, you will see that it requires N to grow large so that convergence of the sample mean to the expected value can occur, or equivalently for the sample mean distribution to converge to one that is Gaussian. What that means is that many independent copies of the random variable must be observed, and that means (1) it must in fact be the same random variable, i.e. the “same game” must be played, and (2) that same game must be repeated many times. Not even the first criterion is met for a voting system except in an abstract way that would need to be explicitly elaborated and modeled, for example perhaps regarding “types” of candidates and how they relate to each other and the electorate.

I do think I like the idea of a jury more, although you would have to prevent the members of that jury from casting a ballot.

Also, I want to illustrate a point having to do with your analysis of the problem the voter is faced with when they cast a second ballot. Even if the voter had his ballot removed entirely, the distributions would hardly change at all, given that the electorate is large. So how is it that he can use his own ballot to obtain information about the candidates or even to evaluate which candidate is more likely to be his favorite, except in cases where its removal has a decisive effect on the distributions in some way? After all, the situation is virtually identical to having been sequestered to form a member of the jury, and hence not having your vote counted at all. But the abstinent jury does totally solve the problem of noise.

I’d call it a formal construct. An electorate is the contents of a political geographical area, which is hardly organic. And communities close-knit enough to have functional informal customs hardly need a social planner to come in and replace them by incorporating them into a system premised on the notion that an informed individual cannot be trusted. Or maybe I’m missing something. How would you incorporate these informalities?

That’s just as true for repeated balloting without a veil. Granted, the veil would cause the converged-upon candidate to be better if the original votes were honest, but I see no reason why they would be.

Whatever uses that question may have, strategy isn’t one of them. In strategy, it matters how much smaller or greater the sum will be if it is. I’m going to take a one-off bet of $1 on a 49% chance of winning $1 trillion without hesitation, despite the fact that the outcome is likely to be a loss, and I think you’re smart enough to do the same.

The impact on the distribution is small, but the impact on strategy is great. A score you didn’t give to any candidate is not yours and is thus irrelevant. A score you did give to some candidate(s) might be yours and is thus relevant; its import is equal to its probability of being yours. If you didn’t vote, all scores have zero probability of being yours, so the question becomes, do you love the average other person or hate him? If you love him, you’ll choose the best distribution; if you hate him you’ll choose the worst. Even if we assume voters are more altruistic than spiteful, I worry that the veil reduces the incentive to put effort into the decision.