Wolf Committee Results

It was quite a while ago that I went through the thought process that convinced me that Ebert’s Method needed the KP-transformation rather than dealing with raw scores, but thinking about it again now, it was largely because if a candidate was elected with mainly less-than-max scores, then the voters would still have the same load as if they’d given a max score. For a really simple example, all voters could vote the same way (max 10, 2 to elect):

All voters: A=10, B=10, C=1, D=1

For any candidate that gets elected, the load would be spread equally among the voters, so CD would be the same as AB. It has to be that way for the total load to be spread out among the voters. So the KP-transformation (or something like it) is needed to make it known that C and D each only have a tenth of the support. I know you could argue that in practice there will be a range of scores for each candidate and it’s unlikely to make much difference overall, but it just says to me that it’s not The Right Way, so to speak.

That is a very valid point. It seems we are in a bit of a weird bind

Without the KP transform the total score is lost in the normalization. The metric is really only checking for the variance of the support across voter so this is somewhat an “as designed” problem.

With the KP transform each voter is split into MAX_SCORE number of voters. The metric then checks for the variance along the split voters not along the voters themselves. The voters could be reconstructed as the sum of the squares of their scores but this seems somewhat unnatural. There must be a better way to move to score.

Is there a way we could translate this into the more common framework we have been been putting each voter into so far. For example consider the following steps:

  1. Select Score winner
  2. Calculate the deviation from the ideal load C/V for each voter ie C/V-load
  3. Set ballot weight to C/V-load
  4. Return to step 1

Maybe we do not want to have negative ballot weights. You could just take people out of influencing the election once they get C/V by setting the ballot weight to Max(C/V-load,0). I know this is pretty far from the original system but it might just be the case that Eberts method is really only suited as an Optimal Approval system. Thoughts?

It might seem intuitively unnatural, but I think it works well in practice, and I would argue that it makes RRV better as well, and also converts psi voting into harmonic voting, which reduces to score voting for a single winner (which I see as a good thing). It also generally preserves what I think it should preserve - primarily scale invariance and Pareto domination. I don’t think splitting voters in this way causes the loss of any important information that could cause bad results. But basically I don’t consider the fact that it seems unnatural as a significant negative.

Because of what I put above, I’d deem it unnecessary to make changes. Certainly from the perspective of the KP-transformation, but if it’s too awkward to code then obviously that’s another matter.

Coming back to this, it may be the case that you’d consider my examples not too bad for Sequentially Spent Score or for Hamilton more generally. However, philosophically I see Hamilton as the wrong direction to be heading in in the first place, and would consider highest averages/divisor (e.g. D’Hondt or Sainte-Laguë) a far superior starting point. So instead of going with a Hamilton method and arguing that the wrong/weird results it throws up aren’t really that bad, you can cut that out by going straight for highest averages.

With Hamilton, you are building failure of IIA into the system from the start. I’d also question using Hare quotas (or quotas generally) in the way Sequentially Spent Score does. I presume a Hare quota in this case just means 1/s of the maximum possible score for a candidate, where s is the number of seats. In cases where there are lots of voters only giving scores to candidates that have no chance of being elected, the Hare quota and the surplus handling may never come into play. Whereas with exactly the same credible candidates, but no “non-entities”, the quota might be hit more often and the reweighting would be changed and so a different result might come out even with the same credible candidates and the same ratio of support for them.

And this comes back to highest averages vs largest remainder. Under highest averages party list systems, it doesn’t matter if there are smaller irrelevant parties “stealing” the quota, because it’s not used so they pass IIA. Under largest remainder, irrelevant alternatives make a difference because they change the proportion of the quota that the other parties reach.

I understand that vote unitarity came about because you didn’t like RRV’s reweighting system. But I actually think that this was an intuitive dislike and that it is actually wrongheaded. That’s not to say that I like RRV generally, but highest averages methods such as D’Hondt and Sainte-Laguë work by using these divisors, and I would say that they are mathematically correct. So it’s not the divisors that are wrong, but other things that RRV does.

Basically I would struggle to back any system that doesn’t reduce to a highest averages/divisor method for party voting. It’s the basic starting point to build on and that’s the easy part. Getting a voting system to behave in the manner we would like when there isn’t party voting is the hard part. But I don’t see the point in deliberately falling at the first hurdle.

And the final thing I want to say is about the metrics. I would say that the most important thing is to come up with a sensible definition of proportional that works for non-party voting, and find a system that obeys this as well as other basic criteria (or at least fails as little as possible) such as monotonicity, IIA etc. Because if it’s these other metrics you want to optimise, you might not necessarily be chasing a proportional method. So I see them all as a red herring really.

Well, the other consideration that I would take seriously is strategic voting. Basically my idea would be to find the “best” optimising system you can using the criteria of proportionality etc. and then use this as the measure of all systems you’re testing under various strategic assumptions and see which comes out best overall. This “best” system might not win because it might come out poorly by its own measure when voters are strategic and when candidates are elected sequentially.

OK point taken. This brings up a whole new way to make Sequential Systems from Optimal Systems. It is sort of a boot strap method which is recursive. You find the Optimal System answer for 1 winner. Then find it for 2 while keeping the first fixed. Then 3 keeping the first 2 fixed and so on.

This is a fundamentally new concept for me. I think it is good but it is worth pointing out that it is different. For example, applying this method to Harmonic voting does not produce RRV. The sort order will be different in general since one has the sort order based on the sequence and the other has it based on the score. If we are going to do it this way for Ebert’s method why not Harmonic or Monroe?

Also, there is a variant I referenced in another post called Sequential Phragmen which might actually be better. Unless it is the same in which case you should write to the authors. It just seems like a little research still needs to be done here before we code something up.

In any case it is a bunch of coding add methods which run in this way. I do not have the time to do that now so I think it might be best if I run with the existing code to get results and we can circle back later. Hopefully this iteration can exclude some system. Does that seem fair or would you rather wait to to have this added before the next run?

Sequentially spent score does not fail IIA. Or at least as it is defined here

Yes it is a Hare quota of score. This is 100% consistent and actually inspired by the KP transform so I would think that you would like it. A 5-score system under KP-Transform turns every ballot into 5 approval ballots, right?

The Balinski–Young_theorem

Works out in a way that you can either satisfy the Quota Rule with Hamilton or the Alabama and Population Paradox with divisors. In the situation where this problem arose the issues of Population and Alabama Paradox are real problems. I do not see how it makes sense to consider them a problem in the situation we are talking about. The number of representatives for a state is a fundamentally different problem that representing people. Only the quota rule matters in our case and in that situation you want Hamilton. By using Quotas of score we get a finer grained decision and a better result.

I am not sure that maximizing proportionality representation under any definition is best. I would say satisfying utility within the constraints of proportional representation is better. The other argument is to maximize total utility with in these constraints. The difference being the addition of vote unitary in the former. Vote Unitarity implies passing a quota rule on the score level not the whole ballot level. Sequentially Spent Score is monotonic, passes IIA, is stricter with PR than other systems. The example you gave above gives a good example where you can have a total utility of 62 and satisfy 31 people with {A,A} or have a total utility of 41 and satisfy 41 people with {A,B}. I admit that it is a bit axiomatic about which is better but I think we should look at the frequency of each situation for each method.

I do not have a good way to tackle this problem

I knew it wouldn’t be the same for RRV because of the way it elects the candidate with the highest score first rather than the one which maxmises the harmonic sum, but I think it might be the same if you use KP + SPAV. In any case, I think it is the most logical way to convert from optimum to sequential. Reweighting works if they produce the same result, or if there isn’t an optimal system that it’s based on.

I will have a look through the papers you’ve linked to in that post.

I think it’s fine to do it as you are at the moment, especially if you’re intending to do a run fairly soon.

Yes, sorry - it doesn’t fail IIA. My mistake. But it does fail what I consider to be the worse sort of IIB where adding ballots that only score/approve candidates that have no chance of being elected (essentially blank ballots) can change the quota and therefore the result.

I’m not sure I see the connection between using a Hare quota and the KP-transform. But my problem with using the Hare quota (or any quota) is the irrelevant ballots that I mentioned above.

I suppose the Alabama paradox is not the main problem but the IIB failure I mentioned. I don’t think the quota rule is something that is essential to be passed, as long as what @AssetVotingAdvocacy called the relevant quota rule is passed.

I know you didn’t think the IIB failure example I gave was a concern, so I might think about other examples. But I suppose we probably have different philosophies about this.

Maybe this is right and as things stand I haven’t come across any method at all that I feel is definitely The Right Way, so it does make sense to evaluate the existing methods on a variety of criteria.

That’s fair enough then.

Please do. It is pretty cutting edge stuff and you know more about the Phragmen theory than me.


If you have 100 voters in a score 5 system, the KP transform would turn this into 500 approval ballots. A quota for winner 5 goes from 100/5 =20 to 500/5 =100. What I am doing is counting score instead of ballots and using the quota of 100. It is somewhat of a half step to the KP transform. I lower the granularity to the KP approval level but I still keep track of each voter individually. I understand your issue with Quotas and am also troubled with some of their properties like free riding.

Is there documentation on IIB. I have heard people talk about it but I have never really understood why this would be something to be bothered by. These irrelevant ballots come from real people voters so ignoring them seems wrong.

All divisor methods fail the quota rule. This “relevant quota rule” is not really part of standard literature. It is a way to invent a property in a way that divisor methods will pass it. In the end it does come back to IIB. I gave my case for why I think satisfying more voters is better than more total utility. I suppose that might just be a philosophical difference.

If this work ends up with a few options with differences like Utility vs Satisfaction that is totally fine. I view the single member ecosystem like this. If you want simplicity then Approval, if you want Utility then Score and if you want something more majoritarian with some strategy resistance then STAR. I would have a hard time saying one is always better for all situations. I recently consulted on a project where the answer was Borda count because it avoided a weird strategic issue.

It seems to me even the Relevant Quota Rule is failed by divisor methods. Look at D’s Quota vs #seats in this example

actually every “divisor method” violates quota in this scenario:

State Population Quota #seats(Webster&Hill, use divisor=8.43)
A 13 1.625 2
B 13 1.625 2
C 13 1.625 2
D 105 13.125 12
Total 144 18 18

Every state is relevant here (all get seats) but D gets less than 13 seats.

1 Like

Under D’Hondt, D still gets 13 seats in that example. It’s impossible for a party to get less than their rounded down Droop quota number of seats in D’Hondt. I think Warren made a mistake in that sentence although I think his general analysis of apportionment is correct. That is an interesting example though. Sainte-Laguë is often seen as the purest proportional method but fails to give 13 seats to D. But thinking about it, it doesn’t necessarily seem so unreasonable. If a party is due exactly 10 seats, and quite a few others are due 0.9, then if you look at it proportionally, getting 9 instead of 10 is a smaller drop than getting 0 instead of 0.9.

But often an “entry level” definition of proportionality is that under party voting, a party should always get at least the number of seats that corresponds to the number of Hare quotas of votes they get rounded down, and Sainte-Laguë fails this. But I would still regard it as a proportional method.

But for various reasons, I think if I found what I considered to be the “best” voting method by my own criteria, it’s more likely to end up D’Hondt rather than Sainte-Laguë, although ideally there would be two possible versions of it.

1 Like

I don’t know how much there is about it. As you say it’s come up on here, and also on the old Google Group. Chris Benham often cites the criterion on the electorama mailing list.

I would define it is as follows: The addition of a ballot that is indifferent between A and B should not change the winner from A to B or vice versa.

And looking at it like that, I see it as quite close to the participation criterion. If a ballot rating A and B equally can cause B to be elected over A, then it’s not much of a stretch to imagine a ballot that rates A marginally ahead of B could also cause B to be elected over A. And I would suggest that methods that fail this probably fail participation. Methods that pass one probably pass both.

It seems that Chris Benham actually uses it to mean essentially blank ballots where a ballot just plumps for a candidate with hardly any votes, rather than in the wider sense that I would use it.

It’s not a standard term because I think @AssetVotingAdvocacy only coined the term the other day. I don’t think it’s necessarily a fudge though just to get divisor methods to pass.

In any case, I might come back with some examples to highlight the differences between divisor methods and Hamilton.

I tried a few calculators, and D’Hondt gives D 15 seats, which is greater than 13.125 rounded up = 14 seats, and thus still a failure of Relevant Quota.


Something maybe to think about is that where a Score voter can vote A10 B9, this can be simulated as a bloc of 10 voters who support A, 9 of whom also support B, in non-cardinal methods.

OK, I thought you/Warren meant that all divisor methods would under-represent D. I knew that D would get at least 13 under D’Hondt but didn’t calculate it exactly.

But looking at the example again:

If we just look at D with one of the others (say A but it makes no difference), and we have 14 seats, the population would be 118 - A would be due 1.54 seats and D 12.46 seats. Sainte-Laguë logic would award A 2 seats and D 12, and by IIB the result in the table must follow. There’s no quota violation when you consider pairs at a time, and that could be one interpretation of the “relevant quota rule”. Though I’m not sure what methods would fail it in that case. But I do think that the standard quota rule is not really something to worry about. I think IIB is more important.

1 Like

Going back to this, there are definitely gradations of this criterion. I think the worst one to fail is where A and B have zero rating on a ballot (or neither are ranked above any candidate in an ordinal ballot) and perhaps also where no candidate rated above zero on the ballot is elected. But looking at ratings as we’re considering cardinal methods, Hamilton fails this. The scenario I gave previously is an example of this.

Is the quota rule and IIB incompatible?

But how did you decide on 14 seats?

I think they probably are incompatible. Anyway, I’ve been looking at Hamilton methods a bit further, and the following gives a participation failure:

2 seats, max score 10

30 voters: A=10, B=0
9 voters: A=0, B=10
1 voter: A=1, B=10

Without the 1 voter giving a score of 1 to A, it would be a tie for the second seat so this score means that A gets both seats. But then if you have the following ballots:

30 voters: A=10, B=0
9 voters: A=0, B=10
1 voter: A=1, B=10
1 voter: A=1, C=10

The extra voter changes the quota size and causes B to get the second seat, even though they prefer A to B. I think that’s what would happen anyway. Obviously Hamilton normally refers to party list, and SSS works slightly differently. But in terms of raw quotas, A has 302/4102 = 1.473. B has 100/4102 = 0.488.

Anyway, what happens in Hamilton is that when two parties should have exactly something point 5 seats, then they would tie for a seat and it would be the same as in Sainte-Laguë. But when other voters are added/removed to move the fractional part to more than half (with the ratio of the two parties’ voters staying constant) this will favour the larger party, which sees its fractional part rise faster. However if the fractional part drops below 0.5, it will favour the smaller party, sees its fractional part drop at a slower rate. Take the following example with 3 seats:

Party A: 30 voters
Party B: 10 voters
Party C: 20 voters

Party A is due 1.5 seats, B is due 0.5 and C is due exactly 1. So A and B tie for the final seat. But then add a C voter:

Party A: 30 voters
Party B: 10 voters
Party C: 21 voters

Party A is due 1.475 seats, B is due 0.492 seats and C 1.033. In this case Party B claims that final seat over A because of the more slowly dropping fractional part. Now remove two C voters from this example:

Party A: 30 voters
Party B: 10 voters
Party C: 19 voters

Party A is due 1.525 seats, B is due 0.508 and C 0.966. So A gets that final seat over B because of the faster rising fractional part.

The problem is that Hamilton is a mixture of ratio and absolute difference and while it might look right intuitively, I don’t think it stands up to scrutiny. Sainte-Laguë party list would give a tie for the final seat in all of these cases, which I think is the right result.

That was the number of seats A and D had between them before removing B and C. And it was just to show that the 2:12 seat ratio looks acceptable when only those two are considered, so it should also be acceptable when other alternatives are added. And given that B and C are the same as A, it should follow that 2:2:2:12 is acceptable as well.

Lets do the math for Sequentially Spent Score

Score quota = 40*10/2=200

A is the first winner with 301

Ballots spend 200/301 times their support for A

30 voters: A=3.4, B=0
9 voters: A=0, B=10
1 voter: A=1, B=9.33

A has 100.66 and B has 99.33. Close but the winner set is {A,A}

Now lets try

30 voters: A=10, B=0
9 voters: A=0, B=10
1 voter: A=1, B=10
1 voter: A=1, C=10

Score quota = 41*10/2=205

A is the first winner with 302

Ballots spend 205/302 times their support for A

30 voters: A=3.2, B=0
9 voters: A=0, B=10
1 voter: A=1, B=9.32
1 voter: A=1, C=10

A has 98.36 and B has 99.32. Close but the winner set is {A,B}. So yea the C supporter ruined it for the second A candidate.

I’ll believe you on the second case and not bother with the math. This definitely gives me something to think about. I do see that this is an issue. There is somewhat of a flip when the population changes so it is not the Alabama Paradox which is the issue but the population paradox. As you formulated it the population paradox manifests itself as a participation paradox. I would still like to make the point that this is a very contrived example and there are examples for the other systems where the quota rule is failed. To me this needs to be done empirically. What would these effects result in? How do we measure it and what is the cost for each method?

We should really be talking about non-partisan voting. Of course your examples can be reproduced with approval or even score ballots for candidates but with many voters landing exactly on these edges might be rare. I am not saying that what you have pointed out is not an issue. I am saying that both failing the quota rule and failing the population paradox is bad and I am unclear how to measure which is worse. It is not clear if failing participation in this way is a clear veto on the system.

This sort of brings us back to this other post

Justified representation seems like an extension to a quota rule. This means it would have similar such issues. Did you review this Sequential Phragmen system?

Correct me if I am wrong but all this should also apply for Allocated Score since in the approval case it is the same system as SSS. Does Sequential Monroe manage to evade this?

It might be rare, and I’m not sure how often failures would come up for this or any method where they can fail certain criteria.

It’s not necessarily a veto itself, but I would still maintain that failing the quota rule is not actually bad. I think it’s an “intuitive” criterion rather than a “mathematical” one, and I think failing it is the correct behaviour in some cases. I think its biggest problem is having to potentially explain its failure to people who might be adopting a voting system.

1 Like

It seems that to pass Relevant Quota, you’d need an IIB-passing highest-remainders method.