Unifying Thiele and Unitary philosophy

@parker_friedland gave an explanation for Jefferson party list reweigting that got me thinking.

Essentially you are taking a conserved vote power and splitting it between previous winners and potential winners. Whichever potential winner has the most distributed vote power available to them wins. That is a good and compelling explanation. It also fits well with the concept of Vote unitarity. I think it also carries over to SPAV each voter has their vote split between all their approved winners and the next potential one. The formula for their ballot weight is 1/(1+W) but that is deceptive. The 1 in the denominator is the actually the check if they approve of the other candidate or not. If we instead wanted to reweight the approval matrix, A, for voters, v, candidates, c, and current winners, w. We can express SPAV as


The approval matrix in the numerator comes from the distributive law and the one in the denominator come from the fact that it is 0 or 1 but only 1 when the numerator is non-zero. Clearly this is correct but it is not a simplification. It does however show that what Parker was saying is true. You choose the winner who would win when everybody had to spread their vote weight over all winners and the next potential winner.

The Idea I had today is to just to the same update on each round giving a new score matrix derived from the original scores. That would give.


Recall that score matrices are always normalized to [0,1].

This means you spread your vote power over each to be elected candidate by the ratio of the score you gave to them over the total amount of score you would have spent if they were elected. This totally seems like a reasonable system to me. It is a bit like Phragmen but with voter weight not candidate weight.

The interesting thing is that we were trying to motivate RRV theoretically. This is not RRV. RRV does not have the extra score matrix in the denominator. Since it is a different system I am going to call it Sequential Thiele Voting to conform with Sequential Monroe Voting and Sequential Phragmen Voting. Also Distributed Score Voting has already been claimed by @Essenzia.

I like the idea of adjusting the score matrix better than the ballot weight. It is more fine grained. This is where it gets the alignment with Vote Unitrity. SSS updates the score matrix too. So maybe this is the implication for adding Vote Unitarity to different philosophies. For a while I have been trying to determine if Vote Unitarity is a different philosophy to Monroe but if what I have said above is accurate then it can equally well be applied to Thiele. This makes it really more like an additional constraint.

Relative to RRV it down weights less than the Jefferson variant. This is bad since the Jefferson variant seems to already not down-weight enough. We could have a Webster variant by putting a 2 in front of the sum. It would then be somewhere between Webster RRV and Jefferson RRV. It seems hackish to just say already won candidates are worth twice what potential candidates are worth but that seems to be what Webster does.

I could also be totally wrong and this is just a terrible system. I only came up with it today while trying to sort out how to justify RRV with words alone. I am sure you will all tell me if it is terrible.

Could you illustrate it with an example of what happens in a score voting scenario?

So would it still be the same as SPAV if everyone voted approval-style?

This system is the same as SPAV and RRV for approval ballots.

I could work through an example if you have one in mind. Maybe one where RRV does something undesirable. Like the scale invariance one

Yes, it would be interesting to see what it would do in the case where RRV would fail scale invariance.

This method should be scale invariant.

Though it also becomes undefined when electing the first winner: voters that give the winner a non-zero score all add 1 to that candidate’s score and for voters who do give a score to the winner, the amount by which they contribute to the winner becomes undefined. However because of this you could just say that because these equations lead to the selection of the first winner undefined you can just define it to select the score winner in that round.

There are two big problems with this name.

  1. Saying that this is the definitive sequential method version of the Thiele way of thinking about proportionality is a pretty big claim, especially when there are so many other Thiele-based sequential methods.

  2. It’s acronym is STV.

@Toby_Pereira suggested I calculated the following example from this page. The initial score matrix compressed for the two types of identical rows is

200 voters: A1=10, A2=10, A3=9, A4=9, B1=0, B2=0
100 voters: A1= 0, A2= 0, A3=0, A4=0, B1=10, B2=9

First normalize the matrix

200 voters: A1=1, A2=1, A3=0.9, A4=0.9, B1=0, B2=0
100 voters: A1=0, A2=0, A3= 0, A4=0, B1=1, B2=0.9

First winner is A1 with 200 the reweighted matrix is

200 voters: A1=x, A2=0.5, A3=0.47, A4=0.47, B1=0, B2=0
100 voters: A1=x, A2=0, A3= 0, A4=0, B1=1, B2=0.9

The next winner would be either A2 or B1. Anyway, the first three will be A1, A2 and B2. After that the score matrix will be

200 voters: A1=x, A2=x, A3=0.31, A4=0.31, B1=x, B2=0
100 voters: A1=x, A2=x, A3= 0, A4=0, B1=x, B2=0.47

A3 wins next giving

200 voters: A1=x, A2=x, A3=x, A4=0.23, B1=x, B2=0
100 voters: A1=x, A2=x, A3=x, A4=0, B1=x, B2=0.47

Which looks like the tie we want but its hard to tell with rounding

A4 has twice as many voters giving a reweighted score of 0.9/(0.9+2.9) as the B2 reweighted score of 0.9/(0.9+1). So it is a tie. Neat. This was honestly not something I was thinking about when I invented this.

All sequential systems selected then reweight then select again and so on. Why would the question of reweighting before the first select even come up?

That is a bit funny coming from the person who called his system Sequential Monroe Voting.

Good. I do not like Single Transferable Vote. I can call this “New STV” and really bother the FairVote guys. :stuck_out_tongue: But whatever. If no major flaws are found I’ll come up with a better name. Maybe at this point I start being egotistical and call it “Sequential Thiele Edmonds Voting”. STEV is a pronounceable name similar to Steve. But the question remains, are there any flaws?

They don’t need to re-weight the first round because all the current re-weighting formulas would set the weight of votes to one in the first round anyways to reduce to score voting. Though you could offset this problem to the 2nd round too by having 50% + 1 of the voters being voters that bullet vote for one candidate that all of the other voters rate as 0. Then every n+1th round is now the nth round amoung the other 50% of the voters and the effective 1st round selection now happens after the first re-weighting.

There is only one Monroe based quality function while there are several Thiele based quality functions. The only big competition SMV has for being the most Monroe sequential method is allocated score voting and I explained why SMV is better designed to pick results that optimize the Monroe function then allocated score when I created it. What sequential Thiele based optimal method is your method better designed to optimize then all of the other Thiele based sequential methods? Perhaps your method should be called sequential [that Thiele based optimal method] voting.

That works.

There is one potential problem I see: what happens when a voter rates a candidate 0 but also hasn’t given any non-zero rating to any other candidate. If you want your method to reduce to score voting the first round (and effective first rounds like I described above) you should make this 1 (so your method can become undefined and use score voting as a tie breaker. If you instead make it 0 it will always choose the candidate whom the most voters gave non-zero ratings to the first round). However if you make this 1 then this method no-longer reduces to SPAV (and now some voters now give the same support to candidates they gave minimum scores to just as much as voters they give maximum support to when they have given no support to any other elected candidates yet).

Also no matter how you define this 0/0 case of this method, for the voters whom have not yet given a non-zero rating to an elected candidate there is still the problem of whatever score those voters give a candidate being irrelevant as long as it’s not zero.

Other then that, I’m not sure yet.

This isn’t necessarily a flaw but I don’t really see an optimal method that this method is more tied to then the other Thiele based quality functions. Could you also create an optimal version of this method?

One potential fix to all this is to just select the candidate that maximizes Svc^2/(Svc + sum_c_in_w(Svc)) instead of Svc/(Svc + sum_c_in_w(Svc)) (and assume that the 0/0 cases now all go to 0 because the 0^2 overpowers the 0). Though squaring Svc feels a bit weird but if it works it works.

OK, I see the point now. In the above example I gave for scale invariance. After the first round, I only applied the rescaling formula to group A and not group B. If I had applied it to group B all their scores would have been 1 or 0/0. I guess I sort of cheated by just not applying the formula to voters who don’t have a winner yet.

I have thought of that. The whole point of doing this was to have a theoretical derivation of something like RRV. This clearly fixes the issue mathematically but It is not really motivated. It still reduces to SPAV since 1^2=1 but why square? I feel like we need better motivation or some empirical evidence to do something like this. The fact that this method got scale invariance without attempting to seem to imply it is on the right track. What about the method when it has the square? Does it still have scale invariance? It is more downweighting than Jefferson which is good. I need to think on this. What is so bad about saying that the voters who get winners get their scores scaled and those that do not stay the same?

I have been thinking about that too. This is sort of like the opposite of the Phragmén’s Method where instead of winners distributing a representation load voters are distributing a ballot load. The problem is the same, if the load is distributed then you can’t just have a simple sum as the quality metric or you will just get back the total load. The symmetry needs to be broken. As far as I know there are two branches of such methods the Max Phragmen type ones and the Ebert’s method types ones. I suspect there will be analogous variants for this method in the optimal form. @Toby_Pereira is really the Phragmen expert around here. Maybe he has an opinion on the best way to do this.

Checked the math. It is scale invariant with the square.

20.90.9 / (0.9+2.9) = 0.9*0.9 / (0.9+1)

I guess one example does not really prove it though.

If it is scale invariant and the method downweights somewhere between Webster and Jefferson then this is actually better. The difference is that instead of setting the score to the fraction of load that candidate would represent it is scaling by that fraction. Does that make intuitive sense?

It’s definitely scale invariant: If you multiply all score from every voter by a constant C, then instead of maximizing the sum among all voters Svc^2/(Svc + sum_c_in_w(Svc)) you maximize the sum amoung all voters (C*Svc)^2/(C*Svc + sum_c_in_w(C*Svc)) which becomes C * (Svc^2/(Svc + sum_c_in_w(Svc))). Because for each candidate’s sum just gets multiplied by C, this doesn’t change who has the highest sum in any round.

Possibly, though how (I’m not sure about the why) the method works is a lot more complicated and because how much weight each ballot has depends on which candidate is being considered to be elected, it might also be a lot harder to tabulate.

This is sort of the point. You reweight the scores not the ballot. You can throw out the idea of a ballot weight. This would be hard to tabulate by hand. I would think this is also true for RRV due to the arithmetic. On the other hand it is very straightforward computationally. Not really worse than RRV and still much simpler than SMV. I could add it to the simulation code in like 30min. I am not going to do that until we have a little better understanding of the theory.

The more I think about it the more it makes sense with the square. Svc/(Svc + sum_c_in_w(Svc)) is the fractional load distributed to each potential candidate. It is something like a weight of how much ballot weight could be assigned. This should be applied to the score it should not be the replacement of the score. This then gives a square when you update the score matrix. I think I was following the math to literally in my derivation and missed this. Good catch. I was a little annoyed that the original version was not downweighting even as much as the standard RRV. This one downweights more and I suppose only a simulation will tell us how close this is to other systems like SSS and SMV for results.

What about criteria. It seems pretty obviously monotonic. I guess it should have all the same criteria as RRV since it is very similar.

If this does reduce to SPAV in the approval voting case and it’s scale invariant like SPAV + KP, in what scenarios would they give different results, and what is the argument for this being better?

If it’s SPAV for approvals, then in the approval case at least, it must weight down the same as RRV (Jefferson/Webster/whichever divisor method you’re using). And would it not weight down more than RRV for scores? Normal RRV fails scale invariance because the weighting down for non-max scores is insufficient, and it becomes more majoritarian. And this new method passes scale invariance.

Honestly no idea at the moment. I guess if it gets the exact same results we have done something right.

The original idea downweighted less than RRV for non-max scores. Parkers update downweights less than RRV for non-max scores. They are both scale invariant.

Just to be clear. I now advocate for this reweigting of the score matrix on each round

For those of you who have never taken a course on set theory, c ∈ w means c is an element of w. So the candidates would are in the winner set. Hopefully this is reasonable notation.

I think it would be useful to also describe the whole thing in English as best as you can anyway.

1 Like
  1. Acquire candidate score form voters and normalize them in [0,1] into a matrix. Voter v scores candidate c with score S_{v,c}.
  2. Determine candidate with max summed (over voters) score. He is elected.
  3. Update the entire score matrix. The v,c entry of the new matrix is
    S_{v,c}^2 / (S_{v,c} + SUM_{winners j} S_{v,j})
    where S_{v,c} is the ORIGINAL score matrix not the one from the prior round.
  4. go back to step 2 until desired #winners have been elected.

The first winner is the score winner. Then we allow ballot weight to be distributed between the first winner and all potential next winners according to the score given. The the second winner is the candidate who has the highest sum score when it is down weighted by the ballot weight they are supported with by each voter. Then we allow ballot weight to be distributed between the first two winner and all potential next winners according to the score given. The the third winner is the candidate who has the highest sum score when it is down weighted by the ballot weight they are supported with by each voter. And so on…

It is similar to RRV except the same ballot weight is applied to all next potential winners. And FYI I am discussing this modification with Warren. I think he will agree it is better than RRV although he has not done so just yet.

1 Like

The only thing I am trying to puzzle out is the Webster vs Jefferson difference. In some sense I derived Jefferson from the theory statement @parker_friedland gave. The formula in step 3 above is S_{v,c}^2 / (S_{v,c} + SUM_{winners j} S_{v,j}) which reduces to Jeffereson SPAV. Other systems like SSS or SMV give results with much lower variance on the resulting sum(score) between voters. Altering the formula to S_{v,c}^2 / (S_{v,c} + 2 SUM_{winners j} S_{v,j}) would give the version of this method which reduces to Webster SPAV. This is much more similar in results to SSS and SMV. Adding the 2 into the formula is the same as saying ‘A Bird in the Hand is Worth Two in the Bush’. While this is a common Phrase it does not seem like something that can be used to motivate an electoral system. Is there a way to theoretically motivate that currently held users get twice the weight as the potential next?

SSS and SMV are Monroe type systems. It could be that Monroe systems are just fundamentally different and we should not expect similar results. This would mean that the Webster version is not motivated theoretically but only by a desire to match Monroe methods. If that is the case then I think we should stick with Jefferson.

One consideration is Free riding. The Jefferson version almost completely avoids free riding. Can we use that to motivate the factor of 2? Maybe it should be 1.5 have fair balance of Monroe type PR and low free riding. @Marylander is there a way to do such a calculation? What is the best value of K in S_{v,c}^2 / (S_{v,c} + K SUM_{winners j} S_{v,j})?

Which feature of Monroe do you have in mind when you call a system “Monroe type”? Surely not Monroe selection.

Anyway, the defining difference between this system and RRV is that the lower a candidate is scored on a ballot, the greater the proportion of points it loses (with RRV, each ballot contributes unweighted score*weight factor points to a candidate, while with STEV, the weight factor is an increasing function with respect to the unweighted score). So I would expect this method to be more similar to Monroe than either RRV or SSS with respect to selection (although unlike Sequential Monroe and SSS, this lacks ULC-PR).

I’m not sure which calculation you are referring to.

I mean the Monroe interpretation of PR defined here

It is the same as RRV if everybody bullet votes so I doubt that.

I would expect this to have very similar properties to RRV in terms of criteria.

If you follow my logic in how I derived the formula. S_{v,c}^2 / (S_{v,c} + K SUM_{winners j} S_{v,j}) I am curious if there is a good justification for having K>1. Why would count prior winners with more weight than the next potential winners. You had said before that the difference between Jefferson and Webster had to do with free riding so I was hoping there was a derivation for getting the 2 in Webster.

I would like to ask you this example, to better understand the method:
You have the candidates A1, A2, A3, A4, A5, A6 and B1, B2, B3, B4, B5, B6.
People vote as follows (range 0 to 1):
75 V1: A [1,1,1,1,1,1], B [0,0,0,0,0,0]
25 V2: A [0,0,0,0,0,0], B [1,1,1,1,1,1]
Written in a shorter way:
75 V1: A [1], B [0]
25 V2: A [0], B [1]
With your method, what would the 4 winners be? and the 8 winners?
Expected results: 75% A and 25% B