A new(?) STAR variant

STAR does not seem to get the right answer in the following situation.

Red 51%: A[5] B[4] C[0] D[0]
Blue 49%: A[0] B[4] C[0] D[5]

It elects A when I feel like the correct answer is B. Both Score and Approval elect B but there are other examples where Score and Approval seem to get the wrong answer.

STAR argues that it is better because it recovers the way that people would have voted given the top two utilitarian winners. STAR is basically Cardinal Baldwin, but only doing the last round instead of all rounds. It is ONLY applying Baldwins method in the last step to normalize without losing monotonicity. Maybe the Baldwin normalization is not the best form of normalization since it reintroduces polarization and majoritarianism.

The IRNR normalization seems better. Why not do what is done with STAR but use the normalization from IRNR on the last step? This would preserve almost all of the things you want.

In the above example the top two are A and B. After normalization (choosing 100) this becomes.

Red 51%: A[56] B[44]
Blue 49%: A[0] B[100]

This means B clearly wins.

The issue here is that there is incentive for the 51% group to lower their support for B. But if they do lower support for B then they risk D winning since they may not know they are the majority. I mean that it could be that Blue is the group with 51%.

What do you guys think of this Utilitarian STAR (U-STAR)? It seems to be better in the above example over Majoritarian STAR. Am I missing something important? Maybe we should just check all the examples of when STAR is better than Score and vice versa to see if U-STAR splits the difference.

Also, did I just invent this?

1 Like

Is your system related to this?

It is exactly that system. I have said many times that it is almost impossible to come up with a new single winner system.

I guess this means I do not get to name it :frowning: I was thinking saturn = Score Average Then utilitarian runoff normalization

Anyway, I tried to read through the reddit thread. I did not find any good argument against this system but it was not that easy to follow. Most were just criticism of score or star. It seems that this is a middle ground.

If anybody had good criticism of this system please let me know. I am especially interested in hearing what @ClayShentrup and @Sara_Wolf think of this.

1 Like

I don’t know whether the system you describe exists, and I don’t have time to deeply analyze whether I think it would be an overall good system. Sorry. :slight_smile:

However, I think the problem you call out here (specifically with STAR) is not really a significant problem in the real world, and is simply the result of clustering the voters into two hard edged groups, with no fuzz around the edges.

I agree with you that B is the “best” result, while also arguing that STAR is probably doing the right thing by picking A. I know that sounds contradictory, but hear me out.

The thing is, the majority have A as a first choice. That means the median voter’s first choice is A. And I think there is good reason to say that – if it can be determined who the median voter is – the best system should always pick that voter’s first choice.

The reason for the seeming contradiction, as I allude to above, is that your example is very unnaturally contrived and hard-edged. A more realistic scenario might be this, which is almost identical to yours:

Red 49.5%: A[5] B[4] C[0] D[0]
Purply-Red 2%: A[4] B[5] C[0] D[0]
Purply-Blue 1%: A[0] B[5] C[0] D[4]
Blue 47.5%: A[0] B[4] C[0] D[5]

I’m a graphics guy, so I would describe what I did there as simply applying a small amount of blurring. Thanks for using Red and Blue as the voter groups, since blurring red and blue gives you some purple :slight_smile:

In this case, the median voter is Purply-red, even though they only represent 2% of the total. And STAR should pick B, first choice of that median voter. All you need is a tiny percentage of people who actually like B the best (here 3%, both the Purply-Red and Purply-Blue groups), but it could be even lower).

Even more realistic would be to blur it some more, and have more like 10 to 20% have B as a first choice, but I went with a small amount of blurring to make it very close to your example. Regardless, I hope you can see that it is very, very unlikely to have absolutely no one rate the middle ground candidate as their top choice, given that everyone has a pretty high opinion of that same candidate.

There is also the oddness of having B be so highly rated by both sides. You’d think that if so many rates B as 4, there’d be at least some who would rate B as 1, 2 or 3. If you were to try to explain this scenario in a “Map of Tennessee” sort of thing, I don’t think you could do it no matter where you put the proposed capitals and the voters.

Screen Shot 2020-07-03 at 4.56.40 PM

The more people you have voting, the less likely it is to have these highly polarized situations that seem to show a flaw, but they really just show a very unlikely (i.e. weirdly contrived to trigger a flaw) scenario.

Also, I don’t want to pick on Approval, but you can’t know for sure that it will handle this situation better, because it is so dependent on who the voters think will be the front runners. A bad poll, or incorrect speculation as to how voters will set their thresholds based on a poll, could cause any of the three to win (either in your scenario or my “blurred” one). For instance say the polling says that A and B will likely be the front runners. Now the savvy Red voters will approve only A, while the savvy Blue voters will approve both B and D. B will be at a big disadvantage, getting less votes than they would have, if the polls predicted it would come down to A and D.

And B would still be disadvantaged in my more realistic scenario (i.e. the blurred one with some purple voters), under Approval, if there was polling or speculation that lead people astray.

So STAR might be sensitive to highly unnatural clustering of preferences, but Approval can be sensitive to incorrect polling or speculation.

1 Like

Seems like a cool idea for maximizing consensus in those edge case scenarios, but in a complex and non transparent way. I don’t see this having legs beyond in a theoretical space where advocacy isn’t relevant.

3 Likes

If the 2 probable winners are A and B, and the voter prefers A over B, then:

  • in the “classic” STAR it is only necessary that A has more points than B (in reality, given that the sum of the points also counts, a voter still tends to give 0 to B, to minimize the possibility that he ends up between the 2 finalists).
  • in “your STAR” instead the voter is fully pushed to give 10 points to A and 0 to B, precisely because it is utilitarian. This nullifies the STAR benefit (which avoided the need to give 10 points to A).

I am not really hearing any well defined flaws here. My take aways are

  1. It may not make much difference. This is likely true but similar things are said about STAR’s advantage over Score. It does not matter at all until it matters.

  2. It is more complicated. Yes this is slightly more complicated than the majoritarian method. Complexity is a valid concern because at some point the public has trouble grasping it. Again I can make reference to STAR vs Score. STAR is more complex than Score. Who is to say where the public draws the line? Although, @Sara_Wolf has a ton of hands on experience so Ill trust her instincts.

@Essenzia I am not sure I understand your argument. I think you are saying that there is intensive to exaggerate distance between their favourite and second favourite. This would then lower the distance between the second favourite and their most hated. So there is a counterbalance to that. I did make a simialr argument in my original post

I am not sure if there would be a net effect on honesty or polarization. I am surprised that you prefer STAR to this new STAR since your favoured system, Distributed Voting, is a variant of IRNR. If you think the normalization from IRNR is better than Baldwin in general then why not if you only do the last step?

Maybe it is worth checking the Bayesian Regret simulations for this to compare with STAR and Score.

You’re right, I was wrong. The new STAR has no poll dependency and is more utilitarian than the classic STAR (so it’s better).

I have no idea how you’re getting these numbers.

I am currently knocking on doors for our new STAR voting initiative in Troutdale, and let me tell you that “majority winner from between the two highest scored” is already plenty to explain and justify.

2 Likes

The Red faction gives A a 5/5 and B a 4/5. Add 5+4=9. Now divide A’s score of 5 by 9, you get ~0.56, and B’s score of 4 by 9 gives ~0.44.

1 Like

This makes no sense. You could add a constant to both numbers and get a different value. E.g. 1/1 vs. 2/3 are not the same.

1 Like

What would be the theoretical motivation for that? The IRNR system has deep theoretical motivation developed by Brian Olson. The normalisation Olson invented makes logical sense and is often reinvented. You could add arbitrary stuff to almost all systems so that is not a reasonable critique.

How can you justify changing a result between X and Y when a voter increases support for X and Y by a constant?

The simple answer is because that is what information they told us. If they change from X:1,Y:2 to X:2,Y:3. They are giving a different ballot. In the fist ballot they state that Y is twice as favoured as X so a normalized (to 100) ballot would be X:33, Y:67. In the second they state that Y is only 50% more favoured than X so a normalized ballot would be X:40, Y:60. I think you are trying to say that the second ballot could result in X being the group compromise winner when this would not have occurred in the first. I see no issue here. The voter would be told that this is how to vote. In fact, a common criticism with score is that the scale is not explicit. This makes it explicit.

I think you might be asking a bit of a deeper question. Correct me if I am reading too much into your question. First I am going to make some definitions for clarity. I am going to name them after who I think is their inventors but please correct me if these are not the original inventors or there is better Nomenclature.

Baldwin Normalization: Scores are normalized such that there is at least one candidate scored MAX and at least one candidate Scored MIN. Intermediate scored candidates have their score normalized linearly. There is an explicit formula for that here but it does not matter since we are only talking about the top 2.

Olson Normalization: Scores are normalized by dividing by the sum of all scores

STAR voting uses Baldwin Normalization which maximizes impact of each ballot. It is a majoritarian in nature and as a result it causes some degree of polarization as can be seen in the example in the original post.

Olson Normalization is an attempt to fix the issue where a voter’s total vote impact is reduced as a result of them not knowing who would be in the final run off. Baldwin does this as well but it does it in a maximal way by being majoritarian. Olson is an attempt to maintain the Utilitarian underpinning of score by keeping relative utility constant and adjusting the total power to be the same for everybody.

The deep question I think you may be asking is how do we know that Olson normalization is justifiable and not just arbitrary. As in how do we know that the voter really wants to preserve the relative score? Clearly they would want to maximize their impact through Baldwin. This is just another way of stating the difference between Majoritarian and Utilitarian. Olson takes the information and sort of forces the compromise.

Where this gets deep is that it might seem that I am assuming something about the voters intent when I do Olson normalization which I am not when I do Baldwin. This is not the case. The fact that Baldwin uses less of the score information does not mean that it makes less of an assumption.

Is this what you were getting at? If you further want to talk about why Olson and Baldwin are the two natural transforms I can get into that. Baldwin should be fairly obvious. Olson derives from the assumption that utility follows Cauchy’s functional equation so modelling of it should be linear. This means a mapping of two utilities from one space to a normalized space should preserved the multiplicative constant between them.

I hope this clears things up a little and does not just come off as ramblings. To be clear, I support STAR with Baldwin normalization for reform and understand that we are limited by complexity. Much of what I do here is about the math and not really intended to be implemented.

Crap, I just realized that Olson Normalization is not the only utilitarian normalization
The Olson method turns these top two

A[4] B[2]

Into this

A[4/6] B[2/6]

However you could also do

A[5], B[2.5]

The relative proportion is still that A is twice as favoured. This new method puts the higher one to MAX and then scales the other accordingly.

My assumption was that the total summing to a constant was the same as each person having the same effective power. That is one way to define it but having at least one at the MAX is another. @Essenzia do you have an insight. I know you have thought about this problem a lot.

Yes, in practice it’s equivalent to saying that if you vote with a range like this:
[5,4,2,0] removing 5, it becomes [5,2.5,0] if, however, I removed 2, then it would remain like this: [5,4,0].
If I remove the 0, I should make it stay like this: [5,4,2].
It only changes when I remove the maximum.

I preferred to use 100 points because they are equivalent to a %.
This vote with range [5,4,1,0] (that is [50%,40%,10%,0] ), removing 5 becomes [80%, 20%, 0] (that is [80,20,0] in the DV).

After thinking this through I think this newest method is better than Baldwin or Olson. So I am going to define

Edmonds Normalization: The two scores are scaled by a constant such that the higher = MAX. I do not think this is defined for more than 2 candidates which is why I don’t think it exists in the literature. If anybody has seen it elsewhere let me know and I will rename.

Here is my logic for why it is better. Everybody gets to have their favourite at MAX. This is also what Baldwin normalization does. However, Baldwin but the other at MIN. Since the impact on the outcome of the final winner is the difference between the two scores Baldwin maximises this. This is why it is majoritarian, it gives every bodies preference the same power. If we wanted to be utilitarian we would try to scale the impact of their impact to the Utility gained. I think this is what Edmonds normalization does.

Although this is in a sense a STAR variant, it seems to have a completely different philosophy. STAR is saying “Rate the candidates how you want, but don’t worry because when it comes to the run-off your vote will still be a full one.” This new system doesn’t do that.

If I have two candidates I like and give them 5 and 4 and they also happen to be frontrunners, it’s good for me in STAR that if they make the run-off it becomes 1 and 0. I don’t know what this new method would do in terms of the overall utility of the result etc., but in terms of what it does for an individual with their vote, it’s worse than STAR.

I also don’t agree with this form of normalisation anyway, and I think @ClayShentrup has exactly the same idea. Scores are a proxy for utilities, and it makes more sense to look at the difference between utilities than the ratios between them, primarily because the zero point is arbitrary.

If I give candidates A, B and C 0, 1 and 2 respectively, then from a utility point of view electing candidate B is the same (for me) as a 50/50 lottery between A and C. This is also the case if I gave scores 1, 2, 3, or 2, 3, 4 etc. It’s the numerical differences not the ratios that are important.

Also I might give my least favourite two candidates 0 and 1, and my top two 4 and 5. From my point of view and from a utility point of view these situations are symmetrical. Yet with this type of normalisation if my worst two make the run-off, then my vote counts for a full one, whereas if my top two make the run-off I get 1/9 of a vote in the run-off. (Edit - or maybe actually 1/5 of a vote if the normalisation gives the max to your favourite).

4 Likes

Nailed it. …

1 Like

Does this apply to cardinal PR as well?