Score/STAR ballots and the concept of liking someone "twice as much" as another

Another thread ( A new(?) STAR variant ) was going down the rabbit hole of discussing whether it makes sense to say that a score of 4 means that the voter likes that candidate “twice as much” as a candidate that they rank as 2. This is relevant in determining if it makes sense to consider a ballot that rates 3 candidates as [100,75,50] as providing significantly different information compared to one that rates them as [100,50,0].

I decided to break it off into this new thread, since it was detracting from and getting mixed up in the original topic. (even if it was relevant to it)

My position is that the first voter rather foolishly weakened their vote (unless of course the tabulation system ignores the difference between them by pre-normalizing them). Instead of trying to consider some absolute scale of bad to good (where zero represents some absolutely defined degree of dislike, variously described as “no support”, “hatred”, or just “meh”), I’d suggest that voters should just pre-calibrate their scales such that their least favorite candidate represents zero and their favorite is 100 (a.k.a. “max”). In other words, the numerical utilities should be considered to be relative to the field of candidates who are running, rather than to some absolute concept of goodness and badness.

I can respect that others don’t agree with my position that liking something “half as much” as something else is meaningless. Regardless, a whole lot of mainstream economic/social choice theory appears to agree with my position on that.

And it’s of course true that many people, from babies to random adult “advocacy targets,” have an intuitive sense of “absolute utilities.” An often heard counterpoint to that intuition is the expression “first world problem,” which calls attention to the intrinsic relativity of any measure of utility.

This is Wikipedia’s version of exactly what I was trying to say in the other thread: (from the article on Utility: ):

One cannot conclude, however, that the cup of tea is two thirds of the goodness of the cup of juice, because this conclusion would depend not only on magnitudes of utility differences, but also on the “zero” of utility. For example, if the “zero” of utility was located at -40, then a cup of orange juice would be 160 utils more than zero, a cup of tea 120 utils more than zero.

Some people take a stronger position than my own, rejecting cardinal utility outright (and thinking only ordinal is meaningfully measurable). That seems to be the general position of this article, but the quote below is something I agree with:

A ratio-level measure requires a nonarbitrary zero point, but there is no way of finding that. Zero utility is obviously nonsense; again, this is shown to be true before we ask whether my zero point and yours are the same or different

This one discusses it in a lot of detail:
with a key line that concisely represents my position:

I do not think there is any way of measuring, or even conceptualising, the zero-point in the absence of specific choice options.

Rejecting “ratioed utility” seems pretty much universal everywhere I can find.

Again, it’s fine if people want to take another position, but I’d hope they can recognize that for people who view it as I do, it isn’t for lack of someone explaining it to them so they can wrap their little heads around it. Honestly, I find such suggestions to be quite condescending. It’s a valid perspective that sure appears to be shared by most people who approach this with academic rigor, as opposed to basic intuition. (or, what works in the context of STAR/Score advocacy)

That said, if anyone can find any literature that makes the case for the meaningfulness of such ratioed utilities, I’d happily read and consider it.


I posted about this in one of the other threads as well. I agree with your position. The zero point is essentially arbitrary, and points differences between candidates rather than ratios are what I think you should be looking at. So giving two candidates 0, 1 indicates the same difference as giving them 4, 5.

Also, giving three candidates 0, 1, 2 means you would give equal preference to the 1 candidate as a 50/50 lottery between the 0 candidate and the 2 candidate. The same principle would apply for 1, 2, 3 or 2, 3, 4 etc.

This is why I’m not a big fan of the ratio-based normalisation that some seem to be keen on.

Using this logic, if you’re doing score + some sort of run-off, I’m not sure there is any sensible alternative to STAR. 0,1 would have to be treated in the same way as 4, 5. Well, you could come up with some alternatives I suppose you could say that there is a maximum amount that you can increase the gap by, and you’d still be sticking to differences rather than ratios, but I’m not sure what the motivation would be.

But I think I’m largely repeating myself from the other thread.


Somewhat related to this would be saying “it’s twice as important to you that your 2 beats the 0 as your 1 beats the 0”, right?

If by this you mean to some extent “an acceptable variation of STAR (for you) would allow the voter to weaken their vote below what they could give in STAR”, then I think a reasonable motivation for that would simply be to allow the voter to avoid imposing their preferences on others if they want to. It’s the same reason why some prefer to use Score voting not only in 3+ candidate elections, but also with only 2 candidates, despite FPTP probably being simpler to use in such cases (and only differing in result for strategic voters if they’re taking write-in candidates into account).

Yeah, that’s pretty reasonable. You’ve chosen a candidate to represent zero, rather than just giving a zero some arbitrary value of badness.

Right. Can you imagine, though, if a 2 person election were presented this way?

Which candidate do you prefer?
[ ] Alexander Hamilton
[ ] Thomas Jefferson

How much do you want your vote to count?
[ ] 100% of maximum
[ ] 90% of maximum
[ ] 70% of maximum
[ ] 50% of maximum
[ ] 30% of maximum
[ ] 10% of maximum

Seems kind of silly, but that is no different than using score voting for the two candidates.

It’s kind of like saying, “one person, one vote, unless that person prefers to give some lesser amount”

I think that for some people, they waver between voting and not voting because they feel sort of morally bad if they give too much support to a lesser evil candidate they don’t prefer that much over any of the others. Such people might actually be more likely to politically participate if they can sort of send a signal to the candidates that “I’m here, but if you want more support from me, then you need to reach out more.”
I sort of think of it like this: if every voter had 9 clones who they could direct to vote in any manner, then should we always expect a voter to have all 9 cast the same votes they do? Or could there be some room for voters who feel that they don’t want to impose their preference too much on others?
In addition, there’s a “how well do you know the candidates” type of argument here too: if I barely do any research, and feel that one of the candidates may be better than the others, then is it really appropriate for me to vote all-or-nothing on that candidate?

Also, I’m surprised that you understand this so well but not rated pairwise.

I’d like to hear if anyone actually feels that way. (I doubt it is a common sentiment but who knows)

I think more people skip voting because they think their vote is so insignificant that it is not going to change anything (which, ya know… is kinda true!), not because their vote is going to be too significant.

Not sure what you mean by that.

@RobBrown I 100% agree with all your statements and citations but 100% disagree with your conclusion. I think you are inadvertently straw-manning what I am trying to propose. The problem with score is that everybody is on a different scale. I even have a section on this problem on the STLR page. Even though there is ample empirical evidence that we cannot get people to score with MIN as their least favourite and MAX as their most favorite with others spaced in between, this only maximizes the problem. We both agree on the problem the question is about if anything can be done about it.

This is the straw-man. I never have nor would I ever propose such a thing. To illustrate my point lets say there is an absolute scale of utility from negative to positive infinity. For any given voter the most and least liked candidate on the ballot can be any two points on this objective scale. When comparing two voters, the range between MAX and MIN may not even overlap on this objective scale. So how are we going to reconcile this issue when we do not have access to the true scale but can only ask people to give us a few numbers.

One thing to start with is that we want to give everybody an equal vote. So even if this election is the most important thing in somebodies life and if it goes the wrong way they end up in a gulag we give them the same vote power as somebody who will be unaffected by the outcome. I do not think anybody is here to argue otherwise and in the West it is normally not so different between people.

So given that assumption I need an anchoring point. What I want to anchor on is the flip point between want and do not want. Much of this is practical. For game theoretic reasons it is dumb to score anybody you dislike greater than zero. As I said above it is born out in data that people do this. That puts the anchoring point for zero at neutrality. Yes this arbitrary but it is somewhat naturally motivated and relatively easy to explain. What you were arguing on the other post was not the sophisticated argument about utilities but an argument that you did not know your own preference. That you did not know what the zero meant to you because you did not know the flipping point between like and dislike. I honestly find that hard to believe but Ill humour it for now.

You have mentioned before that you like the analogy to economics. I like to think of it that way too. Score is then thought of similarly to money. So if there were tasks it would be pretty easy to know if you wanted to do them. Would you need to be paid to do it or would you be willing to pay to do it? Getting punched in the face is likely not something you would want so you would not pay for it. Similarly watching paint dry is not any fun so that is also worth zero dollars to you. I would tend to think that people can make similar assessments of what candidates are worth to themselves or at least for game theoretic reasons come up with such scores.

So now that we have a zero, which admittedly is not perfect, we need to talk about the MAX end of the scale. This is intended to be the most preferred candidate. The difference in utility between the most preferred and zero will be different for every person. This is what the levelling in the runoff is for but lets talk about the metric space of the scale. ie the mapping function from utility to score. You state

Not only does this go against @Toby_Pereira assertion that it is the difference which people think in but you have scaled linearly. Exactly as I would expect. You have basically proven that it is somewhat intuitive to you. You went from candidate C at 50 to 0 and scaled B linearly.

Anyway, since we have already anchored zero it is MAX which is important. Since we know zero is neutrality and MAX is favourite it seems pretty clear that a linear scale is a straightforward way to define it. Twice the score is twice the endorsement since all values greater than zero are considered endorsement. Many argue that people tend to think on different scales like log. Your above example shows you do not but suppose other do. The system is designed to use linear and people will be told to use linear. So instead of letting people all do it in a different way we are pinning it down so something specific.

The linear choice and the zero choice is a choice. Both are natural and simple but there is adeeper goal they serve. The point of the choices is two fold. 1) It puts people on a bit of a more similar scale 2) it allows for a very simple normalization metric where all positive values are multiplied by a constant. If we have a top two run off we can use this normalization (dubbed levelling) to put the favoured of the two up to MAX and put the other to a score value mapped based on the above choices. Since we agree that each person get the same vote power everybodies favoured of the two should be at MAX. The influence should then be reduced by SUBTRACTING off the relative utility of the other option. We can calculate that relative utility because of all the choices made above.

I am not claiming to have solved the scale problem in score. The claim is that STLR will mitigate it to some extent. STAR does a similar thing but it makes different assumptions and ends up being majoritarian which I do not like. My point is that it is better than score because it takes out some of the problem of arbitrary scales and it is better than STAR because it is not majoritarian.

One final point about the “its the difference not the ratio which matters argument”. I am not sure how to explain this succinctly with out getting deep into the math but here goes. Yes it is the difference which matters for true utility but we do not have that. We have the score and the difference of one score point represents a different utility change for each person. So when I anchor the zero and then do the ratio I am actually just doing the difference with an adjusted mapping to the true utility space. This is why the final influence is the difference of the two in the runoff. I need to put them on an approximated scale first though and that requires the ratio. The ratio is for the mapping between the spaces not the calculation of differing utility/impact. That is still a subtraction as I said two paragraphs ago.

What I saw and commented on was this:

Score your level of endorsement. No endorsement is a zero. Give your favourite the MAX (ie 5 or 9 depending). Give somebody you like half as much as your favourite half the score.

“No endorsement” is one of those absolutely defined terms. But more importantly, the use of “like half as much” is what I was referring to, that doesn’t compute for me. I don’t think I have ever described myself as liking something half as much as something else. There would need to be a clearly defined baseline, and there isn’t in a voting context.

I’m not sure where you saw me say that “I did not know my own preference.” Maybe you can show me where I said that?

To be clear, I understand that when I use the word “like” and “dislike” they are dependent on context, and therefore are relative. Again, I know my preferences, but whether or not I apply the words “like” and “dislike” is a different matter.

Likewise, if I say my favorite color is #ffc524, I do know my own preference… quite precisely actually. Whether or not I want to use the word “yellow” or “orange” is a different matter. The latter has nothing to do with my preference, and is simply semantics.

If I am talking about what foods I like or dislike, it is relative to the foods that are generally available in my lifestyle and my culture. If I suddenly find myself homeless or living in a post apocalyptic world or whatever, then the baseline changes.

If I say I like my job, it doesn’t mean I’d go to work even if they didn’t pay me. It is relative to what other options I could imagine might be available to me, what jobs I see other people in my peer group have, etc.

It’s ok if I use the words as shortcuts in regular conversation, but, like many people who give these things a lot of thought, I understand that they are simply shortcuts.

Back to elections… I know a lot of people who voted against John McCain in 2008, and voted for Hillary Clinton in 2016, while actually liking McCain more than Clinton. Whether or not they say they “like” either of them, really shouldn’t matter in a voting situation… what should matter is how those candidates compare to others that are running or who have a chance of winning.

I’ll do my best to read the rest of your post carefully and consider it as I have time, but that jumped out at me.

I see it like this:
Considering the single interest, for example “pollution increase / reduction”, if the candidates were to be voted, considering only how much their ideas produce or reduce pollution, then I would say that there is an “absolute” (zero) point and:

  • all candidates who support, little or much, a reduction in pollution, will have positive values.
  • all candidates who support, little or much, an increase in pollution, will have negative values.

Min and Max are still relative, but zero is not.

Regarding the voter scale:
if I say that “wins the sum of the major points (as in SV)” then I am saying that “if the vote is this A[10] B[5] C[0] then the voter likes A 2 times B”.
If you vote B[5] you know that another vote B[5] will be needed to make B be on par with A[10].
2 times B[5] = 1 time A[10].
If a voter votes A[10] B[5] C[0] without thinking that A is twice as good as B, then he has not understood how the counting (sum) works.

I’d point to, to some extent, communities that feel ignored to some extent. There’s been a long-time movement among some (maybe very few) black Democrats to try to force the Dems to listen to their concerns and not take them for granted as a base voting bloc, by either not voting, or going to the Republicans.

By rated pairwise, I’m referring to the How should transitivity be handled with rated pairwise preferences? discussion. In this topic, you are showing clear understanding that Score voting in a two-candidate election is equivalent to being able to vote at varying degrees of power. So it surprises me that, when we were discussing Score voting used in every pairwise matchup, that you didn’t see that as the same (i.e. that you’re allowed to give your full vote in each matchup, like in Condorcet, but also less than that if desired).

Yeah, I suppose it could be viewed that way.

Yes, but then we could just have score voting. Also, when it comes to complexity of a voting method, I’m not sure it’s worth adding extra complexity in to find something that’s midway between score and STAR.

This isn’t necessarily the case though. It can depend on the likely outcome of the election. The two frontrunners both might be candidates I dislike, but I might still dislike one more than the other. In that case, it would make sense for me to register a difference between them. If I gave them 0 and 1, then under STAR and the system you proposed, then this would give me a maximum vote in the run-off.

It could be as simple as allowing the voter to add however many points they like to the gap they indicated on their ballot. So if you say you want to add 3 points to your gap in the runoff, and it ends up being held between candidates you scored a 5 and 4, that would create a 4-point gap in the runoff.

I think we are talking past eachother a bit here. What you are arguing against is not what I am arguing for. I wrote a lot in the last comment about what I mean. Do you still believe that you could not fill out a ballot to that specification? The comment you quote was an answer to what I would say to the average person filling out a ballot as instructions. I would be surprised if nearly all people would not be able to fill in a ballot correctly with such instructions. If I am wrong then can you find a way to phrase my intent better than I have?

You just said it again. I think we are just talking past eachother.

Yes the nuance conversation about utility and how that is dependant on situation is interesting but it does not apply. We are talking about voting in a specific election with specific candidate. That is the context. I am not interested in having a long conversation about how context changes preference. I find that obvious and boring.

Yes, given the set of candidates as context who do you endorse. This is the exact same question as approval voting. I would think that the same people who have 0 in approval would have a 0 in STLR voting.

Most of this stuff is not exactly true. It mostly true and good enough. We can always find strategic situations. On the other post I said that thing get tricky when you do not endorse anybody on the ballot. I do not claim STLR fixes all these problems for all people. My claim is that it lowers strategic incentives and mitigates some of these problems relative to score. The claim is not that STLR is perfect but that it is better.

Still not sure I understand, but… are you thinking I am advocating that? I think voters should always give their “full vote”, and I don’t think we should complicate things to allow them to weaken their votes.

Of course I could fill out my ballot, but if you want to get technical, I would not agree it is "to that specification " and I would not know if it qualifies as “correctly” to you. As a voter reading those instructions, I’d just say “whatever” and vote how I see fit. Just like if someone instructs me, under plurality, to “vote for the candidate you like the most,” I feel free to not vote for my actual favorite, if it happens to be a third party candidate that I don’t expect to be a front runner. I am certainly capable of dealing with simplistic or sloppily worded instructions and get on with my life.

Still, I think the words “give somebody you like half as much as your favourite half the score” would cause me to do a double take and say “that doesn’t make sense.” It would be like someone asked me to make the room half as warm as it currently is.

I suspect a lot of people would find that weird wording and ultimately non-sensical. Better wording is “rate the candidates from zero to five.” I don’t think you need to be more specific than that, and no one will be confused.

But still, if all you are saying is “this is what we say to an average person”, I guess that’s fine. It wouldn’t be my choice of wording, but that’s not all that big a concern.

But I don’t think, here in a forum on theory, that is a good way to explain what a vote is expected to mean, especially if you are very specifically using that logic as a basis for improving voting methods. That’s my main objection.

The problem is I think your intent is either unclear, or based on an error in logic. It appears to me that you have an intuitive “baseline” (i.e. what is represented by zero), that you consider that this baseline is a different thing than “the least preferred candidate running”, and you are unable to understand that others do not have the same sort of baseline.

Again, I go back to the temperature analogy. Can you rephrase the instruction “please adjust the thermostat so the room is half the temperature as it is now” to better capture the speaker’s intent? Or are you like me, and have no idea what their intent is because what they asked is completely unclear?

I do understand that others have another baseline or zero point. I am telling them to use a specific one. The one I propose is the one we see very often in data and the one which a game theoretically optimal voter would use. I am trying to nail it down. I am not saying that all people find what I have chosen to be intuitive. You keep trying to put words in my mouth.

This is a silly straw man for a number of reasons and I think you know it. I have said repeatedly that this analogy does not capture what I am asking. Since you like the temperature scale so much lets just rephrase what I am doing in terms of what I am actually doing so you can’t mischaracterize my argument any more.

Step 1 Establish a zero: Zero temperature will be the temperature that you feel is neither warm nor cold. That nice in between middle temperature. This is subjective, as I have admitted a number of times but most people will choose room temperature. So we have nailed it down to within a few degrees. Had I set zero to “cold” then it would be different for everybody by a larger degree than choosing neutral.

Step 2 zero out lower scores/temps: From the list of items all temperatures which you think are lower than the zero you established in step 1 are zero.

step 3 set the max: The hottest thing on the list is fixed to temperature 5

step 4 Fill in the others linearly: Assign all remaining items on the list temperatures such that half the number is half as hot. This will be subjective too but its up to the person to decide. In the score system it is more tangible since half the score would have half the eventual impact.

Temperature is a bad analogy because two people will not have two different temperatures inverted so that one says something is hot and the other says something is cold. But whatever, it is your analogy. Also, the process I have described above is actually how all temperature scales have been made historically. They choose two temperature point and a number of gradations which separate them. This is why when I say 20 degrees is twice as hot as 10 it has meaning.

So voila, here is what I mean in your analogy. And again, as I have said before I know it is not perfect. The solution you are pretending I am advocating for is that of score. Just saying to let people score and then we do the sum is exactly the system I think is flawed. If we have a system with some constraints then it will get better results because it mitigates these issues with score. Also, note that I am not saying “vote this way”, I am saying “your ballot will be treated as if you voted this way”. Imposing the constraints that way means that people can vote more honestly because the system compensates in a fair manner. People will still likely try to vote strategically but STLR is bounded by STAR and Score so I am in a reasonable space.

Ok, I am not following this. My complaint is not about your method in general (it seems to be a bit of a blend of STAR and Score, and I guess that is fine), it was over your using ratioed utilities in the discussion/explanation of them (speaking of “liking a candidate twice as much” as another).

Please stop telling me what I know. I don’t.

The temperature analogy is precisely the same to me. I am not stupid, and I am not being insincere.

Like vs. dislike are identical to warm vs. cold to me, in this sense. For each, there is no meaningful zero. To me, saying “like half as much” is exactly as meaningless as saying “half as warm,” and they are meaningless in the exact same way. Since both of them represent a roughly symmetrical polarity, I can also mention that saying “half as cold” and “dislike half as much” are just as meaningless.

Forgive me for being lost, but please clarify what that specific baseline/zero point is. I’ve tried to say what I’ve understood you to have said (“no endorsement”, “meh,” and at some point you mentioned hatred), but you accuse me of mischaracterizing your words and attacking straw men. I am genuinely trying to understand what you intend to be meant by “zero” if you are saying that it is anything other than “your least favorite candidate running.”

I am also fine with zero simply being defined by usage, a la “you know how the tabulation works, so give a zero to any candidate you want to give a zero to.” This is how I interpret the term “no support” that I see on some example Score ballots… it just means “I want to give them zero points.”

At this point I do no think there is more I can say. We are stuck in a loop. You say that you disagree with something I also find clearly false and invented a whole system to mitigate. Then I give a detailed explanation of what I mean and why the system mitigates it. You don’t reply to my explanation but instead go back to saying what we both disagree with. I do not understand what you do not understand about this. Please go through the temperature example and tell me what it is you disagree with or where you get lost.

So you would define the range on the ballot? One problem with that is increased complexity, or people not voting “properly”. You want the ballot instructions to be as simple as possible. You don’t want it more complex than “Rate the candidates from 0 to 5 where higher is better” or words to that effect (don’t hold me to that specific wording). You don’t want to have to add “0 means… 5 means…”