Cardinal Baldwin Spinoff

Cardinal Baldwin seems like a very good voting method. Thinking about the “rubber band” renormalization, I was kind of having a difficult time justifying to myself why the whole ballot should be “stretched” rather than simply the highest rated candidate(s) still in the running having their score replaced with the maximum.

I think it could give voters even less incentive to exaggerate their scores. Basically giving a candidate a higher score would be preventative against their elimination, but that protection would be tempered against potentially competing with the voters’ most preferred choice. The mathematics is also simpler—rather than dealing with rational numbers, only integers are ever involved.

I understand that some of the purpose of scoring is so that the candidates are scored “independently,” but I don’t think that is ever going to be plausible in a single winner system. This way voters would be forced to balance out their preferences. I could see it potentially introducing a degree of regret in hindsight, which is something disliked. I’m not sure it’s totally unwarranted though.

Just a thought, again. Any comments are welcome. Hopefully I’m not spamming with too many topics, I just find this all very interesting.

@Essenzia suggested this a while back.There was a bunch of threads about the differing ways to do this.

I think the same normalization should be applied to all scores. I also think monotonicity is important so it would be a top 2 run off. Finally I do not want the final method to be majoritarian. From this I came up with STLR voting.

1 Like

I don’t understand what you mean here. Maybe I should suggest you implement it in a fork of the CodePen, or help me do it. https://codepen.io/karmatics/pen/eYJxXge

The idea behind Cardinal Baldwin, as I see it, is this:

It allows you to determine who the front runners, and then apply everyone’s vote as if those were the only candidates running.

However, to determine who are the front runners is tricky. So let’s narrow them down one by one, trying not to eliminate anyone prematurely.

We’ll then use the best estimate of how voters would have voted, if the ones we have eliminated weren’t on the ballot in the first place. (that’s what the normalizing does, with the assumption that everyone wants to “maximize” the power of their vote)

We can be pretty sure that, if a candidate gets the very lowest average score (or total score…same thing) of all candidates, they aren’t likely to be a front runner. So hopefully we are never eliminating a candidate that could have beaten the front runner pairwise. (or if we are… it is an actual near tie / Condorcet cycle situation, as opposed to a split vote)

I would like to see an example where a voter (or block of voters with similar preferences) can gain an advantage by exaggerating under Cardinal Baldwin. (same goes for most Condorcet methods or “near Condorcet methods”) Keep in mind, by my definition, people don’t have an incentive to exaggerate if there is nothing to gain by doing so. (it doesn’t count if they erroneously think there might be)

You know I’ve been pushing the CodePen stuff because we can try out these varieties of methods and see if we can create a scenario where people’s concerns actually cause problems. So feel free to play around with the one I linked and create a ballot set where voters could gain an advantage by exaggerating.

From my perspective, the reason we use 0 - 5, without intermediate scores like 3.625, is not to keep the math simpler. It is to keep the ballots simpler. There is no such concern with intermediate scores. There may be good reason to convert the ballots to, say, 0 - 500, and round intermediate numbers to an integer, so we avoid the fuzziness of floating point math. But simply forcing things to ints with an already small range doesn’t make things better, in my opinion.

Sorry I will have to look into codepen and see if I can implement some of these changes I’m describing. But maybe I can explain more clearly if you’ll bear with me through a small example.

Say the vote is like this:

A[10] B[5] C[2] D[0]

Let’s say A gets eliminated. Under normal CB, the vote becomes

B[10] C[4] D[0]

I’m suggesting instead

B[10] C[2] D[0]

So only the maximum moves to the top. The score for C remains unaffected. Similarly we could also have the minimum move to the bottom.

It’s reasonable to assume that if A had never been on the ballot, the voter would have indeed voted:

B[10] C[4] D[0]

So why convert their ballot to

B[10] C[2] D[0] ?

That doesn’t make sense to me.It isn’t consistent with the logic of “You don’t have to worry about guessing how other people will vote. We’ll adjust your ballot to maximize your vote, after we see how things are going.”

If the point is to simplify, ok, I guess an argument could be made that it is simpler. If the point is to make it more immune to exaggeration, I’m quite sure it only does the opposite.

It’s very hard to argue with that, because it does seem reasonable. I’m perhaps inexplicably skeptical of it though. It comes back down to the concept of liking someone “twice as much,” right?

I feel like the scale is linear because voters are forced to compress their ballot into points on a line and the CB method itself scales linearly. I suppose if that’s what voters expect, then it makes sense to preserve the linear structure of the space into which they’ve compressed their ballot. Otherwise the meaning of the space is no longer “I like A [X] times more than B.” But does it really even mean that as it is? It seems like if that’s true, there should be an invariant zero point indicating total indifference, and negative values as well. But it isn’t necessarily possible to establish an invariant zero point with the linear scaling, unless positive and negative scores are scaled independently… If the scale is nonlinear to begin with, then I see no particular reason to try to preserve linearity.

I’m not sure honestly. This, to me, is a philosophically tricky topic. If there are negative scores, it seems like people would score all of their disliked candidates as the absolute minimum in every case. Then it becomes a sort of approval system. But maybe that’s fine.

Or perhaps indicating their relative dislike would still be a good idea.

Yes I don’t accept the concept of liking someone “twice as much.” It is fine in everyday conversation, but it fails when you put it in contexts such as these. It seems to be roundly rejected in any literature on economics, social welfare theory, game theory, etc. As came up in a previous topic: “One cannot conclude, however, that the cup of tea is two thirds of the goodness of the cup of juice, because this conclusion would depend not only on magnitudes of utility differences, but also on the “zero” of utility.” https://en.wikipedia.org/wiki/Utility#Cardinal

I think it would take a lot to overturn the current thought on that, even if it doesn’t jibe with some people’s everyday, intuitive sense of it.

If you think that the ratings you give people have some sort of absolute value, consider whether an Elizabeth Warren fan would change their rating for Biden when voting in the primary (against Warren, Sanders, Klobuchar, etc), vs. the general election against the incumbent president. I’d expect that almost all of them would raise Biden’s rating significantly. Their preferences haven’t changed, they are just expressing them relative to a different field of choices.

Even IF you are still insisting on absolute utilities being a meaningful concept, I’m not seeing the logic for maximizing one rating while leaving other ratings in place. That just seems arbitrary, and I haven’t seen an argument for why it does anything positive. Say it was

A[10] B[5] C[4] D[0]

This ballot says “I like B slightly more than C”

When you eliminate A, the ballot changes to

B[10] C[4] D[0]

That says “I like B a lot more than C”

So I’m left asking, why change its meaning like that?

I think I agree with you about the scaling. That makes sense to me. Sorry, I’m still developing my thoughts about all this, so talking through some of these issues is helpful for me, hopefully it’s at least mildly interesting for you as well.

I also don’t think it makes sense to talk about liking something “twice as much,” what does that even mean? In addition to other things, it doesn’t seem to account for diminishing marginal utility. For example, even reduced to identical goods, people would probably claim to like $15,000 1.5 times as much as they like $10,000, but the law of diminishing marginal utility suggests otherwise.

Like you’re suggesting, it more comes down to “I like X a little more than Y,” “I like X a lot more than Y.” I would also add statements like “I like X and I dislike Y,” or “I am indifferent to X.” Etc.

I do think a system where voters can score candidates positively and negatively and where the positive and negative axes scale independently would be interesting to consider.

Basically three categories: a scale from 1 to MAX, a scale from -1 to -MAX, and a ZERO. Maybe also UNMARKED.

For example, you might see a ballot like

A[10] B[5] C[1] D[0] E[-5] F[-10]

and if A and then F were eliminated, it would become

B[10] C[2] D[0] E[-10]

1 Like

I wouldn’t be against negative ratings, I actually kind of prefer that zero mean “neutral” and the symmetry that implies.

In practice, I think the biggest difference would be that candidates that you haven’t given much thought to you will be more likely to leave at zero (i.e. neutral) rather than give them the worse possible score. In other words, neutral would be the default, rather than “strongly dislike.”

That said, I think that 5 stars (which means six possible ratings) is probably the best user interface, considering how common it is in places like Amazon, Yelp, etc. (although I’m not sure they allow zero ratings) It feels familiar at this point, and that counts in its favor.

1 Like

I was thinking about that too a bit. One could still use the average ratings of candidates that pass a certain threshold. I don’t know how necessary it would be, because it would be very unlikely for a candidate that is indicated in many ballots to have a 0 sum rating and then to be compared with a candidate who received almost no indications. It also seems like people probably would almost never actually indicate a 0 score.

I think it might make sense only for candidates who were indicated on some fixed percentage of the ballots to be considered. But again I’m not sure if that would be necessary. It would just be a bit weird to have some guy most people are indifferent to and who hardly anybody even considered to beat somebody else even if they were widely considered to be a bad candidate.

I think it might make more sense for unmarked candidates just to be rated as -1 by default or something. Maybe that’s dumb though. Kind of defeats the whole purpose of the ZERO point lol.

1 Like

Original vote: A[10] B[5] C[2] D[0]

CB uses differences, so there is no correct answer.
In this vote A [10] B [5] C [2] D [0] the voter says:

  1. C deserves 2 points more than D.
  2. B deserves 3 points more than C.

If I delete A, maximizing the voting power, then I have these possibilities:

  • I keep 1 valid, that is: B [10] C [2] D [0]
  • I keep 2 valid, that is: B [10] C [7] D [0]
  • “half way”, that is: B [10] C [4] D [0]

You assume that the voter chooses the third option, but there is no real correct answer because the original interests cannot all be maintained and according to which the voter decides to keep, the vote changes.

The logic of the DV, for ex, says “distribute 100 points based on the preferences expressed” so if I consider candidates A,B,C,D, the points will be distributed based on this original vote:
A[10] B[5] C[2] D[0] --> A [58] B[29] C[13] D[0]
if instead I consider candidates B,C,D, then the points will be distributed considering this vote (which is the original, without changes):
B[5] C[2] D[0] --> B[72] C[28] D[0]

@cfrank about the zero point:
The thing that seems to me not being understood is the fact that, in SV-style voting methods with range [0,10], the voter who wants to use negative scores could put at 5 all neutral candidates (such as unknown ones) and 0 only the disapproved ones, exactly as if the range were [-5,+5].
Using ranges with only positive values ​​such as [0,10] does not eliminate the problem of negative scores; it just masks it.
In DV instead, the range of ratings [0,9] is an effective range in which 0 really plays the role of “minimum value of the range” (without masked negative values).

I used the DV as an example to make it clear that the alternatives exist, to avoid or reduce the problems exposed.

No, I do understand that. The zero point I’m talking about is more than just an unmasking. The positive and negative values are scaled independently—two different rubber bands looped around a single immovable pin, rather than just a single rubber band.

For example, say we look at ordinary integer scores in CB from 0 to 10, and suppose a voter indicates

A[10] B[5] C[1] D[0]

There is no canonical way to convert this into a vote using independently scaled positive and negative values that will serve the same function. For example, here is one possibility obtained by subtracting 5 to set the zero point, and scaling by a factor of 2:

A[10] B[0] C[-8] D[-10]

However, let’s compare what happens in each case when A is eliminated. In the first case, the vote becomes

B[10] C[2] D[0]

Which, using the same transformation into the positive and negative score space, would give

B[10] C[-6] D[-10]

But with independent scaling, the vote

A[10] B[0] C[-8] D[-10]

Would become

B[0] C[-8] D[-10]

Which serves quite a different function than the other vote.

Choosing the zero point along the single rubber band is arbitrary, but the zero point in this other system serves the invariant function of indicating indifference.

It’s true that not all normalizations are subject to this problem but:

  • the normalization of the CB which consists in lengthening a range, is subject to this problem.
  • the normalization that sets the maximum value to +10 and the minimum value to -10, is subject to this problem.

The positive and negative values ​​are scaled independently

If you mean that removing the value +10 only changes the positive values ​​(above 0), then you might be right but problems remains: given this vote A [10] B [0] C [-4] D [-10]

  • if I remove A what happens? To maximize voting power I would have to set B to 10, but how would negative values ​​change?
  • If I take away A and B, is C set to 0, to 10 or remains unchanged?

The answers to these questions are very arbitrary (therefore, it’s very difficult for everyone to accept your subjective answer).

Furthermore, this way of thinking causes unknown (unrated) candidates by default to end up in 0, being drastically favored over those with negative values; this can generate the victory of a candidate unknown to more than 50% of the voters.

The way I imagined it, zero scores remain zero always. If a voter wanted B to count positively, they should have given B a positive score. Scores only increase in absolute value, being scaled outward from 0. Almost nobody rational would ever actually make an indication of 0 on their ballot, 0 is essentially reserved for unmarked candidates, unless a voter wants to indicate a candidate they would legitimately be indifferent to.

Furthermore, you can still only consider the average scores of candidates that are indicated by at least 50% of voters. Ordinary score has the same problem if unmarked candidates are given a midpoint score by default.

On that subject, you could also have unmarked candidates set by default to -1 or -10. I’m not sure, but the default setting is an arbitrary/subjective matter in ordinary score as well.

I don’t believe the logic about maximizing voting power in the context of the original system applies to the different ballot constraints on this different system.

Almost nobody is ever going to make an indication of 0 in their ballot, 0 is reserved for unmarked candidates.

Ok, then you should create a ballot slip in which there is no 0 among the values ​​to choose, so you are sure that the values ​​in 0 are candidates not evaluated.

Furthermore, you can still only consider the average scores of candidates that are indicated by at least 50% of voters.

This solution could prevent unknown candidates from being favored (also give -10 to them) but I don’t understand why you should also use negative scores, instead of range [0,10].
What would be the improvement in putting a fixed point (0) in the middle of the range?

Ordinary score has the same problem if unmarked candidates are given a midpoint score by default.

This is in fact the criticism I make of all votes with range. DV is a cumulative voting system disguised as a range, to make it easier to write.
In cumulative methods (fixed number of points to distribute) the voter would not give his limited points to unknown candidates when he could instead give them to an approved candidate.

I’m not sure what improvements there would be if any exist at all, it’s just a suggestion that I think is interesting to consider. Probably it has both pros and cons. It treats like and dislike differently, and allows voters to basically approximate statements like “I like A, but I don’t like B,” “I dislike D twice as much as I dislike C,” etc.

It’s just a different way of doing it, just as arbitrary as any. That is a fair criticism of score voting in general. But there are fair criticisms of every voting system.

Perhaps if all positive or all negative candidates are eliminated, it could revert to ordinary CB scoring on the single remaining rubber band over the whole range. I feel like that might solve the problems you’re seeing.

Again, I’m just trying to wrap my head around these systems. It’s helpful for me to talk with others about alternatives and modifications that come to mind, because if they can be convincingly argued against it gives me more confidence that the original system is somewhat less arbitrary.

Ok, but this is nothing more than one of the many possible normalizations I talked about in this example:

In this vote A [10] B [5] C [2] D [0] the voter says:

  1. C deserves 2 points more than D.
  2. B deserves 3 points more than C.

If I delete A, maximizing the voting power, then I have these possibilities:

  • I keep 1 valid, that is: B [10] C [2] D [0]
  • I keep 2 valid, that is: B [10] C [7] D [0]
  • “half way”, that is: B [10] C [4] D [0]
  • “half way”, in which the value 5 is kept fixed during normalization; 5 would be 0 if the range had half negative and positive values (I have added this possibility now).

I’m not saying it’s wrong as a normalization, but it seems too subjective to me.
In general, it makes no sense to me to alter the voter’s original vote.

My personal opinion is that arbitrariness is not really a valid criticism of a voting system. Perhaps complexity is, but not arbitrariness. Every voting system is totally arbitrary. All that’s important to me is that they function and meet practical (i.e. not abstract) functional demands.

Even though a rock isn’t a hammer, I can still use it to whack in a nail.

Every voting system is certainly not equally arbitrary.

I would argue that the only arbitrariness in a condorcet compatible method is in resolving the cycles, and even that can be minimized to a degree where it is generally insignificant.

So I certainly can’t get on board with the “arbitrariness is ok” idea.

I’m of the opinion that the Condorcet criterion is also arbitrary. The importance of it is functional, not abstract. It’s possible to find a replacement of Condorcet that serves more or less the same function. As you’ve noted, CB is Condorcet compliant “for all practical purposes.”