I worked a bit more on some Codepens to both hopefully provide some generally useful tools, and to dig into analyzing some the sorts of ballot scenarios where Score and STAR produce different results.

I am exploring a concept I mentioned earlier, “blurring” which means to take an example ballot set and add a few more ballots which are semi-randomized “middle ground” between all of the ballots in the example. (to produce each of them, it picks a few random ballots from the set, average them, then normalize)

I’m also exploring a method which is Score but using interpolated median. This has the “exaggeration resistant” property as does regular median, but it isn’t so likely to produce a tie that needs to be resolved. It seems to produce results very similar to STAR, but has the nice side effect of having easy to look at results due to their being only one round.

The Codepen is at https://codepen.io/karmatics/pen/jOWQgBJ , and it references the code from a couple other Codepens (such as parsing and writing score ballots).

Here is a screenshot. The ballot sample comes from a post here. You can see that Score has B winning, while STAR and Interpolated Median have A win. (A is the Condorcet winner, as 51% give A a 5). B wins under Score because B is universally liked by everyone (getting a 4 from 100%), and 49% really dislike A.

In the second image, I have created 6 additional “blurred” ballots, which you see in the middle text area. Since these have an element of randomness, they’ll be different every time, but they aren’t completely random. For instance, all of them having B scoring high, and all have C scoring zero. This is consistent with the ballots in the sample.

If you do much less that 6 blurred ballots, it will typically stay with A winning, but if you do much more, B will almost always win.

To me this suggests that the example is unnaturally contrived, since you’d expect in a real world election, the electorate will have some number of people who are more in the middle. Any method like STAR or Condorcet methods seem to require a little bit of “fuzz” in the electorate for them to work well.

Anyway, I’m interested in thoughts on this but mostly I’m hoping that it inspires some people to get interested in doing Codepens, since they are so easy to share and collaborate on. (remember that non-coders can go in and play with different sets of ballots and different amounts of blurring on this one, and as time goes on their should be a ton more things to experiment with)