This is a break off from this other post
I want to dig deeper into evaluation methods. In multimember sequential score systems we have all our basic criteria other than accountability and local representation covered. This means we are ahead of all other “PR Systems” but there are still many such systems. There are 6 in the post but there are many others which come in a allocation or optimal types. So how do we decide which is “best” and what are the the metrics we can evaluate under. So far in the post we have come up with the following metrics
Total Utility = The total amount of score which each voter attributed to the winning candidates on their ballot
Total Log Utility = sum of ln(score spent) for each user [ I actually use ln(1 + x) to avoid infinities]
Total Favored Winner Utility = The total utility of each voters most highly scored winner
Total Unsatisfied Utility = The sum of the score left to be spent by all voters. ie 5  sum(score) for each user then summed over all users
Fully Satisfied Voters = The number of voters who got candidates with a total score of 5 or more
Wasted Voters = Voters who did not get any candidate they scored
The problem is the same as the reason why I put “PR Systems” in quotation above. Basically there is no real way to define PR in a non partisan sense. The standard checks are just that it fits partisan PR in that limiting case with bullet voting. I have been using the term “Ideal Representation” to refer to a fair down sampling like what you would get from a sortition. There is no real way to calculate this. Even if we had all the voters and candidates position in the ideological space the sample size is too small to do something like a chi squared test for independence.
So the question is, what are other metrics which would probe something like PR? The above ones have Utility sliced in many ways but I feel like we are missing the issue of spread/variation.
I was thinking of just looking at the the variance of the scores for the winners. When comparing two systems the variance of the score would be a decent relative metric for the polarization/centerist bias. This is not really the same thing as PR since MMP is polarizing and considered 100% PR. However, it could at least be a way to know if a system is polarizing. STV for example could be used as a reference point of a polarized system. Specifically the metrics for comparing the methods could be

The the standard deviation of all scores for each individual winner. One metric for winner 1 then 2 then 3 ect I would expect the standard deviation to get larger for subsequent winners.

The mean of the above across all winners. This would give a metric for the system as a whole since the order of the winner set is not important.

The standard deviation of all scores given to all winners.
This way we can at least see if the variance of the scores is too high. I am using standard deviation because it is the standard. Median absolute deviation and Average absolute deviation are also options but I am not sure if they are better. Thoughts?