CORDIAL MINUET ENSEMBLE

??????

You are not logged in.

#1 2014-12-27 21:13:08

jasonrohrer
Administrator
Registered: 2014-11-20
Posts: 802

Elo ratings?

Any thoughts on implementing Elo ratings for players in this game?  The most concise summary I could find was here:

http://leagueoflegends.wikia.com/wiki/Elo_rating_system


The most obvious way would simply count leaving a table ahead as a win and leaving a table behind as a loss.

However, it seems like that would encourage weaker players to win one chip and then leave against a stronger player.

Similarly, if instead of a score of 1 or 0 like chess, you could imagine a score based on number of chips taken at the table (a score between -100 and +100, ignoring the tribute).  But that's weird too, because a high ranking player would be expected to take more chips from a low ranking player, and if the high ranking player took less than expected, both player's scores would change.  That doesn't capture it either.  Weaker player would still want to take one chip and leave.

The system could ignore all games where less than 50% of chips change hands, and then count binary wins and losses for games that exceed 50% transfer (or maybe 25%---whatever threshold we want).  We couldn't just count the sub-threshold games as "draws" because draws would move the weaker player up and stronger player down (so counting draws would still encourage leaving after the first hand for the weaker player).

Still, whatever threshold we pick would have an effect.  The losing player would want to leave the table before the threshold was reached (so the game wouldn't count).

Offline

#2 2014-12-27 21:39:39

Nate
Member
Registered: 2014-12-23
Posts: 52

Re: Elo ratings?

One problem that you didn't mention for the main game is Smurfing, where I could lose 50 1 cent matches to lower my elo, and then pubstomp my way back to the middle using 1$ matches.

Plus I feel like this whole system just encourages us to try to maintain as low an ELO as possible....at least if our end goal is making money off of people.  Youll see a lot of high elo players make new accounts to reset it.

For a tournament structure though, something like this could work I think.

Offline

#3 2014-12-27 21:56:23

Nate
Member
Registered: 2014-12-23
Posts: 52

Re: Elo ratings?

Although, if the ELO system didn't affect matchmaking and was only public to yourself, it wouldn't be bad at all like I said it would be.

But even having elos on the leaderboards could lead people to scrape the leaderboards to find out how skilled their opponent is, at least if they play multiple games.

Actually, people might be doing that anyway, so.... xD

Even with a ton more players, you could still narrow down your opponent by noticing the timing of changes in the leaderboard comapred to your own game quitting. Maybe the leaderboards would just need a random, long refresh rate. Maybe 15 seconds to a minute, randomly each refresh.

edit: I just don't want my opponents to have their d-space calculations done for them by scraping leader boards wink

Last edited by Nate (2014-12-27 22:06:26)

Offline

#4 2014-12-27 22:03:57

computermouth
Member
Registered: 2014-12-27
Posts: 134

Re: Elo ratings?

Nate wrote:

One problem that you didn't mention for the main game is Smurfing, where I could lose 50 1 cent matches to lower my elo, and then pubstomp my way back to the middle using 1$ matches.

^Agreed. Plus, that's part of the fun of head to head stuff like this. You never know if someone's got the nuts, IS nuts, or is just bluffing. big_smile


Try Linux, get free. #!++ (CrunchbangPlusPlus) is a stable distribution based on Debian 8. Keep it fast, keep it pretty.

Offline

#5 2014-12-27 22:22:52

jasonrohrer
Administrator
Registered: 2014-11-20
Posts: 802

Re: Elo ratings?

Oh, I wasn't suggesting it would be used to match players, or to give you more info when picking your opponent.

It wouldn't be displayed in game at all.

I'm talking about a 5th kind of leaderboard that gets closer to picking out "better players" than what we have now.


As a fix for the leaving problem that I talked about above, just count any leave before chip threshold is reached as a loss for the player that left.  So, say the threshold is 25.  If you leave at 124, it counts as an Elo loss for you (your opponent would win, Elo-wise, even though you left with more chips).  If your opponent leaves when you are at 124, it would count as a win for you.  If you leave at 125, it would count as an Elo win for you.

I'm not sure what the threshold should be, but I think that would give us a pretty good Elo leaderboard.  Thoughts?

Offline

#6 2014-12-27 22:24:13

computermouth
Member
Registered: 2014-12-27
Posts: 134

Re: Elo ratings?

Ohhhhhh, in that case, it sounds rad!


Try Linux, get free. #!++ (CrunchbangPlusPlus) is a stable distribution based on Debian 8. Keep it fast, keep it pretty.

Offline

#7 2014-12-27 22:42:14

mzo
Member
Registered: 2014-12-09
Posts: 50

Re: Elo ratings?

Why don't we first try and explicitly define what a better player is in terms of CM? It seems like the current leaderboards are all different views on that, but none really establishes the qualities of "better" in a singular way. Skill in this game takes a few different factors (creatures strategy post lays that out well). I feel like lack of clarity in this definition is making it harder to determine an ideal tournament structure, as the structure will generally skew winning playing style.

Offline

#8 2014-12-28 00:44:19

jasonrohrer
Administrator
Registered: 2014-11-20
Posts: 802

Re: Elo ratings?

Oh, well, the Elo idea was separate from my thoughts on tournament structure.

As I was tweaking the leaderboard formulas, I realized how much the formulas affect our perception of "better."  Like, is Creature Expression the best CM player?  It sorta seems like it based on the old profit ratio.  But then again, he has played tons of games, so, that could inflate his profit.  Also, how would he do toe-to-toe against other really good players?  Profit ratio doesn't answer that.  The new profit ratio, which adjusts for total buy-ins, also doesn't, because "winning more on a given buy-in" doesn't necessarily make you good.  Maybe you've just been up against a lot of fish who foolishly go all-in.


I think that Elo would, though.  It's explicitly a measure of how likely one player would be to win against another.  I was just trying to sketch out what "win" would mean in this game.

Offline

#9 2014-12-28 01:15:05

mzo
Member
Registered: 2014-12-09
Posts: 50

Re: Elo ratings?

In a way you've established some parameters for "winning" for a 1v1 table by giving fixed starting coins of 100, thus providing two losing conditions and giving players an equal starting place (unlike poker where you may lose by smaller bankroll). The basic losing condition is something you mentioned above, leaving the table with less coins than the other player. Note that I'm saying YOU leaving with less, not based on the other player leaving. Imo the true competition between highly skilled players is that whoever leaves the table loses (assuming you are forced to leave at 0 coins). This means you win by either chipping out the other player or getting them to leave to preserve their bankroll.

I consider this the best definition of better player because the better player should (given enough time) eventually chip out the other player unless the other player forfeits first. Obviously real life intervenes and matches between evenly skilled players can go on for an indefinite period of time. It's likely though that eventually one of the players will make a mistake and the balance will shift.

Perhaps there does need to be a side pot rake that slowly whittles away at both players coins and the player who doesn't leave the game gets it. This way there's at least a finite number of matches.

I don't think people should be punished heavily for leaving games, but I also don't think they should be encouraged to table hop too much either, as it diminishes the value of learning anything about your opponent. I even purposely leave and rejoin games just to make my opponent think I could be someone new, and I think that exploits the anonymity TOO MUCH. If I rarely ever know much about my opponent it feels more like a slot machine than a poker game.

Offline

#10 2014-12-28 02:24:32

Pox
Member
From: Canberra
Registered: 2014-12-26
Posts: 15
Website

Re: Elo ratings?

I don't think a system using binary win/loss per table is appropriate unless the game itself places a lot of emphasis on playing a table out. As it is now you can walk away at any point (which I like), so I think the outcome of a game should be measured with more granularity. I'd also recommend using a Glicko-like system, which deals with uncertainty better.

Perhaps take the score of a player to be the fraction of total chips won that was won by them, and when recalculating ratings weight the impact of a given table by the total number of chips wagered. This eliminates the issue of a weaker player leaving with 99/101 or 101/99: in either case there will be basically no impact on ratings anyway. It also means that long, back-and-forth games (where the total chips wagered can often exceed 200) will be weighted more heavily, but this makes sense to me - after all, if you swing back and forth and get back to even scores, restarting the table has very little effect.

It also seems somewhat natural to weight everything by the stakes, but this may make it very difficult for penny players to move their rating - perhaps have separate ladder rankings for various stakes brackets? Not sure on this.

Offline

#11 2014-12-28 05:26:20

jasonrohrer
Administrator
Registered: 2014-11-20
Posts: 802

Re: Elo ratings?

Well, I think in terms of Elo, a penny should be the same as $100.  If you're playing to win, then it doesn't matter what the chips are worth.

Your solution is good, except it doesn't punish leaving at 99/101 enough.  As you point out, the weight will be so small that it won't affect Elo in that case, but that's a problem, because a weaker player who senses they are beat, and cares about their Elo, would be motivated to bail before too many chips are wagered.   "Get me out of here, I don't want this to hurt my Elo."

I really want EVERY game to count, otherwise people who care about Elo are motivated to bail "before the game really counts."  Likewise, all games should count the same amount.

You know, imagine if Chess Elo only counted "all games lasting at least 5 minutes" or weighted games based on how long they were.  A bunch of players would start getting stomach aches at the 4:30 mark.... or for the one that weights by game length, be motivated to throw the game before it went on too long to save their Elo.

What I want is an Elo that means something to players, and that players care about, and this should encourage players to play their best and play a lot, just like it does for other games.  You know, working on your Elo as a beginner at Chess.

If we weight game length, then the player who is winning will be like, "Oh, yes, lets slog this out forever so it will count a lot," while the losing player will be like "Oh, no, lets end this now so that it won't count so much."  What I want is both players motivated to keep playing their best.  The behind player should be trying for a comeback.  The ahead player should be trying to stay on top.

Granted, the desire to raise your Elo may be at cross purposes with the desire to make money... I think that's okay.

Also, yes, walking away from a table at any moment is a big part of this game.  However, being "good" at the game against other skilled players seems very hard to measure if we aren't able to stick those to skilled players together at a table for a while.  If you're in a match against a skilled player and you walk away, all we know is that you walked away.  We can't ignore that fact, though, when computing your Elo.


Still, I'm not totally happy with a "can-leave-now" threshold for Elo either.  If it's 25, and the winning player can leave at 125 while still having it count as a win, then the winning players are motivated to leave at that point (instead of risking a turn-around and risking lowering their Elo).

Jeez... maybe.... it doesn't count as a win for you unless:

A.  Your opponent leaves first
or
B.  Your opponent has no coins left.

So, if you care about Elo, and you're currently winning, you stay at the table until the end.  You don't say, "Oh, I've won enough to count for Elo, I'm outta here."

Otherwise, we don't know that you would have really won if you had stuck around.  Just 'cause you're up to 125 doesn't mean you would stay up.  If your opponent leaves, okay, you won.  But if your 75-coin opponent is willing to say, we don't know what will happen unless you stay too.


(Also, Glicko seems too complicated---I'd really just want one number for each player.)

Offline

#12 2014-12-28 07:01:45

Pox
Member
From: Canberra
Registered: 2014-12-26
Posts: 15
Website

Re: Elo ratings?

jasonrohrer wrote:

Your solution is good, except it doesn't punish leaving at 99/101 enough.

Very true - if you're able to quickly identify a playstyle you're weak against then this lets you avoid that weakness being reflected in your rating. However, I would argue that the very ability to make such a quick identification is a valuable skill in a gambling scenario; and indeed leaving these games quickly is a good strategy if you're trying to maximize your bankroll. In order for this kind of strategy to help you in terms of your Elo, of course, you do need to be playing some high-ranked players: if you only play fish then increasing your rating much past fish level should become impossible.

So the question is: if you have the ability to beat some high-ranked players and not others, and you also have the ability to identify those you can't beat before losing much too them, should this be punished? The answer depends upon what you want the rating to measure - if it's designed to highlight good tournament players then it should be punished, but if it's designed to highlight online money-making skills then it should not.

jasonrohrer wrote:

I really want EVERY game to count, otherwise people who care about Elo are motivated to bail "before the game really counts."  Likewise, all games should count the same amount.
You know, imagine if Chess Elo only counted "all games lasting at least 5 minutes" or weighted games based on how long they were.  A bunch of players would start getting stomach aches at the 4:30 mark.... or for the one that weights by game length, be motivated to throw the game before it went on too long to save their Elo.

"One game" is a much less meaningful unit here than in chess. If two players fight it out for a bunch of rounds and end up at 95-95, they could leave the game and start a new one with very little effect, and this first game would be treated as a tie by the ratings system. It just feels natural to me that the rating system should be "additive" - if they instead play until one chips out from the 95-95, the long, close fight should be taken in to effect. Granted this is very different to how ratings systems work in other games - a Fool's Mate and a drawn-out endgame are weighted the same in chess. The reason I would treat this differently is that in chess all that matters is whether you win or you lose; while in a gambling game with the option of leaving early, there are various degrees of success and failure.

Additionally, weighting by the number of times the chips change hands is very different than weighting by the length of time (or number of rounds, etc.) that passes - here I feel like we're really just getting back to the question of punishing early leavers.

I guess overall I just think it's best to make "good play" be universal, whether you're trying to maximize your rating or bankroll. If (as you said) you don't think this matters, then most of my points are irrelevant; but you end up with conflicting motivations for your players, and perhaps discourage people who care about their rating from playing high-stakes games.

Here's a question for you: have you considered a separate "ranked" mode where you compete for rating only, instead of for cash? This would avoid all the issues you're having - as you say, if you leave before your opponent chips out, you lose. If you put this rule in place without splitting the gamemodes then I feel like you are to some extent restricting people to care about either rating or making money, but not both.

jasonrohrer wrote:

(Also, Glicko seems too complicated---I'd really just want one number for each player.)

You don't need to expose Glicko's RD in the leaderboard - it's just a nice number to keep track of internally, and solves the issue of skilled new players/accounts demolishing the ratings of the players they are matched up against while climbing the ladder. The arithmetic is a little more complicated, but I'd say it's worth implementing.

Last edited by Pox (2014-12-28 07:04:55)

Offline

#13 2014-12-28 15:03:19

..
Member
Registered: 2014-11-21
Posts: 259

Re: Elo ratings?

Great, I was hoping that you would implement something like this! I also wanted to suggest Glicko rather than Elo, though I don't know what your tolerance for converting mathematics to code is. Elo seems to be overly simple (and has dubious assumptions built into it), so you could always just use one of the improved versions of it.

Regarding trying not to introduce perverse incentives to players to act in a certain way to increase their rating, I suggest making sure that the incentive caused by the effects on a player's rating is totally aligned with their normal incentives: winning and losing money.  In other words, you could just make the weight of the game equal to the profit. If a player realises that their opponent is much stronger then they'd want to leave before losing more money anyway, so IMO there's no additional incentive to leaving early or after you happen to reach 100 coins by random walk or martingale betting. On the other hand, artificial binary rules for determining win/lose/draw are doomed to introduce perverse incentives.

Last edited by .. (2014-12-28 15:05:10)

Offline

#14 2014-12-28 22:33:38

jasonrohrer
Administrator
Registered: 2014-11-20
Posts: 802

Re: Elo ratings?

Well, I'm not afraid to implement math in code... it's just that everyone has heard of Elo, to the point where "elo" is synonymous with "rating" in many contexts.  The formula makes obvious intuitive sense to me, and it's well documented (much longer Wikipedia page about it, for example).

And here, where we're talking about adapting a rating system to support margins instead of binary win/loss, analyzing Elo in this regard is much more straight forward.


So... would you just let your E expected score be the number of chips you're expected to take from (or lose to) the other player?  1 if you take all their chips, 0 if they take all your chips, and 0.5 if you leave the table when stacks are even.

I was at first thinking it's a problem that a weak player can leave early and thus reduce the Elo of their strong opponent (opponent didn't take as many chips as expected, so their Elo would go down), BUT, as you point out, this is kinda the name of the game.  If you get outta there before the strong player gets you, they're not really so strong, are they?  The strongest player is the player who can take lots of chips from everyone, no matter what.  The player who can charm a weaker player into sticking around, etc.  If the stronger player scares the weaker player away early, the stronger player isn't playing so well.

Offline

#15 2014-12-29 02:18:51

jasonrohrer
Administrator
Registered: 2014-11-20
Posts: 802

Re: Elo ratings?

Okay, this is implemented now.  There another thread describing the formula, and you can look at the new leaderboards as well.

There's one more problem:  Elo ratings tend to encourage strong players to stop playing to protect their high rating.  That's a problem for another day, though.

Offline

#16 2014-12-29 02:28:27

jasonrohrer
Administrator
Registered: 2014-11-20
Posts: 802

Re: Elo ratings?

Oh, and forgot to mention that the Elo ratings in place now were computed using the full game history logs.  So, I essentially went back in time and imagined that everyone started as a provisional Elo 1000 player whenever they started playing, and watched the outcome of each game in order, changing each player's ranking per game just as if those games had taken place in that same order today, and updating each player from provisional to established status once they hit 20 games whenever they did in the past.

Thus, your Elo today is your real Elo, reflecting your full history in the game so far.

Offline

#17 2014-12-29 14:09:57

..
Member
Registered: 2014-11-21
Posts: 259

Re: Elo ratings?

Great, thanks!

jasonrohrer wrote:

I was at first thinking it's a problem that a weak player can leave early and thus reduce the Elo of their strong opponent (opponent didn't take as many chips as expected, so their Elo would go down), BUT, as you point out, this is kinda the name of the game.  If you get outta there before the strong player gets you, they're not really so strong, are they?

Actually, I've realised that I made an error: using profit still creates an extra incentive to quit games immediately. Someone could watch the profit or dollar leaderboard to detect when they've joined a game with a higher ranked opponent, and leave immediately with even coins, dragging up their Elo ranking. So there is an argument for decreasing the influence of very short games, though I hope few people would do that just to increase an anonymous score.

Closely related, regularly leaving and recreating a game (against the same player) will have completely different effects on the Elo ranking even if the overall profit would be the same. Lets say that your expected score against the other player is 0.3. If you play one game and lose 40 coins (actual score = 0.3), your ranking stays the same. If you split the game into 4 and lose 10 coins in each (actual score = 0.45), then your score increases by approximately 4 * K * (0.45 - 0.3) = 19! Maybe it's feasible to combine the results of  multiple games played soon after each other (if that means a total score below 0 or above 1 that's totally fine for the calculation) but it's probably not very significant.

Offline

#18 2014-12-29 16:07:23

jasonrohrer
Administrator
Registered: 2014-11-20
Posts: 802

Re: Elo ratings?

Yeah, there are problems here for sure.

The good news is that I can manually recompute Elo for everyone using the full history at the push of a button, so we can change the formula as much as we want going forward.

Right now, "fraction of chips taken" is your actual (and expected) score.  We expect you to take X% of the chips based on your relative rating with your opponent.  That doesn't really make sense intuitively, as you point out above with the player who left three times and lost the same amount.

However, if this is the way the Elo works, then you could imagine that this would all work itself out, with the expected number of chips taken floating around how many chips are taken in practice, etc.  Leaving before you lose too many chips would never be as good for your Elo as staying and making a comeback.

The other option would be to track binary win/loss like Chess, but then factor the number of chips taken in the K factor.  "Your rating will go up whenever you win chips, but it will go up more if you win more chips."

Then at least your rating would never go DOWN as the result of winning chips.  Currently, you can win fewer than we expect you to win and have your rating go down.

Looking at the top established Elo vs the bottom, we expect that top player to take 77% of the chips in each match with the bottom player.  If they take only 50%, their Elo will go down!


It does suck that the other leaderboards give realtime information about who just bought-in.... hmm...  I'll have to figure out how to fix that.

Offline

#19 2014-12-30 06:20:15

..
Member
Registered: 2014-11-21
Posts: 259

Re: Elo ratings?

I'm coming around to the idea that binary win/loss is better after all. Positive/negative profit will indicate which player performed best, which is the only information that the Elo system assumes is available anyway. Counting the loss of 1 coin due to leaving immediately as a lost game would discourage people from doing that, and is exactly the same as a chess player leaving on turn 1 and forfeiting the game. Ugh, we've argued our way back to the beginning! Adjusting K by magnitude seems fine (though if you go through the maths I expect you'd get some other result), and has the nice effect that those 4 consecutive loses of 10 coins could have a similar effect to one loss of 40 coins.

If the other leaderboards only updated when you left a game rather than both when you left and when you entered it, then they'd be harder to exploit... while adding yet another incentive to quit games immediately : ( Alternatively, don't include games in the last X minutes in updates.

EDIT: Oh, I didn't see that you've already changed it. I think the new formula if pretty good and there's little to complain about. I doubt that the total Elo ranking of players decreasing over time due to the non-zero-sum effect will be significant. It'll be balanced by Elo getting injected due to new players.

Last edited by .. (2014-12-30 06:42:04)

Offline

#20 2014-12-30 15:27:07

jasonrohrer
Administrator
Registered: 2014-11-20
Posts: 802

Re: Elo ratings?

Well, that's the thing... Elo isn't really injected by new players.

Everyone starts at 1000, so if we left it at that, the average would always be 1000.

But to combat inflation of existing players who feed off of overrated new players, new players have provisional Elos that change for 20 games WITHOUT changing the Elos of their opponents at all.  This also combats deflation of existing players who get trounced by an underrated new player.  But in general, we'd expect new players to be worse than average, so we'd expect them to be overrated.

Anyway, those extra points, until their Elos settle after 20 games, are lost.  This means that the average will slowly drift down from 1000, assuming that most new players are overrated.

Offline

#21 2014-12-30 17:37:26

jasonrohrer
Administrator
Registered: 2014-11-20
Posts: 802

Re: Elo ratings?

Okay, leaderboards have been fixed so that the live game buy-in is taken out of all the formulas.

Thus, you can no longer watch the leaderboard for small changes to figure out who you just joined a game with.

The leaderboards update instantly as soon as a game ends, however.  So, you might be able to tell who you just played against.

Offline

#22 2014-12-31 15:44:51

..
Member
Registered: 2014-11-21
Posts: 259

Re: Elo ratings?

Oh, I was thinking of the case where there is a floor on the minimum possible Elo rating. That causes new or bad players with a score already equal to the floor to inject points into the system. But since you aren't doing this, so new players aren't a scores of points. Instead, they're a drain: assuming that provisional players are more likely to lose to non-provisional ones, when provisional players play non-provisional one, the provisional player will typically lose points and decrease the average Elo rating of the whole playerbase. This is on top of the drain on points when both players end with less than 100 coins.

So, given this, why are Elo scores for both provision and non-provisional players on average above 1000? It's a bug in the server. I noticed today that after winning a game that my score increased by 24 points and my opponents decreased by 8. I worked out that my opponent should have lost 24. So I picked over cm_computeNewElo and finally found the bug:

    $bS;
    if( $payoutB > $buyIn ) {
        $bS = 1;
        }
    else if( $payoutA < $buyIn ) {   <--- should be payoutB
        $bS = 0;
        }
    else {
        $bS = 0.5;
        }

After you fix this bug and recompute Elo scores, they will drop massively so that the average is below 1000. You are going to have to introduce a source of rating points into the system. (You don't want the leaderboards to claim that most players are worse than a newbie.) The most traditional is a rating floor. The floor could be either equal to the starting score or a bit below it. Because new players are going to be rapidly improving (as compared to e.g. chess players joining a score board, who already know chess) I suggest making the floor equal to the starting rating: losing your first several games says little about what your skill will be in a week. A rating floor has the advantage that points are added at the bottom, so it doesn't cause the ratings of top players to keep ballooning, though what the top ratings are will depend on the total number of active players. However because provisional players don't add to the score of non-provisional ones this will have a slower effect.

See also http://en.wikipedia.org/wiki/Elo_rating … _deflation. Notice that it mentions another drain on points:

However, players tend to enter the system as novices with a low rating and retire from the system as experienced players with a high rating. Therefore, in the long run a system with strictly equal transactions tends to result in rating deflation.

Unless you have a big enough source of rating points to overpower those three point drains, I think it's likely that you'll have to get rid of the drain in the situation where both players walk away with less than 100 coins, by redefining victory as ending with more coins than your opponent. That means you played better, afterall.

Last edited by .. (2014-12-31 15:49:34)

Offline

#23 2014-12-31 19:38:10

jasonrohrer
Administrator
Registered: 2014-11-20
Posts: 802

Re: Elo ratings?

Whoa, that was a huge bug.  Fixed it and recomputed.

New global average Elo:  987
New non-provisional average Elo:  1011
New provisional average Elo:  983


Well, I wanted "Victory" to match with money victory and not be something else.  If you're expected to win money against a weak opponent, but your opponent chips you down slowly over time and you walk away with less money, then you didn't match our expectation (your opponent played better against you than we expected).

This is certainly a little weird, because the expected score is off for evenly matched opponents.  If players have identical Elo, our expected score is "0.5", or "tie," but evenly matched players in reality wouldn't tie, they'd chip each other down and both lose money.

Still, I wonder how much this happens in practice.

Okay, data.  In CM history of 1522 games, both players have walked away at a loss in 27 games, or less than 2%.

Anyway, this is where I got the idea of provisional ratings to prevent overrated new players from pushing up the scores of existing players (inflation of the existing player base Elo when new players come in):

http://stackoverflow.com/questions/1881 … tart-value

This would obviously cause deflation in the average, while at the same time maintaining the average of the established playerbase.

It's also interesting that if there's a balance between over- and under-rated players coming in, then there wouldn't be inflation here (some players would lose points to nobody, others would gain points that are taken from nobody).

It seems like a rating floor would inflate the ratings of any stronger players who played against the sub-floor players.  E.g, if the floor is 1000, why play against a true-score 1001 player to gain X points when you could play against a true-200 (floored at 1000) and gain the same number of points?  Granted, we can't pick our opponents here, but any player that did "luck out" and get paired against a below-floor player would get an unfair Elo boost.

The same problem occurs if you assume new players are correctly rated at 1000 and let them into the pool.  If they are overrated, anyone paired against them gets and unfair Elo boost, while players paired with true-1000 players don't get the same boost.

It is also interesting that new players can affect other new players in exactly this way.  I could fix that so that provisional players don't affect the ratings of other provisional players (your rating only changes when you play a non-provisional player), but then bootstrapping the whole thing becomes a problem (when I recompute Elo, I roll the whole system back and start everyone as provisional 1000).

Offline

#24 2014-12-31 20:06:52

jasonrohrer
Administrator
Registered: 2014-11-20
Posts: 802

Re: Elo ratings?

Okay, I changed my mind here.

I can imagine it's demoralizing as a new player to watch your rating get beaten down.  So, all players start a 1 now, and that's the floor.  If you lose a game at 1, you stay at 1.

I was worried about how a floor would inflate upper ratings, but given that all players are in the same pool, and everything is relative, it kinda all works itself out, because the upper players will just have ratings that are that much higher.

Finally, if we're going to have a floor, let's do away with this arbitrary 1000 nonsense.  You start at level 1 and go up from there (zero is itself a demoralizing number).

It is interesting that putting a floor in changes the order of the resulting ratings on recompute.  For example, Vegetable Duty was 10 before and is now number 3.

Offline

#25 2015-01-01 09:50:13

..
Member
Registered: 2014-11-21
Posts: 259

Re: Elo ratings?

I don't think there's such a thing as a  "unfair Elo boost". Your rating goes up and down randomly. If the system is working, then your average rating over time will be close to your true rating. If your rating happens to get knocked way up or way down by something, then it will tend to be quickly corrected.

The playerbase has some true distribution of relative ratings to each other. The total range of ratings for 99% of players may in the long term be something like 1000 points, but as long as there are sources and sinks of points, this distribution will move about relative to the ratings floor (the average rating will adjust) until the loss of points is equal to the gain, because the more players that have a rating equal to the floor, the faster the injection of points. I think that the distribution of ratings right now is way below where it will be eventually. At that point, only a small percentage of non-provisional players should have a rating equal to the floor (currently at 4 out 33, it's still very high). I'm surprised at how high the ratings of many provisional players are (they're probably genuinely that good), and the low maximum rating.

I compared the ranking of non-provisional players using the latest Elo scores to the first version (with fractional wins). Most players are within 3 places of their old rank. E.g. vegetable duty went from 3rd to 4th.

Last edited by .. (2015-01-01 09:53:56)

Offline

Board footer

Powered by FluxBB