More Thoughts on Cordial Minuet/The Problem With Tributes

AnoHito · 2014-11-28 09:06:05

I have been delving deep into understanding how Cordial Minuet works, and I think I have gained some valuable insight. First of all, it is absolutely possible to achieve Nash Equilibrium in your picking strategy. It's not as simple as just picking randomly, because if you pick completely randomly you will lose to a player that is utilizing information on how the current board layout affects your odds of winning. But it is possible. That means that you can reduce the picking game of Cordial Minuet to nothing. You can play the game so you don't even think about what numbers you pick, and instead only focus on betting strategy. Additionally, using Nash Equilibrium picking as a basis, it is possible to introduce subtle biases into your picking that would take advantage of someone picking numbers based on another system. In other words, if you identify someone picking with a strategy the contains bias, you can exploit it, and if they are not you should avoid playing them because the best you can hope for is a tie. A tie that will cause you to lose money, thanks to the tribute.

Which brings me to my next point. The tribute system as it stands now, is completely exploitable. I strongly believe that a bot using an optimal betting strategy would be able to consistently break even playing over the long term. But because breaking even still makes money for the person running the game, they could easily write a bot for the sole purpose of breaking even, and thereby increase their profits in a subtle and hidden way. You couldn't do it too much, or people would start to catch on. But, especially if high stakes games become the norm, you could significantly increase your profits by doing this.

Even putting this concern aside, given that the optimal strategy for Cordial Minuet is increasing looking like it will be weighted randomness with compensations to exploit other players' biases, the tribute system makes the game very undesirable. Two players playing with such a strategy could never make money playing Cordial Minuet with and having to pay the tributes. If things stay the way the are now, I believe that tributes will severely discourage people from playing the game, due to their inability to make money without resorting to preying on weaker players. This will further exacerbate the issue I mentioned earlier, regarding weaker players abandoning the game due to consistently losing money.

Any thoughts/comments are welcome.

Asminthe · 2014-11-28 09:41:43

You are vastly underestimating the significance and difficulty of betting strategies. Every theory you have had about the game is based on an incredibly naive view that having the bigger score more often than your opponent results in you beating them, or at least that it is a relatively minor jump between those two points.

Everything you said is exactly as applicable to No Limit Hold'em as it is to Cordial Minuet, where we see none of the consequences you express concern for. Well, except for people making money primarily off of weaker players, but that's kind of the entire point of a skill based gambling game and I don't understand what you want to change there.

Actually, most of what you've said is simply some variation of "the optimal strategy is whatever turns out to be the optimal strategy" or "if there's an optimal column picking strategy then the game is broken". One of those things is obviously true and the other is highly suspect and not substantiated by any of your claims.

The tributes, at their current rates, are fairly generous to the players and give plenty of room to profit in the long term with even a relatively minor skill advantage over your opponents.

Either demonstrate an optimal strategy (which requires solving the entire game) or stop constantly panicking about the fact that there theoretically is one, because that is true of every game that exists.

Last edited by Asminthe (2014-11-28 09:44:31)

AnoHito · 2014-11-28 11:09:13

Asminthe wrote:

You are vastly underestimating the significance and difficulty of betting strategies. Every theory you have had about the game is based on an incredibly naive view that having the bigger score more often than your opponent results in you beating them, or at least that it is a relatively minor jump between those two points.
Everything you said is exactly as applicable to No Limit Hold'em as it is to Cordial Minuet, where we see none of the consequences you express concern for. Well, except for people making money primarily off of weaker players, but that's kind of the entire point of a skill based gambling game and I don't understand what you want to change there.

Betting strategies are complex, it's true. But part of what makes them complex is the game that you're betting on. For example, betting on a coin flip is technically gambling, but unless someone is using a rigged coin, no one is going to make any money in the long term, and there really is no betting strategy that is necessarily better than any other. In Cordial Minuet, by playing to achieve Nash Equilibrium you can actually deconstruct the game into a random number generator. Every round, you get a number that is either lower or higher than the average number, and your bet should be based on the value of that number. In practice it is more complicated, since to determine the true value of your position you must take into account not only what number you have, but what numbers you know the other player cannot have, and what choices you will have left in future rounds. Nevertheless, by correctly accounting for this information, it is possible to gain perfect statistical information on the value of your position. In other words, you have a number that tells you the odds that you will win. So in that sense, once you have successfully deconstructed the game to the point where you can achieve Nash Equilibrium, there is no game beyond the betting strategy.

But okay, so that's not really any different that Texas Hold'em, is it? In Texas Hold 'em, it's possible to calculate your odds of winning at every step of the game too. But the difference is (and this is key) in Cordial Minuet you often definitely know that the value if your position is better than the other players' (or at least probably is), while in Texas Hold 'em having that information is relatively rare. This greatly simplifies the betting strategy, and I've been narrowing in on something I believe is an ideal strategy for betting. I know at least, that if I can control the pace of the betting enough that the other player will not raise too often at times when there is a small gap between the value of our positions, I can probably come out ahead. But it's still very much a work in progress.

Asminthe wrote:

Actually, most of what you've said is simply some variation of "the optimal strategy is whatever turns out to be the optimal strategy" or "if there's an optimal column picking strategy then the game is broken". One of those things is obviously true and the other is highly suspect and not substantiated by any of your claims.

What I actually wasn't thinking of, and what I really should have realized sooner, is that any fundamentally fair game of chance results in a 50% chance of winning. It really sounds obvious when you say it that way, but still... A coin flip is a fundamentally fair game of chance, and the odds of winning are 50%. If you are playing Cordial Minuet in such a way that you have perfect information (or at least information that is not less accurate than the other players') about your chances of winning, and you play with a picking strategy designed to achieve Nash Equilibrium, the game become little more than a series of coin flips.

But what I was saying earlier about introducing a subtle bias is a point I should elaborate on. In Cordial Minuet there are choices that are objectively better than other choices. This is why you will actually tend to get lower values than the other player if you just pick randomly. Choices that give you a higher possible final score in relation to your opponent's are better than choices that give them a higher possible final score. By correctly weighing the relative values of your column choices, you can then pick a random column based on those weights, and achieve Nash Equilibrium. The second and third rounds are a bit more complex than the first, because they have additional asymmetric aspects you need to take into account, but it should still be possible to choose in a way that achieves equilibrium. Once you know how to generate weights that achieve equilibrium in each round, you can actually introduce a subtle bias in your choices that tends to favor columns that are better for you even more than their weight suggests you should. I believe that this is what you could call an optimal strategy for playing to win, because it would be incredibly difficult to figure out what you were doing. Another human player would probably never be able to make better choices without risking exposing an exploitable bias in their behavior. A machine playing for perfect Nash Equilibrium on the other hand, would on average beat you, but assuming that more players are not picking for equilibrium than are, you should still come out ahead.

Asminthe wrote:

The tributes, at their current rates, are fairly generous to the players and give plenty of room to profit in the long term with even a relatively minor skill advantage over your opponents.

Even if that were true, how would you address the possibility of bots playing to force a draw in order to farm money for the server owner?

Last edited by AnoHito (2014-11-28 11:13:16)

jere · 2014-11-28 13:59:51

Some of these ideas were discussed in this TCD thread.

Even if that were true, how would you address the possibility of bots playing to force a draw in order to farm money for the server owner?

A few things here. Wouldn't this apply equally to online poker? Since the skill gap seems higher in poker, the bot could easily be making money off weaker players AND driving rake to the house. And anyway is there anything legally problematic with that? In the above thread, it's mentioned that casinos sometimes hire players just to a) eat up weaker players and b) consistently provide opponents at the table.

In fact bots might be advantageous for CM for that last reason. I'd like to think finding opponents will be a non-issue when the game goes live, but knowing the fate of TCD (and the parity issue), I worry about it.

And finally, maybe this sounds naive, but Jason is the last person I'd expect to do something like that considering his publicly stated financial requirements, revenue, and just how plain stupid it would be.

In Cordial Minuet, by playing to achieve Nash Equilibrium you can actually deconstruct the game into a random number generator.

This is expected though, right? At least the column picking part. Frankly, I've already started using an RNG for both some column picking and early raising. Otherwise, I imagine I'd be rather predictable.

For the betting part, it's been said here that bots are pretty bad at no-limits betting.

But the difference is (and this is key) in Cordial Minuet you often definitely know that the value if your position is better than the other players' (or at least probably is), while in Texas Hold 'em having that information is relatively rare.

True, you sometimes know you have absolutely won. But more often you don't know. And even when you know for sure, the most important thing is reading the other person so you know how much you get them sink into a losing hand. Seems pretty intuitive to me why bots would be bad at that.

By the way, I'm having a little trouble wrapping my head around the probability in the very last betting round. It seems like each option is 50-50, but it also reminds me a little bit of the Monty Hall problem.....

jasonrohrer · 2014-11-28 15:57:56

Yes, a mixed strategy Nash Equilibrium exists in this game.

Finding that strategy is a different matter, however. The Nash Equilibrium is different for each board and it would need to be computed in realtime for each new game by exploring the full, 500,000-leaf game tree. This may or may not be feasible in terms of computational complexity. If it is claimed to be feasible, we need to start talking about what algorithm would be used and what it's complexity characteristics are.

Optimal play of Holdem with no rake is 0 EV if your opponent is playing optimally. With the rake, it's negative EV.

But the point of Holdem, and why it's interesting, is donkeyspace, and even more important, higher-order donkeyspace. Here's how I explained it in an email:

Simplify it to heads-up, no-limit holdem with a rake. If both players play optimally, computing pot odds and all that, then the expected value for both players is negative. First order donkeyspace is making moves that might convince your opponent that you aren't actually CAPABLE playing optimally, to get them to try to raise their EV by taking advantage of your sub-optimal play, which effectively pushes them into sub-optimal play that you can take advantage of in turn. But of course, your opponent knows that you're probably doing this on purpose, that you're pretending to be a donkey and not actually a donkey. So then you push into second-order donkeyspace, where the players are trying to figure out just exactly how far their opponent's pretending is going to go, while modulating their own level of pretending in turn. It's no longer, "Hmm... how bad is my opponent at this game?" but instead, "Hmm... how far out into sub-optimal play is my opponent actually going at this moment in the game, and are they out far enough on their limb so that with a little extra downward pressure, by standing a bit further out on my own limb, I can snap theirs."
It seems quite similar to a martial arts contest, like fencing (or JSB Joust!), where risk is a resource that players are trading in (and obviously fighting video games).
I haven't played complete information, turn-based games (like Chess or Go) extensively enough to know whether this kind of dance is totally eliminated by the practically-infinite layers of "even more optimal play" that never hit bottom. I imagine scenarios in Chess, like traps, where your opponent does something that looks stupid, but you know your opponent is not stupid, so there must be something more going on here several moves in the future. But you CAN figure it out, if you think hard enough. There is a right answer (or set of equivalent answers) to any trap or puzzle that arises. Or did your opponent actually just have a small stroke, or crack under the pressure? Strangely, there is an answer to even that question lurking deep in the game's current state.
In Poker, that answer is unknowable before the showdown.

And donkeyspace is this game's bread-and-butter.

So, even if computing the Nash equilibrium were trivial, there'd still be a non-trivial game in donkeyspace. But unlike RPS or "odd or even," computing the Nash equilibrium is not trivial, and is at least out of reach for humans without computer assistance (and perhaps out of reach for computers too), so this game has an extra layer that RPS doesn't have.

(Building an interesting betting structure around RPS is an open problem.)

As a starting point toward seeking the Nash equilibrium, my empirical tests match your observation that picking your first column randomly with 1/6 weight is not optimal (AI bots that I've written can beat a random opponent by looking at the game tree and picking the column that has a slight advantage game-tree wise).

AnoHito · 2014-11-28 16:14:42

jere wrote:

And finally, maybe this sounds naive, but Jason is the last person I'd expect to do something like that considering his publicly stated financial requirements, revenue, and just how plain stupid it would be.

I tend to agree, but still, money is a strong motivator...

jere wrote:

This is expected though, right? At least the column picking part. Frankly, I've already started using an RNG for both some column picking and early raising. Otherwise, I imagine I'd be rather predictable.

As have I. But do keep in mind that an unweighted RNG will tend to give you lower number than the other player if they are picking the column choices that are the most advantageous in the first round.

jere wrote:

For the betting part, it's been said here that bots are pretty bad at no-limits betting.

So say for example I, or someone else, released a bot to the public that automatically selected numbers for you with perfect Nash Equilibrium, and all you had to do was decide what to bet. Would anyone not use the bot? If you didn't, you would do worse than other players who were using the bot. Would anyone still want to play Cordial Minuet if you removed the picking game from the equation entirely? This is a problem.

jere wrote:

True, you sometimes know you have absolutely won. But more often you don't know. And even when you know for sure, the most important thing is reading the other person so you know how much you get them sink into a losing hand. Seems pretty intuitive to me why bots would be bad at that.

Well, what I have found is that when betting in Cordial Minuet raising by more that one in the first round is almost never a good idea. You need to raise in the first round if you have a strong position, otherwise you will never be able to make money because of the high probability the other player will fold if you raise later. But if you give any indication of the value of your position in your raise, it increases the chances the other player will fold. Additionally, betting a low number maximizes the chance that the other player will call with a bad position, if they are impatient or simply don't want to be seen as predictable. Betting more than one will only expose you to more risk, because you need to bluff at least some of the time (or the other player will always know if you have a strong position), and betting more than one in this situation will just cause you to lose more money if your bluff is called.

The aspect of the betting strategy I haven't quite figured out you, is how to handle a situation where you commit yourself to a round by matching a large bet in the first round because you know you had the higher number, but then get a low number in the second round. This is a perfect situation for your opponent to bluff, although you do have the advantage of knowing that because they can't be certain of what number you have, it's unlikely they would bluff in this situation that often. Still, you can't always fold whenever this happens or the other player could take advantage of you, and I'm not sure what the threshold for deciding to match their bet should be.

By the way, I'm having a little trouble wrapping my head around the probability in the very last betting round. It seems like each option is 50-50, but it also reminds me a little bit of the Monty Hall problem.....

Hmm... not quite sure what you mean. The Monty Hall problem only applies when you have the option to change your choice after another choice has been eliminated.

AnoHito · 2014-11-28 16:27:20

jasonrohrer wrote:

Yes, a mixed strategy Nash Equilibrium exists in this game.
Finding that strategy is a different matter, however. The Nash Equilibrium is different for each board and it would need to be computed in realtime for each new game by exploring the full, 500,000-leaf game tree. This may or may not be feasible in terms of computational complexity. If it is claimed to be feasible, we need to start talking about what algorithm would be used and what it's complexity characteristics are.

And I'm probably not going to be the one to write it. The fact that I know it exists is enough to discourage me for seriously perusing this game, and I'm not really that interesting in writing a bot to farm money from human players. You should also consider, that even if achieving Nash Equilibrium in computationally infeasible (and I don't think it is for various reasons), you could still probably write a program that can approximate it better than any human. Even coming up with a weighting algorithm that gets better results than the average human in the first round would be enough to make it difficult for an unassisted human to be successful at this game.

AnoHito · 2014-11-28 16:38:24

Oh, one other thing I was thinking of though. Cordial Minuet is played on a 6x6 grid, 6 being an even number. Games with this kind of symmetry tend to be easily solvable, as it is asymmetry that tends to give rise to complexity. For example, in Go, all valid board sizes have an odd width/height. The reason being, if the board size were even, the best way to play would simply be to copy the last move of the other player. It would be an unbeatable strategy. But having an odd board size creates an asymmetry, the center space. Only one player can occupy this space, and there is no way to maintain a symmetrical board once this space is taken. That one space in the center turns an easily solvable game, into one of the most complex games known to exist. Something to think about.

Last edited by AnoHito (2014-11-28 16:39:23)

jasonrohrer · 2014-11-28 17:17:18

The following post ignores betting at first:

The Nash equilibrium is only "optimal" in a particular, narrowly-defined way, essentially guaranteeing that you lose the least no matter what strategy your opponent is using and that you stand to gain nothing by switching away from the Nash eq yourself if your opponent is also playing the Nash eq. It's an attractor in the strategy space, a gravity well.

But it is NOT optimal in terms of how much you could win against various different strategies. If your opponent is not playing the Nash eq, you'd better switch strategies, because by playing the Nash eq, you've stuck yourself with no net winnings and rewarded your opponent with more winnings than they deserve. Let's say their strategy is as simple as "always pick column 1 for me and 2 for you." If you're sticking with your Nash eq strategy, you will win 50% of the rounds. Your opponent's strategy is just as good as your Nash eq strategy in terms of the outcome.

The only way you can punish your opponent for their bad strategy, and thereby take more than 50% of the money, is to deviate from your Nash eq strategy. In fact, you could jump up to winning 100% of the rounds.

But at that point, your strategy has become exploitable too. Back when you were playing Nash eq, your opponent had no motivation to switch from "c1 c2," because doing so would net them no more wins against Nash eq. But now that you've deviated, they are motivated to deviate. Note that they won't be motivated to switch to Nash eq themselves at this point, because that would only get them back to 50% wins. They could climb above that by exploiting the weakness in your new, non-eq strategy.

Here is the interesting, counter-intuitive crux: the only way to make money at the game is to play a non-optimal strategy.

Taking the rake into account, if you are playing the Nash eq strategy, you've guaranteed that you will lose money over time, no matter what your opponent is doing. There's a guarantee that you won't lose more than a certain amount, but a guarantee that you will win nothing. If your opponent is playing the Nash eq strategy, all your strategies become equal, all losing the same limited amount of money over time, and tying the so-called optimal strategy.

The only way that you can possibly win money is to play a non-optimal strategy. If your opponent sticks with Nash eq, you will do no worse. However, if your opponent ever deviates, you will be able to exploit their deviation.

Baiting them into deviating is donkeyspace.

Who would spend time writing a bot that plays the Nash eq strategy? A bot that is guaranteed to lose money, no matter what? No, in the real world, the Nash eq is not actually the best bot. Step one is an evolutionary bot that adapts to the opponent's behavior. Step ten is far beyond that:

http://edge.org/conversation/on-iterate … er-dilemma

Yes, the Nash eq is a kind of strategic gravity well, but a well that has many holes when you're down in it in the real world. Especially when the Nash eq itself is complex, the smallest crack in one opponent's strategy will quickly cause both players to slip out of that well, chasing each other's exploitabilities in a rising cyclone that is unlikely to return.

I get your point that I could profit by building such a Nash eq bot and running it. Then I'd have a game where everyone, the bot included, would lose a fixed amount of money over time, no matter what people were trying to do against the bot. I'd think I'd be shooting myself in the foot by doing that, because people would stop playing.

In conclusion, the only player that would be willing to play the Nash eq for long would be a bot that was funded by the house, at which point I'd be making $0 (me running a stable full of bots that were all losing money to me).

Now add betting strategies.

jasonrohrer · 2014-11-28 17:20:01

http://en.wikipedia.org/wiki/Mirror_go

AnoHito · 2014-11-28 18:10:31

I know that you can't beat the other player using Nash Equilibrium, at least not in terms of picking. But there would probably always be enough human players around with bad betting strategies that you could find and exploit, and eliminating picking as a factor would make it much more simple to do so. Putting that aside, as I said earlier, I believe that in any case the best possible strategy for actually winning would be to use an equilibrium strategy as you default strategy, and then at random times deviate from it in order to pick columns that are "better" than the ones your equilibrium strategy tells you to pick. The ideal amount of deviation would be the absolute minimum required to increase your odds to the point where you could be profitable. I believe that it would be next to impossible for a human to effectively counter this tactic, so it would limit the ability of any human player to win if bots were involved.

Oh, and I forgot about the ladder strategy. I hadn't even heard of the superko strategy. The point that I was trying to make, is that I think there needs to be some aspect to Cordial Minuet that adds some additional asymmetry to the picking. Something it would not be trivial for a bot to deal with. However, I have no idea what that should be.

zed · 2014-11-28 22:47:21

jasonrohrer:
> But it is NOT optimal in terms of how much you could win against various
> different strategies. If your opponent is not playing the Nash eq, you'd
> better switch strategies, because by playing the Nash eq, you've stuck
> yourself with no net winnings and rewarded your opponent with more winnings
> than they deserve. Let's say their strategy is as simple as "always pick
> column 1 for me and 2 for you." If you're sticking with your Nash eq
> strategy, you will win 50% of the rounds. Your opponent's strategy is just
> as good as your Nash eq strategy in terms of the outcome.

That's only correct if the oppenent is not playing a recursively dominated
strategy (being one which is dominated once you ignore recursively dominated
strategies). The equilibrium strategy wins against such strategies more than
half the time. As far as I know, it's infeasible to actually do the
computations for this game, so I don't know how common domination is. But I
would be surprised if humans didn't regularly play recursively dominated
strategies.

jere · 2014-11-28 23:29:24

Hmm... not quite sure what you mean. The Monty Hall problem only applies when you have the option to change your choice after another choice has been eliminated.

I said it reminded me of the Monty Hall problem. Not that they're the same. Here's my reasoning and like I said, I'm confused on it, so if someone can flat out say "no, it's actually 50-50", I'd be very happy with that.

What I mean is in the penultimate betting round, you're usually shown up to 6 possible red outcomes. If you treat your opponent's row selection as essential random (only for the point of analysis here), then all the outcomes appear equally likely. So consider the case where my final score is higher than 5 out of the 6 of the opponent's possible scores, I'm tempted to say I have an 83% chance of winning. I think that's reasonable given the information I have at that point, but the next part is more muddled....

In the final round, I'm given new information and only two options remain. This seems intuitively like each is equally likely, which is why I'm reminded of Monty Hall. That problem also presents a probability that appears 50-50 at first but is not.

There's something interesting about the highest value. My opponent will almost certainly choose to reveal that highest option if they can. Otherwise (assuming again I have a score higher than 5/6 of the previous outcomes), they've letting me know for sure I've won. And like in Monty Hall, the rest of the options (besides the two I'm shown) have been eliminated.

Can someone clear this up? Is it 50-50, 83-17, or something else?

Games with this kind of symmetry tend to be easily solvable, as it is asymmetry that tends to give rise to complexity.

The board size seems pretty tightly constrained by the theme (all numbers adding to 666). And I think symmetry is intentional, no? Even if it was 7x7 and one column/row left unselected, I don't see how that becomes much more complex.

As far as I know, it's infeasible to actually do the computations for this game

I believe you, but can you elaborate? I'm not getting it. Each board has only 518,400 outcomes, no?, which is like the search space in a couple moves in Go. I need to think about this one more.

Last edited by jere (2014-11-28 23:36:49)

zed · 2014-11-28 23:53:36

jere:
> > As far as I know, it's infeasible to actually do the computations for this game
> I believe you, but can you elaborate? I'm not getting it. Each board has
> only 518,400 outcomes, no?, which is like the search space in a couple moves
> in Go. I need to think about this one more.

But it isn't outcomes you have to work with, it's strategies.

There may well be a more cunning way to approach it, but the only way I've
seen to try to fully "solve" the whole (betless) game is to reduce it to a
single-move simultaneous move game. You can do that by having a strategy in
the one-move version be the choices the player will follow in the actual game
according to revealed information. So that's 30 possibilities for the first
move, then, for each of the 6 possible revealed rows after the first move, 12
possibilities for the second move, then for each of the 5 possible revealed
rows after the second move, another 2 possibilities for the last move. So
that's 30 * (12*2^5)^6 = 9.6*10^16 possible strategies. Utterly infeasible.

jasonrohrer · 2014-11-29 03:36:28

Good stuff, as usual, Zed.

I was, admittedly, working off of my understanding of Nash eq as applied to much simpler games than CM. Like, in RPS, where the Nash eq is to pick randomly with 1/3 weights, and all that does is guarantee that you win 50/50, regardless of whether your opponent plays rock every time or not.

Still, I think the reasoning is sound that even in CM, you wouldn't want to stick with the Nash eq "no matter what," because it offers no guarantee of winning a given round. If your opponent is playing a known deterministic strategy, for example, there is clear strategy that will always beat that strategy (every round).

I've heard from someone else who came the the same conclusion as you regarding the infeasibility of applying minimax to the betless game as a single move. There was some reference to a linear programming algorithm for computing such things that had a specific running time given the number of possible strategies, but I can't find it now.

Can you post a link that explains "recursively dominated?" My searches aren't turning up much on that topic.

jasonrohrer · 2014-11-29 06:21:27

I feel sure that it's true that there is no PURE Nash eq here, which essentially means that if your opponent is playing a deterministic strategy, you can always beat them, every round (even if they're searching the game tree in a deterministic way), and I think a greedy per-turn strategy would win here.

But I have no proof of this truth. It's clear that on turn 1, if you know what they will pick, you can always give yourself a higher number than they give you, but this also affects future possible picks in complicated ways.

Perhaps the proof can happen by working backward.

Assuming that you're opponent is using a known, deterministic strategy (you can run the same algorithm as them and know what they will pick in any situation), in the last pick, there are 4 squares left on the board, and you know which ones are left. Say they are in A, B, C, D order, with A the highest and D the lowest.

Worst case possible pattern:

A B
C D

They will pick the top row for themselves, and no matter what, you will get a lower turn score that final turn.

Is there a way to play, on some board layouts, that allows them to force this situation if you are making greedy choices (given that you know what they will pick) on turn 1 and 2? Yes, I've found this to be true on an example board. By controlling what you pick on turn 1 and 2, against a per-turn greedy omniscient player, you can force a higher score for yourself than them on turn 3.

What about turn 2, where there are 16 squares left... (A, B, C, D, ... O, P) Could there be one row where all 4 numbers left in that row are higher than the remaining 12 numbers? That could be a row that looks like this:

1 A B C D 2

Where they forced you to pick the first and last columns on turn 1 as your greedy choices. This row could work:

1 25 26 28 29 2

And I've seen rows like this on an example board. In this case, it is possible to force the per-turn greedy, omniscient player to take a lower score than you on turn 2.

But can you do it on both turn 2 and 3, in series? Certainly you cannot do it on turn 1---the greedy omniscient player can always give themselves a higher score than you on turn 1. But, in so doing, can you set them up to take a lower score in both turn 2 and 3 that lets you win?

On turn 1, the average score of a number is 18.5. Thus, the best case highest number you can let them take (if you give them a column with the lowest high number) is 21 (the worst-for-them possible column is 16 17 18 19 20 21), and the best-case lowest number you can let them give you (if you give yourself a column with the highest low number) is 13 (suppose you also, in this hypothetical best-for-you board, had a column like 13 14 15 22 23 24). So, best case, you might be able to only allow them a score gap of 8 on turn 1. That's the lower bound---the score gap the greedy omniscient player can force on turn 1 will never be lower than 8.

So, given that you're always at least 8 behind after turn 1 against greedy omniscient, can you recover 8 points by forcing certain columns in turns 2 and 3?

Seems like I need to code up some AIs to search this space. If your opponent is peeking and given themselves the highest score-gap per turn, is there a leaf in the game tree where you have a higher total score?

To rephrase it, in the turn-based, full-information version (where p2 sees p1's picks before picking), do any boards exist that are not an automatic win for player 2 using a per-pick greedy strategy? Is the turn-based, full information version trivial to solve for every board as a win for p2? Or can you force p2 to explore the game tree? Is p2 guaranteed to win if p2 explores the game tree, or can P1 force a win on certain boards?

Finally, I will note that I feel like I'm falling down a Turing/Godel rabbit hole here. We're talking about reacting to deterministic strategies in a deterministic way by taking the strategy into account. Asking whether any deterministic strategy A has a deterministic strategy B that always beats it, and whether a function F(A) = B exists. But what if the A[] strategy takes a strategy S as as an input and soundly beats the input strategy? What is B = F( A[ B ] )?

Like a Turing Machine that runs the halting-detection algorithm on itself and does the opposite.

AnoHito · 2014-11-29 12:56:24

jere wrote:

I said it reminded me of the Monty Hall problem. Not that they're the same. Here's my reasoning and like I said, I'm confused on it, so if someone can flat out say "no, it's actually 50-50", I'd be very happy with that.
What I mean is in the penultimate betting round, you're usually shown up to 6 possible red outcomes. If you treat your opponent's row selection as essential random (only for the point of analysis here), then all the outcomes appear equally likely. So consider the case where my final score is higher than 5 out of the 6 of the opponent's possible scores, I'm tempted to say I have an 83% chance of winning. I think that's reasonable given the information I have at that point, but the next part is more muddled....
In the final round, I'm given new information and only two options remain. This seems intuitively like each is equally likely, which is why I'm reminded of Monty Hall. That problem also presents a probability that appears 50-50 at first but is not.
There's something interesting about the highest value. My opponent will almost certainly choose to reveal that highest option if they can. Otherwise (assuming again I have a score higher than 5/6 of the previous outcomes), they've letting me know for sure I've won. And like in Monty Hall, the rest of the options (besides the two I'm shown) have been eliminated.

Ah, okay, I see what you mean now. As far as the Monty Hall problem is concerned, the logical fallacy comes from the natural human assumption that when picking from a set of two choices, you have a 50% chance of picking the choice that yields the most desirable outcome. But this is actually only true if you completely ignore any and all information on the probability that one choice is better than other. In the Money Hall problem, if you select door one in the first round, and then door two is eliminated, and then you pick randomly between doors one and three, you do indeed have a 50% chance of guessing correctly. But let's go back to the first round and examine what happened in more detail. Your original choice, had a 1/3 chance of being correct no matter which door you picked. So let's divide the choices into two sets, one with the door you picked (1/3 chance of being desirable), and one with the doors you didn't pick (2/3 chance of being desirable). If you had the choice of picking one of these sets (assuming the final door you would get was the best possible in the set), the obvious choice would be the set with a 2/3 probability of being most desirable. When they eliminate door two, they eliminate a number from the second set, but since we know they have eliminated only the door with the least desirable outcome in that set, the probability the set contains the most desirable door remains the same. Now you are left with two choices, door one, which you know has a 1/3 chance of being the most desirable, and door three, which you know has a 2/3 chance of being the most desirable. Not changing your original selection at this point would be like someone outright telling you it is twice as likely that door three has the prize, and then picking door one anyway. It's actually not that difficult to understand once you realize that the outcome of the first round gives you new information that you didn't have when making your original choice.

Given this, let's look at how the number the other player shows gives us more information about our odds of winning. If we assume the other player is picking a number to show us randomly, we learn nothing. What we know about whether we will win at the end is accurately represented by the remaining bars on the right. But if we assume the other player has a bias toward picking their best possible number to show us, that bias will in fact affect the reliability of the information we gain from the bars displayed after they make their choice. The more often they pick their highest number, the more it will tend to lower the relative reliability of the information displayed in the bars after they show us their number, compared to the information the bars gave us before they showed us a number.

jere wrote:

Can someone clear this up? Is it 50-50, 83-17, or something else?

Based on the above conjecture, if we know they always show us their highest number, the odds do indeed remain 83-17. If we know they always choose randomly, the odds would then become 50-50. And any combination thereof would skew the odds toward 83-17, depending on the strength of their their bias.

Last edited by AnoHito (2014-11-29 12:58:22)

zed · 2014-11-29 14:55:11

> I've heard from someone else who came the the same conclusion as you
> regarding the infeasibility of applying minimax to the betless game as a
> single move. There was some reference to a linear programming algorithm for
> computing such things that had a specific running time given the number of
> possible strategies, but I can't find it now.

That was probably me, by email, a couple of months ago.

The algorithm is roughly cubic in n where n is the number of strategies for
each player.

> Can you post a link that explains "recursively dominated?" My searches
> aren't turning up much on that topic.

I don't know a standard reference, but searches gave me this:
http://www.umass.edu/preferen/Game Theory Evolving/GTE Public/GTE Eliminating Dominated Strategies.pdf
which gives the basic definitions and has a nice 3x3 example where elimination
collapses it to 1x1. There's a zero-sum example in the exercises (f).

jasonrohrer · 2014-11-30 05:05:43

Thanks for that PDF link, Zed!

And yeah, that must have been you that emailed me.

.. · 2014-11-30 15:40:31

zed wrote:

jasonrohrer:
> But it is NOT optimal in terms of how much you could win against various
> different strategies. If your opponent is not playing the Nash eq, you'd
> better switch strategies, because by playing the Nash eq, you've stuck
> yourself with no net winnings and rewarded your opponent with more winnings
> than they deserve. Let's say their strategy is as simple as "always pick
> column 1 for me and 2 for you." If you're sticking with your Nash eq
> strategy, you will win 50% of the rounds. Your opponent's strategy is just
> as good as your Nash eq strategy in terms of the outcome.
That's only correct if the oppenent is not playing a recursively dominated
strategy (being one which is dominated once you ignore recursively dominated
strategies). The equilibrium strategy wins against such strategies more than
half the time.

It's stronger than that. If you play to a Nash equilibrium and your opponent doesn't, then by definition (since they aren't playing their optimal strategy) you will do better than if they did (because it's zero-sum), eg, making a profit in CM. It would be interesting to know on what scale the profit rate against other strategies would be in CM.

There may well be a more cunning way to approach it, but the only way I've
seen to try to fully "solve" the whole (betless) game is to reduce it to a
single-move simultaneous move game. You can do that by having a strategy in
the one-move version be the choices the player will follow in the actual game
according to revealed information. So that's 30 possibilities for the first
move, then, for each of the 6 possible revealed rows after the first move, 12
possibilities for the second move, then for each of the 5 possible revealed
rows after the second move, another 2 possibilities for the last move. So
that's 30 * (12*2^5)^6 = 9.6*10^16 possible strategies. Utterly infeasible.

Provided that a game is one with perfect recall (the players remember their previous moves, true here) you can also exactly solve games in extensive (game tree) form directly without converting to normal form (payoff matrix), which causes the exponential blowup that you describe. See for example:
Koller, Megiddo, & von Stengel. (1996). Efficient Computation of Equilibria for Extensive Two-Person Games
This translates into a linear programming problem linear in the game tree size. There's various other papers on ways to do this. I found one in particular that looks like it could be practical for the betless version of CM because it (sort of) prunes the game tree. If I understand correctly, they successfully ran it on games with up to 10^11 game tree nodes, and can run in half a minute if the tree is pruned to a few thousand nodes.
Bosansky, Branislav, et al. (2013) "Double-oracle algorithm for computing an exact nash equilibrium in zero-sum extensive-form games."

Cordial Minuet with betting is definitely infeasible to solve, and not just because there's a huge number of different possible bet amounts. But this is all moot anyway, because if you were to write a bot to play against humans you'd want to use approximate solutions instead, and opponent modelling at least for the betting.

---

Noone has mentioned yet that adding betting to the game changes how you should pick columns, for several reasons. These include that the game can end early, and that it's not just score that matters at the end, but also the amount of information you have (and not just information revealed through the column reveal and early bets). I suspect that a bot that computes an optimal (for the restricted game) column picking strategy by ignoring betting, and then separately an optimal strategy for betting, would be highly suboptimal because information is very valuable.

Here's one data point: a day or two ago I played the most unbelievably predictable player I've ever seen. I could pick the first row they gave me 90% of the time, and also knew how they picked the 2nd turn rows (hint: the same way a lot of other people seem to). Despite this enormous advantage, I was losing badly, because they were so much better at betting than me. Now, I admit, I must be horrible at betting. But wait, I checked the scoreboards to see who this was. This player ranked at the top.

---

If you want to brush up on game theory, I can recommend this, as I just read some parts of it myself. Easy to read and lots of examples:
http://www.math.ucla.edu/~tom/Game_Theory/mat.pdf

---

The endgame of CM is very interesting; I agree that it's a lot like the Monty Hall problem. But until Monty Hall, I don't think we can work out the optimal strategy without computer assistance or a lot of maths, because it's another complication on top of the endgame betting, which is itself very complicated (Ferguson and Ferguson (2007) "The Endgame in Poker" looks interesting).

Last edited by .. (2014-11-30 16:39:40)

zed · 2014-11-30 18:39:55

..:
> Koller, Megiddo, & von Stengel. (1996). Efficient Computation of Equilibria
> for Extensive Two-Person Games
> This translates into a linear programming problem linear in the game tree
> size.

Just the kind of trick I was hoping for, thanks. So if I understand correctly,
that reduces the dimension to a little over 30 * (12*2*5)*6 = 21600, which is
rather more manageable! Still not bruteforceable in the timeframes required by
a bot, I believe, but it might be interesting to see if that "double-oracle"
algorithm does it.

I'll be a bit disappointed if it does, to be honest. I know that betting is
the main point of the game, but I rather feel that it's a cheap and
unsatisfying way to inject complexity into a simultaneous move game. I believe
the world needs more deep (in particular computationally intractable) discrete
simultaneous move games. This seems to be harder to achieve than I'd realised!

Asminthe · 2014-11-30 19:53:09

zed wrote:

..:
I know that betting is the main point of the game, but I rather feel that it's a cheap and
unsatisfying way to inject complexity into a simultaneous move game.

I'm curious why you think that betting actions are cheap and unsatisfying. In my experience with poker I've found bets to be a truly fascinating type of game move and so far I really like the way their existence in this game influences the other actions as well.

jasonrohrer · 2014-11-30 21:33:33

Yeah, when I designed this game, I wasn't so concerned with whether it was solvable or not in terms of the picking game. I wanted to ensure that there was no pure Nash equilibrium (which would make it broken), and the "complexity" of the game, in terms of multiple rounds on the same grid (as compared to RPS) is just there to build an interesting betting structure around a progressive narrowing of hidden information.

Mixed strategies, even if known to all, are fine in terms of my design goals, because mixed strategies leave room for donkeyspace. Even RPS has this (playing rock 4x in a row to bait your opponent into playing paper). But RPS has no obvious spot on which to hook an interesting betting structure. Also, one of my other design goals was that it be a truly novel game.

I now believe that I was wrong about the 50% upper/lower bound when playing the Nash eq even if you're opponent is not, but I stand by my claim that the Nash eq is not optimal if you're opponent is not playing Nash eq. Yes, a rational opponent would move toward Nash eq to raise their win percentage against your Nash eq strategy---but I'm talking about an irrational opponent. Nash eq. is unexploitable. But your irrational opponent's strategy IS exploitable, and there is nothing about the Nash eq. strategy that causes it to take full advantage of an exploitable opponent.

Thus, the Nash eq. and the optimal strategy against a given opponent are two different things (unless the given opponent is playing the Nash eq. strategy, in which case they are the same thing).

Thus, even iterated RPS or iterated prisoner's dilemma is not simply "solved" through the Nash equilibrium. That's why there are bot competitions for these games and active research into more complex strategies.

As for simultaneous decision games that are intractable to solve, I think space of possible strategies is exponential in the number of rows/columns. So... wouldn't 8x8 or 10x10 do the trick there? 10x10 gives us 10^13 leaves. Not that I want to make that game, because that kind of intractability is not what I'm shooting for anyway (and I believe a bigger grid would make a less interesting betting game).

AnoHito · 2014-11-30 22:48:05

As far as the betting game, I am beginning to notice a trend. When the betting is mostly done with low stakes bets (which I believe is always the best strategy as it minimizes risk to yourself, and increases the risk to the other player if they bet more than you), you will actually tend to lose more money from the tribute than you will to the other player. For example, it's not uncommon at all to see games where one player is breaking even and the other player is losing money. This creates a game of chicken, because the lower the losing player's stack becomes, the more pressure they feel to take risks in order to avoid losing more money to the tribute. I cannot stress this enough, I think that taking tributes out of individual rounds is a bad decision in terms of the future of the game. It makes much more sense to just take a tribute from both players at the beginning of the match. Not only does this encourage more sensible betting, but it also discourages players from abandoning matches the second they gain a lead.

Edit: No wait I'm an idiot. ;P It should be taken at the end of a match depending on how much money has changed hands. If you take it at the beginning the other player could leave the match right away just to troll you.

I also feel that I should point out, I am noticing that variations on my original strategy of giving the other player their worst possible row are emerging as a dominant strategy. While you can't do it all the time or it would become predictable and take away your advantage, doing it as often as you can get away with it appears to be better strategy than any other that I have observed. Even if the picking game couldn't be broken to the point of achieving pure Nash Equilibrium, the existence of a dominant strategy like this takes so much out of the picking game that there is very little room for alternative strategies.

I was also wondering if you read my post on the Monty Hall problem and how it relates to your odds with winning in the final betting round. In my opinion, given the fact that the decision of which number you show the other player will be biased toward presenting your position as stronger (or weaker if you are certain you will win), the possible outcomes shown in the final round will always give a less reliable indication of your odds of winning than the possible outcomes shown before the numbers are revealed. The only way this could not be true is if you selected your number to show the other players randomly, but no one would ever do this because there is no advantage in doing so. This renders the final betting round pointless in many cases, because the information available in the final round is almost always less reliable than the previous round. The only exceptions are the cases where the information available in the final round tells you for certain you will win or lose. But in most cases you would have already known this before the numbers were revealed anyway, or at least known if one player had a much higher chance of winning than the other. So the whole showing a number thing just becomes reduced to a gimmick, and if both players are playing correctly, they should usually ignore the possible outcomes shown in the final round completely.

Last edited by AnoHito (2014-11-30 23:23:27)

Asminthe · 2014-11-30 23:41:54

There's nothing wrong with the current tribute system. It works very much like a good poker rake does.

If it were changed to be up front, every game would have to last until someone busted, which can take an arbitrarily long time, especially if no tributes are coming out of the pots. Someone who simply has more time to play could just fold every round as slowly as possible and hope the other person has to leave first. Every malicious player who was ever upset about having lost a pot would simply make the game miserable for the other player.

You're trying to solve problems that aren't problems and are suggesting solutions that would make the game bad.

Edit: Just saw your edit. While taking the tribute at the end would resolve the problem of forcing the game to last until someone busts, I don't know how it is an improvement over the current system, especially considering the tribute percentage would probably have to be increased in order to maintain the same profit for the house.

Last edited by Asminthe (2014-11-30 23:50:02)

CORDIAL MINUET ENSEMBLE

#1 2014-11-28 09:06:05

More Thoughts on Cordial Minuet/The Problem With Tributes

#2 2014-11-28 09:41:43

Re: More Thoughts on Cordial Minuet/The Problem With Tributes

#3 2014-11-28 11:09:13

Re: More Thoughts on Cordial Minuet/The Problem With Tributes

#4 2014-11-28 13:59:51

Re: More Thoughts on Cordial Minuet/The Problem With Tributes

#5 2014-11-28 15:57:56

Re: More Thoughts on Cordial Minuet/The Problem With Tributes

#6 2014-11-28 16:14:42

Re: More Thoughts on Cordial Minuet/The Problem With Tributes

#7 2014-11-28 16:27:20

Re: More Thoughts on Cordial Minuet/The Problem With Tributes

#8 2014-11-28 16:38:24

Re: More Thoughts on Cordial Minuet/The Problem With Tributes

#9 2014-11-28 17:17:18

Re: More Thoughts on Cordial Minuet/The Problem With Tributes

#10 2014-11-28 17:20:01

Re: More Thoughts on Cordial Minuet/The Problem With Tributes

#11 2014-11-28 18:10:31

Re: More Thoughts on Cordial Minuet/The Problem With Tributes

#12 2014-11-28 22:47:21

Re: More Thoughts on Cordial Minuet/The Problem With Tributes

#13 2014-11-28 23:29:24

Re: More Thoughts on Cordial Minuet/The Problem With Tributes

#14 2014-11-28 23:53:36

Re: More Thoughts on Cordial Minuet/The Problem With Tributes

#15 2014-11-29 03:36:28

Re: More Thoughts on Cordial Minuet/The Problem With Tributes

#16 2014-11-29 06:21:27

Re: More Thoughts on Cordial Minuet/The Problem With Tributes

#17 2014-11-29 12:56:24

Re: More Thoughts on Cordial Minuet/The Problem With Tributes

#18 2014-11-29 14:55:11

Re: More Thoughts on Cordial Minuet/The Problem With Tributes

#19 2014-11-30 05:05:43

Re: More Thoughts on Cordial Minuet/The Problem With Tributes

#20 2014-11-30 15:40:31

Re: More Thoughts on Cordial Minuet/The Problem With Tributes

#21 2014-11-30 18:39:55

Re: More Thoughts on Cordial Minuet/The Problem With Tributes

#22 2014-11-30 19:53:09

Re: More Thoughts on Cordial Minuet/The Problem With Tributes

#23 2014-11-30 21:33:33

Re: More Thoughts on Cordial Minuet/The Problem With Tributes

#24 2014-11-30 22:48:05

Re: More Thoughts on Cordial Minuet/The Problem With Tributes

#25 2014-11-30 23:41:54

Re: More Thoughts on Cordial Minuet/The Problem With Tributes

Board footer