pandaant / poker-cfrm Goto Github PK
View Code? Open in Web Editor NEWA NLTH Poker Agent using Counterfactual Regret Minimization
A NLTH Poker Agent using Counterfactual Regret Minimization
Hi Mark,
I was wondering whether or not the potential aware abstraction you created is based on the paper: http://www.cs.cmu.edu/afs/cs/Web/People/sandholm/potential-aware_imperfect-recall.aaai14.pdf
secondly, in the readme you say that the clusters in potential-abs are pre-calculated. Is there a way to regenerate them for a custom target. For example, if I wanted to create a potential aware abstraction of {169 ,200, 200, 200} or {169 ,500, 500, 500} ext.. would this be possible somehow?
Also, is the cfrm a version of MCCFR with external sampling?
With kind regards
I ran some tests of holdem, nolimit, 2player, 1|2 small blind |bigblind, 200|200 stack, maxRaises of 3 4 4 4 games. During cluster abstractions runs for all tests I kept the nb-samples to "0,2,500,500", the buckets to : 169,5,10,500, the error bounds to : .01,.01,.01,.01, the nb-hist-samples-per-round to 0,1,200,200. For all tests I held the action-abstraction to polrelative at 0.4,0.8,1.2,2,5,9999 raises. For cfr learning I had 12 threads and times of 8 hours and sometimes 16 hrs and 24 hrs.
I ran the head to heads, specifically NSSS against each of the NOOO, NEES, NEEO. I expected NSSS to perform the wost, meaning lose money, i.e. negative average wins and NEEO to be best. I'm getting NSSS to be the best ! Here's a table of results. As you can see I ran cfr's learning phase for the most sophisticated strategy, NEEO, for longer and longer times, so 8 hrs then 16 hours then 24 hours but that didn't change things. Any ideas of what to experiment with to get the results to align with expectations - meaning NEEO, NEES, NOOO to be all better than NSSS.
Update: thinking harder, I'm wondering if the clustering abstraction is too coarse so I need to increase the fineness, by increasing the nb-samples and the nb-hist-samples. Any ideas on combinatrics around this to see what's appropriate ?
Abs | cfrm runtime (secs / 1000) | Abs | cfrm runtime (secs / 1000) | Avg win | Var | num games | seed | median win |
NSSS | 28 | NOOO | 28 | 2.94 | 7167 | 500000 | 7534 | 0 |
28 | NOOO | 28 | 2.76 | 7119 | 100000 | 3575 | 0 | |
28 | NOOO | 28 | 2.95 | 7163 | 100000 | 8379 | 1.5 | |
28 | NEES | 28 | 3.4 | 7564 | 100000 | 8379 | 1.5 | |
28 | NEES | 28 | 3.03 | 3575 | 100000 | 3575 | 1.5 | |
28 | NEEO | 28 | 5.07 | 7475 | 100000 | 8379 | 1.5 | |
28 | NEEO | 57 | 4.53 | 7118 | 100000 | 7534 | 1.5 | |
28 | NEEO | 86 | 5.05 | 7141 | 100000 | 7534 | 1.5 | |
28 | NEEO | 86 | 4.84 | 7138 | 100000 | 8370 | 1.5 |
So which game have this repo been used for? And what about the training requirement(CPU cores, etc.) and the performance?
Hi Pandaant,
Nice work ! I would like to know if you implemented the "P for the potential aware abstraction" because i don't see it in the code available in github.
If i use cluster_abs with -m mixed_npeo it just computes with the mixed_nooo abstraction, you have something wrote like that in the scripts directory.
The github's files are updated?
I have a particular interest in this algorithm.
Thanks for your time
Hi, sorry to bother you,
if I wanted to obtain for each action in each state an expected value of taking that action, would the basic idea be to modify the train function in cfrm.cpp to keep track of what is returned by each action?
Hello Pandaant,
Great work with the project and thanks for sharing. I have run into the following issue, kindly share your thoughts.
I have not made any changes to the code apart from increasing the number of threads. The strategy file was computed, 17GB in total. But when I run "player" I get the error, "too many abstract actions". Looks like the maximum allowed is defined to be 20. The blinds are 1 and 2, stack sizes are 200 and 200. Since max allowed raises in each rounds are defined to be 3 4 4 4, my calculation for max number of actions is 4+5+5+5=19. So I am not sure why I am getting this error.
If I do change "MAX_ABSTRACT_ACTIONS" to say 30, do I have to repeat all the steps - gen_eval_table, cluster_abs, potential_abs, and cfrm again?
What is the idea behind code duplication in tools/ehs_gen/src and src?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.