Comments (19)
What is your GPU usage?
What is your RAM footprint?
from alphazero.jl.
What is your GPU usage?
What is your RAM footprint?
Gpu like 0-1%
Ram around 4gb currently
I'm using WSL but typically with the connect-four example the progress bar appears within 3 minutes. I left it running all last night and no progress bar appeared. I swapped to significantly smaller games and input data for turns but this doesn't seem to be helping.
Going from memory when this issue appeared I believe it started once I made the changes to vectorize state.
So I'm taking my CSV > Table.matrix then reshaping it.
return Array{Float32}(reshape(state.board, (24,15,16)))
return Array{Float32}(reshape(state.board, (6,4,10)))
I've tested if it perhaps was due to reshaping taking significant time by using a constant state variable for this but it doesn't seem to have an effect.
Could something be broken in my vectorize state that's causing this? Also - do you have discord or anything?
from alphazero.jl.
Did you manage to train the connect four agent on your hardware and how much time did it take?
Could something be broken in my vectorize state that's causing this?
It seems possible but unlikely. Are you sure the problem is not simply that your states are too big for your hardware?
There are scripts in script/profile
that you can use to profile different parts of AlphaZero. You can start by profiling the time it takes for the neural network to evaluate a bunch of states in your game using scripts/profile/inference.jl
and then compare the number you get to the connect four example
Also - do you have discord or anything?
Unfortunately not. I am currently lacking the time to be present on discussion platforms such as Slack or Discord.
from alphazero.jl.
Did you manage to train the connect four agent on your hardware and how much time did it take?
Not fully, progress bar was filling quite quickly. I confirmed good gpu usage, unsure on cpu.
1080ti I7 8770k clocked at 5.0ghz
Could something be broken in my vectorize state that's causing this?
It seems possible but unlikely. Are you sure the problem is not simply that your states are too big for your hardware?
Hm I'll dig into this more...
There are scripts in
script/profile
that you can use to profile different parts of AlphaZero. You can start by profiling the time it takes for the neural network to evaluate a bunch of states in your game usingscripts/profile/inference.jl
and then compare the number you get to the connect four example
Ok I'll try to take a look at this.
Also - do you have discord or anything?
Unfortunately not. I am currently lacking the time to be present on discussion platforms such as Slack or Discord.
Makes sense as to how productive you are :)
I was just thinking if you did I could add you to the private repo and you can tell me in private how poorly it is configured!
from alphazero.jl.
Seem to of gotten progress bar to at least appear but am getting this?
I think I know the cause but am not entirely sure...
from alphazero.jl.
from alphazero.jl.
This means that the network is outputting a distribution that is not a valid probability distribution. Maybe should should add some print statements to see what's going on.
Another source of concern here is the 98.4% redundancy figure, which may indicate you are not doing enough exploration.
from alphazero.jl.
I fixed it - Would rigging this to multiple machines increase benchmark speed?
from alphazero.jl.
Currently looking quite grim in terms of how many instances I'd need to generate this.
My estimate somewhere around 100-1000
from alphazero.jl.
Can you give me the numbers you are getting with scripts/profile/inference.jl
? How much time does it take to run a typical batch of your states with your neural network?
Using multiple machines would increase speed, although right now only the self-play data generation is distributed, meaning that you would start hitting diminishing returns after about a dozen machines as training the network becomes the new bottleneck. (This will be improved in future releases but I am thinking about the best way to do it without adding too much complexity.)
If your problem is too complex for the hardware you have, you may want to reformulate it so that you end up with smaller states, or use a smaller neural network architecture (or even use faster ML models such as gradient boosted trees).
from alphazero.jl.
I multiplied the amount used for previous tests by 6x
So this is what I'm reading at
I left it running all last night and when I woke up it was about 25% through a single training iteration.
My goal would be 24x this memory footprint.
My cpu usage is incredibly low, I've checked to ensure WSL is set to use all my CPU cores. I suspect I'm ram speed bottlenecked at the moment, which matches my experience with Muzero. Unfortunately I'm not aware of an easy way to check ram speed utilization. I suspect speed will increase dramatically once I rig this up to Azure since the instances I plan to use are configured so each core has it's own independent memory.
as training the network becomes the new bottleneck.
This can be done on the gpu correct?
If not what is the bottleneck in this case? I'd assume single core speed?
If your problem is too complex for the hardware you have, you may want to reformulate it so that you end up with smaller states, or use a smaller neural network architecture (or even use faster ML models such as gradient boosted trees).
Based on my limited understanding I don't think I can decrease the size of the states. My understanding is this is connected to size of the vectorized states, of which I'm storing the 'board data'. This 'board data' only contains the data used to decide which action to take, it doesn't include which actions have been taken in the past (even though I'm not quite sure if you told me to include this as it effects future actions).
One thing to note and perhaps you has a suggestion for is: My 'board' is the current time step along with the previous 60 steps in my dataset with the idea being for this to be used to make the decision at the current step, after this step is complete the board is changed to the next step along with the previous 60 steps before that. I'm trying to think of a way to reuse this data but am pretty sure alphazero doesn't have any sort of LSTM.
As for hardware, I'm not too worried about my personal hardware and am worried about if I'm able to throw cloud instances at the problem. I plan to test tonight to see the speed on azure.
from alphazero.jl.
If I could figure out a way to integrate an LSTM into this it would 'cut the data by' 1500x, but even if I had that wouldn't I still need to store that state?
I need to research alphazero further and get a deep enough understanding to understand how to solve this. But yeah in your expert opinion is there any solution to solve this 'refeeding' of data problem?
from alphazero.jl.
Apparently Open.Ai's Alphazero stores previous states for it's calculations, does Alphazero.jl do the same in any capacity?
from alphazero.jl.
Apparently Open.Ai's Alphazero stores previous states for it's calculations, does Alphazero.jl do the same in any capacity?
I am not sure what you mean by this. Previous states are typically stored in the MCTS tree but there is no need to send previous states to the network (almost by definition of a state).
More generally, I am wondering if you're not trying to apply AlphaZero on a problem where it does not really apply.
If the problem is to make online prediction on sequential time series data for example, AlphaZero does not really look like a good fit.
Also, here is an advice. Despite many people's efforts to make AlphaZero more accessible (including my own efforts with this library), AlphaZero is not an algortihm you should be expecting to use on your problems as a black box. Even for using it on simple board games, you will need some deep knowledge about that algorithm so that you can tune the hyperparameters properly without spending too much compute resources. For more unusual applications, you should be ready to modify the implementation itself so as to generalize some components or integrate domain knowledge to improve sample efficiency.
So my suggestion is to learn more about AlphaZero and make sure that AlphaZero is the most appropriate algorithm for solving your problem. Here, it may be useful to write a post on Julia's Discourse or an ML forum where you explain your problem clearly and where people can chime in and give suggestions on what approaches make sense.
from alphazero.jl.
Think I figured out how to simplify it. It was quite obvious but somehow escaped me. Once again sorry to be a pain but I'm struggling with this bug. I'm struggling to figure out the cause of this. I've attempted to have debug prints to narrow down the cause. I can't tell where it's coming from. Perhaps something with vectorize states?
from alphazero.jl.
Seemed to figure it out. It doesn't appear to like my 'clever' way to simplify things.
Something about limiting the amask to a single action causes it to break.
from alphazero.jl.
So yeah is there anyway to continue a game whist forcing a single move?
I.e each player is allowed to jump once then they must sit until the end of the game?
I can't end early since that will end the other players game right?
Should I just add an extra possible move so that it doesn't cause this breaking?
from alphazero.jl.
Final Update: I attempted to add an extra possible move that can always be made but it doesn't seem to fix the problem? I have no idea why this works but this recent logic I added causes it to break. Something regarding the actions and the amask
from alphazero.jl.
I think it's something busted with state maybe? Like perhaps some sort of desync?
I don't understand why this behavior is presenting itself given the behavior of when it appears.
Could this be do to set_state! and the fact I'm not setting all my variables from this state in set_state?
from alphazero.jl.
Related Issues (20)
- Training Stuck at 'Network Only against MinMax (depth 6)' after Modifying TicTacToe Board Size to 4x4 HOT 4
- [Feedback Requested] Wishlist for the next major release of AlphaZero.jl HOT 4
- Maximum number of iterations HOT 1
- Success stories HOT 2
- How to prevent GPU OOM? HOT 9
- What next after training an agent? HOT 2
- How to play a one-player game automatically and get the move sequence? HOT 1
- I can't install it to run training HOT 1
- LoadError: UndefVarError: #flatten not defined
- Any way to start when playing against AlphaZero? HOT 2
- AZ much worse than generic solution for simple game HOT 2
- When should I stop learning? HOT 2
- How many iterations are required?
- Julia 1.9?
- How important is loss during learning
- How to restart a training session that has completed HOT 3
- Not invoking alternate implementation of select_move HOT 1
- Success report and request for help HOT 3
- got Out of GPU memory when learning HOT 6
- Non-trivial games HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from alphazero.jl.