Comments (1)
Hi, thanks for reaching out.
I can definitely revamp the READMEs to better explain the project. Nowadays I've mostly been doing various training runs and tweaking the model, such that recently, on a good run, it could beat my max-damage heuristic agent in a bit under half of its matches but still usually fail against me or any moderately-competent human player. Soon I definitely want to try more ideas to expand and improve the model and algorithm more to see if I can get a better model, especially one that I'd be confident in trying to test on the ladder later down the line (which I haven't done at all yet).
As for the initial choices and setting, I chose Gen 4 which was more simple and familiar to me so coding an event parser and state vector representation for it could be somewhat easy, though there were still some issues with trying to figure out how to represent e.g. Ditto's state and all its corner cases. As this is mostly a hobby project I don't foresee supporting other generations, at least for a while until I can reliably produce a good Gen 4 Random Battles model as a proof of concept. Randomly-generated teams were because a fixed one felt too limited and possible to overfit without good signal, so I just picked Showdown's random battles format to sidestep the issue of having to pick a sufficiently-balanced team, though with the amount of variance already in the game I haven't confirmed whether this was the best choice, at least when starting out, since now it has to generalize and adapt to every playstyle based on its starting team and matchups.
This should probably go into the README now that I think about it so will do that and the rest of your points.
Lemme know if there's anything else you wanted to know about.
from pokemonshowdown-ai.
Related Issues (20)
- Overlap rollout and update stages in training script
- Reward is always -1
- Reduce game log size
- Rollout model only ever exploring
- Implement TD learning
- Implement multi-step learning
- Use legal actions only when calculating TD target
- Encore not handling Pursuit properly
- Add more model evaluation baselines
- Training memory leak HOT 2
- Use multiple threads for inference during training
- Allow multiple training games per thread
- Partial Python rewrite HOT 2
- Add recurrent DQN option
- Add prioritized replay
- Add noisy networks
- Error when running training HOT 10
- Simplify battle stream interface HOT 1
- Request for Assistance with Locking a Party of 6 Pokemon HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pokemonshowdown-ai.