Comments (4)
Hi @Acejoy, here is a blog post mentioning how LSTM-DQN can be adapted to text-based games.
Code can be found here https://github.com/microsoft/tdqn
Regarding having access to the possible actions, you can request the admissible_commands
when building the EnvInfos
object (see documentation). Also, you can check Building a simple agent.ipynb, this is used in RandomAgent
.
from textworld.
Hi @mohanr, thank you for your feedback. We are going to better describe what's going on in the agent example code. Here's a high-level description in case you need it now.
The agent receives as input the concatenation of the description of the room (output of the look
command), its inventory contents (output of the inventory
command) and the game's narrative (previous command feedback). A GRU encoder (encoder_gru
) is used to parse the input text and extract features at the current game step. Those features are then provided as input to another GRU (state_gru
) that serves as a state history and spans the whole episode.
To select which commands to send to the game, the agent gets the list of all admissible commands (provided by TextWorld) and scores each of them conditioned on the current hidden state of the state_gru
. To do that, another GRU encoder (cmd_encoder_gru
) is used to encode each text command separately. Then, a simple linear layer (att_cmd
) is used to output a score given the concatenation of an encoded command and the current hidden state of the state_gru
.
Note, that word embedding (embedding
) is learned from scratch but one could use a pre-trained one like word2vec, GloVe, ELMo or BERT.
The agent is trained using A2C (batch size of 1). The critic shares the same bottom layers as the agent and uses a simple linear layer (critic
) to output a single scalar value given the current hidden state of the state_gru
.
The model also uses entropy regularization to promote exploration.
from textworld.
Hey, I am new to RL and got interested in its applications in NLP.
I have few queries:
- I was wondering how would one implement LSTM-DQN (https://arxiv.org/pdf/1506.08941.pdf) using textworld. Here we need to know all possible actions beforehand, and the final layer outputs max_actions q-values for each action. I tried to find all possible actions from the gamefile , but wasn't successful. In OpenAi's GYM , env.action_space gives the action list. Is there any way to get it in textworld?
- ALso, in the above implementation, how are actions stored. I mean in normal DQN, we store actions in transitions. But in above implementations, where are actions stored in self.tansitions:
self.transitions.append([None, indexes, outputs, values]) # Reward will be set on the next call
from textworld.
Thank you for the reply.
from textworld.
Related Issues (20)
- tw-extract commands does not extract all commands HOT 4
- pip install version 1.3.2 issue HOT 2
- Accessing oracle policy commands for tw-cooking games HOT 5
- make_batch2 in textworld.gym HOT 1
- update grammar file to reflect the logic HOT 4
- Optional and repeatable quest HOT 6
- Using Action in Event HOT 4
- Can Textworld support multiple players? HOT 2
- Can we generate full description at each step? HOT 4
- Using Action-based to create repeatable reward only when the action is executed HOT 1
- Translation Request HOT 1
- This repo is missing important files
- why doesn't work on the colab either else enviroment based on cloud? HOT 4
- Fail to install on M1/M2 Macos (arm64) HOT 8
- Unable to import textworld in colab HOT 4
- *** buffer overflow detected ***: terminated HOT 3
- [Gym] - Depreciation Warning ("rewrite env with new step API") HOT 1
- [Numpy] - Depreciation Warning (np.bool8 is deprecated in numpy 1.24) HOT 2
- [Gym] - UserWarning (obs returned by the `step()` method is not within the observation space) HOT 2
- [Frotz / GitGlulx / Gym / (Jeriocho) ] - Question, Bug, Feature Request? HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from textworld.