Published at ICML 2024: Paper link Website link
We present an algorithm for skill discovery from expert demonstrations. The algorithm first utilizes Large Language Models (LLMs) to propose an initial segmentation of the trajectories. Following that, a hierarchical variational inference framework incorporates the LLM-generated segmentation information to discover reusable skills by merging trajectory segments. To further control the trade-off between compression and reusability, we introduce a novel auxiliary objective based on the Minimum Description Length principle that helps guide this skill discovery process.
TO DO: online hierarchical RL with the learned skills
Clone repo:
$ git clone https://github.com/Minusadd/LAST.git LAST
Install requirements:
$ virtualenv -p $(which python3.9) last
$ source last/bin/activate
$ cd LAST
$ pip install --upgrade pip
$ pip install -r requirements.txt
Install ALFRED and download the dataset.
$ git clone https://github.com/askforalfred/alfred.git alfred
$ cd alfred/data
$ sh download_data.sh json_feat
$ cd ../..
(Optional) Download the preprocessed features & LLM-generated data from a google drive.
Setup openai api key
$ export OPENAI_API_KEY='your api key'
$ export OPENAI_API_BASE='your api base'
$ export OPENAI_API_TYPE='your api type'
Generate trajectory data using gpt-4
$ mkdir data_gpt4
$ python alfred_steps.py --data_dir ./alfred/data/json_2.1.0 --output_dir data_gpt4/ --n_workers 4
Note 1: The .jpeg images from the full
dataset are different from the images rendered during evaluation due to the JPG compression. Thus we generated images for all the trajectories on our own. We are still trying to figure out how to share this but you can generate it on your own with the code provided in ET.
Note 2: You can directly download the gpt4-generated dataset we used from the google drive and skip this step.
Process the image and language data given the initial segmentation results
$ python process_data.py
Note: You can directly download the processed data we used from the google drive and skip this step:
FasterRCNN
, MaskRCNN
, Image features
, Language features
, Goal features
, Masks
, Action sequences
, Switching points
, Processed trajectory data (gpt4)
You will need to put all the downloaded files into the data/
folder.
$ mkdir data
Train a LAST agent:
$ python algorithm.py --name train_last --train 1 --include_goal 1 --ent_weight 0.1 --kl_weight 0.0001
Evaluate the agent on the dataset:
First, download our checkpoint and put it into saved_nets/
, then,
$ python algorithm.py --name test_last --train 0 --include_goal 1 --ent_weight 0.1 --kl_weight 0.0001 --model saved_nets/Model_epoch70
If you find this repository useful, please cite our work:
@inproceedings{fu2024languageskill,
title = {Language-guided Skill Learning with Temporal Variational Inference},
author = {Haotian Fu and Pratyusha Sharma and Elias Stengel-Eskin and George Konidaris and Nicolas Le Roux and Marc-Alexandre Côté and Xingdi Yuan},
booktitle = {ICML},
year = {2024},
}