Giter Club home page Giter Club logo

lsvi-ase's Introduction

More Efficient Randomized Exploration in RL with Approximate Sampling

This repository contains the source code for the paper titled More Efficient Randomized Exploration for Reinforcement Learning via Approximate Sampling.

Installation Requirements

  • Python: >= 3.8
  • Tianshou: ==0.4.10
  • Envpool: ==0.6.6
  • Additional dependencies can be found in requirements.txt.

Running Experiments

Setting Up and Executing Experiments

Hyperparameters and grid search parameters are organized within a configuration file located in the configs folder. To initiate an experiment, select a configuration index to generate a corresponding dictionary. This dictionary defines the specific experiment setup. All outputs, including logs, are stored within the logs folder. For detailed instructions, refer to the provided source code.

To launch an experiment using the configuration file atari8_fg_aULMC.json with the index 1, execute:

python main.py --config_file ./configs/atari8_fg_aULMC.json --config_idx 1

Optional: Grid Search

To identify the total number of parameter combinations for a given configuration (for instance, atari8_fg_aULMC.json), run:

python utils/sweeper.py

This command outputs the total combinations:

Number of total combinations in atari8_fg_aULMC.json: 1728

To systematically explore each combination (indices 1 to 144), you could utilize a bash script:

for index in {1..144}
do
  python main.py --config_file ./configs/atari8_fg_aULMC.json --config_idx $index
done

For handling a large batch of experiments, GNU Parallel is recommended for job scheduling:

parallel --eta --ungroup python main.py --config_file ./configs/atari8_fg_aULMC.json --config_idx {1} ::: $(seq 1 1728)

If conducting multiple runs for the same configuration index, increment the index by the total number of combinations. For instance, to perform 5 runs for index 1:

for index in 1 1729 3457 5185 6913
do
  python main.py --config_file ./configs/atari8_fg_aULMC.json --config_idx $index
done

Alternatively, for simplicity:

parallel --eta --ungroup python main.py --config_file ./configs/atari8_fg_aULMC.json --config_idx {1} ::: $(seq 1 1728 8640)

Optional: Analyzing Results

To analyze experiment outcomes, simply execute:

python analysis.py

This script identifies unfinished experiments by checking for missing result files, reports memory usage, and produces a histogram of memory utilization for the logs/atari8_fg_aULMC/0 directory. It also generates CSV files summarizing the training and testing outcomes. For comprehensive details, see analysis.py. Additional analysis tools are available in utils/plotter.py.

lsvi-ase's People

Contributors

yangyu0879 avatar hmishfaq avatar panxu-ai avatar qlan3 avatar

Stargazers

 avatar  avatar

Watchers

Kostas Georgiou avatar

Forkers

al2wang

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.