farama-foundation / d4rl Goto Github PK

View Code? Open in Web Editor NEW

1.2K 1.2K 262.0 28.52 MB

A collection of reference environments for offline reinforcement learning

License: Apache License 2.0

Python 99.98% Shell 0.02%

d4rl's People

Contributors

Stargazers

Watchers

Forkers

ml-lab michaelrzhang justinjfu xiaxx244 kpertsch ofirnachum yuishihara ugurkanates aaronhd stjordanis 8bit-pixies frt03 vermouth1992 thibaud-ardoin jiaodaxiaozi douxation anirudhajitani capybaralet superjeary skasman zzl35 hebowei2000 mihdalal pengzhenghao y011d4 r-ceph violasox mr-pepe yanlai00 liushasha-code amandlek danieltakeshi staminatang wsjeon slee01 orybkin mcx zhihanyang2022 bonsaiai haozhougt leeheesoo97 dibgerge j0rd1smit astro-eric tao2020 allensmile dssrgu dongzhuzhao kelvinxu tmmichi sufodsia xfdywy colinqiyangli tmats vcharvet abdulhaim reinholdm xiaofei-w geyang mohan-zhang-u luciferkonn hesnobi rubensolozabal vbarbaros franktiantt eru1206 debajit15kgp maxco2 jack-sherman01 jannerm arg-nctu hassamsheikh t6-thu peteflorence bobosui waitalone moteesh-reddy yutiansut young-geng tejassp2002 fcbw2012 chinganc ysliu627 dibyaghosh koulanurag minoring colinavrech mrwalter lsaldyt qiaoptdun anuragajay jkbjh takuyahiraoka yipcw5-bmeyear4 thomasw219 zyvoi rafapi huihanl buoyancy99 chethus

d4rl's Issues

Question about data collection

Hi, thank you for providing such a great work. I have some question about the data collection method. I look through the code, and find this may be relevant:

https://github.com/rail-berkeley/d4rl/blob/1ed16f94b74d9d7ee60fa399746ace754bc0b838/scripts/ope_rollout.py#L29-L30

However, since the model is referred as an ONNX model, I am confused on how the noise are used when generate actions. Is the noise directly added to the deterministic sample of the actor, or used by a VAE style actor as a latent code, or something else?

Installing D4RL using pip install -e causes error

How to reproduce:
git clone https://github.com/rail-berkeley/d4rl.git
cd d4rl
pip install -e .

Error message text:

Exception:
Traceback (most recent call last):
File "/home/kamran/rlfd_env/lib/python3.6/site-packages/pip/basecommand.py", line 215, in main
status = self.run(options, args)
File "/home/kamran/rlfd_env/lib/python3.6/site-packages/pip/commands/install.py", line 353, in run
wb.build(autobuilding=True)
File "/home/kamran/rlfd_env/lib/python3.6/site-packages/pip/wheel.py", line 749, in build
self.requirement_set.prepare_files(self.finder)
File "/home/kamran/rlfd_env/lib/python3.6/site-packages/pip/req/req_set.py", line 380, in prepare_files
ignore_dependencies=self.ignore_dependencies))
File "/home/kamran/rlfd_env/lib/python3.6/site-packages/pip/req/req_set.py", line 554, in _prepare_file
require_hashes
File "/home/kamran/rlfd_env/lib/python3.6/site-packages/pip/req/req_install.py", line 278, in populate_link
self.link = finder.find_requirement(self, upgrade)
File "/home/kamran/rlfd_env/lib/python3.6/site-packages/pip/index.py", line 465, in find_requirement
all_candidates = self.find_all_candidates(req.name)
File "/home/kamran/rlfd_env/lib/python3.6/site-packages/pip/index.py", line 423, in find_all_candidates
for page in self._get_pages(url_locations, project_name):
File "/home/kamran/rlfd_env/lib/python3.6/site-packages/pip/index.py", line 568, in _get_pages
page = self._get_page(location)
File "/home/kamran/rlfd_env/lib/python3.6/site-packages/pip/index.py", line 683, in _get_page
return HTMLPage.get_page(link, session=self.session)
File "/home/kamran/rlfd_env/lib/python3.6/site-packages/pip/index.py", line 795, in get_page
resp.raise_for_status()
File "/home/kamran/rlfd_env/share/python-wheels/requests-2.18.4-py2.py3-none-any.whl/requests/models.py", line 935, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://pypi.org/simple/mjrl/

Thanks in advance for the help.

MuJoCo Datasets

It seems 'next_observations' is removed from the current version of the MuJoCo datasets. Is it possible to include them as the previous version? Are the data points in the correct order such that I can always use obs[t+1] as the next observation when the terminal condition is false? Thank you.

hopper-medium-expert-v0

hopper-medium-expert-v0 has 1200919 samples.
It doesn't seem to be properly combined.

d4rl/gym_mujoco/init.py
kwargs in register 'ant-medium-expert-v0' doesn't have 'ref_min_score' and 'ref_max_score'.

KeyError: 'mini-kitchen-microwave-kettle-light-slider-v0'

hi I want to test the case in kitchen,but I get a error,can you help me?thanks very much!

Python 3.7.8 | packaged by conda-forge | (default, Jul 31 2020, 02:25:08) 
[GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import gym
>>> import d4rl
Warning: Flow failed to import. Set the environment variable D4RL_SUPPRESS_IMPORT_ERROR=1 to suppress this message.
No module named 'flow'
/home/fangyu/.conda/envs/d4rl/lib/python3.7/site-packages/glfw/__init__.py:834: GLFWError: (65544) b'X11: The DISPLAY environment variable is missing'
  warnings.warn(message, GLFWError)
Warning: CARLA failed to import. Set the environment variable D4RL_SUPPRESS_IMPORT_ERROR=1 to suppress this message.
No module named 'carla'
>>> env = gym.make('mini-kitchen-microwave-kettle-light-slider-v0')
Traceback (most recent call last):
  File "/home/fangyu/.conda/envs/d4rl/lib/python3.7/site-packages/gym/envs/registration.py", line 121, in spec
    return self.env_specs[id]
KeyError: 'mini-kitchen-microwave-kettle-light-slider-v0'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/fangyu/.conda/envs/d4rl/lib/python3.7/site-packages/gym/envs/registration.py", line 145, in make
    return registry.make(id, **kwargs)
  File "/home/fangyu/.conda/envs/d4rl/lib/python3.7/site-packages/gym/envs/registration.py", line 89, in make
    spec = self.spec(path)
  File "/home/fangyu/.conda/envs/d4rl/lib/python3.7/site-packages/gym/envs/registration.py", line 131, in spec
    raise error.UnregisteredEnv('No registered env with id: {}'.format(id))
gym.error.UnregisteredEnv: No registered env with id: mini-kitchen-microwave-kettle-light-slider-v0

logging policy propensities available?

is it possible to access the propensities (pi(a|s) for each transition s,a,r) of the logging policy while making the dataset? this would be useful for algorithms that aren't policy agnostic.

CustomMDPPathCollector

Firstly, thank you.

After python antmaze_bear.py,

It seems that rlkit (already installed the latest version) doesn't contain CustomMDPPathCollector. Is this your custom class?

adroit task names in wiki are wrong

In wiki, task names of Adroit are written as 'task-demo-v0', but they are registered as 'task-human-v0'.

Unable to make environments

I get errors when trying to use the following environments:

Ant maze:
FileNotFoundError: [Errno 2] No such file or directory: '~/anaconda3/envs/myenv/lib/python3.8/site-packages/d4rl/locomotion/assets/ant.xml'

Adroit:
OSError: File ~/anaconda3/envs/myenv/lib/python3.8/site-packages/d4rl/hand_manipulation_suite/assets/DAPG_pen.xml does not exist

Where can I get those environments from? Thanks in advance!

Antmaze datasets broken

The difference in x/y values between current observations and next observations has some extremely high values that imply that the ant jumped across the whole maze in a single step.

There is a number of such samples, making me wonder whether the dataset is generally broken.

Can you please give a comment on that?

Use this to reproduce:

env = gym.make('antmaze-medium-diverse-v0')
dataset = d4rl.qlearning_dataset(env)

(dataset['next_observations'] - dataset['observations'])[:, 0].max()
>> 21.227503

(dataset['next_observations'] - dataset['observations'])[:, 1].max()
>> 16.82595

http://rail.eecs.berkeley.edu/datasets/offline_rl/ope_policies/ 404 error

I'm interested in experimenting with https://github.com/rail-berkeley/d4rl/wiki/Off-Policy-Evaluation

How can I visualize the action?

Hi, I want to visualize the action after training, but I notice the "d4rl/scripts/visualize_dataset.py" writed 'Only MuJoCo-based environments can be visualized', If I want to visulize 'kitchen-complete-v0',how can I do ? thanks very much!

Missing next observation in dataset.

Hi! I'm recently exploring this dataset to study offline rl.

But i'm wondering why this dataset do not contain the information about next_observation in transitin? Is this on purpose or just a mistake?

Here is what i got:

>>> env = gym.make("maze2d-large-v1")

>>> dataset = env.env.get_dataset()

Downloading dataset: http://rail.eecs.berkeley.edu/datasets/offline_rl/maze2d/maze2d-large-sparse-v1.hdf5 to xxxxxxxxxxxxxxxxxxxxx

>>> dataset.keys()

dict_keys(['actions', 'infos/goal', 'infos/qpos', 'infos/qvel', 'observations', 'rewards', 'terminals'])`

Datasets broken after update

Ant maze and maze2d datasets have different numbers of samples for observations/actions/... after pulling the newest version.
@justinjfu Did you break something with your update yesterday?

AttributeError: module 'd4rl' has no attribute 'flow'

https://github.com/rail-berkeley/d4rl/blob/19ff42dfca15a7ecef38c380711da5164a86e26f/d4rl/flow/__init__.py#L26

This causes circular references, maybe should be
from d4rl.flow import traffic_light_grid as traffic_light_grid

from d4rl.flow import merge as merge

from d4rl.flow import bottleneck as bottleneck

In MuJoCo dataset, how can we figure out the beginning of each episode?

Hi, I would like to use d4rl dataset for DICE scenarios, where sampling from initial states is required. I thought the termination flag could be helpful at first glance, but I've noticed from #34 that termination=False when an agent reaches the maximum length of episodes.

Is there another recommended method for this issue?

Thanks for your support in advance!

Way to find convert agent position to maze coverage

Hi,

For the maze environments, I'm interested in converting an agent's (x, y)-position into what cell of the maze they are located so I can easily compute the percentage of the maze the agent has explored.

In the code, I saw that each maze environment has its own representation i.e.:

U_MAZE = \
        "#####\\"+\
        "#GOO#\\"+\
        "###O#\\"+\
        "#OOO#\\"+\
        "#####"

but what I haven't been able to find are the centers of each of the open positions and their corresponding height and width (for both the point maze and ant maze).

Thanks!

env.seed doesn't seem to work

env.seed(seed=x) doesn't seem to work.
After running previous command and doing env.reset(), I get different states every time.
It would be nice to get reproducible states to compare algorithms during evaluation

Which env version for Mujoco tasks?

Hi, which Mujoco env version is used to generate the offline data, e.g., Hopper-v1 or Hopper-v3? And how to evaluate the agent's performance after training? By testing the agent in an online way, i.e., running it in the env? Thank you.

'terminals' in 'halfcheetah-medium-v0' are all zero

Hi guys,

I find that 'terminals' in 'halfcheetah-medium-v0' are all zero.
Is this a bug?

Best,
Rui

Couple things

Need to grant permissions on the mixed mujoco envs (maybe other non-mujoco ones as well). Getting error AccessDeniedException: 403 ... does not have storage.objects.list access to justinjfu-public.
You should wrap your environments in a TimeLimit wrapper: https://github.com/openai/gym/blob/master/gym/wrappers/time_limit.py. Right now they must be automatically terminated else infinite loop.
Missing the Ant mujoco from Bear paper =(
Not really an issue with the code, but curious for your thoughts: do you have reason to believe that the environments on which none of your tested methods achieve any reasonable score, are even solvable / well-posed problems for the tabula rasa, completely offline setting? It seems a bit silly to throw these out there as benchmarks. Potentially better approach 1: start with too much data, so that a reasonable baseline (e.g., offline SAC) solved. Then the benchmark becomes not performance, but equivalent performance gotten from random sub sampling of the data. Potentially better approach 2: offer these as warm-up / demo-like datasets, to see how fast an agent that has the ability to explore can achieve good performance using them to bootstrap its performance.

Anyways, thanks for making the Mujoco datasets available. I have the following "better" expert agents to produce datasets if you are interested (probably can do even better, wasn't sure why Bear paper stopped short on the expert perf): Ant 6900, HalfCheetah 16700, Hopper 4200, Walker 6600.

Some dataset cannot be imported correctly.

When testing in gym, some dataset in d4rl seems to don't work.

res = ['ant_expert', 'ant_medium', 'ant_medium_expert', 'ant_mixed', 'ant_random', 'ant_random_expert',
       'halfcheetah_expert', 'halfcheetah_medium', 'halfcheetah_medium_expert', 'halfcheetah_mixed',
       'halfcheetah_random', 'hopper_expert', 'hopper_medium', 'hopper_medium_expert', 'hopper_mixed',
       'hopper_random', 'walker2d_expert', 'walker2d_medium', 'walker2d_medium_expert',
       'walker2d_random', 'walker_mixed'] 


res = [x.replace('_', '-') + '-v0' for x in res]

import d4rl
import gym
import mujoco_py

bad_case = []
good_case = []
for x in res:
    try:
        env = gym.make(x)
        good_case.append(x)
    except:
        bad_case.append(x)

print("Bad case", bad_case)

And the output:

Bad case ['ant-mixed-v0', 'ant-random-expert-v0', 'halfcheetah-mixed-v0', 'hopper-mixed-v0', 'walker-mixed-v0']

carla-town-v0

The carla dataset for carla-town-v0 seems to be faulty. All the observations from index = 16 and on are exactly the same. Same goes for the full dataset.

Backward error when running BEAR

I am running the bear algorithm, using the example command:

python examples/bear_hdf5_d4rl.py --env='halfcheetah-medium-v0' --policy_lr=1e-4 --num_samples=100
and I am running with a backward error when doing backward propagation on the policy_loss.
This happens already at training epoch=1.
I get:

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [256, 6]], which is output 0 of TBackward, is at version 2; expected version 1 instead [...]

Computing normalized score

Hi,

Thanks again for this repository and data.
I was wondering are REF_MIN_SCORE and REF_MAX_SCORE (here) used in score normalization described in section 5.1?
normalized_score = 100* (score - REF_MIN_SCORE) / (REF_MAX_SCORE - REF_MIN_SCORE )

where

REF_MIN_SCORE == random_score
REF_MAX_SCORE == expert_score

If not would you please provide random_score and expert score [section 5.1 in the paper] to make comparison correct ? otherwise, it would be very hard to do correct and fair comparison.

Thanks for your help.

About the server setting and the training time

Hi guys,

Thank you for the great work!
What kind of server (CPU cluster or GPU machine) did you use to run the experiments?
Is the GPU acceleration significant?
And how long does it normally take to run each experiment?

Thank you!

Best,
Rui

Maze env: out-of-grid goal location?

I cannot have goal locations (shown as red dot) contained within the maze. Is this a bug of the environment?

This actually happens to ALL pointmass maze environments.

Why MuJoCo instead of pyBullet?

No offense but just wondered wouldn't be more great using bullet3&pyBullet as baseline for this project since it's open source and better community support in general

Relative imports break module

Due to the relative imports (example) when the package is not cloned and pip install locally, the package cannot be imported:

(base) ➜  Developer conda create -y -n offline_rl python=3.6.9
Collecting package metadata (current_repodata.json): done
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: done


==> WARNING: A newer version of conda exists. <==
  current version: 4.7.12
  latest version: 4.8.3

Please update conda by running

    $ conda update -n base -c defaults conda



## Package Plan ##

  environment location: /usr/local/Caskroom/miniconda/base/envs/offline_rl

  added / updated specs:
    - python=3.6.9


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    xz-5.2.5                   |       h1de35cc_0         282 KB
    ------------------------------------------------------------
                                           Total:         282 KB

The following NEW packages will be INSTALLED:

  ca-certificates    pkgs/main/osx-64::ca-certificates-2020.1.1-0
  certifi            pkgs/main/osx-64::certifi-2020.4.5.1-py36_0
  libcxx             pkgs/main/osx-64::libcxx-4.0.1-hcfea43d_1
  libcxxabi          pkgs/main/osx-64::libcxxabi-4.0.1-hcfea43d_1
  libedit            pkgs/main/osx-64::libedit-3.1.20181209-hb402a30_0
  libffi             pkgs/main/osx-64::libffi-3.2.1-h475c297_4
  ncurses            pkgs/main/osx-64::ncurses-6.2-h0a44026_0
  openssl            pkgs/main/osx-64::openssl-1.1.1f-h1de35cc_0
  pip                pkgs/main/osx-64::pip-20.0.2-py36_1
  python             pkgs/main/osx-64::python-3.6.9-h359304d_0
  readline           pkgs/main/osx-64::readline-7.0-h1de35cc_5
  setuptools         pkgs/main/osx-64::setuptools-46.1.3-py36_0
  sqlite             pkgs/main/osx-64::sqlite-3.31.1-ha441bb4_0
  tk                 pkgs/main/osx-64::tk-8.6.8-ha441bb4_0
  wheel              pkgs/main/osx-64::wheel-0.34.2-py36_0
  xz                 pkgs/main/osx-64::xz-5.2.5-h1de35cc_0
  zlib               pkgs/main/osx-64::zlib-1.2.11-h1de35cc_3



Downloading and Extracting Packages
xz-5.2.5             | 282 KB    | ################################################################################################################################################################################################################################# | 100%
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
#     $ conda activate offline_rl
#
# To deactivate an active environment, use
#
#     $ conda deactivate

(base) ➜  Developer conda activate offline_rl
(offline_rl) ➜  Developer pip install git+https://github.com/rail-berkeley/offline_rl@master#egg=offline-rl

Collecting offline-rl
  Cloning https://github.com/rail-berkeley/offline_rl (to revision master) to /private/var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/pip-install-718sfz13/offline-rl
  Running command git clone -q https://github.com/rail-berkeley/offline_rl /private/var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/pip-install-718sfz13/offline-rl
Processing /Users/alpha/Library/Caches/pip/wheels/83/05/c5/c585041ad642c75ce98ecee9a930e34dab7eb64fd5305972be/gym-0.17.1-py3-none-any.whl
Collecting numpy
  Using cached numpy-1.18.2-cp36-cp36m-macosx_10_9_x86_64.whl (15.2 MB)
Collecting mujoco_py
  Downloading mujoco-py-2.0.2.9.tar.gz (777 kB)
     |████████████████████████████████| 777 kB 426 kB/s
  Installing build dependencies ... done
  WARNING: Missing build requirements in pyproject.toml for mujoco_py from https://files.pythonhosted.org/packages/a2/30/21abd0cf2734bf5f34a7a8967789b12dee55f1e51e9c1c60af1cba549123/mujoco-py-2.0.2.9.tar.gz#sha256=6ae20ca9509203758f5e30a7a4019cb2d581b6d40dc2c2669dbe3229cfdf05e8 (from offline-rl).
  WARNING: The project does not specify a build backend, and pip cannot fall back to setuptools without 'wheel'.
  Getting requirements to build wheel ... done
  Installing backend dependencies ... done
    Preparing wheel metadata ... done
Collecting h5py
  Using cached h5py-2.10.0-cp36-cp36m-macosx_10_6_intel.whl (3.0 MB)
Collecting mjrl@ git+git://github.com/aravindr93/mjrl@master#egg=mjrl
  Cloning git://github.com/aravindr93/mjrl (to revision master) to /private/var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/pip-install-718sfz13/mjrl
  Running command git clone -q git://github.com/aravindr93/mjrl /private/var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/pip-install-718sfz13/mjrl
Collecting six
  Using cached six-1.14.0-py2.py3-none-any.whl (10 kB)
Collecting cloudpickle<1.4.0,>=1.2.0
  Using cached cloudpickle-1.3.0-py2.py3-none-any.whl (26 kB)
Collecting scipy
  Using cached scipy-1.4.1-cp36-cp36m-macosx_10_6_intel.whl (28.5 MB)
Collecting pyglet<=1.5.0,>=1.4.0
  Using cached pyglet-1.5.0-py2.py3-none-any.whl (1.0 MB)
Collecting imageio>=2.1.2
  Using cached imageio-2.8.0-py3-none-any.whl (3.3 MB)
Collecting glfw>=1.4.0
  Using cached glfw-1.11.0-py2.py27.py3.py30.py31.py32.py33.py34.py35.py36.py37.py38-none-macosx_10_6_intel.whl (100 kB)
Collecting cffi>=1.10
  Using cached cffi-1.14.0-cp36-cp36m-macosx_10_9_x86_64.whl (174 kB)
Collecting Cython>=0.27.2
  Using cached Cython-0.29.16-cp36-cp36m-macosx_10_9_x86_64.whl (2.0 MB)
Collecting fasteners~=0.15
  Using cached fasteners-0.15-py2.py3-none-any.whl (23 kB)
Processing /Users/alpha/Library/Caches/pip/wheels/6e/9c/ed/4499c9865ac1002697793e0ae05ba6be33553d098f3347fb94/future-0.18.2-py3-none-any.whl
Collecting pillow
  Using cached Pillow-7.1.1-cp36-cp36m-macosx_10_10_x86_64.whl (2.2 MB)
Collecting pycparser
  Using cached pycparser-2.20-py2.py3-none-any.whl (112 kB)
Collecting monotonic>=0.1
  Using cached monotonic-1.5-py2.py3-none-any.whl (5.3 kB)
Building wheels for collected packages: offline-rl, mujoco-py, mjrl
  Building wheel for offline-rl (setup.py) ... done
  Created wheel for offline-rl: filename=offline_rl-1.0-py3-none-any.whl size=70315 sha256=00e52eb5520e1b573e8867a6181e78feaf798b2ef19bc2fb36df4436b197b4ca
  Stored in directory: /private/var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/pip-ephem-wheel-cache-zykbok5i/wheels/75/a8/3e/ed95ca3abac0062f8d6cfc73e017e6b87c83926a15f4a7678a
  Building wheel for mujoco-py (PEP 517) ... error
  ERROR: Command errored out with exit status 1:
   command: /usr/local/Caskroom/miniconda/base/envs/offline_rl/bin/python /usr/local/Caskroom/miniconda/base/envs/offline_rl/lib/python3.6/site-packages/pip/_vendor/pep517/_in_process.py build_wheel /var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/tmppdxeqqa2
       cwd: /private/var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/pip-install-718sfz13/mujoco-py
  Complete output (67 lines):
  running bdist_wheel
  running build

  You appear to be missing MuJoCo.  We expected to find the file here: /Users/alpha/.mujoco/mujoco200

  This package only provides python bindings, the library must be installed separately.

  Please follow the instructions on the README to install MuJoCo

      https://github.com/openai/mujoco-py#install-mujoco

  Which can be downloaded from the website

      https://www.roboti.us/index.html

  Traceback (most recent call last):
    File "/usr/local/Caskroom/miniconda/base/envs/offline_rl/lib/python3.6/site-packages/pip/_vendor/pep517/_in_process.py", line 257, in <module>
      main()
    File "/usr/local/Caskroom/miniconda/base/envs/offline_rl/lib/python3.6/site-packages/pip/_vendor/pep517/_in_process.py", line 240, in main
      json_out['return_val'] = hook(**hook_input['kwargs'])
    File "/usr/local/Caskroom/miniconda/base/envs/offline_rl/lib/python3.6/site-packages/pip/_vendor/pep517/_in_process.py", line 182, in build_wheel
      metadata_directory)
    File "/private/var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/pip-build-env-_nw630od/overlay/lib/python3.6/site-packages/setuptools/build_meta.py", line 213, in build_wheel
      wheel_directory, config_settings)
    File "/private/var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/pip-build-env-_nw630od/overlay/lib/python3.6/site-packages/setuptools/build_meta.py", line 198, in _build_with_temp_dir
      self.run_setup()
    File "/private/var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/pip-build-env-_nw630od/overlay/lib/python3.6/site-packages/setuptools/build_meta.py", line 250, in run_setup
      self).run_setup(setup_script=setup_script)
    File "/private/var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/pip-build-env-_nw630od/overlay/lib/python3.6/site-packages/setuptools/build_meta.py", line 143, in run_setup
      exec(compile(code, __file__, 'exec'), locals())
    File "setup.py", line 51, in <module>
      'Programming Language :: Python :: 3 :: Only',
    File "/private/var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/pip-build-env-_nw630od/overlay/lib/python3.6/site-packages/setuptools/__init__.py", line 144, in setup
      return distutils.core.setup(**attrs)
    File "/usr/local/Caskroom/miniconda/base/envs/offline_rl/lib/python3.6/distutils/core.py", line 148, in setup
      dist.run_commands()
    File "/usr/local/Caskroom/miniconda/base/envs/offline_rl/lib/python3.6/distutils/dist.py", line 955, in run_commands
      self.run_command(cmd)
    File "/usr/local/Caskroom/miniconda/base/envs/offline_rl/lib/python3.6/distutils/dist.py", line 974, in run_command
      cmd_obj.run()
    File "/private/var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/pip-build-env-_nw630od/normal/lib/python3.6/site-packages/wheel/bdist_wheel.py", line 223, in run
      self.run_command('build')
    File "/usr/local/Caskroom/miniconda/base/envs/offline_rl/lib/python3.6/distutils/cmd.py", line 313, in run_command
      self.distribution.run_command(command)
    File "/usr/local/Caskroom/miniconda/base/envs/offline_rl/lib/python3.6/distutils/dist.py", line 974, in run_command
      cmd_obj.run()
    File "setup.py", line 29, in run
      import mujoco_py  # noqa: force build
    File "/private/var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/pip-install-718sfz13/mujoco-py/mujoco_py/__init__.py", line 3, in <module>
      from mujoco_py.builder import cymj, ignore_mujoco_warnings, functions, MujocoException
    File "/private/var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/pip-install-718sfz13/mujoco-py/mujoco_py/builder.py", line 509, in <module>
      mujoco_path, key_path = discover_mujoco()
    File "/private/var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/pip-install-718sfz13/mujoco-py/mujoco_py/utils.py", line 93, in discover_mujoco
      raise Exception(message)
  Exception:
  You appear to be missing MuJoCo.  We expected to find the file here: /Users/alpha/.mujoco/mujoco200

  This package only provides python bindings, the library must be installed separately.

  Please follow the instructions on the README to install MuJoCo

      https://github.com/openai/mujoco-py#install-mujoco

  Which can be downloaded from the website

      https://www.roboti.us/index.html

  ----------------------------------------
  ERROR: Failed building wheel for mujoco-py
  Building wheel for mjrl (setup.py) ... done
  Created wheel for mjrl: filename=mjrl-1.0.0-py3-none-any.whl size=53719 sha256=b900484aee5a924770e99e0b244716d68dff2bf350b32ad903110af39aebb280
  Stored in directory: /private/var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/pip-ephem-wheel-cache-zykbok5i/wheels/12/ab/6b/8bf2aeb28732954b2b712465bed617ca07213ce1070053cb34
Successfully built offline-rl mjrl
Failed to build mujoco-py
ERROR: Could not build wheels for mujoco-py which use PEP 517 and cannot be installed directly
(offline_rl) ➜  Developer
(offline_rl) ➜  Developer ls ~/.mujoco
mjkey.txt       mujoco200_macos
(offline_rl) ➜  Developer cp ~/.mujoco/mujoco200_macos ~/.mujoco/mujoco200
cp: /Users/alpha/.mujoco/mujoco200_macos is a directory (not copied).
(offline_rl) ➜  Developer pip install git+https://github.com/rail-berkeley/offline_rl@master#egg=offline-rl
Collecting offline-rl
  Cloning https://github.com/rail-berkeley/offline_rl (to revision master) to /private/var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/pip-install-_n8vei1l/offline-rl
  Running command git clone -q https://github.com/rail-berkeley/offline_rl /private/var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/pip-install-_n8vei1l/offline-rl
Processing /Users/alpha/Library/Caches/pip/wheels/83/05/c5/c585041ad642c75ce98ecee9a930e34dab7eb64fd5305972be/gym-0.17.1-py3-none-any.whl
Collecting numpy
  Using cached numpy-1.18.2-cp36-cp36m-macosx_10_9_x86_64.whl (15.2 MB)
Collecting mujoco_py
  Using cached mujoco-py-2.0.2.9.tar.gz (777 kB)
  Installing build dependencies ... done
  WARNING: Missing build requirements in pyproject.toml for mujoco_py from https://files.pythonhosted.org/packages/a2/30/21abd0cf2734bf5f34a7a8967789b12dee55f1e51e9c1c60af1cba549123/mujoco-py-2.0.2.9.tar.gz#sha256=6ae20ca9509203758f5e30a7a4019cb2d581b6d40dc2c2669dbe3229cfdf05e8 (from offline-rl).
  WARNING: The project does not specify a build backend, and pip cannot fall back to setuptools without 'wheel'.
  Getting requirements to build wheel ... done
  Installing backend dependencies ... done
    Preparing wheel metadata ... done
Collecting h5py
  Using cached h5py-2.10.0-cp36-cp36m-macosx_10_6_intel.whl (3.0 MB)
Collecting mjrl@ git+git://github.com/aravindr93/mjrl@master#egg=mjrl
  Cloning git://github.com/aravindr93/mjrl (to revision master) to /private/var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/pip-install-_n8vei1l/mjrl
  Running command git clone -q git://github.com/aravindr93/mjrl /private/var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/pip-install-_n8vei1l/mjrl
Collecting pyglet<=1.5.0,>=1.4.0
  Using cached pyglet-1.5.0-py2.py3-none-any.whl (1.0 MB)
Collecting six
  Using cached six-1.14.0-py2.py3-none-any.whl (10 kB)
Collecting cloudpickle<1.4.0,>=1.2.0
  Using cached cloudpickle-1.3.0-py2.py3-none-any.whl (26 kB)
Collecting scipy
  Using cached scipy-1.4.1-cp36-cp36m-macosx_10_6_intel.whl (28.5 MB)
Collecting cffi>=1.10
  Using cached cffi-1.14.0-cp36-cp36m-macosx_10_9_x86_64.whl (174 kB)
Collecting fasteners~=0.15
  Using cached fasteners-0.15-py2.py3-none-any.whl (23 kB)
Collecting Cython>=0.27.2
  Using cached Cython-0.29.16-cp36-cp36m-macosx_10_9_x86_64.whl (2.0 MB)
Collecting imageio>=2.1.2
  Using cached imageio-2.8.0-py3-none-any.whl (3.3 MB)
Collecting glfw>=1.4.0
  Using cached glfw-1.11.0-py2.py27.py3.py30.py31.py32.py33.py34.py35.py36.py37.py38-none-macosx_10_6_intel.whl (100 kB)
Processing /Users/alpha/Library/Caches/pip/wheels/6e/9c/ed/4499c9865ac1002697793e0ae05ba6be33553d098f3347fb94/future-0.18.2-py3-none-any.whl
Collecting pycparser
  Using cached pycparser-2.20-py2.py3-none-any.whl (112 kB)
Collecting monotonic>=0.1
  Using cached monotonic-1.5-py2.py3-none-any.whl (5.3 kB)
Collecting pillow
  Using cached Pillow-7.1.1-cp36-cp36m-macosx_10_10_x86_64.whl (2.2 MB)
Building wheels for collected packages: offline-rl, mujoco-py, mjrl
  Building wheel for offline-rl (setup.py) ... done
  Created wheel for offline-rl: filename=offline_rl-1.0-py3-none-any.whl size=70315 sha256=96100150f0abebcd45a63382bd7948c9ac8d19ddde468f91008f227e9e1c8ff8
  Stored in directory: /private/var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/pip-ephem-wheel-cache-wv7hgl2i/wheels/75/a8/3e/ed95ca3abac0062f8d6cfc73e017e6b87c83926a15f4a7678a
  Building wheel for mujoco-py (PEP 517) ... error
  ERROR: Command errored out with exit status 1:
   command: /usr/local/Caskroom/miniconda/base/envs/offline_rl/bin/python /usr/local/Caskroom/miniconda/base/envs/offline_rl/lib/python3.6/site-packages/pip/_vendor/pep517/_in_process.py build_wheel /var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/tmpq9x05kh0
       cwd: /private/var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/pip-install-_n8vei1l/mujoco-py
  Complete output (67 lines):
  running bdist_wheel
  running build

  You appear to be missing MuJoCo.  We expected to find the file here: /Users/alpha/.mujoco/mujoco200

  This package only provides python bindings, the library must be installed separately.

  Please follow the instructions on the README to install MuJoCo

      https://github.com/openai/mujoco-py#install-mujoco

  Which can be downloaded from the website

      https://www.roboti.us/index.html

  Traceback (most recent call last):
    File "/usr/local/Caskroom/miniconda/base/envs/offline_rl/lib/python3.6/site-packages/pip/_vendor/pep517/_in_process.py", line 257, in <module>
      main()
    File "/usr/local/Caskroom/miniconda/base/envs/offline_rl/lib/python3.6/site-packages/pip/_vendor/pep517/_in_process.py", line 240, in main
      json_out['return_val'] = hook(**hook_input['kwargs'])
    File "/usr/local/Caskroom/miniconda/base/envs/offline_rl/lib/python3.6/site-packages/pip/_vendor/pep517/_in_process.py", line 182, in build_wheel
      metadata_directory)
    File "/private/var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/pip-build-env-03tr64q6/overlay/lib/python3.6/site-packages/setuptools/build_meta.py", line 213, in build_wheel
      wheel_directory, config_settings)
    File "/private/var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/pip-build-env-03tr64q6/overlay/lib/python3.6/site-packages/setuptools/build_meta.py", line 198, in _build_with_temp_dir
      self.run_setup()
    File "/private/var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/pip-build-env-03tr64q6/overlay/lib/python3.6/site-packages/setuptools/build_meta.py", line 250, in run_setup
      self).run_setup(setup_script=setup_script)
    File "/private/var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/pip-build-env-03tr64q6/overlay/lib/python3.6/site-packages/setuptools/build_meta.py", line 143, in run_setup
      exec(compile(code, __file__, 'exec'), locals())
    File "setup.py", line 51, in <module>
      'Programming Language :: Python :: 3 :: Only',
    File "/private/var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/pip-build-env-03tr64q6/overlay/lib/python3.6/site-packages/setuptools/__init__.py", line 144, in setup
      return distutils.core.setup(**attrs)
    File "/usr/local/Caskroom/miniconda/base/envs/offline_rl/lib/python3.6/distutils/core.py", line 148, in setup
      dist.run_commands()
    File "/usr/local/Caskroom/miniconda/base/envs/offline_rl/lib/python3.6/distutils/dist.py", line 955, in run_commands
      self.run_command(cmd)
    File "/usr/local/Caskroom/miniconda/base/envs/offline_rl/lib/python3.6/distutils/dist.py", line 974, in run_command
      cmd_obj.run()
    File "/private/var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/pip-build-env-03tr64q6/normal/lib/python3.6/site-packages/wheel/bdist_wheel.py", line 223, in run
      self.run_command('build')
    File "/usr/local/Caskroom/miniconda/base/envs/offline_rl/lib/python3.6/distutils/cmd.py", line 313, in run_command
      self.distribution.run_command(command)
    File "/usr/local/Caskroom/miniconda/base/envs/offline_rl/lib/python3.6/distutils/dist.py", line 974, in run_command
      cmd_obj.run()
    File "setup.py", line 29, in run
      import mujoco_py  # noqa: force build
    File "/private/var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/pip-install-_n8vei1l/mujoco-py/mujoco_py/__init__.py", line 3, in <module>
      from mujoco_py.builder import cymj, ignore_mujoco_warnings, functions, MujocoException
    File "/private/var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/pip-install-_n8vei1l/mujoco-py/mujoco_py/builder.py", line 509, in <module>
      mujoco_path, key_path = discover_mujoco()
    File "/private/var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/pip-install-_n8vei1l/mujoco-py/mujoco_py/utils.py", line 93, in discover_mujoco
      raise Exception(message)
  Exception:
  You appear to be missing MuJoCo.  We expected to find the file here: /Users/alpha/.mujoco/mujoco200

  This package only provides python bindings, the library must be installed separately.

  Please follow the instructions on the README to install MuJoCo

      https://github.com/openai/mujoco-py#install-mujoco

  Which can be downloaded from the website

      https://www.roboti.us/index.html

  ----------------------------------------
  ERROR: Failed building wheel for mujoco-py
  Building wheel for mjrl (setup.py) ... done
  Created wheel for mjrl: filename=mjrl-1.0.0-py3-none-any.whl size=53719 sha256=1ac9f0f9dadd8206e0fd2aac9408051cca10ef373f34e763683b03501f42668e
  Stored in directory: /private/var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/pip-ephem-wheel-cache-wv7hgl2i/wheels/12/ab/6b/8bf2aeb28732954b2b712465bed617ca07213ce1070053cb34
Successfully built offline-rl mjrl
Failed to build mujoco-py
ERROR: Could not build wheels for mujoco-py which use PEP 517 and cannot be installed directly
(offline_rl) ➜  Developer λσ /Users/alpha/.mujoco/mujoco200
zsh: command not found: λσ
(offline_rl) ➜  Developer ls /Users/alpha/.mujoco/mujoco200
ls: /Users/alpha/.mujoco/mujoco200: No such file or directory
(offline_rl) ➜  Developer ls /Users/alpha/.mujoco/
mjkey.txt       mujoco200_macos
(offline_rl) ➜  Developer cp ~/.m
(offline_rl) ➜  Developer ls
dotfiles                              keygen                                robotrader                            tex-uapd                              uncertainty-aware-policy-distillation yagw
dreamer                               pfduu                                 spinningup                            tom                                   website
(offline_rl) ➜  Developer cp -r ~/.mujoco/mujoco200_macos ~/.mujoco/mujoco200
(offline_rl) ➜  Developer ls ~/.mujoco
mjkey.txt       mujoco200       mujoco200_macos
(offline_rl) ➜  Developer pip install git+https://github.com/rail-berkeley/offline_rl@master#egg=offline-rl
Collecting offline-rl
  Cloning https://github.com/rail-berkeley/offline_rl (to revision master) to /private/var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/pip-install-3hep6jfx/offline-rl
  Running command git clone -q https://github.com/rail-berkeley/offline_rl /private/var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/pip-install-3hep6jfx/offline-rl
Processing /Users/alpha/Library/Caches/pip/wheels/83/05/c5/c585041ad642c75ce98ecee9a930e34dab7eb64fd5305972be/gym-0.17.1-py3-none-any.whl
Collecting numpy
  Using cached numpy-1.18.2-cp36-cp36m-macosx_10_9_x86_64.whl (15.2 MB)
Collecting mujoco_py
  Using cached mujoco-py-2.0.2.9.tar.gz (777 kB)
  Installing build dependencies ... done
  WARNING: Missing build requirements in pyproject.toml for mujoco_py from https://files.pythonhosted.org/packages/a2/30/21abd0cf2734bf5f34a7a8967789b12dee55f1e51e9c1c60af1cba549123/mujoco-py-2.0.2.9.tar.gz#sha256=6ae20ca9509203758f5e30a7a4019cb2d581b6d40dc2c2669dbe3229cfdf05e8 (from offline-rl).
  WARNING: The project does not specify a build backend, and pip cannot fall back to setuptools without 'wheel'.
  Getting requirements to build wheel ... done
  Installing backend dependencies ... done
    Preparing wheel metadata ... done
Collecting h5py
  Using cached h5py-2.10.0-cp36-cp36m-macosx_10_6_intel.whl (3.0 MB)
Collecting mjrl@ git+git://github.com/aravindr93/mjrl@master#egg=mjrl
  Cloning git://github.com/aravindr93/mjrl (to revision master) to /private/var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/pip-install-3hep6jfx/mjrl
  Running command git clone -q git://github.com/aravindr93/mjrl /private/var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/pip-install-3hep6jfx/mjrl
Collecting scipy
  Using cached scipy-1.4.1-cp36-cp36m-macosx_10_6_intel.whl (28.5 MB)
Collecting six
  Using cached six-1.14.0-py2.py3-none-any.whl (10 kB)
Collecting cloudpickle<1.4.0,>=1.2.0
  Using cached cloudpickle-1.3.0-py2.py3-none-any.whl (26 kB)
Collecting pyglet<=1.5.0,>=1.4.0
  Using cached pyglet-1.5.0-py2.py3-none-any.whl (1.0 MB)
Collecting fasteners~=0.15
  Using cached fasteners-0.15-py2.py3-none-any.whl (23 kB)
Collecting glfw>=1.4.0
  Using cached glfw-1.11.0-py2.py27.py3.py30.py31.py32.py33.py34.py35.py36.py37.py38-none-macosx_10_6_intel.whl (100 kB)
Collecting cffi>=1.10
  Using cached cffi-1.14.0-cp36-cp36m-macosx_10_9_x86_64.whl (174 kB)
Collecting Cython>=0.27.2
  Using cached Cython-0.29.16-cp36-cp36m-macosx_10_9_x86_64.whl (2.0 MB)
Collecting imageio>=2.1.2
  Using cached imageio-2.8.0-py3-none-any.whl (3.3 MB)
Processing /Users/alpha/Library/Caches/pip/wheels/6e/9c/ed/4499c9865ac1002697793e0ae05ba6be33553d098f3347fb94/future-0.18.2-py3-none-any.whl
Collecting monotonic>=0.1
  Using cached monotonic-1.5-py2.py3-none-any.whl (5.3 kB)
Collecting pycparser
  Using cached pycparser-2.20-py2.py3-none-any.whl (112 kB)
Collecting pillow
  Using cached Pillow-7.1.1-cp36-cp36m-macosx_10_10_x86_64.whl (2.2 MB)
Building wheels for collected packages: offline-rl, mujoco-py, mjrl
  Building wheel for offline-rl (setup.py) ... done
  Created wheel for offline-rl: filename=offline_rl-1.0-py3-none-any.whl size=70315 sha256=3673fc821caf94748e07b095ec7a1f87a14e0f37a68058e3e88f96a30b71920c
  Stored in directory: /private/var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/pip-ephem-wheel-cache-dp_kivxz/wheels/75/a8/3e/ed95ca3abac0062f8d6cfc73e017e6b87c83926a15f4a7678a
  Building wheel for mujoco-py (PEP 517) ... done
  Created wheel for mujoco-py: filename=mujoco_py-2.0.2.9-py3-none-any.whl size=1689115 sha256=c6e49fa4a6ef64965936e6d7e9ad4baa796d3cf577666ced63639cd3017d1426
  Stored in directory: /Users/alpha/Library/Caches/pip/wheels/04/48/e0/82745eebaf57a4a96ff15db7fd4336aba0ac5a49c6e414d59f
  Building wheel for mjrl (setup.py) ... done
  Created wheel for mjrl: filename=mjrl-1.0.0-py3-none-any.whl size=53719 sha256=9a372215bd4965c701d78de53e757a775e28feeb7a0a93a3d6eacac73f8c3896
  Stored in directory: /private/var/folders/t4/2083q_815sl0312trk4k_4l00000gn/T/pip-ephem-wheel-cache-dp_kivxz/wheels/12/ab/6b/8bf2aeb28732954b2b712465bed617ca07213ce1070053cb34
Successfully built offline-rl mujoco-py mjrl
Installing collected packages: numpy, scipy, six, cloudpickle, future, pyglet, gym, monotonic, fasteners, glfw, pycparser, cffi, Cython, pillow, imageio, mujoco-py, h5py, mjrl, offline-rl
Successfully installed Cython-0.29.16 cffi-1.14.0 cloudpickle-1.3.0 fasteners-0.15 future-0.18.2 glfw-1.11.0 gym-0.17.1 h5py-2.10.0 imageio-2.8.0 mjrl-1.0.0 monotonic-1.5 mujoco-py-2.0.2.9 numpy-1.18.2 offline-rl-1.0 pillow-7.1.1 pycparser-2.20 pyglet-1.5.0 scipy-1.4.1 six-1.14.0
(offline_rl) ➜  Developer pip install ipython
Collecting ipython
  Using cached ipython-7.13.0-py3-none-any.whl (780 kB)
Processing /Users/alpha/Library/Caches/pip/wheels/b4/cb/f1/d142b3bb45d488612cf3943d8a1db090eb95e6687045ba61d1/backcall-0.1.0-py3-none-any.whl
Collecting pickleshare
  Using cached pickleshare-0.7.5-py2.py3-none-any.whl (6.9 kB)
Collecting jedi>=0.10
  Downloading jedi-0.17.0-py2.py3-none-any.whl (1.1 MB)
     |████████████████████████████████| 1.1 MB 559 kB/s
Collecting pexpect; sys_platform != "win32"
  Using cached pexpect-4.8.0-py2.py3-none-any.whl (59 kB)
Collecting traitlets>=4.2
  Using cached traitlets-4.3.3-py2.py3-none-any.whl (75 kB)
Requirement already satisfied: setuptools>=18.5 in /usr/local/Caskroom/miniconda/base/envs/offline_rl/lib/python3.6/site-packages (from ipython) (46.1.3.post20200330)
Collecting prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0
  Using cached prompt_toolkit-3.0.5-py3-none-any.whl (351 kB)
Collecting pygments
  Using cached Pygments-2.6.1-py3-none-any.whl (914 kB)
Collecting appnope; sys_platform == "darwin"
  Using cached appnope-0.1.0-py2.py3-none-any.whl (4.0 kB)
Collecting decorator
  Using cached decorator-4.4.2-py2.py3-none-any.whl (9.2 kB)
Collecting parso>=0.7.0
  Downloading parso-0.7.0-py2.py3-none-any.whl (100 kB)
     |████████████████████████████████| 100 kB 2.3 MB/s
Collecting ptyprocess>=0.5
  Using cached ptyprocess-0.6.0-py2.py3-none-any.whl (39 kB)
Requirement already satisfied: six in /usr/local/Caskroom/miniconda/base/envs/offline_rl/lib/python3.6/site-packages (from traitlets>=4.2->ipython) (1.14.0)
Collecting ipython-genutils
  Using cached ipython_genutils-0.2.0-py2.py3-none-any.whl (26 kB)
Collecting wcwidth
  Using cached wcwidth-0.1.9-py2.py3-none-any.whl (19 kB)
Installing collected packages: backcall, pickleshare, parso, jedi, ptyprocess, pexpect, ipython-genutils, decorator, traitlets, wcwidth, prompt-toolkit, pygments, appnope, ipython
ipythonSuccessfully installed appnope-0.1.0 backcall-0.1.0 decorator-4.4.2 ipython-7.13.0 ipython-genutils-0.2.0 jedi-0.17.0 parso-0.7.0 pexpect-4.8.0 pickleshare-0.7.5 prompt-toolkit-3.0.5 ptyprocess-0.6.0 pygments-2.6.1 traitlets-4.3.3 wcwidth-0.1.9
(offline_rl) ➜  Developer ipython
import gyPython 3.6.9 |Anaconda, Inc.| (default, Jul 30 2019, 13:42:17)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.13.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import gym

In [2]: import offline_rl
objc[13133]: Class GLFWApplicationDelegate is implemented in both /Users/alpha/.mujoco/mujoco200/bin/libglfw.3.dylib (0x10a9d6778) and /usr/local/Caskroom/miniconda/base/envs/offline_rl/lib/python3.6/site-packages/glfw/libglfw.3.dylib (0x10ab166e8). One of the two will be used. Which one is undefined.
objc[13133]: Class GLFWWindowDelegate is implemented in both /Users/alpha/.mujoco/mujoco200/bin/libglfw.3.dylib (0x10a9d6700) and /usr/local/Caskroom/miniconda/base/envs/offline_rl/lib/python3.6/site-packages/glfw/libglfw.3.dylib (0x10ab16710). One of the two will be used. Which one is undefined.
objc[13133]: Class GLFWContentView is implemented in both /Users/alpha/.mujoco/mujoco200/bin/libglfw.3.dylib (0x10a9d67a0) and /usr/local/Caskroom/miniconda/base/envs/offline_rl/lib/python3.6/site-packages/glfw/libglfw.3.dylib (0x10ab16760). One of the two will be used. Which one is undefined.
objc[13133]: Class GLFWWindow is implemented in both /Users/alpha/.mujoco/mujoco200/bin/libglfw.3.dylib (0x10a9d6818) and /usr/local/Caskroom/miniconda/base/envs/offline_rl/lib/python3.6/site-packages/glfw/libglfw.3.dylib (0x10ab167d8). One of the two will be used. Which one is undefined.
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-3-ac190704bf6b> in <module>
----> 1 import offline_rl

/usr/local/Caskroom/miniconda/base/envs/offline_rl/lib/python3.6/site-packages/offline_rl/__init__.py in <module>
      1 import offline_rl.locomotion
----> 2 import offline_rl.hand_manipulation_suite
      3 import offline_rl.pointmaze
      4 import offline_rl.gym_minigrid
      5 import offline_rl.gym_mujoco

/usr/local/Caskroom/miniconda/base/envs/offline_rl/lib/python3.6/site-packages/offline_rl/hand_manipulation_suite/__init__.py in <module>
      2 from mjrl.envs.mujoco_env import MujocoEnv
      3 from offline_rl.hand_manipulation_suite.door_v0 import DoorEnvV0
----> 4 from offline_rl.hand_manipulation_suite.hammer_v0 import HammerEnvV0
      5 from offline_rl.hand_manipulation_suite.pen_v0 import PenEnvV0
      6 from offline_rl.hand_manipulation_suite.relocate_v0 import RelocateEnvV0

/usr/local/Caskroom/miniconda/base/envs/offline_rl/lib/python3.6/site-packages/offline_rl/hand_manipulation_suite/hammer_v0.py in <module>
      4 from mjrl.envs import mujoco_env
      5 from mujoco_py import MjViewer
----> 6 from ..utils.quatmath import quat2euler
      7 from .. import offline_env
      8 import os

ModuleNotFoundError: No module named 'offline_rl.utils'

This problem goes away when the repo is cloned locally and installed via pip install -e.

I suggest you replace the relative imports with absolute ones and this will be resolved!

Discrepancy between results reported in CQL and D4rl papers

Hi,

I notice there are differences between results reported in CQL paper and D4RL paper for this benchmark. Since some of the authors are common for both papers, can you please comment which of those results should be used as reference?
Table 1 and 2 in CQL vs. Table 1 and 3 in D4RL paper

CQL: Conservative Q-Learning for Offline Reinforcement Learning

About the results reported in the D4RL paper table 1

Hi guys,

I have a question about the results reported in the D4RL paper table 1.

Are these reported results the best undiscounted return during training averaged over multiple random seeds?
Or is it the latset (at the 1000th training epoch?) undiscounted return averaged over multiple random seeds?
Or is it something else?

Let me know which one did you use in the paper!

Thank you!

Best,
Rui

random-expert dataset?

Does the random-expert dataset used in CQL paper publish for Hopper, Walke2d and Halfcheetah? Thanks!

import error: free(): invalid pointer

When I import d4rl, there was an error:
Warning: Flow failed to import. Set the environment variable D4RL_SUPPRESS_IMPORT_ERROR=1 to suppress this message. No module named 'flow.envs' free():invalid pointer Aborted (core dumped)
Any suggestions? Thank you.

Episode information

Hello, could you help me figure out how to segment the dataset into trajectories/episodes?
For example, for the halfcheetah datasets, there are no terminal states included, so I assumed first 1000 datapoints correspond to the first episode, the next 1000 the second episode, etc.

The qlearning_dataset function from
https://github.com/rail-berkeley/d4rl/blob/19ff42dfca15a7ecef38c380711da5164a86e26f/d4rl/__init__.py#L38
seems to support my guess.

But then when I look into the observations more closely, I get, for example for the halfcheetah-medium-v0 environment,
print(obs[9000])
array([-0.03586214, -0.11854211, -0.5709649 , 0.59237313, 0.25799525, 0.24009612, 0.04685834, -0.180547 , 3.6944256 , 1.2065067 , -1.4099166 , 1.4494822 , -6.9591885 , 15.26705 , 3.7576292 , 7.2117867 , -9.449336 ], dtype=float32)
Or
print(obs[10000])
array([ -0.09099437, 0.11028791, 0.32212004, -0.32907683, 0.66518635, -0.03972699, 0.13382532, -0.56899893, 6.4752765 , 0.26289114, -2.9174874 , 19.897604 , -20.984861 , 4.177299 , 20.101353 , -6.1993895 , 1.4919064 ], dtype=float32)

These do not seem right for initial observations. AFAIK, initial observation for halfcheetah is a concat of qpos[1:] and qvel, which, upon reset, are small uniform noise around 0 and standard gaussian noise around 0, respectively.

Any help would be greatly appreciated. Thanks!

Invalid observations in maze2d-umaze dataset

env = gym.make('maze2d-umaze-v1')
dataset = d4rl.qlearning_dataset(env)

(dataset['observations'] == [0, 0, 0, 0]).sum(axis=0)
>>> array([12459, 12459, 12459, 12459])

What's the deal with those?

Data Coverage in UMaze Env

I visualized the state coverage of the provided data in the 2D maze environments. For the small UMaze environment the downloadable dataset seems to have a skewed data distribution where the agent never fully explores one of the sides:

The right shows a scatter plot of all positions in the downloaded dataset. Was there a bug in data generation or is this intended? (the data that is used to generate the GIF on the website does not seem to have this issue)

The other two maze environments seem to have full coverage of the maze:

I slightly modified the visualized_dataset.py script for these plots:

import argparse
import d4rl
import gym


if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument('--env_name', type=str, default='maze2d-umaze-v0')
    args = parser.parse_args()

    env = gym.make(args.env_name)
    
    dataset = env.get_dataset()
    if 'infos/qpos' not in dataset:
        raise ValueError('Only MuJoCo-based environments can be visualized')
    qpos = dataset['infos/qpos']
    qvel = dataset['infos/qvel']
    rewards = dataset['rewards']
    actions = dataset['actions']

    import matplotlib.pyplot as plt
    plt.scatter(qpos[:, 0], qpos[:, 1])
    plt.axis('equal')
    plt.show()

Datasets incorrect? Expert data rewards seem worse than medium data

Hello, I'm noticing that the expert data is of very low quality as shown below. Could the datasets be incorrect, or am I processing the data incorrectly? Thanks!

import gym
import d4rl
import numpy as np

e_medium = gym.make('hopper-medium-v1'); data_medium = e_medium.get_dataset()
e_expert = gym.make('hopper-expert-v1'); data_expert = e_expert.get_dataset()

print(np.mean(data_medium['rewards']), np.mean(data_expert['rewards']))

Results in the following output which seems incorrect to me.

3.47  1.50

Failed to import flow

Hi! I successfully installed flow and d4rl. However, I got this kind of error. Do you have any idea how to resolve this?

Flow and SUMO installed successfully, and I can import flow from a separate python command as well. Thank you!

`Python 3.7.4 (default, Aug 13 2019, 20:35:49)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.

import gym
import d4rl
Warning: Flow failed to import. Set the environment variable D4RL_SUPPRESS_IMPORT_ERROR=1 to suppress this message.
/opt/ros/kinetic/lib/python2.7/dist-packages/cv2.so: undefined symbol: PyCObject_Type
*** Error in python': free(): invalid pointer: 0x00007fe5859776e0 *** ======= Backtrace: ========= /lib/x86_64-linux-gnu/libc.so.6(+0x777f5)[0x7fe5a7d747f5] /lib/x86_64-linux-gnu/libc.so.6(+0x8038a)[0x7fe5a7d7d38a] /lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7fe5a7d8158c] /home/aswin/anaconda3/envs/mujoco-gym/lib/python3.7/site-packages/carla/libcarla.cpython-37m-x86_64-linux-gnu.so(_ZNSt6locale5_Impl16_M_install_facetEPKNS_2idEPKNS_5facetE+0x129)[0x7fe5855d49d9] /home/aswin/anaconda3/envs/mujoco-gym/lib/python3.7/site-packages/carla/libcarla.cpython-37m-x86_64-linux-gnu.so(_ZNSt6locale5_ImplC1Em+0x1c6)[0x7fe5855d12d6] /home/aswin/anaconda3/envs/mujoco-gym/lib/python3.7/site-packages/carla/libcarla.cpython-37m-x86_64-linux-gnu.so(_ZNSt6locale18_S_initialize_onceEv+0x15)[0x7fe5855d2195] /lib/x86_64-linux-gnu/libpthread.so.0(+0xea99)[0x7fe5a80d5a99] /home/aswin/anaconda3/envs/mujoco-gym/lib/python3.7/site-packages/carla/libcarla.cpython-37m-x86_64-linux-gnu.so(_ZNSt6locale13_S_initializeEv+0x21)[0x7fe5855d21e1] /home/aswin/anaconda3/envs/mujoco-gym/lib/python3.7/site-packages/carla/libcarla.cpython-37m-x86_64-linux-gnu.so(_ZNSt6localeC1Ev+0x13)[0x7fe5855d2243] /home/aswin/anaconda3/envs/mujoco-gym/lib/python3.7/site-packages/carla/libcarla.cpython-37m-x86_64-linux-gnu.so(_ZNSt8ios_base4InitC1Ev+0xc1)[0x7fe5855d2fe1] /home/aswin/anaconda3/envs/mujoco-gym/lib/python3.7/site-packages/carla/libcarla.cpython-37m-x86_64-linux-gnu.so(+0x3122f2)[0x7fe5853d82f2] /lib64/ld-linux-x86-64.so.2(+0x106fa)[0x7fe5a85776fa] /lib64/ld-linux-x86-64.so.2(+0x1080b)[0x7fe5a857780b] /lib64/ld-linux-x86-64.so.2(+0x15922)[0x7fe5a857c922] /lib64/ld-linux-x86-64.so.2(+0x105a4)[0x7fe5a85775a4] /lib64/ld-linux-x86-64.so.2(+0x14de9)[0x7fe5a857bde9] /lib/x86_64-linux-gnu/libdl.so.2(+0xf09)[0x7fe5a7af9f09] /lib64/ld-linux-x86-64.so.2(+0x105a4)[0x7fe5a85775a4] /lib/x86_64-linux-gnu/libdl.so.2(+0x1571)[0x7fe5a7afa571] /lib/x86_64-linux-gnu/libdl.so.2(dlopen+0x31)[0x7fe5a7af9fa1] python(_PyImport_FindSharedFuncptr+0x8d)[0x556a4e86787d] python(_PyImport_LoadDynamicModuleWithSpec+0x140)[0x556a4e88b850] python(+0x22daa9)[0x556a4e88baa9] python(_PyMethodDef_RawFastCallDict+0x2b2)[0x556a4e7936a2] python(_PyCFunction_FastCallDict+0x21)[0x556a4e7937c1] python(_PyEval_EvalFrameDefault+0x5c17)[0x556a4e82f007] python(_PyEval_EvalCodeWithName+0x2f9)[0x556a4e772539] python(_PyFunction_FastCallKeywords+0x387)[0x556a4e7c1f57] python(_PyEval_EvalFrameDefault+0x4b39)[0x556a4e82df29] python(_PyFunction_FastCallKeywords+0xfb)[0x556a4e7c1ccb] python(_PyEval_EvalFrameDefault+0x6a3)[0x556a4e829a93] python(_PyFunction_FastCallKeywords+0xfb)[0x556a4e7c1ccb] python(_PyEval_EvalFrameDefault+0x416)[0x556a4e829806] python(_PyFunction_FastCallKeywords+0xfb)[0x556a4e7c1ccb] python(_PyEval_EvalFrameDefault+0x416)[0x556a4e829806] python(_PyFunction_FastCallKeywords+0xfb)[0x556a4e7c1ccb] python(_PyEval_EvalFrameDefault+0x416)[0x556a4e829806] python(_PyFunction_FastCallDict+0x10b)[0x556a4e77356b] python(+0x13414e)[0x556a4e79214e] python(_PyObject_CallMethodIdObjArgs+0xbd)[0x556a4e7ed68d] python(PyImport_ImportModuleLevelObject+0x29c)[0x556a4e77becc] python(_PyEval_EvalFrameDefault+0x2b49)[0x556a4e82bf39] python(_PyEval_EvalCodeWithName+0x2f9)[0x556a4e772539] python(PyEval_EvalCodeEx+0x44)[0x556a4e773424] python(PyEval_EvalCode+0x1c)[0x556a4e77344c] python(+0x1dae81)[0x556a4e838e81] python(_PyMethodDef_RawFastCallDict+0x2b2)[0x556a4e7936a2] python(_PyCFunction_FastCallDict+0x21)[0x556a4e7937c1] python(_PyEval_EvalFrameDefault+0x5c17)[0x556a4e82f007] python(_PyEval_EvalCodeWithName+0x2f9)[0x556a4e772539] python(_PyFunction_FastCallKeywords+0x387)[0x556a4e7c1f57] python(_PyEval_EvalFrameDefault+0x4b39)[0x556a4e82df29] python(_PyFunction_FastCallKeywords+0xfb)[0x556a4e7c1ccb] python(_PyEval_EvalFrameDefault+0x6a3)[0x556a4e829a93] python(_PyFunction_FastCallKeywords+0xfb)[0x556a4e7c1ccb] python(_PyEval_EvalFrameDefault+0x416)[0x556a4e829806] python(_PyFunction_FastCallKeywords+0xfb)[0x556a4e7c1ccb] python(_PyEval_EvalFrameDefault+0x416)[0x556a4e829806] python(_PyFunction_FastCallDict+0x10b)[0x556a4e77356b] python(+0x13414e)[0x556a4e79214e] python(_PyObject_CallMethodIdObjArgs+0xbd)[0x556a4e7ed68d] python(PyImport_ImportModuleLevelObject+0x29c)[0x556a4e77becc] python(_PyEval_EvalFrameDefault+0x2b49)[0x556a4e82bf39] ======= Memory map: ======== 556a4e65e000-556a4e6b9000 r--p 00000000 08:07 4726427 /home/aswin/anaconda3/envs/mujoco-gym/bin/python3.7 556a4e6b9000-556a4e895000 r-xp 0005b000 08:07 4726427 /home/aswin/anaconda3/envs/mujoco-gym/bin/python3.7 556a4e895000-556a4e93c000 r--p 00237000 08:07 4726427 /home/aswin/anaconda3/envs/mujoco-gym/bin/python3.7 556a4e93d000-556a4e940000 r--p 002de000 08:07 4726427 /home/aswin/anaconda3/envs/mujoco-gym/bin/python3.7 556a4e940000-556a4e9a9000 rw-p 002e1000 08:07 4726427 /home/aswin/anaconda3/envs/mujoco-gym/bin/python3.7 556a4e9a9000-556a4e9c9000 rw-p 00000000 00:00 0 556a4ec1a000-556a4f6b9000 rw-p 00000000 00:00 0 [heap] 7fe574000000-7fe574021000 rw-p 00000000 00:00 0 7fe574021000-7fe578000000 ---p 00000000 00:00 0 7fe57869a000-7fe5786a0000 r--p 00000000 08:07 2885855 /home/aswin/anaconda3/envs/mujoco-gym/lib/libpng16.so.16.37.0 7fe5786a0000-7fe5786c6000 r-xp 00006000 08:07 2885855 /home/aswin/anaconda3/envs/mujoco-gym/lib/libpng16.so.16.37.0 7fe5786c6000-7fe5786d1000 r--p 0002c000 08:07 2885855 /home/aswin/anaconda3/envs/mujoco-gym/lib/libpng16.so.16.37.0 7fe5786d1000-7fe5786d2000 r--p 00036000 08:07 2885855 /home/aswin/anaconda3/envs/mujoco-gym/lib/libpng16.so.16.37.0 7fe5786d2000-7fe5786d3000 rw-p 00037000 08:07 2885855 /home/aswin/anaconda3/envs/mujoco-gym/lib/libpng16.so.16.37.0 7fe57a418000-7fe57a422000 r-xp 00000000 08:07 3024489 /usr/lib/x86_64-linux-gnu/libnuma.so.1.0.0 7fe57a422000-7fe57a621000 ---p 0000a000 08:07 3024489 /usr/lib/x86_64-linux-gnu/libnuma.so.1.0.0 7fe57a621000-7fe57a622000 r--p 00009000 08:07 3024489 /usr/lib/x86_64-linux-gnu/libnuma.so.1.0.0 7fe57a622000-7fe57a623000 rw-p 0000a000 08:07 3024489 /usr/lib/x86_64-linux-gnu/libnuma.so.1.0.0 7fe57a888000-7fe57c13e000 r-xp 00000000 08:07 3022340 /usr/lib/x86_64-linux-gnu/libicudata.so.55.1 7fe57c13e000-7fe57c33d000 ---p 018b6000 08:07 3022340 /usr/lib/x86_64-linux-gnu/libicudata.so.55.1 7fe57c33d000-7fe57c33e000 r--p 018b5000 08:07 3022340 /usr/lib/x86_64-linux-gnu/libicudata.so.55.1 7fe57c33e000-7fe57c33f000 rw-p 018b6000 08:07 3022340 /usr/lib/x86_64-linux-gnu/libicudata.so.55.1 7fe57c33f000-7fe57c363000 r-xp 00000000 08:07 3024133 /usr/lib/x86_64-linux-gnu/libgraphite2.so.3.0.1 7fe57c363000-7fe57c562000 ---p 00024000 08:07 3024133 /usr/lib/x86_64-linux-gnu/libgraphite2.so.3.0.1 7fe57c562000-7fe57c564000 r--p 00023000 08:07 3024133 /usr/lib/x86_64-linux-gnu/libgraphite2.so.3.0.1 7fe57c564000-7fe57c565000 rw-p 00025000 08:07 3024133 /usr/lib/x86_64-linux-gnu/libgraphite2.so.3.0.1 7fe57c565000-7fe57c572000 r--p 00000000 08:07 4470157 /home/aswin/anaconda3/envs/mujoco-gym/lib/libfreetype.so.6.16.1 7fe57c572000-7fe57c5e1000 r-xp 0000d000 08:07 4470157 /home/aswin/anaconda3/envs/mujoco-gym/lib/libfreetype.so.6.16.1 7fe57c5e1000-7fe57c608000 r--p 0007c000 08:07 4470157 /home/aswin/anaconda3/envs/mujoco-gym/lib/libfreetype.so.6.16.1 7fe57c608000-7fe57c609000 ---p 000a3000 08:07 4470157 /home/aswin/anaconda3/envs/mujoco-gym/lib/libfreetype.so.6.16.1 7fe57c609000-7fe57c610000 r--p 000a3000 08:07 4470157 /home/aswin/anaconda3/envs/mujoco-gym/lib/libfreetype.so.6.16.1 7fe57c610000-7fe57c611000 rw-p 000aa000 08:07 4470157 /home/aswin/anaconda3/envs/mujoco-gym/lib/libfreetype.so.6.16.1 7fe57c611000-7fe57c67f000 r-xp 00000000 08:07 4461934 /lib/x86_64-linux-gnu/libpcre.so.3.13.2 7fe57c67f000-7fe57c87f000 ---p 0006e000 08:07 4461934 /lib/x86_64-linux-gnu/libpcre.so.3.13.2 7fe57c87f000-7fe57c880000 r--p 0006e000 08:07 4461934 /lib/x86_64-linux-gnu/libpcre.so.3.13.2 7fe57c880000-7fe57c881000 rw-p 0006f000 08:07 4461934 /lib/x86_64-linux-gnu/libpcre.so.3.13.2 7fe57ca89000-7fe57cac4000 r-xp 00000000 08:07 4594927 /home/aswin/anaconda3/envs/mujoco-gym/lib/libjpeg.so.9.2.0 7fe57cac4000-7fe57ccc3000 ---p 0003b000 08:07 4594927 /home/aswin/anaconda3/envs/mujoco-gym/lib/libjpeg.so.9.2.0 7fe57ccc3000-7fe57ccc4000 r--p 0003a000 08:07 4594927 /home/aswin/anaconda3/envs/mujoco-gym/lib/libjpeg.so.9.2.0 7fe57ccc4000-7fe57ccc5000 rw-p 0003b000 08:07 4594927 /home/aswin/anaconda3/envs/mujoco-gym/lib/libjpeg.so.9.2.0 7fe57ccc5000-7fe57cccf000 r--p 00000000 08:07 2626091 /home/aswin/anaconda3/envs/mujoco-gym/lib/libzstd.so.1.3.7 7fe57cccf000-7fe57cd5b000 r-xp 0000a000 08:07 2626091 /home/aswin/anaconda3/envs/mujoco-gym/lib/libzstd.so.1.3.7 7fe57cd5b000-7fe57cd67000 r--p 00096000 08:07 2626091 /home/aswin/anaconda3/envs/mujoco-gym/lib/libzstd.so.1.3.7 7fe57cd67000-7fe57cd68000 ---p 000a2000 08:07 2626091 /home/aswin/anaconda3/envs/mujoco-gym/lib/libzstd.so.1.3.7 7fe57cd68000-7fe57cd69000 r--p 000a2000 08:07 2626091 /home/aswin/anaconda3/envs/mujoco-gym/lib/libzstd.so.1.3.7 7fe57cd69000-7fe57cd6a000 rw-p 000a3000 08:07 2626091 /home/aswin/anaconda3/envs/mujoco-gym/lib/libzstd.so.1.3.7 7fe580881000-7fe58128e000 r-xp 00000000 08:07 3017762 /usr/lib/x86_64-linux-gnu/libx265.so.79 7fe58128e000-7fe58148d000 ---p 00a0d000 08:07 3017762 /usr/lib/x86_64-linux-gnu/libx265.so.79 7fe58148d000-7fe581490000 r--p 00a0c000 08:07 3017762 /usr/lib/x86_64-linux-gnu/libx265.so.79 7fe581490000-7fe581493000 rw-p 00a0f000 08:07 3017762 /usr/lib/x86_64-linux-gnu/libx265.so.79 7fe581493000-7fe5814a0000 rw-p 00000000 00:00 0 7fe581e78000-7fe581edc000 r-xp 00000000 08:07 3024545 /usr/lib/x86_64-linux-gnu/libpcre16.so.3.13.2 7fe581edc000-7fe5820dc000 ---p 00064000 08:07 3024545 /usr/lib/x86_64-linux-gnu/libpcre16.so.3.13.2 7fe5820dc000-7fe5820dd000 r--p 00064000 08:07 3024545 /usr/lib/x86_64-linux-gnu/libpcre16.so.3.13.2 7fe5820dd000-7fe5820de000 rw-p 00065000 08:07 3024545 /usr/lib/x86_64-linux-gnu/libpcre16.so.3.13.2 7fe5820de000-7fe58225d000 r-xp 00000000 08:07 3022329 /usr/lib/x86_64-linux-gnu/libicuuc.so.55.1 7fe58225d000-7fe58245d000 ---p 0017f000 08:07 3022329 /usr/lib/x86_64-linux-gnu/libicuuc.so.55.1 7fe58245d000-7fe58246d000 r--p 0017f000 08:07 3022329 /usr/lib/x86_64-linux-gnu/libicuuc.so.55.1 7fe58246d000-7fe58246e000 rw-p 0018f000 08:07 3022329 /usr/lib/x86_64-linux-gnu/libicuuc.so.55.1 7fe58246e000-7fe582472000 rw-p 00000000 00:00 0 7fe582472000-7fe5826c4000 r-xp 00000000 08:07 3022336 /usr/lib/x86_64-linux-gnu/libicui18n.so.55.1 7fe5826c4000-7fe5828c4000 ---p 00252000 08:07 3022336 /usr/lib/x86_64-linux-gnu/libicui18n.so.55.1 7fe5828c4000-7fe5828d3000 r--p 00252000 08:07 3022336 /usr/lib/x86_64-linux-gnu/libicui18n.so.55.1 7fe5828d3000-7fe5828d4000 rw-p 00261000 08:07 3022336 /usr/lib/x86_64-linux-gnu/libicui18n.so.55.1 7fe5828d4000-7fe582930000 r-xp 00000000 08:07 3024221 /usr/lib/x86_64-linux-gnu/libharfbuzz.so.0.10000.1 7fe582930000-7fe582b30000 ---p 0005c000 08:07 3024221 /usr/lib/x86_64-linux-gnu/libharfbuzz.so.0.10000.1 7fe582b30000-7fe582b31000 r--p 0005c000 08:07 3024221 /usr/lib/x86_64-linux-gnu/libharfbuzz.so.0.10000.1 7fe582b31000-7fe582b32000 rw-p 0005d000 08:07 3024221 /usr/lib/x86_64-linux-gnu/libharfbuzz.so.0.10000.1 7fe582b32000-7fe582c41000 r-xp 00000000 08:07 4461424 /lib/x86_64-linux-gnu/libglib-2.0.so.0.4800.2 7fe582c41000-7fe582e40000 ---p 0010f000 08:07 4461424 /lib/x86_64-linux-gnu/libglib-2.0.so.0.4800.2 7fe582e40000-7fe582e41000 r--p 0010e000 08:07 4461424 /lib/x86_64-linux-gnu/libglib-2.0.so.0.4800.2 7fe582e41000-7fe582e42000 rw-p 0010f000 08:07 4461424 /lib/x86_64-linux-gnu/libglib-2.0.so.0.4800.2 7fe582e42000-7fe582e43000 rw-p 00000000 00:00 0 7fe582e43000-7fe582e95000 r-xp 00000000 08:07 3016095 /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0.4800.2 7fe582e95000-7fe583094000 ---p 00052000 08:07 3016095 /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0.4800.2 7fe583094000-7fe583095000 r--p 00051000 08:07 3016095 /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0.4800.2 7fe583095000-7fe583096000 rw-p 00052000 08:07 3016095 /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0.4800.2 7fe583299000-7fe5832e3000 r-xp 00000000 08:07 3024313 /usr/lib/x86_64-linux-gnu/libjasper.so.1.0.0 7fe5832e3000-7fe5834e2000 ---p 0004a000 08:07 3024313 /usr/lib/x86_64-linux-gnu/libjasper.so.1.0.0 7fe5834e2000-7fe5834e3000 r--p 00049000 08:07 3024313 /usr/lib/x86_64-linux-gnu/libjasper.so.1.0.0 7fe5834e3000-7fe5834e7000 rw-p 0004a000 08:07 3024313 /usr/lib/x86_64-linux-gnu/libjasper.so.1.0.0 7fe5834e7000-7fe5834ee000 rw-p 00000000 00:00 0 7fe5834ee000-7fe5834f8000 r--p 00000000 08:07 3555323 /home/aswin/anaconda3/envs/mujoco-gym/lib/libtiff.so.5.5.0 7fe5834f8000-7fe58353d000 r-xp 0000a000 08:07 3555323 /home/aswin/anaconda3/envs/mujoco-gym/lib/libtiff.so.5.5.0 7fe58353d000-7fe583569000 r--p 0004f000 08:07 3555323 /home/aswin/anaconda3/envs/mujoco-gym/lib/libtiff.so.5.5.0 7fe583569000-7fe58356d000 r--p 0007a000 08:07 3555323 /home/aswin/anaconda3/envs/mujoco-gym/lib/libtiff.so.5.5.0 7fe58356d000-7fe58356e000 rw-p 0007e000 08:07 3555323 /home/aswin/anaconda3/envs/mujoco-gym/lib/libtiff.so.5.5.0 7fe58356e000-7fe583592000 r-xp 00000000 08:07 4461946 /lib/x86_64-linux-gnu/libpng12.so.0.54.0 7fe583592000-7fe583791000 ---p 00024000 08:07 4461946 /lib/x86_64-linux-gnu/libpng12.so.0.54.0 7fe583791000-7fe583792000 r--p 00023000 08:07 4461946 /lib/x86_64-linux-gnu/libpng12.so.0.54.0 7fe583792000-7fe583793000 rw-p 00024000 08:07 4461946 /lib/x86_64-linux-gnu/libpng12.so.0.54.0 7fe583793000-7fe5837ec000 r-xp 00000000 08:07 3024892 /usr/lib/x86_64-linux-gnu/libwebp.so.5.0.4 7fe5837ec000-7fe5839ec000 ---p 00059000 08:07 3024892 /usr/lib/x86_64-linux-gnu/libwebp.so.5.0.4 7fe5839ec000-7fe5839ed000 r--p 00059000 08:07 3024892 /usr/lib/x86_64-linux-gnu/libwebp.so.5.0.4 7fe5839ed000-7fe5839ef000 rw-p 0005a000 08:07 3024892 /usr/lib/x86_64-linux-gnu/libwebp.so.5.0.4 7fe5839ef000-7fe583a46000 r-xp 00000000 08:07 3029395 /usr/lib/x86_64-linux-gnu/libjpeg.so.8.0.2 7fe583a46000-7fe583c46000 ---p 00057000 08:07 3029395 /usr/lib/x86_64-linux-gnu/libjpeg.so.8.0.2 7fe583c46000-7fe583c47000 r--p 00057000 08:07 3029395 /usr/lib/x86_64-linux-gnu/libjpeg.so.8.0.2 7fe583c47000-7fe583c48000 rw-p 00058000 08:07 3029395 /usr/lib/x86_64-linux-gnu/libjpeg.so.8.0.2 7fe5850c6000-7fe58573c000 r-xp 00000000 08:07 930533 /home/aswin/anaconda3/envs/mujoco-gym/lib/python3.7/site-packages/carla/libcarla.cpython-37m-x86_64-linux-gnu.so 7fe58573c000-7fe58593b000 ---p 00676000 08:07 930533 /home/aswin/anaconda3/envs/mujoco-gym/lib/python3.7/site-packages/carla/libcarla.cpython-37m-x86_64-linux-gnu.so 7fe58593b000-7fe58595c000 r--p 00675000 08:07 930533 /home/aswin/anaconda3/envs/mujoco-gym/lib/python3.7/site-packages/carla/libcarla.cpython-37m-x86_64-linux-gnu.so 7fe58595c000-7fe58596a000 rw-p 00696000 08:07 930533 /home/aswin/anaconda3/envs/mujoco-gym/lib/python3.7/site-packages/carla/libcarla.cpython-37m-x86_64-linux-gnu.so 7fe58596a000-7fe585979000 rw-p 00000000 00:00 0 7fe585979000-7fe585e3d000 r-xp 00000000 08:07 3017530 /usr/lib/x86_64-linux-gnu/libQt5Core.so.5.5.1 7fe585e3d000-7fe585e3e000 ---p 004c4000 08:07 3017530 /usr/lib/x86_64-linux-gnu/libQt5Core.so.5.5.1 7fe585e3e000-7fe585e4a000 r--p 004c4000 08:07 3017530 /usr/lib/x86_64-linux-gnu/libQt5Core.so.5.5.1 7fe585e4a000-7fe585e4b000 rw-p 004d0000 08:07 3017530 /usr/lib/x86_64-linux-gnu/libQt5Core.so.5.5.1 7fe585e4b000-7fe585e4f000 rw-p 00000000 00:00 0 7fe585e4f000-7fe586376000 r-xp 00000000 08:07 3017538 /usr/lib/x86_64-linux-gnu/libQt5Gui.so.5.5.1 7fe586376000-7fe586377000 ---p 00527000 08:07 3017538 /usr/lib/x86_64-linux-gnu/libQt5Gui.so.5.5.1 7fe586377000-7fe58638c000 r--p 00527000 08:07 3017538 /usr/lib/x86_64-linux-gnu/libQt5Gui.so.5.5.1 7fe58638c000-7fe586392000 rw-p 0053c000 08:07 3017538 /usr/lib/x86_64-linux-gnu/libQt5Gui.so.5.5.1 7fe586392000-7fe586397000 rw-p 00000000 00:00 0 7fe586397000-7fe5863c5000 r-xp 00000000 08:07 3017564 /usr/lib/x86_64-linux-gnu/libQt5Test.so.5.5.1 7fe5863c5000-7fe5863c6000 r--p 0002d000 08:07 3017564 /usr/lib/x86_64-linux-gnu/libQt5Test.so.5.5.1 7fe5863c6000-7fe5863c7000 rw-p 0002e000 08:07 3017564 /usr/lib/x86_64-linux-gnu/libQt5Test.so.5.5.1 7fe5863c7000-7fe5863cb000 rw-p 00000000 00:00 0 7fe5863cb000-7fe586a24000 r-xp 00000000 08:07 3017543 /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5.5.1 7fe586a24000-7fe586a52000 r--p 00658000 08:07 3017543 /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5.5.1 7fe586a52000-7fe586a57000 rw-p 00686000 08:07 3017543 /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5.5.1 7fe586a57000-7fe586a58000 rw-p 00000000 00:00 0 7fe586c15000-7fe586c28000 r--p 00000000 08:07 4471369 /home/aswin/anaconda3/envs/mujoco-gym/lib/python3.7/site-packages/PIL/_imaging.cpython-37m-x86_64-linux-gnu.so 7fe586c28000-7fe586c7c000 r-xp 00013000 08:07 4471369 /home/aswin/anaconda3/envs/mujoco-gym/lib/python3.7/site-packages/PIL/_imaging.cpython-37m-x86_64-linux-gnu.so 7fe586c7c000-7fe586c8b000 r--p 00067000 08:07 4471369 /home/aswin/anaconda3/envs/mujoco-gym/lib/python3.7/site-packages/PIL/_imaging.cpython-37m-x86_64-linux-gnu.so 7fe586c8b000-7fe586c8f000 r--p 00075000 08:07 4471369 /home/aswin/anaconda3/envs/mujoco-gym/lib/python3.7/site-packages/PIL/_imaging.cpython-37m-x86_64-linux-gnu.so 7fe586c8f000-7fe586c92000 rw-p 00079000 08:07 4471369 /home/aswin/anaconda3/envs/mujoco-gym/lib/python3.7/site-packages/PIL/_imaging.cpython-37m-x86_64-linux-gnu.so 7fe586c92000-7fe586d93000 rw-p 00000000 00:00 0 7fe586d93000-7fe586f32000 r-xp 00000000 08:07 262409 /home/aswin/.mujoco/mujoco200_linux/bin/libmujoco200.so 7fe586f32000-7fe587132000 ---p 0019f000 08:07 262409 /home/aswin/.mujoco/mujoco200_linux/bin/libmujoco200.so 7fe587132000-7fe587133000 r--p 0019f000 08:07 262409 /home/aswin/.mujoco/mujoco200_linux/bin/libmujoco200.so 7fe587133000-7fe587141000 rw-p 001a0000 08:07 262409 /home/aswin/.mujoco/mujoco200_linux/bin/libmujoco200.so 7fe587141000-7fe587146000 rw-p 00000000 00:00 0 7fe587146000-7fe587e49000 r-xp 00000000 08:07 1316166 /opt/ros/kinetic/lib/x86_64-linux-gnu/libopencv_core3.so.3.3.1 7fe587e49000-7fe588049000 ---p 00d03000 08:07 1316166 /opt/ros/kinetic/lib/x86_64-linux-gnu/libopencv_core3.so.3.3.1 7fe588049000-7fe588055000 r--p 00d03000 08:07 1316166 /opt/ros/kinetic/lib/x86_64-linux-gnu/libopencv_core3.so.3.3.1 7fe588055000-7fe58807b000 rw-p 00d0f000 08:07 1316166 /opt/ros/kinetic/lib/x86_64-linux-gnu/libopencv_core3.so.3.3.1 7fe58807b000-7fe588080000 rw-p 00000000 00:00 0 7fe588080000-7fe58a69f000 r-xp 00000000 08:07 1314922 /opt/ros/kinetic/lib/x86_64-linux-gnu/libopencv_imgproc3.so.3.3.1 7fe58a69f000-7fe58a89f000 ---p 0261f000 08:07 1314922 /opt/ros/kinetic/lib/x86_64-linux-gnu/libopencv_imgproc3.so.3.3.1 7fe58a89f000-7fe58a8b5000 r--p 0261f000 08:07 1314922 /opt/ros/kinetic/lib/x86_64-linux-gnu/libopencv_imgproc3.so.3.3.1 7fe58a8b5000-7fe58a8d7000 rw-p 02635000 08:07 1314922 /opt/ros/kinetic/lib/x86_64-linux-gnu/libopencv_imgproc3.so.3.3.1 7fe58a8d7000-7fe58a976000 rw-p 00000000 00:00 0 7fe58a976000-7fe58ad93000 r-xp 00000000 08:07 1315421 /opt/ros/kinetic/lib/x86_64-linux-gnu/libopencv_imgcodecs3.so.3.3.1 7fe58ad93000-7fe58af93000 ---p 0041d000 08:07 1315421 /opt/ros/kinetic/lib/x86_64-linux-gnu/libopencv_imgcodecs3.so.3.3.1 7fe58af93000-7fe58af99000 r--p 0041d000 08:07 1315421 /opt/ros/kinetic/lib/x86_64-linux-gnu/libopencv_imgcodecs3.so.3.3.1 7fe58af99000-7fe58afa4000 rw-p 00423000 08:07 1315421 /opt/ros/kinetic/lib/x86_64-linux-gnu/libopencv_imgcodecs3.so.3.3.1 7fe58afa4000-7fe58afa5000 rw-p 00000000 00:00 0 7fe58afd6000-7fe58afd7000 r-xp 00000000 08:07 3022503 /usr/lib/x86_64-linux-gnu/libX11-xcb.so.1.0.0 7fe58afd7000-7fe58b1d6000 ---p 00001000 08:07 3022503 /usr/lib/x86_64-linux-gnu/libX11-xcb.so.1.0.0 7fe58b1d6000-7fe58b1d7000 r--p 00000000 08:07 3022503 /usr/lib/x86_64-linux-gnu/libX11-xcb.so.1.0.0 7fe58b1d7000-7fe58b1d8000 rw-p 00001000 08:07 3022503 /usr/lib/x86_64-linux-gnu/libX11-xcb.so.1.0.0 7fe58b1d8000-7fe58b216000 r-xp 00000000 08:07 1316242 /opt/ros/kinetic/lib/x86_64-linux-gnu/libopencv_highgui3.so.3.3.1 7fe58b216000-7fe58b415000 ---p 0003e000 08:07 1316242 /opt/ros/kinetic/lib/x86_64-linux-gnu/libopencv_highgui3.so.3.3.1 7fe58b415000-7fe58b418000 r--p 0003d000 08:07 1316242 /opt/ros/kinetic/lib/x86_64-linux-gnu/libopencv_highgui3.so.3.3.1 7fe58b418000-7fe58b419000 rw-p 00040000 08:07 1316242 /opt/ros/kinetic/lib/x86_64-linux-gnu/libopencv_highgui3.so.3.3.1 7fe58b419000-7fe58b470000 r-xp 00000000 08:07 1315411 /opt/ros/kinetic/lib/x86_64-linux-gnu/libopencv_flann3.so.3.3.1 7fe58b470000-7fe58b670000 ---p 00057000 08:07 1315411 /opt/ros/kinetic/lib/x86_64-linux-gnu/libopencv_flann3.so.3.3.1 7fe58b670000-7fe58b672000 r--p 00057000 08:07 1315411 /opt/ros/kinetic/lib/x86_64-linux-gnu/libopencv_flann3.so.3.3.1 7fe58b672000-7fe58b673000 rw-p 00059000 08:07 1315411 /opt/ros/kinetic/lib/x86_64-linux-gnu/libopencv_flann3.so.3.3.1 7fe58b673000-7fe58b74e000 r-xp 00000000 08:07 1315472 /opt/ros/kinetic/lib/x86_64-linux-gnu/libopencv_features2d3.so.3.3.1 7fe58b74e000-7fe58b94e000 ---p 000db000 08:07 1315472 /opt/ros/kinetic/lib/x86_64-linux-gnu/libopencv_features2d3.so.3.3.1 7fe58b94e000-7fe58b953000 r--p 000db000 08:07 1315472 /opt/ros/kinetic/lib/x86_64-linux-gnu/libopencv_features2d3.so.3.3.1 7fe58b953000-7fe58b955000 rw-p 000e0000 08:07 1315472 /opt/ros/kinetic/lib/x86_64-linux-gnu/libopencv_features2d3.so.3.3.1 7fe58b955000-7fe58b956000 rw-p 00000000 00:00 0 7fe58b956000-7fe58baf7000 r-xp 00000000 08:07 1315473 /opt/ros/kinetic/lib/x86_64-linux-gnu/libopencv_calib3d3.so.3.3.1 7fe58baf7000-7fe58bcf7000 ---p 001a1000 08:07 1315473 /opt/ros/kinetic/lib/x86_64-linux-gnu/libopencv_calib3d3.so.3.3.1 7fe58bcf7000-7fe58bcf9000 r--p 001a1000 08:07 1315473 /opt/ros/kinetic/lib/x86_64-linux-gnu/libopencv_calib3d3.so.3.3.1 7fe58bcf9000-7fe58bcfa000 rw-p 001a3000 08:07 1315473 /opt/ros/kinetic/lib/x86_64-linux-gnu/libopencv_calib3d3.so.3.3.1 7fe58bcfa000-7fe58bcfb000 rw-p 00000000 00:00 0

BEAR Benchmark

Hi Everyone

I am interested in benchmarking BEAR, but I cannot execute the corresponding script (antmaze_bear.py). I get the error message "No module named 'rlkit.torch.sac.bear'". Unfortunately I cannot find a module corresponding to BEAR anywhere else in the rlkit. Can anyone tell me where I can find this code or share their code with me?

Best regards
Nils

Unable to run antmaze_sac.py nor antmaze_sac_rlkit.py

The RLkit version changed a lot and for example FlattenMlp does not exist anymore.
Any recommendation which RLkit version is compatible with your repository?

Thank you in advance. :)
Best

Question about the normalization of results and the benchmarked CQL performance in Table 2 (ICLR submission)

Hi,

Thanks so much for your work. I have a question about the normalization of results. Specifically, e.g., in the Gym domain, each result is normalized according to the expert-policy (sac) and random-policy. But which number should we refer to? On the Wiki/"Off policy evaluation" page, there is a form that includes the expert-policy and random-policy, should we refer these? Also, the results of the expert-policy are different from the SAC results in Table3 (ICLR), so which one should we use?

And I noticed that in Table 2 and 3 (ICLR), the result of CQL-'hopper-medium' seems not aligned, could you please confirm this (maybe also the CQL-'walker2d-medium')?

Thanks.

Wrong URLs for downloading dataset

I got HTTPError: HTTP Error 404: Not Found when I tried to download some dataset.

In [56]: env = gym.make(‘halfcheetah-random-expert-v0’)                                                                                            
In [57]: dataset = env.get_dataset()                                                                                                      
Downloading dataset: http://rail.eecs.berkeley.edu/datasets/offline_rl/gym_mujoco/halfcheetah_random_expert.hdf5 to /root/d4rl_dataset/halfcheetah_random_expert.hdf5
---------------------------------------------------------------------------
HTTPError                 Traceback (most recent call last)
<ipython-input-57-3347b5d22c17> in <module>
----> 1 dataset = env.get_dataset()

~/matsushima/furuta/d4rl/d4rl/offline_env.py in get_dataset(self, h5path)
   53       if not os.path.exists(self.dataset_filepath):
   54         print(‘Downloading dataset:’, self._dataset_url, ‘to’, self.dataset_filepath)
---> 55         urllib.request.urlretrieve(self._dataset_url, self.dataset_filepath)
   56 
   57       if not os.path.exists(self.dataset_filepath):

~/.pyenv/versions/3.7.4/lib/python3.7/urllib/request.py in urlretrieve(url, filename, reporthook, data)
  245   url_type, path = splittype(url)
  246 
--> 247   with contextlib.closing(urlopen(url, data)) as fp:
  248     headers = fp.info()
  249 

~/.pyenv/versions/3.7.4/lib/python3.7/urllib/request.py in urlopen(url, data, timeout, cafile, capath, cadefault, context)
  220   else:
  221     opener = _opener
--> 222   return opener.open(url, data, timeout)
  223 
  224 def install_opener(opener):

~/.pyenv/versions/3.7.4/lib/python3.7/urllib/request.py in open(self, fullurl, data, timeout)
  529     for processor in self.process_response.get(protocol, []):
  530       meth = getattr(processor, meth_name)
--> 531       response = meth(req, response)
  532 
  533     return response

~/.pyenv/versions/3.7.4/lib/python3.7/urllib/request.py in http_response(self, request, response)
  639     if not (200 <= code < 300):
  640       response = self.parent.error(
--> 641         ‘http’, request, response, code, msg, hdrs)
  642 
  643     return response

~/.pyenv/versions/3.7.4/lib/python3.7/urllib/request.py in error(self, proto, *args)
  567     if http_err:
  568       args = (dict, ‘default’, ‘http_error_default’) + orig_args
--> 569       return self._call_chain(*args)
  570 
  571 # XXX probably also want an abstract factory that knows when it makes

~/.pyenv/versions/3.7.4/lib/python3.7/urllib/request.py in _call_chain(self, chain, kind, meth_name, *args)
  501     for handler in handlers:
  502       func = getattr(handler, meth_name)
--> 503       result = func(*args)
  504       if result is not None:
  505         return result

~/.pyenv/versions/3.7.4/lib/python3.7/urllib/request.py in http_error_default(self, req, fp, code, msg, hdrs)
  647 class HTTPDefaultErrorHandler(BaseHandler):
  648   def http_error_default(self, req, fp, code, msg, hdrs):
--> 649     raise HTTPError(req.full_url, code, msg, hdrs, fp)
  650 
  651 class HTTPRedirectHandler(BaseHandler):

HTTPError: HTTP Error 404: Not Found

The same error has occurred in walker2d-random-expert-v0, door-only-cloned-v0, hammer-only-cloned-v0, pen-only-cloned-v0, relocate-only-cloned-v0.
I think maybe you don't make some of them available for now. If possible, could you check it?

Missing antmaze evaluation environments

What are the environment IDs for the antmaze evaluation environments?

Running [env for env in gym.envs.registry.all() if 'eval' in env.id] after importing d4rl only returns the maze2d eval environments.

Unable to run antmaze_sac.py

Similar to #9, I believe appropriate version of rlkit is also needed to run antmaze_sac.py. Any estimate when the appropriate version of rlkit can be made available?

In particular, current code crashes at MdpPathCollector with the error init() got an unexpected keyword argument 'sparse_reward'

BRAC dependencies versions seem to be incorrect.

I failed to run train_brac.py with the given requirements.txt. And I find the following dependencies versions work fine:

tensorflow==1.15.0
tensorflow-probability==0.8.0rc0
tf-agents==0.3.0

Previous tensorflow==1.14.0 will cause module 'tensorflow' has no attribute 'TypeSpec', and tensorflow-probability==0.7.0rc0 will cause TypeError: Tensor is unhashable if Tensor equality is enabled.

Clarification about Training and Evaluation Task Split

Hi,

Thanks for sharing this repository. It is great
I'd like to ask about "Training and Evaluation Task Split" in Appendix D and how results are reported in Tables 1 and 3. I am a bit confused how those have been done.
For simplicity, let's assume method A and Maze2D are being used, which of the followings is correct description of what have been done in this paper:

A is trained on "maze2d-umaze-v1". Then the leaned model is used to report results on "maze2d-eval-umaze-v1"? In other words, maze2d-eval-umaze-v1 is not used for training and only used to report results?
A's hyperparameters are tuned on "maze2d-umaze-v1". Then, A is trained with those hyperparameters and evaluated on "maze2d-eval-umaze-v1"? In other words, maze2d-eval-umaze-v1 is used for both training and evaluation?
Or any other scenario?

Thanks for your help.

Maze2d tasks don't have a goal location in the state

(Crosspost of this issue at d4rl_evaluations)

Hi,

I find it irritating that the observations in the maze2d tasks only contain the 2d positions/velocities. If the agent is not informed about the goal location (which can be found in info/goal in the data set), it can't decide whether to go eg. left or right as the goal might be on either side.

How was that dealt with in the experiments from the paper? Is the agent conditioned on the goal in some form?

Thanks,
-Justin

Easy way to compute maze coverage

Hi,

In the code, I saw that each maze environment has its own representation i.e.:

U_MAZE = \
        "#####\\"+\
        "#GOO#\\"+\
        "###O#\\"+\
        "#OOO#\\"+\
        "#####"

but what I haven't been able to find are the centers of each of the open positions and their corresponding height and width (for both the point maze and ant maze).

Thanks!

can not install

when I try to install this, I encounter "ERROR: Command errored out with exit status 128: git clone -q git://github.com/deepmind/dm_control /tmp/pip-install-z7djkhsc/dm-control_750010d3edcd47c991d434681980f049 Check the logs for full command output."

Carla env and dataset issue

In the __init__.py file in the carla folder, the config for carla-lane-render-v0 environment is wrong.
The entry_point should be d4rl.carla:CarlaObsEnv instead of d4rl.carla:CarlaDictEnv; and the dataset_url should be the same as that of the carla-lane-v0 env. The current URL gives a 404 error.

farama-foundation / d4rl Goto Github PK

d4rl's People

Contributors

Stargazers

Watchers

Forkers

d4rl's Issues

Recommend Projects

Recommend Topics

Recommend Org