aidudezzz / deepworlds Goto Github PK

Examples and use cases using the deepbots framework (https://github.com/aidudezzz/deepbots) with the Webots robot simulator.

License: GNU General Public License v3.0

deepworlds's Introduction

Deepworlds is a support repository for the deepbots framework, containing examples of the framework's usage on the Webots robot simulator.

If the following sections feel overwhelming, feel free to start on our deepbots-tutorials repository for a beginner's in-depth introduction to the way the deepbots framework is used.

Run an example in Webots

Clone the repository using:

git clone https://github.com/aidudezzz/deepworlds.git

Install specific packages for each example you want to use by running the following:
```
pip install -r <path to requirements file>
```
You can find the requirement files on the /requirements/<example-name>.txt path of each example, e.g., /examples/cartpole/cartpole_discrete/requirements/.
Through Webots, open the .wbt file of the example you are interested in and hit run to train the provided agent. You can find the .wbt files under /worlds/, e.g., /examples/cartpole/cartpole_discrete/worlds/.

For more information on the examples, refer to each one's README, and examine the code within their /controllers/ directory.

Some important notes

Each example might be split into discrete and continuous action space cases. The reason for this split is that depending on the action space, different kinds of reinforcement learning agents need to be used, and thus quite large changes are needed in the code.

Keep in mind that each example can have multiple solutions provided using the two schemes of deepbots (robot supervisor and emitter-receiver) and with different reinforcement learning agents, backends, etc.

We suggest starting your exploration from the discrete cartpole example using the robot supervisor scheme, as it is also the example used in the tutorial. The main class/controller implementation can be found here, and the corresponding tutorial to create it from scratch is here.

Directories structure

\deepworlds
    \examples
        \cartpole
            \cartpole_discrete
                \controllers
                \requirements
                \worlds
            \cartpole_continuous
                \controllers
                \requirements
                \worlds
            \...
        \find_and_avoid
            \find_and_avoid_continuous
                \controllers
                \requirements
                \worlds
            \...
        \pit_escape
            \pit_escape_discrete
                \controllers
                \requirements
                \worlds
            \...
        \(more examples)

Contributors ✨

Thanks goes to these wonderful people (emoji key):

_{Kostas Tsampazis}
🐛 💻 📖 💡 🤔 🚧 📆 💬 👀

_{Manos Kirtas}
🐛 💻 📖 💡 🤔 🚧 📆 💬 👀

_RKJ
🤔

_wakeupppp
🐛

_{Jiun Kai Yang}
💻 📖 💡 🤔 👀 🚧 📆 🐛 💬

_{Nikolaos Kokkinis-Ntrenis}
💻 📖 💡

This project follows the all-contributors specification. Contributions of any kind welcome!

Special thanks to Papanikolaou Evangelia for designing project's logo!

deepworlds's People

Contributors

Stargazers

Watchers

Forkers

eakirtas tsampazk mentalgear mdecourse piyush-555 zy-kk eellak eellak-gsoc2021 natalienikoloul kheele kelvinyang0320 fpgod doandongnguyen lazydogez moemew j-degooijer stavliv lailufangchang vanished44

deepworlds's Issues

Update all reset methods to use new Webots capabilities

Webots 2020a rev2 will fix some issues related to resetting the Webots world without resetting the controllers (https://cyberbotics.com/doc/reference/supervisor#wb_supervisor_simulation_reset).

When the new version is released deepworlds examples should use this method to reset the world instead of reloading the robot node.

See aidudezzz/deepbots#25

Panda robot environment outputs IndexedFaceSet errors

Hello,

I was trying the panda environment from the examples files.
But starting webots and the controller I get the following errors:

ERROR: '/home/ubuntu/Downloads/deepbots-env/examples/panda/panda_goal_reaching/worlds/panda_goal_reaching.wbt':44:9: error: Skipped unknown 'solid' field in IndexedFaceSet node.
ERROR: '/home/ubuntu/Downloads/deepbots-env/examples/panda/panda_goal_reaching/worlds/panda_goal_reaching.wbt':206:55: error: Skipped unknown 'scale' field in Solid node.
ERROR: '/home/ubuntu/Downloads/deepbots-env/examples/panda/panda_goal_reaching/worlds/panda_goal_reaching.wbt':284:63: error: Skipped unknown 'solid' field in IndexedFaceSet node.
ERROR: '/home/ubuntu/Downloads/deepbots-env/examples/panda/panda_goal_reaching/worlds/panda_goal_reaching.wbt':371:63: error: Skipped unknown 'solid' field in IndexedFaceSet node.
ERROR: '/home/ubuntu/Downloads/deepbots-env/examples/panda/panda_goal_reaching/worlds/panda_goal_reaching.wbt':409:57: error: Skipped unknown 'solid' field in IndexedFaceSet node.
ERROR: '/home/ubuntu/Downloads/deepbots-env/examples/panda/panda_goal_reaching/worlds/panda_goal_reaching.wbt':445:51: error: Skipped unknown 'solid' field in IndexedFaceSet node.
ERROR: '/home/ubuntu/Downloads/deepbots-env/examples/panda/panda_goal_reaching/worlds/panda_goal_reaching.wbt':481:45: error: Skipped unknown 'solid' field in IndexedFaceSet node.
ERROR: '/home/ubuntu/Downloads/deepbots-env/examples/panda/panda_goal_reaching/worlds/panda_goal_reaching.wbt':517:39: error: Skipped unknown 'solid' field in IndexedFaceSet node.
ERROR: '/home/ubuntu/Downloads/deepbots-env/examples/panda/panda_goal_reaching/worlds/panda_goal_reaching.wbt':553:33: error: Skipped unknown 'solid' field in IndexedFaceSet node.
ERROR: '/home/ubuntu/Downloads/deepbots-env/examples/panda/panda_goal_reaching/worlds/panda_goal_reaching.wbt':589:27: error: Skipped unknown 'solid' field in IndexedFaceSet node.
ERROR: '/home/ubuntu/Downloads/deepbots-env/examples/panda/panda_goal_reaching/worlds/panda_goal_reaching.wbt':625:21: error: Skipped unknown 'solid' field in IndexedFaceSet node.
ERROR: '/home/ubuntu/Downloads/deepbots-env/examples/panda/panda_goal_reaching/worlds/panda_goal_reaching.wbt':661:15: error: Skipped unknown 'solid' field in IndexedFaceSet node.
ERROR: '/home/ubuntu/Downloads/deepbots-env/examples/panda/panda_goal_reaching/worlds/panda_goal_reaching.wbt':693:3: error: Skipped unknown 'scale' field in Solid node.
ERROR: '/home/ubuntu/Downloads/deepbots-env/examples/panda/panda_goal_reaching/worlds/panda_goal_reaching.wbt':711:3: error: Skipped unknown 'scale' field in Solid node.
ERROR: '/home/ubuntu/Downloads/deepbots-env/examples/panda/panda_goal_reaching/worlds/panda_goal_reaching.wbt':729:3: error: Skipped unknown 'scale' field in Solid node.
ERROR: '/home/ubuntu/Downloads/deepbots-env/examples/panda/panda_goal_reaching/worlds/panda_goal_reaching.wbt':747:3: error: Skipped unknown 'scale' field in Solid node.
ERROR: '/home/ubuntu/Downloads/deepbots-env/examples/panda/panda_goal_reaching/worlds/panda_goal_reaching.wbt':765:3: error: Skipped unknown 'scale' field in Solid node.
ERROR: '/home/ubuntu/Downloads/deepbots-env/examples/panda/panda_goal_reaching/worlds/panda_goal_reaching.wbt':783:3: error: Skipped unknown 'scale' field in Solid node.
ERROR: '/home/ubuntu/Downloads/deepbots-env/examples/panda/panda_goal_reaching/worlds/panda_goal_reaching.wbt':801:3: error: Skipped unknown 'scale' field in Solid node.
ERROR: '/home/ubuntu/Downloads/deepbots-env/examples/panda/panda_goal_reaching/worlds/panda_goal_reaching.wbt':819:3: error: Skipped unknown 'scale' field in Solid node.
ERROR: '/home/ubuntu/Downloads/deepbots-env/examples/panda/panda_goal_reaching/worlds/panda_goal_reaching.wbt':837:3: error: Skipped unknown 'scale' field in Solid node.
ERROR: 'https://raw.githubusercontent.com/cyberbotics/webots/R2022b/projects/objects/lights/protos/ConstructionLamp.proto':25:5: error: Skipped unknown 'scale' field in Solid node.

For some reason the IndexedFaceSet node has been used instead of CadShape, which means, that's the all content of the CAD collada files has been literally copied inside that Webots node (for this reason the world.wbt file has a size of 6 MByte)
Could you upload the CAD files (the robot parts/links) into the repository? I would replace them in the robot and then I'll do a pull request.

Several worlds still have old rotations

CartPole Discrete

cartPoleWorldEmitterReceiver.wbt new fixed one works ok, but uses NUE coordinate system, instead of ENU which is default (fix in #60)
cartPoleWorldRobotSupervisor.wbt is not updated at all - broken (fix in #66)
cartPoleWorldRobotSupervisorStableBaselines.wbt is not updated at all - broken (fix in #71)

CartPole Continuous

cartPoleWorldEmitterReceiver.wbt is not updated at all - broken (fixed in #59)
cartPoleWorldRobotSupervisor.wbt is not updated at all - broken (fixed in #67)

Find and Avoid

smallWorld.wbt is not updated at all, automatic rotation resolver partially fixes it (fix in #63 - Thank you @KelvinYang0320)

KHR-3HV

khr-3hv.wbt is not updated at all, seems to be working correctly (fix in #76 , along with a wealth of quality improvements)

Panda Goal Reaching

pandaRL_training.wbt is not updated at all, seems to be working correctly (fix in #61 - Thank you @KelvinYang0320)

Pit Escape

pit_escape.wbt fixed in #56 and works ok, but still uses NUE coordinate system, instead of ENU (same as cartpole emitter-receiver discrete)

Usage of snake_case

Make sure all deepworlds variables follow the snake_case convention.

Deep Mimic example

Relevant discussion here #18, suggestion by rohit-kumar-j.

Original Deep Mimic implementation

Basic Deep Mimic example could include a "teacher" cartpole robot that uses a PID controller and a "student" cartpole robot that is exactly the same as the existing cartpole example using RobotSupervisor, plus an emitter/receiver scheme to receive information from the "teacher" cartpole robot.

add constants at robot_supervisor_manager.py for all examples

We can solve the cross imports problem by adding this line if name == 'main': at robot_supervisor_manager.py and constants will be visible to the end user at robot_supervisor_manager.py.

Do you think constants is better to be added at robot_supervisor_manager.py
Originally posted by @ManosMagnus in #33 (comment)

About panda robot demo

I try to use the panda world file from the demo, and use the ikpy to get link, but I got wrong link amount? Does anyone know the reason?

error: DDPG_runner.run()

Hello! My DDPG_runner.run() has an error, like :Can't instantiate abstract class CartPoleSupervisor with abstract methods get_default_observation, and PPO has same error. I don't know if it is caused by my python3+torch1.6.0, do you have the similar problem?

A doubt for behaviors that the agent finally converged

Hi, Sorry to trouble you again. In recent months I implemented a 'find-and-avoid' project by myself according to the thought of yours at the algorithm parts and other papers I read at the reward parts. However, the agent finally keeps going around in circles. After checking repeatedly I think the most possible reason is the reward sparsity which commonly leads to poor exploration so that the behaviors fall into the local optimal paradox. Based on this, I hope to discuss two questions with you, hope this will not spend you too much time!
Q1. Based on your experiences, how effectively it can be improved if I deploy the curiosity-driving learning to solve the poverty exploration?
Q2. I found another factor differing from yours is the action mask. By now I still not deploy this method yet. But may I ask you how many that will be improved if this method achieves?
Thank you very much! I'll wait for your reply.

Cartpole continuous directory typo

Directory is named cartpole_continous instead of cartpole_continuous

A thinking of observation window

Hi, I am learning the 'finding and avoiding v2' implementation to improve my own project. I notice that there are two variables 'step window' and 'second window' and this is the original code for what it should be:

...
:param step_window: How many steps of observations to add in the observation window, defaults to 1
:type step_window: int, optional
:param seconds_window: How many seconds of observations to add in the observation window, defaults to 1
:type seconds_window: int, optional
...

And how the maximum and minimum vector of observation should be defined based on these two variables:

...
self.single_obs_size = len(single_obs_low)
obs_low = []
obs_high = []
for _ in range(self.step_window + self.seconds_window):
    obs_low.extend(single_obs_low)
    obs_high.extend(single_obs_high)
    self.obs_list.extend([0.0 for _ in range(self.single_obs_size)])
# Memory is used for creating the windows in get_observation()
self.obs_memory = [[0.0 for _ in range(self.single_obs_size)]
                   for _ in range((self.seconds_window * int(np.ceil(1000 / self.timestep))) +
                                  self.step_window)]
self.observation_counter_limit = int(np.ceil(1000 / self.timestep))
self.observation_counter = self.observation_counter_limit
...

According to the comment and the corresponding code, it seems that the two 'window' variable can be used to expand memory batch size or even expend the size of the two assignment vectors (obs_low and obs_high) of the space class Box(). However, I still have no idea about why it is necessary for expanding the vector size and acheiving the batch size like this. Therefore, could you please enlighten me on the correct perceptive to construct these two variables and deploy them? Many thanks!

Deep Mimic example in Webots?

Hi! Could you integrate/translate a deep mimic environment example from HERE.?

Add setup tool

General

Setup tools is a command line wizard in order to create new examples or easily use the existing ones.
Current structure is:

./examples
- ./examples/torch
- ./examples/torch/
Every should have a description file.
- Should be either on a markup language or simple text.
Every should have a requirement.txt
- This should have only the imported libraries, anything else.

Interface design

Generally, the deepworlds setup tool is going to be like a common command line tool. Users will use it as an out-of-box wizard in order to create new worlds or use the existing ones.

Consequently, every might have different worlds or even different solutions (supervisor, agent, robots). It is essential that the deepworlds setup tool knows those different aspects of every problem. As result, every should have a formated string that can be interpreted by the setup tools. This string should be in JSON format or even in python dictionary.

Name
Path
Short description
Long Description
Requirements (or the path only)
Authors
- Name
- Email
- (optional) Github Account
Webots version
Deebots version
Worlds
Supervisors-Robots (array or dice)
(optional) Framework which is used
- Version of the framework
(optional) Future Work
(optional) Related papers
(optional) Other

This information should be easily accessible by the developers and also by the user. Developers should have the option to add those information when they are about to start a project. This could be happen with a command line wizard. On the other hand, user should have the option to "scroll" on those information in order to choose which fits better to they case. In addition, user should have the ability to start from a template world. This mean that in the deepworlds repository should be added also another directory named as templates

The difference between templates and examples is that templates provides only a basic code structure for a fresh project. Examples are complete solutions of problems.

A problem about timestep in Deepbots

Hi, Sorry for trouble you again and hope you are well. I got a problem in creating the environments of reinforcement learning training.
I just remember that you have proposed me before a thought that with deepbots we can construct a similar world as many cases of gym environments, in which the the agent moves by not continuous driving but actions moving one space in a different direction (such as cliff walking which is a typical grid world). Thanks to your enlightment, I am now trying to create a grid world with arena map with black and white plaid over it.
Now, my main thought is to create a space including actions by which the robot agent can move a constant distance, just like the same distance to one grid step ahead to different direction (forward, back, left, right). However, I got a problem with timestep.
As you know, the timestep is an attribute of Supervisor class, which is the super class of even all of relative subclasses. When the iteration starts, the function step(), which is the function of Supervisor class, will be called by the step() function in subclass and automatically call the timestep attribute from Supervisor class so that the 3D simulating animation can push forward along the timeline. The trouble is, if I create a space with 'grid moving style' actions (such as moving for a grid distance, or rotate a given angle in one step of iteration), the function also use timestep. For example, this is the original step function in Supervisor class:

    def step(self, action):
        self.handle_emitter(action)
        if super(Supervisor, self).step(self.timestep) == -1:
            exit()

        return (
            self.get_observations(),
            self.get_reward(action),
            self.is_done(),
            self.get_info(),
        )

And based on some resources online, one method to implement 'grid moving style' action can be achieved like this:

def turn_left(angle):
    l_speed = 1.0
    r_speed = -1.0
    leftMotor.setVelocity(l_speed)
    rightMotor.setVelocity(r_speed)
    
    while robot.step(self.timestep) != -1:
        if robot.getFromDef('agent').getField('rotation').getSFRotation[3] >= 1.5708:
            leftMotor.setVelocity(0.0)
            rightMotor.setVelocity(0.0)
        break

Here, timestep cannot be called simultanerously by two processes. Therefore, do you have any ideas to solve the problem? Or any better solutions for create grid world in Webots? Many thanks for any help or advice!

Robot-supervisor: Discrete CartPole
- https://drive.google.com/drive/folders/1n6HeZWg9zxwABPxg3u0nSldvw5DfFD84?usp=sharing
Robot-supervisor: Discrete CartPole (SB3)
- https://drive.google.com/drive/folders/1s-ua1fPanaMTzEED48otAdBpfBGh3Keg?usp=sharing
Robot-supervisor: Continusous CartPole
- https://drive.google.com/drive/folders/1AQme2Z-kH4XHhH6aBudi-WkazSkodfjb?usp=sharing
Emitter-receiver: Discrete CartPole
- Task is not solved in 2000 episodes, but the agent works well in test mode and the score is always 197.
- https://drive.google.com/drive/folders/1P7VPB4ac-_5asCbWCSDtIencIDll8cwH?usp=sharing
Emitter-receiver: Continusous CartPole
- Task is not solved in 10000 episodes, but the agent works well in test mode and the score is always 197.
- https://drive.google.com/drive/folders/1P7VPB4ac-_5asCbWCSDtIencIDll8cwH?usp=sharing

Wrong argument name in docstring

action != message

deepworlds/examples/cartpole/cartpole_continuous/controllers/robot_supervisor_manager/robot_supervisor.py

Lines 153 to 159 in 5e815cf

 def apply_action(self, action): 

 """ 

  This method uses the action list provided, which contains the next action to be executed by the robot. 

  The message contains a float value that is applied on all wheels as velocity. 

  :param message: The message the supervisor sent containing the next action. 

  :type message: list of strings

Broken links in Cartpole README.md

OpenAI gym cartpole link broken as well as cartpole continuous project link

[KHR-3HV] Fix convergeness issues

Currently, we still have a few training problems in KHR-3HV.
#76 (comment)

Normalization:
- normalize_to_range()
- #76 (comment)
Synchronization:
- #76 (comment)
Reward function:
- #76 (comment)
Warning messages:
- #76 (comment)

Additional contributions

@all-contributors please add @KelvinYang0320 for projectManagement, bug, question

	def apply_action(self, action):
	"""
	This method uses the action list provided, which contains the next action to be executed by the robot.
	The message contains a float value that is applied on all wheels as velocity.

	:param message: The message the supervisor sent containing the next action.
	:type message: list of strings