<script src="https://platform.linkedin.com/badges/js/profile.js" async defer type="text/javascript"></script>
demowith_issues's Introduction
demowith_issues's People
demowith_issues's Issues
Running Popular RL algorithms
### 1. td3
I want ot run this Algoritms in a multiprocessing run for the purpose of collecting data .
from tqdm import tqdm
#change the data file '../4738-learningfrommodels/benin.csv' to the location of the data file
location1="https://gist.githubusercontent.com/slremy/5adf90df6f4c096258f8d8af0f3039dc/raw/d85ac5f204771657b015055eb3b6b9f5dcf33956/location1.csv" # data from togo
location2="https://gist.githubusercontent.com/slremy/5adf90df6f4c096258f8d8af0f3039dc/raw/d85ac5f204771657b015055eb3b6b9f5dcf33956/location2.csv" #data from benin
# task = Task.init(project_name="RLtesting/non_garage", task_name="all2")
results=[]
for algo in [test_td3,test_ddpg,test_sarsa,test_dqn,test_sac,]: #][1:3]
for location in [location1, location2]:
with mp.Pool(processes=10) as pool:
results = pool.map(algo, [location]*5) # changed from 30 to 5
df = pd.DataFrame(results)
df["algorithm"]=algo.__name__
df["location"]="location1" if "togo" in location else "location2"
df.to_csv(df["algorithm"][0]+df["location"][0]+"_2")
results.append(df)
df.to_csv("Results.csv") ```
my first error in running td3 is
```File "/Users/keziawangeciwahome/miniconda3/envs/dqn/lib/python3.10/site-packages/Popular_RL_Algorithms/td3.py", line 22, in <module>
from reacher import Reacher
ModuleNotFoundError: No module named 'reacher'
"""```
The above exception was the direct cause of the following exception:
```File "/Users/keziawangeciwahome/miniconda3/envs/dqn/lib/python3.10/multiprocessing/pool.py", line 774, in get
raise self._value
ModuleNotFoundError: No module named 'reacher'
to solve this error ,I pip install reacher
and got this error next error
ImportError: cannot import name 'Reacher' from 'reacher' (unknown location)
I will put a pause on td3 because of this error ,
2. ddpg
The error is ,
ModuleNotFoundError: No module named 'common.buffers'
""" ```
The above exception was the direct cause of the following exception:
``` File "/Users/keziawangeciwahome/miniconda3/envs/dqn/lib/python3.10/multiprocessing/pool.py", line 774, in get
raise self._value
ModuleNotFoundError: No module named 'common.buffers'
I will put a pause on ddpg too .
For this experiment , I will run dqn , sarsa and sac algorithms and collect their best rewards in one location , plot their performance and evaluate their performance .
Installing Conda and GT4SD
First things first:
- Installing Miniconda ( a ligthtwight version of Anaconda)
Log into the remote server , if you had configured the virtual machine , this should log in automatically without asking for a password
If not follow , this are the steps I followed : here
install the miniconda using this link in the terminal
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh
to check the version install , run conda -V
Getting Started!
Getting started
-
First things first Set up my Virtual Machine
Commands used
ssh ssh username@IP address
password
Note : The user name and password were provided by the server admin
The ssh creates a communication channel between my local host and the server virtual machine (a Linux system)
This will ensure I am working on Linux and creating my development variable in the Linux system -
Creating SSH Keys
ssh-keygen
Expected Output
Output Generating public/private rsa key pair. Enter file in which to save the key (/home/username/.ssh/id_rsa):
Possible Error at this point
Check if the file named in your username in the directory cd home
, ls -a
If not there , the system throws an error that not such file in found in the directory .
To resolve this , create mkdir username
or request the remote server admin to create the file.
- After creating ssh key ,I copied the SSH Public Key to my remote server manually using
ssh-copy-id
ssh-copy-id username@IP adress
cat ~/.ssh/id_rsa.pub
This will output the public key
I copied the key and add it to Github
setting --> ssh and GPU key
Add new ssh key
-
Authenticating to Your Server Using SSH Keys
ssh username@remote_host
This allows me now to log into the server without the account’s password. -
Last Step
I also created a ssh key for my local host
With local host as root user , I repeated the same process fromssh-keygen
and saved the public key on Github
This will allow me to securely connect to GitHub to the server without having to enter the remote username and password every time
Resources I found useful to understand :
Another one
We were trying to plot the rewards after each episode
This are some the ideas we run ,
First , Created a another file in the directory ICLR23Workshop, visualize.py
In this file we run this codes to visualize the dnq function ,
- The initial understanding was the number episodes are 10000 as set in the dqn function ,
We tried this first
This code could iterated 10000 times
( We did not wait for it to finish )
The code was
import matplotlib.pyplot as plt
# importing the function we are running from collectdata
from collectdata import test_dqn
# creating a list to store the reward per episode
#Total episodes are 10000
rewards_per_episode = []
total_rewards = 0
# the loop that will run and save the rewards in the list
for episode in range(1, 10001):
#run the test_dqn function and get the reward
reward = test_dqn("https://gist.githubusercontent.com/slremy/5adf90df6f4c096258f8d8af0f3039dc/raw/d85ac5f204771657b015055eb3b6b9f5dcf33956/location1.csv")
#add the reward to the total_rewards
total_rewards += reward
# if the episode number is a multiple of 500, append the average reward per episode to the rewards_per_episode list and reset the total_rewards
if episode % 500 == 0:
rewards_per_episode.append(total_rewards/500)
total_rewards = 0
# declaring the range for plotting
episodes = range(500, 10001, 500)
#plotting the rewards_per_episode vs. episodes
plt.plot(episodes, rewards_per_episode)
plt.xlabel('Episodes')
plt.ylabel('Reward')
plt.title('Reward per Episode')
plt.show()
- The second idea was to take the 10000 episodes as a single action making up a single episode , create a range and had a for loop running n times to get total rewards
this was the code ,
# Importing the function test_dqn from collectdata module which we will be running
from collectdata import test_dqn
rewards = [] # Initializing an empty list to store rewards obtained in each episode
# Looping over 5 episodes
for episode in range(1, 6):
# Calling the test_dqn function with the given URL as argument and storing the returned reward value
reward = test_dqn("https://gist.githubusercontent.com/slremy/5adf90df6f4c096258f8d8af0f3039dc/raw/d85ac5f204771657b015055eb3b6b9f5dcf33956/location1.csv")
# Appending the reward value to the rewards list
rewards.append(reward)
episodes = [1, 2, 3, 4, 5] # Initializing a list of episode numbers
# Plotting a line graph with episode numbers on x-axis and rewards on y-axis
plt.plot(episodes, rewards)
plt.xlabel('Episodes') # Labeling the x-axis as 'Episodes'
plt.ylabel('Reward') # Labeling the y-axis as 'Reward'
plt.title('Reward per Episode') # Setting the title of the plot
plt.show() # Displaying the plot on the screen
It was plotting out the graph but the logic felt wrong since we are taking the 10000 episodes as actions making up one episode ,(from our understanding)
- Another idea we tried was to directly edit collectdata.py file and add this code to the while loop in the dqn function
rewards_per_episode = []
while not done:
a = model.choose_action(s)
s, r, done, truncated, info = env.step(a)
episode_reward+=r
rewards_per_episode.append(episode_reward)
and this was the visualize.py file
# importing matplotlip
import matplotlib.pyplot as plt
# importing collectdata
import collectdata
num_episodes = 10000
collectdata.test_dqn("https://gist.githubusercontent.com/slremy/5adf90df6f4c096258f8d8af0f3039dc/raw/d85ac5f204771657b015055eb3b6b9f5dcf33956/location1.csv")
# Plot rewards per episode
plt.plot(range(num_episodes), collectdata.rewards_per_episode)
plt.xlabel('Episodes')
plt.ylabel('Reward')
plt.title('Reward per Episode')
plt.show()
This gave us a very interesting output , a list of 10 rewards only ,
And cloud not plot the visualization because of the value error here :
ValueError: x and y must have same first dimension, but have shapes (10000,) and (10,)
@ what can you advice on the actions and episodes the is running
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.