wahomekezia / demowith_issues Goto Github PK

View Code? Open in Web Editor NEW

0.0 1.0 0.0 2 KB

demowith_issues's Introduction

Demowith_issues

Kezia Wahome

demowith_issues's People

Contributors

Watchers

demowith_issues's Issues

Running Popular RL algorithms

### 1. td3
I want ot run this Algoritms in a multiprocessing run for the purpose of collecting data .

    from tqdm import tqdm
    #change the data file '../4738-learningfrommodels/benin.csv' to the location of the data file 
    location1="https://gist.githubusercontent.com/slremy/5adf90df6f4c096258f8d8af0f3039dc/raw/d85ac5f204771657b015055eb3b6b9f5dcf33956/location1.csv" # data from togo 
    location2="https://gist.githubusercontent.com/slremy/5adf90df6f4c096258f8d8af0f3039dc/raw/d85ac5f204771657b015055eb3b6b9f5dcf33956/location2.csv"  #data from benin

    # task = Task.init(project_name="RLtesting/non_garage", task_name="all2")
    results=[]
    for algo in [test_td3,test_ddpg,test_sarsa,test_dqn,test_sac,]: #][1:3]
        for location in [location1, location2]:
            with mp.Pool(processes=10) as pool:
                results = pool.map(algo, [location]*5) # changed from 30 to 5
                df = pd.DataFrame(results)
                df["algorithm"]=algo.__name__
                df["location"]="location1" if "togo" in location else "location2"
                df.to_csv(df["algorithm"][0]+df["location"][0]+"_2")
                results.append(df)

        df.to_csv("Results.csv") ```

my first error in running td3 is 

```File "/Users/keziawangeciwahome/miniconda3/envs/dqn/lib/python3.10/site-packages/Popular_RL_Algorithms/td3.py", line 22, in <module>
    from reacher import Reacher
ModuleNotFoundError: No module named 'reacher'
"""```
The above exception was the direct cause of the following exception:

```File "/Users/keziawangeciwahome/miniconda3/envs/dqn/lib/python3.10/multiprocessing/pool.py", line 774, in get
    raise self._value
ModuleNotFoundError: No module named 'reacher'

to solve this error ,I pip install reacher and got this error next error

      ImportError: cannot import name 'Reacher' from 'reacher' (unknown location)

I will put a pause on td3 because of this error ,

2. ddpg

The error is ,

ModuleNotFoundError: No module named 'common.buffers'
""" ```

The above exception was the direct cause of the following exception:

```  File "/Users/keziawangeciwahome/miniconda3/envs/dqn/lib/python3.10/multiprocessing/pool.py", line 774, in get
  raise self._value
ModuleNotFoundError: No module named 'common.buffers'

I will put a pause on ddpg too .

For this experiment , I will run dqn , sarsa and sac algorithms and collect their best rewards in one location , plot their performance and evaluate their performance .

Installing Conda and GT4SD

First things first:

Installing Miniconda ( a ligthtwight version of Anaconda)

Log into the remote server , if you had configured the virtual machine , this should log in automatically without asking for a password
If not follow , this are the steps I followed : here

install the miniconda using this link in the terminal
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh

to check the version install , run conda -V

Getting Started!

Getting started

First things first Set up my Virtual Machine
Commands used
ssh ssh username@IP address
password
Note : The user name and password were provided by the server admin
The ssh creates a communication channel between my local host and the server virtual machine (a Linux system)
This will ensure I am working on Linux and creating my development variable in the Linux system
Creating SSH Keys
ssh-keygen
Expected Output
Output Generating public/private rsa key pair. Enter file in which to save the key (/home/username/.ssh/id_rsa):

Possible Error at this point
Check if the file named in your username in the directory cd home , ls -a If not there , the system throws an error that not such file in found in the directory .
To resolve this , create mkdir username or request the remote server admin to create the file.

After creating ssh key ,I copied the SSH Public Key to my remote server manually using ssh-copy-id
ssh-copy-id username@IP adress
cat ~/.ssh/id_rsa.pub
This will output the public key

I copied the key and add it to Github
setting --> ssh and GPU key
Add new ssh key

Authenticating to Your Server Using SSH Keys
ssh username@remote_host
This allows me now to log into the server without the account’s password.
Last Step
I also created a ssh key for my local host
With local host as root user , I repeated the same process from ssh-keygen and saved the public key on Github

This will allow me to securely connect to GitHub to the server without having to enter the remote username and password every time

Resources I found useful to understand :

Another one

We were trying to plot the rewards after each episode

This are some the ideas we run ,

First , Created a another file in the directory ICLR23Workshop, visualize.py

In this file we run this codes to visualize the dnq function ,

The initial understanding was the number episodes are 10000 as set in the dqn function ,
We tried this first

This code could iterated 10000 times
( We did not wait for it to finish )

The code was

import matplotlib.pyplot as plt

# importing the function we are running from collectdata 
from collectdata import test_dqn

# creating a list to store the reward per episode 
#Total episodes are 10000 
rewards_per_episode = []
total_rewards = 0

# the loop that will run and save the rewards in the list
for episode in range(1, 10001):
    #run the test_dqn function and get the reward
    reward = test_dqn("https://gist.githubusercontent.com/slremy/5adf90df6f4c096258f8d8af0f3039dc/raw/d85ac5f204771657b015055eb3b6b9f5dcf33956/location1.csv")
    
    #add the reward to the total_rewards
    total_rewards += reward
    
    # if the episode number is a multiple of 500, append the average reward per episode to the rewards_per_episode list and reset the total_rewards
    if episode % 500 == 0:
        rewards_per_episode.append(total_rewards/500)
        total_rewards = 0

# declaring the range for plotting 
episodes = range(500, 10001, 500)

#plotting the rewards_per_episode vs. episodes
plt.plot(episodes, rewards_per_episode)
plt.xlabel('Episodes')
plt.ylabel('Reward')
plt.title('Reward per Episode')
plt.show()

The second idea was to take the 10000 episodes as a single action making up a single episode , create a range and had a for loop running n times to get total rewards

this was the code ,


# Importing the function test_dqn from collectdata module which we will be running
from collectdata import test_dqn

rewards = []   # Initializing an empty list to store rewards obtained in each episode

# Looping over 5 episodes
for episode in range(1, 6):
    # Calling the test_dqn function with the given URL as argument and storing the returned reward value
    reward = test_dqn("https://gist.githubusercontent.com/slremy/5adf90df6f4c096258f8d8af0f3039dc/raw/d85ac5f204771657b015055eb3b6b9f5dcf33956/location1.csv")
    # Appending the reward value to the rewards list
    rewards.append(reward)
    
episodes = [1, 2, 3, 4, 5]   # Initializing a list of episode numbers

# Plotting a line graph with episode numbers on x-axis and rewards on y-axis
plt.plot(episodes, rewards)
plt.xlabel('Episodes')   # Labeling the x-axis as 'Episodes'
plt.ylabel('Reward')   # Labeling the y-axis as 'Reward'
plt.title('Reward per Episode')   # Setting the title of the plot
plt.show()   # Displaying the plot on the screen

It was plotting out the graph but the logic felt wrong since we are taking the 10000 episodes as actions making up one episode ,(from our understanding)

Another idea we tried was to directly edit collectdata.py file and add this code to the while loop in the dqn function

    rewards_per_episode = []
    while not done:
        a = model.choose_action(s)
        s, r, done, truncated, info = env.step(a)
        episode_reward+=r
        rewards_per_episode.append(episode_reward)

and this was the visualize.py file

# importing matplotlip
import matplotlib.pyplot as plt
# importing collectdata
import collectdata 

num_episodes = 10000

collectdata.test_dqn("https://gist.githubusercontent.com/slremy/5adf90df6f4c096258f8d8af0f3039dc/raw/d85ac5f204771657b015055eb3b6b9f5dcf33956/location1.csv")

# Plot rewards per episode
plt.plot(range(num_episodes), collectdata.rewards_per_episode)
plt.xlabel('Episodes')
plt.ylabel('Reward')
plt.title('Reward per Episode')
plt.show()

This gave us a very interesting output , a list of 10 rewards only ,

And cloud not plot the visualization because of the value error here :

ValueError: x and y must have same first dimension, but have shapes (10000,) and (10,)

@ what can you advice on the actions and episodes the is running

wahomekezia / demowith_issues Goto Github PK

demowith_issues's Introduction

Demowith_issues

demowith_issues's People

Contributors

Watchers

demowith_issues's Issues

Running Popular RL algorithms

2. ddpg

Installing Conda and GT4SD

Getting Started!

Another one

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent