Giter Club home page Giter Club logo

demowith_issues's Introduction

demowith_issues's People

Contributors

wahomekezia avatar

Watchers

 avatar

demowith_issues's Issues

Running Popular RL algorithms

### 1. td3
I want ot run this Algoritms in a multiprocessing run for the purpose of collecting data .

    from tqdm import tqdm
    #change the data file '../4738-learningfrommodels/benin.csv' to the location of the data file 
    location1="https://gist.githubusercontent.com/slremy/5adf90df6f4c096258f8d8af0f3039dc/raw/d85ac5f204771657b015055eb3b6b9f5dcf33956/location1.csv" # data from togo 
    location2="https://gist.githubusercontent.com/slremy/5adf90df6f4c096258f8d8af0f3039dc/raw/d85ac5f204771657b015055eb3b6b9f5dcf33956/location2.csv"  #data from benin

    # task = Task.init(project_name="RLtesting/non_garage", task_name="all2")
    results=[]
    for algo in [test_td3,test_ddpg,test_sarsa,test_dqn,test_sac,]: #][1:3]
        for location in [location1, location2]:
            with mp.Pool(processes=10) as pool:
                results = pool.map(algo, [location]*5) # changed from 30 to 5
                df = pd.DataFrame(results)
                df["algorithm"]=algo.__name__
                df["location"]="location1" if "togo" in location else "location2"
                df.to_csv(df["algorithm"][0]+df["location"][0]+"_2")
                results.append(df)

        df.to_csv("Results.csv") ```

my first error in running td3 is 

```File "/Users/keziawangeciwahome/miniconda3/envs/dqn/lib/python3.10/site-packages/Popular_RL_Algorithms/td3.py", line 22, in <module>
    from reacher import Reacher
ModuleNotFoundError: No module named 'reacher'
"""```
The above exception was the direct cause of the following exception:

```File "/Users/keziawangeciwahome/miniconda3/envs/dqn/lib/python3.10/multiprocessing/pool.py", line 774, in get
    raise self._value
ModuleNotFoundError: No module named 'reacher'

to solve this error ,I pip install reacher and got this error next error

      ImportError: cannot import name 'Reacher' from 'reacher' (unknown location)

I will put a pause on td3 because of this error ,

2. ddpg

The error is ,

ModuleNotFoundError: No module named 'common.buffers'
""" ```

The above exception was the direct cause of the following exception:

```  File "/Users/keziawangeciwahome/miniconda3/envs/dqn/lib/python3.10/multiprocessing/pool.py", line 774, in get
  raise self._value
ModuleNotFoundError: No module named 'common.buffers'

I will put a pause on ddpg too .

For this experiment , I will run dqn , sarsa and sac algorithms and collect their best rewards in one location , plot their performance and evaluate their performance .

Installing Conda and GT4SD

First things first:

  • Installing Miniconda ( a ligthtwight version of Anaconda)

Log into the remote server , if you had configured the virtual machine , this should log in automatically without asking for a password
If not follow , this are the steps I followed : here

install the miniconda using this link in the terminal
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh

to check the version install , run conda -V

Getting Started!

Getting started

  • First things first Set up my Virtual Machine
    Commands used
    ssh ssh username@IP address
    password
    Note : The user name and password were provided by the server admin
    The ssh creates a communication channel between my local host and the server virtual machine (a Linux system)
    This will ensure I am working on Linux and creating my development variable in the Linux system

  • Creating SSH Keys
    ssh-keygen
    Expected Output
    Output Generating public/private rsa key pair. Enter file in which to save the key (/home/username/.ssh/id_rsa):

Possible Error at this point
Check if the file named in your username in the directory cd home , ls -a If not there , the system throws an error that not such file in found in the directory .
To resolve this , create mkdir username or request the remote server admin to create the file.

  • After creating ssh key ,I copied the SSH Public Key to my remote server manually using ssh-copy-id
    ssh-copy-id username@IP adress
    cat ~/.ssh/id_rsa.pub
    This will output the public key

I copied the key and add it to Github
setting --> ssh and GPU key
Add new ssh key

  • Authenticating to Your Server Using SSH Keys
    ssh username@remote_host
    This allows me now to log into the server without the account’s password.

  • Last Step
    I also created a ssh key for my local host
    With local host as root user , I repeated the same process from ssh-keygen and saved the public key on Github

This will allow me to securely connect to GitHub to the server without having to enter the remote username and password every time

Resources I found useful to understand :

Another one

We were trying to plot the rewards after each episode

This are some the ideas we run ,

First , Created a another file in the directory ICLR23Workshop, visualize.py

In this file we run this codes to visualize the dnq function ,

  • The initial understanding was the number episodes are 10000 as set in the dqn function ,
    We tried this first

This code could iterated 10000 times
( We did not wait for it to finish )

The code was

import matplotlib.pyplot as plt

# importing the function we are running from collectdata 
from collectdata import test_dqn

# creating a list to store the reward per episode 
#Total episodes are 10000 
rewards_per_episode = []
total_rewards = 0

# the loop that will run and save the rewards in the list
for episode in range(1, 10001):
    #run the test_dqn function and get the reward
    reward = test_dqn("https://gist.githubusercontent.com/slremy/5adf90df6f4c096258f8d8af0f3039dc/raw/d85ac5f204771657b015055eb3b6b9f5dcf33956/location1.csv")
    
    #add the reward to the total_rewards
    total_rewards += reward
    
    # if the episode number is a multiple of 500, append the average reward per episode to the rewards_per_episode list and reset the total_rewards
    if episode % 500 == 0:
        rewards_per_episode.append(total_rewards/500)
        total_rewards = 0

# declaring the range for plotting 
episodes = range(500, 10001, 500)

#plotting the rewards_per_episode vs. episodes
plt.plot(episodes, rewards_per_episode)
plt.xlabel('Episodes')
plt.ylabel('Reward')
plt.title('Reward per Episode')
plt.show()
  • The second idea was to take the 10000 episodes as a single action making up a single episode , create a range and had a for loop running n times to get total rewards

this was the code ,


# Importing the function test_dqn from collectdata module which we will be running
from collectdata import test_dqn

rewards = []   # Initializing an empty list to store rewards obtained in each episode

# Looping over 5 episodes
for episode in range(1, 6):
    # Calling the test_dqn function with the given URL as argument and storing the returned reward value
    reward = test_dqn("https://gist.githubusercontent.com/slremy/5adf90df6f4c096258f8d8af0f3039dc/raw/d85ac5f204771657b015055eb3b6b9f5dcf33956/location1.csv")
    # Appending the reward value to the rewards list
    rewards.append(reward)
    
episodes = [1, 2, 3, 4, 5]   # Initializing a list of episode numbers

# Plotting a line graph with episode numbers on x-axis and rewards on y-axis
plt.plot(episodes, rewards)
plt.xlabel('Episodes')   # Labeling the x-axis as 'Episodes'
plt.ylabel('Reward')   # Labeling the y-axis as 'Reward'
plt.title('Reward per Episode')   # Setting the title of the plot
plt.show()   # Displaying the plot on the screen

It was plotting out the graph but the logic felt wrong since we are taking the 10000 episodes as actions making up one episode ,(from our understanding)

  • Another idea we tried was to directly edit collectdata.py file and add this code to the while loop in the dqn function
    rewards_per_episode = []
    while not done:
        a = model.choose_action(s)
        s, r, done, truncated, info = env.step(a)
        episode_reward+=r
        rewards_per_episode.append(episode_reward)

and this was the visualize.py file

# importing matplotlip
import matplotlib.pyplot as plt
# importing collectdata
import collectdata 

num_episodes = 10000

collectdata.test_dqn("https://gist.githubusercontent.com/slremy/5adf90df6f4c096258f8d8af0f3039dc/raw/d85ac5f204771657b015055eb3b6b9f5dcf33956/location1.csv")

# Plot rewards per episode
plt.plot(range(num_episodes), collectdata.rewards_per_episode)
plt.xlabel('Episodes')
plt.ylabel('Reward')
plt.title('Reward per Episode')
plt.show()

This gave us a very interesting output , a list of 10 rewards only ,

And cloud not plot the visualization because of the value error here :

ValueError: x and y must have same first dimension, but have shapes (10000,) and (10,)

@ what can you advice on the actions and episodes the is running

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.