Giter Club home page Giter Club logo

Comments (6)

JaswanthBadvelu avatar JaswanthBadvelu commented on July 27, 2024

Did you figure out anything? Actually even I am trying to do the same thing along with Time windows

from adm-vrp.

Monikasinghjmi avatar Monikasinghjmi commented on July 27, 2024

@JaswanthBadvelu No, not yet

from adm-vrp.

JaswanthBadvelu avatar JaswanthBadvelu commented on July 27, 2024

I am using this model in my project, I have a few doubts, Do you mind connecting with me here on LinkedIn so we can discuss more?
https://www.linkedin.com/in/jaswanth-badvelu/
Thanks

from adm-vrp.

d-eremeev avatar d-eremeev commented on July 27, 2024

@Monikasinghjmi If you mean multi-depot problem for a single agent, you could consider changing masking procedure along with adding embeddings for all depots in graph attention encoder.

There is function get_mask in environment.py where we mask all visited nodes and nodes with demand greater than current available capacity of the agent (along with some additional logic for dynamical version of the model presented in the paper) and decide whether it is allowed to go to depot. This mask is used in MHA decoding process in attention_dynamic_model.py. Also there is mask defined in get_att_mask in environment.py. This one is used for graph encoding (before decoding steps) in the same file.

Make sure these functions keep all the depots unmasked at the appropriate times so that attention mechanisms could properly encode all depots and then decide which one should we choose.

from adm-vrp.

Monikasinghjmi avatar Monikasinghjmi commented on July 27, 2024

@d-eremeev thanks for the explanation. My understanding is that the REINFORCE algo used for calculating the cost function has to be updated for MDVRP problem. Is this correct??

from adm-vrp.

d-eremeev avatar d-eremeev commented on July 27, 2024

@Monikasinghjmi I'm not sure what do you mean by "has to be updated".
REINFORCE is a classical policy gradient algorithm. Basically, we want to extremize the expected return of the whole episode: sum of the rewards (~ length) over the whole trajectory multiplied by corresponding probabilities. REINFORCE is a Monte-Carlo method, which "tells" us a "convinient" form of the gradient of expected return. In this sense, it should not be changed.

Of course, there are several components in the formula: length of your path + probabilities of nodes, returned by decoder. Also, there is a "baseline" added, which involves a copy of a model from one of the preceeding epochs. If you change components, then in that sense REINFORCE with baseline would be "updated".

For educational purposes, you might want to check for ex. the following link:
RL — Policy Gradient Explained by Jonathan Hui.

from adm-vrp.

Related Issues (3)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.