Giter Club home page Giter Club logo

np-hard-deep-reinforcement-learning's Introduction

combinatorial optimization with DL/RL: IPython tutorials

This tutorial demonstrates technique to solve combinatorial optimization problems such as the well-known travelling salesman problem. The method was presented in the paper Neural Combinatorial Optimization with Reinforcement Learning.

The Algorithm applies the pointer network architecture wherein an attention mechanism is fashioned to point to elements of an input sequence, allowing a decoder to output said elements. The network is trained by reinforcement learning using an actor-critic method.

Note! This model does not beat existing baselines for TSP, moreover local search method LK-H solves these tsp tasks to optimality in seconds on a CPU, compared to suboptimal results by this model in several hours on a GPU.

The algorithm consists of two parts:

Pointer Network

Intro to PN for simple sorting task: Intro to Pointer Network.ipynb.

Paper: Pointer Networks.

Blog post by fast ml: Introduction to pointer networks.

Neural Combinatorial Optimization

Neural Combinatorial Optimization.ipynb

Paper: Neural Combinatorial Optimization with Reinforcement Learning

np-hard-deep-reinforcement-learning's People

Contributors

higgsfield avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

np-hard-deep-reinforcement-learning's Issues

(At least) the sorting notebook is erroneous

Brief looking at the code:

    for i in range(seq_len):
        ...
        for i in range(self.n_glimpses):
            ...
        loss += self.criterion(logits, target[:,i])

i will always be self.n_glimpses - 1, so the displayed loss is meaningless. Sorting predicted by the model will also be meaningless. After fixing it, the loss won't change and the model doesn't train at all. Logits are not softmaxed after pointing. Masks are never applied since idxs are never initialized.

BrokenPipeError: [Errno 32] Broken pipe

I am getting a broken pipe error. I have worked for two months to solve this problem but still I am unable to resolve it. Can you suggest any solution?
Regards
Tariq Shah
Ph.D. students
Thammasat University, Thailand

BrokenPipeError: [Errno 32] Broken pipe

Good day,

I want to test the model on windows 10. I assume the code was written for Linux, therefore I am receiving problems related to pip. I found that I should add "if name == 'main':" to protect the main code, however, I wasn't successful in fixing this problem.

Could someone help to solve this issue?

Solution:
https://stackoverflow.com/questions/18204782/runtimeerror-on-windows-trying-python-multiprocessing/18205006#18205006

explenation:
https://stackoverflow.com/questions/14207708/ioerror-errno-32-broken-pipe-python

Error:

BrokenPipeError Traceback (most recent call last)
in ()
4
5 for epoch in range(n_epochs):
----> 6 for batch_id, sample_batch in enumerate(train_loader):
7
8 inputs = Variable(sample_batch)

~\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py in iter(self)
308
309 def iter(self):
--> 310 return DataLoaderIter(self)
311
312 def len(self):

~\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py in init(self, loader)
165 for w in self.workers:
166 w.daemon = True # ensure that the worker exits on process exit
--> 167 w.start()
168
169 if self.pin_memory:

~\Anaconda3\lib\multiprocessing\process.py in start(self)
103 'daemonic processes are not allowed to have children'
104 _cleanup()
--> 105 self._popen = self._Popen(self)
106 self._sentinel = self._popen.sentinel
107 # Avoid a refcycle if the target function holds an indirect

~\Anaconda3\lib\multiprocessing\context.py in _Popen(process_obj)
221 @staticmethod
222 def _Popen(process_obj):
--> 223 return _default_context.get_context().Process._Popen(process_obj)
224
225 class DefaultContext(BaseContext):

~\Anaconda3\lib\multiprocessing\context.py in _Popen(process_obj)
320 def _Popen(process_obj):
321 from .popen_spawn_win32 import Popen
--> 322 return Popen(process_obj)
323
324 class SpawnContext(BaseContext):

~\Anaconda3\lib\multiprocessing\popen_spawn_win32.py in init(self, process_obj)
63 try:
64 reduction.dump(prep_data, to_child)
---> 65 reduction.dump(process_obj, to_child)
66 finally:
67 set_spawning_popen(None)

~\Anaconda3\lib\multiprocessing\reduction.py in dump(obj, file, protocol)
58 def dump(obj, file, protocol=None):
59 '''Replacement for pickle.dump() using ForkingPickler.'''
---> 60 ForkingPickler(file, protocol).dump(obj)
61
62 #

BrokenPipeError: [Errno 32] Broken pipe

RuntimeError: backward_input can only be called in training mode

In pointer network jupyter notebook,

/pytorch_test/env2/lib/python2.7/site-packages/ipykernel_launcher.py:46: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
Exception NameError: "global name 'FileNotFoundError' is not defined" in <bound method _DataLoaderIter.__del__ of <torch.utils.data.dataloader._DataLoaderIter object at 0x7fdc900d1cd0>> ignored
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-21-0299c228a1ba> in <module>()
     15 
     16         adam.zero_grad()
---> 17         loss.backward()
     18         adam.step()
     19 

/pytorch_test/env2/local/lib/python2.7/site-packages/torch/tensor.pyc in backward(self, gradient, retain_graph, create_graph)
     91                 products. Defaults to ``False``.
     92         """
---> 93         torch.autograd.backward(self, gradient, retain_graph, create_graph)
     94 
     95     def register_hook(self, hook):

/pytorch_test/env2/local/lib/python2.7/site-packages/torch/autograd/__init__.pyc in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables)
     87     Variable._execution_engine.run_backward(
     88         tensors, grad_tensors, retain_graph, create_graph,
---> 89         allow_unreachable=True)  # allow_unreachable flag
     90 
     91 

RuntimeError: backward_input can only be called in training mode

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.