Giter Club home page Giter Club logo

Comments (11)

torressliu avatar torressliu commented on July 19, 2024 1

thanks fot your answer, but I have run the project in github.com/gml16/rl-medical, it still not work.

image
p.s.段错误(核心已转储)means:Segmentation fault.
I am sorry that my linux language is Chinese.

from rl-medical.

gml16 avatar gml16 commented on July 19, 2024

Would you mind sharing the command you are using and any traceback you receive please so I can diagnose the problem?
The multi-agent project has been exported to github.com/gml16/rl-medical, could you try running this one instead as well? Sorry for the confusion, I will add it to the readme.

from rl-medical.

gml16 avatar gml16 commented on July 19, 2024

Oh I see, not much information there. I don't remember ever having encountered this issue. Would you mind sharing your py36torch environment so I could try to reproduce it? If it comes from the environment I will add an environment file with the exact Python and package versions I am using. PS: no need to apologies for the language setting :)

EDIT: does the evaluation command work for you or also returns a segmentation fault?

from rl-medical.

torressliu avatar torressliu commented on July 19, 2024

image
this is my environment.Unluckily, evaluation command work also returns a segmentation fault👀 👀

from rl-medical.

gml16 avatar gml16 commented on July 19, 2024

Thanks for sharing your environment. I created a conda environment with only the necessary modules and using the same versions as you. I didn't receive any segmentation fault but received an error from Numpy (updating to a newer version solved the issue), and open-cv is required to run the code but isn't in your environment (as far as I can tell).
I've exported my minimal conda environment using Python 3.6 (which isn't so small after all). Would you mind saving the text below as env.yml and then run conda env create -f env.yml, and finally conda activate rl-medical-36 and try again?
Hopefully this solves this cryptic seg fault.

Edit: if you prefer, I've added on the gml16/rl-medical repo an environment.yml file using Python 3.8 that I have tested on two machines.

name: rl-medical-36
channels:
  - pytorch
  - defaults
dependencies:
  - _libgcc_mutex=0.1=main
  - blas=1.0=mkl
  - ca-certificates=2020.7.22=0
  - certifi=2020.6.20=py36_0
  - cudatoolkit=10.1.243=h6bb024c_0
  - intel-openmp=2020.2=254
  - ld_impl_linux-64=2.33.1=h53a641e_7
  - libedit=3.1.20191231=h14c3975_1
  - libffi=3.3=he6710b0_2
  - libgcc-ng=9.1.0=hdf63c60_0
  - libstdcxx-ng=9.1.0=hdf63c60_0
  - mkl=2020.2=256
  - mkl-service=2.3.0=py36he904b0f_0
  - mkl_fft=1.2.0=py36h23d657b_0
  - mkl_random=1.1.1=py36h0573a6f_0
  - ncurses=6.2=he6710b0_1
  - ninja=1.10.1=py36hfd86e86_0
  - openssl=1.1.1h=h7b6447c_0
  - pip=20.2.3=py36_0
  - python=3.6.12=hcff3b4d_2
  - pytorch=1.4.0=py3.6_cuda10.1.243_cudnn7.6.3_0
  - readline=8.0=h7b6447c_0
  - setuptools=49.6.0=py36_1
  - six=1.15.0=py_0
  - sqlite=3.33.0=h62c20be_0
  - tk=8.6.10=hbc83047_0
  - wheel=0.35.1=py_0
  - xz=5.2.5=h7b6447c_0
  - zlib=1.2.11=h7b6447c_3
  - pip:
    - absl-py==0.10.0
    - aiohttp==3.6.2
    - async-timeout==3.0.1
    - attrs==20.2.0
    - cachetools==4.1.1
    - chardet==3.0.4
    - cloudpickle==1.6.0
    - cycler==0.10.0
    - future==0.18.2
    - google-auth==1.22.0
    - google-auth-oauthlib==0.4.1
    - grpcio==1.32.0
    - gym==0.17.3
    - idna==2.10
    - idna-ssl==1.1.0
    - importlib-metadata==2.0.0
    - markdown==3.2.2
    - matplotlib==2.0.2
    - msgpack==1.0.0
    - msgpack-numpy==0.4.7.1
    - multidict==4.7.6
    - numpy==1.19.2
    - oauthlib==3.1.0
    - olefile==0.46
    - opencv-python==4.4.0.44
    - pillow==4.2.1
    - protobuf==3.13.0
    - psutil==5.7.2
    - pyasn1==0.4.8
    - pyasn1-modules==0.2.8
    - pyglet==1.5.0
    - pyparsing==2.4.7
    - python-dateutil==2.8.1
    - pytz==2020.1
    - pyzmq==19.0.2
    - requests==2.24.0
    - requests-oauthlib==1.3.0
    - rsa==4.6
    - scipy==1.5.2
    - simpleitk==1.2.4
    - tabulate==0.8.7
    - tensorboard==2.3.0
    - tensorboard-plugin-wit==1.7.0
    - tensorpack==0.9.5
    - termcolor==1.1.0
    - tqdm==4.50.0
    - typing-extensions==3.7.4.3
    - urllib3==1.25.10
    - werkzeug==1.0.1
    - yarl==1.6.0
    - zipp==3.3.0

from rl-medical.

torressliu avatar torressliu commented on July 19, 2024

thanks for your help .However, when I use your environment.yml file to run this project, I received a new error:
image
the error said it needs 42.4GB video card memory?

from rl-medical.

gml16 avatar gml16 commented on July 19, 2024

Great, I think it should be working then. This memory error is due to the fact the default replay buffer is quite large, try setting the flag --memory_size to a lower value, such as 1000 and, similarly, set --init_memory_size to 500.
Do you know if the evaluation works now?

Edit: the default training values are the ones I used to produce the models presented in the paper, I will add to the Readme to reduce the memory size if it is an issue.

from rl-medical.

torressliu avatar torressliu commented on July 19, 2024

yes,I set a lower value and it works ! I tried the evaluation command and received this error:pyglet.canvas.xlib.NoSuchDisplayException: Cannot connect to "None"
That should be my problem,I think the reason for the error is render function.My server hasn't installed a fake screen yet,How do you get training images from the server

from rl-medical.

gml16 avatar gml16 commented on July 19, 2024

Good news :) Yes you receive this error because you cannot render, you can use the flag --viz 0 to disable rendering

from rl-medical.

torressliu avatar torressliu commented on July 19, 2024

Thank you for your patient guidance,your idea inspires me.😁😁

from rl-medical.

gml16 avatar gml16 commented on July 19, 2024

Glad I could help :)
You can also use the flag --saveGif to save the evaluation as a gif. That can be useful on headless machines.

from rl-medical.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.