when I run the multi-agents landmark detect project, there is always a Segmentation f

<a target="_blank" rel="noopener noreferrer nofollow" href="https://user-images.github

Thank you for your patient guidance，your idea inspires me.<g-emoji class="g-emoji" ali

multi-agent landmark detection about rl-medical HOT 11 CLOSED

amiralansary commented on July 19, 2024

multi-agent landmark detection

from rl-medical.

Comments (11)

torressliu commented on July 19, 2024 1

thanks fot your answer, but I have run the project in github.com/gml16/rl-medical, it still not work.

p.s.段错误（核心已转储）means:Segmentation fault.
I am sorry that my linux language is Chinese.

from rl-medical.

gml16 commented on July 19, 2024

Would you mind sharing the command you are using and any traceback you receive please so I can diagnose the problem?
The multi-agent project has been exported to github.com/gml16/rl-medical, could you try running this one instead as well? Sorry for the confusion, I will add it to the readme.

from rl-medical.

gml16 commented on July 19, 2024

Oh I see, not much information there. I don't remember ever having encountered this issue. Would you mind sharing your py36torch environment so I could try to reproduce it? If it comes from the environment I will add an environment file with the exact Python and package versions I am using. PS: no need to apologies for the language setting :)

EDIT: does the evaluation command work for you or also returns a segmentation fault?

from rl-medical.

torressliu commented on July 19, 2024

this is my environment.Unluckily, evaluation command work also returns a segmentation fault👀 👀

from rl-medical.

gml16 commented on July 19, 2024

Thanks for sharing your environment. I created a conda environment with only the necessary modules and using the same versions as you. I didn't receive any segmentation fault but received an error from Numpy (updating to a newer version solved the issue), and open-cv is required to run the code but isn't in your environment (as far as I can tell).
I've exported my minimal conda environment using Python 3.6 (which isn't so small after all). Would you mind saving the text below as env.yml and then run conda env create -f env.yml, and finally conda activate rl-medical-36 and try again?
Hopefully this solves this cryptic seg fault.

Edit: if you prefer, I've added on the gml16/rl-medical repo an environment.yml file using Python 3.8 that I have tested on two machines.

name: rl-medical-36
channels:
  - pytorch
  - defaults
dependencies:
  - _libgcc_mutex=0.1=main
  - blas=1.0=mkl
  - ca-certificates=2020.7.22=0
  - certifi=2020.6.20=py36_0
  - cudatoolkit=10.1.243=h6bb024c_0
  - intel-openmp=2020.2=254
  - ld_impl_linux-64=2.33.1=h53a641e_7
  - libedit=3.1.20191231=h14c3975_1
  - libffi=3.3=he6710b0_2
  - libgcc-ng=9.1.0=hdf63c60_0
  - libstdcxx-ng=9.1.0=hdf63c60_0
  - mkl=2020.2=256
  - mkl-service=2.3.0=py36he904b0f_0
  - mkl_fft=1.2.0=py36h23d657b_0
  - mkl_random=1.1.1=py36h0573a6f_0
  - ncurses=6.2=he6710b0_1
  - ninja=1.10.1=py36hfd86e86_0
  - openssl=1.1.1h=h7b6447c_0
  - pip=20.2.3=py36_0
  - python=3.6.12=hcff3b4d_2
  - pytorch=1.4.0=py3.6_cuda10.1.243_cudnn7.6.3_0
  - readline=8.0=h7b6447c_0
  - setuptools=49.6.0=py36_1
  - six=1.15.0=py_0
  - sqlite=3.33.0=h62c20be_0
  - tk=8.6.10=hbc83047_0
  - wheel=0.35.1=py_0
  - xz=5.2.5=h7b6447c_0
  - zlib=1.2.11=h7b6447c_3
  - pip:
    - absl-py==0.10.0
    - aiohttp==3.6.2
    - async-timeout==3.0.1
    - attrs==20.2.0
    - cachetools==4.1.1
    - chardet==3.0.4
    - cloudpickle==1.6.0
    - cycler==0.10.0
    - future==0.18.2
    - google-auth==1.22.0
    - google-auth-oauthlib==0.4.1
    - grpcio==1.32.0
    - gym==0.17.3
    - idna==2.10
    - idna-ssl==1.1.0
    - importlib-metadata==2.0.0
    - markdown==3.2.2
    - matplotlib==2.0.2
    - msgpack==1.0.0
    - msgpack-numpy==0.4.7.1
    - multidict==4.7.6
    - numpy==1.19.2
    - oauthlib==3.1.0
    - olefile==0.46
    - opencv-python==4.4.0.44
    - pillow==4.2.1
    - protobuf==3.13.0
    - psutil==5.7.2
    - pyasn1==0.4.8
    - pyasn1-modules==0.2.8
    - pyglet==1.5.0
    - pyparsing==2.4.7
    - python-dateutil==2.8.1
    - pytz==2020.1
    - pyzmq==19.0.2
    - requests==2.24.0
    - requests-oauthlib==1.3.0
    - rsa==4.6
    - scipy==1.5.2
    - simpleitk==1.2.4
    - tabulate==0.8.7
    - tensorboard==2.3.0
    - tensorboard-plugin-wit==1.7.0
    - tensorpack==0.9.5
    - termcolor==1.1.0
    - tqdm==4.50.0
    - typing-extensions==3.7.4.3
    - urllib3==1.25.10
    - werkzeug==1.0.1
    - yarl==1.6.0
    - zipp==3.3.0

from rl-medical.

torressliu commented on July 19, 2024

thanks for your help .However, when I use your environment.yml file to run this project, I received a new error:

the error said it needs 42.4GB video card memory?

from rl-medical.

gml16 commented on July 19, 2024

Great, I think it should be working then. This memory error is due to the fact the default replay buffer is quite large, try setting the flag --memory_size to a lower value, such as 1000 and, similarly, set --init_memory_size to 500.
Do you know if the evaluation works now?

Edit: the default training values are the ones I used to produce the models presented in the paper, I will add to the Readme to reduce the memory size if it is an issue.

from rl-medical.

torressliu commented on July 19, 2024

yes,I set a lower value and it works ! I tried the evaluation command and received this error:pyglet.canvas.xlib.NoSuchDisplayException: Cannot connect to "None"
That should be my problem，I think the reason for the error is render function.My server hasn't installed a fake screen yet,How do you get training images from the server

from rl-medical.

gml16 commented on July 19, 2024

Good news :) Yes you receive this error because you cannot render, you can use the flag --viz 0 to disable rendering

from rl-medical.

torressliu commented on July 19, 2024

Thank you for your patient guidance，your idea inspires me.😁😁

from rl-medical.

gml16 commented on July 19, 2024

Glad I could help :)
You can also use the flag --saveGif to save the evaluation as a gif. That can be useful on headless machines.

from rl-medical.

multi-agent landmark detection about rl-medical HOT 11 CLOSED

Comments (11)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent