@unsky , a nice work
When training, the error occur. the details is bellow:
#########################################train#######################################
./data/VOCdevkit2007/VOC2007/JPEGImages/2009_002123.jpg
./data/VOCdevkit2007/VOC2007/JPEGImages/000783.jpg
[08:24:35] /home/chengshuai/mx-maskrcnn-master1/incubator-mxnet/dmlc-core/include/dmlc/logging.h:308: [08:24:35] /home/chengshuai/mx-maskrcnn-master1/incubator-mxnet/mshadow/mshadow/././././cuda/tensor_gpu-inl.cuh:58: too large launch parameter: Softmax[89847,1], [256,1,1]
Stack trace returned 10 entries:
[bt] (0) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(_ZN4dmlc15LogMessageFatalD1Ev+0x3c) [0x7f0b7ad7f70c]
[bt] (1) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(_ZN7mshadow4cuda16CheckLaunchParamE4dim3S1_PKc+0x165) [0x7f0b7d3e83f5]
[bt] (2) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(ZN7mshadow4cuda7SoftmaxIfEEvRKNS_6TensorINS_3gpuELi2ET_EES7+0xfa) [0x7f0b7e3ec24a]
[bt] (3) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(ZN5mxnet2op19SoftmaxActivationOpIN7mshadow3gpuEE7ForwardERKNS_9OpContextERKSt6vectorINS_5TBlobESaIS9_EERKS8_INS_9OpReqTypeESaISE_EESD_SD+0x20b) [0x7f0b7e4fe57b]
[bt] (4) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(ZN5mxnet2op13OperatorState7ForwardERKNS_9OpContextERKSt6vectorINS_5TBlobESaIS6_EERKS5_INS_9OpReqTypeESaISB_EESA+0x354) [0x7f0b7d04a524]
[bt] (5) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(ZZN5mxnet10imperative12PushOperatorERKNS_10OpStatePtrEPKN4nnvm2OpERKNS4_9NodeAttrsERKNS_7ContextERKSt6vectorIPNS_6engine3VarESaISH_EESL_RKSE_INS_8ResourceESaISM_EERKSE_IPNS_7NDArrayESaISS_EESW_RKSE_IjSaIjEERKSE_INS_9OpReqTypeESaIS11_EENS_12DispatchModeEENKUlNS_10RunContextENSF_18CallbackOnCompleteEE0_clES17_S18+0x2a0) [0x7f0b7cec2950]
[bt] (6) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(_ZN5mxnet6engine14ThreadedEngine15ExecuteOprBlockENS_10RunContextEPNS0_8OprBlockE+0x9d) [0x7f0b7ce3fc6d]
[bt] (7) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(_ZN5mxnet6engine23ThreadedEnginePerDevice9GPUWorkerILN4dmlc19ConcurrentQueueTypeE0EEEvNS_7ContextEbPNS1_17ThreadWorkerBlockIXT_EEESt10shared_ptrINS0_10ThreadPool11SimpleEventEE+0xf3) [0x7f0b7ce43cb3]
[bt] (8) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(ZNSt17_Function_handlerIFvSt10shared_ptrIN5mxnet6engine10ThreadPool11SimpleEventEEEZZNS2_23ThreadedEnginePerDevice13PushToExecuteEPNS2_8OprBlockEbENKUlvE1_clEvEUlS5_E_E9_M_invokeERKSt9_Any_dataS5+0x56) [0x7f0b7ce43e96]
[bt] (9) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(_ZNSt6thread5_ImplISt12_Bind_simpleIFSt8functionIFvSt10shared_ptrIN5mxnet6engine10ThreadPool11SimpleEventEEEES8_EEE6_M_runEv+0x3b) [0x7f0b7ce410cb]
[08:24:35] /home/chengshuai/mx-maskrcnn-master1/incubator-mxnet/dmlc-core/include/dmlc/logging.h:308: [08:24:35] src/engine/./threaded_engine.h:370: [08:24:35] /home/chengshuai/mx-maskrcnn-master1/incubator-mxnet/mshadow/mshadow/././././cuda/tensor_gpu-inl.cuh:58: too large launch parameter: Softmax[89847,1], [256,1,1]
Stack trace returned 10 entries:
[bt] (0) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(_ZN4dmlc15LogMessageFatalD1Ev+0x3c) [0x7f0b7ad7f70c]
[bt] (1) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(_ZN7mshadow4cuda16CheckLaunchParamE4dim3S1_PKc+0x165) [0x7f0b7d3e83f5]
[bt] (2) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(ZN7mshadow4cuda7SoftmaxIfEEvRKNS_6TensorINS_3gpuELi2ET_EES7+0xfa) [0x7f0b7e3ec24a]
[bt] (3) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(ZN5mxnet2op19SoftmaxActivationOpIN7mshadow3gpuEE7ForwardERKNS_9OpContextERKSt6vectorINS_5TBlobESaIS9_EERKS8_INS_9OpReqTypeESaISE_EESD_SD+0x20b) [0x7f0b7e4fe57b]
[bt] (4) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(ZN5mxnet2op13OperatorState7ForwardERKNS_9OpContextERKSt6vectorINS_5TBlobESaIS6_EERKS5_INS_9OpReqTypeESaISB_EESA+0x354) [0x7f0b7d04a524]
[bt] (5) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(ZZN5mxnet10imperative12PushOperatorERKNS_10OpStatePtrEPKN4nnvm2OpERKNS4_9NodeAttrsERKNS_7ContextERKSt6vectorIPNS_6engine3VarESaISH_EESL_RKSE_INS_8ResourceESaISM_EERKSE_IPNS_7NDArrayESaISS_EESW_RKSE_IjSaIjEERKSE_INS_9OpReqTypeESaIS11_EENS_12DispatchModeEENKUlNS_10RunContextENSF_18CallbackOnCompleteEE0_clES17_S18+0x2a0) [0x7f0b7cec2950]
[bt] (6) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(_ZN5mxnet6engine14ThreadedEngine15ExecuteOprBlockENS_10RunContextEPNS0_8OprBlockE+0x9d) [0x7f0b7ce3fc6d]
[bt] (7) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(_ZN5mxnet6engine23ThreadedEnginePerDevice9GPUWorkerILN4dmlc19ConcurrentQueueTypeE0EEEvNS_7ContextEbPNS1_17ThreadWorkerBlockIXT_EEESt10shared_ptrINS0_10ThreadPool11SimpleEventEE+0xf3) [0x7f0b7ce43cb3]
[bt] (8) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(ZNSt17_Function_handlerIFvSt10shared_ptrIN5mxnet6engine10ThreadPool11SimpleEventEEEZZNS2_23ThreadedEnginePerDevice13PushToExecuteEPNS2_8OprBlockEbENKUlvE1_clEvEUlS5_E_E9_M_invokeERKSt9_Any_dataS5+0x56) [0x7f0b7ce43e96]
[bt] (9) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(_ZNSt6thread5_ImplISt12_Bind_simpleIFSt8functionIFvSt10shared_ptrIN5mxnet6engine10ThreadPool11SimpleEventEEEES8_EEE6_M_runEv+0x3b) [0x7f0b7ce410cb]
A fatal error occurred in asynchronous engine operation. If you do not know what caused this error, you can try set environment variable MXNET_ENGINE_TYPE to NaiveEngine and run with debugger (i.e. gdb). This will force all operations to be synchronous and backtrace will give you the series of calls that lead to this error. Remember to set MXNET_ENGINE_TYPE back to empty after debugging.
Stack trace returned 8 entries:
[bt] (0) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(_ZN4dmlc15LogMessageFatalD1Ev+0x3c) [0x7f0b7ad7f70c]
[bt] (1) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(_ZN5mxnet6engine14ThreadedEngine15ExecuteOprBlockENS_10RunContextEPNS0_8OprBlockE+0x3a0) [0x7f0b7ce3ff70]
[bt] (2) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(_ZN5mxnet6engine23ThreadedEnginePerDevice9GPUWorkerILN4dmlc19ConcurrentQueueTypeE0EEEvNS_7ContextEbPNS1_17ThreadWorkerBlockIXT_EEESt10shared_ptrINS0_10ThreadPool11SimpleEventEE+0xf3) [0x7f0b7ce43cb3]
[bt] (3) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(ZNSt17_Function_handlerIFvSt10shared_ptrIN5mxnet6engine10ThreadPool11SimpleEventEEEZZNS2_23ThreadedEnginePerDevice13PushToExecuteEPNS2_8OprBlockEbENKUlvE1_clEvEUlS5_E_E9_M_invokeERKSt9_Any_dataS5+0x56) [0x7f0b7ce43e96]
[bt] (4) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(_ZNSt6thread5_ImplISt12_Bind_simpleIFSt8functionIFvSt10shared_ptrIN5mxnet6engine10ThreadPool11SimpleEventEEEES8_EEE6_M_runEv+0x3b) [0x7f0b7ce410cb]
[bt] (5) /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xb1a60) [0x7f0b9d5c7a60]
[bt] (6) /lib/x86_64-linux-gnu/libpthread.so.0(+0x8182) [0x7f0ba193a182]
[bt] (7) /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f0ba166747d]
terminate called after throwing an instance of 'dmlc::Error'
what(): [08:24:35] src/engine/./threaded_engine.h:370: [08:24:35] /home/chengshuai/mx-maskrcnn-master1/incubator-mxnet/mshadow/mshadow/././././cuda/tensor_gpu-inl.cuh:58: too large launch parameter: Softmax[89847,1], [256,1,1]
Stack trace returned 10 entries:
[bt] (0) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(_ZN4dmlc15LogMessageFatalD1Ev+0x3c) [0x7f0b7ad7f70c]
[bt] (1) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(_ZN7mshadow4cuda16CheckLaunchParamE4dim3S1_PKc+0x165) [0x7f0b7d3e83f5]
[bt] (2) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(ZN7mshadow4cuda7SoftmaxIfEEvRKNS_6TensorINS_3gpuELi2ET_EES7+0xfa) [0x7f0b7e3ec24a]
[bt] (3) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(ZN5mxnet2op19SoftmaxActivationOpIN7mshadow3gpuEE7ForwardERKNS_9OpContextERKSt6vectorINS_5TBlobESaIS9_EERKS8_INS_9OpReqTypeESaISE_EESD_SD+0x20b) [0x7f0b7e4fe57b]
[bt] (4) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(ZN5mxnet2op13OperatorState7ForwardERKNS_9OpContextERKSt6vectorINS_5TBlobESaIS6_EERKS5_INS_9OpReqTypeESaISB_EESA+0x354) [0x7f0b7d04a524]
[bt] (5) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(ZZN5mxnet10imperative12PushOperatorERKNS_10OpStatePtrEPKN4nnvm2OpERKNS4_9NodeAttrsERKNS_7ContextERKSt6vectorIPNS_6engine3VarESaISH_EESL_RKSE_INS_8ResourceESaISM_EERKSE_IPNS_7NDArrayESaISS_EESW_RKSE_IjSaIjEERKSE_INS_9OpReqTypeESaIS11_EENS_12DispatchModeEENKUlNS_10RunContextENSF_18CallbackOnCompleteEE0_clES17_S18+0x2a0) [0x7f0b7cec2950]
[bt] (6) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(_ZN5mxnet6engine14ThreadedEngine15ExecuteOprBlockENS_10RunContextEPNS0_8OprBlockE+0x9d) [0x7f0b7ce3fc6d]
[bt] (7) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(_ZN5mxnet6engine23ThreadedEnginePerDevice9GPUWorkerILN4dmlc19ConcurrentQueueTypeE0EEEvNS_7ContextEbPNS1_17ThreadWorkerBlockIXT_EEESt10shared_ptrINS0_10ThreadPool11SimpleEventEE+0xf3) [0x7f0b7ce43cb3]
[bt] (8) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(ZNSt17_Function_handlerIFvSt10shared_ptrIN5mxnet6engine10ThreadPool11SimpleEventEEEZZNS2_23ThreadedEnginePerDevice13PushToExecuteEPNS2_8OprBlockEbENKUlvE1_clEvEUlS5_E_E9_M_invokeERKSt9_Any_dataS5+0x56) [0x7f0b7ce43e96]
[bt] (9) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(_ZNSt6thread5_ImplISt12_Bind_simpleIFSt8functionIFvSt10shared_ptrIN5mxnet6engine10ThreadPool11SimpleEventEEEES8_EEE6_M_runEv+0x3b) [0x7f0b7ce410cb]
A fatal error occurred in asynchronous engine operation. If you do not know what caused this error, you can try set environment variable MXNET_ENGINE_TYPE to NaiveEngine and run with debugger (i.e. gdb). This will force all operations to be synchronous and backtrace will give you the series of calls that lead to this error. Remember to set MXNET_ENGINE_TYPE back to empty after debugging.
Stack trace returned 8 entries:
[bt] (0) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(_ZN4dmlc15LogMessageFatalD1Ev+0x3c) [0x7f0b7ad7f70c]
[bt] (1) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(_ZN5mxnet6engine14ThreadedEngine15ExecuteOprBlockENS_10RunContextEPNS0_8OprBlockE+0x3a0) [0x7f0b7ce3ff70]
[bt] (2) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(_ZN5mxnet6engine23ThreadedEnginePerDevice9GPUWorkerILN4dmlc19ConcurrentQueueTypeE0EEEvNS_7ContextEbPNS1_17ThreadWorkerBlockIXT_EEESt10shared_ptrINS0_10ThreadPool11SimpleEventEE+0xf3) [0x7f0b7ce43cb3]
[bt] (3) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(ZNSt17_Function_handlerIFvSt10shared_ptrIN5mxnet6engine10ThreadPool11SimpleEventEEEZZNS2_23ThreadedEnginePerDevice13PushToExecuteEPNS2_8OprBlockEbENKUlvE1_clEvEUlS5_E_E9_M_invokeERKSt9_Any_dataS5+0x56) [0x7f0b7ce43e96]
[bt] (4) /usr/local/lib/python2.7/dist-packages/mxnet-0.12.1-py2.7.egg/mxnet/libmxnet.so(_ZNSt6thread5_ImplISt12_Bind_simpleIFSt8functionIFvSt10shared_ptrIN5mxnet6engine10ThreadPool11SimpleEventEEEES8_EEE6_M_runEv+0x3b) [0x7f0b7ce410cb]
[bt] (5) /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xb1a60) [0x7f0b9d5c7a60]
[bt] (6) /lib/x86_64-linux-gnu/libpthread.so.0(+0x8182) [0x7f0ba193a182]
[bt] (7) /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f0ba166747d]
What is the problem? Does the mxnet version result in?
Thanks!