Comments (14)
Hi @scottneuhoff, thank you for your interest in trying our async VOL connector and provide the feedback.
I have updated the README with the correct path for the installed libraries as you suggested.
For the errors you are seeing with "./async_test_serial_event_set_error_stack.exe", it is likely that the HDF5_VOL_CONNECTOR environment variable is not set properly, with the test program not using the async connector, can you do "echo $HDF5_VOL_CONNECTOR" before your run and make sure it is "async under_vol=0;under_info={}". Also it would be good to update the HDF5 code and async code to the latest version with "git pull", since we have been fixing bugs.
For parallel tests, can you check the content in async_vol_test.err, or just run with "mpirun -np 4 ./async_test_parallel.exe" and see if there are any errors.
What version of Linux system are you running on? You can check with "cat /etc/os-release".
from vol-async.
@houjun , thanks for your feedback, directory structures look much better. However, I am still unable to pass the multifile test. When I run make check
from $VOL_DIR/tests
, I get the following:
./pytest.py -p
Running serial tests
Test # 1 : async_test_serial.exe PASSED
Test # 2 : async_test_serial2.exe PASSED
ERROR: Test async_test_multifile.exe : returned non-zero exit status= -6 aborting test
run_cmd= ./async_test_multifile.exe
pytest was unsuccessful
Running the multifile test individually and/or checking async_vol_test.err yields the same message:
async_test_multifile.exe: H5CX.c:3610: H5CX__pop_common: Assertion
head && *head' failed.`
These were after double checking my directories, environment variables, and pulling the most recent updates from git. I am currently running SUSE linux, SLES12.
As for async_test_parallel.exe, when I try to run that individually, I get the following:
HDF5-DIAG: Error detected in HDF5 (1.13.0) MPI-process 0:
#000: H5.c line 1010 in H5open(): library initialization failed
major: Function entry/exit
minor: Unable to initialize object
#001: H5.c line 277 in H5_init_library(): unable to initialize vol interface
major: Function entry/exit
minor: Unable to initialize object
#002: H5VLint.c line 202 in H5VL_init_phase2(): unable to set default VOL connector
major: Virtual Object Layer
minor: Can't set value
#003: H5VLint.c line 444 in H5VL__set_def_conn(): can't register connector
major: Virtual Object Layer
minor: Unable to register new ID
#004: H5VLint.c line 1371 in H5VL__register_connector_by_name(): unable to load VOL connector
major: Virtual Object Layer
minor: Unable to initialize object
HDF5-DIAG: Error detected in HDF5 (1.13.0) MPI-process 0:
#000: H5VL.c line 144 in H5VLregister_connector_by_name(): unable to register VOL connector
major: Virtual Object Layer
minor: Unable to register new ID
#001: H5VLint.c line 1371 in H5VL__register_connector_by_name(): unable to load VOL connector
major: Virtual Object Layer
minor: Unable to initialize object
[ASYNC VOL ERROR] with H5VLregister_connector_by_name
[ASYNC VOL ERROR] H5Pset_vol_async: async_setup
async_test_parallel.exe: async_test_parallel.c:41: main: Assertion `status >= 0' failed.
MPT ERROR: Rank 0(g:0) received signal SIGABRT/SIGIOT(6).
from vol-async.
@scottneuhoff - This looks like the bug I fixed on the async_vol_register_optional branch if the HPC-IO org's HDF5 git repo last week. Can you please pull the latest HDF5 code from that branch, rebuild & install, then try this test again?
from vol-async.
@scottneuhoff I just pushed a change to the parallel testing that removes H5Pset_vol_async() calls which are not necessary, can you try again? If there's still errors, maybe we can find a time for a zoom session to go through the tests together?
from vol-async.
@qkoziol @houjun Thanks for your quick responses; I switched to another machine that I hoped would be less complicated to work with (Red Hat Linux) and went through the process from a fresh directory; this ensured that I had cloned the most recent git clones so all the changes you refer to should be in. However, I get to exactly the same place as before - I set those environment variables, go into $VOL_DIR/test
, edit my Makefile, run make check
and hit:
./pytest.py -p
Running serial tests
Test # 1 : async_test_serial.exe PASSED
Test # 2 : async_test_serial2.exe PASSED
ERROR: Test async_test_multifile.exe : returned non-zero exit status= -6 aborting test
run_cmd= ./async_test_multifile.exe
pytest was unsuccessful
Where again, running async_test_multifile.exe
individually (or checking async_vol_test.err
) tells me:
async_test_multifile.exe: H5CX.c:3610: H5CX__pop_common: Assertion `head && *head' failed.
Interestingly, I also found that earlier in the install process when running make check
in my $H5_DIR
(the hdf5 directory), that I pass most of those checks but get to the test_mirror.sh
which fails, with the same error:
============================
Testing test_mirror.sh
============================
test_mirror.sh Test Log
============================
mkdir: cannot create directory \u2018mirror_vfd_test\u2019: File exists
Launching Mirror Server
Mirror VFD was not built -- cannot launch server.
mirror_vfd: H5CX.c:3610: H5CX__pop_common: Assertion `head && *head' failed.
test_mirror.sh: line 80: 13012 Aborted (core dumped) ./mirror_vfd
Stopping Mirror Server
Mirror VFD not built -- unable to perform shutdown.
Mirror VFD tests FAILED.
0.05user 0.11system 0:00.29elapsed 58%CPU (0avgtext+0avgdata 3376maxresident)k
0inputs+1656outputs (0major+18419minor)pagefaults 0swaps
make[4]: *** [Makefile:3536: test_mirror.sh.chkexe_] Error 1
It's the same Assertion
head && *head' failed.
@houjun if a call would make solving this issue smoother, then please email me at [email protected]
and we can figure it out. Thanks~
from vol-async.
@scottneuhoff - Although the assert is the same, I have a feeling that these have different root causes. I'm happy to "pair debug" in a call also, I'll send you and Tang an email to set something up.
from vol-async.
Closing this as we solved the problem in the call.
from vol-async.
I am trying the Async vol today and am getting the same head && *head
assertion failure. My application is access HDF5 through either the netCDF or CGNS library (assertion failure on both). In the assert, head
is non-null, but *head
is NULL.
Not sure if the same cause as this issue, but the failure looks the same, so perhaps the solution is the same?
from vol-async.
Hi @gsjaardema, the previous problem was due to an environment variable setting, HDF5_PLUGIN_PATH should be set to "$VOL_DIR/src" instead of "$VOL_DIR", can you check the HDF5_PLUGIN_PATH value in your environment? (Also please update the HDF5 library as well as the async vol to the latest version.)
from vol-async.
The HDF5_PLUGIN_PATH
is set to the correct location and the HDF5 library is up-to-date as of yesterday. I will do some more debugging and make sure the plugin is being found and loaded and then open a new issue if I still can't determine the problem.
from vol-async.
The
HDF5_PLUGIN_PATH
is set to the correct location and the HDF5 library is up-to-date as of yesterday. I will do some more debugging and make sure the plugin is being found and loaded and then open a new issue if I still can't determine the problem.
This also seems similar to a problem that I fixed in the incoming branch. Can you try again with the 'async_vol_register_optional' branch of both the HDF5 and vol-async repos from the HPC-IO org's forks:
https://github.com/hpc-io/hdf5/tree/async_vol_register_optional
https://github.com/hpc-io/vol-async/tree/async_vol_register_optional
from vol-async.
I am on the async_vol_register_optional
branch of both HDF5 and vol-async repos and code is up-to-date. For some reason the dlopen
is failing in H5PL__open
. The path that it is using in that routine points to the correct plugin library, but for some reason, the handle
coming back is NULL.
I made a simple C program to just do a dlopen
on the same file and a dlsym
on the symbol and that works correctly.
Not sure why it can't open the libh5async.so
library in the app, but can in my simple program... The tests in vol-async
run correctly, so it is opening the library correctly there...
from vol-async.
OK, I reran with LD_DEBUG=libs
which outputs debug information related to shared libraries and it showed that the application could not find libabt.so
. I then reran with the LD_LIBRARY_PATH
specifying the path to libabt.so
and everything seems to be working.
I'm not sure why it isn't finding the library without LD_LIBRARY_PATH
being set since the executable has that location specified with the -rpath
option... But, seems to be working for now. Will see if I can figure out why it needs LD_LIBRARY_PATH.
from vol-async.
OK, I reran with
LD_DEBUG=libs
which outputs debug information related to shared libraries and it showed that the application could not findlibabt.so
. I then reran with theLD_LIBRARY_PATH
specifying the path tolibabt.so
and everything seems to be working.I'm not sure why it isn't finding the library without
LD_LIBRARY_PATH
being set since the executable has that location specified with the-rpath
option... But, seems to be working for now. Will see if I can figure out why it needs LD_LIBRARY_PATH.
Ah, very cool! Annoying about the LD_LIBRARY_PATH though - I would tend to agree with you, it should have been linked into the async VOL connector and shouldn't need to be added to the dynamic library path.
from vol-async.
Related Issues (10)
- HDF5 segfault with vol-asyc when building FLASHX HOT 3
- Checks for < 0 of unsigned variables. HOT 3
- Summit crash with hdf5-iotest and > 1 node HOT 3
- 2.1 Compile H5_DIR Configure Issue HOT 2
- async_test_multifile.exe fails with segmentation fault HOT 7
- Support latest HDF5 VOL connector feature flags HOT 1
- both ASYNC dynamic and static libraries in LDFLAGS in test/Makefile, conflict? HOT 2
- Test errors HOT 3
- Argobots segfault in MacOS Solution HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from vol-async.