Giter Club home page Giter Club logo

Comments (8)

mwinkel-dev avatar mwinkel-dev commented on July 22, 2024

Using Ubuntu 20, created a new Docker image for Ubuntu 18 (instead of using the archived image in Docker Hub). However, when build with mdsplus/deploy/build.sh --os=ubuntu18 --test, it still hangs on TreeSegmentTest.

According to posts on the web, this issue is likely caused by the version of GLIBC used. Probably will have to run the debugger to figure out which thread owns the mutex and why the deadlock is occurring.

from mdsplus.

mwinkel-dev avatar mwinkel-dev commented on July 22, 2024

Using the debugger, noticed that the three threads appear to be hung at the treeshr/RemoteAccess.c/io_lock_local() call of fcntl() at line 1544 (https://github.com/MDSplus/mdsplus/blob/alpha/treeshr/RemoteAccess.c#L1554).

This debugging was done with a local build (not using Docker) on Ubuntu 20. Note that Ubuntu 20 uses GLIBC 2.31 while Ubuntu 18 uses GLIBC 2.27.

According to posts on the web, there was a significant change in GLIBC 2.28 -- apparently it redefined fcntl() to call fcntl64(). The following posts might provide some clues.

from mdsplus.

mwinkel-dev avatar mwinkel-dev commented on July 22, 2024

Using Ubuntu 20 (with GLIBC 2.31), the TreeSegmentTest program is deadlocking when each of the three threads attempts to write their first segment. The lslocks command shows the there are three "advisory" write locks on different regions of the tree_test_001.characteristics file. The debugger reveals that they are F_OFD_SETLKW locks (open file descriptor locks, and that fcntl64() will wait until the locks are created). The three locks are deadlocking, so the threads never run to completion, which is why the pthread_join() call hangs.

Note: eventually this bug will be given a new title. The primary issue is with MDSplus threads / locking / GLIBC version. The problems this creates for the build scripts / system are a secondary issue.

from mdsplus.

mwinkel-dev avatar mwinkel-dev commented on July 22, 2024

Surprisingly, the TreeSegmentTest program is not deadlocked. It does run to completion -- just extremely slowly.

When run TreeSegmentTest 2 1 2 (two threads, one segment per thread, and two elements per segment), it is taking ~30 seconds per lock (presumably when writing to the *.datafile). Found this out by adding some debug print statements to treeshr/RemoteAccess.c and running with the debugger.

gdb ./TreeSegmentTest
r 2 1 2

from mdsplus.

mwinkel-dev avatar mwinkel-dev commented on July 22, 2024

The TreeSegmentTest uses pthread_cond_wait and pthread_cond_broadcast.

Turns out there is a gnarly bug with pthread_cond_* that was found in GLIBC 2.27 and still hasn't been fixed in GLIBC 2.31.
Sourceware.org 25847
RedHat 1889892

Fortunately, the TreeSegmentTest usage is so simple, it surely doesn't trigger that bug.

from mdsplus.

zack-vii avatar zack-vii commented on July 22, 2024

The issue may be related to an issue recently under investigation (to be filed here). There seems to be a problem with the locking scheme for appending data to the end of a file with competing concurrent writers. I think the problem may be a race condition on the SEEK_END used for the locking and the SEEK_END for the writing. Although, I seem to have problems properly locking beyond the size of the file (documented to be possible), which seems to succeed even if another process is already holding a competing lock in that region. according to this post https://nullprogram.com/blog/2016/08/03/ its quite challenging to append larger continuous blocks of data.

from mdsplus.

mwinkel-dev avatar mwinkel-dev commented on July 22, 2024

Mystery solved. Should have used a local file system and not an NFS mounted volume.

However, there might indeed also be a residual locking problem as per the preceding post by @zack-vii .

All of the above experiments were conducted on x86_64 with Ubuntu 20.04 (GLIBC 2.31) as the host OS. And all were running in a NFS mounted volume. (However the reference Ubuntu 18.04 as host OS used a local file system.)

Turns out that in GLIBC 2.27 and earlier, there was an issue handling "open file descriptor" locks (e.g., F_OFD_SETLKW) on systems that support large file sizes (> 2 GB or > 4 GB). The issue / bug apparently has to do with "cancelling" a lock when a thread is terminated via a signal. Note that Ubuntu 18.04 uses GLIBC 2.27.

So in GLIBC 2.28, that problem was fixed by using the SYSCALL_CANCEL macro to run fcntl64 to make the F_OFD_SETLKW lock. And in GLIBC's sysdeps/unix/sysdep.h there is logic for multi-threaded code to use three steps:

  • first change the cancellation mode to LIBC_CANCEL_ASYNC,
  • then run the system call (which is fcntl64 in this case and is what creates the lock),
  • and then restore the original cancellation mode to LIBC_CANCEL_RESET.

Those three steps run quickly on a local file system. But are glacial when creating F_OFD_SETLKW locks on a NFS mounted volume. And that likely explains why in the above experiments, it was taking ~30 seconds per lock on the *.datafile.

Note that Ubuntu 20.04 uses GLIBC 2.31 (which of course includes the GLIBC 2.28 changes).

After switching to a local file system, the TreeSegmentTest ran fine with the default settings: 3 threads, 100 segments per thread, and 10,000 elements per segment.

Will do some more testing to confirm no more gremlins lurking in the weeds. And then will close this issue. Likely resolution of this issue is merely to update the documentation on the MdsWiki to explain that building / testing should use local filesystems.

from mdsplus.

mwinkel-dev avatar mwinkel-dev commented on July 22, 2024

It is a known issue that doing data acquisition to tree files on NFS volumes degrades MDSplus performance because of the overhead NFS adds to file locking. There is no easy fix for this issue (i.e., it would require architectural changes such as data shards).

Edits have been added to two pages on www.mdsplus.org recommending that tree files be kept in local file systems when using MDSplus for data acquisition and also when building / testing MDSplus from sources.

from mdsplus.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.