Giter Club home page Giter Club logo

Comments (6)

gheber avatar gheber commented on August 10, 2024

Let me rephrase what you are saying to make sure I understand the problem. You have two independent processes A (HDFView) and B (writer app), which do not employ any kind of IPC (inter-process communication) mechanism to coordinate. Process B is writing (an attribute) to an HDF5 file F and A is reading from it. Correct?

In such a scenario, HDF5's behavior is undefined. Process A might:

  1. See no attribute at all
  2. It might see an attribute, but the value is the default or garbage
  3. It might see what you expect (Congratulations, you've won the lottery ;-)
  4. Process A might throw an exception or crash, because it's reading only partially written data.

A similar scenario is supported only in HDF5 1.10 with SWMR (Single Writer Multiple Reader), but only for dataset appends. With SWMR you may have a single writer appending data to datasets in an HDF5 file and multiple readers, which do not coordinate their read operations with the writer or among themselves, and you are guaranteed that all readers will always see a consistent HDF5 file, and eventually the latest data.

In summary, the behavior you are seeing is neither expected nor unexpected, it's UNSUPPORTED. If you want predictable outcomes, you need some form of IPC and coordination among those processes.

from hdf.pinvoke.

riegaz avatar riegaz commented on August 10, 2024

Thanks for describing this scenario in a more detailed way. I understand now how to deal with hdf5 files when there should be a parallel readout.

However, in my scenario, also to keep it simple, lets say I have one process writing and reading the hdf5 file. So there is only one process doing all the communication with the hdf library.

My question now is, how to guarantee that none of the information is lost if the application crashes. I mean of course if it crashes before writing to the file data will be lost, that's clear.

Scenario:
1 User opens hdf file.
2 User fills an input form for 10 minutes --> here I write changes directly to the hdf file
3 User imports data into datasets --> here I write changes directly to the hdf file
4 User visualizes data
5 User analyses data
6 User closes hdf file

For example how to guarantee that information stored in step 2 is not lost because the application crashed (not even crashed, lets say my notebook goes into sleep, battery is over or the connection to my file on the network drive is interrupted) in step 5. Until now, I thought I can open the file in the beginning and once I write(attribute, dataset...), I'm on the save side.

However, it seems to be more complicated because if I do not call H5.close() everything is lost.

What would you propose? Also it could be the case that the users jumps from step 5 to 2 to change something and then directly goes to 5 again.....I how I could make myself clear.

from hdf.pinvoke.

epourmal avatar epourmal commented on August 10, 2024

Hello,

HDF5 attribute is stored in the object header that itself is in metadata cache until evicted and written to the file. If you flush the dataset (H5D.flush) HDFView should see it (modulo the comments from Gerd's reply).

Elena

Sent from my iPhone

On Mar 4, 2016, at 11:51 AM, riegaz <[email protected]mailto:[email protected]> wrote:

I created a variable length string and stored it into a attribute. If I rewrite the string, HDFView does not see it until H5.close() is called. This is not the case for datasets, why is it the case with attributes?

Probably this is also the case with freshly written attributes, but I did not test this yet.

Pseudocode:

H5.open()
H5F.open()
H5A.write()
H5F.close()
-----------------HDF does not see the change
H5.close()
-----------------HDF does see the change

Reply to this email directly or view it on GitHubhttps://github.com//issues/33.

from hdf.pinvoke.

riegaz avatar riegaz commented on August 10, 2024

@epourmal thanks. I did not find the H5D.flush() but I found H5F.flush() whicht does exactly what you told me. So for my scenario mentioned in my previous comment, what should I do.

  1. Close and reopen the file several times with calling H5.open() and H5.close()
  2. Call H5F.flush() after each write command

Which one is the right way to go?

from hdf.pinvoke.

epourmal avatar epourmal commented on August 10, 2024

Hi,

On Mar 4, 2016, at 5:08 PM, riegaz <[email protected]mailto:[email protected]> wrote:

@epourmalhttps://github.com/epourmal I did not find the H5D.flush() but I found H5F.flush() whicht does exactly what you told me. So for my scenario mentioned in my previous comment, what should I do.

  1. Close and reopen the file several times with calling H5.open() and H5.close()
  2. Call H5F.flush() after each write command

Which one is the right way to go?

Both will work. H5F.flush is a cheaper call.

Elena Pourmal  The HDF Group  http://hdfgroup.org
1800 So. Oak St., Suite 203, Champaign IL 61820
217.531.6112


Reply to this email directly or view it on GitHubhttps://github.com//issues/33#issuecomment-192514387.

from hdf.pinvoke.

riegaz avatar riegaz commented on August 10, 2024

@epourmal thanks. That helped me a lot!

from hdf.pinvoke.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.