Giter Club home page Giter Club logo

Comments (11)

akalin avatar akalin commented on July 19, 2024

Thanks for the bug fix and interest! I'd be happy to give pointers and review accept pull requests, although I'm unlikely to have the time to do the coding myself. :)

It's not entirely clear to me the situations in which the various return values are returned in par2cmdline, but you may be more familiar with its behavior. For example, if repairable damage is found and it's fixed, is it SUCCESS or REPAIRABLE? Maybe REPAIRABLE and IRREPAIRABLE are only returned in validate modes, whereas SUCCESS or FAIL is used when repair is attempted?

In any case, as you said, the important part is distinguishing repairable vs. irrepairable damage, and that should be pretty easy to implement.

func (par2LogDecoderDelegate) OnDetectCorruptDataChunk(fileID [16]byte, filename string, startByteOffset, endByteOffset int) {
is a delegate method that's called when any damage is detected, so you can probably pass a bool pointer to the delegate and set it to true when https://github.com/akalin/gopar/blob/93c63e81c01b1e19320146c9fdbd6dd44adbab16/cmd/par/main.go#L157 is called.

Also

if len(input) < c.dataShards {
checks whether the damage is repairable or not and returns an error if so -- you can then define a new error type for irrepairable error, and then check for it in the main function.

That should be enough to get you the important errors codes, I think. Let me know if you have any more questions!

from gopar.

brenthuisman avatar brenthuisman commented on July 19, 2024

Thanks for the reply! I've done some more spelunking, Go and in particular the interface and delegation pattern you use a lot are kinda new for me. You've got afwul code incoming ;)

par2cmdline not being entirely clear is a good description of the problem I have with it :) REPAIRABLE and IRREPAIRABLE are returned on a verify. I see that in gopar a repair needs to be attempted in order to learn of the damage is REPAIRABLE. This check now is in the repair routine (coder):

if len(input) < c.dataShards {

I've added a TestReconstructData that exits at:

func (par2LogDecoderDelegate) OnDetectCorruptDataChunk(fileID [16]byte, filename string, startByteOffset, endByteOffset int) {

to rsec16/coder.go. I'm calling it after this block (I need coder, dataShards):

However, the function already exits at:

for _, info := range d.fileIntegrityInfos {

Moving the call to TestReconstructData above fileIntegrityInfos seems the wrong order though. What do you think?

from gopar.

akalin avatar akalin commented on July 19, 2024

OK, I refreshed my memory on this code (i wish past-me commented it a bit more!) and I think you're on the right track. We do need a TestReconstructData function on rsec16.Coder, although I'd have to think a bit on how best to make par2.Decoder.Verify use that function. Also, do you care about the PAR1 side at all? If so, I delegate that to https://github.com/klauspost/reedsolomon so a similar function would have to be added to it to detect repairable vs. non-repairable.

So here's what I suggest the best order to do things is:

  1. One PR to add a function CanReconstructData() (I slightly prefer this name) to rsec16.Coder, which returns something (e.g., an enum) that distinguishes between: 'nothing to do', 'repairable errors', 'non-repairable errors'. (I don't think it has to call any delegate functions.)
  2. (Optional, if you care about PAR1) A similar PR for klauspost/reedsolomon that adds a similar function.
  3. One PR to make par2.Decoder.Verify and maybe par1.Decoder.Verify use CanReconstructData() -- this is still a bit fuzzy in my head, but I think once we finish steps 1 and 2 I'll have a better idea of what to do here (and I'll have an answer to your question about the order of things).

Does that make sense?

from gopar.

brenthuisman avatar brenthuisman commented on July 19, 2024

I don't care about par1 so I was hoping to leave that aside for now. I have a CanReconstructData that distinguishes repairable from irrepairable and I added an int with the appropriate return value. I've added this to par1's methods too, to satisfy the expected function signature, but I would rather not as I won't be able to test it.

Should I add to CanReconstructData the OnDetectCorruptDataChunk test? Or would that be double work (it is already tested somewhere) and is it better to have it also set an enum appropriately? What would be a good place to keep track of these enums? CanReconstructData could store it in a Coder, and OnDetectCorruptDataChunk on an decoderInputFileInfo? Or both on decoderInputFileInfo? Or on the Decoder struct?

from gopar.

akalin avatar akalin commented on July 19, 2024

Maybe I don't quite understand your question, but I don't think you have to mess with OnDetectCorruptDataChunk after all -- those delegate functions are intended mainly for logging/informational purposes. I think the enum would go around rsec16.Coder for now, and it would just be a return value for CanReconstructData, it doesn't have to be stored anywhere (I'd imagine par2.Decoder.Verify would examine the return value for CanReconstructData and return it / continue with other verification tasks).

Do you have a PR you're working on currently? Feel free to send it my way even if it's incomplete, it might be easier to make comments inline.

from gopar.

brenthuisman avatar brenthuisman commented on July 19, 2024

The result of a verification in par2cmdline et al. involves some of that logging: return values of 0,1,2 and 4 are needed. Basically: is the file (and/or parity) corrupt yes/no, if not, is it repairable yes/no? That's a minimum of four states we need to distinguish (OK, REPAIRABLE, IRREPAIRABLE); the fourth is any other error, which I don't want to silently roll up into irrepairable corruption.

The repairable bool/enum can be made member of coder, but what about catching all other errors?

if len(input) < c.dataShards {

Are we sure that if the above test fails, we're in the IRREPAIRABLE state or can any other error be the source of not passing?

from gopar.

brenthuisman avatar brenthuisman commented on July 19, 2024

Maybe some help with understanding Verify() will help me with this. I don't know much about par2 either in fact ;)

It seems to me that Verify() should return the above described enum. There is a succession of tests in it, which I don't all understand. Are they independent (can I reorder them)? And could you tell me what those tests mean in terms of the enum state?

My temp branch: https://github.com/brenthuisman/gopar/commits/retvals

from gopar.

akalin avatar akalin commented on July 19, 2024

The result of a verification in par2cmdline et al. involves some of that logging: return values of 0,1,2 and 4 are needed. Basically: is the file (and/or parity) corrupt yes/no, if not, is it repairable yes/no? That's a minimum of four states we need to distinguish (OK, REPAIRABLE, IRREPAIRABLE); the fourth is any other error, which I don't want to silently roll up into irrepairable corruption.

The repairable bool/enum can be made member of coder, but what about catching all other errors?

if len(input) < c.dataShards {

Well Verify already returns an err parameter, so we can roll up all other errors to that one.

Are we sure that if the above test fails, we're in the IRREPAIRABLE state or can any other error be the source of not passing?

Well I'd have to look at the par2cmdline logic to see what they're doing -- do you have a pointer? It's conceivable that a problem with the parity chunks could lead to that above state, which might be what the UNUSABLE error is for, but practically I don't know if there's a real difference between the two. The user would have to find more parity chunks regardless...

Maybe some help with understanding Verify() will help me with this. I don't know much about par2 either in fact ;)

It seems to me that Verify() should return the above described enum. There is a succession of tests in it, which I don't all understand. Are they independent (can I reorder them)? And could you tell me what those tests mean in terms of the enum state?

Yes, Verify should probably return the enum. The tests in there were the minimum viable product of what I thought a Verify function should do. If anything, once rsec16.Coder.CanReconstructData returns the enum, that can be the first test to be called by Verify.

from gopar.

brenthuisman avatar brenthuisman commented on July 19, 2024

The enum used in par2cmdline is defined here:
https://github.com/brenthuisman/libpar2/blob/master/src/libpar2.h#L109
The associated .cpp is at https://github.com/brenthuisman/libpar2/blob/master/src/libpar2.cpp, but I think you are interested in Par2Repairer::Process() (a dorepair bool is the difference between verify and repair):
https://github.com/brenthuisman/libpar2/blob/master/src/par2repairer.cpp#L126

Yes, Verify should probably return the enum. The tests in there were the minimum viable product of what I thought a Verify function should do. If anything, once rsec16.Coder.CanReconstructData returns the enum, that can be the first test to be called by Verify.

OK. The first step must be d.newCoderAndShards(). What's the meaning of the err != nil test after this?

from gopar.

akalin avatar akalin commented on July 19, 2024

newCoderAndShards delegates to

func NewCoderPAR2Vandermonde(dataShards, parityShards, numGoroutines int) (Coder, error) {
, which returns an error for some very unlikely conditions -- basically if there are too many data shards or parity shards. You can probably treat that as an 'other' error; in general, you can probably do that with any other error you encounter except for the one you're checking specifically.

from gopar.

brenthuisman avatar brenthuisman commented on July 19, 2024

Closed by #4

from gopar.

Related Issues (8)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.