Giter Club home page Giter Club logo

Comments (16)

deanishe avatar deanishe commented on July 28, 2024

Hi,

What Queue Mode are you using in your Script Filters? Do you have any set to "Terminate previous script"?

AW doesn't (yet) protect the cache/settings writing code from kill signals, so I could imagine Alfred is killing the script after it opens settings.json but before it writes the new JSON to the file. That would be my best guess.

AW should be smarter about overwriting settings.json and other files, writing to a temp file first, and then replacing the destination file only if the write succeeds. File writes also need protecting from kill signals.

I think I'd prefer an error message, e.g. "settings.json is corrupt. The previous good version is saved as settings.json.bak", rather than automatically recovering from the old version. The latter behaviour would hide the fact that an error has occurred (the workflow still works, just incorrectly).

Perhaps the best solution would be for AW to raise a specific exception, e.g. InvalidSettings, which workflow authors could catch and decide what to do (rollback to previous settings, replace with default settings etc.)

What do you think?

Regarding v2: I'm still kinda stuck at how to implement the serialisers. I need to sit down and try a few different things to see what works, but haven't had the time or inclination. Most of the other stuff that needs doing depends on the serialisers. I'm sure I'll have some time to look at it now summer's here and there's not so much work.

from alfred-workflow.

owenwater avatar owenwater commented on July 28, 2024

I didn't think about that Terminate Mode. Everything makes sense now.

If that is the case, writing to a temp file should be a good idea since os.rename() is atomic and moving file under the same directory won't be expensive.

With this change, I'm ok with using exception instead of auto recover. At least it points out which file is broken that current exception can't.

Also, we shall probably do the same thing for the store_data() and cache_data() functions.

from alfred-workflow.

deanishe avatar deanishe commented on July 28, 2024

Renaming is atomic, but the previous version needs to be renamed/deleted as well, so the code will still have to be protected from kill signals.

As you say, the same goes for the cache- and data-writing code.

I think a single context manager class should do it. It'd catch any signals, write to a temp file, and if that write succeeds, replace the previous version. If it caught a signal, it should then raise SystemExit.

from alfred-workflow.

owenwater avatar owenwater commented on July 28, 2024

I don't think the previous version does the matter even it needs to be deleted first. Here is the man page of mac:

The rename() system call guarantees that an instance of new will always exist,
even if the system should crash in the middle of the operation.

As long as there is a valid setting file, I don't think we need to care about any system signal.

from alfred-workflow.

deanishe avatar deanishe commented on July 28, 2024

You're right: on UNIX, os.rename() will overwrite any existing file, so it's fundamentally atomic.

from alfred-workflow.

owenwater avatar owenwater commented on July 28, 2024

Cool. I can find a time to write a patch for it.

from alfred-workflow.

owenwater avatar owenwater commented on July 28, 2024

One more thing. I just noticed that store_data() contains two file write operators. As atomic writing to multiple files is complicated. My suggestion is

  • Real data is written before metadata. If system failed between two writing operators, the serializer will be guessed based on file suffix. If multiple suffixes is founded, the latest one will be used.

OR

  • Put metadata in the first line of the file. Using dumps()/loads() instead of dump()/load(). Also stop using serializer_name as file name suffix.

I prefer the first one because it's 100% backward compatible.

from alfred-workflow.

deanishe avatar deanishe commented on July 28, 2024

I don't think all that complexity is necessary.

The fundamental problem is scripts being killed in the middle of write operations. The solution is, imo, stopping them being killed during write operations.

A context manager that catches signals and reraises them when it exits seems much simpler.

from alfred-workflow.

owenwater avatar owenwater commented on July 28, 2024

I was trying not to catch the signal because:

  • Signal handler is a global thing. So we need to be very careful not to overwrite the existing one. Also, there is no way to guarantee ours won't be overwritten by others.
  • Because handler is global, a global variable will be needed to track the file name.
  • We can't just simply raise SystemExit because it can be caught accidentally by user while SIGTERM won't. os._exit() can avoid the exception but I'm still not quite sure if there is any other side effect.
  • System can still crush in the handler.
  • Even Alfred is now using SIGTERM to kill the process, there are also lots of other ways. Do we want to catch them all?

I'm not saying signal is a bad idea but just don't think it's that simple especially in a helper library.

from alfred-workflow.

deanishe avatar deanishe commented on July 28, 2024

SystemExit can't be caught accidentally because it's a subclass of BaseException, not Exception. Raising it is equivalent to calling sys.exit(). If users are catching BaseException, they're doing it wrong, and it's not our job to save them from themselves.

If a handler has already been registered, cache it and put it back when the context manager exits. If a signal is caught and there was an existing handler, call the handler instead of raising SystemExit.

I don't think other signals matter, nor the fact that the system can crush the handler. It's a workflow library, not an RDBMS.

The only purpose is to protect AW from being killed by Alfred in the middle of something critical, not against whatever else might come up.

The only real issue here is data being corrupted because Alfred killed the workflow.

from alfred-workflow.

owenwater avatar owenwater commented on July 28, 2024

Ok, I'll do rename() for settings.json and cache_data() and catch signal for store_data()

from alfred-workflow.

owenwater avatar owenwater commented on July 28, 2024

Hi @deanishe ,

I'm writing the signal handler and does store_data() support multiple threads?

If so, because signal handler registration can only be done in the main thread, we need a synchronized queue to track all in progress file writing opeartor. Do you have any better idea because it's getting more complicated.

from alfred-workflow.

deanishe avatar deanishe commented on July 28, 2024

Don't worry about any of that stuff. There is no thread safety in AW and almost certainly never will be.

The expectation is that anyone who wants to run code in parallel will use background.run_in_background() to start a separate process, where it can't hold up the main script.

Writing threaded code is a can of worms anyway, and anyone capable of doing it properly can implement their own queues and/or locks and ensure AW is only called from the main thread.

At the very most, a mention in the docs is warranted, but I don't even think that is necessary: in almost every case, if a library's docs says nothing about thread safety, it's not thread safe.

from alfred-workflow.

owenwater avatar owenwater commented on July 28, 2024

Ok, that saves me lots of work. I'll mentioned that in the document. Because what I wanna say is that there is not only thread safe problem without queue, it's just simply doesn't work.

from alfred-workflow.

deanishe avatar deanishe commented on July 28, 2024

Thanks for the great work. Added to v1.12.

I refactored the code a little bit and renamed atomic_multiple_files_writer to uninterruptible, seeing as it doesn't really have anything to do with writing files and is applicable to any critical code that shouldn't be interrupted. It might be useful to users, too.

from alfred-workflow.

owenwater avatar owenwater commented on July 28, 2024

On thing I'm thinking about now is the previous workflow may not die fast enough. Because I'm using Terminate previous script + Immediately after each character typed. So the scenario may be

first character typed -> first WF open tmp file -> first WF write to tmp file -> second character typed -> second WF open tmp file(now tmp file is empty again) -> third character typed -> second WF is killed ->
first WF rename tmp file to settings.json -> first WF is killed. -> third WF try to read settings.json and it's empty.

I'm not sure if it's possible that happened. It may be hard to produce this scenario.

from alfred-workflow.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.