Comments (7)
FYI, this was started initially in rstudio/promises#86 in a related effort; the double-reporting issue there I think has been resolved in dev, but I think the variable failing is future
-specific.
from future.
Thanks for this. Some quick initial comments. I think I understand what's going on. The time-out limit is not reset on the worker after the future expression has been finished. This means it'll also apply to the code wrapping up the results and sending it back to the parent R process.
Depending on exactly when this timeout kicks in, you'll get different types of errors. Your different examples show this and makes sense to me. For example, the "unexpected result" error message is from an internal sanity check that makes sure that the proper data structure has been received by the parent R process, but in your example it received nothing, indicating that the connection was interrupted. That error is actually of class UnexpectedFutureResultError
, which is a special instance of class FutureError
. To distinguish them from regular run-time evaluation errors, the FutureError
class indicates an error in the "orchestration" of futures. When those happen, they can be quite disruptive, and depending on parallel backend, you might have to reset the backend, or even restart R.
I think that if you add setTimeLimit(cpu = Inf, elapsed = Inf, transient = FALSE)
to the end of your future expression, e.g.
thisfut <- future({
on.exit(setTimeLimit(cpu = Inf, elapsed = Inf, transient = FALSE)) ## TESTING ONLY!
setTimeLimit(elapsed = 2, transient = TRUE)
Sys.sleep(3)
1L
})
then this problem should go away? Is that the case? If so, I'll look into possibly adding a setTimeLimit(cpu = Inf, elapsed = Inf, transient = FALSE)
automatically.
The reason why I say TESTING ONLY!
, is that I have yet not decided if it is a good idea to use on.exit()
in future expressions. It works because of how it's evaluated internally, but I'm not yet ready to promise that behavior, but it's good enough for our testing purposes here.
BTW, Sys.sleep()
is very special when it comes to timeouts, and its behavior varies with OS, cf. HenrikBengtsson/Wishlist-for-R#47.
from future.
Thank you for the prompt and detailed response!
Sys.sleep()
is just a placeholder for longer-running expressions that I need to be bounded, ergosetTimeLimit
. Note taken. (And I learned something, thanks!)Especially with the explicittransient=TRUE
, I'm surprised that theon.exit(..)
changes any behavior, much less fixes this problem. My interpretation oftransient=TRUE
is that as soon as the current computation is complete (i.e., everything within thefuture({...})
, it should automatically return to the defaultInf
limits. Do think think this could be a bug insetTimeLimit
, or am I misinterpreting it?
Edit: on re-reading your comment, thetransient=
appears to be insufficient due to the extra code you're running for futures-management, I get it, so some of your code is also not being allowed to continue operation. Would it change any behavior to runsetTimeLimit(.)
and my real code within alocal(.)
or other further-constricting environment?
I see that this (on.exit(..)
) trick is what you do in R.utils::withTimeout
, which suggests to me that it's just a "feature" (perhaps not quite "known deficiency") of setTimeLimit
.
I understand that you are concerned with imposing this as default behavior. For now I'm going to have to use it in production (so that failed futures don't leak workers, a problem I've been dealing with). Do you think the use of reg.finalizer
is any more or less risky than on.exit
?
from future.
I understand that you are concerned with imposing this as default behavior. For now I'm going to have to use it in production (so that failed futures don't leak workers, a problem I've been dealing with). Do you think the use of
reg.finalizer
is any more or less risky thanon.exit
?
I'm only hesitant because I haven't had a deep focus of thinking about what it would entail to allow on.exit()
within a future expression. Quite simplified, you can think of the current implementation for evaluating futures to looks something like:
setup_future_code()
result <- tryCatch({
local(<future expression here>)
}, error = function(ex) {
...
})
teardown_future_code()
send_results_back(result)
Since it's evaluated within a local()
already, then on.exit()
works already now. I doubt that will change, but I do not want to make a full commitment to it right now. An advantage of recommending/requiring an explicit local()
, is that you need that in vanilla sequential code, e.g.
res <- local({
on.exit(...)
42L
})
You cannot just do:
res <- {
on.exit(...)
42L
}
So, for future()
analogue to regular evaluation, requiring local()
makes more sense, especially so when replacing <-
with future's %<-%
parallel assignment operator.
Would it change any behavior to run setTimeLimit(.) and my real code within a local(.) or other further-constricting environment?
Yes, using an explicit local()
would guarantee that on.exit()
will always work, with or without futures. So,
thisfut <- future(local{
on.exit(setTimeLimit(cpu = Inf, elapsed = Inf, transient = FALSE))
setTimeLimit(elapsed = 2, transient = TRUE)
Sys.sleep(3)
1L
}))
would be the most future-proof solution (pun intended), and it explicitly communicates the scope and intent.
That said, I will consider adding something like on.exit(setTimeLimit(cpu = Inf, elapsed = Inf, transient = FALSE))
internally, to protect against hidden timeouts, which might be a left-over from some deep-down dependency. Doing so would be inline with recent attempts to undo R options and environment set, so they do not spill over future (pun 2) futures. To clarify, with the current behavior, you can create a long-term timeout in one future, which will remain if run on a persistent worker (e.g. multisession), and might trigger a timeout, say, in one hour from now.
from future.
Actually, I'm having a difficult time meeting my expectation of setTimeLimit
even outside of future
.
print({
setTimeLimit(cpu = Inf, elapsed = Inf, transient = FALSE)
e <- environment()
reg.finalizer(e,
function(ign) {cat('fin1\n');setTimeLimit(cpu = Inf, elapsed = Inf, transient = FALSE);},
onexit = TRUE)
on.exit({cat('fin2\n');setTimeLimit(cpu = Inf, elapsed = Inf, transient = FALSE);})
setTimeLimit(cpu = 1, elapsed = 1, transient = TRUE)
system("sleep 3")
# Sys.sleep(3)
message("hey")
setTimeLimit(cpu = Inf, elapsed = Inf, transient = FALSE)
1L
})
Also done with print(local({...}))
. Also done using Sys.sleep
instead of system
. Clearly overkill.
Some of the time I get Error: reached elapsed time limit
, but usually it happily messages "hey"
and returns 1L
. I can't seem to find a way to get any expression to reliably fail on a time limit, either in future
or not ... off-topic for this issue but I think I'd like to shore-up this issue so that it behaves as expected both inside and outside of a future. Do you have any insight?
My expectation is that this should always: fail, show the error message, not display "hey"
, and not return 1L
. Clearly I'm confused about something :-/
from future.
Do you have any insight?
I only skimmed your comment here, but my first instinct is that also system()
might not be interruptable, or at least have a delayed interruption. Try with for (kk in 1:300) Sys.sleep(0.01)
instead and see if that makes a difference.
Also, it doesn't look like that finalizer makes a difference in this example, since it'll only kick in if you delete e
, and then it's unpredictable when the GC will clean it up.
from future.
Okay, I think it's finally sinking in: neither Sys.sleep
nor system
in this usage are good tests for setTimeLimit
, locally or in a future
. Replacing system("sleep 3")
and Sys.sleep(3)
with something like (function() replicate(300, Sys.sleep(0.1)))()
(I was too lazy to define it as a named func) seems to remedy the symptoms. Incidentally, so does processx::run("sleep", "3")
(I really don't like system
/system2
:-/)
For reg.finalizer
, I had assumed that once we extracted the value from a multicore
worker, the forked process and therefore the e
nvironment would be harvested, so it should have been finalized. Perhaps, as with many things gc-related, it just isn't happening as quickly as I assumed.
I should have listened more closely to your first cautions: on.exit(..)
does not seem necessary when using either the looped-Sys.sleep
or the processx
variant.
Thank you for the dialogue!
from future.
Related Issues (20)
- Incremental async reduction HOT 5
- deleted/invalid
- Error in unserialize(node$con) : MultisessionFuture (future_lapply-4) failed to receive results from cluster RichSOCKnode #4 (PID 436932 on localhost ‘localhost’). The reason reported was ‘error reading from connection’. Post-mortem diagnostic: No process exists with this PID, i.e. the localhost worker is no longer alive. HOT 1
- Issues with foreach and SGE HOT 3
- R session crash if plan is not changed back to sequential after using multisession (in linux) HOT 2
- global r6 instances are modified and unusable in multisession HOT 7
- namespace:future'没有出口‘multiprocess’这个对象 HOT 1
- plan(cluster) over ssh asks for port to be specified in workRSOCK() HOT 2
- future() and resolved() handle FutureError:s differently for different backends
- R CMD check NOTE: Documented arguments not in \usage in documentation ...
- ‘error reading from connection’. Post-mortem diagnostic: No process exists with this PID, i.e. the localhost worker is no longer alive. HOT 2
- detritus NOTE when multisession plan declared in vignette HOT 1
- Some R packages change 'future.*' options permanently
- Nested Futures Use More Memory Than They Should HOT 1
- Detect a future parallel backend HOT 1
- plan(cluster, workers = <cluster>) has more overhead than plan(cluster, workers = <integer>) HOT 6
- Add `arrow` to the list of packages with non-exportable objects? HOT 5
- `future_lapply` ran into `subscript out of bounds` error when function have use `$<-` with magrittr pipe. HOT 4
- Backend hook function for shutting down/cleaning up backend HOT 1
- Error: ‘inherits(future, "Future")’ is not TRUE HOT 18
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from future.