Giter Club home page Giter Club logo

Comments (5)

coroa avatar coroa commented on May 23, 2024

A direct consequence for making the cutout preparation work for sarah was to let go of the cleanup tmp files automatically once the file handle is released (the weakref.finalizer(noisy_unlink, ...) bit), since dask is often configured to use multiple processes and then whether a file handle is released in one process is not a good indicator for whether we still need the tmp file. Thus, I was forced to switch to a create all files in a temporary directory and clean up the directory when the cutout is ready scheme.

What probably happens for you now, is that some exception happens during the preparation of the cutout. During the error handling I try to clean up the temporary files, but the file handles have not been released and windows prohibits the deletion of the files. The easiest way to debug this problem now, is to supply a tmpdir="<some_empty_dir_that_you_created>" argument to cutout.prepare, so that keep_tmpdir is set and the OSError does not overshadow the real exception underneath and we can find out who holds onto its filehandles in this situation.

Ideally all filehandles should be released before exiting a scope with error exceptions (ie using with or a finally clause)), as another way to mitigate it would probably be good to wrap the rmtree in an try-except clause and turn the OSError into a log message so it does not hide the true exception.

from atlite.

euronion avatar euronion commented on May 23, 2024

By specifying a tmpdir the operation runs without error:

$ cutout.prepare(tmpdir="./localtmp/")
INFO:atlite.cutout:Cutout uk-2011-01-2 not found in directory ./, building new one
INFO:cdsapi:Sending request to https://cds.climate.copernicus.eu/api/v2/resources/reanalysis-era5-single-levels
INFO:cdsapi:Sending request to https://cds.climate.copernicus.eu/api/v2/resources/reanalysis-era5-single-levels
INFO:cdsapi:Sending request to https://cds.climate.copernicus.eu/api/v2/resources/reanalysis-era5-single-levels
INFO:cdsapi:Sending request to https://cds.climate.copernicus.eu/api/v2/resources/reanalysis-era5-single-levels
INFO:cdsapi:Sending request to https://cds.climate.copernicus.eu/api/v2/resources/reanalysis-era5-single-levels
INFO:cdsapi:Request is completed
INFO:atlite.datasets.common:Downloading request for 1 variables to C:\Users\J. Hampp\Documents\GitHub\atlite\examples\localtmp\tmpybiq8b4h.nc
INFO:cdsapi:Downloading http://136.156.132.198/cache-compute-0003/cache/data5/adaptor.mars.internal-1568637651.461136-31786-5-0d10c442-82a6-4f46-8be4-a73443c1b3ec.nc to C:\Users\J. Hampp\Documents\GitHub\atlite\examples\localtmp\tmpybiq8b4h.nc (3.9M)
INFO:cdsapi:Request is completed
INFO:cdsapi:Request is completed
INFO:cdsapi:Request is completed
INFO:cdsapi:Request is completed
INFO:atlite.datasets.common:Downloading request for 3 variables to C:\Users\J. Hampp\Documents\GitHub\atlite\examples\localtmp\tmplpevg6_5.nc
INFO:atlite.datasets.common:Downloading request for 9 variables to C:\Users\J. Hampp\Documents\GitHub\atlite\examples\localtmp\tmpvoopfrne.nc
INFO:cdsapi:Downloading http://136.156.133.39/cache-compute-0012/cache/data3/adaptor.mars.internal-1568637651.4501736-25378-15-3562dc67-d581-4f22-a699-4d06068c8af5.nc to C:\Users\J. Hampp\Documents\GitHub\atlite\examples\localtmp\tmplpevg6_5.nc (11.6M)
INFO:cdsapi:Downloading http://136.156.132.210/cache-compute-0005/cache/data7/adaptor.mars.internal-1568702524.1244705-7942-5-a3a39405-ad59-4a37-aba5-4ea662c1747b.nc to C:\Users\J. Hampp\Documents\GitHub\atlite\examples\localtmp\tmpvoopfrne.nc (6.8K)
INFO:atlite.datasets.common:Downloading request for 2 variables to C:\Users\J. Hampp\Documents\GitHub\atlite\examples\localtmp\tmpcutwpaeh.nc
INFO:atlite.datasets.common:Downloading request for 4 variables to C:\Users\J. Hampp\Documents\GitHub\atlite\examples\localtmp\tmpaao4up9g.nc
INFO:cdsapi:Downloading http://136.156.132.198/cache-compute-0003/cache/data2/adaptor.mars.internal-1568637672.4669356-30780-7-f10d482b-f931-4644-9c61-209b361f0a76.nc to C:\Users\J. Hampp\Documents\GitHub\atlite\examples\localtmp\tmpcutwpaeh.nc (7.7M)
INFO:cdsapi:Downloading http://136.156.133.37/cache-compute-0011/cache/data8/adaptor.mars.internal-1568637651.4599202-22719-5-02ff42ba-49a2-420a-a401-8d45c3a3c8c4.nc to C:\Users\J. Hampp\Documents\GitHub\atlite\examples\localtmp\tmpaao4up9g.nc (15.5M)
INFO:cdsapi:Download rate 88.7K/s
INFO:cdsapi:Download rate 2.8M/s
INFO:cdsapi:Download rate 4.8M/s
INFO:cdsapi:Download rate 4.2M/s
INFO:cdsapi:Download rate 3.9M/s

[                                        ] | 0% Completed |  0.0s

C:\anaconda\envs\atlite\lib\site-packages\dask\core.py:119: RuntimeWarning: divide by zero encountered in true_divide
  return func(*args2)
C:\anaconda\envs\atlite\lib\site-packages\dask\core.py:119: RuntimeWarning: invalid value encountered in true_divide
  return func(*args2)

[########################################] | 100% Completed |  0.4s

The cutout is properly prepared.
It seems to me some file handles are not released properly before the finally clause.
For now this is an acceptable workaround, but maybe we can pinpoint and resolve the cause.

from atlite.

coroa avatar coroa commented on May 23, 2024

Should be fixed in v0.2. Re-open if the problem persists.

from atlite.

euronion avatar euronion commented on May 23, 2024

I can confirm that this works for now in my limited test case.

from atlite.

euronion avatar euronion commented on May 23, 2024

Cheering a bit to early.

This introduced a minor bug:
If the cutout is already fully created, then

atlite/atlite/data.py

Lines 199 to 202 in 09714d5

if not missing_features and not overwrite:
logger.info(f"All available features {cutout.available_features} have already been prepared, so nothing to do."
f" Use `overwrite=True` to re-create {cutout.name}.nc and {cutout.name}.sindex.pickle.")
return

returns, but first the finally of the block

atlite/atlite/data.py

Lines 227 to 236 in 09714d5

finally:
# ds is the last reference to the temporary files:
# - we remove it from this scope, and
# - fire up the garbage collector,
# => xarray's file manager closes them and we can remove tmpdir
del ds
gc.collect()
if not keep_tmpdir:
rmtree(tmpdir)

get's executed.
Problem:
In this case ds is never defined, raising

    230         # - fire up the garbage collector,
    231         # => xarray's file manager closes them and we can remove tmpdir
--> 232         del ds
    233         gc.collect()
    234 

UnboundLocalError: local variable 'ds' referenced before assignment

Maybe wrap the del ds inside its own

try:
    del ds
except NameError:
    # that's ok.

from atlite.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.