Giter Club home page Giter Club logo

Comments (20)

adamjstewart avatar adamjstewart commented on May 23, 2024 1

Your screenshot doesn't contain the full stack trace, and I also can't copy-n-paste error messages from screenshots...

from torchgeo.

calebrob6 avatar calebrob6 commented on May 23, 2024 1

One bug here is that if I do:
dataset = CDL(paths="data/", years=[2017], download=True)

and the data/ directory is empty, then the 2017 layer is downloaded as expected. However, if I then do:

dataset = CDL(paths="data/", years=[2023], download=True)

the second download of the 2023 layer does not happen.

Edit: It seems @yichiac and I discovered this at the same time 🙂

in ._verify(self) the following code should take into account the current layers requested:

pathname = os.path.join(
    self.paths, self.zipfile_glob.replace("*", str(year))
)

from torchgeo.

tchaton avatar tchaton commented on May 23, 2024 1

Hey,

I can reproduce the same issue in a Studio on Lightning.Ai. The hanging seems to be coming from torchvision:

Here is a minimal repro.

import urllib
import urllib.error
import urllib.request

USER_AGENT = "pytorch/vision"

def _get_redirect_url(url: str, max_hops: int = 3) -> str:
    initial_url = url
    headers = {"Method": "HEAD", "User-Agent": USER_AGENT}

    for _ in range(max_hops + 1):
        with urllib.request.urlopen(urllib.request.Request(url, headers=headers)) as response:
            if response.url == url or response.url is None:
                return url

            url = response.url
    else:
        raise RecursionError(
            f"Request to {initial_url} exceeded {max_hops} redirects. The last redirect points to {url}."
        )


url = "https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2022_30m_cdls.zip"
url = _get_redirect_url(url)
assert url == url
print(url)

from torchgeo.

robmarkcole avatar robmarkcole commented on May 23, 2024

Nothing wrong with connection, can manually download

image

from torchgeo.

adamjstewart avatar adamjstewart commented on May 23, 2024

I'm unable to reproduce this issue. I tried both 2017 and 2022 and both downloaded fine on my system. What version of torchvision are you using? Can you try upgrading to the newest version?

from torchgeo.

robmarkcole avatar robmarkcole commented on May 23, 2024

I have torchvision==0.17.1+cu121

I upgrade, and the cell now executes immediately, but no data is downloaded (2022)

image

from torchgeo.

adamjstewart avatar adamjstewart commented on May 23, 2024

Is it possible that you already have some CDL data somewhere in that folder recursively?

from torchgeo.

robmarkcole avatar robmarkcole commented on May 23, 2024

Dont see anything:

⚡ ~ find data -type f 
data/2017_30m_cdls.aux
data/2017_30m_cdls.tfw
data/Metadata_Cropland-Data-Layer.htm
data/2017_30m_cdls.zip
data/2017_30m_cdls.tif
data/2017_30m_cdls.tif.ovr

Also, even the manually downloaded dataset doesn't look correct, shouldn't this work?:
image

from torchgeo.

robmarkcole avatar robmarkcole commented on May 23, 2024


'0.6.0.dev0'
CDL Dataset
    type: GeoDataset
    bbox: BoundingBox(minx=-127.88721217969017, maxx=-65.34561975376272, miny=22.94022503977174, maxy=51.60512156832182, mint=1483228800.0, maxt=1514764799.999999)
    size: 1
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[8], [line 4](vscode-notebook-cell:?execution_count=8&line=4)
      [1](vscode-notebook-cell:?execution_count=8&line=1) sampler = RandomGeoSampler(dataset, size=224, length=3)
      [2](vscode-notebook-cell:?execution_count=8&line=2) dataloader = DataLoader(dataset, sampler=sampler, collate_fn=stack_samples)
----> [4](vscode-notebook-cell:?execution_count=8&line=4) for batch in dataloader:
      [5](vscode-notebook-cell:?execution_count=8&line=5)     sample = unbind_samples(batch)[0]
      [6](vscode-notebook-cell:?execution_count=8&line=6)     dataset.plot(sample)

File [/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torch/utils/data/dataloader.py:631](https://vscode-remote+vscode-002d01hvnmksrn19shhky9tc8w99r0-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torch/utils/data/dataloader.py:631), in _BaseDataLoaderIter.__next__(self)
    [628](https://vscode-remote+vscode-002d01hvnmksrn19shhky9tc8w99r0-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torch/utils/data/dataloader.py:628) if self._sampler_iter is None:
    [629](https://vscode-remote+vscode-002d01hvnmksrn19shhky9tc8w99r0-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torch/utils/data/dataloader.py:629)     # TODO(https://github.com/pytorch/pytorch/issues/76750)
    [630](https://vscode-remote+vscode-002d01hvnmksrn19shhky9tc8w99r0-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torch/utils/data/dataloader.py:630)     self._reset()  # type: ignore[call-arg]
--> [631](https://vscode-remote+vscode-002d01hvnmksrn19shhky9tc8w99r0-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torch/utils/data/dataloader.py:631) data = self._next_data()
    [632](https://vscode-remote+vscode-002d01hvnmksrn19shhky9tc8w99r0-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torch/utils/data/dataloader.py:632) self._num_yielded += 1
    [633](https://vscode-remote+vscode-002d01hvnmksrn19shhky9tc8w99r0-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torch/utils/data/dataloader.py:633) if self._dataset_kind == _DatasetKind.Iterable and \
    [634](https://vscode-remote+vscode-002d01hvnmksrn19shhky9tc8w99r0-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torch/utils/data/dataloader.py:634)         self._IterableDataset_len_called is not None and \
    [635](https://vscode-remote+vscode-002d01hvnmksrn19shhky9tc8w99r0-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torch/utils/data/dataloader.py:635)         self._num_yielded > self._IterableDataset_len_called:

File [/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torch/utils/data/dataloader.py:674](https://vscode-remote+vscode-002d01hvnmksrn19shhky9tc8w99r0-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torch/utils/data/dataloader.py:674), in _SingleProcessDataLoaderIter._next_data(self)
    [673](https://vscode-remote+vscode-002d01hvnmksrn19shhky9tc8w99r0-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torch/utils/data/dataloader.py:673) def _next_data(self):
--> [674](https://vscode-remote+vscode-002d01hvnmksrn19shhky9tc8w99r0-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torch/utils/data/dataloader.py:674)     index = self._next_index()  # may raise StopIteration
    [675](https://vscode-remote+vscode-002d01hvnmksrn19shhky9tc8w99r0-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torch/utils/data/dataloader.py:675)     data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
    [676](https://vscode-remote+vscode-002d01hvnmksrn19shhky9tc8w99r0-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torch/utils/data/dataloader.py:676)     if self._pin_memory:

File [/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torch/utils/data/dataloader.py:621](https://vscode-remote+vscode-002d01hvnmksrn19shhky9tc8w99r0-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torch/utils/data/dataloader.py:621), in _BaseDataLoaderIter._next_index(self)
    [620](https://vscode-remote+vscode-002d01hvnmksrn19shhky9tc8w99r0-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torch/utils/data/dataloader.py:620) def _next_index(self):
--> [621](https://vscode-remote+vscode-002d01hvnmksrn19shhky9tc8w99r0-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torch/utils/data/dataloader.py:621)     return next(self._sampler_iter)

File [/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torch/utils/data/sampler.py:287](https://vscode-remote+vscode-002d01hvnmksrn19shhky9tc8w99r0-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torch/utils/data/sampler.py:287), in BatchSampler.__iter__(self)
    [285](https://vscode-remote+vscode-002d01hvnmksrn19shhky9tc8w99r0-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torch/utils/data/sampler.py:285) batch = [0] * self.batch_size
    [286](https://vscode-remote+vscode-002d01hvnmksrn19shhky9tc8w99r0-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torch/utils/data/sampler.py:286) idx_in_batch = 0
--> [287](https://vscode-remote+vscode-002d01hvnmksrn19shhky9tc8w99r0-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torch/utils/data/sampler.py:287) for idx in self.sampler:
    [288](https://vscode-remote+vscode-002d01hvnmksrn19shhky9tc8w99r0-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torch/utils/data/sampler.py:288)     batch[idx_in_batch] = idx
    [289](https://vscode-remote+vscode-002d01hvnmksrn19shhky9tc8w99r0-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torch/utils/data/sampler.py:289)     idx_in_batch += 1

File [/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/samplers/single.py:140](https://vscode-remote+vscode-002d01hvnmksrn19shhky9tc8w99r0-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/samplers/single.py:140), in RandomGeoSampler.__iter__(self)
    [133](https://vscode-remote+vscode-002d01hvnmksrn19shhky9tc8w99r0-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/samplers/single.py:133) """Return the index of a dataset.
    [134](https://vscode-remote+vscode-002d01hvnmksrn19shhky9tc8w99r0-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/samplers/single.py:134) 
    [135](https://vscode-remote+vscode-002d01hvnmksrn19shhky9tc8w99r0-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/samplers/single.py:135) Returns:
    [136](https://vscode-remote+vscode-002d01hvnmksrn19shhky9tc8w99r0-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/samplers/single.py:136)     (minx, maxx, miny, maxy, mint, maxt) coordinates to index a dataset
    [137](https://vscode-remote+vscode-002d01hvnmksrn19shhky9tc8w99r0-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/samplers/single.py:137) """
    [138](https://vscode-remote+vscode-002d01hvnmksrn19shhky9tc8w99r0-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/samplers/single.py:138) for _ in range(len(self)):
    [139](https://vscode-remote+vscode-002d01hvnmksrn19shhky9tc8w99r0-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/samplers/single.py:139)     # Choose a random tile, weighted by area
--> [140](https://vscode-remote+vscode-002d01hvnmksrn19shhky9tc8w99r0-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/samplers/single.py:140)     idx = torch.multinomial(self.areas, 1)
    [141](https://vscode-remote+vscode-002d01hvnmksrn19shhky9tc8w99r0-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/samplers/single.py:141)     hit = self.hits[idx]
    [142](https://vscode-remote+vscode-002d01hvnmksrn19shhky9tc8w99r0-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/samplers/single.py:142)     bounds = BoundingBox(*hit.bounds)

RuntimeError: cannot sample n_sample > prob_dist.size(-1) samples without replacement

from torchgeo.

adamjstewart avatar adamjstewart commented on May 23, 2024

Never seen this error before, interesting...

We still need to figure out how to reproduce this. Are you able to reproduce this in Google Colab or some other shared computing resource I can access? That will make it easier to debug.

from torchgeo.

robmarkcole avatar robmarkcole commented on May 23, 2024

If you create an account on https://lightning.ai/ I can grant you access!

from torchgeo.

calebrob6 avatar calebrob6 commented on May 23, 2024

I can't reproduce this locally with main branch

image

from torchgeo.

calebrob6 avatar calebrob6 commented on May 23, 2024

One thing I am noticing is that the bounds shown in the output of your print(dataset) seem to be in lat/lon while mine are not:

bbox: BoundingBox(minx=-2356095.0, maxx=2258235.0, miny=276915.0, maxy=3172605.0, mint=1483228800.0, maxt=1514764799.999999)

Is there anything else in the data/ directory?

from torchgeo.

yichiac avatar yichiac commented on May 23, 2024

I cannot reproduce the issue either. The dataset can be downloaded immediately. I did find that the other years can't be downloaded after downloading some years. For example:

from torchgeo.datasets import CDL
dataset = CDL(years=[2022], download=True, ) 

This can download the corresponding year without issues. But if I restart the terminal and run

from torchgeo.datasets import CDL
dataset = CDL(years=[2023], download=True, ) 

It won't download anything. It seems that the download function only works for the first time when the data directory doesn't have any downloaded CDL files. This issue is not related to certain years. I tried different combination of years.

from torchgeo.

robmarkcole avatar robmarkcole commented on May 23, 2024

Can confirm (for my own sanity) that this bug I only see on lighnting.ai, will ask them

image

from torchgeo.

adamjstewart avatar adamjstewart commented on May 23, 2024

The problem is actually higher up:

# Check if the extracted files already exist                                     
if self.files:                                                                   
    return 

If any CDL files are found, the method exits, even if the specific years you requested aren't there. This broke in #1442. The fix would be to check for the specific years requested. However, this is difficult if you can't know whether paths is a directory or a list of files. Anyone want to take a stab at fixing this?

from torchgeo.

calebrob6 avatar calebrob6 commented on May 23, 2024

The problem is actually higher up:

Yes, just discovered this as well

from torchgeo.

robmarkcole avatar robmarkcole commented on May 23, 2024

I found if I run the command in terminal (rather than jupyter) I get a warning - I pointed to a fresh directory (data2):

>>> from torchgeo.datasets import CDL
>>> dataset = CDL(paths='/teamspace/studios/this_studio/data2/', years=[2010], download=True, checksum=False, crs="EPSG:4326") 
/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/geo.py:313: UserWarning: Could not find any relevant files for provided path '/teamspace/studios/this_studio/data2/'. Path was ignored.
  warnings.warn(

Appears it is ignoring the path and hanging. If I interrupt and rerun the command, I do not get the warning.
On keyboard interrupt I get the following:

/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/geo.py:313: UserWarning: Could not find any relevant files for provided path 'data'. Path was ignored.
  warnings.warn(
---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
Cell In[2], line 2
      1 # dataset = CDL(years=[2017], download=False, checksum=False, crs="EPSG:4326") # manually downloaded
----> 2 dataset = CDL(years=[2020], download=True, checksum=False, crs="EPSG:4326") # 

File /home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/cdl.py:263, in CDL.__init__(self, paths, crs, res, years, classes, transforms, cache, download, checksum)
    260 self.ordinal_map = torch.zeros(max(self.cmap.keys()) + 1, dtype=self.dtype)
    261 self.ordinal_cmap = torch.zeros((len(self.classes), 4), dtype=torch.uint8)
--> 263 self._verify()
    265 super().__init__(paths, crs, res, transforms=transforms, cache=cache)
    267 # Map chosen classes to ordinal numbers, all others mapped to background class

File /home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/cdl.py:315, in CDL._verify(self)
    312     raise DatasetNotFoundError(self)
    314 # Download the dataset
--> 315 self._download()
    316 self._extract()

File /home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchgeo/datasets/cdl.py:321, in CDL._download(self)
    319 """Download the dataset."""
    320 for year in self.years:
--> 321     download_url(
    322         self.url.format(year),
    323         self.paths,
    324         md5=self.md5s[year] if self.checksum else None,
    325     )

File /home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchvision/datasets/utils.py:130, in download_url(url, root, filename, md5, max_redirect_hops)
    127     _download_file_from_remote_location(fpath, url)
    128 else:
    129     # expand redirect chain if needed
--> 130     url = _get_redirect_url(url, max_hops=max_redirect_hops)
    132     # check if file is located on Google Drive
    133     file_id = _get_google_drive_file_id(url)

File /home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/torchvision/datasets/utils.py:78, in _get_redirect_url(url, max_hops)
     75 headers = {"Method": "HEAD", "User-Agent": USER_AGENT}
     77 for _ in range(max_hops + 1):
---> 78     with urllib.request.urlopen(urllib.request.Request(url, headers=headers)) as response:
     79         if response.url == url or response.url is None:
     80             return url

File /home/zeus/miniconda3/envs/cloudspace/lib/python3.10/urllib/request.py:216, in urlopen(url, data, timeout, cafile, capath, cadefault, context)
    214 else:
    215     opener = _opener
--> 216 return opener.open(url, data, timeout)

File /home/zeus/miniconda3/envs/cloudspace/lib/python3.10/urllib/request.py:519, in OpenerDirector.open(self, fullurl, data, timeout)
    516     req = meth(req)
    518 sys.audit('urllib.Request', req.full_url, req.data, req.headers, req.get_method())
--> 519 response = self._open(req, data)
    521 # post-process response
    522 meth_name = protocol+"_response"

File /home/zeus/miniconda3/envs/cloudspace/lib/python3.10/urllib/request.py:536, in OpenerDirector._open(self, req, data)
    533     return result
    535 protocol = req.type
--> 536 result = self._call_chain(self.handle_open, protocol, protocol +
    537                           '_open', req)
    538 if result:
    539     return result

File /home/zeus/miniconda3/envs/cloudspace/lib/python3.10/urllib/request.py:496, in OpenerDirector._call_chain(self, chain, kind, meth_name, *args)
    494 for handler in handlers:
    495     func = getattr(handler, meth_name)
--> 496     result = func(*args)
    497     if result is not None:
    498         return result

File /home/zeus/miniconda3/envs/cloudspace/lib/python3.10/urllib/request.py:1391, in HTTPSHandler.https_open(self, req)
   1390 def https_open(self, req):
-> 1391     return self.do_open(http.client.HTTPSConnection, req,
   1392         context=self._context, check_hostname=self._check_hostname)

File /home/zeus/miniconda3/envs/cloudspace/lib/python3.10/urllib/request.py:1352, in AbstractHTTPHandler.do_open(self, http_class, req, **http_conn_args)
   1350     except OSError as err: # timeout error
   1351         raise URLError(err)
-> 1352     r = h.getresponse()
   1353 except:
   1354     h.close()

File /home/zeus/miniconda3/envs/cloudspace/lib/python3.10/http/client.py:1374, in HTTPConnection.getresponse(self)
   1372 try:
   1373     try:
-> 1374         response.begin()
   1375     except ConnectionError:
   1376         self.close()

File /home/zeus/miniconda3/envs/cloudspace/lib/python3.10/http/client.py:318, in HTTPResponse.begin(self)
    316 # read until we get a non-100 response
    317 while True:
--> 318     version, status, reason = self._read_status()
    319     if status != CONTINUE:
    320         break

File /home/zeus/miniconda3/envs/cloudspace/lib/python3.10/http/client.py:279, in HTTPResponse._read_status(self)
    278 def _read_status(self):
--> 279     line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
    280     if len(line) > _MAXLINE:
    281         raise LineTooLong("status line")

File /home/zeus/miniconda3/envs/cloudspace/lib/python3.10/socket.py:705, in SocketIO.readinto(self, b)
    703 while True:
    704     try:
--> 705         return self._sock.recv_into(b)
    706     except timeout:
    707         self._timeout_occurred = True

File /home/zeus/miniconda3/envs/cloudspace/lib/python3.10/ssl.py:1274, in SSLSocket.recv_into(self, buffer, nbytes, flags)
   1270     if flags != 0:
   1271         raise ValueError(
   1272           "non-zero flags not allowed in calls to recv_into() on %s" %
   1273           self.__class__)
-> 1274     return self.read(nbytes, buffer)
   1275 else:
   1276     return super().recv_into(buffer, nbytes, flags)

File /home/zeus/miniconda3/envs/cloudspace/lib/python3.10/ssl.py:1130, in SSLSocket.read(self, len, buffer)
   1128 try:
   1129     if buffer is not None:
-> 1130         return self._sslobj.read(len, buffer)
   1131     else:
   1132         return self._sslobj.read(len)

KeyboardInterrupt: 

from torchgeo.

tchaton avatar tchaton commented on May 23, 2024

Interestingly enough, it works if I remove the "User-Agent": USER_AGENT from the headers.

Screenshot 2024-04-18 at 09 22 38

from torchgeo.

robmarkcole avatar robmarkcole commented on May 23, 2024

A temporary workaround on lightning.ai thanks to @tchaton

from torchgeo.datasets import CDL

# Apply patch to pop User-Agent until we figure out why it hangs
from torchvision.datasets.utils import urllib
original_request = urllib.request.Request
def Request(*args, headers, **kwargs):
    if "User-Agent" in headers:
        headers.pop("User-Agent")
    return original_request(*args, headers=headers, **kwargs)
urllib.request.Request = Request

dataset = CDL(years=[2022], download=True, paths="./data") 

print(dataset)

However when I go to plot a sample I get the error

RuntimeError: cannot sample n_sample > prob_dist.size(-1) samples without replacement

I suspect this error is due to setting a crs that is different from the native dataset crs, as when I don't do this there is no error

from torchgeo.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.