microsoft / planetarycomputerexamples Goto Github PK

Examples of using the Planetary Computer

License: MIT License

Jupyter Notebook 99.99% Shell 0.01% Dockerfile 0.01% Python 0.01%

planetarycomputerexamples's Introduction

Planetary Computer Hub

Welcome to the Planetary Computer Hub, a development environment that makes our data and APIs accessible through familiar, open-source tools, and allows users to easily scale their analyses.

If you're viewing this repository from GitHub, you might want to browse the rendered examples on nbviewer, including our quickstarts, dataset examples, and tutorials.

Quickstarts

These quickstarts give high-level introductions to a single topic.

Datasets

These examples introduce specific datasets. They give some details about the datasets and example code for working with them.

Tutorials

These tutorials introduce a large topic and cover it in detail.

Learn More

Data Catalog: https://planetarycomputer.microsoft.com/catalog
Documentation: https://planetarycomputer.microsoft.com/docs/overview/about
Discussions: https://github.com/Microsoft/PlanetaryComputer/discussions

planetarycomputerexamples's People

Contributors

Stargazers

Watchers

Forkers

taylorcorbett mrslabiitb johnpfay alequech g4brielvs thuydotm giancastro brendancol sockthem qpc-database mmcfarland gadomski nodell111 tcmetzger run93 sumedhg10 tomaugspurger jessjaco jscreve-leroymerlin wonderdong11 alistaire grg-ffb matthewhanson khandnshrimp justinfisk ishaankochhar developmentseed okarsarblueder francescoasaro mwengren tamara-glazer sparkgeo drivendataorg shiweihappy ninjabunny9000 ydwanggithub geo-rao dongyi1996 ruiduobao pauljwright zgcao jakkerman reykoki wanghai1996 radiantearth ariewahyu jimjoker cboettig cheukhin1024 aurghs bopen python-repository-hub pjhartzell hobu delgadom mmbateni dukeglobe stuartpearman saad0x espackman-nv geoidal manik-soin vwmaus russmain qianyouliang miraanand shitabishmam upstream-tech caitlittlef saibo-li pholleway reseed-farm lprra priyankapiba rsadiq raybellwaves chmzs mwestergaard guadac ichit hadieo manmeet3591 manishdhasmana49 orianac jdgrillo mukhery neura-lode jgomezdans ping-p-yang digidude dmfenton rolfsimoes oldlipe mush giswqs southworks zacharydez backface slflood memo1986

planetarycomputerexamples's Issues

Lack of actualised module inside indicators.ipynb

Runnig the example the first cell in indicators.ipynb produces this error:

ModuleNotFoundError Traceback (most recent call last)
Cell In[1], line 9
6 import xarray as xr
8 # climate indicators with xclim
----> 9 import xclim.indicators
11 # optional imports used in this notebook
12 from dask.diagnostics import ProgressBar

File /srv/conda/envs/notebook/lib/python3.10/site-packages/xclim/init.py:4
1 """Climate indices computation package based on Xarray."""
2 from importlib.resources import contents, path
----> 4 from xclim.core import units # noqa
5 from xclim.core.indicator import build_indicator_module_from_yaml
6 from xclim.core.locales import load_locale

File /srv/conda/envs/notebook/lib/python3.10/site-packages/xclim/core/units.py:15
12 from typing import Any, Callable, Optional, Tuple, Union
14 import pint.converters
---> 15 import pint.unit
16 import xarray as xr
17 from boltons.funcutils import wraps

ModuleNotFoundError: No module named 'pint.unit'

Landsat TM/MSS Collection 2 - NASA CMR API Authentication

Hi,

I've been having a look at your "Landsat TM/MSS Collection 2" dataset that is currently in preview and have encountered some difficulty in passing authentication details to the NASA CMR API.

Where do I go to generate a token for the NASA CMR API? Can a token be generated for free or are there data partnership constraints?

Thanks,

Ben

Dask scheduler version with custom Docker image

Hello,

I am attempting to run a modified version of the land cover classification tutorial, where I'd like to load my own segmentation_models_pytorch model that was trained on Sentinel-2 imagery. We have some custom functions we need from our own pip package, and we are using the latest segmentation_models_pytorch built from source. Thus, I have built my own Docker image based on the pangeo-docker-images/pangeo-notebook image (tag 2021.05.04), which uses both dask==2021.04.1 and distributed==2021.04.1. I have verified those versions are correct within a container.

I point the cluster to this image using options['image'] = 'drollend/apl-gpu-pytorch-notebook:latest', but I receive the distributed client VersionMismatchWarning:

+-------------+-----------+-----------+-----------+
| Package     | client    | scheduler | workers   |
+-------------+-----------+-----------+-----------+
| dask        | 2021.04.1 | 2021.05.0 | 2021.04.1 |
| distributed | 2021.04.1 | 2021.05.0 | 2021.04.1 |
+-------------+-----------+-----------+-----------+

I am unsure of where the scheduler lives and why it is a different version. Any pointers you can provide on fixing the scheduler version are much appreciated. Thanks in advance and thanks for creating the Planetary Computer!

Catalog search don't work

Hello all,

I want to load all Sentinel-2 products for January 2019 in an area of interest via Planetarys Computer JupyterLab via:

from pystac_client import Client
import planetary_computer

catalog = Client.open(
    "https://planetarycomputer.microsoft.com/api/stac/v1",
    modifier=planetary_computer.sign_inplace)

x_min = 5.7
x_max = 16
y_min = 47.025
y_max = 50.812

area_of_interest = {
    "type": "Polygon",
    "coordinates": [
        [
            [x_max, y_min],
            [x_min, y_min],
            [x_min, y_max],
            [x_max, y_max],
            [x_max, y_min],
        ]
    ],}

time_of_interest = "2019-01"
search = catalog.search(collections=["sentinel-2-l2a"],
                        intersects=area_of_interest,
                        datetime=time_of_interest,
                        #query={"eo:cloud_cover": {"lt": 100}},
)
items = search.item_collection()

Trying this will give me an APIError:

APIError: The request exceeded the maximum allowed time, please try again. If the issue persists, please contact [email protected].

Debug information for support: 0aSG1ZAAAAAC5yLOMbWyXS4Mq4pR59StDQU1TMDRFREdFMTgwNgA5MjdhYmZhNi0xOWY2LTRhZjEtYTA5ZC1jOTU5ZDlhMWU2NDQ=

Does anyone know what went wrong?
Is this a known issue?
Thank you very much for your support!

Best,
Sebastian

Whole Error:

---------------------------------------------------------------------------
APIError                                  Traceback (most recent call last)
Cell In[6], line 31
     25 time_of_interest = "2019-01"
     26 search = catalog.search(collections=["sentinel-2-l2a"],
     27                         intersects=area_of_interest,
     28                         datetime=time_of_interest,
     29                         #query={"eo:cloud_cover": {"lt": 100}},
     30 )
---> 31 items = search.item_collection()

File /srv/conda/envs/notebook/lib/python3.11/site-packages/pystac_client/item_search.py:756, in ItemSearch.item_collection(self)
    748 """
    749 Get the matching items as a :py:class:`pystac.ItemCollection`.
    750 
    751 Return:
    752     ItemCollection: The item collection
    753 """
    754 # Bypass the cache here, so that we can pass __preserve_dict__
    755 # without mutating what's in the cache.
--> 756 feature_collection = self.item_collection_as_dict.__wrapped__(self)
    757 # already signed in item_collection_as_dict
    758 return ItemCollection.from_dict(
    759     feature_collection, preserve_dict=False, root=self.client
    760 )

File /srv/conda/envs/notebook/lib/python3.11/site-packages/pystac_client/item_search.py:777, in ItemSearch.item_collection_as_dict(self)
    764 """
    765 Get the matching items as an item-collection-like dict.
    766 
   (...)
    774     Dict : A GeoJSON FeatureCollection
    775 """
    776 features = []
--> 777 for page in self.pages_as_dicts():
    778     for feature in page["features"]:
    779         features.append(feature)

File /srv/conda/envs/notebook/lib/python3.11/site-packages/pystac_client/item_search.py:727, in ItemSearch.pages_as_dicts(self)
    725 if isinstance(self._stac_io, StacApiIO):
    726     num_items = 0
--> 727     for page in self._stac_io.get_pages(
    728         self.url, self.method, self.get_parameters()
    729     ):
    730         call_modifier(self.modifier, page)
    731         features = page.get("features", [])

File /srv/conda/envs/notebook/lib/python3.11/site-packages/pystac_client/stac_api_io.py:304, in StacApiIO.get_pages(self, url, method, parameters)
    302 while next_link:
    303     link = Link.from_dict(next_link)
--> 304     page = self.read_json(link, parameters=parameters)
    305     if not (page.get("features") or page.get("collections")):
    306         return None

File /srv/conda/envs/notebook/lib/python3.11/site-packages/pystac/stac_io.py:202, in StacIO.read_json(self, source, *args, **kwargs)
    185 def read_json(self, source: HREF, *args: Any, **kwargs: Any) -> Dict[str, Any]:
    186     """Read a dict from the given source.
    187 
    188     See :func:`StacIO.read_text <pystac.StacIO.read_text>` for usage of
   (...)
    200         given source.
    201     """
--> 202     txt = self.read_text(source, *args, **kwargs)
    203     return self.json_loads(txt)

File /srv/conda/envs/notebook/lib/python3.11/site-packages/pystac_client/stac_api_io.py:161, in StacApiIO.read_text(self, source, *args, **kwargs)
    157     else:
    158         # parameters are already in the link href
    159         parameters = {}
--> 161     return self.request(
    162         href, method=method, headers=headers, parameters=parameters
    163     )
    164 else:  # str or something that can be str'ed
    165     href = str(source)

File /srv/conda/envs/notebook/lib/python3.11/site-packages/pystac_client/stac_api_io.py:216, in StacApiIO.request(self, href, method, headers, parameters)
    214     raise APIError(str(err))
    215 if resp.status_code != 200:
--> 216     raise APIError.from_response(resp)
    217 try:
    218     return resp.content.decode("utf-8")

APIError: The request exceeded the maximum allowed time, please try again. If the issue persists, please contact [email protected].

Debug information for support: 0aSG1ZAAAAAC5yLOMbWyXS4Mq4pR59StDQU1TMDRFREdFMTgwNgA5MjdhYmZhNi0xOWY2LTRhZjEtYTA5ZC1jOTU5ZDlhMWU2NDQ=

Customizable RTC with Sentinel-1

Following the Customisable RTC tutorial with Sentinel-1, I am facing this issue when it comes to using sarsen v0.9.3:

TypeError                                 Traceback (most recent call last)
Cell 18 line 1
----> gtc = apps.terrain_correction(
   product_urlpath=grd_local_path,
    measurement_group=measurement_group,
dem_urlpath=dem_path,
output_urlpath=os.path.join(
       tmp_dir, os.path.basename(product_folder) + ".10m.GTC.tif"
    ),
 )

TypeError: terrain_correction() got an unexpected keyword argument 'product_urlpath'

This happens when running In 13 of the tutorial which corresponds to the following piece of code:

gtc = apps.terrain_correction(
    product_urlpath=grd_local_path,
    measurement_group=measurement_group,
    dem_urlpath=dem_path,
    output_urlpath=os.path.join(
        tmp_dir, os.path.basename(product_folder) + ".10m.GTC.tif"
    ),
)

Looking at the sarsen API, it seems it has changed since the tutorial was made. However, I do not see any equivalent to how to use the new API.

workers losing communication with scheduler

Let me know if you think this is a more general Dask issue and not PC-specific, but we're running into an issue with workers seemingly randomly losing communications during processing. As a result, compute jobs hang indefinitely because their tasks get assigned to and fragmented amongst the lost workers.

Here's the relevant section of logs from one of the lost workers (anecdotally, they always seem to start with the connection to the scheduler being broken) :

distributed.worker - INFO - Run out-of-band function '_configure_helper'

distributed.worker - INFO - Starting Worker plugin pc_utils.py0f4c310e-149e-4082-818a-05fb40d056ad
distributed.worker - INFO - Connection to scheduler broken. Reconnecting...

distributed.worker - WARNING - Compute Failed Function: execute_task args: ((<built-in function getitem>, (<built-in function getitem>, (<built-in function getitem>, (subgraph_callable, array(nan), 0, (slice(5120, 5493, None), slice(0, 1024, None)), array([[('https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/29/T/PF/2017/10/12/S2A_MSIL2A_20171012T112111_N0212_R037_T29TPF_20201015T000115.SAFE/GRANULE/L2A_T29TPF_A012046_20171012T112713/IMG_DATA/R10m/T29TPF_20171012T112111_B02_10m.tif<shortened>', [-7.81885207, 40.53618553, -6.52280188, 41.54559377])]], dtype=[('url', 'O'), ('bounds', '<f8', (4,))]), (<function apply at 0x7f30e7b931f0>, <class 'stackstac.raster_spec.RasterSpec'>, (), (<class 'dict'>, [['epsg', 4326], ['bounds', (<class 'tuple'>, [-7.2601718149277765, kwargs: {} Exception: CommClosedError()

distributed.worker - INFO - Comm closed

distributed.worker - WARNING - Compute Failed Function: execute_task args: ((<built-in function getitem>, (<built-in function getitem>, (<built-in function getitem>, (subgraph_callable, array(nan), 0, (slice(5120, 5493, None), slice(0, 1024, None)), array([[('https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/29/T/PF/2017/12/23/S2B_MSIL2A_20171223T111449_N0212_R137_T29TPF_20201014T155436.SAFE/GRANULE/L2A_T29TPF_A004167_20171223T111445/IMG_DATA/R10m/T29TPF_20171223T111449_B02_10m.tif<shortened>', [-7.81885207, 40.53618553, -6.52280188, 41.54559377])]], dtype=[('url', 'O'), ('bounds', '<f8', (4,))]), (<function apply at 0x7f30e7b931f0>, <class 'stackstac.raster_spec.RasterSpec'>, (), (<class 'dict'>, [['epsg', 4326], ['bounds', (<class 'tuple'>, [-7.2601718149277765, kwargs: {} Exception: CommClosedError()

distributed.worker - INFO - Comm closed

distributed.worker - WARNING - Compute Failed Function: execute_task args: ((<built-in function getitem>, (<built-in function getitem>, (<built-in function getitem>, (subgraph_callable, array(nan), 0, (slice(5120, 5493, None), slice(0, 1024, None)), array([[('https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/29/T/PF/2017/07/16/S2B_MSIL2A_20170716T110649_N0212_R137_T29TPF_20210210T100246.SAFE/GRANULE/L2A_T29TPF_A001879_20170716T111603/IMG_DATA/R10m/T29TPF_20170716T110649_B02_10m.tif<shortened>', [-7.81885207, 40.53618553, -6.52280188, 41.54559377])]], dtype=[('url', 'O'), ('bounds', '<f8', (4,))]), (<function apply at 0x7f30e7b931f0>, <class 'stackstac.raster_spec.RasterSpec'>, (), (<class 'dict'>, [['epsg', 4326], ['bounds', (<class 'tuple'>, [-7.2601718149277765, kwargs: {} Exception: CommClosedError()

distributed.worker - INFO - Comm closed

distributed.worker - INFO - Comm closed

How can I get one pixel time series in most efficient way?

like it on the google earth engine. It is so hard to do so on your service. I need to open all the data I need then subset. It is really unfriendly process. I appreciate your help in improving this part.

query STAC extension support

Hi,

Would it be possible to add query support to your STAC endpoints?

As your team noted here, the pystac-client is a little slow when instantiating many items. The query extension would enable some pre-filtering (e.g., based on cloud cover) to help reduce the instantiation of unnecessary items.

For example, a search with the query command below:

api = pystac_client.Client.open(
    "https://planetarycomputer-staging.microsoft.com/api/stac/v1/"
)
s2_search = api.search(
    datetime="2017-09-01/2017-09-05",
    limit=500,
    collections=["sentinel-2-l2a"],
    query=['eo:cloud_cover<5']
)

Currently fails with the following error:

APIError: The 'query' STAC API extension is not yet implemented. You may perform queries against collection IDs, geometry (intersects or bbox), and datetime only. Stay tuned!

SSL: CERTIFICATE_VERIFY_FAILED error

I tried running the CMIP6 ensemble notebook locally (not on the hub) at my current organization and got:

SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate in certificate chain

I know I'm supposed to be pointing at a custom .pem file I have in my home dir, but not sure how to specify that in the context of planetary computer notebooks.

In another notebook, I was able to overcome this by adding verify=false to the fsspec client_kwargs, but I know that (a) that's not a great solution, and (b) not sure how to even do that here.

High memory usage on proximity notebook

The cell

extent_data = data.sel(band="extent")

extent_proximity_default = proximity(extent_data).compute()

is current failing on staging because the workers are using too much memory. The notebook output has

distributed.nanny - WARNING - Worker exceeded 95% memory budget. Restarting
distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:39389 -> tcp://127.0.0.1:36541
Traceback (most recent call last):
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/comm/tcp.py", line 198, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed

Here's a reproducer with just xrspatial, dask, and xarray

import dask.array as da
import xarray as xr
from xrspatial.proximity import proximity

a = xr.DataArray(da.ones((5405, 5766), dtype="float64", chunks=(3000, 3000)), dims=("y", "x"))

xrspatial.proximity(a).compute()

cc @thuydotm, does this look like an issue in xrspatial? Or do you think it might be upstream in dask?

Sentinel-2-l2a band 10 availability

Hi,

I have noticed that when checking the available assets for the sentinel-2-l2a dataset that band B10 appears to be absent.
Will this band be accessible through the Planetary Computer api in the future?

Best wishes,
Ben

radiant-mlhub-landcovernet example failing

The landcovernet example at https://github.com/microsoft/PlanetaryComputerExamples/blob/main/tutorials/radiant-mlhub-landcovernet.ipynb is currently failing for me.

>>> collection_id = "ref_landcovernet_v1_labels"
>>> collection = client.get_collection(collection_id)

---------------------------------------------------------------------------
APIError                                  Traceback (most recent call last)
File /srv/conda/envs/notebook/lib/python3.8/site-packages/pystac_client/stac_api_io.py:136, in StacApiIO.request(self, href, method, headers, parameters)
    135 if resp.status_code != 200:
--> 136     raise APIError(resp.text)
    137 return resp.content.decode("utf-8")

APIError: {"detail":"Collection ref_landcovernet_v1_labels does not exist."}

During handling of the above exception, another exception occurred:

APIError                                  Traceback (most recent call last)
Input In [4], in <cell line: 3>()
      1 collection_id = "ref_landcovernet_v1_labels"
----> 3 collection = client.get_collection(collection_id)
      4 collection_sci_ext = ScientificExtension.ext(collection)
      5 print(f"Description: {collection.description}")

File /srv/conda/envs/notebook/lib/python3.8/site-packages/pystac_client/client.py:92, in Client.get_collection(self, collection_id)
     90 if self._stac_io.conforms_to(ConformanceClasses.COLLECTIONS):
     91     url = f"{self.get_self_href()}/collections/{collection_id}"
---> 92     collection = CollectionClient.from_dict(self._stac_io.read_json(url), root=self)
     93     return collection
     94 else:

File /srv/conda/envs/notebook/lib/python3.8/site-packages/pystac/stac_io.py:197, in StacIO.read_json(self, source, *args, **kwargs)
    178 def read_json(
    179     self, source: Union[str, "Link_Type"], *args: Any, **kwargs: Any
    180 ) -> Dict[str, Any]:
    181     """Read a dict from the given source.
    182 
    183     See :func:`StacIO.read_text <pystac.StacIO.read_text>` for usage of
   (...)
    195         given source.
    196     """
--> 197     txt = self.read_text(source, *args, **kwargs)
    198     return self.json_loads(txt)

File /srv/conda/envs/notebook/lib/python3.8/site-packages/pystac_client/stac_api_io.py:77, in StacApiIO.read_text(self, source, parameters, *args, **kwargs)
     75 href = source
     76 if bool(urlparse(href).scheme):
---> 77     return self.request(href, *args, parameters=parameters, **kwargs)
     78 else:
     79     with open(href) as f:

File /srv/conda/envs/notebook/lib/python3.8/site-packages/pystac_client/stac_api_io.py:139, in StacApiIO.request(self, href, method, headers, parameters)
    137     return resp.content.decode("utf-8")
    138 except Exception as err:
--> 139     raise APIError(str(err))

APIError: {"detail":"Collection ref_landcovernet_v1_labels does not exist."}

cc @KennSmithDS. Do you know if that collection ID was deliberately removed? Is there a good alternative?

ExtensionNotImplemented error

As suggested in a comment in #182, I tried the label collection in Africa (ref_landcovernet_af_v1_labels). However, I now receive the following error.

---------------------------------------------------------------------------
ExtensionNotImplemented                   Traceback (most recent call last)
/tmp/ipykernel_17820/416491967.py in <module>
      2 
      3 first_item = next(item_search.get_items())
----> 4 first_item_label_ext = LabelExtension.ext(first_item)
      5 
      6 label_classes = first_item_label_ext.label_classes

/opt/conda/lib/python3.8/site-packages/pystac/extensions/label.py in ext(cls, obj, add_if_missing)
    699         """
    700         if isinstance(obj, pystac.Item):
--> 701             cls.validate_has_extension(obj, add_if_missing)
    702             return cls(obj)
    703         else:

/opt/conda/lib/python3.8/site-packages/pystac/extensions/base.py in validate_has_extension(cls, obj, add_if_missing)
    174 
    175         if cls.get_schema_uri() not in obj.stac_extensions:
--> 176             raise pystac.ExtensionNotImplemented(
    177                 f"Could not find extension schema URI {cls.get_schema_uri()} in object."
    178             )

ExtensionNotImplemented: Could not find extension schema URI https://stac-extensions.github.io/label/v1.0.1/schema.json in object.

Where is last_provisional_slice and/or last_permanent_slice stored in the gridMET Zarr file?

I love the example notebooks, but for the gridMet example, notes 3 and 4 mention last_permanent_slice and last_provisional_slice, but where are those values stored in the Zarr? I've looked through the various attrs, but don't see where that is stored so that a user knows where the provisional or permanent data ends. Thanks in advance for an answer.

unable to initialize dask_cuda LocalCUDACluster

Using GPU-PyTorch profile, unable to initialize LocalCUDACluster.

It does not fail on import:
from dask.distributed import Client
from dask_cuda import LocalCUDACluster

It fails on initializing the Cluster:
cluster = LocalCUDACluster(threads_per_worker=4)

Full error message:

distributed.diskutils - INFO - Found stale lock file and directory '/home/jovyan/PlanetaryComputerExamples/tutorials/dask-worker-space/worker-o6h81bfq', purging
distributed.preloading - INFO - Import preload module: dask_cuda.initialize
distributed.nanny - ERROR - Failed to start worker
Traceback (most recent call last):
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/nanny.py", line 889, in run
    await worker
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/core.py", line 283, in _
    await self.start()
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/worker.py", line 1502, in start
    await self._register_with_scheduler()
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/worker.py", line 1198, in _register_with_scheduler
    types={k: typename(v) for k, v in self.data.items()},
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/worker.py", line 1198, in <dictcomp>
    types={k: typename(v) for k, v in self.data.items()},
  File "/srv/conda/envs/notebook/lib/python3.8/_collections_abc.py", line 743, in __iter__
    for key in self._mapping:
RuntimeError: Set changed size during iteration
distributed.nanny - ERROR - Failed while trying to start worker process: Set changed size during iteration
Task exception was never retrieved
future: <Task finished name='Task-22' coro=<_wrap_awaitable() done, defined at /srv/conda/envs/notebook/lib/python3.8/asyncio/tasks.py:688> exception=RuntimeError('Set changed size during iteration')>
Traceback (most recent call last):
  File "/srv/conda/envs/notebook/lib/python3.8/asyncio/tasks.py", line 695, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/core.py", line 283, in _
    await self.start()
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/nanny.py", line 338, in start
    response = await self.instantiate()
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/nanny.py", line 421, in instantiate
    result = await self.process.start()
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/nanny.py", line 698, in start
    msg = await self._wait_until_connected(uid)
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/nanny.py", line 817, in _wait_until_connected
    raise msg["exception"]
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/nanny.py", line 889, in run
    await worker
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/core.py", line 283, in _
    await self.start()
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/worker.py", line 1502, in start
    await self._register_with_scheduler()
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/worker.py", line 1198, in _register_with_scheduler
    types={k: typename(v) for k, v in self.data.items()},
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/worker.py", line 1198, in <dictcomp>
    types={k: typename(v) for k, v in self.data.items()},
  File "/srv/conda/envs/notebook/lib/python3.8/_collections_abc.py", line 743, in __iter__
    for key in self._mapping:
RuntimeError: Set changed size during iteration
distributed.diskutils - INFO - Found stale lock file and directory '/home/jovyan/PlanetaryComputerExamples/tutorials/dask-worker-space/worker-fz13r6ye', purging
distributed.preloading - INFO - Import preload module: dask_cuda.initialize
distributed.nanny - ERROR - Failed to start worker
Traceback (most recent call last):
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/nanny.py", line 889, in run
    await worker
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/core.py", line 283, in _
    await self.start()
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/worker.py", line 1502, in start
    await self._register_with_scheduler()
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/worker.py", line 1198, in _register_with_scheduler
    types={k: typename(v) for k, v in self.data.items()},
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/worker.py", line 1198, in <dictcomp>
    types={k: typename(v) for k, v in self.data.items()},
  File "/srv/conda/envs/notebook/lib/python3.8/_collections_abc.py", line 743, in __iter__
    for key in self._mapping:
RuntimeError: Set changed size during iteration
distributed.nanny - ERROR - Failed while trying to start worker process: Set changed size during iteration
tornado.application - ERROR - Exception in callback functools.partial(<bound method IOLoop._discard_future_result of <zmq.eventloop.ioloop.ZMQIOLoop object at 0x7f338725d970>>, <Task finished name='Task-21' coro=<SpecCluster._correct_state_internal() done, defined at /srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/deploy/spec.py:325> exception=RuntimeError('Set changed size during iteration')>)
Traceback (most recent call last):
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/tornado/ioloop.py", line 741, in _run_callback
    ret = callback()
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/tornado/ioloop.py", line 765, in _discard_future_result
    future.result()
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/deploy/spec.py", line 363, in _correct_state_internal
    await w  # for tornado gen.coroutine support
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/core.py", line 283, in _
    await self.start()
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/nanny.py", line 338, in start
    response = await self.instantiate()
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/nanny.py", line 421, in instantiate
    result = await self.process.start()
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/nanny.py", line 698, in start
    msg = await self._wait_until_connected(uid)
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/nanny.py", line 817, in _wait_until_connected
    raise msg["exception"]
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/nanny.py", line 889, in run
    await worker
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/core.py", line 283, in _
    await self.start()
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/worker.py", line 1502, in start
    await self._register_with_scheduler()
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/worker.py", line 1198, in _register_with_scheduler
    types={k: typename(v) for k, v in self.data.items()},
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/worker.py", line 1198, in <dictcomp>
    types={k: typename(v) for k, v in self.data.items()},
  File "/srv/conda/envs/notebook/lib/python3.8/_collections_abc.py", line 743, in __iter__
    for key in self._mapping:
RuntimeError: Set changed size during iteration
distributed.diskutils - INFO - Found stale lock file and directory '/home/jovyan/PlanetaryComputerExamples/tutorials/dask-worker-space/worker-zyrogkjh', purging
distributed.preloading - INFO - Import preload module: dask_cuda.initialize
distributed.nanny - ERROR - Failed to start worker
Traceback (most recent call last):
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/nanny.py", line 889, in run
    await worker
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/core.py", line 283, in _
    await self.start()
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/worker.py", line 1502, in start
    await self._register_with_scheduler()
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/worker.py", line 1198, in _register_with_scheduler
    types={k: typename(v) for k, v in self.data.items()},
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/worker.py", line 1198, in <dictcomp>
    types={k: typename(v) for k, v in self.data.items()},
  File "/srv/conda/envs/notebook/lib/python3.8/_collections_abc.py", line 743, in __iter__
    for key in self._mapping:
RuntimeError: Set changed size during iteration
distributed.nanny - ERROR - Failed while trying to start worker process: Set changed size during iteration
Task exception was never retrieved
future: <Task finished name='Task-47' coro=<_wrap_awaitable() done, defined at /srv/conda/envs/notebook/lib/python3.8/asyncio/tasks.py:688> exception=RuntimeError('Set changed size during iteration')>
Traceback (most recent call last):
  File "/srv/conda/envs/notebook/lib/python3.8/asyncio/tasks.py", line 695, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/core.py", line 283, in _
    await self.start()
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/nanny.py", line 338, in start
    response = await self.instantiate()
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/nanny.py", line 421, in instantiate
    result = await self.process.start()
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/nanny.py", line 698, in start
    msg = await self._wait_until_connected(uid)
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/nanny.py", line 817, in _wait_until_connected
    raise msg["exception"]
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/nanny.py", line 889, in run
    await worker
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/core.py", line 283, in _
    await self.start()
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/worker.py", line 1502, in start
    await self._register_with_scheduler()
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/worker.py", line 1198, in _register_with_scheduler
    types={k: typename(v) for k, v in self.data.items()},
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/worker.py", line 1198, in <dictcomp>
    types={k: typename(v) for k, v in self.data.items()},
  File "/srv/conda/envs/notebook/lib/python3.8/_collections_abc.py", line 743, in __iter__
    for key in self._mapping:
RuntimeError: Set changed size during iteration
distributed.diskutils - INFO - Found stale lock file and directory '/home/jovyan/PlanetaryComputerExamples/tutorials/dask-worker-space/worker-jpr2r_q4', purging
distributed.preloading - INFO - Import preload module: dask_cuda.initialize
distributed.nanny - ERROR - Failed to start worker
Traceback (most recent call last):
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/nanny.py", line 889, in run
    await worker
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/core.py", line 283, in _
    await self.start()
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/worker.py", line 1502, in start
    await self._register_with_scheduler()
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/worker.py", line 1198, in _register_with_scheduler
    types={k: typename(v) for k, v in self.data.items()},
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/worker.py", line 1198, in <dictcomp>
    types={k: typename(v) for k, v in self.data.items()},
  File "/srv/conda/envs/notebook/lib/python3.8/_collections_abc.py", line 743, in __iter__
    for key in self._mapping:
RuntimeError: Set changed size during iteration
distributed.nanny - ERROR - Failed while trying to start worker process: Set changed size during iteration
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Input In [1], in <module>
      1 from dask.distributed import Client
      2 from dask_cuda import LocalCUDACluster
----> 4 cluster = LocalCUDACluster(threads_per_worker=4)
      5 client = Client(cluster)
      6 print(f"/proxy/{client.scheduler_info()['services']['dashboard']}/status")

File /srv/conda/envs/notebook/lib/python3.8/site-packages/dask_cuda/local_cuda_cluster.py:361, in LocalCUDACluster.__init__(self, CUDA_VISIBLE_DEVICES, n_workers, threads_per_worker, memory_limit, device_memory_limit, data, local_directory, protocol, enable_tcp_over_ucx, enable_infiniband, enable_nvlink, enable_rdmacm, ucx_net_devices, rmm_pool_size, rmm_managed_memory, rmm_async, rmm_log_directory, jit_unspill, log_spilling, worker_class, **kwargs)
    359 self.cuda_visible_devices = CUDA_VISIBLE_DEVICES
    360 self.scale(n_workers)
--> 361 self.sync(self._correct_state)

File /srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/deploy/cluster.py:258, in Cluster.sync(self, func, asynchronous, callback_timeout, *args, **kwargs)
    256     return future
    257 else:
--> 258     return sync(self.loop, func, *args, **kwargs)

File /srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/utils.py:332, in sync(loop, func, callback_timeout, *args, **kwargs)
    330 if error[0]:
    331     typ, exc, tb = error[0]
--> 332     raise exc.with_traceback(tb)
    333 else:
    334     return result[0]

File /srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/utils.py:315, in sync.<locals>.f()
    313     if callback_timeout is not None:
    314         future = asyncio.wait_for(future, callback_timeout)
--> 315     result[0] = yield future
    316 except Exception:
    317     error[0] = sys.exc_info()

File /srv/conda/envs/notebook/lib/python3.8/site-packages/tornado/gen.py:762, in Runner.run(self)
    759 exc_info = None
    761 try:
--> 762     value = future.result()
    763 except Exception:
    764     exc_info = sys.exc_info()

File /srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/deploy/spec.py:363, in SpecCluster._correct_state_internal(self)
    361     for w in workers:
    362         w._cluster = weakref.ref(self)
--> 363         await w  # for tornado gen.coroutine support
    364 self.workers.update(dict(zip(to_open, workers)))

File /srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/core.py:283, in Server.__await__.<locals>._()
    277             raise TimeoutError(
    278                 "{} failed to start in {} seconds".format(
    279                     type(self).__name__, timeout
    280                 )
    281             )
    282     else:
--> 283         await self.start()
    284         self.status = Status.running
    285 return self

File /srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/nanny.py:338, in Nanny.start(self)
    335     await self.plugin_add(plugin=plugin, name=name)
    337 logger.info("        Start Nanny at: %r", self.address)
--> 338 response = await self.instantiate()
    339 if response == Status.running:
    340     assert self.worker_address

File /srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/nanny.py:421, in Nanny.instantiate(self, comm)
    419 else:
    420     try:
--> 421         result = await self.process.start()
    422     except Exception:
    423         await self.close()

File /srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/nanny.py:698, in WorkerProcess.start(self)
    696     return self.status
    697 try:
--> 698     msg = await self._wait_until_connected(uid)
    699 except Exception:
    700     self.status = Status.failed

File /srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/nanny.py:817, in WorkerProcess._wait_until_connected(self, uid)
    813 if "exception" in msg:
    814     logger.error(
    815         "Failed while trying to start worker process: %s", msg["exception"]
    816     )
--> 817     raise msg["exception"]
    818 else:
    819     return msg

File /srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/nanny.py:889, in run()
    885 """
    886 Try to start worker and inform parent of outcome.
    887 """
    888 try:
--> 889     await worker
    890 except Exception as e:
    891     logger.exception("Failed to start worker")

File /srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/core.py:283, in _()
    277             raise TimeoutError(
    278                 "{} failed to start in {} seconds".format(
    279                     type(self).__name__, timeout
    280                 )
    281             )
    282     else:
--> 283         await self.start()
    284         self.status = Status.running
    285 return self

File /srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/worker.py:1502, in start()
   1497 await asyncio.gather(
   1498     *(self.plugin_add(plugin=plugin) for plugin in self._pending_plugins)
   1499 )
   1500 self._pending_plugins = ()
-> 1502 await self._register_with_scheduler()
   1504 self.start_periodic_callbacks()
   1505 return self

File /srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/worker.py:1198, in _register_with_scheduler()
   1179 comm.name = "Worker->Scheduler"
   1180 comm._server = weakref.ref(self)
   1181 await comm.write(
   1182     dict(
   1183         op="register-worker",
   1184         reply=False,
   1185         address=self.contact_address,
   1186         status=self.status.name,
   1187         keys=list(self.data),
   1188         nthreads=self.nthreads,
   1189         name=self.name,
   1190         nbytes={
   1191             ts.key: ts.get_nbytes()
   1192             for ts in self.tasks.values()
   1193             # Only if the task is in memory this is a sensible
   1194             # result since otherwise it simply submits the
   1195             # default value
   1196             if ts.state == "memory"
   1197         },
-> 1198         types={k: typename(v) for k, v in self.data.items()},
   1199         now=time(),
   1200         resources=self.total_resources,
   1201         memory_limit=self.memory_limit,
   1202         local_directory=self.local_directory,
   1203         services=self.service_ports,
   1204         nanny=self.nanny,
   1205         pid=os.getpid(),
   1206         versions=get_versions(),
   1207         metrics=await self.get_metrics(),
   1208         extra=await self.get_startup_information(),
   1209     ),
   1210     serializers=["msgpack"],
   1211 )
   1212 future = comm.read(deserializers=["msgpack"])
   1214 response = await future

File /srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/worker.py:1198, in <dictcomp>()
   1179 comm.name = "Worker->Scheduler"
   1180 comm._server = weakref.ref(self)
   1181 await comm.write(
   1182     dict(
   1183         op="register-worker",
   1184         reply=False,
   1185         address=self.contact_address,
   1186         status=self.status.name,
   1187         keys=list(self.data),
   1188         nthreads=self.nthreads,
   1189         name=self.name,
   1190         nbytes={
   1191             ts.key: ts.get_nbytes()
   1192             for ts in self.tasks.values()
   1193             # Only if the task is in memory this is a sensible
   1194             # result since otherwise it simply submits the
   1195             # default value
   1196             if ts.state == "memory"
   1197         },
-> 1198         types={k: typename(v) for k, v in self.data.items()},
   1199         now=time(),
   1200         resources=self.total_resources,
   1201         memory_limit=self.memory_limit,
   1202         local_directory=self.local_directory,
   1203         services=self.service_ports,
   1204         nanny=self.nanny,
   1205         pid=os.getpid(),
   1206         versions=get_versions(),
   1207         metrics=await self.get_metrics(),
   1208         extra=await self.get_startup_information(),
   1209     ),
   1210     serializers=["msgpack"],
   1211 )
   1212 future = comm.read(deserializers=["msgpack"])
   1214 response = await future

File /srv/conda/envs/notebook/lib/python3.8/_collections_abc.py:743, in __iter__()
    742 def __iter__(self):
--> 743     for key in self._mapping:
    744         yield (key, self._mapping[key])

RuntimeError: Set changed size during iteration

Update us-census example to use storage_options

geopandas/geopandas#2107 added storage options to geopandas' read_parquet. Once it's released, we should update the census notebook to replace

protocol, url = signed_asset.href.split("://")
fs = fsspec.filesystem(protocol, **signed_asset.extra_fields["table:storage_options"])
df = geopandas.read_parquet(url, filesystem=fs)

with

df = geopandas.read_parquet(url, storage_options=signed_asset.extra_fields["table:storage_options"])

bad image in stac?

Received this error during some batch processing, so maybe this image in particular is corrupt?

RasterioIOError: '/vsicurl/https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/17/Q/LA/2017/12/27/S2B_MSIL2A_20171227T160459_N0212_R054_T17QLA_20201014T165101.SAFE/GRANULE/L2A_T17QLA_A004227_20171227T160750/IMG_DATA/R10m/T17QLA_20171227T160459_B04_10m.tif<url_shortened>' not recognized as a supported file format.

Accessed via this stac:
"https://planetarycomputer-staging.microsoft.com/api/stac/v1/"

GPU environment has corrupted GDAL installation

Apologies if this isn't the right place to be posting this issue.

All GDAL dependencies seem to work properly in the CPU environment with the Pangeo Notebook. However, the GDAL/rasterio installations seem to be broken when you spin up a GPU machine on the Hub. If I open a terminal and type in gdalinfo I get this error:

gdalinfo: error while loading shared libraries: libdeflate.so.0: cannot open shared object file: no such file or directory

Inaccessibility of Aqua Snow Cover Daily Product with id "MYD10A1 "

Hi, Your example is working fine with Terra product. I tried to access Aqua Snow Cover Daily product with its id "MYD10A1" but it is not accessible. Are you going to add this product in near future?
I used your example but it is not working:

for datetime in datelist:
    print(f"Fetching {datetime}")
    search = catalog.search(
        collections=["modis-MYD10A1-061"],
        intersects=location,
        datetime=str(datetime),
    )
    item_aq = search.get_all_items()[0]
    items_aq[datetime] = planetary_computer.sign(item_aq)

print(items_aq)

label-maker-dask.ipynb example tutorial fails

The https://github.com/microsoft/PlanetaryComputerExamples/blob/main/tutorials/label-maker-dask.ipynb tutorial fails for me.

When running lmj.execute_job() I get many repetition of the following warning:

distributed.worker - WARNING - Compute Failed
Function:  execute_task
args:      ((<function tile_to_label at 0x7f6dd4e45c10>, Tile(x=15550, y=12548, z=15), 'segmentation', [(<class 'dict'>, [['name', 'Roads'], ['filter', ['has', 'highway']]]), (<class 'dict'>, [['name', 'Buildings'], ['filter', ['has', 'building']]])], '[https://qa-tiles-server-dev.ds.io/services/z17/tiles/{z}/{x}/{y}.pbf](https://qa-tiles-server-dev.ds.io/services/z17/tiles/%7Bz%7D/%7Bx%7D/%7By%7D.pbf)'))
kwargs:    {}
Exception: 'SSLError(MaxRetryError(\'HTTPSConnectionPool(host=\\\'qa-tiles-server-dev.ds.io\\\', port=443): Max retries exceeded with url: /services/z17/tiles/15/15550/12548.pbf (Caused by SSLError(CertificateError("hostname \\\'qa-tiles-server-dev.ds.io\\\' doesn\\\'t match either of \\\'*.azure-api.net\\\', \\\'*.portal.azure-api.net\\\', \\\'*.management.azure-api.net\\\', \\\'*.scm.azure-api.net\\\', \\\'*.configuration.azure-api.net\\\', \\\'*.regional.azure-api.net\\\', \\\'*.developer.azure-api.net\\\', \\\'*.data.azure-api.net\\\'")))\'))'

Followed by an SSL error

---------------------------------------------------------------------------
SSLError                                  Traceback (most recent call last)
Input In [7], in <cell line: 1>()
----> 1 lmj.execute_job()

File /srv/conda/envs/notebook/lib/python3.8/site-packages/label_maker_dask/main.py:108, in LabelMakerJob.execute_job(self)
    106 def execute_job(self):
    107     """compute the labels and images"""
--> 108     self.results = dask.compute(*self.tasks)

File /srv/conda/envs/notebook/lib/python3.8/site-packages/dask/base.py:573, in compute(traverse, optimize_graph, scheduler, get, *args, **kwargs)
    570     keys.append(x.__dask_keys__())
    571     postcomputes.append(x.__dask_postcompute__())
--> 573 results = schedule(dsk, keys, **kwargs)
    574 return repack([f(r, *a) for r, (f, a) in zip(results, postcomputes)])

File /srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/client.py:3010, in Client.get(self, dsk, keys, workers, allow_other_workers, resources, sync, asynchronous, direct, retries, priority, fifo_timeout, actors, **kwargs)
   3008         should_rejoin = False
   3009 try:
-> 3010     results = self.gather(packed, asynchronous=asynchronous, direct=direct)
   3011 finally:
   3012     for f in futures.values():

File /srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/client.py:2162, in Client.gather(self, futures, errors, direct, asynchronous)
   2160 else:
   2161     local_worker = None
-> 2162 return self.sync(
   2163     self._gather,
   2164     futures,
   2165     errors=errors,
   2166     direct=direct,
   2167     local_worker=local_worker,
   2168     asynchronous=asynchronous,
   2169 )

File /srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/utils.py:311, in SyncMethodMixin.sync(self, func, asynchronous, callback_timeout, *args, **kwargs)
    309     return future
    310 else:
--> 311     return sync(
    312         self.loop, func, *args, callback_timeout=callback_timeout, **kwargs
    313     )

File /srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/utils.py:378, in sync(loop, func, callback_timeout, *args, **kwargs)
    376 if error:
    377     typ, exc, tb = error
--> 378     raise exc.with_traceback(tb)
    379 else:
    380     return result

File /srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/utils.py:351, in sync.<locals>.f()
    349         future = asyncio.wait_for(future, callback_timeout)
    350     future = asyncio.ensure_future(future)
--> 351     result = yield future
    352 except Exception:
    353     error = sys.exc_info()

File /srv/conda/envs/notebook/lib/python3.8/site-packages/tornado/gen.py:762, in Runner.run(self)
    759 exc_info = None
    761 try:
--> 762     value = future.result()
    763 except Exception:
    764     exc_info = sys.exc_info()

File /srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/client.py:2025, in Client._gather(self, futures, errors, direct, local_worker)
   2023         exc = CancelledError(key)
   2024     else:
-> 2025         raise exception.with_traceback(traceback)
   2026     raise exc
   2027 if errors == "skip":

File /srv/conda/envs/notebook/lib/python3.8/site-packages/label_maker_dask/main.py:38, in tile_to_label()
     22 """
     23 Parameters
     24 ------------
   (...)
     34     representing the label of the tile
     35 """
     37 url = label_source.format(x=tile.x, y=tile.y, z=tile.z)
---> 38 r = requests.get(url)
     39 r.raise_for_status()
     41 tile_data = mapbox_vector_tile.decode(r.content)

File /srv/conda/envs/notebook/lib/python3.8/site-packages/requests/api.py:75, in get()
     64 def get(url, params=None, **kwargs):
     65     r"""Sends a GET request.
     66 
     67     :param url: URL for the new :class:`Request` object.
   (...)
     72     :rtype: requests.Response
     73     """
---> 75     return request('get', url, params=params, **kwargs)

File /srv/conda/envs/notebook/lib/python3.8/site-packages/requests/api.py:61, in request()
     57 # By using the 'with' statement we are sure the session is closed, thus we
     58 # avoid leaving sockets open which can trigger a ResourceWarning in some
     59 # cases, and look like a memory leak in others.
     60 with sessions.Session() as session:
---> 61     return session.request(method=method, url=url, **kwargs)

File /srv/conda/envs/notebook/lib/python3.8/site-packages/requests/sessions.py:529, in request()
    524 send_kwargs = {
    525     'timeout': timeout,
    526     'allow_redirects': allow_redirects,
    527 }
    528 send_kwargs.update(settings)
--> 529 resp = self.send(prep, **send_kwargs)
    531 return resp

File /srv/conda/envs/notebook/lib/python3.8/site-packages/requests/sessions.py:645, in send()
    642 start = preferred_clock()
    644 # Send the request
--> 645 r = adapter.send(request, **kwargs)
    647 # Total elapsed time of the request (approximately)
    648 elapsed = preferred_clock() - start

File /srv/conda/envs/notebook/lib/python3.8/site-packages/requests/adapters.py:517, in send()
    513         raise ProxyError(e, request=request)
    515     if isinstance(e.reason, _SSLError):
    516         # This branch is for urllib3 v1.22 and later.
--> 517         raise SSLError(e, request=request)
    519     raise ConnectionError(e, request=request)
    521 except ClosedPoolError as e:

SSLError: None: Max retries exceeded with url: /services/z17/tiles/15/15550/12548.pbf (Caused by None)

Looks like the label_source does not work as suggested in the notebook.

label_source: str. A template string for a tile server providing OpenStreetMap QA tiles. Planetary Computer hosts a tile server supporting this format at https://qa-tiles-server-dev.ds.io/services/z17/tiles/{z}/{x}/{y}.pbf

stac_vrt missing when launching GPU-PyTorch profile for landcover.ipynb

When running the landcover tutorial, import stac_vrt failed.

After adding a cell block above the import and pip install stac-vrt the issue is fixed.

Querying the STAC API for several models at the same time

Hi,

I am new to planetary computer and I am looking for an advice how to load several climate models at the same time. I am using nasa-nex-gddp-cmip6 data and I found this example for one model:

search = catalog.search(
collections=["nasa-nex-gddp-cmip6"],
datetime="1950/2000",
query={"cmip6:model": {"eq": "ACCESS-CM2"}},
)
items = search.get_all_items()
len(items)

However I need more than just "ACCESS-CM2'. For example also: 'ACCESS-ESM1-5', 'BCC-CSM2-MR', 'CESM2', 'CESM2-WACCM'. Could you please let me know of the syntax to load all or at least several climate models at the same time?

Many thanks,

Ivana

pystac_client ConformanceError

Following Reading Data from the STAC API

I hit a ConformanceError raised by pystac_client. Apparently "must contain one of the following URIs"

from pystac_client import Client
catalog = Client.open("https://planetarycomputer.microsoft.com/api/stac/v1")

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\smurphy1\Miniconda3\envs\mpc\lib\site-packages\pystac_client\client.py", line 98, in open
    catalog = cls.from_file(url)
  File "C:\Users\smurphy1\Miniconda3\envs\mpc\lib\site-packages\pystac\stac_object.py", line 496, in from_file
    o = cls.from_dict(d, href=href)
  File "C:\Users\smurphy1\Miniconda3\envs\mpc\lib\site-packages\pystac_client\client.py", line 134, in from_dict
    catalog = cls(
  File "C:\Users\smurphy1\Miniconda3\envs\mpc\lib\site-packages\pystac_client\client.py", line 68, in __init__
    raise ConformanceError(
pystac_client.exceptions.ConformanceError: API does not conform to {ConformanceClasses.STAC_API_CORE}. Must contain one
of the following URIs to conform (preferably the first):
        https://api.stacspec.org/v1.0.0-beta.1/core
        http://stacspec.org/spec/api/1.0.0-beta.1/core.

Steps to reproduce

Create a new conda environment in Windows
pip install planetary-computer
pip install pystac-client

Versions

Python 3.9.6
pystac-client      0.1.1
pystac             0.5.6
planetary-computer 0.2.2

Xclim example not running

Hi, the notebook at https://planetarycomputer.microsoft.com/dataset/cil-gdpcir-cc-by#Climate-indicators is not working.
I am aware that Xclim has a new release and it apparently is not adjusted in your sever yet.

module 'pystac_client' has no attribute 'ItemCollection'

I run this example https://planetarycomputer.microsoft.com/docs/tutorials/cloudless-mosaic-sentinel2/
and i found following error:

AttributeError Traceback (most recent call last)
Cell In[23], line 1
----> 1 signed_items = pystac_client.ItemCollection([
2 planetary_computer.sign_assets(item) for item in items
3 ])
4 #print(signed_items)
5 data = (
6 stackstac.stack(
7 items,
(...)
13 .assign_coords(band=lambda x: x.common_name.rename("band")) # use common names
14 )

AttributeError: module 'pystac_client' has no attribute 'ItemCollection'

How to solve this problem?

Unable to set up gate way in local system

I am trying to set up dask gateway on my local system, but when i click on dask gaetway from hub it shows me 404 error
can you suggest me how can i set up the dask gateway

Unable to use xclim library even after depreciating the xclim library version

i was trying to use the xclim.indicators in my jupyterHub but got the error of pint.unit not available

Issue in accessing elements of data variable using Xarray with Dask Clusters

I am not sure if this question needs to be ask here or no. I am trying to access elements of "re" dataset using re.variable[0,0,0].values.item(). It works perfectly without creating dask clusters, whenever I create dask clusters then it does not work: I used re.persist(), load() options but still issue not solved. I am using Planetary Computer environment (Jupyter Notebook). The error detail is as follows: I will appreciate if any help please.

[---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Input In [33], in
----> 1 da[0,0,0].values.item()

File /srv/conda/envs/notebook/lib/python3.8/site-packages/xarray/core/dataarray.py:641, in DataArray.values(self)
632 @Property
633 def values(self) -> np.ndarray:
634 """
635 The array's data as a numpy.ndarray.
636
(...)
639 type does not support coercion like this (e.g. cupy).
640 """
--> 641 return self.variable.values

File /srv/conda/envs/notebook/lib/python3.8/site-packages/xarray/core/variable.py:510, in Variable.values(self)
507 @Property
508 def values(self):
509 """The variable's data as a numpy.ndarray"""
--> 510 return _as_array_or_item(self._data)

File /srv/conda/envs/notebook/lib/python3.8/site-packages/xarray/core/variable.py:250, in _as_array_or_item(data)
236 def _as_array_or_item(data):
237 """Return the given values as a numpy array, or as an individual item if
238 it's a 0d datetime64 or timedelta64 array.
239
(...)
248 TODO: remove this (replace with np.asarray) once these issues are fixed
249 """
--> 250 data = np.asarray(data)
251 if data.ndim == 0:
252 if data.dtype.kind == "M":

File /srv/conda/envs/notebook/lib/python3.8/site-packages/numpy/core/_asarray.py:102, in asarray(a, dtype, order, like)
99 if like is not None:
100 return _asarray_with_like(a, dtype=dtype, order=order, like=like)
--> 102 return array(a, dtype, copy=False, order=order)

File /srv/conda/envs/notebook/lib/python3.8/site-packages/dask/array/core.py:1541, in Array.array(self, dtype, **kwargs)
1540 def array(self, dtype=None, **kwargs):
-> 1541 x = self.compute()
1542 if dtype and x.dtype != dtype:
1543 x = x.astype(dtype)

File /srv/conda/envs/notebook/lib/python3.8/site-packages/dask/base.py:288, in DaskMethodsMixin.compute(self, **kwargs)
264 def compute(self, **kwargs):
265 """Compute this dask collection
266
267 This turns a lazy Dask collection into its in-memory equivalent.
(...)
286 dask.base.compute
287 """
--> 288 (result,) = compute(self, traverse=False, **kwargs)
289 return result

File /srv/conda/envs/notebook/lib/python3.8/site-packages/dask/base.py:571, in compute(traverse, optimize_graph, scheduler, get, *args, **kwargs)
568 keys.append(x.dask_keys())
569 postcomputes.append(x.dask_postcompute())
--> 571 results = schedule(dsk, keys, **kwargs)
572 return repack([f(r, *a) for r, (f, a) in zip(results, postcomputes)])

File /srv/conda/envs/notebook/lib/python3.8/site-packages/dask/threaded.py:79, in get(dsk, result, cache, num_workers, pool, **kwargs)
76 elif isinstance(pool, multiprocessing.pool.Pool):
77 pool = MultiprocessingPoolExecutor(pool)
---> 79 results = get_async(
80 pool.submit,
81 pool._max_workers,
82 dsk,
83 result,
84 cache=cache,
85 get_id=_thread_get_id,
86 pack_exception=pack_exception,
87 **kwargs,
88 )
90 # Cleanup pools associated to dead threads
91 with pools_lock:

File /srv/conda/envs/notebook/lib/python3.8/site-packages/dask/local.py:507, in get_async(submit, num_workers, dsk, result, cache, get_id, rerun_exceptions_locally, pack_exception, raise_exception, callbacks, dumps, loads, chunksize, **kwargs)
505 _execute_task(task, data) # Re-execute locally
506 else:
--> 507 raise_exception(exc, tb)
508 res, worker_id = loads(res_info)
509 state["cache"][key] = res

File /srv/conda/envs/notebook/lib/python3.8/site-packages/dask/local.py:315, in reraise(exc, tb)
313 if exc.traceback is not tb:
314 raise exc.with_traceback(tb)
--> 315 raise exc

File /srv/conda/envs/notebook/lib/python3.8/site-packages/dask/local.py:220, in execute_task(key, task_info, dumps, loads, get_id, pack_exception)
218 try:
219 task, data = loads(task_info)
--> 220 result = _execute_task(task, data)
221 id = get_id()
222 result = dumps((result, id))

File /srv/conda/envs/notebook/lib/python3.8/site-packages/dask/core.py:119, in _execute_task(arg, cache, dsk)
115 func, args = arg[0], arg[1:]
116 # Note: Don't assign the subtask results to a variable. numpy detects
117 # temporaries by their reference count and can execute certain
118 # operations in-place.
--> 119 return func(*(_execute_task(a, cache) for a in args))
120 elif not ishashable(arg):
121 return arg

File /srv/conda/envs/notebook/lib/python3.8/site-packages/dask/array/chunk.py:422, in getitem(obj, index)
401 def getitem(obj, index):
402 """Getitem function
403
404 This function creates a copy of the desired selection for array-like
(...)
420
421 """
--> 422 result = obj[index]
423 try:
424 if not result.flags.owndata and obj.size >= 2 * result.size:

TypeError: 'Future' object is not subscriptable](url)

Is there a list of dependencies to install?

I apologize in advance if this information is available in an obvious location, but I couldn't find it: is there a list of dependencies or a specific environment file (for conda)?

I am trying to run the landcover.ipynb on my local machine, but the kernel keeps crashing after attempting to import segmentation_models_pytorch. It would help to know the exact versions of the packages to install.

Read or write failed. IReadBlock failed at X offset 0, Y offset 0: IReadBlock failed at X offset 20, Y offset 12: TIFFReadEncodedTile() failed

I sometimes get the following error when running the stackstac.stack. This error does not occur all of the times while running the same code! For now, I just implemented a retry mechanism.

data = (
    stackstac.stack(
        items,
        assets=["B02", "B03", "B04", "B08"], 
        chunksize=2304,
        resolution=10,
        epsg=most_common_epsg,
        bounds_latlon=bbox,
    )
    .where(lambda x: x > 0, other=np.nan)  # sentinel-2 uses 0 as nodata
    .assign_coords(band=lambda x: x.common_name.rename("band"))  # use common names
)

Error reading Window(col_off=0, row_off=0, width=1950, height=1950) from 'https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/11/S/LT/2017/06/13/S2A_MSIL2A_20170613T182921_N0212_R027_T11SLT_20210209T224230.SAFE/GRANULE/L2A_T11SLT_A010320_20170613T183355/IMG_DATA/R10m/T11SLT_20170613T182921_B04_10m.tif?st=2023-10-02T19%3A26%3A13Z&se=2023-10-03T20%3A11%3A13Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2023-10-03T17%3A11%3A45Z&ske=2023-10-10T17%3A11%3A45Z&sks=b&skv=2021-06-08&sig=uo6P2FvIfahk3On4ADj8egCudxKiGsSaX4j/hmdn2WQ%3D': RasterioIOError('Read or write failed. IReadBlock failed at X offset 0, Y offset 0: IReadBlock failed at X offset 20, Y offset 12: TIFFReadEncodedTile() failed.')

Also sometimes, I face other similar errors that get resolved after retrying:

Error opening 'https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/10/S/EH/2020/08/22/S2A_MSIL2A_20200822T184921_N0212_R113_T10SEH_20200825T124924.SAFE/GRANULE/L2A_T10SEH_A026994_20200822T190121/IMG_DATA/R10m/T10SEH_20200822T184921_B08_10m.tif?st=2023-10-02T20%3A57%3A02Z&se=2023-10-03T21%3A42%3A02Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2023-10-03T13%3A31%3A17Z&ske=2023-10-10T13%3A31%3A17Z&sks=b&skv=2021-06-08&sig=7J/j2FJjYw3pdyGyZXFeDbzrJip7fwmmCiDWF2lbbtM%3D': RasterioIOError('HTTP response code: 503')

Error opening 'https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/11/S/MS/2020/12/19/S2B_MSIL2A_20201219T183759_N0212_R027_T11SMS_20201220T165730.SAFE/GRANULE/L2A_T11SMS_A019787_20201219T184458/IMG_DATA/R10m/T11SMS_20201219T183759_B08_10m.tif?st=2023-10-03T13%3A35%3A38Z&se=2023-10-04T14%3A20%3A38Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2023-10-04T13%3A24%3A10Z&ske=2023-10-11T13%3A24%3A10Z&sks=b&skv=2021-06-08&sig=FyHq7ke3q%2BmAaVvIScoez1S/foziLy8SX0QKohxlGtY%3D': RasterioIOError("'/vsicurl/https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/11/S/MS/2020/12/19/S2B_MSIL2A_20201219T183759_N0212_R027_T11SMS_20201220T165730.SAFE/GRANULE/L2A_T11SMS_A019787_20201219T184458/IMG_DATA/R10m/T11SMS_20201219T183759_B08_10m.tif?st=2023-10-03T13%3A35%3A38Z&se=2023-10-04T14%3A20%3A38Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2023-10-04T13%3A24%3A10Z&ske=2023-10-11T13%3A24%3A10Z&sks=b&skv=2021-06-08&sig=FyHq7ke3q%2BmAaVvIScoez1S/foziLy8SX0QKohxlGtY%3D' not recognized as a supported file format.")

Reading Data from the STAC API tutorial - Conformance Error

I was trying to run the tutorial here https://github.com/microsoft/PlanetaryComputerExamples/blob/main/quickstarts/reading-stac.ipynb, and I received the following error message. I also tried upgrading to the pystac-client v0.2.0b2 version but received the same error.

landcover.ipynb predictions misalignment?

It seems that there might be some misalignment of imagery within the landcover example? Looking at this output, the labels don't appear to be from the same portion of the imagery.

CPLE_OpenFailedError

import pystac_client
import planetary_computer
import odc.stac
from pystac.extensions.eo import EOExtension as eo
atalog = pystac_client.Client.open(
    "https://planetarycomputer.microsoft.com/api/stac/v1",
    modifier=planetary_computer.sign_inplace,
)
bbox_of_interest = [91.77, 22.33, 92.78, 25.349] # lon lat
time_of_interest = "2022-05-01/2022-6-31"
search = catalog.search(
    collections=["landsat-c2-l2"],
    bbox=bbox_of_interest,
    datetime=time_of_interest,
    query={"eo:cloud_cover": {"lt": 1}},
)

items = search.item_collection()
print(f"Returned {len(items)} Items")
selected_item = min(items, key=lambda item: eo.ext(item).cloud_cover)

print(
    f"Choosing {selected_item.id} from {selected_item.datetime.date()}"
    + f" with {selected_item.properties['eo:cloud_cover']}% cloud cover"
)
bands_of_interest = ["nir08", "red", "green", "blue", "qa_pixel", "lwir11"]
data = odc.stac.stac_load(
    [selected_item], bands=bands_of_interest, bbox=bbox_of_interest
).isel(time=0)
data

CPLE_OpenFailedError Traceback (most recent call last)
File rasterio/_base.pyx:302, in rasterio._base.DatasetBase.init()

File rasterio/_base.pyx:213, in rasterio._base.open_dataset()

File rasterio/_err.pyx:217, in rasterio._err.exc_wrap_pointer()

CPLE_OpenFailedError: '/vsicurl/https://landsateuwest.blob.core.windows.net/landsat-c2/level-2/standard/oli-tirs/2022/137/043/LC09_L2SP_137043_20221022_20221025_02_T1/LC09_L2SP_137043_20221022_20221025_02_T1_SR_B3.TIF?st=2022-11-15T04%3A35%3A21Z&se=2022-11-23T04%3A35%3A21Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2022-11-16T04%3A35%3A20Z&ske=2022-11-23T04%3A35%3A20Z&sks=b&skv=2021-06-08&sig=fNaHgF31zqZk2j8ygij66ABhGFKwqYfvsnsaypqdCso%3D' not recognized as a supported file format.

During handling of the above exception, another exception occurred:

RasterioIOError Traceback (most recent call last)
Cell In [94], line 2
1 bands_of_interest = ["nir08", "red", "green", "blue", "qa_pixel", "lwir11"]
----> 2 data = odc.stac.stac_load(
3 [selected_item], bands=bands_of_interest, bbox=bbox_of_interest
4 ).isel(time=0)
5 data

File /srv/conda/envs/notebook/lib/python3.10/site-packages/odc/stac/_load.py:595, in load(items, bands, groupby, resampling, dtype, chunks, pool, crs, resolution, anchor, geobox, bbox, lon, lat, x, y, like, geopolygon, progress, stac_cfg, patch_url, preserve_original_order, **kw)
592 if progress is not None:
593 _work = progress(SizedIterable(_work, total_tasks))
--> 595 for _ in _work:
596 pass
598 return _with_debug_info(ds, tasks=_tasks)

File /srv/conda/envs/notebook/lib/python3.10/site-packages/odc/stac/_utils.py:38, in pmap(func, inputs, pool)
34 """
35 Wrapper for ThreadPoolExecutor.map
36 """
37 if pool is None:
---> 38 yield from map(func, inputs)
39 return
41 if isinstance(pool, int):

File /srv/conda/envs/notebook/lib/python3.10/site-packages/odc/stac/_load.py:586, in load.._do_one(task)
580 srcs = [
581 src
582 for src in (_parsed[idx].get(band, None) for idx, band in task.srcs)
583 if src is not None
584 ]
585 with rio_env(**_rio_env):
--> 586 _ = _fill_2d_slice(srcs, task.dst_gbox, task.cfg, dst_slice)
587 t, y, x = task.idx_tyx
588 return (task.band, t, y, x)

File /srv/conda/envs/notebook/lib/python3.10/site-packages/odc/stac/_load.py:681, in _fill_2d_slice(srcs, dst_gbox, cfg, dst)
678 return dst
680 src, *rest = srcs
--> 681 _roi, pix = rio_read(src, cfg, dst_gbox, dst=dst)
683 for src in rest:
684 # first valid pixel takes precedence over others
685 _roi, pix = rio_read(src, cfg, dst_gbox)

File /srv/conda/envs/notebook/lib/python3.10/site-packages/odc/stac/_reader.py:185, in rio_read(src, cfg, dst_geobox, dst)
181 # if resampling is nearest then ignore sub-pixel translation when deciding
182 # whether we can just paste source into destination
183 ttol = 0.9 if cfg.nearest else 0.05
--> 185 with rasterio.open(src.uri, "r", sharing=False) as rdr:
186 assert isinstance(rdr, rasterio.DatasetReader)
187 ovr_idx: Optional[int] = None

File /srv/conda/envs/notebook/lib/python3.10/site-packages/rasterio/env.py:442, in ensure_env_with_credentials..wrapper(*args, **kwds)
439 session = DummySession()
441 with env_ctor(session=session):
--> 442 return f(*args, **kwds)

File /srv/conda/envs/notebook/lib/python3.10/site-packages/rasterio/init.py:277, in open(fp, mode, driver, width, height, count, crs, transform, dtype, nodata, sharing, **kwargs)
274 path = _parse_path(raw_dataset_path)
276 if mode == "r":
--> 277 dataset = DatasetReader(path, driver=driver, sharing=sharing, **kwargs)
278 elif mode == "r+":
279 dataset = get_writer_for_path(path, driver=driver)(
280 path, mode, driver=driver, sharing=sharing, **kwargs
281 )

File rasterio/_base.pyx:304, in rasterio._base.DatasetBase.init()

RasterioIOError: '/vsicurl/https://landsateuwest.blob.core.windows.net/landsat-c2/level-2/standard/oli-tirs/2022/137/043/LC09_L2SP_137043_20221022_20221025_02_T1/LC09_L2SP_137043_20221022_20221025_02_T1_SR_B3.TIF?st=2022-11-15T04%3A35%3A21Z&se=2022-11-23T04%3A35%3A21Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2022-11-16T04%3A35%3A20Z&ske=2022-11-23T04%3A35%3A20Z&sks=b&skv=2021-06-08&sig=fNaHgF31zqZk2j8ygij66ABhGFKwqYfvsnsaypqdCso%3D' not recognized as a supported file format.

which version of IMERG?

I suggest that the header at https://github.com/microsoft/PlanetaryComputerExamples/blob/main/datasets/gpm-imerg-hhr/gpm-imerg-hhr-example.ipynb make it clear which version of IMERG is being used (presumably final run).

Landcover.ipynb Exception: TypeError('Could not serialize object of type Tensor...') in

The landcover.ipynb notebook example is amazing. Thanks @TomAugspurger for putting it together!

I'm fairly new to pytorch and GPUs and am encountering tracebacks in the default environment perhaps related to version changes.

remote_model = client.scatter(model, broadcast=True)

(abbreviated traceback):

TypeError: Could not serialize object of type Tensor.
Traceback (most recent call last):
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/protocol/serialize.py", line 340, in serialize
    header, frames = dumps(x, context=context) if wants_context else dumps(x)
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/protocol/serialize.py", line 43, in dask_dumps
    sub_header, frames = dumps(x)
  File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/protocol/torch.py", line 19, in serialize_torch_Tensor
    sub_header, frames = serialize(t.numpy())
TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

Naively I tried remote_model = client.scatter(model.cpu(), broadcast=True) which runs (but would that not take advantage of GPU?) , but then run into the following with predictions[:, :200, :200].compute()

distributed.worker - WARNING - Compute Failed
Function:  execute_task
kwargs:    {}
Exception: "RuntimeError('Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same')"

Data not available in benchmark-tutorial.ipynb

Working through competitions/cloud-cover/benchmark-tutorial.ipynb on a hub instance, the notebook states the data should be available in a volume but this is not the case:

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
Input In [4], in <cell line: 5>()
      2 TRAIN_FEATURES = DATA_DIR / "train_features"
      3 TRAIN_LABELS = DATA_DIR / "train_labels"
----> 5 assert TRAIN_FEATURES.exists()

AssertionError:

multiple reads of nasa-nex-gddp-cmip6 dataset from MultiZarrToZarr concatenated metadata returns all nans

Simply running twice cells 13 & 14 of the notebook that read and plot point single variable time-series for a point will reproduce this issue where the first run will have the valid values but second will be all nans. I encountered this when parallelizing reading of the files with dask that results in multiple reads and the unexpected result.

add reviewnb to this repo

May help with notebook PRs

https://www.reviewnb.com/

"ConformanceError: API does not conform"

Hi,

I am new to EO data and have started following your 'Cloudless Mosaic' tutorial to familiarise myself with accessing satellite data through the api. I have pip installed the pystac_client, however when I run the following command:

stac = pystac_client.Client.open("https://planetarycomputer.microsoft.com/api/stac/v1")

I receive the error:

ConformanceError: API does not conform to {ConformanceClasses.STAC_API_CORE}. Must contain one of the following URIs to conform (preferably the first):
https://api.stacspec.org/v1.0.0-beta.1/core
http://stacspec.org/spec/api/1.0.0-beta.1/core.

Any help or guidance would be greatly appreciated. Thank you.

Ben