stac-utils / stac-fastapi Goto Github PK
View Code? Open in Web Editor NEWSTAC API implementation with FastAPI.
Home Page: https://stac-utils.github.io/stac-fastapi/
License: MIT License
STAC API implementation with FastAPI.
Home Page: https://stac-utils.github.io/stac-fastapi/
License: MIT License
I would propose moving the landing page method from the sqlalchemy code into the BaseCoreClient
and removing the abstractmethod
decorator as this code is generic.
When trying to install one namespace package from another, pipenv cannot find the package because it is not available on pypi. Not a huge deal because we can still install from the setup.py
, but it would nice to keep the lock files up to date.
Another solution is to get pipenv to install from the local source, but I've so far been unsuccessful at doing this.
TiTiler is pinned to 0.1a2
which is incompatible I think with latest rio-tiler version. we should either pin rio-tiler (<= 2.0.0rc1) or update to TiTiler 0.1.0a12
which has some nice improvement
also worth noting that we remove any use of pkg_ressources
in https://github.com/developmentseed/titiler/blob/master/CHANGES.md#010-alpha7-2020-10-13
you can now replace
https://github.com/arturo-ai/arturo-stac-api/blob/457cf8250b78c1e6e6fd519589b99ccee04eff43/stac_api/api/extensions/tiles.py#L17-L28
with
from titiler.templates import templates
We are building API docs with pdocs
but the github action to rebuild documentation on master builds doesn't trigger on changes to python files, which means the API docs may be out of date if function signatures are changed, docstrings are updated, or new functionality is added by the PR being merged.
https://github.com/stac-utils/stac-fastapi/blob/master/.github/workflows/deploy_mkdocs.yml#L9-L11
STAC spec defines transactions endpoints for POSTing new data. This is quite slow for bulk data ingest, as these operations are atomic and require a single INSERT and commit for each row. Also this is currently done through sqlalchemy's ORM which is slow for bulk ingest (sqlalchemy core is much faster for this).
It would be great to expose a bulk transactions extension which allows for more efficient ingest of large amounts of items. I imagine this functionality wouldn't be exposed through the API layer, but instead provide a way to load data server-side without having to write a custom script every time.
item['collection']
) needs to match the global iddocker-compose up --build
to rebuild the underlying containers, e.g. if the dependencies have changed.I am running into this issue in updating existing items while making PUT request on /collections/{collectionId}/items
Found this in the code where we are dropping geometry before forming update query. Is this an expected behaviour as per STAC spec?
timvt a good example of this: https://github.com/developmentseed/timvt
Its really useful to calculate the jaccard score (IOU) when doing spatial queries against really any catalog. The score returns a 1.0 if the search geometry and item geometry are identical, and a 0.0 if the two geometries are very different. You can make the jaccard score inclusive by instead using the intersection of the search and item geometry for the calculation. This effectively returns a score of 1.0 if the search geometry is completely contained by an asset.
This is really useful for clients which care about how item geometries compare to the request geometry past the typical intersects/contains operations. A good example of this is mosaic tiling, where the goal is to use some sort of index (whether it is mosaicjson, postgres etc.) to reduce number of HTTP requests sent to rasters by minimizing the search space. In this case, the tiler can sort the /search
response by score and more intelligently send requests to fill that particular tile.
This is one of the downsides of mosaicjson. Because you have to seed it at a particular quadkey (which may or may not align well with your data), the tiler really doesn't have an understanding of what assets cover which tiles at higher zooms than the seed. The result is the tiler has to naively send tile requests to every asset until the particular tile is full which could be avoided if the index does a better job minimizing the search space
we define the minimal default field to return (so pydantic is happy). In some personal use case I've made (/search
) requests where I wanted only the id
returned or the item without the geometry...
Hi all,
at first thanks for this great software!
When using "STAC browser" on a catalog created with stac-fastapi I get an error message that normally indicates that cors headers are not present.
"NetworkError when attempting to fetch resource.
Please note that some servers don't allow external access via web browsers (e.g., when CORS headers are not present).
Errored URL: https://localhost:8081"
Any comment is appreciated! If you point me to a location in the source code I can also try to include it.
StacAPI
should accept title and version parameters which should then be used in OpenAPI generation
r = requests.delete('http://localhost:8081/collections/sentinel-s2-l2a')
r.json()
# {'detail': 'collection does not exist'}
But the collection does exist
r = requests.get('http://localhost:8081/collections/sentinel-s2-l2a')
r.json()
{'id': 'sentinel-s2-l2a',
'description': 'Sentinel-2a and Sentinel-2b imagery, processed to Level 2A (Surface Reflectance)',
'stac_version': '1.0.0-beta.2',
setup.py
to point to new repo.This code is not linked to any DB call ;-)
dataclass
when we really need to be using attrs
. Mostly because attrs is more flexible when it comes to the definition of optional and optionally required attributes.Originally posted by @geospatial-jeff in #71 (comment)
First off, thanks so much for this superb Python library. I have a small dataset that I would like to make public available to everyone using stac-api.
I was trying to activate the OGC and Titiler routes in my STAC FastAPI app, like in this video.
from stac_api.config import ApiSettings
from stac_api.api import create_app
settings = ApiSettings(
add_ons=["tiles"]
)
app = create_app(settings)
But it seems that the API changes a bit, I was trying to solve it adding TilesExtension
to the stac_fastapi/server/app.py
:
from stac_fastapi.api.app import StacApi
from stac_fastapi.extensions.core import (
FieldsExtension,
QueryExtension,
SortExtension,
TransactionExtension,
)
from stac_fastapi.extensions.third_party import TilesExtension, BaseTilesClient
from stac_fastapi.extensions.third_party import BulkTransactionExtension
from stac_fastapi.sqlalchemy.config import SqlalchemySettings
from stac_fastapi.sqlalchemy.core import CoreCrudClient
from stac_fastapi.sqlalchemy.session import Session
from stac_fastapi.sqlalchemy.transactions import (
BulkTransactionsClient,
TransactionsClient,
)
settings = SqlalchemySettings()
session = Session.create_from_settings(settings)
api = StacApi(
settings=settings,
extensions=[
TilesExtension(client=BaseTilesClient()),
TransactionExtension(client=TransactionsClient(session=session)),
BulkTransactionExtension(client=BulkTransactionsClient(session=session)),
FieldsExtension(),
QueryExtension(),
SortExtension()
],
client=CoreCrudClient(session=session),
)
app = api.app
But it doesn't work. I'm a very basic user, sorry if it is a silly question, but I will over appreciated any help.
Collection summaries are not stored in the database, but they really should to enable collection search for stac-index
This was written as it's own class because, before the release of v1.0.0-beta.1, paging was listed as an api extension. Since then this has been changed and pagination has become a part of core (I think for alignment with OGC), so it makes sense to push the pagination code into core here as well.
The GET /search pagination link is currently returning a POST pagination link
stac_extensions
is included in the alembic migrations but not present in the item and collection database models
Hi again :)
Altough the file tests/data/joplin/index.geojson
has some stac_extension (eo
and proj
), they are not imported into the database in the deployment, so they are not shown at the end. I think maybe it is related to the STAC Pydantic models
index.geojson
{
"id": "f2cca2a3-288b-4518-8a3e-a4492bb60b08",
"type": "Feature",
"collection": "joplin",
"links": [],
"geometry": {
"type": "Polygon",
"coordinates": [
[
[
-94.6884155,
37.0595608
],
[
-94.6884155,
37.0332547
],
[
-94.6554565,
37.0332547
],
[
-94.6554565,
37.0595608
],
[
-94.6884155,
37.0595608
]
]
]
},
"properties": {
"proj:epsg": 3857,
"orientation": "nadir",
"height": 2500,
"width": 2500,
"datetime": "2000-02-02T00:00:00Z",
"gsd": 0.5971642834779395
},
"assets": {
"COG": {
"type": "image/tiff; application=geotiff; profile=cloud-optimized",
"href": "https://arturo-stac-api-test-data.s3.amazonaws.com/joplin/images/may24C350000e4102500n.tif",
"title": "NOAA STORM COG"
}
},
"bbox": [
-94.6884155,
37.0332547,
-94.6554565,
37.0595608
],
"stac_extensions": [
"eo", <----------- HERE
"proj" <----------- HERE
],
"stac_version": "1.0.0-beta.2"
}, ....
Run:
docker-compose up --build
Local Browser
http://127.0.0.1:8081/collections/joplin/items/29c53e17-d7d1-4394-a80f-36763c8f42dc
https://github.com/stac-utils/stac-fastapi/blob/master/tox.ini#L16 there aren't any more arturo modules
In several instances a urljoin
is used with Fast API's base_url
, where a leading /
is used during the join. This works if there's no root_path
set, but in the case where base_url
contains a path prefix, the leading / makes the resulting join based on the host information and disregards the root_path.
Examples:
BaseLinks.root - joining to "/" erases the root_path, should just use str(self.base_url)
The goal of this issue is to find all the instances where a urljoin is used with a leading slash, and joining to a relative path instead (or avoiding a join in the case where the base_url
can be used directly).
When building the docker-compose stack (docker-compose up
) we run a python script which ingests a sample dataset into the database (https://github.com/arturo-ai/arturo-stac-api/blob/master/scripts/ingest_joplin.py). If the stack is built when the database container already exists (maybe from a previous build), the POST request to create a new collection returns a 409 Conflict
which causes the ingest script to raise an exception.
This exception is confusing because it isn't really an error, it just implies that the collection is already in the database which is after all the purpose of the script in the first place. I think a good solution is to only raise an exception on 5XX codes.
Timestamp without time zone should always be avoided as it is ambiguous as to the timezone represented by the field and can be lossy at the DST transition time.
Stac pydantic 1.3.x
supports 1.0.0-beta.2.
with the recent update we've split the module to multiple sub-packages (namespaced). In the current repo architecture those packages are placed at the top-level and then dynamically linked in /stac-fastapi
.
While the current structure gives a quick overview of all the sub-packages I'm not sure it's well aligned with the namespace convention.
For asyncpg or sqlalchemy>1.4 we need to be able to specify async def
endpoints so the code may be executed by the event loop rather than a background thread. Right now it only supports def
(sync) endpoints.
Currently there is no good way to use a different data model than what is defined in models/database.py
. Currently this is difficult to change in a way that is sustainable long term, should be much easier once #57 is resolved.
Running stac-validator against the app produces the following results:
$ stac_validator http://localhost:8081/
[
{
"path": "http://localhost:8081/",
"asset_type": "catalog",
"valid_stac": false,
"error_type": "KeyError",
"error_message": "Key Error: 'id'"
}
]
$ stac_validator http://localhost:8081/collections/joplin
[
{
"path": "http://localhost:8081/collections/joplin",
"asset_type": "collection",
"id": "joplin",
"validated_version": "1.0.0-beta.2",
"valid_stac": true
}
]
$ stac_validator http://localhost:8081/collections/joplin/items/047ab5f0-dce1-4166-a00d-425a3dbefe02
[
{
"path": "http://localhost:8081/collections/joplin/items/047ab5f0-dce1-4166-a00d-425a3dbefe02",
"asset_type": "item",
"id": "047ab5f0-dce1-4166-a00d-425a3dbefe02",
"validated_version": "1.0.0-beta.2",
"valid_stac": true
}
]
Just need to add an id to the landing page.
makes sense. TilesClient
arguably shouldn't subclass CoreCrudClient
either. I think composition is a better pattern here.
Originally posted by @geospatial-jeff in #97 (comment)
they are parameterized in different places, but should probably be set to the same thing.
If I make a request to an endpoint for a collection that doesn't exists, I get a 404
In [14]: import requests
In [19]: r = requests.get("https://pct-pqe-staging.westeurope.cloudapp.azure.com/stac/v1/collections/not-a-collection")
In [20]: r.status_code
Out[20]: 404
But if I make a request to that collection's /items
I get a 200, and the response includes an empty FeatureCollection.
In [21]: r = requests.get("https://pct-pqe-staging.westeurope.cloudapp.azure.com/stac/v1/collections/not-a-collection/items")
In [22]: r.status_code
Out[22]: 200
In [23]: r.json()
Out[23]:
{'type': 'FeatureCollection',
'features': [],
'links': [],
'context': {'returned': 0, 'matched': 0}}
I wanted to verify that this is the expected behavior. I didn't find anything in the API spec, but I admittedly didn't look too closely.
Reported in TomAugspurger/stac-dask-discussion#1
would also require a sprinkling of async/await syntax, as well as a review of the code to make sure any blocking calls are being run in a separate thread.
There are still some places where the backend is coupled to the API. For example, the sqlalchemy engine and session are created during app startup. This coupling makes it difficult to support additional backends, and forces us to do some hacky things in the code.
A similar treatment was applied to api extensions in #54.
Endpoint factories wrap a callable in a function that can be executed as a FastAPI route. Doing so lets us "decorate" the callable with specific request/response models. Currently we have two factories, one for routes which define the request using a dataclass
and one as a pydantic model.
It would be much better to have a single factory instead which means either (1) use a single request type for all routes or (2) one factory that can understand both dataclass + pydantic models.
It seems that right now only Polygon (POST) and bbox (GET) are supported
https://github.com/arturo-ai/arturo-stac-api/blob/7d4c9572981e935de2521441878f3ffb78f6b9b7/stac_api/clients/postgres/core.py#L321-L327
https://github.com/arturo-ai/arturo-stac-api/blob/master/stac_api/models/schemas.py#L228-L235
the STAC API specs says:
Searches items by performing intersection between their geometry and provided GeoJSON geometry. All GeoJSON geometry types must be supported.
ref:
https://github.com/radiantearth/stac-api-spec/blob/f64a08235cb0ae04dfdb37bd8d6940c3814d057c/item-search/README.md#query-parameter-table
The /collections
route according to the API spec returns an object with collections
and links
keys --
The deployed version of the API that I saw this on was based on this branch so I don't know if this is also true on master
. If not I won't be offended by a quick close.
The URL structure for the STAC API makes it clear that collection IDs are potentially required to access items by their ID. A small change to the ItemUri
model should do the trick: https://github.com/stac-utils/stac-fastapi/blob/master/stac_fastapi/api/stac_fastapi/api/models.py#L66
I kept getting
stac-api | sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) FATAL: role "username" does not exist
when running docker-compose up
.
It turned out (with help from this SO answer) that I had a global system Postgres on my Mac that was also running on port 5432. So the Postgres in Docker was being hidden by the system Postgres. When I shut down the system postgres, docker-compose up
worked.
The migration
command
in docker-compose.yml
is:
https://github.com/arturo-ai/arturo-stac-api/blob/4f0ba30a2300fc3273aca83ed6e118def2529b75/docker-compose.yml#L48-L49
It looks like the sleep 10
is just to make sure the app
service is deployed first? Could you just add
depends_on:
- database
- app
to the migration config? Or would that not work?
With the default settings, a query to the root catalog doesn't include stac_version
. I believe that the spec says it should be included: https://github.com/radiantearth/stac-api-spec/tree/master/core
$ docker-compose build
$ docker-compose up
$ curl --silent http://localhost:8081 | jq .stac_version
null
I think this is because the landing page sets response_model_exclude_unset=True
. If I make this change
diff --git a/stac_fastapi_api/stac_fastapi/api/app.py b/stac_fastapi_api/stac_fastapi/api/app.py
index 56b9493..a336900 100644
--- a/stac_fastapi_api/stac_fastapi/api/app.py
+++ b/stac_fastapi_api/stac_fastapi/api/app.py
@@ -99,7 +99,7 @@ class StacApi:
name="Landing Page",
path="/",
response_model=LandingPage,
- response_model_exclude_unset=True,
+ response_model_exclude_unset=False,
response_model_exclude_none=True,
methods=["GET"],
endpoint=create_endpoint_with_depends(
then we're able to get the STAC version
$ docker-compose build
$ docker-compose up
$ curl --silent http://localhost:8081 | jq .stac_version
"1.0.0-beta.2"
Does that seem like the right fix, or will it have unintended consequences? Are there other places we should look at?
According to the gitter conversation here, the Landing Page should have a link with rel=data
to the collections endpoint.
The ingest_joplin.py
script is looking for data in the wrong place https://github.com/stac-utils/stac-fastapi/blob/master/scripts/ingest_joplin.py#L12. Location of test data was changed when tests were moved into the sqlalchemy backend.
like titiler? https://github.com/developmentseed/titiler
exclude_unset
here should be False
so that default fields (e.g. stac_version) are included in the output JSON
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.