Giter Club home page Giter Club logo

stac-fastapi's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

stac-fastapi's Issues

relock pipfiles, maybe after pypi release

When trying to install one namespace package from another, pipenv cannot find the package because it is not available on pypi. Not a huge deal because we can still install from the setup.py, but it would nice to keep the lock files up to date.

Another solution is to get pipenv to install from the local source, but I've so far been unsuccessful at doing this.

Update TiTiler (or pin rio-tiler)

TiTiler is pinned to 0.1a2 which is incompatible I think with latest rio-tiler version. we should either pin rio-tiler (<= 2.0.0rc1) or update to TiTiler 0.1.0a12 which has some nice improvement

also worth noting that we remove any use of pkg_ressources in https://github.com/developmentseed/titiler/blob/master/CHANGES.md#010-alpha7-2020-10-13

you can now replace
https://github.com/arturo-ai/arturo-stac-api/blob/457cf8250b78c1e6e6fd519589b99ccee04eff43/stac_api/api/extensions/tiles.py#L17-L28

with

from titiler.templates import templates

bulk transactions

STAC spec defines transactions endpoints for POSTing new data. This is quite slow for bulk data ingest, as these operations are atomic and require a single INSERT and commit for each row. Also this is currently done through sqlalchemy's ORM which is slow for bulk ingest (sqlalchemy core is much faster for this).

It would be great to expose a bulk transactions extension which allows for more efficient ingest of large amounts of items. I imagine this functionality wouldn't be exposed through the API layer, but instead provide a way to load data server-side without having to write a custom script every time.

Ref https://docs.sqlalchemy.org/en/13/faq/performance.html#i-m-inserting-400-000-rows-with-the-orm-and-it-s-really-slow

Docs updates

  • When inserting records, the collection id in each item (item['collection']) needs to match the global id
  • If you pull a new version, you may need to use docker-compose up --build to rebuild the underlying containers, e.g. if the dependencies have changed.

PUT request for item doesn't update the geometry

I am running into this issue in updating existing items while making PUT request on /collections/{collectionId}/items

Found this in the code where we are dropping geometry before forming update query. Is this an expected behaviour as per STAC spec?

jaccards API extension

Its really useful to calculate the jaccard score (IOU) when doing spatial queries against really any catalog. The score returns a 1.0 if the search geometry and item geometry are identical, and a 0.0 if the two geometries are very different. You can make the jaccard score inclusive by instead using the intersection of the search and item geometry for the calculation. This effectively returns a score of 1.0 if the search geometry is completely contained by an asset.

This is really useful for clients which care about how item geometries compare to the request geometry past the typical intersects/contains operations. A good example of this is mosaic tiling, where the goal is to use some sort of index (whether it is mosaicjson, postgres etc.) to reduce number of HTTP requests sent to rasters by minimizing the search space. In this case, the tiler can sort the /search response by score and more intelligently send requests to fill that particular tile.

This is one of the downsides of mosaicjson. Because you have to seed it at a particular quadkey (which may or may not align well with your data), the tiler really doesn't have an understanding of what assets cover which tiles at higher zooms than the seed. The result is the tiler has to naively send tile requests to every asset until the particular tile is full which could be avoided if the index does a better job minimizing the search space

stac-fastapi does not provide CORS header (Cross-Origin Resource Sharing)

Hi all,

at first thanks for this great software!

When using "STAC browser" on a catalog created with stac-fastapi I get an error message that normally indicates that cors headers are not present.

"NetworkError when attempting to fetch resource.
Please note that some servers don't allow external access via web browsers (e.g., when CORS headers are not present).
Errored URL: https://localhost:8081"

Any comment is appreciated! If you point me to a location in the source code I can also try to include it.

Delete collection fails with 'collection does not exist'

r = requests.delete('http://localhost:8081/collections/sentinel-s2-l2a')
r.json()
# {'detail': 'collection does not exist'}

But the collection does exist

r = requests.get('http://localhost:8081/collections/sentinel-s2-l2a')
r.json()
{'id': 'sentinel-s2-l2a',
 'description': 'Sentinel-2a and Sentinel-2b imagery, processed to Level 2A (Surface Reflectance)',
 'stac_version': '1.0.0-beta.2',

switch to stac-utils

  • Update setup.py to point to new repo.
  • Do a pypi release to update pypi metadata.
  • Mention arturo in the readme

switch to attrs

  1. We are using dataclass when we really need to be using attrs. Mostly because attrs is more flexible when it comes to the definition of optional and optionally required attributes.

Originally posted by @geospatial-jeff in #71 (comment)

Help to active Titiler routes in the stac-fastapi new version

First off, thanks so much for this superb Python library. I have a small dataset that I would like to make public available to everyone using stac-api.

I was trying to activate the OGC and Titiler routes in my STAC FastAPI app, like in this video.

from stac_api.config import ApiSettings
from stac_api.api import create_app

settings = ApiSettings(
    add_ons=["tiles"]
)
app = create_app(settings)

But it seems that the API changes a bit, I was trying to solve it adding TilesExtension to the stac_fastapi/server/app.py:

from stac_fastapi.api.app import StacApi
from stac_fastapi.extensions.core import (
    FieldsExtension,
    QueryExtension,
    SortExtension,
    TransactionExtension,
)
from stac_fastapi.extensions.third_party import TilesExtension, BaseTilesClient
from stac_fastapi.extensions.third_party import BulkTransactionExtension
from stac_fastapi.sqlalchemy.config import SqlalchemySettings
from stac_fastapi.sqlalchemy.core import CoreCrudClient
from stac_fastapi.sqlalchemy.session import Session
from stac_fastapi.sqlalchemy.transactions import (
    BulkTransactionsClient,
    TransactionsClient,
)

settings = SqlalchemySettings()
session = Session.create_from_settings(settings)
api = StacApi(
    settings=settings,
    extensions=[
        TilesExtension(client=BaseTilesClient()),
        TransactionExtension(client=TransactionsClient(session=session)),
        BulkTransactionExtension(client=BulkTransactionsClient(session=session)),
        FieldsExtension(),
        QueryExtension(),
        SortExtension()
    ],
    client=CoreCrudClient(session=session),
)
app = api.app

But it doesn't work. I'm a very basic user, sorry if it is a silly question, but I will over appreciated any help.

Support collection summaries

Collection summaries are not stored in the database, but they really should to enable collection search for stac-index

remove `PaginationClient`, support paging in core

This was written as it's own class because, before the release of v1.0.0-beta.1, paging was listed as an api extension. Since then this has been changed and pagination has become a part of core (I think for alignment with OGC), so it makes sense to push the pagination code into core here as well.

items does not display the stac_extensions after deployment

Hi again :)

Altough the file tests/data/joplin/index.geojson has some stac_extension (eo and proj), they are not imported into the database in the deployment, so they are not shown at the end. I think maybe it is related to the STAC Pydantic models

index.geojson

        {
            "id": "f2cca2a3-288b-4518-8a3e-a4492bb60b08",
            "type": "Feature",
            "collection": "joplin",
            "links": [],
            "geometry": {
                "type": "Polygon",
                "coordinates": [
                    [
                        [
                            -94.6884155,
                            37.0595608
                        ],
                        [
                            -94.6884155,
                            37.0332547
                        ],
                        [
                            -94.6554565,
                            37.0332547
                        ],
                        [
                            -94.6554565,
                            37.0595608
                        ],
                        [
                            -94.6884155,
                            37.0595608
                        ]
                    ]
                ]
            },
            "properties": {
                "proj:epsg": 3857,
                "orientation": "nadir",
                "height": 2500,
                "width": 2500,
                "datetime": "2000-02-02T00:00:00Z",
                "gsd": 0.5971642834779395
            },
            "assets": {
                "COG": {
                    "type": "image/tiff; application=geotiff; profile=cloud-optimized",
                    "href": "https://arturo-stac-api-test-data.s3.amazonaws.com/joplin/images/may24C350000e4102500n.tif",
                    "title": "NOAA STORM COG"
                }
            },
            "bbox": [
                -94.6884155,
                37.0332547,
                -94.6554565,
                37.0595608
            ],
            "stac_extensions": [
                "eo", <----------- HERE
                "proj" <----------- HERE
            ],
            "stac_version": "1.0.0-beta.2"
        }, ....

Run:

docker-compose up --build

Local Browser

http://127.0.0.1:8081/collections/joplin/items/29c53e17-d7d1-4394-a80f-36763c8f42dc

image

item squema

Ensure urljoins with base_url use relative paths

In several instances a urljoin is used with Fast API's base_url, where a leading / is used during the join. This works if there's no root_path set, but in the case where base_url contains a path prefix, the leading / makes the resulting join based on the host information and disregards the root_path.

Examples:
BaseLinks.root - joining to "/" erases the root_path, should just use str(self.base_url)

The goal of this issue is to find all the instances where a urljoin is used with a leading slash, and joining to a relative path instead (or avoiding a join in the case where the base_url can be used directly).

Ignore conflict errors in ingest script

When building the docker-compose stack (docker-compose up) we run a python script which ingests a sample dataset into the database (https://github.com/arturo-ai/arturo-stac-api/blob/master/scripts/ingest_joplin.py). If the stack is built when the database container already exists (maybe from a previous build), the POST request to create a new collection returns a 409 Conflict which causes the ingest script to raise an exception.

This exception is confusing because it isn't really an error, it just implies that the collection is already in the database which is after all the purpose of the script in the first place. I think a good solution is to only raise an exception on 5XX codes.

[needs discussion] follows namespaces package convention ?

with the recent update we've split the module to multiple sub-packages (namespaced). In the current repo architecture those packages are placed at the top-level and then dynamically linked in /stac-fastapi.

While the current structure gives a quick overview of all the sub-packages I'm not sure it's well aligned with the namespace convention.

cc @geospatial-jeff @kylebarron

support custom data models

Currently there is no good way to use a different data model than what is defined in models/database.py. Currently this is difficult to change in a way that is sustainable long term, should be much easier once #57 is resolved.

stac validator

Running stac-validator against the app produces the following results:

Landing Page

$ stac_validator http://localhost:8081/

[
    {
        "path": "http://localhost:8081/",
        "asset_type": "catalog",
        "valid_stac": false,
        "error_type": "KeyError",
        "error_message": "Key Error: 'id'"
    }
]

Collection

$ stac_validator http://localhost:8081/collections/joplin

[
    {
        "path": "http://localhost:8081/collections/joplin",
        "asset_type": "collection",
        "id": "joplin",
        "validated_version": "1.0.0-beta.2",
        "valid_stac": true
    }
]

Item

$ stac_validator http://localhost:8081/collections/joplin/items/047ab5f0-dce1-4166-a00d-425a3dbefe02
[
    {
        "path": "http://localhost:8081/collections/joplin/items/047ab5f0-dce1-4166-a00d-425a3dbefe02",
        "asset_type": "item",
        "id": "047ab5f0-dce1-4166-a00d-425a3dbefe02",
        "validated_version": "1.0.0-beta.2",
        "valid_stac": true
    }
]

Just need to add an id to the landing page.

Request to /collection/<missing>/items returns 200

If I make a request to an endpoint for a collection that doesn't exists, I get a 404

In [14]: import requests

In [19]: r = requests.get("https://pct-pqe-staging.westeurope.cloudapp.azure.com/stac/v1/collections/not-a-collection")

In [20]: r.status_code
Out[20]: 404

But if I make a request to that collection's /items I get a 200, and the response includes an empty FeatureCollection.

In [21]: r = requests.get("https://pct-pqe-staging.westeurope.cloudapp.azure.com/stac/v1/collections/not-a-collection/items")

In [22]: r.status_code
Out[22]: 200

In [23]: r.json()
Out[23]:
{'type': 'FeatureCollection',
 'features': [],
 'links': [],
 'context': {'returned': 0, 'matched': 0}}

I wanted to verify that this is the expected behavior. I didn't find anything in the API spec, but I admittedly didn't look too closely.

Reported in TomAugspurger/stac-dask-discussion#1

decouple backends from api layer

There are still some places where the backend is coupled to the API. For example, the sqlalchemy engine and session are created during app startup. This coupling makes it difficult to support additional backends, and forces us to do some hacky things in the code.

A similar treatment was applied to api extensions in #54.

use a single endpoint factory

https://github.com/arturo-ai/arturo-stac-api/blob/fb47dedfbc45df4488f7fa169b76ca1b30a420f1/stac_api/api/routes.py

Endpoint factories wrap a callable in a function that can be executed as a FastAPI route. Doing so lets us "decorate" the callable with specific request/response models. Currently we have two factories, one for routes which define the request using a dataclass and one as a pydantic model.

  • Dataclasses are used because of their support for dependency injection.
  • Pydantic models are used for static types (no dependency injection).

It would be much better to have a single factory instead which means either (1) use a single request type for all routes or (2) one factory that can understand both dataclass + pydantic models.

allow more geometry types for search

Add note to README that `docker-compose up` won't work when other postgres is running

I kept getting

stac-api     | sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) FATAL:  role "username" does not exist

when running docker-compose up.

It turned out (with help from this SO answer) that I had a global system Postgres on my Mac that was also running on port 5432. So the Postgres in Docker was being hidden by the system Postgres. When I shut down the system postgres, docker-compose up worked.

Default settings exclude stac_version from the Landing page.

With the default settings, a query to the root catalog doesn't include stac_version. I believe that the spec says it should be included: https://github.com/radiantearth/stac-api-spec/tree/master/core

$ docker-compose build
$ docker-compose up
$ curl --silent http://localhost:8081 | jq .stac_version
null

I think this is because the landing page sets response_model_exclude_unset=True. If I make this change

diff --git a/stac_fastapi_api/stac_fastapi/api/app.py b/stac_fastapi_api/stac_fastapi/api/app.py
index 56b9493..a336900 100644
--- a/stac_fastapi_api/stac_fastapi/api/app.py
+++ b/stac_fastapi_api/stac_fastapi/api/app.py
@@ -99,7 +99,7 @@ class StacApi:
             name="Landing Page",
             path="/",
             response_model=LandingPage,
-            response_model_exclude_unset=True,
+            response_model_exclude_unset=False,
             response_model_exclude_none=True,
             methods=["GET"],
             endpoint=create_endpoint_with_depends(

then we're able to get the STAC version

$ docker-compose build
$ docker-compose up
$ curl --silent http://localhost:8081 | jq .stac_version
"1.0.0-beta.2"

Does that seem like the right fix, or will it have unintended consequences? Are there other places we should look at?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.