Giter Club home page Giter Club logo

Comments (17)

StephenPCG avatar StephenPCG commented on September 15, 2024 1

Sure, I will create a PR tomorrow :)

from spiff-arena.

burnettk avatar burnettk commented on September 15, 2024

i guess you modified the docker compose file at spiff-arena/docker-compose.yml for postgres? here's another version that is known to work with postgres: https://github.com/sartography/arena-compose-postgres/. does it work for you? we're happy to add configuration options if they are needed, but we've seen postgres work without this option.

from spiff-arena.

StephenPCG avatar StephenPCG commented on September 15, 2024

I didn't notice that there was a docker compose file for postgres.

I'm indeed deploying arena in k8s, I translated the docker compose file into k8s resources (deployment, configmap, etc.), provided an entrypoint to install libpq5 on container startup (so I don't have to build a custom image and push to some registry).

Here is the entrypoint:

#!/bin/bash
set -e

apt-get update
DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends libpq5

exec /app/bin/boot_server_in_docker

Here is environment variables set on spiffworkflow-backend container:

  SPIFFWORKFLOW_BACKEND_ENV: production # NOT SURE what values are allowed here? 
  FLASK_DEBUG: "0"
  FLASK_SESSION_SECRET_KEY: "[REDACTED]"
  SPIFFWORKFLOW_BACKEND_URL: "[REDACTED]"
  SPIFFWORKFLOW_BACKEND_BPMN_SPEC_ABSOLUTE_DIR: /app/process_models
  SPIFFWORKFLOW_BACKEND_CONNECTOR_PROXY_URL: http://spiffworkflow-connector
  SPIFFWORKFLOW_BACKEND_DATABASE_TYPE: postgres
  SPIFFWORKFLOW_BACKEND_DATABASE_URI: postgresql://USER:PASS@HOST/DB
  SPIFFWORKFLOW_BACKEND_LOAD_FIXTURE_DATA: "false"
  SPIFFWORKFLOW_BACKEND_LOG_LEVEL: "debug"
  SPIFFWORKFLOW_BACKEND_OPEN_ID_CLIENT_ID: "[REDACTED]"
  SPIFFWORKFLOW_BACKEND_OPEN_ID_CLIENT_SECRET_KEY: "[REDACTED]"
  SPIFFWORKFLOW_BACKEND_OPEN_ID_SERVER_URL: "[REDACTED]" # I have a local deployment of dex: https://github.com/dexidp/dex
  #SPIFFWORKFLOW_BACKEND_PERMISSIONS_FILE_NAME: example.yml
  SPIFFWORKFLOW_BACKEND_PERMISSIONS_FILE_ABSOLUTE_PATH: /app/permissions.yaml
  SPIFFWORKFLOW_BACKEND_PORT: "8000"
  SPIFFWORKFLOW_BACKEND_RUN_BACKGROUND_SCHEDULER_IN_CREATE_APP: "true"
  SPIFFWORKFLOW_BACKEND_UPGRADE_DB: "true"
  SPIFFWORKFLOW_BACKEND_URL_FOR_FRONTEND: "[REDACTED]"
  FORWARDED_ALLOW_IPS: "*"

I have examined arena-compose-postgres, and seems there are no difference in env variable settings, except it builds an image with libpq-dev (I think libpq5 is enough, libpq-dev is for compiling softwares which depends on libpq, e.g. when installing psycopg2).

I will try arena-compose-postgres locally, and will report back later (since the problem happened irregularly, I have to wait some time to see if that happens).

from spiff-arena.

StephenPCG avatar StephenPCG commented on September 15, 2024

BTW, the documentation says the backend can be seperated into three deployment, API, Background, and Celery Worker, but I didn't find any hints on how to deploy like this.

I searched scripts in spiffworkflow-backend/bin/, found a start_celery_worker which I think is to start Celery Worker, but didn't find any script to start Backend. Could you please provide some instructions?

from spiff-arena.

StephenPCG avatar StephenPCG commented on September 15, 2024

There is another problem. I'm using dex as openid connect provider. It works fine, but after some time (usually over night), it will fail to login, the /v1.0/login_return interface returns:

{
    "error_code": "invalid_token",
    "message": "Cannot decode token.",
    "status_code": 401
}

If I restart the spiffworkflow-backend container, it works again. I don't know the details behind openid auth flow, not sure how to debug this problem, can you provide some instructions? Is it caused by misconfiguration?

from spiff-arena.

burnettk avatar burnettk commented on September 15, 2024

ok, will be interested to see if the arena compose postgres replicates the issue.

here's a command to start the background container, aka apscheduler: ["./bin/start_blocking_apscheduler"]

when you get Cannot decode token, the logs in the API container might be interesting. that's interesting that bouncing the spiff container fixes it.

from spiff-arena.

StephenPCG avatar StephenPCG commented on September 15, 2024

I did some search in the source code, the error should be returned from here:

def _get_decoded_token(token: str) -> dict:
    try:
        decoded_token: dict = AuthenticationService.parse_jwt_token(_get_authentication_identifier_from_request(), token)
    except Exception as e:
        current_app.logger.warning(f"Received exception when attempting to decode token: {e.__class__.__name__}: {str(e)}")
        AuthenticationService.set_user_has_logged_out()
        raise ApiError(error_code="invalid_token", message="Cannot decode token.", status_code=401) from e

        ...

Then I searched the log, found this:

{"level": "WARNING", "message": "Received exception when attempting to decode token: StopIteration: ", "loggerName": "spiffworkflow_backend", "processName": "MainProcess", "processID": 149, "threadName": "ThreadPoolExecutor-1_1", "threadID": 139777974761152, "timestamp": "2024-07-09T10:35:14.537Z"}
{"level": "WARNING", "message": "Received exception: ApiError: Cannot decode token.. . Since we do not want this particular exception in sentry, we cannot use logger.exception or logger.error, so there will be no backtrace. see api_error.py", "loggerName": "spiffworkflow_backend", "processName": "MainProcess", "processID": 149, "threadName": "ThreadPoolExecutor-1_1", "threadID": 139777974761152, "timestamp": "2024-07-09T10:35:14.537Z"}

The StopIteration exception should be throwed from parse_jwt_token, but I have not idea how this exception can be raised. It is usually used inside iteration and almost never exposed outside of iteration.

I changed that logger.warning to logger.error(..., stack_info=True, exc_info=True), and will check if any interesting info will be logged the next time it happens.

from spiff-arena.

StephenPCG avatar StephenPCG commented on September 15, 2024

The first problem, psycopg2.OperationalError, I believe it also affects arena-compose-postgres.

I didn't wait for the error happen, just use psql to kill connections from db server side:

SELECT pg_terminate_backend(pg_stat_activity.pid) FROM pg_stat_activity WHERE pg_stat_activity.datname = 'spiffworkflow' AND pid <> pg_backend_pid();

Then visit arena frontend, and the error occurs.

In contrast, I made another deployment, modified /app/src/spiffworkflow_backend/config/__init__.py, manually added pool_pre_ping param:

app.config["SQLALCHEMY_ENGINE_OPTIONS"]["pool_pre_ping"] = True

and then kill connections from db side, everything works fine.

SQLAlchemy maintains a connection pool, whenever a conn is required, sqlalchemy checks out a conn from pool, if the conn was terminated for some reason (usually because of idle timeout, which I think not only postgres has, mysql by default has an 8 hour idle timeout), then error occurs. If pool_pre_ping is configured, then sqlalchemy will issue SELECT 1 to check if the conn is healthy on every checkout, and checkout another one if unhealthy.

So I believe pool_pre_ping is a must have option, especially for low traffic sites.

from spiff-arena.

burnettk avatar burnettk commented on September 15, 2024

I’m convinced, thank you for the research and doing the experiment. If you want to add the config option (maybe to default.py), we’d gladly accept a PR.

from spiff-arena.

StephenPCG avatar StephenPCG commented on September 15, 2024

I configured local sentry and caught the StopIteration exception:
image

key_id does not exist in any of jwks_configs["keys"], so next() raised StopIteration.

I'm really not familiar with openid authentication internals, so I cannot understand the problem. I just tried to follow the code.

    @classmethod
    def jwks_public_key_for_key_id(cls, authentication_identifier: str, key_id: str) -> dict:
        jwks_uri = cls.open_id_endpoint_for_name("jwks_uri", authentication_identifier)
        jwks_configs = cls.get_jwks_config_from_uri(jwks_uri)
        json_key_configs: dict = next(jk for jk in jwks_configs["keys"] if jk["kid"] == key_id)
        return json_key_configs

It tries to get key config for given key_id, from openid provider server (which I guessed from the code), and get_jwks_config_from_uri() maintained a cache, and will made request to openid server only if the cache is empty.

Here, I believe the cache is not empty, so the code is looking for key_id in cached configs, and failed to find one.

I manually made a request to jwks_uri (https://dex.my.domain/keys), it returns 5 keys, I then compared with the 5 keys logged in sentry, only 1 match, the other 4 are different.

So, I guest dex has a key rotation mechanism, each key is only valid for a relatively short period. I then did some googling with keyword jwks + rotation, found this answer, it indicates that the provider can revoke a key at any time for any reason.

So I believe, in spiffworkflow, either do not use cache for jwks, or should cache keys by key_id, instead of cache a provider's keys by jwks_uri.

from spiff-arena.

burnettk avatar burnettk commented on September 15, 2024

we updated the jwks handling to hopefully handle this key rotation. it's in main. are you using :latest tags? if so, since we only bump :latest on release, perhaps you could switch to the latest timestamped main tag of backend based on this commit: https://github.com/sartography/spiff-arena/actions/runs/9880631567

from spiff-arena.

StephenPCG avatar StephenPCG commented on September 15, 2024

I have submitted a PR to add pool_pre_ping option.

BTW, do you mind to install libpq5 in spiffworkflow-backend docker image? So there is no need to build a dedicated image for postgres use case?

from spiff-arena.

burnettk avatar burnettk commented on September 15, 2024

@StephenPCG sure, we added libpq5 to the backend docker image.

from spiff-arena.

StephenPCG avatar StephenPCG commented on September 15, 2024

@burnettk Wow! Thank you!

However, from the commit diff 42a3110 , it seems only two lines of comments were added, libpq5 was not installed 😂

from spiff-arena.

burnettk avatar burnettk commented on September 15, 2024

@StephenPCG lol, yes; i've found it's harder to add new bugs if you only add comments. :D

thanks for catching! db8433d actually adds the package.

from spiff-arena.

StephenPCG avatar StephenPCG commented on September 15, 2024

I works like a charm now! Thank you very much!

from spiff-arena.

burnettk avatar burnettk commented on September 15, 2024

@StephenPCG great to hear. it's been lovely working with you. i hope you will keep us informed on your progress on here or discord.

from spiff-arena.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.