Giter Club home page Giter Club logo

envoy-operator's Introduction

Envoy Operator

Overview

This charm encompasses the Kubernetes Python operator for Envoy (see CharmHub).

The Envoy operator is a Python script that wrap the latest released version of Envoy, providing lifecycle management and handling events such as install, upgrade, integrate, and remove.

Install

To install Envoy, run:

juju deploy envoy

For more information, see https://juju.is/docs

envoy-operator's People

Contributors

beliaev-maksim avatar ca-scribner avatar dnplas avatar i-chvets avatar kimwnasptd avatar knkski avatar misohu avatar natalian98 avatar nohaihab avatar orfeas-k avatar renovate[bot] avatar wrfitch avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

aym-frikha

envoy-operator's Issues

Sidecar rewrite: envoy

Context

We rewrite all of our charms using the sidecar with base charm pattern instead of the old podspec.

What needs to get done

Rewrite the charm using sidecar with base charm pattern.

Definition of Done

Charm is rewritten with sidecar with base charm pattern.
All of the tests are rewritten and passing.

envoy responds upstream connect error: connection termination

When KFP-UI sends a request to envoy in order to fetch ML metadata from metadata-grpc-service, we see the following error:

Cannot find context with {"typeName":"system.PipelineRun","contextName":"2b2eae10-e8fc-4427-8142-7e8c30fcfb27"}: upstream connect error or disconnect/reset before headers. reset reason: connection termination

This is a message received from envoy service which is probably not configured properly.

Debug

Applying upstream envoy Deployment, Service and VirtualService solved the issue, thus we confirm that it's our envoy configuration that creates an issue.

When deploying on air gapped environment gets stuck installing

Bug Description

Trying to deploy air gapped I am seing a couple issues with multiple charms

Now, AFAIU there seem to be at least some dependencies like this
kfp-metadata-writer -> mlmd -> envoy

And envoy shows stuck "installing" no much useful logs I could find

To Reproduce

# [..] Deploy other charms offline as per kubeflow-bundle repo

juju deploy --trust --debug ./envoy envoy --resource oci-image=10.10.11.39:32000/gcr.io/ml-pipeline/metadata-envoy:2.0.2

# [..] Add relations as per kubeflow-bundle repo

Environment

Kubeflow 1.8/stable
Microk8s 1.28-strict/stable
Juju 3.1.7/stable

Air Gapped

Relevant Log Output

# juju status | grep -v active
Model     Controller  Cloud/Region              Version  SLA          Timestamp
kubeflow  lxd-mgmt    microk8s-train/localhost  3.1.7    unsupported  06:04:11Z

App                        Version                         Status       Scale  Charm                    Channel  Rev  Address         Exposed  Message
envoy                                                      maintenance      1  envoy                               0                  no       installing charm software
istio-pilot                                                waiting          1  istio-pilot                         2  10.152.183.244  no       installing agent
kfp-metadata-writer                                        waiting          1  kfp-metadata-writer                 0  10.152.183.26   no       installing agent
kfp-profile-controller                                     waiting          1  kfp-profile-controller              0  10.152.183.27   no       installing agent
kfp-ui                                                     waiting          1  kfp-ui                              0  10.152.183.127  no       installing agent
mlmd                       .../tfx-oss-public/ml_metad...  waiting          1  mlmd                                0  10.152.183.212  no       List of <ops.model.Relation grpc:25> versions not found for apps: envoy
oidc-gatekeeper                                            waiting          1  oidc-gatekeeper                     0  10.152.183.84   no       installing agent

Unit                          Workload     Agent      Address       Ports          Message
envoy/0*                      maintenance  executing                               (leader-elected) installing charm software
istio-pilot/0*                waiting      idle       10.1.195.250                 Execution handled 1 errors.  See logs for details.
kfp-metadata-writer/0*        blocked      idle       10.1.195.245                 [relation:grpc] Expected data from exactly 1 related applications - got 0.
kfp-profile-controller/0*     maintenance  idle       10.1.195.197                 Reconciling charm: executing component container:kfp-profile-controller
kfp-ui/0*                     waiting      idle       10.1.195.231                 [container:ml-pipeline-ui] Waiting for Pebble services (ml-pipeline-ui).  If this persists, it could be a blocking co...
mlmd/0*                       waiting      idle       10.1.195.212  8080/TCP       List of <ops.model.Relation grpc:25> versions not found for apps: envoy
oidc-gatekeeper/0*            blocked      idle       10.1.195.225                 Failed to replan

# kk logs envoy-operator-0
Defaulted container "juju-operator" out of: juju-operator, juju-init (init)
2024-02-27 04:53:19 INFO juju.cmd supercommand.go:56 running jujud [3.1.7 0cd207d999fef1fc8b965c410e9f58fafe7ee335 gc go1.21.5]
2024-02-27 04:53:19 DEBUG juju.cmd supercommand.go:57   args: []string{"/var/lib/juju/tools/jujud", "caasoperator", "--application-name=envoy", "--debug"}
2024-02-27 04:53:19 DEBUG juju.agent agent.go:593 read agent config, format "2.0"
2024-02-27 04:53:19 INFO juju.worker.upgradesteps worker.go:60 upgrade steps for 3.1.7 have already been run.
2024-02-27 04:53:19 INFO juju.cmd.jujud caasoperator.go:205 caas operator application-envoy start (3.1.7 [gc])
2024-02-27 04:53:19 DEBUG juju.cmd.jujud runner.go:402 start "api"
2024-02-27 04:53:19 INFO juju.cmd.jujud runner.go:578 start "api"
2024-02-27 04:53:19 DEBUG juju.worker.dependency engine.go:580 "caas-units-manager" manifold worker started at 2024-02-27 04:53:19.415579542 +0000 UTC
2024-02-27 04:53:19 DEBUG juju.worker.dependency engine.go:580 "clock" manifold worker started at 2024-02-27 04:53:19.416484295 +0000 UTC
2024-02-27 04:53:19 DEBUG juju.worker.dependency engine.go:580 "upgrade-steps-gate" manifold worker started at 2024-02-27 04:53:19.416651233 +0000 UTC
2024-02-27 04:53:19 DEBUG juju.worker.dependency engine.go:580 "agent" manifold worker started at 2024-02-27 04:53:19.417316281 +0000 UTC
2024-02-27 04:53:19 DEBUG juju.worker.dependency engine.go:603 "caas-units-manager" manifold worker completed successfully
2024-02-27 04:53:19 DEBUG juju.worker.introspection worker.go:135 introspection worker listening on "@jujud-application-envoy"
2024-02-27 04:53:19 DEBUG juju.cmd.jujud runner.go:410 "api" started
2024-02-27 04:53:19 DEBUG juju.worker.introspection worker.go:161 stats worker now serving
2024-02-27 04:53:19 DEBUG juju.worker.dependency engine.go:580 "upgrade-steps-flag" manifold worker started at 2024-02-27 04:53:19.426088114 +0000 UTC
2024-02-27 04:53:19 DEBUG juju.worker.dependency engine.go:580 "caas-units-manager" manifold worker started at 2024-02-27 04:53:19.426203073 +0000 UTC
2024-02-27 04:53:19 DEBUG juju.worker.apicaller connect.go:129 connecting with old password
2024-02-27 04:53:19 DEBUG juju.worker.dependency engine.go:580 "api-config-watcher" manifold worker started at 2024-02-27 04:53:19.428977713 +0000 UTC
2024-02-27 04:53:19 DEBUG juju.worker.dependency engine.go:580 "migration-fortress" manifold worker started at 2024-02-27 04:53:19.436578166 +0000 UTC
2024-02-27 04:53:19 DEBUG juju.api apiclient.go:1172 successfully dialed "wss://10.10.11.54:17070/model/c41734e0-aa2c-4028-8105-ccefc9d4111e/api"
2024-02-27 04:53:19 INFO juju.api apiclient.go:707 connection established to "wss://10.10.11.54:17070/model/c41734e0-aa2c-4028-8105-ccefc9d4111e/api"
2024-02-27 04:53:19 INFO juju.worker.apicaller connect.go:163 [c41734] "application-envoy" successfully connected to "10.10.11.54:17070"
2024-02-27 04:53:19 DEBUG juju.api monitor.go:35 RPC connection died
2024-02-27 04:53:19 DEBUG juju.worker.dependency engine.go:603 "api-caller" manifold worker completed successfully
2024-02-27 04:53:19 DEBUG juju.worker.apicaller connect.go:129 connecting with old password
2024-02-27 04:53:19 DEBUG juju.api apiclient.go:1172 successfully dialed "wss://10.10.11.54:17070/model/c41734e0-aa2c-4028-8105-ccefc9d4111e/api"
2024-02-27 04:53:19 INFO juju.api apiclient.go:707 connection established to "wss://10.10.11.54:17070/model/c41734e0-aa2c-4028-8105-ccefc9d4111e/api"
2024-02-27 04:53:19 INFO juju.worker.apicaller connect.go:163 [c41734] "application-envoy" successfully connected to "10.10.11.54:17070"
2024-02-27 04:53:19 DEBUG juju.worker.dependency engine.go:580 "api-caller" manifold worker started at 2024-02-27 04:53:19.496843869 +0000 UTC
2024-02-27 04:53:19 DEBUG juju.worker.dependency engine.go:603 "caas-units-manager" manifold worker completed successfully
2024-02-27 04:53:19 DEBUG juju.worker.dependency engine.go:580 "caas-units-manager" manifold worker started at 2024-02-27 04:53:19.50550414 +0000 UTC
2024-02-27 04:53:19 DEBUG juju.worker.dependency engine.go:580 "migration-minion" manifold worker started at 2024-02-27 04:53:19.507115444 +0000 UTC
2024-02-27 04:53:19 DEBUG juju.worker.dependency engine.go:580 "upgrader" manifold worker started at 2024-02-27 04:53:19.507267948 +0000 UTC
2024-02-27 04:53:19 DEBUG juju.worker.dependency engine.go:580 "log-sender" manifold worker started at 2024-02-27 04:53:19.507510505 +0000 UTC
2024-02-27 04:53:19 DEBUG juju.worker.dependency engine.go:580 "upgrade-steps-runner" manifold worker started at 2024-02-27 04:53:19.509256653 +0000 UTC
2024-02-27 04:53:19 DEBUG juju.worker.dependency engine.go:603 "upgrade-steps-runner" manifold worker completed successfully
2024-02-27 04:53:19 DEBUG juju.worker.dependency engine.go:580 "migration-inactive-flag" manifold worker started at 2024-02-27 04:53:19.512624978 +0000 UTC
2024-02-27 04:53:19 INFO juju.worker.caasupgrader upgrader.go:113 abort check blocked until version event received
2024-02-27 04:53:19 DEBUG juju.worker.caasupgrader upgrader.go:128 current agent binary version: 3.1.7
2024-02-27 04:53:19 INFO juju.worker.caasupgrader upgrader.go:119 unblocking abort check
2024-02-27 04:53:19 INFO juju.worker.migrationminion worker.go:142 migration phase is now: NONE
2024-02-27 04:53:19 DEBUG juju.worker.dependency engine.go:580 "charm-dir" manifold worker started at 2024-02-27 04:53:19.522822714 +0000 UTC
2024-02-27 04:53:19 DEBUG juju.worker.dependency engine.go:618 "operator" manifold worker stopped: fortress operation aborted
stack trace:
github.com/juju/juju/worker/fortress.init:43: fortress operation aborted
github.com/juju/juju/worker/fortress.Occupy:60:
github.com/juju/juju/cmd/jujud/agent/engine.Housing.Decorate.occupyStart.func1:93:
2024-02-27 04:53:19 DEBUG juju.worker.dependency engine.go:580 "secret-drain-worker" manifold worker started at 2024-02-27 04:53:19.523128737 +0000 UTC
2024-02-27 04:53:19 DEBUG juju.worker.dependency engine.go:580 "api-address-updater" manifold worker started at 2024-02-27 04:53:19.523210402 +0000 UTC
2024-02-27 04:53:19 DEBUG juju.worker.logger logger.go:65 initial log config: "<root>=DEBUG"
2024-02-27 04:53:19 DEBUG juju.worker.dependency engine.go:580 "logging-config-updater" manifold worker started at 2024-02-27 04:53:19.523368699 +0000 UTC
2024-02-27 04:53:19 INFO juju.worker.logger logger.go:120 logger worker started
2024-02-27 04:53:19 DEBUG juju.worker.dependency engine.go:580 "proxy-config-updater" manifold worker started at 2024-02-27 04:53:19.524860739 +0000 UTC
2024-02-27 04:53:19 DEBUG juju.worker.logger logger.go:93 reconfiguring logging from "<root>=DEBUG" to "<root>=INFO"
2024-02-27 04:53:19 WARNING juju.worker.proxyupdater proxyupdater.go:241 unable to set snap core settings [proxy.http= proxy.https= proxy.store=]: exec: "snap": executable file not found in $PATH, output: ""
2024-02-27 04:53:19 INFO juju.worker.caasoperator.charm bundles.go:81 downloading local:focal/envoy-0 from API server
2024-02-27 04:53:19 INFO juju.downloader download.go:109 downloading from local:focal/envoy-0
2024-02-27 04:53:19 INFO juju.downloader download.go:92 download complete ("local:focal/envoy-0")
2024-02-27 04:53:19 INFO juju.downloader download.go:172 download verified ("local:focal/envoy-0")
2024-02-27 04:53:23 INFO juju.worker.caasoperator caasoperator.go:430 operator "envoy" started
2024-02-27 04:53:23 INFO juju.worker.caasoperator.runner runner.go:578 start "envoy/0"
2024-02-27 04:53:23 INFO juju.worker.leadership tracker.go:194 envoy/0 promoted to leadership of envoy
2024-02-27 04:53:23 INFO juju.agent.tools symlinks.go:20 ensure jujuc symlinks in /var/lib/juju/tools/unit-envoy-0
2024-02-27 04:53:23 INFO juju.worker.caasoperator.uniter.envoy/0 uniter.go:363 unit "envoy/0" started
2024-02-27 04:53:23 INFO juju.worker.caasoperator.uniter.envoy/0 uniter.go:689 resuming charm install
2024-02-27 04:53:23 INFO juju.worker.caasoperator.uniter.envoy/0.charm bundles.go:81 downloading local:focal/envoy-0 from API server
2024-02-27 04:53:23 INFO juju.downloader download.go:109 downloading from local:focal/envoy-0
2024-02-27 04:53:23 INFO juju.downloader download.go:92 download complete ("local:focal/envoy-0")
2024-02-27 04:53:24 INFO juju.downloader download.go:172 download verified ("local:focal/envoy-0")
2024-02-27 04:53:27 INFO juju.worker.caasoperator.uniter.envoy/0 uniter.go:389 hooks are retried true
2024-02-27 04:53:27 INFO juju.worker.caasoperator.uniter.envoy/0 resolver.go:165 found queued "install" hook
2024-02-27 04:53:28 INFO juju-log Running legacy hooks/install.
2024-02-27 04:53:29 WARNING juju-log 0 containers are present in metadata.yaml and refresh_event was not specified. Defaulting to update_status. Metrics IP may not be set in a timely fashion.
2024-02-27 04:53:30 INFO juju.worker.caasoperator.uniter.envoy/0.operation runhook.go:186 ran "install" hook (via hook dispatching script: dispatch)
2024-02-27 04:53:30 INFO juju.worker.caasoperator.uniter.envoy/0 resolver.go:165 found queued "leader-elected" hook
2024-02-27 04:53:31 WARNING juju-log 0 containers are present in metadata.yaml and refresh_event was not specified. Defaulting to update_status. Metrics IP may not be set in a timely fashion.

Additional Context

No response

Publish envoy-operator charm to `metadata-envoy` instead of `envoy`

Context

This charm is for KFP's metadata-envoy component and not a general envoy component. During the update for CKF 1.8 release, we changed from using image envoyproxy/envoy:v1.12.2 to gcr.io/ml-pipeline/metadata-envoy:2.0.2. This made sense, since the charm is not a really configurable general envoy charm but rather configures envoy in order to imitate the kfp's metadata-envoy functionality.

At the same time, there are two charms in charmhub:

Proposal

We should archive envoy charm in Charmhub and start publishing under metadata-envoy. This way, we 'll avoid confusion and also make explicit that this is not a generalized envoy charm.

Debug `envoy` for in airgapped environments in `track/2.0`

Context

#72 reports envoy not working in an airgapped environment. We need to debug this, fix the issues, and release them to track/2.0

It is unclear what version of envoy was used in the previous failed ckf 1.8 deployments - its possible that:

  • track/2.0 does actually work in airgapped
  • our current main charm works in airgapped now based on other recent changes.

If neither work, we need to debug and implement a fix to track/2.0

What needs to get done

  1. investigate if envoy is broken for airgapped environments
  2. fix the issue and land the fix in track/2.0

Definition of Done

  1. track/2.0 works in airgapped

Difficulty fetching oci-image creates transient blocked status

When running the build_and_deploy integration tests in this PR, if raise_on_blocked is set to True when waiting for idle, the test fails. This is due to a transient blocked status, which shows up in the next test in the file, which hangs on "getting oci-image" for around 5 minutes. Since a blocked status means "I require human intervention", this might just need updating to a maintenance/waiting status.

Test Logs:

Grafana integration leaves no data dashboards

Bug Description

I deployed COS and integrated it with envoy:

$   juju status --relations | grep envoy
envoy                                  res:oci-image@cc06b3e    active       1  envoy                     2.0/stable           101  10.152.183.122  no       
grafana-agent-envoy                    0.35.2                   active       1  grafana-agent-k8s         latest/stable         58  10.152.183.22   no       
envoy/5*                                  active       idle   192.168.13.250   9090,9901/TCP  
grafana-agent-envoy/0*                    active       idle   192.168.212.194                 
cos-loki:logging                                                   grafana-agent-envoy:logging-consumer                     loki_push_api             regular  
cos-prometheus:receive-remote-write                                grafana-agent-envoy:send-remote-write                    prometheus_remote_write   regular  
envoy:grafana-dashboards                                           cos-grafana:grafana-dashboard                            grafana_dashboard         regular  joining  
envoy:grafana-dashboards                                           grafana-agent-envoy:grafana-dashboards-consumer          grafana_dashboard         regular  
envoy:metrics-endpoint                                             grafana-agent-envoy:metrics-endpoint                     prometheus_scrape         regular  
grafana-agent-envoy:grafana-dashboards-provider                    cos-grafana:grafana-dashboard                            grafana_dashboard         regular  
grafana-agent-envoy:peers                                          grafana-agent-envoy:peers                                grafana_agent_replica     peer     
istio-pilot:ingress                                                envoy:ingress                                            ingress                   regular  
mlmd:grpc                                                          envoy:grpc                                               grpc                      regular  

As you can see a lot of dashboards have no data:
Screenshot from 2024-03-04 12-15-19

To Reproduce

  1. Juju deploy COS
  2. Juju deploy kubeflow 1.8
  3. Integrate it

Environment

Relevant Log Output

-

Additional Context

No response

Airgapped: envoy gets stuck installing with `Failed to resolve 'raw.githubusercontent.com'` attempting to download relation schema

Bug Description

When deploying CKF 1.8/stable on airgapped, the envoy charm gets stuck in maintenance with the unit message being (leader-elected) installing charm software.
In the relevant logs, it is observed that envoy charm is trying to download the shcema for the grpc relation.
Looking at the metadata.yaml, the relation is defined as follows:

  grpc:
    interface: grpc
    schema: https://raw.githubusercontent.com/canonical/operator-schemas/master/grpc.yaml
    versions: [v1]

Indeed we are using a remote reference for the schema, this is a blocker for deploying envoy in airgapped.
The same applies for the grpc-web relation.

To fix this, we can define the schema in the metadata.yaml and have the url set in the __schema_source attribute for reference.

This was reported in canonical/bundle-kubeflow#818, see point 1..

To Reproduce

Deploy envoy 2.0/stable in airgapped

Environment

microk8s 1.25-strict/stable
juju 3.1/stable

Relevant Log Output

2024-05-24 07:19:00 ERROR juju-log Uncaught exception while in charm code:
Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-envoy-0/charm/venv/urllib3/connection.py", line 203, in _new_conn
    sock = connection.create_connection(
  File "/var/lib/juju/agents/unit-envoy-0/charm/venv/urllib3/util/connection.py", line 60, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
  File "/usr/lib/python3.8/socket.py", line 918, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -3] Temporary failure in name resolution

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-envoy-0/charm/venv/urllib3/connectionpool.py", line 790, in urlopen
    response = self._make_request(
  File "/var/lib/juju/agents/unit-envoy-0/charm/venv/urllib3/connectionpool.py", line 491, in _make_request
    raise new_e
  File "/var/lib/juju/agents/unit-envoy-0/charm/venv/urllib3/connectionpool.py", line 467, in _make_request
    self._validate_conn(conn)
  File "/var/lib/juju/agents/unit-envoy-0/charm/venv/urllib3/connectionpool.py", line 1092, in _validate_conn
    conn.connect()
  File "/var/lib/juju/agents/unit-envoy-0/charm/venv/urllib3/connection.py", line 611, in connect
    self.sock = sock = self._new_conn()
  File "/var/lib/juju/agents/unit-envoy-0/charm/venv/urllib3/connection.py", line 210, in _new_conn
    raise NameResolutionError(self.host, self, e) from e
urllib3.exceptions.NameResolutionError: <urllib3.connection.HTTPSConnection object at 0x7bab20b7d940>: Failed to resolve 'raw.githubusercontent.com' ([Errno -3] Temporary failure in name resolution)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-envoy-0/charm/venv/requests/adapters.py", line 486, in send
    resp = conn.urlopen(
  File "/var/lib/juju/agents/unit-envoy-0/charm/venv/urllib3/connectionpool.py", line 844, in urlopen
    retries = retries.increment(
  File "/var/lib/juju/agents/unit-envoy-0/charm/venv/urllib3/util/retry.py", line 515, in increment
    raise MaxRetryError(_pool, url, reason) from reason  # type: ignore[arg-type]
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='raw.githubusercontent.com', port=443): Max retries exceeded with url: /canonical/operator-schemas/master/grpc.yaml (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7bab20b7d940>: Failed to resolve 'raw.githubusercontent.com' ([Errno -3] Temporary failure in name resolution)"))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "./src/charm.py", line 308, in <module>
    main(Operator)
  File "/var/lib/juju/agents/unit-envoy-0/charm/venv/ops/main.py", line 441, in main
    _emit_charm_event(charm, dispatcher.event_name)
  File "/var/lib/juju/agents/unit-envoy-0/charm/venv/ops/main.py", line 149, in _emit_charm_event
    event_to_emit.emit(*args, **kwargs)
  File "/var/lib/juju/agents/unit-envoy-0/charm/venv/ops/framework.py", line 342, in emit
    framework._emit(event)
  File "/var/lib/juju/agents/unit-envoy-0/charm/venv/ops/framework.py", line 839, in _emit
    self._reemit(event_path)
  File "/var/lib/juju/agents/unit-envoy-0/charm/venv/ops/framework.py", line 928, in _reemit
    custom_handler(event)
  File "./src/charm.py", line 165, in set_pod_spec
    interfaces = self._get_interfaces()
  File "./src/charm.py", line 261, in _get_interfaces
    interfaces = get_interfaces(self)
  File "/var/lib/juju/agents/unit-envoy-0/charm/venv/serialized_data_interface/sdi.py", line 351, in get_interfaces
    return {
  File "/var/lib/juju/agents/unit-envoy-0/charm/venv/serialized_data_interface/sdi.py", line 352, in <dictcomp>
    endpoint: get_interface(charm, endpoint)
  File "/var/lib/juju/agents/unit-envoy-0/charm/venv/serialized_data_interface/sdi.py", line 375, in get_interface
    schema = utils.get_schema(interface["schema"])
  File "/var/lib/juju/agents/unit-envoy-0/charm/venv/serialized_data_interface/utils.py", line 68, in get_schema
    response = _get_schema_response_from_remote(schema)
  File "/var/lib/juju/agents/unit-envoy-0/charm/venv/serialized_data_interface/utils.py", line 93, in _get_schema_response_from_remote
    response = requests.get(url=url, proxies=proxies)
  File "/var/lib/juju/agents/unit-envoy-0/charm/venv/requests/api.py", line 73, in get
    return request("get", url, params=params, **kwargs)
  File "/var/lib/juju/agents/unit-envoy-0/charm/venv/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
  File "/var/lib/juju/agents/unit-envoy-0/charm/venv/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "/var/lib/juju/agents/unit-envoy-0/charm/venv/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
  File "/var/lib/juju/agents/unit-envoy-0/charm/venv/requests/adapters.py", line 519, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='raw.githubusercontent.com', port=443): Max retries exceeded with url: /canonical/operator-schemas/master/grpc.yaml (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7bab20b7d940>: Failed to resolve 'raw.githubusercontent.com' ([Errno -3] Temporary failure in name resolution)"))

Additional Context

No response

Refactor envoy configuration to using upstream yaml.file

Context

We should refactor the charm to be configured using a simple yaml file instead of generating it using the envoy_data_plane package for the following reasons.

  1. During the update of envoy for Kubeflow 1.9, there were updates to the envoy.yaml file that are visible in PR kubeflow/pipelines#10589. Regarding those:
  1. In general, it is a more cumbersome developer experience to map changes in the envoy.yaml file to the correspond. In other words, it's easier to just copy changes from upstream if we also use a yaml file.
  2. This has caused us trouble in the past e.g. in this PR #64 (comment) we had to revert changes because the python package didn't work the same way with the yaml file (see bullet about revert max_grpc_timeout). This also resulted in us having a slightly different configuration than upstream.

Thus, in order to make our lifes easier, I think we should refactor the

What needs to get done

Replace the config generator component with a simple yaml template being rendered and pushed in the container.

Definition of Done

Envoy is configurable through a yaml file.

Update `envoy` manifests

Context

Each charm has a set of manifest files that have to be upgraded to their target version. The process of upgrading manifest files usually means going to the component’s upstream repository, comparing the charm’s manifest against the one in the repository and adding the missing bits in the charm’s manifest.

What needs to get done

https://docs.google.com/document/d/1a4obWw98U_Ndx-ZKRoojLf4Cym8tFb_2S7dq5dtRQqs/edit?pli=1#heading=h.jt5e3qx0jypg

Definition of Done

  1. Manifests are updated
  2. Upstream image is used

update the grpc relation to use the mlops-libs k8s_service_info library

Context

Previous versions of this charm use an SDI-backed implementation of the grpc relation with mlmd. For the Charmed Kubeflow 1.9 release, mlmd's relation handling is changing to use the mlops-libs k8s_service_info library. We need to update that here as well to keep them compatible.

What needs to get done

  1. update the relation handling here to use the mlops-libs k8s_service_info library

Definition of Done

  1. charm has the new relation handling and is demonstrated working. If this is implemented before mlmd is upgraded, we might need to demonstrate this using a dummy charm of an intermediate implementation of mlmd because we have no other charms that use this library yet.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.