Giter Club home page Giter Club logo

Comments (7)

narqo avatar narqo commented on May 30, 2024

Whenever I try to redeploy the mimir smoke-test I get this error.

To double-check, could you clarify, what do you mean by redeploying the smoke-test job; do you manually run helm test?

Could you check if the output from kubectl get -l app.kubernetes.io/component=smoke-test and kubectl get pod -l app.kubernetes.io/component=smoke-test indicates of any failures from the testing pod.

from mimir.

crisjaytomas avatar crisjaytomas commented on May 30, 2024

Hi @narqo

To double-check, could you clarify, what do you mean by redeploying the smoke-test job; do you manually run helm test?

This error happens only when I change something on mimir and try to redeploy it using helm install. I believe that smoke-test is part of the deployment of mimir. Lastly I don't run helm test. I'm finding a way to disable this smoke_test or suppress the error

Could you check if the output from kubectl get -l app.kubernetes.io/component=smoke-test and kubectl get pod -l app.kubernetes.io/component=smoke-test indicates of any failures from the testing pod.

I don't have any output on the first command maybe you mean kubectl get job -l app.kubernetes.io/component=smoke-test ? Here is the output of the get job:

NAME                                COMPLETIONS   DURATION   AGE
cloud-monitoring-mimir-smoke-test   0/1           97m        97m

For the second command I don't have any pod running for smoke-test

from mimir.

narqo avatar narqo commented on May 30, 2024

This error happens only when I change something on mimir and try to redeploy it using helm install. I believe that smoke-test is part of the deployment of mimir.

"smoke-test" is a chart hook, which Helm runs on helm test. This is defined via the "helm.sh/hook": test annotation on the job (refer to Helm's "Chart Tests"). I am not sure Helm is supposed to run the tests automatically on install.

To help debugging that,

  • Could you provide the output the helm version command.
  • Show the values you pass to the helm install and helm upgrade (stripping out any sensitive details from the values).

from mimir.

crisjaytomas avatar crisjaytomas commented on May 30, 2024

Will try to explain what I actually do so you get more details

  1. We run a helm template to produce an output.yaml manifest
helm template cloud-monitoring "$MONITORING_CLUSTER_VALUES_PATH" -f "$MONITORING_CLUSTER_VALUES_PATH/values.yaml" \
                    --namespace=$NAMESPACE \
                    --api-versions=policy/v1/PodDisruptionBudget \
                    --set mimir-distributed.rollout_operator.image.repository="$BUILD_ACR".azurecr.io/rollout-operator \
                    --set mimir-distributed.image.repository=acr.azurecr.io/mimir \
                    --set mimir-distributed.nginx.image.registry="$BUILD_ACR".azurecr.io > output.yml"
  1. Then we apply output.yml in kubernetes
kubectl apply -f output.yml
  1. When I inspect the output.yml it has a smoke-test job that I wanted to get rid of. If the smoke-test job already exist it causes the error that I've put on the Describe the bug of this issue. What we normally do is to remove that smoke-test job first and then redeploy mimir
---
# Source: cloud-monitoring/charts/mimir-distributed/templates/smoke-test/smoke-test-job.yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: cloud-monitoring-mimir-smoke-test
  labels:
    helm.sh/chart: mimir-distributed-4.5.0-weekly.245
    app.kubernetes.io/name: mimir
    app.kubernetes.io/instance: cloud-monitoring
    app.kubernetes.io/component: smoke-test
    app.kubernetes.io/version: "r245"
    app.kubernetes.io/managed-by: Helm
  annotations:
    "helm.sh/hook": test
  namespace: "monitoring"
  1. What I wanted to ask is that I've seen that you can set continous_test enabled to false is there a similar way in the smoke_test? Or is there a way to completely remove smoke test on my template.

from mimir.

narqo avatar narqo commented on May 30, 2024

Thank you for the detailed explanation. That makes it much more clear now.

We run a helm template to produce an output.yaml manifest. [..] Is there a way to completely remove smoke test on my template.

You can use helm's --no-hooks flag to skip generating resources, that are marked as helm-hooks.

Another popular trick to tweak the output of helm template for your own infra's needs is to pass the output.yaml through kustomize, to "massage" the manifests further.

from mimir.

narqo avatar narqo commented on May 30, 2024

Also, regarding the original error,

Here is the output of the get job:

NAME                                COMPLETIONS   DURATION   AGE
cloud-monitoring-mimir-smoke-test   0/1           97m        97m

I wonder if this indicates any mismatch in the configuration. It occurs to me that what happens is that job/cloud-monitoring-mimir-smoke-test creates a smoke-test pod, but the pod fails for some reason. After the job's "backoff limit", the pod is removed, so you don't see it in the output of kubectl get pod. But this leaves the job itself hanging with zero completions. And this is what breaks the subsequent kubectl apply.

Instead of trying to disable the smoke-test, I suggest investigating, why the smoke-test's pod fails:

  1. remove the job/cloud-monitoring-mimir-smoke-test
  2. apply kubernetes manifest
  3. while the smoke-test job's pod is in the backoff retry, observe through its logs, why the pod fails.

from mimir.

narqo avatar narqo commented on May 30, 2024

Please, re-open and provide more details on any new findings, if you think there is anything we might help with. Will close it for now.

from mimir.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.