kyma-incubator / octopus Goto Github PK

View Code? Open in Web Editor NEW

117.0 117.0 16.0 419 KB

The test runner for acceptance tests

License: Apache License 2.0

Dockerfile 0.47% Makefile 1.60% Go 97.12% Shell 0.08% Mustache 0.73%

k8s kubernetes testing

octopus's People

Contributors

Stargazers

Watchers

Forkers

aszecowka mszostok sjanota jasiu001 piotrmiskiewicz polskikiel clebs klaudiagrz spilchen petergrace wozniakjan ywsfay florianstoeber zxshinxz ruanxin jeremyharisch sudosoul

octopus's Issues

Invalid TestDefinition block processing of the whole ClusterTestSuite

Steps to reproduce

Create invalid TestDefinition, as an example we have a TestDefinition that refers to not existing Service Account:

apiVersion: testing.kyma-project.io/v1alpha1
kind: TestDefinition
metadata:
  labels:
    controller-tools.k8s.io: "1.0"
    dependency/kubeless: "true"
    component/service-catalog: "true" # user has a freedom how to define labels
  name: test-ls
spec:
  skip: false
  disableConcurrency: false
  timeout: 3m
  template:
    spec:
      serviceAccountName: doesnotexist
      containers:
        - name: test
          image: alpine:latest
          command:
            - "ls"

Define ClusterTestSuite that tries to run such Test

Actual result

ClusterTestSuite is in Running phase all the time.

apiVersion: v1
items:
- apiVersion: testing.kyma-project.io/v1alpha1
  kind: ClusterTestSuite
  metadata:
    creationTimestamp: 2019-04-04T09:51:57Z
    generation: 1
    labels:
      controller-tools.k8s.io: "1.0"
    name: testsuite-all
    resourceVersion: "539"
    selfLink: /apis/testing.kyma-project.io/v1alpha1/clustertestsuites/testsuite-all
    uid: 48d78903-56bf-11e9-afde-cec002874220
  spec:
    concurrency: 1
    count: 1
    maxRetries: 0
  status:
    conditions:
    - status: "True"
      type: Running
    results:
    - executions: []
      name: test-ls
      namespace: default
      status: NotYetScheduled
    startTime: 2019-04-04T09:52:15Z
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

User does not have any feedback what is wrong. Necessary information is reportrfd only in the Octopus logs:

{"level":"error","ts":1554371706.9390872,"logger":"kubebuilder.controller","msg":"Reconciler error","controller":"testsuite-controller","request":"/testsuite-all","error":"while scheduling next testing pod for suite [testsuite-all]: while creating testing pod for suite [testsuite-all] and test definition [name: test-ls, namespace: default]: pods \"oct-tp-testsuite-all-test-ls-0\" is forbidden: error looking up service account default/doesnotexist: serviceaccount \"doesnotexist\" not found","errorVerbose":"pods \"oct-tp-testsuite-all-test-ls-0\" is forbidden: error looking up service account default/doesnotexist: serviceaccount \"doesnotexist\" not found\nwhile creating testing pod for suite [testsuite-all] and test definition [name: test-ls, namespace: default]

Expected Result

In CTS status I can see that there is a problem with execution such test
CTS skip such test if there is a permament error and run other tests

Allow to specify maxRetries for Suite

MaxRetries allows you to retry test in case of failure. Currently, this field is ignored by the Octopus.

AC:

implementation for global maxRetries parameter
document a feature and provide simple example

Allow to mark tests as a skipped

I should be able to set on TestDefinition level that this test should be skipped. Such test should be marked as a skipped in the status of a ClusterTestSuite

Test preconditions

Description
A test definition might have a requirement on specific resources to be existent in order to run. That requirement exists for multiple test definitions. As an example take a look at the testing.sh of kyma where the suite execution requires a helm broker testing bundle to be configured upfront. That configuration is not part of the default kyma installation but many test definitions require them and it should be configured upfront as part of the suite execution.
Another example is the dex static password requirement defined in the same script in Kyma. Here, the test needs to be sure that a specific condition is established, otherwise the test should be skipped.

Reasons
To remove Kyma's testing.sh in order to use plain octopus tooling, any workarounds needs to be solved. The mentioned two examples are the specific problems to be solved in a generic way. However, it is a general problem to solve as it is a common pattern that tests have preconditions which are either established in a setup phase or will lead to a skipping of the test.

Details
A solution might look like this:

A test definition is expressing a condition by a label on the TestDefinition resource like:

labels:
  requires-testing-bundle=true
  requires-test-user=true

When creating a TestSuite, by default requires labels are not satisfied and the TestSuite will not execute any test having a requires label, instead it will set it's status to skipped with a reason text.

In order to satisfy a requires we need a new CRD TestRequirement, specifying the label condition and a Job (or simply in-line script?) to evaluate the condition. Optional it can define a setUp and tearDown Job (or simply in-line script?) in order to satisfy a condition.

kind: TestRequirement
metadata:
  name: testing-bundle
spec:
  evaluationExpression: "kubectl get configmap testing-bundles-repos"
  setupExpression: "kubectl create configmap testing-bundles-repos  --from-literal=...; kubectl label configmap testing-bundles-repos helm-broker-repo=true"
  tearDownExpression: "kubectl delete configmap testing-bundles-repos

kind: TestRequirement
metadata:
  name: testing-user
spec:
  evaluationExpression: "${kc} get cm dex-config -n kyma-system -ojsonpath="{.data}" | grep --silent "#__STATIC_PASSWORDS__"

The TestSuite will find all TestRequirement resources in a cluster and now can evaluate the requires label. With that, test requirements might be already satisfied by the resources present on the cluster. Additionally, a TestRequirement can have a setup which will turn the evaluation into success after execution. On TestSuite deletion, the tearDown expressions should be executed.

Upgrade octopus controller runtime

Description
Octopus is using controller-runtime version 0.1.1, since then many breaking changes were introduced and we just stumbled into them in Kyma 1.5.0.
Now it is not possible to import both the Kyma and the Octopus APIs and compile them together.

We need to upgrade the controller-runtime on octopus at least to version 0.2.0.

Additionally, there seem to be issues between dep and go modules on those versions.

Reasons

When we upgrade Kyma to Kubernetes 1.16, Octopus will stop working alltogether.
Currently CLI development is blocked because it both uses the Kyma and the Octopus APIs.

Missing pod specs for octopus tests

Description

The pod definition generated by octopus is missing/ignoring a vital option in the manifests. A running pod doesn't have the shareProcessNamespace: true with, if the tests uses an istio sidecar can break the test. Without the option it is impossible to kill the proxy process after a finished test resulting in a never-ending pod.

Expected result

The pod option is present in the pod definition

Actual result

The pod option is missing in the running pod

Steps to reproduce

On a cluster, running tests

kubectl get pod -n kyma-system oct-tp-testsuite-all-2019-05-17-08-57-test-core-kubeless-int-0 -o yaml | grep "shareProcessNamespace"

This doesn't find anything meaning that the option is not set in the pod spec, however in the code we can see it is present

Troubleshooting

Configure proper logging in Octopus

Currently, we log every message on Info Level. Some logs are rather debugging information, examples from scheduler.go:

Cannot get next test to schedule, max concurrency reached

No tests to execute right now

Setting verbosity of logs is not so trivial task, see discusion here: kubernetes-sigs/controller-runtime#167

Implement validation for CRDs and default values

(ClusterTestSuite): you cannot provide Count and MaxRetries at the same time
(ClusterTestSuite) if not Specified, Count = 1
Limit the length of names for ClusterTestSuite and TestDefinition - testing Pod names are created by concatenating both names.

Documentation points to deprecated script.

Our public documentation under https://kyma-project.io/docs/#details-testing-kyma-tests-execution points to .../installation/scripts/testing.sh as a way to run all test.

To run all tests, use the testing.sh script located in the /installation/scripts/ directory. Internally, the ClusterTestSuite resource is defined. It fetches all TestDefinitions and executes them."

Whereas script says:
echo "The testing.sh script is deprecated and will be removed. Use Kyma CLI instead."

We should update documentation to point to Kyma CLI as a default way of running tests.

Execute TestSuite in dedicated namespace

Description

Every test within TestSuite could be run within one namespace created exclusively for this TestSuite.

Reasons

Many TestSuites could be executed at once
It is easier to cleanup after TestSuite
Tests should work outside of the component's namespace to mimic client behaviour

Prerequisite

For this to make any difference all Kyma tests must accept namespace as an argument. Tests have the namespace hardcoded usually now.

Interrupt test or suite when timeout exceeded

In CRD's we have to parameters that specify a timeout after which execution of a test or a suite should be stopped:

ClusterTestSuite.SuiteTimeout
TestDefinition.Timeout

Purpose of this issue is to implement logic around timeouts.

AC:

a test is interrupted if it takes more than a value specified in TestDefinition.Timeout. A suite can be continued.
a suite is interrupted if it takes more than a value specified in ClusterTestSuite.SuiteTimeout

Implement Selective Testing

ClusterTestSuite has defined selectors:

matchNames
matchLabels

Currently, we run all the tests. Implement running only tests defined by selectors.

Implement Before and BeforeAll logic execution

Description

In Kyma we have some issues with the execution of our tests which applies to every test pod.

When we start a test we always need to add a testing-addons to our cluster otherwise, the tests will fail. The number of required things for our tests can increase in time, so it's better to implement it here rather than in many different places later, which will be much harder to maintenance.
All of our test pods contain a sidecar which blocks Pod's connections to API server until sidecar is ready. It may fail tests if they don't have a waiter.

Above issues can be addressed in Octopus by implementing a proper logic which will be executed once before test suite execution AND a logic which will be executed before each test.

It should be configurable from the TestSuite level.

Reasons

Many tests have common issues. This implementation allows us to avoid many issues and boilerplate code in the future.

Attachments

Timeout on test suite with completed pod

Description

When I triggered tests for Kyma release 1.13rc0 the kyma-gke-integration job failed. It timed out on the ClusterTestSuite step. I see in the logs that last test "core-core-ui-acceptance" is still in progress, but the "oct-tp-testsuite-all-2019-07-04-10-32-test-core-core-ui-acceptance-0" pod is completed.

Expected result

Test should be terminated when pod is "Completed"

Actual result

Test is pending but the corresponding pod is "Completed"

Steps to reproduce

It happened also in kyma-integration job https://storage.googleapis.com/kyma-prow-logs/pr-logs/pull/kyma-project_kyma/4746/pre-rel13-kyma-integration/1146721314044645376/build-log.txt

Kyma 1.3-RC0 PR: kyma-project/kyma#4746

Troubleshooting

1146720810015133697-pre-rel13-kyma-gke-integration.log
testing-octopus-0.log
testsuite-all-2019-07-04-10-32.log
pods.log

Applying Testsuite in EKS executes all Test definition's

Hi,

We are looking into something that is easier to maintain than helm tests, and we found this project
First of all, do you say go for it, or is this still too 'alpha'?

Our issues is that in our EKS Cluster running a test suite executes all test definitions regardless of the selectors,
I tried matching on both Labels, name and namespace, specifying label selectors with the simple key=value notation as well as the {key: in: values: []}

but whenever i apply a testsuite with kubectl and then check the results with kubectl get cts -oyaml
i see all the test definitions running.

    kc_test get cts testsuite-sre-go-grpc-hibob-labels -ojson | jq -r '.status | .results | .[] | {name, status}'
    {
      "name": "another-test-in-another-ns",
      "status": "Running"
    }
    {
      "name": "test-definition-go-grpc-hibob",
      "status": "Succeeded"
    }

here you can see that the test-definition-go-grpc-hibob succeeded (just the pwd, so fast) and the real test is still running, but with a different name, namespace and labels

we deploy our test-definitions in the namespaces for the different domains and the test suite in a cicd namespace used for deployments and tests

    apiVersion: testing.kyma-project.io/v1alpha1
    kind: TestDefinition
    metadata:
      labels:
        controller-tools.k8s.io: "1.0"
        testsuite.group: sre
        testsuite.project: go-grpc-hibob
      name: test-definition-go-grpc-hibob
    spec:
      template:
        spec:
          containers:
            - name: test-go-grpc-hibob
              image: alpine:latest
              command:
                - "pwd"

    apiVersion: testing.kyma-project.io/v1alpha1
    kind: ClusterTestSuite
    metadata:
      labels:
        controller-tools.k8s.io: "1.0"
      name: testsuite-sre-go-grpc-hibob-labels
    spec:
      count: 1
      maxRetries: 0
      concurrency: 3 
      selectors:
        matchLabelExpressions:
          - {key: testsuite.group, operator: In, values: [sre]}
          - {key: testsuite.project, operator: In, values: [go-grpc-hibob]}

    apiVersion: testing.kyma-project.io
    kind: ClusterTestSuite
    metadata:
      labels:
        controller-tools.k8s.io: "1.0"
      name: testsuite-sre-go-grpc-hibob-names
    spec:
      count: 1
      maxRetries: 0
      concurrency: 1 
      selectors:
        matchNames:
          - name: test-definition-go-grpc-hibob
          - namespace: sre

Improve scheduling tests: take into account concurrencyLevel and disabling concurrency

Description
Improvements in how octopus schedule tests.

AC:

default, concurencyLevel: 1
ClusterTestSuite has a possibility to define concurrencyLevel - to run many tests at the same time. So far this field is ignored.
Also, on the test level, we can specify that concurrency is not supported. This flag is also ignored.

Delays between test retries

Description
Provide a mechanism for the delay between test retries. There should be configurable per CTS, the fixed delay between retries.

Reasons
Sometimes infrastructure has problems and retrying right away is not the best option. We should give users an option to make their tests more resilient.

Attachments

do not fail reconcilation if testdefinition is wrong

Description

If the testsuite-controller fails at reconciling a testdefinition (due to an error in the testdefinition, e.g. missing service-account) execution of the entire testsuite gets stuck. Instead of constantly retrying to create the test pod the controller should just mark the single test as failed (potentially with a descriptive error) and continue with running the testsuite.
Another approach would be to immediately fail the entire testsuite.

Reasons

If running in ci you cannot see this issue at the moment and the testexecution will just time out (if implemented correctly in the ci)

Attachments

{
"level":"error",
"ts":1599740144.8231885,
"logger":"controller-runtime.controller",
"msg":"Reconciler error",
"controller":"testsuite-controller",
"request":"/testsuite-all",
"error":"while scheduling next testing pod for suite [testsuite-all]: while creating testing pod for suite [testsuite-all] and test definition [name: knative-serving, namespace: knative-serving]: pods \"oct-tp-testsuite-all-knative-serving-0\" is forbidden: error looking up service account knative-serving/knative-serving-tests: serviceaccount \"knative-serving-tests\" not found","errorVerbose":"pods \"oct-tp-testsuite-all-knative-serving-0\" is forbidden: error looking up service account knative-serving/knative-serving-tests: serviceaccount \"knative-serving-tests\" not found\nwhile creating testing pod for suite [testsuite-all] and test definition [name: knative-serving, namespace: knative-serving]\ngithub.com/kyma-incubator/octopus/pkg/scheduler.(*Service).startPod\n\t/go/src/github.com/kyma-incubator/octopus/pkg/scheduler/scheduler.go:165\ngithub.com/kyma-incubator/octopus/pkg/scheduler.(*Service).TrySchedule\n\t/go/src/github.com/kyma-incubator/octopus/pkg/scheduler/scheduler.go:61\ngithub.com/kyma-incubator/octopus/pkg/controller/testsuite.(*ReconcileTestSuite).Reconcile\n\t/go/src/github.com/kyma-incubator/octopus/pkg/controller/testsuite/testsuite_controller.go:171\ngithub.com/kyma-incubator/octopus/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/src/github.com/kyma-incubator/octopus/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:216\ngithub.com/kyma-incubator/octopus/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/kyma-incubator/octopus/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:192\ngithub.com/kyma-incubator/octopus/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/go/src/github.com/kyma-incubator/octopus/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:171\ngithub.com/kyma-incubator/octopus/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/kyma-incubator/octopus/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152\ngithub.com/kyma-incubator/octopus/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/kyma-incubator/octopus/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153\ngithub.com/kyma-incubator/octopus/vendor/k8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/kyma-incubator/octopus/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1337\nwhile scheduling next testing pod for suite [testsuite-all]\ngithub.com/kyma-incubator/octopus/pkg/controller/testsuite.(*ReconcileTestSuite).Reconcile\n\t/go/src/github.com/kyma-incubator/octopus/pkg/controller/testsuite/testsuite_controller.go:173\ngithub.com/kyma-incubator/octopus/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/src/github.com/kyma-incubator/octopus/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:216\ngithub.com/kyma-incubator/octopus/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/kyma-incubator/octopus/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:192\ngithub.com/kyma-incubator/octopus/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/go/src/github.com/kyma-incubator/octopus/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:171\ngithub.com/kyma-incubator/octopus/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/kyma-incubator/octopus/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152\ngithub.com/kyma-incubator/octopus/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/kyma-incubator/octopus/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153\ngithub.com/kyma-incubator/octopus/vendor/k8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/kyma-incubator/octopus/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1337","stacktrace":"github.com/kyma-incubator/octopus/vendor/github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/github.com/kyma-incubator/octopus/vendor/github.com/go-logr/zapr/zapr.go:128\ngithub.com/kyma-incubator/octopus/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/src/github.com/kyma-incubator/octopus/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:218\ngithub.com/kyma-incubator/octopus/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/kyma-incubator/octopus/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:192\ngithub.com/kyma-incubator/octopus/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/go/src/github.com/kyma-incubator/octopus/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:171\ngithub.com/kyma-incubator/octopus/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/kyma-incubator/octopus/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152\ngithub.com/kyma-incubator/octopus/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/kyma-incubator/octopus/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153\ngithub.com/kyma-incubator/octopus/vendor/k8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/kyma-incubator/octopus/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88"}

Allow optional tests

Description

Sometimes tests are flaky. In this case they usually get disabled, means they do not get executed at all.
We should allow to mark a tests as optional.
An optional test would be executed and even retried according to the TestSuite definition. But its result does not affect the Testsuites result.

Reasons

A test can fail because of different reasons.

test is just broken => the correct thing to do is to disable it and fix it
some other test has sideeffects on the test => disabling the one of the tests will 'fix' the failure, but also make it hard to figure out the actual reason for the failure

Here it would be useful to mark a test as optional. This way we could retrieve the test logs as if it would be enabled, but also do not break builds as it does not fail the testsuite.

Attachments

kyma-incubator / octopus Goto Github PK

octopus's People

Contributors

Stargazers

Watchers

Forkers

octopus's Issues

Steps to reproduce

Actual result

Expected Result

Recommend Projects

Recommend Topics

Recommend Org