kyma-incubator / octopus Goto Github PK
View Code? Open in Web Editor NEWThe test runner for acceptance tests
License: Apache License 2.0
The test runner for acceptance tests
License: Apache License 2.0
apiVersion: testing.kyma-project.io/v1alpha1
kind: TestDefinition
metadata:
labels:
controller-tools.k8s.io: "1.0"
dependency/kubeless: "true"
component/service-catalog: "true" # user has a freedom how to define labels
name: test-ls
spec:
skip: false
disableConcurrency: false
timeout: 3m
template:
spec:
serviceAccountName: doesnotexist
containers:
- name: test
image: alpine:latest
command:
- "ls"
ClusterTestSuite
that tries to run such TestClusterTestSuite is in Running phase all the time.
apiVersion: v1
items:
- apiVersion: testing.kyma-project.io/v1alpha1
kind: ClusterTestSuite
metadata:
creationTimestamp: 2019-04-04T09:51:57Z
generation: 1
labels:
controller-tools.k8s.io: "1.0"
name: testsuite-all
resourceVersion: "539"
selfLink: /apis/testing.kyma-project.io/v1alpha1/clustertestsuites/testsuite-all
uid: 48d78903-56bf-11e9-afde-cec002874220
spec:
concurrency: 1
count: 1
maxRetries: 0
status:
conditions:
- status: "True"
type: Running
results:
- executions: []
name: test-ls
namespace: default
status: NotYetScheduled
startTime: 2019-04-04T09:52:15Z
kind: List
metadata:
resourceVersion: ""
selfLink: ""
User does not have any feedback what is wrong. Necessary information is reportrfd only in the Octopus logs:
{"level":"error","ts":1554371706.9390872,"logger":"kubebuilder.controller","msg":"Reconciler error","controller":"testsuite-controller","request":"/testsuite-all","error":"while scheduling next testing pod for suite [testsuite-all]: while creating testing pod for suite [testsuite-all] and test definition [name: test-ls, namespace: default]: pods \"oct-tp-testsuite-all-test-ls-0\" is forbidden: error looking up service account default/doesnotexist: serviceaccount \"doesnotexist\" not found","errorVerbose":"pods \"oct-tp-testsuite-all-test-ls-0\" is forbidden: error looking up service account default/doesnotexist: serviceaccount \"doesnotexist\" not found\nwhile creating testing pod for suite [testsuite-all] and test definition [name: test-ls, namespace: default]
MaxRetries allows you to retry test in case of failure. Currently, this field is ignored by the Octopus.
AC:
I should be able to set on TestDefinition level that this test should be skipped. Such test should be marked as a skipped in the status of a ClusterTestSuite
Description
A test definition might have a requirement on specific resources to be existent in order to run. That requirement exists for multiple test definitions. As an example take a look at the testing.sh
of kyma where the suite execution requires a helm broker testing bundle to be configured upfront. That configuration is not part of the default kyma installation but many test definitions require them and it should be configured upfront as part of the suite execution.
Another example is the dex static password requirement defined in the same script in Kyma. Here, the test needs to be sure that a specific condition is established, otherwise the test should be skipped.
Reasons
To remove Kyma's testing.sh
in order to use plain octopus tooling, any workarounds needs to be solved. The mentioned two examples are the specific problems to be solved in a generic way. However, it is a general problem to solve as it is a common pattern that tests have preconditions which are either established in a setup phase or will lead to a skipping of the test.
Details
A solution might look like this:
A test definition is expressing a condition by a label on the TestDefinition resource like:
labels:
requires-testing-bundle=true
requires-test-user=true
When creating a TestSuite, by default requires
labels are not satisfied and the TestSuite will not execute any test having a requires
label, instead it will set it's status to skipped
with a reason text.
In order to satisfy a requires
we need a new CRD TestRequirement, specifying the label condition and a Job (or simply in-line script?) to evaluate the condition. Optional it can define a setUp and tearDown Job (or simply in-line script?) in order to satisfy a condition.
kind: TestRequirement
metadata:
name: testing-bundle
spec:
evaluationExpression: "kubectl get configmap testing-bundles-repos"
setupExpression: "kubectl create configmap testing-bundles-repos --from-literal=...; kubectl label configmap testing-bundles-repos helm-broker-repo=true"
tearDownExpression: "kubectl delete configmap testing-bundles-repos
kind: TestRequirement
metadata:
name: testing-user
spec:
evaluationExpression: "${kc} get cm dex-config -n kyma-system -ojsonpath="{.data}" | grep --silent "#__STATIC_PASSWORDS__"
The TestSuite will find all TestRequirement resources in a cluster and now can evaluate the requires
label. With that, test requirements might be already satisfied by the resources present on the cluster. Additionally, a TestRequirement
can have a setup which will turn the evaluation into success after execution. On TestSuite deletion, the tearDown expressions should be executed.
Description
Octopus is using controller-runtime
version 0.1.1
, since then many breaking changes were introduced and we just stumbled into them in Kyma 1.5.0
.
Now it is not possible to import both the Kyma and the Octopus APIs and compile them together.
We need to upgrade the controller-runtime on octopus at least to version 0.2.0
.
Additionally, there seem to be issues between dep and go modules on those versions.
Reasons
1.16
, Octopus will stop working alltogether.Description
The pod definition generated by octopus is missing/ignoring a vital option in the manifests. A running pod doesn't have the shareProcessNamespace: true
with, if the tests uses an istio sidecar can break the test. Without the option it is impossible to kill the proxy process after a finished test resulting in a never-ending pod.
Expected result
The pod option is present in the pod definition
Actual result
The pod option is missing in the running pod
Steps to reproduce
On a cluster, running tests
kubectl get pod -n kyma-system oct-tp-testsuite-all-2019-05-17-08-57-test-core-kubeless-int-0 -o yaml | grep "shareProcessNamespace"
This doesn't find anything meaning that the option is not set in the pod spec, however in the code we can see it is present
Troubleshooting
Currently, we log every message on Info Level. Some logs are rather debugging information, examples from scheduler.go
:
Cannot get next test to schedule, max concurrency reached
No tests to execute right now
Setting verbosity of logs is not so trivial task, see discusion here: kubernetes-sigs/controller-runtime#167
Our public documentation under https://kyma-project.io/docs/#details-testing-kyma-tests-execution points to .../installation/scripts/testing.sh as a way to run all test.
To run all tests, use the testing.sh script located in the /installation/scripts/ directory. Internally, the ClusterTestSuite resource is defined. It fetches all TestDefinitions and executes them."
Whereas script says:
echo "The testing.sh script is deprecated and will be removed. Use Kyma CLI instead."
We should update documentation to point to Kyma CLI as a default way of running tests.
Description
Every test within TestSuite could be run within one namespace created exclusively for this TestSuite.
Reasons
Prerequisite
For this to make any difference all Kyma tests must accept namespace as an argument. Tests have the namespace hardcoded usually now.
In CRD's we have to parameters that specify a timeout after which execution of a test or a suite should be stopped:
ClusterTestSuite.SuiteTimeout
TestDefinition.Timeout
Purpose of this issue is to implement logic around timeouts.
AC:
TestDefinition.Timeout
. A suite can be continued.ClusterTestSuite.SuiteTimeout
ClusterTestSuite
has defined selectors:
matchNames
matchLabels
Currently, we run all the tests. Implement running only tests defined by selectors.
Description
In Kyma we have some issues with the execution of our tests which applies to every test pod.
When we start a test we always need to add a testing-addons
to our cluster otherwise, the tests will fail. The number of required things for our tests can increase in time, so it's better to implement it here rather than in many different places later, which will be much harder to maintenance.
All of our test pods contain a sidecar which blocks Pod's connections to API server until sidecar is ready. It may fail tests if they don't have a waiter.
Above issues can be addressed in Octopus by implementing a proper logic which will be executed once before test suite execution AND a logic which will be executed before each test.
It should be configurable from the TestSuite level.
Reasons
Many tests have common issues. This implementation allows us to avoid many issues and boilerplate code in the future.
Attachments
Description
When I triggered tests for Kyma release 1.13rc0 the kyma-gke-integration job failed. It timed out on the ClusterTestSuite step. I see in the logs that last test "core-core-ui-acceptance" is still in progress, but the "oct-tp-testsuite-all-2019-07-04-10-32-test-core-core-ui-acceptance-0" pod is completed.
Expected result
Test should be terminated when pod is "Completed"
Actual result
Test is pending but the corresponding pod is "Completed"
Steps to reproduce
It happened also in kyma-integration job https://storage.googleapis.com/kyma-prow-logs/pr-logs/pull/kyma-project_kyma/4746/pre-rel13-kyma-integration/1146721314044645376/build-log.txt
Kyma 1.3-RC0 PR: kyma-project/kyma#4746
Troubleshooting
1146720810015133697-pre-rel13-kyma-gke-integration.log
testing-octopus-0.log
testsuite-all-2019-07-04-10-32.log
pods.log
Hi,
We are looking into something that is easier to maintain than helm tests, and we found this project
First of all, do you say go for it, or is this still too 'alpha'?
Our issues is that in our EKS Cluster running a test suite executes all test definitions regardless of the selectors,
I tried matching on both Labels, name and namespace, specifying label selectors with the simple key=value notation as well as the {key: in: values: []}
but whenever i apply a testsuite with kubectl and then check the results with kubectl get cts -oyaml
i see all the test definitions running.
kc_test get cts testsuite-sre-go-grpc-hibob-labels -ojson | jq -r '.status | .results | .[] | {name, status}'
{
"name": "another-test-in-another-ns",
"status": "Running"
}
{
"name": "test-definition-go-grpc-hibob",
"status": "Succeeded"
}
here you can see that the test-definition-go-grpc-hibob succeeded (just the pwd, so fast) and the real test is still running, but with a different name, namespace and labels
we deploy our test-definitions in the namespaces for the different domains and the test suite in a cicd namespace used for deployments and tests
apiVersion: testing.kyma-project.io/v1alpha1
kind: TestDefinition
metadata:
labels:
controller-tools.k8s.io: "1.0"
testsuite.group: sre
testsuite.project: go-grpc-hibob
name: test-definition-go-grpc-hibob
spec:
template:
spec:
containers:
- name: test-go-grpc-hibob
image: alpine:latest
command:
- "pwd"
apiVersion: testing.kyma-project.io/v1alpha1
kind: ClusterTestSuite
metadata:
labels:
controller-tools.k8s.io: "1.0"
name: testsuite-sre-go-grpc-hibob-labels
spec:
count: 1
maxRetries: 0
concurrency: 3
selectors:
matchLabelExpressions:
- {key: testsuite.group, operator: In, values: [sre]}
- {key: testsuite.project, operator: In, values: [go-grpc-hibob]}
apiVersion: testing.kyma-project.io
kind: ClusterTestSuite
metadata:
labels:
controller-tools.k8s.io: "1.0"
name: testsuite-sre-go-grpc-hibob-names
spec:
count: 1
maxRetries: 0
concurrency: 1
selectors:
matchNames:
- name: test-definition-go-grpc-hibob
- namespace: sre
Description
Improvements in how octopus schedule tests.
AC:
ClusterTestSuite
has a possibility to define concurrencyLevel - to run many tests at the same time. So far this field is ignored.Description
Provide a mechanism for the delay between test retries. There should be configurable per CTS, the fixed delay between retries.
Reasons
Sometimes infrastructure has problems and retrying right away is not the best option. We should give users an option to make their tests more resilient.
Attachments
Description
If the testsuite-controller fails at reconciling a testdefinition (due to an error in the testdefinition, e.g. missing service-account) execution of the entire testsuite gets stuck. Instead of constantly retrying to create the test pod the controller should just mark the single test as failed (potentially with a descriptive error) and continue with running the testsuite.
Another approach would be to immediately fail the entire testsuite.
Reasons
If running in ci you cannot see this issue at the moment and the testexecution will just time out (if implemented correctly in the ci)
Attachments
{
"level":"error",
"ts":1599740144.8231885,
"logger":"controller-runtime.controller",
"msg":"Reconciler error",
"controller":"testsuite-controller",
"request":"/testsuite-all",
"error":"while scheduling next testing pod for suite [testsuite-all]: while creating testing pod for suite [testsuite-all] and test definition [name: knative-serving, namespace: knative-serving]: pods \"oct-tp-testsuite-all-knative-serving-0\" is forbidden: error looking up service account knative-serving/knative-serving-tests: serviceaccount \"knative-serving-tests\" not found","errorVerbose":"pods \"oct-tp-testsuite-all-knative-serving-0\" is forbidden: error looking up service account knative-serving/knative-serving-tests: serviceaccount \"knative-serving-tests\" not found\nwhile creating testing pod for suite [testsuite-all] and test definition [name: knative-serving, namespace: knative-serving]\ngithub.com/kyma-incubator/octopus/pkg/scheduler.(*Service).startPod\n\t/go/src/github.com/kyma-incubator/octopus/pkg/scheduler/scheduler.go:165\ngithub.com/kyma-incubator/octopus/pkg/scheduler.(*Service).TrySchedule\n\t/go/src/github.com/kyma-incubator/octopus/pkg/scheduler/scheduler.go:61\ngithub.com/kyma-incubator/octopus/pkg/controller/testsuite.(*ReconcileTestSuite).Reconcile\n\t/go/src/github.com/kyma-incubator/octopus/pkg/controller/testsuite/testsuite_controller.go:171\ngithub.com/kyma-incubator/octopus/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/src/github.com/kyma-incubator/octopus/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:216\ngithub.com/kyma-incubator/octopus/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/kyma-incubator/octopus/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:192\ngithub.com/kyma-incubator/octopus/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/go/src/github.com/kyma-incubator/octopus/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:171\ngithub.com/kyma-incubator/octopus/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/kyma-incubator/octopus/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152\ngithub.com/kyma-incubator/octopus/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/kyma-incubator/octopus/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153\ngithub.com/kyma-incubator/octopus/vendor/k8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/kyma-incubator/octopus/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1337\nwhile scheduling next testing pod for suite [testsuite-all]\ngithub.com/kyma-incubator/octopus/pkg/controller/testsuite.(*ReconcileTestSuite).Reconcile\n\t/go/src/github.com/kyma-incubator/octopus/pkg/controller/testsuite/testsuite_controller.go:173\ngithub.com/kyma-incubator/octopus/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/src/github.com/kyma-incubator/octopus/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:216\ngithub.com/kyma-incubator/octopus/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/kyma-incubator/octopus/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:192\ngithub.com/kyma-incubator/octopus/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/go/src/github.com/kyma-incubator/octopus/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:171\ngithub.com/kyma-incubator/octopus/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/kyma-incubator/octopus/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152\ngithub.com/kyma-incubator/octopus/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/kyma-incubator/octopus/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153\ngithub.com/kyma-incubator/octopus/vendor/k8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/kyma-incubator/octopus/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1337","stacktrace":"github.com/kyma-incubator/octopus/vendor/github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/github.com/kyma-incubator/octopus/vendor/github.com/go-logr/zapr/zapr.go:128\ngithub.com/kyma-incubator/octopus/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/src/github.com/kyma-incubator/octopus/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:218\ngithub.com/kyma-incubator/octopus/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/kyma-incubator/octopus/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:192\ngithub.com/kyma-incubator/octopus/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/go/src/github.com/kyma-incubator/octopus/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:171\ngithub.com/kyma-incubator/octopus/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/kyma-incubator/octopus/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152\ngithub.com/kyma-incubator/octopus/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/kyma-incubator/octopus/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153\ngithub.com/kyma-incubator/octopus/vendor/k8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/kyma-incubator/octopus/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88"}
Description
Sometimes tests are flaky. In this case they usually get disabled, means they do not get executed at all.
We should allow to mark a tests as optional
.
An optional
test would be executed and even retried according to the TestSuite
definition. But its result does not affect the Testsuites
result.
Reasons
A test can fail because of different reasons.
Here it would be useful to mark a test as optional
. This way we could retrieve the test logs as if it would be enabled, but also do not break builds as it does not fail the testsuite.
Attachments
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.