Giter Club home page Giter Club logo

file-integrity-operator's Issues

feature request: add `s390x` / multi arch support

I would like to request support for s390x architecture.
Or maybe you have a roadmap for releasing the official operators for the supported archs like s390x and ppc64le already and can share that?

I just saw that the compliance-operator f.e. has some multi arch PRs merged recently, so maybe there is something in the works here as well?

Looking forward for some feedback :) Thank you!

[ocp4.8] FIO Servicemonitor not accessible by openshift-user-workload-monitoring prometheus instance

file-integrity-operator.v0.1.18 running on OCP 4.8.5 with default config for openshift-user-workload-monitoring.
In the logs of prometheus-operator in the openshift-user-workload-monitoring namespace I can see logs like this:

level=warn ts=2021-08-23T21:16:00.305307242Z caller=operator.go:1703 component=prometheusoperator msg="skipping servicemonitor" error="it accesses file system via bearer token file which Prometheus specification prohibits" servicemonitor=openshift-file-integrity/metrics namespace=openshift-user-workload-monitoring prometheus=user-workload

Are there already plans to change this default behaviour, or to make it configurable?

The setup-envtest dependency isn't tracked using tools.go

From a build perspective, it would be handy to have this dependency tracked using go tooling so that we (as developers, or build systems) don't need to manage it separately.

One possible way to do this is to use tools.go, similar to what the node-observability-operator does.

This was originally lumped into issue #245 but I decided to break it into it's own issue because controller-tools pulls in a lot of transient dependencies that affect FIO dependencies. I thought it would be good to sort these dependencies individually, so we can keep patches as small as possible.

https://github.com/openshift/node-observability-operator/blob/main/tools/tools.go#L12

End-to-end CI is broken

CI has been failing consistently over the last week with errors like the following:

=== RUN   TestFileIntegrityLogAndReinitDatabase
&Namespace{ObjectMeta:{osdk-e2e-46b19674-f9e8-423b-981d-0920db85c117      0 0001-01-01 00:00:00 +0000 UTC <nil> <nil> map[pod-security.kubernetes.io/enforce:privileged security.openshift.io/scc.podSecurityLabelSync:false] map[] [] [] []},Spec:NamespaceSpec{Finalizers:[],},Status:NamespaceStatus{Phase:,Conditions:[]NamespaceCondition{},},}
    client.go:47: resource type ServiceAccount with namespace/name (osdk-e2e-46b19674-f9e8-423b-981d-0920db85c117/file-integrity-daemon) created
    client.go:47: resource type ServiceAccount with namespace/name (osdk-e2e-46b19674-f9e8-423b-981d-0920db85c117/file-integrity-operator) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-46b19674-f9e8-423b-981d-0920db85c117/file-integrity-daemon) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-46b19674-f9e8-423b-981d-0920db85c117/file-integrity-operator) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-46b19674-f9e8-423b-981d-0920db85c117/leader-election-role) created
    client.go:47: resource type ClusterRole with namespace/name (/file-integrity-operator) created
    client.go:47: resource type ClusterRole with namespace/name (/file-integrity-operator-metrics) created
    client.go:47: resource type ClusterRole with namespace/name (/fileintegrity-editor-role) created
    client.go:47: resource type ClusterRole with namespace/name (/fileintegrity-viewer-role) created
    client.go:47: resource type RoleBinding with namespace/name (osdk-e2e-46b19674-f9e8-423b-981d-0920db85c117/file-integrity-daemon) created
    client.go:47: resource type RoleBinding with namespace/name (osdk-e2e-46b19674-f9e8-423b-981d-0920db85c117/file-integrity-operator) created
    client.go:47: resource type RoleBinding with namespace/name (osdk-e2e-46b19674-f9e8-423b-981d-0920db85c117/file-integrity-operator-metrics) created
    client.go:47: resource type RoleBinding with namespace/name (osdk-e2e-46b19674-f9e8-423b-981d-0920db85c117/leader-election-rolebinding) created
    client.go:47: resource type ClusterRoleBinding with namespace/name (/file-integrity-operator) created
    client.go:47: resource type Deployment with namespace/name (osdk-e2e-46b19674-f9e8-423b-981d-0920db85c117/file-integrity-operator) created
    helpers.go:278: Initialized cluster resources
    wait_util.go:52: Waiting for full availability of file-integrity-operator deployment (0/1)
    wait_util.go:52: Waiting for full availability of file-integrity-operator deployment (0/1)
    wait_util.go:59: Deployment available (1/1)
    client.go:47: resource type  with namespace/name (osdk-e2e-46b19674-f9e8-423b-981d-0920db85c117/e2e-test-reinitdb) created
    helpers.go:389: Created FileIntegrity: &{TypeMeta:{Kind: APIVersion:} ObjectMeta:{Name:e2e-test-reinitdb GenerateName: Namespace:osdk-e2e-46b19674-f9e8-423b-981d-0920db85c117 SelfLink: UID:7b66a89d-de9c-4797-960a-6f2fb04adcfb ResourceVersion:30496 Generation:1 CreationTimestamp:2023-07-10 17:13:08 +0000 UTC DeletionTimestamp:<nil> DeletionGracePeriodSeconds:<nil> Labels:map[] Annotations:map[] OwnerReferences:[] Finalizers:[] ManagedFields:[{Manager:Go-http-client Operation:Update APIVersion:fileintegrity.openshift.io/v1alpha1 Time:2023-07-10 17:13:08 +0000 UTC FieldsType:FieldsV1 FieldsV1:{"f:spec":{".":{},"f:config":{".":{},"f:gracePeriod":{},"f:maxBackups":{}},"f:debug":{},"f:nodeSelector":{".":{},"f:node-role.kubernetes.io/worker":{}},"f:tolerations":{}}} Subresource:}]} Spec:{NodeSelector:map[node-role.kubernetes.io/worker:] Config:{Name: Namespace: Key: GracePeriod:20 MaxBackups:5 InitialDelay:0} Debug:true Tolerations:[{Key:node-role.kubernetes.io/master Operator:Exists Value: Effect:NoSchedule TolerationSeconds:<nil>} {Key:node-role.kubernetes.io/infra Operator:Exists Value: Effect:NoSchedule TolerationSeconds:<nil>}]} Status:{Phase:}}
    helpers.go:906: FileIntegrity never reached expected phase (Active)
    helpers.go:424: Timed out waiting for scan status to go Active
--- FAIL: TestFileIntegrityLogAndReinitDatabase (1829.78s)

The traces are all the same since all end-to-end tests are run serially. The first test fails, then cascades into other failures because the first test isn't cleaned up properly.

Duplicate CRDs

In v0.1.3, the CRDs are duplicate

  customresourcedefinitions:
    owned:
    - description: FileIntegrity is the Schema for the fileintegrities API
      kind: FileIntegrity
      name: fileintegrities.fileintegrity.openshift.io
      version: v1alpha1
    - description: FileIntegrity is the Schema for the fileintegrities API
      kind: FileIntegrity
      name: fileintegrities.fileintegrity.openshift.io
      version: v1alpha1
    - description: FileIntegrityNodeStatus defines the status of a specific node
      kind: FileIntegrityNodeStatus
      name: fileintegritynodestatuses.fileintegrity.openshift.io
      version: v1alpha1
    - description: FileIntegrityNodeStatus defines the status of a specific node
      kind: FileIntegrityNodeStatus
      name: fileintegritynodestatuses.fileintegrity.openshift.io
      version: v1alpha1

It also have the same issue in v0.1.2

Using *undeploy Makefile targets outputs Not Found errors

I was recently using the make catalog-deploy target to deploy changes into one of my clusters. When I needed to remove, or uninstall, the operator I used make catalog-undeploy, which outputs the following:

$ make catalog-undeploy
<snip>
namespace "openshift-file-integrity" deleted
customresourcedefinition.apiextensions.k8s.io "fileintegrities.fileintegrity.openshift.io" deleted
customresourcedefinition.apiextensions.k8s.io "fileintegritynodestatuses.fileintegrity.openshift.io" deleted
serviceaccount "file-integrity-daemon" deleted
serviceaccount "file-integrity-operator" deleted
role.rbac.authorization.k8s.io "leader-election-role" deleted
clusterrole.rbac.authorization.k8s.io "file-integrity-operator-metrics" deleted
clusterrole.rbac.authorization.k8s.io "fileintegrity-editor-role" deleted
clusterrole.rbac.authorization.k8s.io "fileintegrity-viewer-role" deleted
rolebinding.rbac.authorization.k8s.io "file-integrity-daemon" deleted
rolebinding.rbac.authorization.k8s.io "leader-election-rolebinding" deleted
rolebinding.rbac.authorization.k8s.io "prometheus-k8s" deleted
clusterrolebinding.rbac.authorization.k8s.io "file-integrity-operator-metrics" deleted
deployment.apps "file-integrity-operator" deleted
Error from server (NotFound): error when deleting "STDIN": roles.rbac.authorization.k8s.io "file-integrity-daemon" not found
Error from server (NotFound): error when deleting "STDIN": roles.rbac.authorization.k8s.io "file-integrity-operator" not found
Error from server (NotFound): error when deleting "STDIN": clusterroles.rbac.authorization.k8s.io "file-integrity-operator" not found
Error from server (NotFound): error when deleting "STDIN": rolebindings.rbac.authorization.k8s.io "file-integrity-operator" not found
Error from server (NotFound): error when deleting "STDIN": clusterrolebindings.rbac.authorization.k8s.io "file-integrity-operator" not found
make: *** [Makefile:357: undeploy] Error 1

It appears some of the resources were cleaned up properly, but others weren't. This may be confusing to contributors (do I need to go clean things up manually?)

Aide database stored on filesystem

I question the usefulness of this tool when the database is stored on the file system which is being checked. An motivated attacker could cover their tracks by altering the database directly. It should be stored in a place not writable by an attacker such as off-cluster or encrypted with keys that are stored off disk, such as on a Trusted Platform Module (TPM).

I think other file integrity checkers can use a TPM, such as linux-ima, which should be used for secure boot in Centos/RHEL 9. That project is also under more recent development than aide. We seem to be including a version from Centos/RHEL 7 where which is 10 years old. Although the project has been updated as recently as last year.

e2e test node cleanup does not run

Our e2e tests show these events:
"message": "Error creating: pods \"aide-clean-\" is forbidden: error looking up service account openshift-file-integrity/file-integrity-operator: serviceaccount \"file-integrity-operator\" not found",
The service account is being cleaned up before the aide-clean deployment is created.

E2E testing fails due to failed node taints

We're experiencing an issue in CI where the E2E testing fails because a node can't be tainted, which manifests in the follow error:

 === RUN   TestFileIntegrityTolerations
&Namespace{ObjectMeta:{osdk-e2e-6c36263d-255d-4a22-9659-ba12e7965054      0 0001-01-01 00:00:00 +0000 UTC <nil> <nil> map[pod-security.kubernetes.io/enforce:privileged security.openshift.io/scc.podSecurityLabelSync:false] map[] [] [] []},Spec:NamespaceSpec{Finalizers:[],},Status:NamespaceStatus{Phase:,Conditions:[]NamespaceCondition{},},}
    client.go:47: resource type ServiceAccount with namespace/name (osdk-e2e-6c36263d-255d-4a22-9659-ba12e7965054/file-integrity-daemon) created
    client.go:47: resource type ServiceAccount with namespace/name (osdk-e2e-6c36263d-255d-4a22-9659-ba12e7965054/file-integrity-operator) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-6c36263d-255d-4a22-9659-ba12e7965054/file-integrity-daemon) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-6c36263d-255d-4a22-9659-ba12e7965054/file-integrity-operator) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-6c36263d-255d-4a22-9659-ba12e7965054/leader-election-role) created
    client.go:47: resource type ClusterRole with namespace/name (/file-integrity-operator) created
    client.go:47: resource type ClusterRole with namespace/name (/file-integrity-operator-metrics) created
    client.go:47: resource type ClusterRole with namespace/name (/fileintegrity-editor-role) created
    client.go:47: resource type ClusterRole with namespace/name (/fileintegrity-viewer-role) created
    client.go:47: resource type RoleBinding with namespace/name (osdk-e2e-6c36263d-255d-4a22-9659-ba12e7965054/file-integrity-daemon) created
    client.go:47: resource type RoleBinding with namespace/name (osdk-e2e-6c36263d-255d-4a22-9659-ba12e7965054/file-integrity-operator) created
    client.go:47: resource type RoleBinding with namespace/name (osdk-e2e-6c36263d-255d-4a22-9659-ba12e7965054/file-integrity-operator-metrics) created
    client.go:47: resource type RoleBinding with namespace/name (osdk-e2e-6c36263d-255d-4a22-9659-ba12e7965054/leader-election-rolebinding) created
    client.go:47: resource type ClusterRoleBinding with namespace/name (/file-integrity-operator) created
    client.go:47: resource type Deployment with namespace/name (osdk-e2e-6c36263d-255d-4a22-9659-ba12e7965054/file-integrity-operator) created
    helpers.go:282: Initialized cluster resources
    wait_util.go:59: Deployment available (1/1)
    helpers.go:1763: Tainting node: ip-10-0-104-211.ec2.internal
    helpers.go:710: Tainting node failed
--- FAIL: TestFileIntegrityTolerations (30.59s)
=== RUN   TestFileIntegrityLogCompress 

Here is an example patch that causes this issue: #426

Since E2E tests are run serially, this causes the entire suite to cascade in failures.

Add initialDelay option

When a user deploys FIO shortly after creating a new cluster, it activates quickly after the cluster is built. This can lead to problems when there are still updates being rolled out to the cluster and MCP getting updates. The file changes trigger failed FIO checks, resulting in unwanted alerts for every new cluster. To address this, we will discuss implementing an initialDelay option in FIO.

The kustomize dependency isn't tracked using tools.go

From a build perspective, it would be handy to have this dependency tracked using go tooling so that we (as developers, or build systems) don't need to manage it separately.

One possible way to do this is to use tools.go, similar to what the node-observability-operator does.

This was originally lumped into issue #245 but I decided to break it into it's own issue because kustomize pulls in a lot of transient dependencies that affect FIO dependencies. I thought it would be good to sort these dependencies individually, so we can keep patches as small as possible.

https://github.com/openshift/node-observability-operator/blob/main/tools/tools.go#L12

license for the operator

It would be good if the license for the file-integrity-operator could be explicitly specified.
Under the hood, AIDE uses GPLv2 and thus the operator license needs to be aligned with this.

Unable to run end-to-end tests using operator installed from catalog source

In the process of debugging an upgrade issue from 0.1.27 to 0.1.28, we started working on an upgrade CI job that exercises upgrades. To use the OpenShift CI tooling developed to help test this case, we need to install the operator using catalog sources.

The basic flow for the test is to:

1.) Install $OPERATOR_VERSION (e.g., 0.1.27) to the cluster using catalog sources
2.) Upgrade the operator to $OPERATOR_VERSION + 1 (0.1.28)
3.) Run make e2e against the upgraded operator

In the process of developing this job, I attempted to run the end-to-end tests against FIO installed from a catalog source and saw the following failures:

=== RUN   TestFileIntegrityBadConfig
I0719 16:23:21.705052   80310 request.go:665] Waited for 2.650165646s due to client-side throttling, not priority and fairness, request: GET:https://api.lbragstad-dev.devcluster.openshift.com:6443/apis/authorization.k8s.io/v1?timeout=32s
I0719 16:23:31.705114   80310 request.go:665] Waited for 12.650244158s due to client-side throttling, not priority and fairness, request: GET:https://api.lbragstad-dev.devcluster.openshift.com:6443/apis/events.k8s.io/v1?timeout=32s
I0719 16:23:41.904659   80310 request.go:665] Waited for 6.557941133s due to client-side throttling, not priority and fairness, request: GET:https://api.lbragstad-dev.devcluster.openshift.com:6443/apis/authentication.k8s.io/v1?timeout=32s
I0719 16:23:52.903704   80310 request.go:665] Waited for 1.157020028s due to client-side throttling, not priority and fairness, request: GET:https://api.lbragstad-dev.devcluster.openshift.com:6443/apis/events.k8s.io/v1beta1?timeout=32s
I0719 16:24:03.102727   80310 request.go:665] Waited for 11.355830053s due to client-side throttling, not priority and fairness, request: GET:https://api.lbragstad-dev.devcluster.openshift.com:6443/apis/storage.k8s.io/v1beta1?timeout=32s
    client.go:47: resource type ServiceAccount with namespace/name (osdk-e2e-2ac8be42-1c4c-4d23-8265-36a0b870dfc7/file-integrity-daemon) created
    client.go:47: resource type ServiceAccount with namespace/name (osdk-e2e-2ac8be42-1c4c-4d23-8265-36a0b870dfc7/file-integrity-operator) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-2ac8be42-1c4c-4d23-8265-36a0b870dfc7/file-integrity-daemon) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-2ac8be42-1c4c-4d23-8265-36a0b870dfc7/file-integrity-operator) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-2ac8be42-1c4c-4d23-8265-36a0b870dfc7/leader-election-role) created
    helpers.go:265: failed to initialize cluster resources: clusterroles.rbac.authorization.k8s.io "file-integrity-operator" already exists
--- FAIL: TestFileIntegrityBadConfig (49.22s)
=== RUN   TestFileIntegrityTolerations
I0719 16:24:13.301850   80310 request.go:665] Waited for 5.02631644s due to client-side throttling, not priority and fairness, request: GET:https://api.lbragstad-dev.devcluster.openshift.com:6443/apis/whereabouts.cni.cncf.io/v1alpha1?timeout=32s
I0719 16:24:23.501172   80310 request.go:665] Waited for 15.225555488s due to client-side throttling, not priority and fairness, request: GET:https://api.lbragstad-dev.devcluster.openshift.com:6443/apis/route.openshift.io/v1?timeout=32s
I0719 16:24:33.501605   80310 request.go:665] Waited for 8.958257296s due to client-side throttling, not priority and fairness, request: GET:https://api.lbragstad-dev.devcluster.openshift.com:6443/apis/discovery.k8s.io/v1beta1?timeout=32s
I0719 16:24:43.700779   80310 request.go:665] Waited for 2.757043945s due to client-side throttling, not priority and fairness, request: GET:https://api.lbragstad-dev.devcluster.openshift.com:6443/apis/console.openshift.io/v1?timeout=32s
I0719 16:24:53.900198   80310 request.go:665] Waited for 12.956345588s due to client-side throttling, not priority and fairness, request: GET:https://api.lbragstad-dev.devcluster.openshift.com:6443/apis/route.openshift.io/v1?timeout=32s
    client.go:47: resource type ServiceAccount with namespace/name (osdk-e2e-3fba73bd-843a-4e44-a243-bcfc3c6b20bc/file-integrity-daemon) created
    client.go:47: resource type ServiceAccount with namespace/name (osdk-e2e-3fba73bd-843a-4e44-a243-bcfc3c6b20bc/file-integrity-operator) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-3fba73bd-843a-4e44-a243-bcfc3c6b20bc/file-integrity-daemon) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-3fba73bd-843a-4e44-a243-bcfc3c6b20bc/file-integrity-operator) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-3fba73bd-843a-4e44-a243-bcfc3c6b20bc/leader-election-role) created
    helpers.go:265: failed to initialize cluster resources: clusterroles.rbac.authorization.k8s.io "file-integrity-operator" already exists
--- FAIL: TestFileIntegrityTolerations (49.23s)
=== RUN   TestFileIntegrityLogCompress
I0719 16:25:03.900496   80310 request.go:665] Waited for 6.393417622s due to client-side throttling, not priority and fairness, request: GET:https://api.lbragstad-dev.devcluster.openshift.com:6443/apis/flowcontrol.apiserver.k8s.io/v1beta1?timeout=32s
I0719 16:25:14.899662   80310 request.go:665] Waited for 1.159087399s due to client-side throttling, not priority and fairness, request: GET:https://api.lbragstad-dev.devcluster.openshift.com:6443/apis/network.openshift.io/v1?timeout=32s
I0719 16:25:24.900039   80310 request.go:665] Waited for 11.159307613s due to client-side throttling, not priority and fairness, request: GET:https://api.lbragstad-dev.devcluster.openshift.com:6443/apis/flowcontrol.apiserver.k8s.io/v1beta1?timeout=32s
I0719 16:25:35.099801   80310 request.go:665] Waited for 4.958409848s due to client-side throttling, not priority and fairness, request: GET:https://api.lbragstad-dev.devcluster.openshift.com:6443/apis/monitoring.coreos.com/v1alpha1?timeout=32s
I0719 16:25:45.298337   80310 request.go:665] Waited for 15.156909324s due to client-side throttling, not priority and fairness, request: GET:https://api.lbragstad-dev.devcluster.openshift.com:6443/apis/network.openshift.io/v1?timeout=32s
    client.go:47: resource type ServiceAccount with namespace/name (osdk-e2e-466eb796-6b8b-4aa5-9ecd-a1d4a4dc431e/file-integrity-daemon) created
    client.go:47: resource type ServiceAccount with namespace/name (osdk-e2e-466eb796-6b8b-4aa5-9ecd-a1d4a4dc431e/file-integrity-operator) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-466eb796-6b8b-4aa5-9ecd-a1d4a4dc431e/file-integrity-daemon) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-466eb796-6b8b-4aa5-9ecd-a1d4a4dc431e/file-integrity-operator) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-466eb796-6b8b-4aa5-9ecd-a1d4a4dc431e/leader-election-role) created
    helpers.go:265: failed to initialize cluster resources: clusterroles.rbac.authorization.k8s.io "file-integrity-operator" already exists
--- FAIL: TestFileIntegrityLogCompress (49.15s)
=== RUN   TestFileIntegrityAcceptsExpectedChange
I0719 16:25:55.298845   80310 request.go:665] Waited for 8.646337137s due to client-side throttling, not priority and fairness, request: GET:https://api.lbragstad-dev.devcluster.openshift.com:6443/apis/flowcontrol.apiserver.k8s.io/v1beta2?timeout=32s
I0719 16:26:05.497989   80310 request.go:665] Waited for 2.556343286s due to client-side throttling, not priority and fairness, request: GET:https://api.lbragstad-dev.devcluster.openshift.com:6443/apis/autoscaling.openshift.io/v1beta1?timeout=32s
I0719 16:26:15.697948   80310 request.go:665] Waited for 12.756205324s due to client-side throttling, not priority and fairness, request: GET:https://api.lbragstad-dev.devcluster.openshift.com:6443/apis/flowcontrol.apiserver.k8s.io/v1beta1?timeout=32s
I0719 16:26:25.897570   80310 request.go:665] Waited for 6.554694131s due to client-side throttling, not priority and fairness, request: GET:https://api.lbragstad-dev.devcluster.openshift.com:6443/apis/route.openshift.io/v1?timeout=32s
    client.go:47: resource type ServiceAccount with namespace/name (osdk-e2e-d668790f-c305-4625-a136-5f21d13bf92d/file-integrity-daemon) created
    client.go:47: resource type ServiceAccount with namespace/name (osdk-e2e-d668790f-c305-4625-a136-5f21d13bf92d/file-integrity-operator) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-d668790f-c305-4625-a136-5f21d13bf92d/file-integrity-daemon) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-d668790f-c305-4625-a136-5f21d13bf92d/file-integrity-operator) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-d668790f-c305-4625-a136-5f21d13bf92d/leader-election-role) created
    helpers.go:265: failed to initialize cluster resources: clusterroles.rbac.authorization.k8s.io "file-integrity-operator" already exists
--- FAIL: TestFileIntegrityAcceptsExpectedChange (49.26s)
FAIL
FAIL    github.com/openshift/file-integrity-operator/tests/e2e  636.762s
FAIL
make: *** [Makefile:422: e2e] Error 1

@mrogers950 noted this in his PR for the test openshift/release#30613

Opening this bug to track the work we need to do to run end-to-end tests against an operator deployed from the catalog.

Operator not cleaning up old aide.log.backup and aide.db.gz.backup files

Presently we are running the file integrity operator , however it is filling up the /etc/kubernetes directory with files dating back from more than 3-4 months. Is there a way for us to clean up the files automatically after 30 or a certain period so that it doesn't use up the space and fill up the disk drive. The clean up using logrotate would also be a feasible option since it is already available in coreos.

sh-4.4# ls -al aide.db.gz.backup* |wc -l 252

sh-4.4# ls -al aide.log.backup* |wc -l 252

Makefile release targets assume repository conventions

The Makefile release targets we document for the release process contain git commands that assume your origin remote points to the upstream openshift/file-integrity-operator repository.

This assumption breaks the release process for folks who have origin pointing to their fork and a separate remote for the upstream repository. For example:

$ git remote -v
origin git@github-redhat:rhmdnd/file-integrity-operator.git (fetch)
origin git@github-redhat:rhmdnd/file-integrity-operator.git (push)
upstream git@github-redhat:openshift/file-integrity-operator.git (fetch)
upstream git@github-redhat:openshift/file-integrity-operator.git (push)

Opening this issue to see if there are some ways we can generalize the release process so that it will work regardless of the remote configuration, making it easier for more people to do releases.

image file-integrity-operator-index has no latest tag

The catalog-source points to the latest tag:

apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: file-integrity-operator
  namespace: openshift-marketplace
spec:
  displayName: File Integrity Operator
  publisher: github.com/openshift/file-integrity-operator
  sourceType: grpc
  image: quay.io/file-integrity-operator/file-integrity-operator-index:latest

In the registry there is no such tag, if I edit the catalog pointing to tag 0.1.6 it works correctly.

https://quay.io/repository/file-integrity-operator/file-integrity-operator-index?tab=tags

Move aide.reinit out of `/etc`

This is related to ostreedev/ostree#2453 - see also https://bugzilla.redhat.com/show_bug.cgi?id=1945274

Basically it looks like this operator is writing a file into /etc at runtime. Since this is for dynamic state, not persistent state, it should go in /run. This would avoid a seen-in-the-wild race condition at shutdown time where ostree wants to propagate the modified config file, but the daemonset has deleted it in the meantime.

(Arguably, ostree should be OK with files vanishing in /etc at shutdown time, but OTOH it can mean it's capturing an inconsistent snapshot of `/etc)

Set resource limits for containers

As a Platform Engineer I need to control usage of CPU Memory per Container.
Please add ressource limits:

resources:
          requests:
            cpu: 100m
            memory: 100Mi
          limits:
            memory: 200Mi
            cpu: 400m

The Aide PODS were running for a day and used 1.6 Gb of memory for no reason.

cheers

E2E testing is broken waiting for nodes to start

Seeing this consistently across various patches (e.g., dependency bumps).

#432

     wait_util.go:59: Deployment available (1/1)
    client.go:47: resource type  with namespace/name (osdk-e2e-0c35dce2-3be6-43fd-9624-566a1176cb0f/e2e-test-certrotation) created
    helpers.go:393: Created FileIntegrity: &{TypeMeta:{Kind: APIVersion:} ObjectMeta:{Name:e2e-test-certrotation GenerateName: Namespace:osdk-e2e-0c35dce2-3be6-43fd-9624-566a1176cb0f SelfLink: UID:576fd4be-282f-4198-9c13-135c08c2818f ResourceVersion:62749 Generation:1 CreationTimestamp:2023-09-06 21:45:15 +0000 UTC DeletionTimestamp:<nil> DeletionGracePeriodSeconds:<nil> Labels:map[] Annotations:map[] OwnerReferences:[] Finalizers:[] ManagedFields:[{Manager:Go-http-client Operation:Update APIVersion:fileintegrity.openshift.io/v1alpha1 Time:2023-09-06 21:45:15 +0000 UTC FieldsType:FieldsV1 FieldsV1:{"f:spec":{".":{},"f:config":{".":{},"f:gracePeriod":{},"f:maxBackups":{}},"f:debug":{},"f:nodeSelector":{".":{},"f:node-role.kubernetes.io/worker":{}},"f:tolerations":{}}} Subresource:}]} Spec:{NodeSelector:map[node-role.kubernetes.io/worker:] Config:{Name: Namespace: Key: GracePeriod:20 MaxBackups:5 InitialDelay:0} Debug:true Tolerations:[{Key:node-role.kubernetes.io/master Operator:Exists Value: Effect:NoSchedule TolerationSeconds:<nil>} {Key:node-role.kubernetes.io/infra Operator:Exists Value: Effect:NoSchedule TolerationSeconds:<nil>}]} Status:{Phase:}}
    helpers.go:902: Got (Active) result #1 out of 0 needed.
    helpers.go:913: FileIntegrity ready (Active)
    helpers.go:430: FileIntegrity deployed successfully
    helpers.go:902: Got (Active) result #1 out of 0 needed.
    helpers.go:913: FileIntegrity ready (Active)
    e2e_test.go:982: Asserting that the FileIntegrity check is in a SUCCESS state after deploying it
    helpers.go:1258: ip-10-0-100-129.us-east-2.compute.internal Succeeded
    helpers.go:1258: ip-10-0-100-129.us-east-2.compute.internal Succeeded
    helpers.go:1258: ip-10-0-123-7.us-east-2.compute.internal Succeeded
    helpers.go:1258: ip-10-0-17-136.us-east-2.compute.internal Succeeded
    e2e_test.go:985: Rotating kube-apiserver-to-kubelet-client-ca certificate
    e2e_test.go:988: Waiting for Nodes to start updating
    helpers.go:1068: Timeout waiting for nodes to start updating, err:  timed out waiting for the condition
    e2e_test.go:995: Asserting that the FileIntegrity is in a SUCCESS state after rotating kube-apiserver-to-kubelet-client-ca certificate
    helpers.go:1258: ip-10-0-100-129.us-east-2.compute.internal Succeeded
    helpers.go:1258: ip-10-0-123-7.us-east-2.compute.internal Succeeded
    helpers.go:1258: ip-10-0-17-136.us-east-2.compute.internal Succeeded
    helpers.go:1875: wrote logs for file-integrity-operator-64df45ddb7-bqwqd/self 

Unable to install kustomize with golang 1.18

I'm unable to use kustomize, or any Makefile targets that leverage kustomize, with golang 1.18:

$ ls build/kustomize
ls: cannot access 'build/kustomize': No such file or directory
$ make kustomize                                                                                                                                                                                           go: creating new go.mod: module tmp
Downloading sigs.k8s.io/kustomize/kustomize/[email protected]
go: added cloud.google.com/go v0.38.0
go: added github.com/Azure/go-autorest/autorest v0.9.0
go: added github.com/Azure/go-autorest/autorest/adal v0.5.0
go: added github.com/Azure/go-autorest/autorest/date v0.1.0
go: added github.com/Azure/go-autorest/logger v0.1.0
go: added github.com/Azure/go-autorest/tracing v0.5.0
go: added github.com/PuerkitoBio/purell v1.1.1
go: added github.com/PuerkitoBio/urlesc v0.0.0-20170810143723-de5bf2ad4578
go: added github.com/asaskevich/govalidator v0.0.0-20190424111038-f61b66f89f4a
go: added github.com/bgentry/go-netrc v0.0.0-20140422174119-9fd32a8b3d3d
go: added github.com/davecgh/go-spew v1.1.1
go: added github.com/dgrijalva/jwt-go v3.2.0+incompatible
go: added github.com/emicklei/go-restful v0.0.0-20170410110728-ff4f55a20633
go: added github.com/evanphx/json-patch v4.9.0+incompatible
go: added github.com/go-errors/errors v1.0.1
go: added github.com/go-openapi/analysis v0.19.5
go: added github.com/go-openapi/errors v0.19.2
go: added github.com/go-openapi/jsonpointer v0.19.3
go: added github.com/go-openapi/jsonreference v0.19.3
go: added github.com/go-openapi/loads v0.19.4
go: added github.com/go-openapi/runtime v0.19.4
go: added github.com/go-openapi/spec v0.19.5
go: added github.com/go-openapi/strfmt v0.19.5
go: added github.com/go-openapi/swag v0.19.5
go: added github.com/go-openapi/validate v0.19.8
go: added github.com/go-stack/stack v1.8.0
go: added github.com/gogo/protobuf v1.3.1
go: added github.com/golang/protobuf v1.3.2
go: added github.com/google/gofuzz v1.1.0
go: added github.com/google/shlex v0.0.0-20191202100458-e7afc7fbc510
go: added github.com/googleapis/gnostic v0.1.0
go: added github.com/gophercloud/gophercloud v0.1.0
go: added github.com/hashicorp/errwrap v1.0.0
go: added github.com/hashicorp/go-cleanhttp v0.5.0
go: added github.com/hashicorp/go-multierror v1.1.0
go: added github.com/hashicorp/go-safetemp v1.0.0
go: added github.com/hashicorp/go-version v1.1.0
go: added github.com/inconshreveable/mousetrap v1.0.0
go: added github.com/json-iterator/go v1.1.8
go: added github.com/mailru/easyjson v0.7.0
go: added github.com/mattn/go-runewidth v0.0.7
go: added github.com/mitchellh/go-homedir v1.1.0
go: added github.com/mitchellh/go-testing-interface v1.0.0
go: added github.com/mitchellh/mapstructure v1.1.2
go: added github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd
go: added github.com/modern-go/reflect2 v1.0.1
go: added github.com/monochromegane/go-gitignore v0.0.0-20200626010858-205db1a8cc00
go: added github.com/olekukonko/tablewriter v0.0.4
go: added github.com/pkg/errors v0.9.1
go: added github.com/pmezard/go-difflib v1.0.0
go: added github.com/qri-io/starlib v0.4.2-0.20200213133954-ff2e8cd5ef8d
go: added github.com/spf13/cobra v1.0.0
go: added github.com/spf13/pflag v1.0.5
go: added github.com/stretchr/testify v1.6.1
go: added github.com/ulikunitz/xz v0.5.5
go: added github.com/xlab/treeprint v0.0.0-20181112141820-a009c3971eca
go: added github.com/yujunz/go-getter v1.4.1-lite
go: added go.mongodb.org/mongo-driver v1.1.2
go: added go.starlark.net v0.0.0-20200306205701-8dd3e2ee1dd5
go: added golang.org/x/crypto v0.0.0-20200622213623-75b288015ac9
go: added golang.org/x/net v0.0.0-20200625001655-4c5254603344
go: added golang.org/x/oauth2 v0.0.0-20190604053449-0f29369cfe45
go: added golang.org/x/sys v0.0.0-20200323222414-85ca7c5b95cd
go: added golang.org/x/text v0.3.2
go: added golang.org/x/time v0.0.0-20190308202827-9d24e82272b4
go: added google.golang.org/appengine v1.5.0
go: added gopkg.in/inf.v0 v0.9.1
go: added gopkg.in/yaml.v2 v2.3.0
go: added gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c
go: added k8s.io/api v0.18.10
go: added k8s.io/apimachinery v0.18.10
go: added k8s.io/client-go v0.18.10
go: added k8s.io/klog v1.0.0
go: added k8s.io/kube-openapi v0.0.0-20200410145947-61e04a5be9a6
go: added k8s.io/utils v0.0.0-20200324210504-a9aa75ae1b89
go: added sigs.k8s.io/kustomize/api v0.6.5
go: added sigs.k8s.io/kustomize/cmd/config v0.8.5
go: added sigs.k8s.io/kustomize/kustomize/v3 v3.8.7
go: added sigs.k8s.io/kustomize/kyaml v0.9.4
go: added sigs.k8s.io/structured-merge-diff/v3 v3.0.0
go: added sigs.k8s.io/yaml v1.2.0
$ ls build/kustomize
ls: cannot access 'build/kustomize': No such file or directory

We hit this same issue in the compliance-operator.

ComplianceAsCode/compliance-operator#145

The make release-images target doesn't properly handle the latest tag

The final step of the release process is to push the images to a public registry [0].

When we prepare and push a new release, we specify a version using the VERSION environment variable, which gets translated to a TAG during the image build and push process. But, the problem is that after we build the images, we don't go back through and tag the latest version (e.g., 0.1.29) with the latest tag.

It's confusing to have a latest tag that doesn't link to the most recent version in the repository. We should find a way to tag the latest release with the version we're releasing so that they're represented accurately in the repository. And, more importantly, if you use the latest tag, you're able to get the most recent images.

[0] https://quay.io/organization/file-integrity-operator

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.