openshift / file-integrity-operator Goto Github PK

Operator providing OpenShift cluster node file integrity checking

License: Apache License 2.0

Makefile 5.65% Dockerfile 0.50% Shell 0.50% Go 93.36%

file-integrity-operator's Introduction

file-integrity-operator

The file-integrity-operator is a OpenShift Operator that continually runs file integrity checks on the cluster nodes. It deploys a DaemonSet that initializes and runs privileged AIDE (Advanced Intrusion Detection Environment) containers on each node, providing a log of files that have been modified since the initial run of the DaemonSet pods.

Deploying:

To deploy the operator using the latest released file-integrity-operator image available on quay.io, run:

$ make deploy

Alternately, to deploy the latest release through OLM, run:

$ make catalog-deploy

Building and deploying from source:

First set an image repo and tag to use. Make sure that you have permissions to push file-integrity-operator* images (and relevant tag) to the repo.

$ export IMAGE_REPO=quay.io/myrepo
$ export TAG=mytag

With these set, they will apply to the rest of the Makefile targets. Next, build and push the operator and bundle images by running:

$ make images && make push

Finally, deploy the operator with the built images,

$ make deploy

or build a catalog and deploy from OLM:

$ make catalog && make catalog-deploy

FileIntegrity API:

The operator works with FileIntegrity objects. Each of these objects represents a managed deployment of AIDE on one or more nodes.

apiVersion: fileintegrity.openshift.io/v1alpha1
kind: FileIntegrity
metadata:
  name: example-fileintegrity
  namespace: openshift-file-integrity
spec:
  nodeSelector:
    kubernetes.io/hostname: "ip-10-10-10-1"
  tolerations:
  - key: "myNode"
    operator: "Exists"
    effect: "NoSchedule"
  config:
    name: "myconfig"
    namespace: "openshift-file-integrity"
    key: "config"
    gracePeriod: 20
    maxBackups: 5
  debug: false
status:
  phase: Active

In the spec:

nodeSelector: Selector for nodes to schedule the scan instances on.
tolerations: Specify tolerations to schedule on nodes with custom taints. When not specified, a default toleration allowing running on master and infra nodes is applied.
config: Point to a ConfigMap containing an AIDE configuration to use instead of the CoreOS optimized default. See "Applying an AIDE config" below.
config.gracePeriod: The number of seconds to pause in between AIDE integrity checks. Frequent AIDE checks on a node may be resource intensive, so it can be useful to specify a longer interval. Defaults to 900 (15 mins).
config.maxBackups: The maximum number of AIDE database and log backups (leftover from the re-init process) to keep on a node. Older backups beyond this number are automatically pruned by the daemon. Defaults to 5.
config.initialDelay: An optional field. The number of seconds to wait before starting the first AIDE integrity check. Defaults to 0.

In the status:

phase: The running status of the FileIntegrity instance. Can be Initializing, Pending, or Active. Initializing is displayed if the FileIntegrity is currently initializing or re-initializing the AIDE database, Pending if the FileIntegrity deployment is still being created, and Active if the scans are active and ongoing. For node scan results, see the FileIntegrityNodeStatus objects explained below.

Usage:

After deploying the operator, you must create a FileIntegrity object. The following example will enable scanning on all nodes.

apiVersion: fileintegrity.openshift.io/v1alpha1
kind: FileIntegrity
metadata:
  name: example-fileintegrity
  namespace: openshift-file-integrity
spec:
  config: {}

Viewing the scan phase: An Active phase indicates that on each node, the AIDE database has been initialized and periodic scanning is enabled:

$ oc get fileintegrities -n openshift-file-integrity
NAME                    AGE
example-fileintegrity   11m

$ oc get fileintegrities/example-fileintegrity -n openshift-file-integrity -o jsonpath="{ .status.phase }"
Active

Each node will have a corresponding FileIntegrityNodeStatus object:

$ oc get fileintegritynodestatuses
NAME                                                               AGE
example-fileintegrity-ip-10-0-139-137.us-east-2.compute.internal   4h24m
example-fileintegrity-ip-10-0-140-35.us-east-2.compute.internal    4h24m
example-fileintegrity-ip-10-0-162-216.us-east-2.compute.internal   4h24m
example-fileintegrity-ip-10-0-172-188.us-east-2.compute.internal   4h24m
example-fileintegrity-ip-10-0-210-181.us-east-2.compute.internal   4h24m
example-fileintegrity-ip-10-0-210-89.us-east-2.compute.internal    4h24m

The results field can contain up to three entries. The most recent Successful scan, the most recent Failed scan (if any), and the most recent Errored scan (if any). When there are multiple entries, the newest lastProbeTime indicates the current status.

A Failed scan indicates that there were changes to the files that AIDE monitors, and displays a brief status. The resultConfigMap fields point to a ConfigMap containing a more detailed report.

Note: Currently the failure log is only exposed to the admin through this result ConfigMap. In order to provide some permanence of record, the result ConfigMaps are not owned by the FileIntegrity object, so manual cleanup is necessary. Additionally, deleting the FileIntegrity object leaves the AIDE database on the nodes, and the scan state will resume if the FileIntegrity is re-created.

$ oc get fileintegritynodestatus/example-fileintegrity-ip-10-0-139-137.us-east-2.compute.internal -o yaml
apiVersion: fileintegrity.openshift.io/v1alpha1
kind: FileIntegrityNodeStatus
...
nodeName: ip-10-0-139-137.us-east-2.compute.internal
results:
- condition: Succeeded
  lastProbeTime: "2020-06-18T01:17:14Z"
- condition: Failed
  filesAdded: 1
  filesChanged: 1
  lastProbeTime: "2020-06-18T01:28:57Z"
  resultConfigMapName: aide-ds-example-fileintegrity-ip-10-0-139-137.us-east-2.compute.internal-failed
  resultConfigMapNamespace: openshift-file-integrity

$ oc get cm/aide-ds-example-fileintegrity-ip-10-0-139-137.us-east-2.compute.internal-failed -n openshift-file-integrity -o jsonpath="{ .data.integritylog }"
AIDE 0.15.1 found differences between database and filesystem!!
Start timestamp: 2020-06-18 02:00:38

Summary:
  Total number of files:        29447
  Added files:                  1
  Removed files:                0
  Changed files:                1


---------------------------------------------------
Added files:
---------------------------------------------------

added: /hostroot/root/.bash_history

---------------------------------------------------
Changed files:
---------------------------------------------------

changed: /hostroot/etc/resolv.conf

---------------------------------------------------
Detailed information about changes:
---------------------------------------------------


File: /hostroot/etc/resolv.conf
 SHA512   : Xl2pzxjmRPtW8bl6Kj49SkKOSBVJgsCI , tebxD8QZd/5/SqsVkExCwVqVO22zxmcq

AIDE logs over 1MB are gzip compressed and base64 encoded, due to the configMap data size limit. In this case, you will want to pipe the output of the above command to base64 -d | gunzip. Compressed logs are indicated by the presense of a file-integrity.openshift.io/compressed annotation key in the configMap.

Events

Transitions in the status of the FileIntegrity and FileIntegrityNodeStatus objects are also logged by events. The creation time of the event reflects the latest transition (i.e., Initializing to Active), and not necessarily the latest scan result. However, the newest event will always reflect the most recent status.

$ oc get events --field-selector reason=FileIntegrityStatus
LAST SEEN   TYPE     REASON                OBJECT                                MESSAGE
97s         Normal   FileIntegrityStatus   fileintegrity/example-fileintegrity   Pending
67s         Normal   FileIntegrityStatus   fileintegrity/example-fileintegrity   Initializing
37s         Normal   FileIntegrityStatus   fileintegrity/example-fileintegrity   Active

When a node has a failed scan, an event is created with the add/changed/removed and configMap information.

$ oc get events --field-selector reason=NodeIntegrityStatus
LAST SEEN   TYPE      REASON                OBJECT                                MESSAGE
114m        Normal    NodeIntegrityStatus   fileintegrity/example-fileintegrity   no changes to node ip-10-0-134-173.ec2.internal
114m        Normal    NodeIntegrityStatus   fileintegrity/example-fileintegrity   no changes to node ip-10-0-168-238.ec2.internal
114m        Normal    NodeIntegrityStatus   fileintegrity/example-fileintegrity   no changes to node ip-10-0-169-175.ec2.internal
114m        Normal    NodeIntegrityStatus   fileintegrity/example-fileintegrity   no changes to node ip-10-0-152-92.ec2.internal
114m        Normal    NodeIntegrityStatus   fileintegrity/example-fileintegrity   no changes to node ip-10-0-158-144.ec2.internal
114m        Normal    NodeIntegrityStatus   fileintegrity/example-fileintegrity   no changes to node ip-10-0-131-30.ec2.internal
87m         Warning   NodeIntegrityStatus   fileintegrity/example-fileintegrity   node ip-10-0-152-92.ec2.internal has changed! a:1,c:1,r:0 log:openshift-file-integrity/aide-ds-example-fileintegrity-ip-10-0-152-92.ec2.internal-failed

Changes to the number of added/changed/removed files will result in a new event, even if the status of the node has not transitioned.

$ oc get events --field-selector reason=NodeIntegrityStatus
LAST SEEN   TYPE      REASON                OBJECT                                MESSAGE
114m        Normal    NodeIntegrityStatus   fileintegrity/example-fileintegrity   no changes to node ip-10-0-134-173.ec2.internal
114m        Normal    NodeIntegrityStatus   fileintegrity/example-fileintegrity   no changes to node ip-10-0-168-238.ec2.internal
114m        Normal    NodeIntegrityStatus   fileintegrity/example-fileintegrity   no changes to node ip-10-0-169-175.ec2.internal
114m        Normal    NodeIntegrityStatus   fileintegrity/example-fileintegrity   no changes to node ip-10-0-152-92.ec2.internal
114m        Normal    NodeIntegrityStatus   fileintegrity/example-fileintegrity   no changes to node ip-10-0-158-144.ec2.internal
114m        Normal    NodeIntegrityStatus   fileintegrity/example-fileintegrity   no changes to node ip-10-0-131-30.ec2.internal
87m         Warning   NodeIntegrityStatus   fileintegrity/example-fileintegrity   node ip-10-0-152-92.ec2.internal has changed! a:1,c:1,r:0 log:openshift-file-integrity/aide-ds-example-fileintegrity-ip-10-0-152-92.ec2.internal-failed
40m         Warning   NodeIntegrityStatus   fileintegrity/example-fileintegrity   node ip-10-0-152-92.ec2.internal has changed! a:3,c:1,r:0 log:openshift-file-integrity/aide-ds-example-fileintegrity-ip-10-0-152-92.ec2.internal-failed

Testing

Unit

$ make test-unit

Local

$ make run

End-to-end

$ make e2e

Running the e2e suite normally handles the operator deployment for each test case. The e2e suite can also be run against an existing deployment with the TEST_BUNDLE_INSTALL variable (set to 1 or true). The following example builds development images including the bundle and catalog, deploys them to a running cluster, and executes the e2e suite against the deployment.

$ export IMAGE_REPO=myrepo
$ export TAG=testing
$ make images && make push && make catalog && make catalog-deploy
$ TEST_BUNDLE_INSTALL=1 TEST_WATCH_NAMESPACE=openshift-file-integrity TEST_OPERATOR_NAMESPACE=openshift-file-integrity make e2e

Overriding the AIDE configuration

By default the AIDE containers run with an aide.conf that is tailored to a default RHCOS node. If you need to add or exclude files on nodes that are not covered by the default config, you can override it with a modified config.

Create a ConfigMap containing the aide.conf, e.g.,

$ oc project openshift-file-integrity
$ oc create configmap myconf --from-file=aide-conf=aide.conf.rhel8

Post the FileIntegrity CR containing the name, namespace, and data key containing the aide.conf in the spec.

apiVersion: file-integrity.openshift.io/v1alpha1
kind: FileIntegrity
metadata:
  name: example-fileintegrity
  namespace: openshift-file-integrity
spec:
  config:
    name: myconf
    namespace: openshift-file-integrity
    key: aide-conf

At this point the operator will update the active AIDE config and perform a re-initialization of the AIDE database, as well as a restart of the AIDE pods to begin scanning with the new configuration. A backup of the logs and database from the previously applied configurations are left available on the nodes under /etc/kubernetes.
The operator automatically converts the database, database_out, report_url, DBDIR, and LOGDIR options in the configuration to accommodate running inside of a pod.
Removing the config section from the FileIntegrity resource when active reverts the running config to the default and re-initializes the database.
In the case of where small modifications are needed (such as excluding a file or directory), it's recommended to copy the default config to a new ConfigMap and add to it as needed.
Some AIDE configuration options may not be supported by the AIDE container. For example, the mhash digest types are not supported. For digest selection, it is recommended to use the default config's CONTENT_EX group.
Manually re-initializing the AIDE database can be done by adding the annotation key file-integrity.openshift.io/re-init to the FileIntegrity object.

Controller metrics

The file-integrity-operator exposes the following FileIntegrity-related metrics to Prometheus when cluster-monitoring is available.

# HELP file_integrity_operator_phase_total The total number of transitions to the FileIntegrity phase
# TYPE file_integrity_operator_phase_total counter
file_integrity_operator_phase_total{phase="Active"} 1
file_integrity_operator_phase_total{phase="Initializing"} 1
file_integrity_operator_phase_total{phase="Pending"} 1

# HELP file_integrity_operator_error_total The total number of FileIntegrity phase errors, per error
# TYPE file_integrity_operator_error_total counter
file_integrity_operator_error_total{error="foo"} 1

# HELP file_integrity_operator_pause_total The total number of FileIntegrity scan pause actions (during node updates)
# TYPE file_integrity_operator_pause_total counter
file_integrity_operator_pause_total{node="node-a"} 1

# HELP file_integrity_operator_unpause_total The total number of FileIntegrity scan unpause actions (during node updates)
# TYPE file_integrity_operator_unpause_total counter
file_integrity_operator_unpause_total{node="node-a"} 1

# HELP file_integrity_operator_reinit_total The total number of FileIntegrity database re-initialization triggers (annotation), per method and node
# TYPE file_integrity_operator_reinit_total counter
file_integrity_operator_reinit_total{by="node", node="node-a"} 1
file_integrity_operator_reinit_total{by="demand", node="node-a"} 1
file_integrity_operator_reinit_total{by="config", node=""} 1

# HELP file_integrity_operator_node_status_total The total number of FileIntegrityNodeStatus transitions, per condition and node
# TYPE file_integrity_operator_node_status_total counter
file_integrity_operator_node_status_total{condition="Failed",node="node-a"} 1
file_integrity_operator_node_status_total{condition="Succeeded",node="node-b"} 1
file_integrity_operator_node_status_total{condition="Errored",node="node-c"} 1

# HELP file_integrity_operator_node_status_error_total The total number of FileIntegrityNodeStatus errors, per error and node
# TYPE file_integrity_operator_node_status_error_total counter
file_integrity_operator_node_status_error_total{error="foo",node="node-a"} 1

# HELP file_integrity_operator_daemonset_update_total The total number of updates to the FileIntegrity AIDE daemonSet
# TYPE file_integrity_operator_daemonset_update_total counter
file_integrity_operator_daemonset_update_total{operation="update"} 1
file_integrity_operator_daemonset_update_total{operation="delete"} 1
file_integrity_operator_daemonset_update_total{operation="podkill"} 1

# HELP file_integrity_operator_reinit_daemonset_update_total The total number of updates to the FileIntegrity re-init signaling daemonSet
# TYPE file_integrity_operator_reinit_daemonset_update_total counter
file_integrity_operator_reinit_daemonset_update_total{operation="update"} 1
file_integrity_operator_reinit_daemonset_update_total{operation="delete"} 1

# HELP file_integrity_operator_node_failed A gauge that is set to 1 when a node has unresolved integrity failures, and 0 when it is healthy
# TYPE file_integrity_operator_node_failed gauge
file_integrity_operator_node_failed{node="node-a"} 1
file_integrity_operator_node_failed{node="node-b"} 1

After logging into the console, navigating to Monitoring -> Metrics, the file_integrity_operator* metrics can be queried using the metrics dashboard. The {__name__=~"file_integrity.*"} query can be used to view the full set of metrics.

Testing for the metrics from the cli can also be done directly with a pod that curls the metrics service. This is useful for troubleshooting.

$ oc run --rm -i --restart=Never --image=registry.fedoraproject.org/fedora-minimal:latest -n openshift-file-integrity metrics-test -- bash -c 'curl -ks -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" https://metrics.openshift-file-integrity.svc:8585/metrics-fio' | grep file

Integrity failure alerts

The operator creates the following default alert (based on the file_integrity_operator_node_failed gauge) in the operator namespace that fires when a node has been in a failure state for more than 30 seconds:

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: file-integrity
  namespace: openshift-file-integrity
spec:
  groups:
  - name: node-failed
    rules:
    - alert: NodeHasIntegrityFailure
      annotations:
        description: Node {{ $labels.node }} has an an integrity check status of Failed for
          more than 1 second.
        summary: Node {{ $labels.node }} has a file integrity failure
      expr: file_integrity_operator_node_failed{node=~".+"} * on(node) kube_node_info > 0
      for: 1s
      labels:
        severity: warning

The severity label and for may be adjusted depending on taste.

Contributor Guide

This guide provides useful information for contributors.

Proposing Releases

The release process is separated into three phases, with dedicated make targets. All targets require that you supply the VERSION prior to running make, which should be a semantic version formatted string (e.g., VERSION=0.1.49). Additionally, you should ensure that IMAGE_REPO and TAG environment variables are unset before running the targets.

Preparing the Release

The first phase of the release process is preparing the release locally. You can do this by running the make prepare-release target. All changes are staged locally. This is intentional so that you have the opportunity to review the changes before proposing the release in the next step.

Proposing the Release

The second phase of the release is to push the release to a dedicated branch against the origin repository. You can perform this step using the make push-release target.

Please note, this step makes changes to the upstream repository, so it is imperative that you review the changes you're committing prior to this step. This step also requires that you have necessary permissions on the repository.

Releasing Images

The third and final step of the release is to build new images and push them to an offical image registry. You can build new images and push using make release-images. Note that this operation also requires you have proper permissions on the remote registry. By default, make release-images will push images to Quay. You can specify a different repository using the IMAGE_REPO environment variable.

file-integrity-operator's People

Stargazers

Watchers

file-integrity-operator's Issues

The setup-envtest dependency isn't tracked using tools.go

From a build perspective, it would be handy to have this dependency tracked using go tooling so that we (as developers, or build systems) don't need to manage it separately.

One possible way to do this is to use tools.go, similar to what the node-observability-operator does.

This was originally lumped into issue #245 but I decided to break it into it's own issue because controller-tools pulls in a lot of transient dependencies that affect FIO dependencies. I thought it would be good to sort these dependencies individually, so we can keep patches as small as possible.

https://github.com/openshift/node-observability-operator/blob/main/tools/tools.go#L12

license for the operator

It would be good if the license for the file-integrity-operator could be explicitly specified.
Under the hood, AIDE uses GPLv2 and thus the operator license needs to be aligned with this.

Using *undeploy Makefile targets outputs Not Found errors

I was recently using the make catalog-deploy target to deploy changes into one of my clusters. When I needed to remove, or uninstall, the operator I used make catalog-undeploy, which outputs the following:

$ make catalog-undeploy
<snip>
namespace "openshift-file-integrity" deleted
customresourcedefinition.apiextensions.k8s.io "fileintegrities.fileintegrity.openshift.io" deleted
customresourcedefinition.apiextensions.k8s.io "fileintegritynodestatuses.fileintegrity.openshift.io" deleted
serviceaccount "file-integrity-daemon" deleted
serviceaccount "file-integrity-operator" deleted
role.rbac.authorization.k8s.io "leader-election-role" deleted
clusterrole.rbac.authorization.k8s.io "file-integrity-operator-metrics" deleted
clusterrole.rbac.authorization.k8s.io "fileintegrity-editor-role" deleted
clusterrole.rbac.authorization.k8s.io "fileintegrity-viewer-role" deleted
rolebinding.rbac.authorization.k8s.io "file-integrity-daemon" deleted
rolebinding.rbac.authorization.k8s.io "leader-election-rolebinding" deleted
rolebinding.rbac.authorization.k8s.io "prometheus-k8s" deleted
clusterrolebinding.rbac.authorization.k8s.io "file-integrity-operator-metrics" deleted
deployment.apps "file-integrity-operator" deleted
Error from server (NotFound): error when deleting "STDIN": roles.rbac.authorization.k8s.io "file-integrity-daemon" not found
Error from server (NotFound): error when deleting "STDIN": roles.rbac.authorization.k8s.io "file-integrity-operator" not found
Error from server (NotFound): error when deleting "STDIN": clusterroles.rbac.authorization.k8s.io "file-integrity-operator" not found
Error from server (NotFound): error when deleting "STDIN": rolebindings.rbac.authorization.k8s.io "file-integrity-operator" not found
Error from server (NotFound): error when deleting "STDIN": clusterrolebindings.rbac.authorization.k8s.io "file-integrity-operator" not found
make: *** [Makefile:357: undeploy] Error 1

It appears some of the resources were cleaned up properly, but others weren't. This may be confusing to contributors (do I need to go clean things up manually?)

Makefile release targets assume repository conventions

The Makefile release targets we document for the release process contain git commands that assume your origin remote points to the upstream openshift/file-integrity-operator repository.

This assumption breaks the release process for folks who have origin pointing to their fork and a separate remote for the upstream repository. For example:

$ git remote -v
origin git@github-redhat:rhmdnd/file-integrity-operator.git (fetch)
origin git@github-redhat:rhmdnd/file-integrity-operator.git (push)
upstream git@github-redhat:openshift/file-integrity-operator.git (fetch)
upstream git@github-redhat:openshift/file-integrity-operator.git (push)

Opening this issue to see if there are some ways we can generalize the release process so that it will work regardless of the remote configuration, making it easier for more people to do releases.

Aide database stored on filesystem

I question the usefulness of this tool when the database is stored on the file system which is being checked. An motivated attacker could cover their tracks by altering the database directly. It should be stored in a place not writable by an attacker such as off-cluster or encrypted with keys that are stored off disk, such as on a Trusted Platform Module (TPM).

I think other file integrity checkers can use a TPM, such as linux-ima, which should be used for secure boot in Centos/RHEL 9. That project is also under more recent development than aide. We seem to be including a version from Centos/RHEL 7 where which is 10 years old. Although the project has been updated as recently as last year.

[ocp4.8] FIO Servicemonitor not accessible by openshift-user-workload-monitoring prometheus instance

file-integrity-operator.v0.1.18 running on OCP 4.8.5 with default config for openshift-user-workload-monitoring.
In the logs of prometheus-operator in the openshift-user-workload-monitoring namespace I can see logs like this:

level=warn ts=2021-08-23T21:16:00.305307242Z caller=operator.go:1703 component=prometheusoperator msg="skipping servicemonitor" error="it accesses file system via bearer token file which Prometheus specification prohibits" servicemonitor=openshift-file-integrity/metrics namespace=openshift-user-workload-monitoring prometheus=user-workload

Are there already plans to change this default behaviour, or to make it configurable?

The kustomize dependency isn't tracked using tools.go

From a build perspective, it would be handy to have this dependency tracked using go tooling so that we (as developers, or build systems) don't need to manage it separately.

One possible way to do this is to use tools.go, similar to what the node-observability-operator does.

This was originally lumped into issue #245 but I decided to break it into it's own issue because kustomize pulls in a lot of transient dependencies that affect FIO dependencies. I thought it would be good to sort these dependencies individually, so we can keep patches as small as possible.

https://github.com/openshift/node-observability-operator/blob/main/tools/tools.go#L12

E2E testing is broken waiting for nodes to start

Seeing this consistently across various patches (e.g., dependency bumps).

#432

     wait_util.go:59: Deployment available (1/1)
    client.go:47: resource type  with namespace/name (osdk-e2e-0c35dce2-3be6-43fd-9624-566a1176cb0f/e2e-test-certrotation) created
    helpers.go:393: Created FileIntegrity: &{TypeMeta:{Kind: APIVersion:} ObjectMeta:{Name:e2e-test-certrotation GenerateName: Namespace:osdk-e2e-0c35dce2-3be6-43fd-9624-566a1176cb0f SelfLink: UID:576fd4be-282f-4198-9c13-135c08c2818f ResourceVersion:62749 Generation:1 CreationTimestamp:2023-09-06 21:45:15 +0000 UTC DeletionTimestamp:<nil> DeletionGracePeriodSeconds:<nil> Labels:map[] Annotations:map[] OwnerReferences:[] Finalizers:[] ManagedFields:[{Manager:Go-http-client Operation:Update APIVersion:fileintegrity.openshift.io/v1alpha1 Time:2023-09-06 21:45:15 +0000 UTC FieldsType:FieldsV1 FieldsV1:{"f:spec":{".":{},"f:config":{".":{},"f:gracePeriod":{},"f:maxBackups":{}},"f:debug":{},"f:nodeSelector":{".":{},"f:node-role.kubernetes.io/worker":{}},"f:tolerations":{}}} Subresource:}]} Spec:{NodeSelector:map[node-role.kubernetes.io/worker:] Config:{Name: Namespace: Key: GracePeriod:20 MaxBackups:5 InitialDelay:0} Debug:true Tolerations:[{Key:node-role.kubernetes.io/master Operator:Exists Value: Effect:NoSchedule TolerationSeconds:<nil>} {Key:node-role.kubernetes.io/infra Operator:Exists Value: Effect:NoSchedule TolerationSeconds:<nil>}]} Status:{Phase:}}
    helpers.go:902: Got (Active) result #1 out of 0 needed.
    helpers.go:913: FileIntegrity ready (Active)
    helpers.go:430: FileIntegrity deployed successfully
    helpers.go:902: Got (Active) result #1 out of 0 needed.
    helpers.go:913: FileIntegrity ready (Active)
    e2e_test.go:982: Asserting that the FileIntegrity check is in a SUCCESS state after deploying it
    helpers.go:1258: ip-10-0-100-129.us-east-2.compute.internal Succeeded
    helpers.go:1258: ip-10-0-100-129.us-east-2.compute.internal Succeeded
    helpers.go:1258: ip-10-0-123-7.us-east-2.compute.internal Succeeded
    helpers.go:1258: ip-10-0-17-136.us-east-2.compute.internal Succeeded
    e2e_test.go:985: Rotating kube-apiserver-to-kubelet-client-ca certificate
    e2e_test.go:988: Waiting for Nodes to start updating
    helpers.go:1068: Timeout waiting for nodes to start updating, err:  timed out waiting for the condition
    e2e_test.go:995: Asserting that the FileIntegrity is in a SUCCESS state after rotating kube-apiserver-to-kubelet-client-ca certificate
    helpers.go:1258: ip-10-0-100-129.us-east-2.compute.internal Succeeded
    helpers.go:1258: ip-10-0-123-7.us-east-2.compute.internal Succeeded
    helpers.go:1258: ip-10-0-17-136.us-east-2.compute.internal Succeeded
    helpers.go:1875: wrote logs for file-integrity-operator-64df45ddb7-bqwqd/self

Duplicate CRDs

In v0.1.3, the CRDs are duplicate

  customresourcedefinitions:
    owned:
    - description: FileIntegrity is the Schema for the fileintegrities API
      kind: FileIntegrity
      name: fileintegrities.fileintegrity.openshift.io
      version: v1alpha1
    - description: FileIntegrity is the Schema for the fileintegrities API
      kind: FileIntegrity
      name: fileintegrities.fileintegrity.openshift.io
      version: v1alpha1
    - description: FileIntegrityNodeStatus defines the status of a specific node
      kind: FileIntegrityNodeStatus
      name: fileintegritynodestatuses.fileintegrity.openshift.io
      version: v1alpha1
    - description: FileIntegrityNodeStatus defines the status of a specific node
      kind: FileIntegrityNodeStatus
      name: fileintegritynodestatuses.fileintegrity.openshift.io
      version: v1alpha1

It also have the same issue in v0.1.2

End-to-end CI is broken

CI has been failing consistently over the last week with errors like the following:

=== RUN   TestFileIntegrityLogAndReinitDatabase
&Namespace{ObjectMeta:{osdk-e2e-46b19674-f9e8-423b-981d-0920db85c117      0 0001-01-01 00:00:00 +0000 UTC <nil> <nil> map[pod-security.kubernetes.io/enforce:privileged security.openshift.io/scc.podSecurityLabelSync:false] map[] [] [] []},Spec:NamespaceSpec{Finalizers:[],},Status:NamespaceStatus{Phase:,Conditions:[]NamespaceCondition{},},}
    client.go:47: resource type ServiceAccount with namespace/name (osdk-e2e-46b19674-f9e8-423b-981d-0920db85c117/file-integrity-daemon) created
    client.go:47: resource type ServiceAccount with namespace/name (osdk-e2e-46b19674-f9e8-423b-981d-0920db85c117/file-integrity-operator) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-46b19674-f9e8-423b-981d-0920db85c117/file-integrity-daemon) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-46b19674-f9e8-423b-981d-0920db85c117/file-integrity-operator) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-46b19674-f9e8-423b-981d-0920db85c117/leader-election-role) created
    client.go:47: resource type ClusterRole with namespace/name (/file-integrity-operator) created
    client.go:47: resource type ClusterRole with namespace/name (/file-integrity-operator-metrics) created
    client.go:47: resource type ClusterRole with namespace/name (/fileintegrity-editor-role) created
    client.go:47: resource type ClusterRole with namespace/name (/fileintegrity-viewer-role) created
    client.go:47: resource type RoleBinding with namespace/name (osdk-e2e-46b19674-f9e8-423b-981d-0920db85c117/file-integrity-daemon) created
    client.go:47: resource type RoleBinding with namespace/name (osdk-e2e-46b19674-f9e8-423b-981d-0920db85c117/file-integrity-operator) created
    client.go:47: resource type RoleBinding with namespace/name (osdk-e2e-46b19674-f9e8-423b-981d-0920db85c117/file-integrity-operator-metrics) created
    client.go:47: resource type RoleBinding with namespace/name (osdk-e2e-46b19674-f9e8-423b-981d-0920db85c117/leader-election-rolebinding) created
    client.go:47: resource type ClusterRoleBinding with namespace/name (/file-integrity-operator) created
    client.go:47: resource type Deployment with namespace/name (osdk-e2e-46b19674-f9e8-423b-981d-0920db85c117/file-integrity-operator) created
    helpers.go:278: Initialized cluster resources
    wait_util.go:52: Waiting for full availability of file-integrity-operator deployment (0/1)
    wait_util.go:52: Waiting for full availability of file-integrity-operator deployment (0/1)
    wait_util.go:59: Deployment available (1/1)
    client.go:47: resource type  with namespace/name (osdk-e2e-46b19674-f9e8-423b-981d-0920db85c117/e2e-test-reinitdb) created
    helpers.go:389: Created FileIntegrity: &{TypeMeta:{Kind: APIVersion:} ObjectMeta:{Name:e2e-test-reinitdb GenerateName: Namespace:osdk-e2e-46b19674-f9e8-423b-981d-0920db85c117 SelfLink: UID:7b66a89d-de9c-4797-960a-6f2fb04adcfb ResourceVersion:30496 Generation:1 CreationTimestamp:2023-07-10 17:13:08 +0000 UTC DeletionTimestamp:<nil> DeletionGracePeriodSeconds:<nil> Labels:map[] Annotations:map[] OwnerReferences:[] Finalizers:[] ManagedFields:[{Manager:Go-http-client Operation:Update APIVersion:fileintegrity.openshift.io/v1alpha1 Time:2023-07-10 17:13:08 +0000 UTC FieldsType:FieldsV1 FieldsV1:{"f:spec":{".":{},"f:config":{".":{},"f:gracePeriod":{},"f:maxBackups":{}},"f:debug":{},"f:nodeSelector":{".":{},"f:node-role.kubernetes.io/worker":{}},"f:tolerations":{}}} Subresource:}]} Spec:{NodeSelector:map[node-role.kubernetes.io/worker:] Config:{Name: Namespace: Key: GracePeriod:20 MaxBackups:5 InitialDelay:0} Debug:true Tolerations:[{Key:node-role.kubernetes.io/master Operator:Exists Value: Effect:NoSchedule TolerationSeconds:<nil>} {Key:node-role.kubernetes.io/infra Operator:Exists Value: Effect:NoSchedule TolerationSeconds:<nil>}]} Status:{Phase:}}
    helpers.go:906: FileIntegrity never reached expected phase (Active)
    helpers.go:424: Timed out waiting for scan status to go Active
--- FAIL: TestFileIntegrityLogAndReinitDatabase (1829.78s)

The traces are all the same since all end-to-end tests are run serially. The first test fails, then cascades into other failures because the first test isn't cleaned up properly.

feature request: add `s390x` / multi arch support

I would like to request support for s390x architecture.
Or maybe you have a roadmap for releasing the official operators for the supported archs like s390x and ppc64le already and can share that?

I just saw that the compliance-operator f.e. has some multi arch PRs merged recently, so maybe there is something in the works here as well?

Looking forward for some feedback :) Thank you!

The controller-gen dependency isn't tracked using go modules

From a build perspective, it would be handy to have this dependency tracked using go tooling so that we (as developers, or build systems) don't need to manage it separately.

One possible way to do this is to use tools.go, similar to what the node-observability-operator does.

https://github.com/openshift/node-observability-operator/blob/main/tools/tools.go#L12

image file-integrity-operator-index has no latest tag

The catalog-source points to the latest tag:

apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: file-integrity-operator
  namespace: openshift-marketplace
spec:
  displayName: File Integrity Operator
  publisher: github.com/openshift/file-integrity-operator
  sourceType: grpc
  image: quay.io/file-integrity-operator/file-integrity-operator-index:latest

In the registry there is no such tag, if I edit the catalog pointing to tag 0.1.6 it works correctly.

https://quay.io/repository/file-integrity-operator/file-integrity-operator-index?tab=tags

Operator does not expose a native way to modify nodeSelector

Request that the operator gets support to modify the nodeSelector on installation

Currently this is hard-coded to run on node-role master. The Deployment can be altered, but as patch operations does not have a way to delete a field, its hard to automate the installation process.

If this can be linked to a customer in Red Hat to add prioritization to it, let me know

Future Release Branches Frozen For Merging | branch:release-4.11 branch:release-4.12

The following branches are being fast-forwarded from the current development branch (master) as placeholders for future releases. No merging is allowed into these release branches until they are unfrozen for production release.

release-4.11
release-4.12

Contact the Test Platform or Automated Release teams for more information.

Set resource limits for containers

As a Platform Engineer I need to control usage of CPU Memory per Container.
Please add ressource limits:

resources:
          requests:
            cpu: 100m
            memory: 100Mi
          limits:
            memory: 200Mi
            cpu: 400m

The Aide PODS were running for a day and used 1.6 Gb of memory for no reason.

cheers

End to end tests broke including the namespace in alerts

We recently merged support for including the namespace in the NodeHasIntegrityFailure alert [0].

This helps understand where the alert is coming from, but we have some assertions in the end-to-end tests that appear to fail with this new format [1].

Opening this issue to track the work to get e2e tests running again.

[0] af58faa
[1] https://github.com/openshift/file-integrity-operator/blob/master/tests/e2e/e2e_test.go#L56

Operator not cleaning up old aide.log.backup and aide.db.gz.backup files

Presently we are running the file integrity operator , however it is filling up the /etc/kubernetes directory with files dating back from more than 3-4 months. Is there a way for us to clean up the files automatically after 30 or a certain period so that it doesn't use up the space and fill up the disk drive. The clean up using logrotate would also be a feasible option since it is already available in coreos.

sh-4.4# ls -al aide.db.gz.backup* |wc -l 252

sh-4.4# ls -al aide.log.backup* |wc -l 252

Add initialDelay option

When a user deploys FIO shortly after creating a new cluster, it activates quickly after the cluster is built. This can lead to problems when there are still updates being rolled out to the cluster and MCP getting updates. The file changes trigger failed FIO checks, resulting in unwanted alerts for every new cluster. To address this, we will discuss implementing an initialDelay option in FIO.

Future Release Branches Frozen For Merging | branch:release-4.16 branch:release-4.17

release-4.16
release-4.17

For more information, see the branching documentation.

e2e test node cleanup does not run

Our e2e tests show these events:
"message": "Error creating: pods \"aide-clean-\" is forbidden: error looking up service account openshift-file-integrity/file-integrity-operator: serviceaccount \"file-integrity-operator\" not found",
The service account is being cleaned up before the aide-clean deployment is created.

E2E testing fails due to failed node taints

We're experiencing an issue in CI where the E2E testing fails because a node can't be tainted, which manifests in the follow error:

 === RUN   TestFileIntegrityTolerations
&Namespace{ObjectMeta:{osdk-e2e-6c36263d-255d-4a22-9659-ba12e7965054      0 0001-01-01 00:00:00 +0000 UTC <nil> <nil> map[pod-security.kubernetes.io/enforce:privileged security.openshift.io/scc.podSecurityLabelSync:false] map[] [] [] []},Spec:NamespaceSpec{Finalizers:[],},Status:NamespaceStatus{Phase:,Conditions:[]NamespaceCondition{},},}
    client.go:47: resource type ServiceAccount with namespace/name (osdk-e2e-6c36263d-255d-4a22-9659-ba12e7965054/file-integrity-daemon) created
    client.go:47: resource type ServiceAccount with namespace/name (osdk-e2e-6c36263d-255d-4a22-9659-ba12e7965054/file-integrity-operator) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-6c36263d-255d-4a22-9659-ba12e7965054/file-integrity-daemon) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-6c36263d-255d-4a22-9659-ba12e7965054/file-integrity-operator) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-6c36263d-255d-4a22-9659-ba12e7965054/leader-election-role) created
    client.go:47: resource type ClusterRole with namespace/name (/file-integrity-operator) created
    client.go:47: resource type ClusterRole with namespace/name (/file-integrity-operator-metrics) created
    client.go:47: resource type ClusterRole with namespace/name (/fileintegrity-editor-role) created
    client.go:47: resource type ClusterRole with namespace/name (/fileintegrity-viewer-role) created
    client.go:47: resource type RoleBinding with namespace/name (osdk-e2e-6c36263d-255d-4a22-9659-ba12e7965054/file-integrity-daemon) created
    client.go:47: resource type RoleBinding with namespace/name (osdk-e2e-6c36263d-255d-4a22-9659-ba12e7965054/file-integrity-operator) created
    client.go:47: resource type RoleBinding with namespace/name (osdk-e2e-6c36263d-255d-4a22-9659-ba12e7965054/file-integrity-operator-metrics) created
    client.go:47: resource type RoleBinding with namespace/name (osdk-e2e-6c36263d-255d-4a22-9659-ba12e7965054/leader-election-rolebinding) created
    client.go:47: resource type ClusterRoleBinding with namespace/name (/file-integrity-operator) created
    client.go:47: resource type Deployment with namespace/name (osdk-e2e-6c36263d-255d-4a22-9659-ba12e7965054/file-integrity-operator) created
    helpers.go:282: Initialized cluster resources
    wait_util.go:59: Deployment available (1/1)
    helpers.go:1763: Tainting node: ip-10-0-104-211.ec2.internal
    helpers.go:710: Tainting node failed
--- FAIL: TestFileIntegrityTolerations (30.59s)
=== RUN   TestFileIntegrityLogCompress

Here is an example patch that causes this issue: #426

Since E2E tests are run serially, this causes the entire suite to cascade in failures.

Unable to install kustomize with golang 1.18

I'm unable to use kustomize, or any Makefile targets that leverage kustomize, with golang 1.18:

$ ls build/kustomize
ls: cannot access 'build/kustomize': No such file or directory
$ make kustomize                                                                                                                                                                                           go: creating new go.mod: module tmp
Downloading sigs.k8s.io/kustomize/kustomize/[email protected]
go: added cloud.google.com/go v0.38.0
go: added github.com/Azure/go-autorest/autorest v0.9.0
go: added github.com/Azure/go-autorest/autorest/adal v0.5.0
go: added github.com/Azure/go-autorest/autorest/date v0.1.0
go: added github.com/Azure/go-autorest/logger v0.1.0
go: added github.com/Azure/go-autorest/tracing v0.5.0
go: added github.com/PuerkitoBio/purell v1.1.1
go: added github.com/PuerkitoBio/urlesc v0.0.0-20170810143723-de5bf2ad4578
go: added github.com/asaskevich/govalidator v0.0.0-20190424111038-f61b66f89f4a
go: added github.com/bgentry/go-netrc v0.0.0-20140422174119-9fd32a8b3d3d
go: added github.com/davecgh/go-spew v1.1.1
go: added github.com/dgrijalva/jwt-go v3.2.0+incompatible
go: added github.com/emicklei/go-restful v0.0.0-20170410110728-ff4f55a20633
go: added github.com/evanphx/json-patch v4.9.0+incompatible
go: added github.com/go-errors/errors v1.0.1
go: added github.com/go-openapi/analysis v0.19.5
go: added github.com/go-openapi/errors v0.19.2
go: added github.com/go-openapi/jsonpointer v0.19.3
go: added github.com/go-openapi/jsonreference v0.19.3
go: added github.com/go-openapi/loads v0.19.4
go: added github.com/go-openapi/runtime v0.19.4
go: added github.com/go-openapi/spec v0.19.5
go: added github.com/go-openapi/strfmt v0.19.5
go: added github.com/go-openapi/swag v0.19.5
go: added github.com/go-openapi/validate v0.19.8
go: added github.com/go-stack/stack v1.8.0
go: added github.com/gogo/protobuf v1.3.1
go: added github.com/golang/protobuf v1.3.2
go: added github.com/google/gofuzz v1.1.0
go: added github.com/google/shlex v0.0.0-20191202100458-e7afc7fbc510
go: added github.com/googleapis/gnostic v0.1.0
go: added github.com/gophercloud/gophercloud v0.1.0
go: added github.com/hashicorp/errwrap v1.0.0
go: added github.com/hashicorp/go-cleanhttp v0.5.0
go: added github.com/hashicorp/go-multierror v1.1.0
go: added github.com/hashicorp/go-safetemp v1.0.0
go: added github.com/hashicorp/go-version v1.1.0
go: added github.com/inconshreveable/mousetrap v1.0.0
go: added github.com/json-iterator/go v1.1.8
go: added github.com/mailru/easyjson v0.7.0
go: added github.com/mattn/go-runewidth v0.0.7
go: added github.com/mitchellh/go-homedir v1.1.0
go: added github.com/mitchellh/go-testing-interface v1.0.0
go: added github.com/mitchellh/mapstructure v1.1.2
go: added github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd
go: added github.com/modern-go/reflect2 v1.0.1
go: added github.com/monochromegane/go-gitignore v0.0.0-20200626010858-205db1a8cc00
go: added github.com/olekukonko/tablewriter v0.0.4
go: added github.com/pkg/errors v0.9.1
go: added github.com/pmezard/go-difflib v1.0.0
go: added github.com/qri-io/starlib v0.4.2-0.20200213133954-ff2e8cd5ef8d
go: added github.com/spf13/cobra v1.0.0
go: added github.com/spf13/pflag v1.0.5
go: added github.com/stretchr/testify v1.6.1
go: added github.com/ulikunitz/xz v0.5.5
go: added github.com/xlab/treeprint v0.0.0-20181112141820-a009c3971eca
go: added github.com/yujunz/go-getter v1.4.1-lite
go: added go.mongodb.org/mongo-driver v1.1.2
go: added go.starlark.net v0.0.0-20200306205701-8dd3e2ee1dd5
go: added golang.org/x/crypto v0.0.0-20200622213623-75b288015ac9
go: added golang.org/x/net v0.0.0-20200625001655-4c5254603344
go: added golang.org/x/oauth2 v0.0.0-20190604053449-0f29369cfe45
go: added golang.org/x/sys v0.0.0-20200323222414-85ca7c5b95cd
go: added golang.org/x/text v0.3.2
go: added golang.org/x/time v0.0.0-20190308202827-9d24e82272b4
go: added google.golang.org/appengine v1.5.0
go: added gopkg.in/inf.v0 v0.9.1
go: added gopkg.in/yaml.v2 v2.3.0
go: added gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c
go: added k8s.io/api v0.18.10
go: added k8s.io/apimachinery v0.18.10
go: added k8s.io/client-go v0.18.10
go: added k8s.io/klog v1.0.0
go: added k8s.io/kube-openapi v0.0.0-20200410145947-61e04a5be9a6
go: added k8s.io/utils v0.0.0-20200324210504-a9aa75ae1b89
go: added sigs.k8s.io/kustomize/api v0.6.5
go: added sigs.k8s.io/kustomize/cmd/config v0.8.5
go: added sigs.k8s.io/kustomize/kustomize/v3 v3.8.7
go: added sigs.k8s.io/kustomize/kyaml v0.9.4
go: added sigs.k8s.io/structured-merge-diff/v3 v3.0.0
go: added sigs.k8s.io/yaml v1.2.0
$ ls build/kustomize
ls: cannot access 'build/kustomize': No such file or directory

We hit this same issue in the compliance-operator.

ComplianceAsCode/compliance-operator#145

Move aide.reinit out of `/etc`

This is related to ostreedev/ostree#2453 - see also https://bugzilla.redhat.com/show_bug.cgi?id=1945274

Basically it looks like this operator is writing a file into /etc at runtime. Since this is for dynamic state, not persistent state, it should go in /run. This would avoid a seen-in-the-wild race condition at shutdown time where ostree wants to propagate the modified config file, but the daemonset has deleted it in the meantime.

(Arguably, ostree should be OK with files vanishing in /etc at shutdown time, but OTOH it can mean it's capturing an inconsistent snapshot of `/etc)

Unable to run end-to-end tests using operator installed from catalog source

In the process of debugging an upgrade issue from 0.1.27 to 0.1.28, we started working on an upgrade CI job that exercises upgrades. To use the OpenShift CI tooling developed to help test this case, we need to install the operator using catalog sources.

The basic flow for the test is to:

1.) Install $OPERATOR_VERSION (e.g., 0.1.27) to the cluster using catalog sources
2.) Upgrade the operator to $OPERATOR_VERSION + 1 (0.1.28)
3.) Run make e2e against the upgraded operator

In the process of developing this job, I attempted to run the end-to-end tests against FIO installed from a catalog source and saw the following failures:

=== RUN   TestFileIntegrityBadConfig
I0719 16:23:21.705052   80310 request.go:665] Waited for 2.650165646s due to client-side throttling, not priority and fairness, request: GET:https://api.lbragstad-dev.devcluster.openshift.com:6443/apis/authorization.k8s.io/v1?timeout=32s
I0719 16:23:31.705114   80310 request.go:665] Waited for 12.650244158s due to client-side throttling, not priority and fairness, request: GET:https://api.lbragstad-dev.devcluster.openshift.com:6443/apis/events.k8s.io/v1?timeout=32s
I0719 16:23:41.904659   80310 request.go:665] Waited for 6.557941133s due to client-side throttling, not priority and fairness, request: GET:https://api.lbragstad-dev.devcluster.openshift.com:6443/apis/authentication.k8s.io/v1?timeout=32s
I0719 16:23:52.903704   80310 request.go:665] Waited for 1.157020028s due to client-side throttling, not priority and fairness, request: GET:https://api.lbragstad-dev.devcluster.openshift.com:6443/apis/events.k8s.io/v1beta1?timeout=32s
I0719 16:24:03.102727   80310 request.go:665] Waited for 11.355830053s due to client-side throttling, not priority and fairness, request: GET:https://api.lbragstad-dev.devcluster.openshift.com:6443/apis/storage.k8s.io/v1beta1?timeout=32s
    client.go:47: resource type ServiceAccount with namespace/name (osdk-e2e-2ac8be42-1c4c-4d23-8265-36a0b870dfc7/file-integrity-daemon) created
    client.go:47: resource type ServiceAccount with namespace/name (osdk-e2e-2ac8be42-1c4c-4d23-8265-36a0b870dfc7/file-integrity-operator) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-2ac8be42-1c4c-4d23-8265-36a0b870dfc7/file-integrity-daemon) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-2ac8be42-1c4c-4d23-8265-36a0b870dfc7/file-integrity-operator) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-2ac8be42-1c4c-4d23-8265-36a0b870dfc7/leader-election-role) created
    helpers.go:265: failed to initialize cluster resources: clusterroles.rbac.authorization.k8s.io "file-integrity-operator" already exists
--- FAIL: TestFileIntegrityBadConfig (49.22s)
=== RUN   TestFileIntegrityTolerations
I0719 16:24:13.301850   80310 request.go:665] Waited for 5.02631644s due to client-side throttling, not priority and fairness, request: GET:https://api.lbragstad-dev.devcluster.openshift.com:6443/apis/whereabouts.cni.cncf.io/v1alpha1?timeout=32s
I0719 16:24:23.501172   80310 request.go:665] Waited for 15.225555488s due to client-side throttling, not priority and fairness, request: GET:https://api.lbragstad-dev.devcluster.openshift.com:6443/apis/route.openshift.io/v1?timeout=32s
I0719 16:24:33.501605   80310 request.go:665] Waited for 8.958257296s due to client-side throttling, not priority and fairness, request: GET:https://api.lbragstad-dev.devcluster.openshift.com:6443/apis/discovery.k8s.io/v1beta1?timeout=32s
I0719 16:24:43.700779   80310 request.go:665] Waited for 2.757043945s due to client-side throttling, not priority and fairness, request: GET:https://api.lbragstad-dev.devcluster.openshift.com:6443/apis/console.openshift.io/v1?timeout=32s
I0719 16:24:53.900198   80310 request.go:665] Waited for 12.956345588s due to client-side throttling, not priority and fairness, request: GET:https://api.lbragstad-dev.devcluster.openshift.com:6443/apis/route.openshift.io/v1?timeout=32s
    client.go:47: resource type ServiceAccount with namespace/name (osdk-e2e-3fba73bd-843a-4e44-a243-bcfc3c6b20bc/file-integrity-daemon) created
    client.go:47: resource type ServiceAccount with namespace/name (osdk-e2e-3fba73bd-843a-4e44-a243-bcfc3c6b20bc/file-integrity-operator) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-3fba73bd-843a-4e44-a243-bcfc3c6b20bc/file-integrity-daemon) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-3fba73bd-843a-4e44-a243-bcfc3c6b20bc/file-integrity-operator) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-3fba73bd-843a-4e44-a243-bcfc3c6b20bc/leader-election-role) created
    helpers.go:265: failed to initialize cluster resources: clusterroles.rbac.authorization.k8s.io "file-integrity-operator" already exists
--- FAIL: TestFileIntegrityTolerations (49.23s)
=== RUN   TestFileIntegrityLogCompress
I0719 16:25:03.900496   80310 request.go:665] Waited for 6.393417622s due to client-side throttling, not priority and fairness, request: GET:https://api.lbragstad-dev.devcluster.openshift.com:6443/apis/flowcontrol.apiserver.k8s.io/v1beta1?timeout=32s
I0719 16:25:14.899662   80310 request.go:665] Waited for 1.159087399s due to client-side throttling, not priority and fairness, request: GET:https://api.lbragstad-dev.devcluster.openshift.com:6443/apis/network.openshift.io/v1?timeout=32s
I0719 16:25:24.900039   80310 request.go:665] Waited for 11.159307613s due to client-side throttling, not priority and fairness, request: GET:https://api.lbragstad-dev.devcluster.openshift.com:6443/apis/flowcontrol.apiserver.k8s.io/v1beta1?timeout=32s
I0719 16:25:35.099801   80310 request.go:665] Waited for 4.958409848s due to client-side throttling, not priority and fairness, request: GET:https://api.lbragstad-dev.devcluster.openshift.com:6443/apis/monitoring.coreos.com/v1alpha1?timeout=32s
I0719 16:25:45.298337   80310 request.go:665] Waited for 15.156909324s due to client-side throttling, not priority and fairness, request: GET:https://api.lbragstad-dev.devcluster.openshift.com:6443/apis/network.openshift.io/v1?timeout=32s
    client.go:47: resource type ServiceAccount with namespace/name (osdk-e2e-466eb796-6b8b-4aa5-9ecd-a1d4a4dc431e/file-integrity-daemon) created
    client.go:47: resource type ServiceAccount with namespace/name (osdk-e2e-466eb796-6b8b-4aa5-9ecd-a1d4a4dc431e/file-integrity-operator) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-466eb796-6b8b-4aa5-9ecd-a1d4a4dc431e/file-integrity-daemon) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-466eb796-6b8b-4aa5-9ecd-a1d4a4dc431e/file-integrity-operator) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-466eb796-6b8b-4aa5-9ecd-a1d4a4dc431e/leader-election-role) created
    helpers.go:265: failed to initialize cluster resources: clusterroles.rbac.authorization.k8s.io "file-integrity-operator" already exists
--- FAIL: TestFileIntegrityLogCompress (49.15s)
=== RUN   TestFileIntegrityAcceptsExpectedChange
I0719 16:25:55.298845   80310 request.go:665] Waited for 8.646337137s due to client-side throttling, not priority and fairness, request: GET:https://api.lbragstad-dev.devcluster.openshift.com:6443/apis/flowcontrol.apiserver.k8s.io/v1beta2?timeout=32s
I0719 16:26:05.497989   80310 request.go:665] Waited for 2.556343286s due to client-side throttling, not priority and fairness, request: GET:https://api.lbragstad-dev.devcluster.openshift.com:6443/apis/autoscaling.openshift.io/v1beta1?timeout=32s
I0719 16:26:15.697948   80310 request.go:665] Waited for 12.756205324s due to client-side throttling, not priority and fairness, request: GET:https://api.lbragstad-dev.devcluster.openshift.com:6443/apis/flowcontrol.apiserver.k8s.io/v1beta1?timeout=32s
I0719 16:26:25.897570   80310 request.go:665] Waited for 6.554694131s due to client-side throttling, not priority and fairness, request: GET:https://api.lbragstad-dev.devcluster.openshift.com:6443/apis/route.openshift.io/v1?timeout=32s
    client.go:47: resource type ServiceAccount with namespace/name (osdk-e2e-d668790f-c305-4625-a136-5f21d13bf92d/file-integrity-daemon) created
    client.go:47: resource type ServiceAccount with namespace/name (osdk-e2e-d668790f-c305-4625-a136-5f21d13bf92d/file-integrity-operator) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-d668790f-c305-4625-a136-5f21d13bf92d/file-integrity-daemon) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-d668790f-c305-4625-a136-5f21d13bf92d/file-integrity-operator) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-d668790f-c305-4625-a136-5f21d13bf92d/leader-election-role) created
    helpers.go:265: failed to initialize cluster resources: clusterroles.rbac.authorization.k8s.io "file-integrity-operator" already exists
--- FAIL: TestFileIntegrityAcceptsExpectedChange (49.26s)
FAIL
FAIL    github.com/openshift/file-integrity-operator/tests/e2e  636.762s
FAIL
make: *** [Makefile:422: e2e] Error 1

@mrogers950 noted this in his PR for the test openshift/release#30613

Opening this bug to track the work we need to do to run end-to-end tests against an operator deployed from the catalog.

Add support for ppc64le and the capability to build images using GitHub Actions or any CI mechanism

As part of the security worker on OCP Compliance Operator we do need to get File Integrity Operator working on Power. Today we had to build it manually but to get officially supported as part of OCP on pcc64le we need the multi-arch OR the individual arch available.

The make release-images target doesn't properly handle the latest tag

The final step of the release process is to push the images to a public registry [0].

When we prepare and push a new release, we specify a version using the VERSION environment variable, which gets translated to a TAG during the image build and push process. But, the problem is that after we build the images, we don't go back through and tag the latest version (e.g., 0.1.29) with the latest tag.

It's confusing to have a latest tag that doesn't link to the most recent version in the repository. We should find a way to tag the latest release with the version we're releasing so that they're represented accurately in the repository. And, more importantly, if you use the latest tag, you're able to get the most recent images.

[0] https://quay.io/organization/file-integrity-operator

openshift / file-integrity-operator Goto Github PK

file-integrity-operator's Introduction

file-integrity-operator

Deploying:

Building and deploying from source:

FileIntegrity API:

Usage:

Events

Testing

Unit

Local

End-to-end

Overriding the AIDE configuration

Controller metrics

Integrity failure alerts

Contributor Guide

Proposing Releases

Preparing the Release

Proposing the Release

Releasing Images

file-integrity-operator's People

Stargazers

Watchers

Forkers

file-integrity-operator's Issues

Recommend Projects

Recommend Topics

Recommend Org