sysdiglabs / charts Goto Github PK

View Code? Open in Web Editor NEW

40.0 19.0 127.0 9.28 MB

The official source for Sysdig’s Helm charts

Home Page: https://charts.sysdig.com

Shell 4.28% Mustache 53.51% Smarty 39.19% Makefile 0.64% Python 2.34% Just 0.05%

sysdig-helm-charts helm chart

charts's Issues

dnsPolicy not set for node-analyzer daemonset while hostNetworking set true

The node-analyzer daemonset sets hostNetworking: true but do not specify a dnsPolicy.

This results into dnsPolicy defaulting to ClusterFirst. It should be explicitly set to ClusterFirstWithHostNet as per the docs[1]

[1] https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-policy

Specifying runtimeScanner maxFileSizeAllowed as integer can result in failures

Summary

If users specify the nodeAnalyzer.runtimeScanner.settings.maxFileSizeAllowed setting as an integer, then the quote function sometimes converts it to a string correctly (for small numbers) or otherwise treats it as a floating-point number, which then causes the runtime scanner container to fail to start.

Current behavior

When maxFileSizeAllowed is given as an integer (rather than a string), an incorrect value appears in the resulting ConfigMap, which prevents the scanner from starting correctly.

Desired behavior

In both cases, the chart should be able to accept either a string or int64.

Steps to reproduce

Use the following values file and save it somewhere as values.yaml:

sysdig:
  accessKey: foo

clusterName: foo

nodeAnalyzer:
  deploy: true
  apiEndpoint: https://api.example.com
  runtimeScanner:
    deploy: true
    settings:
    maxFileSizeAllowed: 262144000

Run helm template sysdig/sysdig --values=values.yaml (it likely also works when running locally against this repo, but I have not checked)

Look at the output; for me, it looks like this:

# Source: sysdig/templates/runtimeScanner/runtime-scanner-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: release-name-sysdig-runtime-scanner
  labels:
    helm.sh/chart: sysdig-1.15.87
    app.kubernetes.io/instance: release-name
    app.kubernetes.io/version: "12.14.0"
    app.kubernetes.io/managed-by: Helm
    app: "sysdig-agent"
data:
  api_endpoint: https://https://api.example.com
  cluster_name: foo
  debug: "false"
  eve_integration_enabled: "true"
  prom_port: "25001"
  analyzer.maxFileSizeAllowed: "2.62144e+08"

You can also try changing the maxFileSizeAllowed to a smaller number, and then quote behaves as we expect:

# Source: sysdig/templates/runtimeScanner/runtime-scanner-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: release-name-sysdig-runtime-scanner
  labels:
    helm.sh/chart: sysdig-1.15.87
    app.kubernetes.io/instance: release-name
    app.kubernetes.io/version: "12.14.0"
    app.kubernetes.io/managed-by: Helm
    app: "sysdig-agent"
data:
  api_endpoint: https://https://api.example.com
  cluster_name: foo
  debug: "false"
  eve_integration_enabled: "true"
  prom_port: "25001"
  analyzer.maxFileSizeAllowed: "2621"

I also verified that it is possible to reproduce this with the new sysdig-deploy chart using this values file:

global:
  sysdig:
    accessKey: foo

agent:
  enabled: false

nodeAnalyzer:
  enabled: true
  clusterName: foo
  apiEndpoint: https://api.example.com
  secure:
    enabled: true
    vulnerabilityManagement:
      newEngineOnly: true
  nodeAnalyzer:
    enabled: true
    runtimeScanner:
      deploy: true
      settings:
        maxFileSizeAllowed: 2621121212

This is the output:

# Source: sysdig-deploy/charts/nodeAnalyzer/templates/runtimeScanner/runtime-scanner-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: sysdig-runtime-scanner
  labels:
    helm.sh/chart: nodeAnalyzer-1.8.50
    app.kubernetes.io/instance: sysdig
    app.kubernetes.io/version: "12.6.0"
    app.kubernetes.io/managed-by: Helm
data:
  api_endpoint: https://secure.sysdig.com
  cluster_name: foo
  debug: "false"
  eve_integration_enabled: "true"
  prom_port: "25001"
  analyzer.maxFileSizeAllowed: "2.621121212e+09"

Workaround

We can pass in this value as a string, as it is done in the Helm default values.

Other details

This chart passes things in through | quote but the value is being detected as a floating point number (understandable as that would be the default YAML and JSON behavior, since JSON treats everything as a float64):

charts/charts/node-analyzer/templates/runtimeScanner/runtime-scanner-configmap.yaml

Line 37 in 006d2fc

{{- if .Values.nodeAnalyzer.runtimeScanner.settings.maxFileSizeAllowed }}

A possible solution may be to use one of Helm's type conversion functions to force it to be treated as an int64, but I don't know if that causes issues when values are passed in as a string: https://helm.sh/docs/chart_template_guide/function_list/#type-conversion-functions

Support using cert-manager for admission controller certs

It would be great if we could use cert-manager to provision the certs for the admission controller. KEDA supports this, for example: https://github.com/kedacore/charts/tree/main/keda/templates/cert-manager

I'll plan to prepare a PR for review.

sysdig-agent daemonset does not respects rolling update strategy

I have observed that sysdig-agent daemonset ignores the rolling update strategy and it takes around 10 minutes to pass readiness checks.

Upon upgrading the helm chart, all the agent pods are replaced with new pods in spite of the readiness check failing on the newly created pods.

  updateStrategy:
    rollingUpdate:
      maxSurge: 0
      maxUnavailable: 1
    type: RollingUpdate

However, node-analyzer with the same updateStrategy works as expected.

I understand this could be unrelated to the helm charts and please feel free to point me in the right direction. Thank you!

[admission-controller] Helm chart generates duplicate ValidatingWebhookConfiguration resources

Hi,

Previously we used terraform to deploy the sysdig-deploy helm chart to our cluster, but are now trying to move the setup to ArgoCD using Kustomize. This works for all our other charts, but it seems like the admission-controller sub-chart is generating multiple resources with the same name/id, which causes kustomization to fail with the following message

Error: may not add resource with an already registered id: admissionregistration.k8s.io_v1_ValidatingWebhookConfiguration|sysdig-agent|sysdig-cluster-agent-admissioncontroller-webhook

Our kustomization.yaml, and the values.yaml has set admissionController.enabled: true:

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
helmCharts:
  - name: sysdig-deploy
    namespace: sysdig-agent
    repo: https://charts.sysdig.com/
    releaseName: sysdig-cluster-agent
    valuesFile: values.yaml
    version: "1.19.2"
    includeCRDs: true

Running helm template sysdig sysdig/sysdig-deploy --values values.yaml works, but I can see the two resources created.

# Source: sysdig-deploy/charts/admissionController/templates/webhook/admissionregistration.yaml
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
  name: sysdig-admissioncontroller-webhook
  namespace: default
webhooks: []

and

# Source: sysdig-deploy/charts/admissionController/templates/webhook/admissionregistration.yaml
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
  name: sysdig-admissioncontroller-webhook
  namespace: default
  annotations:
    "helm.sh/hook": "post-install, post-upgrade"
    meta.helm.sh/release-name: sysdig
    meta.helm.sh/release-namespace: default
  labels:
    app.kubernetes.io/managed-by: Helm
webhooks:
- name: audit.secure.sysdig.com
  matchPolicy: Equivalent
  rules:
    <truncated>
  clientConfig:
    <truncated>
  admissionReviewVersions: ["v1", "v1beta1"]
  sideEffects: None
  timeoutSeconds: 5
  failurePolicy: Ignore

I don't know the reason for creating a empty ValidatingWebhookConfiguration and then overwriting it. It seemingly works with Helm, but causes problems for Kustomize. Let me know if you want me to open a PR with a suggested solution or not.

Node Analyzer Benchmark Write Permissions Toggle Statements Duplicated

In the Benchmark Runner Cluster Role there's a toggle based on .Values.nodeAnalyzer.benchmarkRunner.includeSensitivePermissions that should be stripping out write permissions required in the role. Above that there is a static statement that gives the Cluster Role for Node Analyzer the permissions regardless. If the value is set to true the same clause shows up twice in the role definition.

This is the only thing that requires cluster wide pod creation and deletion permissions so if it's not needed I would like to strip these permissions out. If they are needed for Node Analyzer's standard operation, take the toggle out to avoid confusion about if the permissions are necessary.

# https://github.com/sysdiglabs/charts/blob/master/charts/node-analyzer/templates/clusterrole-node-analyzer.yaml#L36
- apiGroups:
    - ""
  resources:
    - "pods"
    - "pods/exec"
  verbs:
    - "create"
- apiGroups:
    - ""
  resources:
    - "pods"
  verbs:
    - "delete"

# https://github.com/sysdiglabs/charts/blob/master/charts/node-analyzer/templates/clusterrole-node-analyzer.yaml#L99
{{- if .Values.nodeAnalyzer.benchmarkRunner.includeSensitivePermissions  }}
- apiGroups:
    - ""
  resources:
    - pods/exec
  verbs:
    - create
- apiGroups:
    - ""
  resources:
    - pods
  verbs:
    - create
    - delete
{{- end }}

Updating PSP

Hey Team,

We are planning to upgrade our k8s version and we got an alert regarding the PSP policy used by the kspm-collector. Would it be possible to update the PSP(Pod Security Policy) in the template to automatically enable PSA(Pod Security Admission) to the namespace, because since k8s upgrade to 1.25 PSP will be depreciated.
Ref: https://github.com/sysdiglabs/charts/blob/master/charts/kspm-collector/templates/psp.yaml#L2C1-L3C1
https://kubernetes.io/docs/reference/using-api/deprecation-guide/

If its already available. Can you guide me on the update?

Cheers,
Vikranth

Using old beta annotations for arch and os

The sysdig charts are using old annotations:

Warning: spec.template.spec.affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms[1].matchExpressions[0].key: beta.kubernetes.io/arch is deprecated since v1.14; use "kubernetes.io/arch" instead
Warning: spec.template.spec.affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms[1].matchExpressions[1].key: beta.kubernetes.io/os is deprecated since v1.14; use "kubernetes.io/os" instead

sysdig-clustershield deployment does not restart when configmap, secrets, or the webhook change.

Imagine the following scenario: the clustershield helm chart is
re-deployed and the contents of either its configmap, secrets, or
webhook change... and the clustershield deployment itself does not
change. The clustershield deployment won't restart because it does not
know that anything has changed - even though the things it depended on
change.

This problem is especially pronounced for clustershield because with
the default behavior it will auto-generate the certificates used for
communication between the webhook and the deployment. If the webhook
certificate changes and the deployment doesn't restart, the webhook
won't be able to communicate with the pods in the deployment.

Test resources are being rendered via helm

Hey, seems like test charts are being rendered in latest version even when they should be ignored by .helmignore file.

For example, this resource can be seen when running: helm template sysdig-deploy --namespace sysdig-agent templates/platform/sysdig-agent/helm

Can you fix it please? It's blocking us from updating to latest helm chart.

Thanks :)

Related ticket: 00026796

cgroups v2 for sysdig agent

Please update the Sysdig agent to be compatible with AKS.
Beginning with AKS 1.25 and the upgrade to Ubuntu Ubuntu 22.04.2 LTS the sysdig agent breaks our standard environment for deploying java apps because of the cgroup v2 setting
This makes Sysdig Secure unusable for us.

@aroberts87

[sysdig-deploy] Duplicated k8s_coldstart settings

When installing sysdig/sysdig-deploy chart with the cold-start values e.g.

  sysdig:
    settings:
      k8s_coldstart:
        max_parallel_cold_start: 0

The generated agent config map contains a duplicated empty value i.e.

apiVersion: v1
data:
  dragent.yaml: |
    new_k8s: true
    k8s_cluster_name: esc-3113
    collector: collector-staging.sysdigcloud.com
    collector_port: 6443
    security:
      enabled: true
      k8s_audit_server_enabled: true
      k8s_audit_server_port: 7765
      k8s_audit_server_url: 0.0.0.0
    k8s_coldstart: {}


    k8s_coldstart:
      max_parallel_cold_start: 0

Cluster Scoped resource is installed in namespace.

Cluster scoped resource ValidatingWebhookConfiguration is being installed in namespace here. While Helm and K8S ignores the error and deploys the chart, Google Config Sync has more strict validation and does not deploy resource at all.

`cluster-scanner` must not set AMD64 `nodeSelector`

Hi,

cluster-scanner appears to be ARM compatible but it has still the nodeSelector property set to be scheduled on AMD64 only.

Could this please be removed?

Thank you

sysdiglabs / charts Goto Github PK

charts's Issues

dnsPolicy not set for node-analyzer daemonset while hostNetworking set true

Specifying runtimeScanner maxFileSizeAllowed as integer can result in failures

Support using cert-manager for admission controller certs

sysdig-agent daemonset does not respects rolling update strategy

[admission-controller] Helm chart generates duplicate ValidatingWebhookConfiguration resources

Node Analyzer Benchmark Write Permissions Toggle Statements Duplicated

Updating PSP

Using old beta annotations for arch and os

sysdig-clustershield deployment does not restart when configmap, secrets, or the webhook change.

Test resources are being rendered via helm

cgroups v2 for sysdig agent

[sysdig-deploy] Duplicated k8s_coldstart settings

Cluster Scoped resource is installed in namespace.

`cluster-scanner` must not set AMD64 `nodeSelector`

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent