Giter Club home page Giter Club logo

Comments (12)

sbabalol avatar sbabalol commented on August 24, 2024 18

Any ETA on this fix?

from amazon-cloudwatch-agent.

ecerulm avatar ecerulm commented on August 24, 2024 1

Restrict the use of host networking and block access to instance metadata service
sort of recommends blocking IMDS from the pods:

While these privileges are required for the node to operate effectively, it is not usually desirable that the pods running on the node inherit these privileges.

But IMDS access seems like hard requirement for using cloudwatch agent for kubernetes container insights with enhanced observability (enhanced_container_insights).

The alternatives are

  • Increase IMDSv2 hop limit to 2 (that allows all pods to get access to IMDS and use the node IAM role)
  • Increase IMDSv2 hop limit and add k8s NetworkPolicy to each Namespace to block access to 169.254.0.0/16, this is cumbersome since you need to add it to all namespaces (there is no GlobalNetworkPolicy in vanilla Kubernetes) and there is no explicity deny either.
  • Keep IMDSV2 hop limit 1 and run cloudwatch agent daemonset with pod.spec.hostNetwork = true
    • you need to enforce that untrusted pods do not use host networking, with admission controllers

from amazon-cloudwatch-agent.

faizanshah-tp avatar faizanshah-tp commented on August 24, 2024

facing the same while runing cloudwatch agent as a daemonset.

D! [EC2] Found active network interface
I! imds retry client will retry 1 timesD! should retry true for imds error : RequestError: send request failed
caused by: Put "http://169.254.169.254/latest/api/token": context deadline exceeded (Client.Timeout exceeded while awaiting headers)D! should retry true for imds error : RequestError: send request failed
caused by: Put "http://169.254.169.254/latest/api/token": context deadline exceeded (Client.Timeout exceeded while awaiting headers)D! could not get hostname without imds v1 fallback enable thus enable fallback
E! [EC2] Fetch hostname from EC2 metadata fail: EC2MetadataError: failed to make EC2Metadata request

	status code: 401, request id: 
D! should retry true for imds error : RequestError: send request failed
caused by: Put "http://169.254.169.254/latest/api/token": context deadline exceeded (Client.Timeout exceeded while awaiting headers)D! should retry true for imds error : RequestError: send request failed
caused by: Put "http://169.254.169.254/latest/api/token": context deadline exceeded (Client.Timeout exceeded while awaiting headers)D! could not get instance document without imds v1 fallback enable thus enable fallback

from amazon-cloudwatch-agent.

ecerulm avatar ecerulm commented on August 24, 2024

I uninstalled the amazon-cloudwatch-observability eks add-on and installed using the the instructions at https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Container-Insights-setup-metrics.html
and I'm getting the same result.

But I can set RUN_WITH_IRSA="True" (I actually have a service account with IRSA annotation) in the DaemonSet and that makes it detect

I! Detected from ENV RUN_WITH_IRSA is True

The RUN_WITH_IRSA environment variable does not seem to be documented but it's in the source code and it works the values needs to be True (not true or 1).

from amazon-cloudwatch-agent.

ecerulm avatar ecerulm commented on August 24, 2024

Although, using RUN_WITH_IRSA allows the DaemonSet to run and some metrics are sent to CloudWatch . I can see that it still tries to use the IMDS to get the EC2 metadata (instance id, image id, and instance type) I guess it's not possible to get those in any other way currently

The CloudWatch Container Insights still lacks the "Top 10 Nodes by CPU Utilization" , etc. I guess all metrics that have NodeName as dimension are missing.

I guess this means that amazon-cloudwatch-agent really needs IMDS , and maybe it should documented so. You can't have it without it, can you?


2024-03-25T14:14:10Z I! {"caller":"host/ec2metadata.go:78","msg":"Fetch instance id and type from ec2 metadata","kind":"receiver","name":"awscontainerinsightreceiver","data_type":"metrics"}
2024-03-25T14:14:11.425Z	DEBUG	[email protected]/imdsretryer.go:45	imds error : 	{"shouldRetry": true, "error": "RequestError: send request failed\ncaused by: Put \"http://169.254.169.254/latest/api/token\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"}

from amazon-cloudwatch-agent.

jefchien avatar jefchien commented on August 24, 2024

Hi @ecerulm, we are aware of the current IMDS requirement and are tracking an alternative for it when IMDS is unavailable internally.

from amazon-cloudwatch-agent.

AaronFriel avatar AaronFriel commented on August 24, 2024

But IMDS access seems like hard requirement for using cloudwatch agent for kubernetes container insights with enhanced observability (enhanced_container_insights).

This isn't the case - if you edit the operator's agent resource configuration like so, you will see that it is capable of using IRSA:

$ kubectl -n amazon-cloudwatch edit amazoncloudwatchagents.cloudwatch.aws.amazon.com 

Apply this change:

 apiVersion: v1
 items:
 - apiVersion: cloudwatch.aws.amazon.com/v1alpha1
   kind: AmazonCloudWatchAgent
   metadata:
     annotations:
       pulumi.com/patchForce: "true"
     creationTimestamp: "2024-04-01T08:21:38Z"
     generation: 5
     labels:
       app.kubernetes.io/managed-by: amazon-cloudwatch-agent-operator
     name: cloudwatch-agent
     namespace: amazon-cloudwatch
     resourceVersion: "3839446"
     uid: 542fecd4-0368-4ab1-8d8b-e7e5ad47c538
   spec:
     config: '{"agent":{"region":"us-west-2"},"logs":{"metrics_collected":{"app_signals":{"hosted_in":"opal-quokka-6860d02"},"kubernetes":{"cluster_name":"opal-quokka-6860d02","enhanced_container_insights":true}}},"traces":{"traces_collected":{"app_signals":{}}}}'
     env:
+  - name: RUN_WITH_IRSA
+    value: true  
   - name: K8S_NODE_NAME
     valueFrom:
       fieldRef:
         fieldPath: spec.nodeName

from amazon-cloudwatch-agent.

ecerulm avatar ecerulm commented on August 24, 2024

@AaronFriel

Like I commented at #1101 (comment) even with RUN_WITH_IRSA it still goes to IMDS to obtain the instance id, etc

Although, using RUN_WITH_IRSA allows the DaemonSet to run and some metrics are sent to CloudWatch . I can see that it still tries to use the IMDS to get the EC2 metadata (instance id, image id, and instance type)

The instance id, etc are needed metrics for “ kubernetes container insights with enhanced observability (enhanced_container_insights)” and since they can’t be obtained those metric are not sent.

I don’t think there is anyway to pass the instance id, etc by any other means today but @jefchien seems to be indicating that there may be working in some alternative.

from amazon-cloudwatch-agent.

ArthurMelin avatar ArthurMelin commented on August 24, 2024

I don’t think there is anyway to pass the instance id, etc by any other means today but @jefchien seems to be indicating that there may be working in some alternative.

Maybe the agent can grab the instance ID from the spec.providerID field on the Node object which it can be fetched from the kube-api?

from amazon-cloudwatch-agent.

Cornul11 avatar Cornul11 commented on August 24, 2024

But IMDS access seems like hard requirement for using cloudwatch agent for kubernetes container insights with enhanced observability (enhanced_container_insights).

This isn't the case - if you edit the operator's agent resource configuration like so, you will see that it is capable of using IRSA:

$ kubectl -n amazon-cloudwatch edit amazoncloudwatchagents.cloudwatch.aws.amazon.com 

Apply this change:

 apiVersion: v1
 items:
 - apiVersion: cloudwatch.aws.amazon.com/v1alpha1
   kind: AmazonCloudWatchAgent
   metadata:
     annotations:
       pulumi.com/patchForce: "true"
     creationTimestamp: "2024-04-01T08:21:38Z"
     generation: 5
     labels:
       app.kubernetes.io/managed-by: amazon-cloudwatch-agent-operator
     name: cloudwatch-agent
     namespace: amazon-cloudwatch
     resourceVersion: "3839446"
     uid: 542fecd4-0368-4ab1-8d8b-e7e5ad47c538
   spec:
     config: '{"agent":{"region":"us-west-2"},"logs":{"metrics_collected":{"app_signals":{"hosted_in":"opal-quokka-6860d02"},"kubernetes":{"cluster_name":"opal-quokka-6860d02","enhanced_container_insights":true}}},"traces":{"traces_collected":{"app_signals":{}}}}'
     env:
+  - name: RUN_WITH_IRSA
+    value: true  
   - name: K8S_NODE_NAME
     valueFrom:
       fieldRef:
         fieldPath: spec.nodeName

With this change applied to the agent, the issue still persists.

As mentioned in another comment, I created a custom launch template with an increased number of max hops, which solved the issue. I do understand, however, that this may be a security concern and should be avoided, but, as a temporary measure until the addon is fixed, it is acceptable for our use case.

from amazon-cloudwatch-agent.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.