Giter Club home page Giter Club logo

Comments (27)

longwuyuan avatar longwuyuan commented on June 14, 2024

/remove-kind bug

The bug-report template is clear in asking for related information which you have skipped. For example why does your issue-description imply that a ssl-passthrough-using-ingress-resources is in the same use-case as a regular NON-ssl-passthrough-using-annotation.

It would help to know the details as asked in the new bug-report template because when you have passed through the controller and directly terminated TLS on the backend pod, why does your issue description imply a HTTP/HTTPS termination on the ingress-controller

/kind support
/triage needs-information

from ingress-nginx.

virtualdxs avatar virtualdxs commented on June 14, 2024

As far as I know, the only thing I skipped was the reproduction instructions, which I planned to write out and provide tomorrow.

To answer your questions, I have several Kubernetes clusters whose APIs are behind this ingress controller, which needs SSL passthrough. Separately, I have typical Ingress services, where TLS is terminated on the Ingress controller. I needed to enable SSL passthrough for the K8s APIs, which broke client IP address detection for everything. To fix that, I set the PROXY protocol config options, which fixed the IP address detection but broke HTTP for all Ingress resources.

More concisely, configuring the Ingress controller to allow for SSL passthrough breaks HTTP for all Ingress resources, including non-passthrough ones.

from ingress-nginx.

longwuyuan avatar longwuyuan commented on June 14, 2024

Thanks for the update. Helps.

There is almost no data to analyze or guess. And questions are in a new bug report template. Look at it kindly and edit your issue description to answer those.

One example of what is required but you skipped is zero context on proxy-protocol. I can see baremetal. And proxy-protocol needs to be enabled on all the hops. So you have not even told us what is the LoadBalancer where you enabled proxy-protocol, above the controller's service of --type LoadBalancer.

Another example is template asks k get all but we don't even see the output of k get anything in your issue-description.

With limited info in issue description, you could not be using proxy-protocol as recommended because controller needs protocol enabled on the LB above it as that is the way the clinet-info is retained.

An this can not be reproduced in a minikube/kind cluster because metallb can NOT use proxy-protocol and other LBs are not well-known, even if there was one that can run on baremetal and also be configured with proxy-protocol.

from ingress-nginx.

virtualdxs avatar virtualdxs commented on June 14, 2024

One example of what is required but you skipped is zero context on proxy-protocol. I can see baremetal. And proxy-protocol needs to be enabled on all the hops. So you have not even told us what is the LoadBalancer where you enabled proxy-protocol, above the controller's service of --type LoadBalancer.

The LB is irrelevant to this - the proxy protocol is being spoken only within the nginx pod, not by the load balancer. I will update with kind reproduction instructions when I can. The LB in use is MetalLB, but again, that's irrelevant as it is not what is speaking the proxy protocol.

Another example is template asks k get all but we don't even see the output of k get anything in your issue-description.

Check the "Current State of the controller:" section, which I collapsed with <details> as the issue template requested.

from ingress-nginx.

virtualdxs avatar virtualdxs commented on June 14, 2024

This comment is what prompted me to configure the proxy protocol: #8052 (comment)

from ingress-nginx.

longwuyuan avatar longwuyuan commented on June 14, 2024

Thanks. You can review the details collapsed info. I was trying to figure out how you get client info over LB but you see that its all just a long stream of flat unformatted text. If you can fix hte md formatting, maybe it will become legible.

I can guess reasons why you say that proxy-protocol on LB above the controller is not relevant. But its not something the project can extensively dig into. So far the docs and the understanding is that the LB above the controller has proxy-protocol enabled as well as the controller has the proxy-protocol enabled. That way the client info is retained across hops. If the LB is a L7 LB, then I think the X-Forwarded headers contain the client info.

In any case, will wait for reproduce procedure. But if your reproduce procedure does not simulate a real-world use case like TLS terminating on a LB running L4 proxy-protocol, it will cause a lot of confusion.

from ingress-nginx.

longwuyuan avatar longwuyuan commented on June 14, 2024

I saw the comment you linked just now.
Will wait for you to md format the info you said is in the details collapsed section.

from ingress-nginx.

longwuyuan avatar longwuyuan commented on June 14, 2024

Note the mention of the LB above the controller at these links

https://kubernetes.github.io/ingress-nginx/user-guide/miscellaneous/#proxy-protocol
https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/configmap/#use-proxy-protocol

from ingress-nginx.

virtualdxs avatar virtualdxs commented on June 14, 2024

I have updated the issue with a reproduction in kind, without any LB. Apologies about the formatting of the cluster state section, forgot to check that before submitting. It has been updated as well.

from ingress-nginx.

longwuyuan avatar longwuyuan commented on June 14, 2024

Thanks. The reproduce steps are super clear and helpful. The reproduced use0case is also super-clear and now descriibes your report appropriately.

However I can not reproduce. Both the HTTP and the HTTPS requests returned status code 200 for me. See below

image

I also concur with your 2 conclusions.
But I have some observations I am listing below.

  • Both the HTTP & the HTTPS calls failed for me even WITHOUT ssl-passthrough ( and proxy-protocol enabled)
    image

  • There is no TLS section visible in the yaml of the ingress resource you have pasted so makes sense you want to ssl-passthrough. But this test is invalid for me. Because without a TLS section, why should I make a HTTPS call to this ingress resource hostname. There is not even a --default-ssl-certificate configured on the controller here.

  • The very definition of proxy-protocol involves receiving information from a proxy. In this test the client information is directly reaching the controller from the client (and the client is the controller pod itself) so the controller is not receiving any client information from any proxy. So this test is invalid for me.

  • The log messages are clearly indicating my observatons. The log messages are stating that the header in broken, while reading the proxy-protocol

2024/05/17 00:32:04 [error] 50#50: *22948 broken header: "ߜ;$;EI7A;r V!#g&9!/D}t{dkj&Xmc#C>,0̨̩̪+/$(k#'g" while reading PROXY protocol, client: 127.0.0.1, server: 0.0.0.0:443
2024/05/17 00:32:37 [error] 43#43: *24541 broken header: "HEAD / HTTP/1.1" while reading PROXY protocol, client: 127.0.0.1, server: 0.0.0.0:80

  • The src ip is 127.0.0.1 which again makes this test invalid for me because if proxy-protocol is working, then I should see a real-client-ip here. Its moot in the context of above mentioned observations, but its a bullet point nevertheless to emphasize here.

So I agree with your observations and your use-case. But the project is designed multiple use-cases and the largest use-case is on a cloud or in a environment where there is a LB before the controller, So your test is not exposing a bug or a problem with the controller because the users who may have ssl-passthrough enabled on AWS or GCP or Azure, with proxy-protocol enabled on the Cloud-Provider-LB, have not reported this. If possible, I hope we can reproduce this on AWS or GCP or Azure to confirm what you are claiming.

These ar emy opinions anyways. Please wait for other comments.

from ingress-nginx.

virtualdxs avatar virtualdxs commented on June 14, 2024

It appears there is still miscommunication here, and I am wondering how the reproduction failed. I have re-run through the reproduction steps, and found:

  • Without ssl-passthrough or proxy-protocol: HTTP 200, HTTPS 200
  • With ssl-passthrough; without proxy-protocol: HTTP 200, HTTPS 200
  • Without ssl-passthrough; with proxy-protocol: HTTP error; HTTPS error
  • With ssl-passthrough and proxy-protocol: HTTP error; HTTPS 200

Below is complete terminal output demonstrating this.

15:04:13 dxs@bravo ~ $ kind create cluster
Creating cluster "kind" ...
 ✓ Ensuring node image (kindest/node:v1.30.0) 🖼
 ✓ Preparing nodes 📦
 ✓ Writing configuration 📜
 ✓ Starting control-plane 🕹️
 ✓ Installing CNI 🔌
 ✓ Installing StorageClass 💾
Set kubectl context to "kind-kind"
You can now use your cluster with:

kubectl cluster-info --context kind-kind

Not sure what to do next? 😅  Check out https://kind.sigs.k8s.io/docs/user/quick-start/
15:04:31 dxs@bravo ~ $ kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/main/deploy/static/provider/baremetal/deploy.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/main/docs/examples/http-svc.yaml
namespace/ingress-nginx created
serviceaccount/ingress-nginx created
serviceaccount/ingress-nginx-admission created
role.rbac.authorization.k8s.io/ingress-nginx created
role.rbac.authorization.k8s.io/ingress-nginx-admission created
clusterrole.rbac.authorization.k8s.io/ingress-nginx created
clusterrole.rbac.authorization.k8s.io/ingress-nginx-admission created
rolebinding.rbac.authorization.k8s.io/ingress-nginx created
rolebinding.rbac.authorization.k8s.io/ingress-nginx-admission created
clusterrolebinding.rbac.authorization.k8s.io/ingress-nginx created
clusterrolebinding.rbac.authorization.k8s.io/ingress-nginx-admission created
configmap/ingress-nginx-controller created
service/ingress-nginx-controller created
service/ingress-nginx-controller-admission created
deployment.apps/ingress-nginx-controller created
job.batch/ingress-nginx-admission-create created
job.batch/ingress-nginx-admission-patch created
ingressclass.networking.k8s.io/nginx created
validatingwebhookconfiguration.admissionregistration.k8s.io/ingress-nginx-admission created
deployment.apps/http-svc created
service/http-svc created
15:05:16 dxs@bravo ~ $ true # pause here for controller to come up
15:06:37 dxs@bravo ~ $ kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: foo-bar
spec:
  ingressClassName: nginx
  rules:
  - host: foo.bar
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: http-svc
            port:
              number: 80
EOF
ingress.networking.k8s.io/foo-bar created
15:06:48 dxs@bravo ~ $ POD_NAME=$(k get pods -n ingress-nginx -l app.kubernetes.io/name=ingress-nginx -o NAME|grep controller)
kubectl exec -it -n ingress-nginx $POD_NAME -- curl -IH 'Host: foo.bar' localhost
kubectl exec -it -n ingress-nginx $POD_NAME -- curl -kIH 'Host: foo.bar' https://localhost
HTTP/1.1 200 OK
Date: Fri, 17 May 2024 22:07:07 GMT
Content-Type: text/plain
Connection: keep-alive

HTTP/2 200
date: Fri, 17 May 2024 22:07:07 GMT
content-type: text/plain
strict-transport-security: max-age=31536000; includeSubDomains

15:07:07 dxs@bravo ~ $ kubectl get deployment ingress-nginx-controller -n ingress-nginx -o yaml | yq -Y '.spec.template.spec.containers[0].args += ["--enable-ssl-passthrough"]' | kubectl apply -f -
deployment.apps/ingress-nginx-controller configured
15:07:28 dxs@bravo ~ $ true # pause here for controller to come back up
15:08:08 dxs@bravo ~ $ POD_NAME=$(k get pods -n ingress-nginx -l app.kubernetes.io/name=ingress-nginx -o NAME|grep controller)
kubectl exec -it -n ingress-nginx $POD_NAME -- curl -IH 'Host: foo.bar' localhost
kubectl exec -it -n ingress-nginx $POD_NAME -- curl -kIH 'Host: foo.bar' https://localhost
HTTP/1.1 200 OK
Date: Fri, 17 May 2024 22:08:18 GMT
Content-Type: text/plain
Connection: keep-alive

HTTP/2 200
date: Fri, 17 May 2024 22:08:18 GMT
content-type: text/plain
strict-transport-security: max-age=31536000; includeSubDomains

15:08:18 dxs@bravo ~ $ k -n ingress-nginx rollout undo deployment ingress-nginx-controller
deployment.apps/ingress-nginx-controller rolled back
15:09:54 dxs@bravo ~ $ true # pause here for controller to come back up
15:10:06 dxs@bravo ~ $ kubectl apply -f - <<EOF
apiVersion: v1
data:
  forwarded-for-header: proxy_protocol
  use-proxy-protocol: "true"
kind: ConfigMap
metadata:
  labels:
    app.kubernetes.io/component: controller
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/part-of: ingress-nginx
    app.kubernetes.io/version: 1.10.1
  name: ingress-nginx-controller
  namespace: ingress-nginx
EOF
configmap/ingress-nginx-controller configured
15:10:28 dxs@bravo ~ $ POD_NAME=$(k get pods -n ingress-nginx -l app.kubernetes.io/name=ingress-nginx -o NAME|grep controller)
kubectl exec -it -n ingress-nginx $POD_NAME -- curl -IH 'Host: foo.bar' localhost
kubectl exec -it -n ingress-nginx $POD_NAME -- curl -kIH 'Host: foo.bar' https://localhost
curl: (52) Empty reply from server
command terminated with exit code 52
curl: (35) Recv failure: Connection reset by peer
command terminated with exit code 35
15:10:38 dxs@bravo ~ 35 $ kubectl get deployment ingress-nginx-controller -n ingress-nginx -o yaml | yq -Y '.spec.template.spec.containers[0].args += ["--enable-ssl-passthrough"]' | kubectl apply -f -
deployment.apps/ingress-nginx-controller configured
15:10:50 dxs@bravo ~ $ POD_NAME=$(k get pods -n ingress-nginx -l app.kubernetes.io/name=ingress-nginx -o NAME|grep controller)
kubectl exec -it -n ingress-nginx $POD_NAME -- curl -IH 'Host: foo.bar' localhost
kubectl exec -it -n ingress-nginx $POD_NAME -- curl -kIH 'Host: foo.bar' https://localhost
curl: (52) Empty reply from server
command terminated with exit code 52
HTTP/2 200
date: Fri, 17 May 2024 22:11:17 GMT
content-type: text/plain
strict-transport-security: max-age=31536000; includeSubDomains

15:11:17 dxs@bravo ~ $

This leads me to believe that you did not properly enable ssl-passthrough in the test where you received the error on HTTPS.

To be clear, this minimal test does not actually involve an Ingress that uses SSL passthrough - it simply involves the controller being configured to allow it. You are correct that there is no TLS section and no default certificate - this causes the controller to use the default fake certificate, as reflected in the following snippet from curl -v output:

* Server certificate:
*  subject: O=Acme Co; CN=Kubernetes Ingress Controller Fake Certificate
*  start date: May 17 22:10:51 2024 GMT
*  expire date: May 17 22:10:51 2025 GMT
*  issuer: O=Acme Co; CN=Kubernetes Ingress Controller Fake Certificate
*  SSL certificate verify result: self-signed certificate (18), continuing anyway.
*   Certificate level 0: Public key type RSA (2048/112 Bits/secBits), signed using sha256WithRSAEncryption

The very definition of proxy-protocol involves receiving information from a proxy

This is correct. When one enables SSL passthrough on the controller, the controller sets up a TCP proxy (in the controller itself, not in nginx) which listens on 443, intercepting all connections and making decisions about whether they should be routed to nginx (for ssl termination) or directly to the service (for ssl passthrough). This is the proxy that speaks the PROXY protocol to nginx, which is in this case listening on 442, not 443. The problem is that this interception happens only on 443, not on 80. So when we have proxy-protocol enabled, for HTTPS clients go to the TCP proxy which wraps the connection in proxy protocol before hitting nginx on 442, but for HTTP they go direct to nginx without proxy protocol on 80.

from ingress-nginx.

longwuyuan avatar longwuyuan commented on June 14, 2024

Thanks.

  • I think I followed your reproduce steps too verbatim and did not put the proxy-protocol key:value pair in the configMap to start with.
  • I put the configMap key:value pairs later on. This is my hindsight.
  • I will try to go over the steps again.
  • If you need to get more precise and more specific in the reproduce steps you provided in the issue description, please do so. My intention here is not nitpicking but trying to get precise with your use-case
  • I don't know how I can get a LB to enable proxy-protocol on, for free, because metallb is incapable of using proxy-protocol
  • So I am not sure how to test further. I will see what I can do while we wait for other comments

Generally speaking, It seems we are trying to establish that with proxy-protcol enabled, if any user enables the --ssl-passthrough flag of the controller executable, then the HTTP request fails.

But I am personally confused though. Below are the reasons.

I can not understand how the use the controller pod itself, as a client, is a valid test and proves what you say. This is because the error message is clearly stating that the header in the proxy-protocol is broken. And all I can assume here is that header is broken because of the following reasons. An expert needs to verify comment though.

Other confusions for me are below.

  • A valid client is a SRC tcp/ip stack, that is outside the cluster and not the controller pod itself
  • Any valid client-info, like real-ip, has to be a different SRC and can NOT be the controller-pod ip-address itself
  • Any valid client-info, is expected to have traversed over multiple network hops but that is not happening in this test
  • Any valid client-info, is expected to have been retained while travelling across network-hops and NOT be dropped, but that is not happening in this test
  • Any valid client-info, is expected to have traversed over at least one proxy, before reaching the controller, but that is not happening in this test

My opinion of a valid test

Phase 1

  • Create a cluster on cloud like EKS or GCP or Azure
  • Make sure to enable proxy-protocol on the Cloud-LB and also in the ingress-controller configMap
  • Use nginx:alpine as image and create a deployment called app1
  • Create ingress for app1 with a TLS section so TLS connection terminates on controller
  • Use a valid non-selfsigned cert
  • Send a HTTP request to app1 and note the request and logs
  • Send a HTTPS request to app1 and note the response and logs
  • TLS section in ingress automatically causes redirection of HTTP to HTTPS by default. So I can guess results and I already know from history, how this goes

Phase2

  • NOW create a deployment called app2
  • Use the image that we discuss in this issue-description ( kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/main/docs/examples/http-svc.yaml)
  • This app2 has a self-signed cert inside the pod
  • Terminate TLS for app2 pod and not on the ingress-nginx controller
  • Update the ingress-controller deployment and add the --enable-ssl-passthrough flag
  • Create a ingress for app2
  • Configure the ssl-passthrough annotation in ingress for app2
  • Send a HTTP request to app2 and note the request and logs
  • Send a HTTPS request to app2 and note the response and logs

Phase3

  • Configure the ssl-redirect related annotations in ingress for app2
  • Send a HTTP request to app1 and app2 and note the request and logs
  • Send a HTTPS request to app1 and app2 and note the response and logs

I am not sure when I can test like this or even if I can. However these are my opinions and so lets wait for other comments.

from ingress-nginx.

longwuyuan avatar longwuyuan commented on June 14, 2024

There is a big-factor that the test talks about ssl-passthrough but there is no ingress with the ssl-passthrough annotation being used. To me if ssl-passthrough is being tested or used, then the ingress has to be configured with the ssl-passthrough annotation . So this is one of the other confusions I forgot to mention earlier.

from ingress-nginx.

longwuyuan avatar longwuyuan commented on June 14, 2024

https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/annotations/#ssl-passthrough

from ingress-nginx.

virtualdxs avatar virtualdxs commented on June 14, 2024

I don't know how I can get a LB to enable proxy-protocol on, for free, because metallb is incapable of using proxy-protocol

That is irrelevant to this. The use case that this issue is about does not involve an LB that speaks proxy protocol. As stated in my previous comment, the thing speaking proxy protocol is the controller itself, which listens on 443 and proxies to nginx listening on 442. Nginx does not listen on 443 when ssl-passthrough is enabled; the controller tcp proxy (which speaks proxy protocol) does.

Generally speaking, It seems we are trying to establish that with proxy-protcol enabled, if any user enables the --ssl-passthrough flag of the controller executable, then the HTTP request fails.

Yes, but it is also important to note that the HTTPS request does not fail, indicating that the configuration to use proxy protocol is working.

I can not understand how the use the controller pod itself, as a client, is a valid test and proves what you say. This is because the error message is clearly stating that the header in the proxy-protocol is broken.

Correct, because when talking HTTP there is no proxy, whereas when talking HTTPS there is (the TCP proxy on 443). Hence why I want to either be able to enable proxy protocol on HTTPS without enabling it on HTTP, or to have another built-in TCP proxy on 80. The controller itself as a client is a valid test because in the use case there is nothing sitting in front of the controller other than a non-proxy-protocol LB (MetalLB) which has no bearing on this. To reiterate, in my environment, nothing is speaking proxy protocol outside of the nginx ingress controller pod.

Other confusions for me are below.

Most of these seem to have no bearing on the result. This should be reproducible with a nodeport service. As far as client-info, that is not something I am having an issue with here.

Any valid client-info, is expected to have traversed over at least one proxy, before reaching the controller, but that is not happening in this test

This is the exception to the above statement. The only proxy involved here is the built-in TCP proxy. I do not want to deploy another proxy in front of my ingress controller

My opinion of a valid test

The test that you describe here is invalid by nature of having an LB that speaks proxy protocol. This issue does not involve an LB that speaks proxy protocol, and introducing one creates a very different use case.

Reproduction instructions should be minimal, and so I did not include any of the complexity of my actual environment that was not required to reproduce the exact same issue. In my actual environment, the network path looks something like this:

  • Client
  • Client network
  • Router port-forwarding 443 (does not speak proxy protocol)
  • Cluster external network
  • MetalLB (does not speak proxy protocol)
  • Cluster internal network
  • Nginx Ingress controller TCP proxy (inside ingress controller pod, speaks proxy protocol)
  • Nginx (inside ingress controller pod)
  • Cluster internal network
  • Backend

Or for HTTP:

  • Client
  • Client network
  • Router port-forwarding 80 (does not speak proxy protocol)
  • Cluster external network
  • MetalLB (does not speak proxy protocol)
  • Cluster internal network
  • Nginx (inside ingress controller pod)
  • Cluster internal network
  • Backend

The exact same issue appears in the kind environment which I wrote reproduction instructions for, in the test case I provided, even though the path is much more minimal:

  • Client
  • Nginx Ingress controller TCP proxy (inside ingress controller pod, speaks proxy protocol)
  • Nginx (inside ingress controller pod)
  • Cluster internal network
  • Backend

Or for HTTP:

  • Client
  • Nginx (inside ingress controller pod)
  • Cluster internal network
  • Backend

Similarly, this is the reason I did not add an ingress using the ssl-passthrough annotation. The factor that introduces the bug is not whether an ssl-passthrough ingress is present, but whether --enable-ssl-passthrough is passed to the controller. If it is not, then nginx listens on 443 and no TCP proxy is started. If it is, then nginx listens on 442 and a TCP proxy is started on 443. See here and here. Feel free to perform my reproduction instructions with any additional ingress resources, they should have no effect on the reproducibility as long as --enable-ssl-passthrough is set and use-proxy-protocol is true.

from ingress-nginx.

longwuyuan avatar longwuyuan commented on June 14, 2024

Thanks for all the comments.

  • I am clear now on what kind and where is the miscommunication we have. I think I can understand this issue and get on the same page with you, only if I converse with you on a video call. If there is any chance we can get on a meet.jit.si session, please send me a link on slack..

  • From your comments, it seems you are pretty sure that the implementation of the ssl-passthrough feature is broken. But we don't have multiple users reporting this. So that is stopping me from accepting this triage. This means a real expert or a developer of the ingress-nginx controller has to comment and advise here.

  • I completely agree with you that to demo a break in the ssl-passthrouh implementation, you find it adequae to just use the controller pod itself as the client for the HTTP request. However the revelation now appears that you are not referring to HTTP broken for non-ssl-passthrough-ingress. It seems you are referring to HTTP broken for requests sent to that backend which is destination for ssl-passthrough. Is this correct or did I misunderstand !

  • If the ssl-passthrough feature is broken, then all real-world users of ingress-nginx controller who have configured both the features, viz ssl-passthrough-annotation as well as use-proxy-protocol-true-configMap-key, in their ingress-nginx-controller installation, will report that all their HTTP requests are failing. That is not happening and its confusing me.

  • So from what is contained in this issue posts, I hope you agree that anybody can experience the HTTP error and reproduce a bug, if they use both features (ssl-passthrough & proxy-protocol) on a EKS/GCP/Azure/Other cloud cluster, just as well.

  • Lastly I think I don't have enough knowledge to understand this issue. A test that enables the flag --enable-ssl-passthrough on the controller's executable Args spec, but does not use the ssl-passthrough annotation in a ingress, and yet asserts that the result of the test inference, is a bug in the controller, is beyond my current comprehension. And that, coupled with other users of the same combo of features, not reporting broken HTTP, is baffling me and I am unable to wrap my head around this.

I will wait for comments from experts.

The project ingress-nginx community meetings schedule is here https://github.com/kubernetes/community/tree/master/sig-network . My request is that you kindly join a community-meeting (after the 27th of May, 2024) and explain this issue because the maintainers are on those meetings and they can advise what to do next.

from ingress-nginx.

virtualdxs avatar virtualdxs commented on June 14, 2024

appears that you are not referring to HTTP broken for non-ssl-passthrough-ingress. It seems you are referring to HTTP broken for requests sent to that backend which is destination for ssl-passthrough. Is this correct or did I misunderstand !

I think you're misunderstanding still; the issue is that HTTP requests to non-ssl-passthrough ingresses fail if ssl-passthrough is enabled on the controller (regardless of whether there exists an ssl passthrough ingress).

If it helps clarify, the way I discovered the bug is as follows:

  • I had set up multiple standard (non-ssl-passthrough) Ingress resources with cert-manager to handle Let's Encrypt certs via HTTP-01, which worked well
  • I started exploring vCluster, and in particular enabled the option that creates an ssl passthrough ingress for the vCluster k8s API
  • In order for that to work I needed to enable SSL passthrough on the controller, which I did, and those ssl-passthrough ingresses started working
  • I noticed my IP restrictions started blocking everything (on HTTPS at least, but had not checked HTTP), which led me to discover the flaw with ssl passthrough being enabled that in its default configuration all traffic shows as coming from 127.0.0.1, which is resolved by enabling use-proxy-protocol per the issue comment I linked earlier. I did so, and the client IPs for HTTPS were showing correctly again. (This is where HTTP broke, but I did not realize that at the time.)
  • Later I added another standard (non-ssl-passthrough) ingress for an unrelated service
  • I noticed that it was failing to obtain a TLS certificate via cert-manager
  • After some troubleshooting, I realize that HTTPS for all the apps is working well, but HTTP is failing for all apps, causing the Let's Encrypt HTTP-01 challenge to fail
  • I noticed the "proxy protocol: bad header" messages in the log, which led to me digging into the ssl-passthrough implementation and opening this bug report

If the ssl-passthrough feature is broken, then all real-world users of ingress-nginx controller who have configured both the features, viz ssl-passthrough-annotation as well as use-proxy-protocol-true-configMap-key, in their ingress-nginx-controller installation, will report that all their HTTP requests are failing. That is not happening and its confusing me.

This is confusing me as well, but I think it's simply that this is a relatively uncommon use case, as it requires all the following factors:

  • ingress-nginx
  • Need for ssl-passthrough
  • Need for client IPs to be correct
  • Need for HTTP to function
  • No proxy-protocol-aware LB in front of nginx (this would work around the issue as the TCP proxy supports proxy protocol on both ends)

And given there is no documentation on the combination of ssl-passthrough and use-proxy-protocol outside of a Github issue comment, that also seems to indicate that this is a rare edge case.

So from what is contained in this issue posts, I hope you agree that anybody can experience the HTTP error and reproduce a bug, if they use both features (ssl-passthrough & proxy-protocol) on a EKS/GCP/Azure/Other cloud cluster, just as well.

Yes, with the exception of if there is a proxy-protocol-aware LB in front of ingress-nginx, as that would mask the issue.

Lastly I think I don't have enough knowledge to understand this issue. A test that enables the flag --enable-ssl-passthrough on the controller's executable Args spec, but does not use the ssl-passthrough annotation in a ingress, and yet asserts that the result of the test inference, is a bug in the controller, is beyond my current comprehension.

So the issue is that setting --enable-ssl-passthrough fundamentally changes how the controller is set up, particularly the data path. Without the flag, the service points directly at nginx for both HTTP and HTTPS. With the flag, the HTTP path remains the same, but the HTTPS path is changed to add the TCP proxy between the service and nginx. This change in the data path is the cause of the issue, hence the lack of a need to create an ssl-passthrough ingress.

I will message you on Slack to set up a Jitsi call.

from ingress-nginx.

longwuyuan avatar longwuyuan commented on June 14, 2024

Hi @virtualdxs , I saw your ping on slack and responded. Can you check please

from ingress-nginx.

longwuyuan avatar longwuyuan commented on June 14, 2024

Your latest update above is informative. I will need to get precise now and start nitpicking to make progress.

These are the factors you listed ;

  1. ingress-nginx-controller
  2. Need for ssl-passthrough
  3. Need for client IPs to be correct
  4. Need for HTTP to function
  5. No proxy-protocol-aware LB in front of ingress-nginx-controller (this would work around the issue as the TCP proxy supports proxy protocol on both ends)

Do you acknowledge that bullet point 3 and point 5 above are inter-related and have a unique standing ?

In this list of factors, Is the proxy-protocol enabled on the ingress-nginx-controller, implied or not implied ?

  • If the proxy-protocol is enabled in the controller configMap, then its moot and invalid because there is no proxy-protocol-aware LB in front of it
  • If proxy-protocol is NOT enabled in the controller configMap then HTTP and HTTPS for both ssl-passthrough ingress as well as NON-ssl-passthrough-ingress work as expected. I tested this on a cluster, with Metallb installed in this cluster

Cluster and controller state below

image

HTTP request to a NON-ssl-passthrough ingress shows response code 200 below

% k describe ing test0 
Name:             test0
Labels:           <none>
Namespace:        default
Address:          172.19.0.2
Ingress Class:    nginx
Default backend:  <default>
Rules:
  Host                Path  Backends
  ----                ----  --------
  test0.mydomain.com  
                      /   test0:80 (10.244.0.43:80)
Annotations:          <none>
Events:
  Type    Reason  Age                From                      Message
  ----    ------  ----               ----                      -------
  Normal  Sync    33m (x2 over 33m)  nginx-ingress-controller  Scheduled for sync
[~] 

image

HTTPS request to a NON-ssl-passthrough ingress shows response code 200 below

% k -n observability describe ing grafana 
Name:             grafana
Labels:           <none>
Namespace:        observability
Address:          172.19.0.2
Ingress Class:    nginx
Default backend:  <default>
TLS:
  SNI routes grafana.dev.enjoydevops.com
Rules:
  Host                         Path  Backends
  ----                         ----  --------
  grafana.dev.enjoydevops.com  
                               /   prometheusgrafana:80 (10.244.0.35:3000)
Annotations:                   <none>
Events:                        <none>

image

from ingress-nginx.

virtualdxs avatar virtualdxs commented on June 14, 2024

Do you acknowledge that bullet point 3 and point 5 above are inter-related and have a unique standing ?

You are absolutely correct, and my sincere apologies for neglecting to mention this in the initial post. I do in fact have my environment configured with externalTrafficPolicy=Local. So I disagree that "Client-IP can NOT be correct if there is no proxy-protocol-enabled-LB in front of the ingress-nginx-contoller," because it is possible with externalTrafficPolicy=Local.

Is the proxy-protocol enabled on the ingress-nginx-controller, implied or not implied ?

Yes, use-proxy-protocol is enabled, because that is required for nginx to get the correct IP from the controller's built-in TCP proxy, which speaks proxy-protocol.

then its moot and invalid because there is no proxy-protocol-aware LB in front of it

No, because the built-in TCP proxy is in front of it and speaks proxy-protocol.

from ingress-nginx.

longwuyuan avatar longwuyuan commented on June 14, 2024

Some progress on my testing.

  • I have luckily reproduced the problem you have been talking about since the beginning. See below screenshot of the test

image

  • AFAIK, we do not support use-proxy-protocol feature in the controller, behind a Layer4 LB, if proxy-protocol is not enabled on that LB

  • Please join the community meeting as per info I mentioned earlier so that you can talk to the maintainers about this

  • You can try NOT to use the proxy-protocol for client-ip and use the externalTrafficPolicy: Local in the controller's service, to get the real-client-ip

from ingress-nginx.

longwuyuan avatar longwuyuan commented on June 14, 2024

I verified that as soon as I remove proxy-protocol from the controller's configMap, the same HTTP request which failed in the post previous to this post, now works successfully with response code 200

image

from ingress-nginx.

virtualdxs avatar virtualdxs commented on June 14, 2024

AFAIK, we do not support use-proxy-protocol feature in the controller, behind a Layer4 LB, if proxy-protocol is not enabled on that LB
You can try NOT to use the proxy-protocol for client-ip and use the externalTrafficPolicy: Local in the controller's service, to get the real-client-ip

use-proxy-protocol is necessary because without it, the built-in TCP proxy masks the client IP for HTTPS connection. I already use externalTrafficPolicy: Local in my environment, and have done so since before I needed ssl-passthrough.

from ingress-nginx.

longwuyuan avatar longwuyuan commented on June 14, 2024

After a long discussion and triaging on DM in slack, the summary of the issue is seen in below screenshot

  • Use a on-premises cluster or minikube with Metallb or Kind with Metallb
  • Enable ssl-passthrough
  • Enable use-proxy-protocol
  • HTTPS works
  • HTTP breaks with error "Brocken Header in PROXY protocol"

image

Observation is that project does not support use-proxy-protocol without enabling proxy-protocol in a L4 LB in front of the controller.

But user expects real-client-ip on on-premises cluster along with ssl-passthrough enabled on controller for some ingresses

from ingress-nginx.

longwuyuan avatar longwuyuan commented on June 14, 2024

/triage-accepted

/remove-triage needs-information

from ingress-nginx.

longwuyuan avatar longwuyuan commented on June 14, 2024

/triage accepted

from ingress-nginx.

Gacko avatar Gacko commented on June 14, 2024

/assign

from ingress-nginx.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.