Giter Club home page Giter Club logo

Comments (11)

stepro avatar stepro commented on August 27, 2024

Thanks for reporting this issue. Your analysis of the issue is correct and identifies a bug in the mindaro-proxy component. Do you have a Helm chart or raw Kubernetes yaml files you are using to install the specific redis cluster setup? This would help us greatly in recreating the problem on our side. Thanks!

from dev-spaces.

antogh avatar antogh commented on August 27, 2024

Hi @stepro thanks , I write a quick answer right before entering... a meeting :(
you just need the stateful set to recreate the problem , if redis-0 isn't able to set a key to a value that means you have the problem. The problem show the 2nd time you restart the stateful set.

here is the yaml to create the redis stateful set

{
"kind": "StatefulSet",
"apiVersion": "apps/v1beta2",
"metadata": {
"name": "redis",
"namespace": "default",
"selfLink": "/apis/apps/v1beta2/namespaces/default/statefulsets/redis",
"uid": "23c3d74d-d7a1-11e8-a77b-ae2b0ed1f96f",
"resourceVersion": "3214234",
"generation": 1,
"creationTimestamp": "2018-10-24T15:26:12Z",
"labels": {
"app": "redis"
},
"annotations": {
"kubectl.kubernetes.io/last-applied-configuration": "{"apiVersion":"apps/v1beta1","kind":"StatefulSet","metadata":{"annotations":{},"name":"redis","namespace":"default"},"spec":{"replicas":3,"serviceName":"redis","template":{"metadata":{"labels":{"app":"redis"}},"spec":{"containers":[{"command":["sh","-c","source /redis-config/init.sh"],"image":"redis:4.0.11-alpine","name":"redis","ports":[{"containerPort":6379,"name":"redis"}],"volumeMounts":[{"mountPath":"/redis-config","name":"config"},{"mountPath":"/redis-data","name":"data"}]},{"command":["sh","-c","source /redis-config-src/sentinel.sh"],"image":"redis:4.0.11-alpine","name":"sentinel","volumeMounts":[{"mountPath":"/redis-config-src","name":"config"},{"mountPath":"/redis-config","name":"data"}]}],"volumes":[{"configMap":{"defaultMode":420,"name":"redis-config"},"name":"config"},{"emptyDir":null,"name":"data"}]}}}}\n"
}
},
"spec": {
"replicas": 3,
"selector": {
"matchLabels": {
"app": "redis"
}
},
"template": {
"metadata": {
"creationTimestamp": null,
"labels": {
"app": "redis"
}
},
"spec": {
"volumes": [
{
"name": "config",
"configMap": {
"name": "redis-config",
"defaultMode": 420
}
},
{
"name": "data",
"emptyDir": {}
}
],
"containers": [
{
"name": "redis",
"image": "redis:4.0.11-alpine",
"command": [
"sh",
"-c",
"source /redis-config/init.sh"
],
"ports": [
{
"name": "redis",
"containerPort": 6379,
"protocol": "TCP"
}
],
"resources": {},
"volumeMounts": [
{
"name": "config",
"mountPath": "/redis-config"
},
{
"name": "data",
"mountPath": "/redis-data"
}
],
"terminationMessagePath": "/dev/termination-log",
"terminationMessagePolicy": "File",
"imagePullPolicy": "IfNotPresent"
},
{
"name": "sentinel",
"image": "redis:4.0.11-alpine",
"command": [
"sh",
"-c",
"source /redis-config-src/sentinel.sh"
],
"resources": {},
"volumeMounts": [
{
"name": "config",
"mountPath": "/redis-config-src"
},
{
"name": "data",
"mountPath": "/redis-config"
}
],
"terminationMessagePath": "/dev/termination-log",
"terminationMessagePolicy": "File",
"imagePullPolicy": "IfNotPresent"
}
],
"restartPolicy": "Always",
"terminationGracePeriodSeconds": 30,
"dnsPolicy": "ClusterFirst",
"securityContext": {},
"schedulerName": "default-scheduler"
}
},
"serviceName": "redis",
"podManagementPolicy": "OrderedReady",
"updateStrategy": {
"type": "OnDelete"
},
"revisionHistoryLimit": 10
},
"status": {
"observedGeneration": 1,
"replicas": 3,
"readyReplicas": 3,
"currentReplicas": 3,
"currentRevision": "redis-5bd6f7877b",
"updateRevision": "redis-5bd6f7877b",
"collisionCount": 0
}
}

and here the config map

{
"kind": "ConfigMap",
"apiVersion": "v1",
"metadata": {
"name": "redis-config",
"namespace": "default",
"selfLink": "/api/v1/namespaces/default/configmaps/redis-config",
"uid": "e147303d-cbd5-11e8-9b5c-6e6eccc149a1",
"resourceVersion": "1625941",
"creationTimestamp": "2018-10-09T15:13:30Z",
"annotations": {
"kubectl.kubernetes.io/last-applied-configuration": "{"apiVersion":"v1","data":{"init.sh":"#!/bin/bash\nif [[ ${HOSTNAME} == 'redis-0' ]]\nthen\n redis-server /redis-config/master.conf\nelse\n redis-server /redis-config/slave.conf\nfi","master.conf":"bind 0.0.0.0\nport 6379\n\ndir /redis-data","sentinel.conf":"bind 0.0.0.0\nport 26379\n\nsentinel monitor redis redis-0.redis 6379 2\nsentinel parallel-syncs redis 1\nsentinel down-after-milliseconds redis 10000\nsentinel failover-timeout redis 20000","sentinel.sh":"#!/bin/bash\ncp /redis-config-src/. /redis-config\nwhile ! ping -c 1 redis-0.redis; do\n echo 'Waiting for server'\n sleep 1\ndone\n\nredis-sentinel /redis-config/sentinel.conf","slave.conf":"bind 0.0.0.0\nport 6379\n\ndir .\n\nslaveof redis-0.redis 6379"},"kind":"ConfigMap","metadata":{"annotations":{},"creationTimestamp":null,"name":"redis-config","namespace":"default"}}\n"
}
},
"data": {
"init.sh": "#!/bin/bash\nif [[ ${HOSTNAME} == 'redis-0' ]]\nthen\n redis-server /redis-config/master.conf\nelse\n redis-server /redis-config/slave.conf\nfi",
"master.conf": "bind 0.0.0.0\nport 6379\n\ndir /redis-data",
"sentinel.conf": "bind 0.0.0.0\nport 26379\n\nsentinel monitor redis redis-0.redis 6379 2\nsentinel parallel-syncs redis 1\nsentinel down-after-milliseconds redis 10000\nsentinel failover-timeout redis 20000",
"sentinel.sh": "#!/bin/bash\ncp /redis-config-src/. /redis-config\nwhile ! ping -c 1 redis-0.redis; do\n echo 'Waiting for server'\n sleep 1\ndone\n\nredis-sentinel /redis-config/sentinel.conf",
"slave.conf": "bind 0.0.0.0\nport 6379\n\ndir .\n\nslaveof redis-0.redis 6379"
}
}

from dev-spaces.

stepro avatar stepro commented on August 27, 2024

Thanks, I'll take a look.

from dev-spaces.

stepro avatar stepro commented on August 27, 2024

Thanks @antogh for your patience on this issue. I needed to create a headless service object and fix a problem in the sentinel.sh script (the cp /redis-config-src/. /redis-config command didn't work; it needed to be cp /redis-config-src/* /redis-config) to get to the point where I could reproduce the issue.

Unfortunately, the issue here is a general problem that occurs when injecting any kind of intercepting proxy such as for dev spaces or other solutions like istio. I believe the specific problem is that when an intended slave (e.g. redis-1) connects to its master (e.g. redis-0) to register itself as a slave, the master uses the getpeername() API to determine the IP and port of the slave. When this is done through an intercepting proxy, this IP and port always end up being the master's IP and port, and the master then ends up turning itself into a slave. This causes the whole system to be stuck in the initialization phase.

The closest related issue I could find was this one for istio, where you'll notice the attached yaml files already disable the istio sidecar from the master and slave pods using a special istio annotation. The Helm chart did not generate these annotations so I'm not sure how it was determined that istio needed to be disabled for these pods. The actual issue here looks to be some problem with istio still getting in the way when it was told to get out of the way.

For dev spaces, we do not currently have a mechanism for a pod to opt out of being instrumented for dev spaces with the sidecar proxy. Your best option would be to run the redis cache in a different Kubernetes namespace that has not been upgraded to a dev space. We will look into providing a label or annotation similar to istio that will allow you to opt out of the sidecar proxy for specific pods.

from dev-spaces.

antogh avatar antogh commented on August 27, 2024

Thanks @stepro
I read your message with interest, it makes total sense and correspond to what I found out.

I have tried some hacks to have the the redis pod to opt out from the mindaro-proxy, unfortunately kubernetes does not allow removing a container from a pod updating its yaml, so I tried changing the image name to a neutral "alpine" image, and it was working for some time (redis log shows a successful initialization), but then the aks agent notice the hash for the mindaro-proxy container has changed and restarts the whole pod causing an infinite crash back loop :(

In the end I came to the same conclusion you suggested: placing redis pods into a different namespace not affected by dev spaces. Redis works fine again now.

Bu unfortunately problem never ends. Now VS does not debug anymore with azure dev spaces. It worked fine the 1st time I tried, now doesn't work anymore. I removed completely redis and the new namespace but the problem persists. VS is able to create the SVC and DEPLOYMENT on the cluster but then fails (after 10 minutes of silence) to create the POD with the actual application that would be port forwarded to my local machine. It seems a communication problem. VS can create the container locally without problem, so it's not a local docker issue, it can't send the container image to the cluster into the pod.

Do you have any idea what could be? The remote debugging inside kubernetes is really precious to speed up development, I'd really like to use this feature.

BTW I opened another issue here about this problem.

Thanks again

from dev-spaces.

stepro avatar stepro commented on August 27, 2024

I just discovered this article and will be investigating if there is anything we can do to make this scenario work.

Thanks for opening the other issue - someone on the team familiar with these connectivity issues will be able to help you.

from dev-spaces.

antogh avatar antogh commented on August 27, 2024

@stepro
Interesting article.

However, after one day of pain, Iā€™m very happy with the setup I have now, it works like a charm.
I have created a dedicated namespace for dev spaces and installed it there. Now it cannot interfere anymore with other pods. In the while the communication problems I mentioned yesterday are solved (it seems some maintenance was going on west Europe area where my AKS is) and I can debug flawlessly from my custom namespace and interact from my app under debugging with all the other pods in other namespaces.

Allow me to give you a suggestion: I would write a disclaimer in the dev spaces doc here:
https://docs.microsoft.com/en-us/azure/dev-spaces/get-started-netcore-visualstudio
https://docs.microsoft.com/en-us/azure/dev-spaces/troubleshooting

something like:
This service is still in preview and we are continuously working for improvements. At current stage the proxy agent that allow the remote debugging might interfere with some other pods in the same namespace (we know this happen with redis master/slave stateful sets). If you encounter this problem please install az dev spaces in a dedicated namespace, separate from the other pods.

from dev-spaces.

lisaguthrie avatar lisaguthrie commented on August 27, 2024

Thanks @antogh - I've submitted a request to get this added to our troubleshooting documentation.

from dev-spaces.

AceHack avatar AceHack commented on August 27, 2024

Please add an annotation to disable as soon as possible, that will be a great feature.

from dev-spaces.

stepro avatar stepro commented on August 27, 2024

We've checked in an ability to disable and it should be available in a couple of weeks.

from dev-spaces.

YuzorMa avatar YuzorMa commented on August 27, 2024

This should be fixed in the latest versions of Dev Spaces. Please let us know if you continue to see issues.

from dev-spaces.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    šŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. šŸ“ŠšŸ“ˆšŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ā¤ļø Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.