Giter Club home page Giter Club logo

Comments (16)

mcastelino avatar mcastelino commented on July 24, 2024

/cc @egernst

from cloud-native-setup.

krsna1729 avatar krsna1729 commented on July 24, 2024

another data point to say not an issue of kata-deploy -

checkout v1.2 tag and apply the rsync hotfix. Same behavior

Every 2.0s: kubectl get po --all-namespaces -owide                                                                                                              clr-01: Thu Mar 14 15:56:44 2019

NAMESPACE     NAME                              READY   STATUS         RESTARTS   AGE     IP                NODE     NOMINATED NODE   READINESS GATES
kube-system   canal-c8zkf                       2/3     Running        0          40s     192.168.121.30    clr-01   <none>           <none>
kube-system   canal-j6hgs                       0/3     ErrImagePull   0          40s     192.168.121.215   clr-02   <none>           <none>
kube-system   canal-wxlf9                       0/3     ErrImagePull   0          40s     192.168.121.137   clr-03   <none>           <none>
kube-system   coredns-86c58d9df4-8zrnw          1/1     Running        0          9m32s   10.244.0.3        clr-01   <none>           <none>
kube-system   coredns-86c58d9df4-cwr4n          1/1     Running        0          9m32s   10.244.0.2        clr-01   <none>           <none>
kube-system   etcd-clr-01                       1/1     Running        0          8m35s   192.168.121.30    clr-01   <none>           <none>
kube-system   kube-apiserver-clr-01             1/1     Running        0          8m38s   192.168.121.30    clr-01   <none>           <none>
kube-system   kube-controller-manager-clr-01    1/1     Running        0          8m41s   192.168.121.30    clr-01   <none>           <none>
kube-system   kube-proxy-44v49                  1/1     Running        0          55s     192.168.121.215   clr-02   <none>           <none>
kube-system   kube-proxy-dnn28                  1/1     Running        0          9m32s   192.168.121.30    clr-01   <none>           <none>
kube-system   kube-proxy-qrmn7                  1/1     Running        0          44s     192.168.121.137   clr-03   <none>           <none>
kube-system   kube-scheduler-clr-01             1/1     Running        0          8m33s   192.168.121.30    clr-01   <none>           <none>
kube-system   metrics-server-5465bdb788-snf2d   0/1     Pending        0          40s     <none>            <none>   <none>           <none>

from cloud-native-setup.

mcastelino avatar mcastelino commented on July 24, 2024

@krsna1729 so this lockup happens even w/o kata-deploy. So for some reason crio is stuck on download. Can you paste the output of the kubectl get po for the pods that are stuck.

from cloud-native-setup.

AntonioMeireles avatar AntonioMeireles commented on July 24, 2024

another data point to say not an issue of kata-deploy -

checkout v1.2 tag and apply the rsync hotfix. Same behavior

Every 2.0s: kubectl get po --all-namespaces -owide                                                                                                              clr-01: Thu Mar 14 15:56:44 2019

NAMESPACE     NAME                              READY   STATUS         RESTARTS   AGE     IP                NODE     NOMINATED NODE   READINESS GATES
kube-system   canal-c8zkf                       2/3     Running        0          40s     192.168.121.30    clr-01   <none>           <none>
kube-system   canal-j6hgs                       0/3     ErrImagePull   0          40s     192.168.121.215   clr-02   <none>           <none>
kube-system   canal-wxlf9                       0/3     ErrImagePull   0          40s     192.168.121.137   clr-03   <none>           <none>
kube-system   coredns-86c58d9df4-8zrnw          1/1     Running        0          9m32s   10.244.0.3        clr-01   <none>           <none>
kube-system   coredns-86c58d9df4-cwr4n          1/1     Running        0          9m32s   10.244.0.2        clr-01   <none>           <none>
kube-system   etcd-clr-01                       1/1     Running        0          8m35s   192.168.121.30    clr-01   <none>           <none>
kube-system   kube-apiserver-clr-01             1/1     Running        0          8m38s   192.168.121.30    clr-01   <none>           <none>
kube-system   kube-controller-manager-clr-01    1/1     Running        0          8m41s   192.168.121.30    clr-01   <none>           <none>
kube-system   kube-proxy-44v49                  1/1     Running        0          55s     192.168.121.215   clr-02   <none>           <none>
kube-system   kube-proxy-dnn28                  1/1     Running        0          9m32s   192.168.121.30    clr-01   <none>           <none>
kube-system   kube-proxy-qrmn7                  1/1     Running        0          44s     192.168.121.137   clr-03   <none>           <none>
kube-system   kube-scheduler-clr-01             1/1     Running        0          8m33s   192.168.121.30    clr-01   <none>           <none>
kube-system   metrics-server-5465bdb788-snf2d   0/1     Pending        0          40s     <none>            <none>   <none>           <none>

kubectl describe -n kube-system canal-j6hgs output please

from cloud-native-setup.

mcastelino avatar mcastelino commented on July 24, 2024

@krsna1729 after you get kubectl describe -n kube-system canal-j6hgs

Also restart crio in place, see if crio recovers. If it is a upstream issue we can report it if we can reproduce it consistently.

from cloud-native-setup.

krsna1729 avatar krsna1729 commented on July 24, 2024

@mcastelino crio journalctl
http://paste.ubuntu.com/p/WcZm7QThzw/

from cloud-native-setup.

mcastelino avatar mcastelino commented on July 24, 2024

http://paste.ubuntu.com/p/WcZm7QThzw/

@krsna1729 is this before the restart?

from cloud-native-setup.

mcastelino avatar mcastelino commented on July 24, 2024

@krsna1729 @ganeshmaharaj. This one seems to be more stable

vagrant box list
AntonioMeireles/ClearLinux (libvirt, 28230)

Kernel

4.19.27-436.lts

from cloud-native-setup.

AntonioMeireles avatar AntonioMeireles commented on July 24, 2024

@mcastelino so it seems latest cri-o bump behaves for updates (of already running stuff) but not for fresh installs :/ ... (still trying to repro locally - with my c8s
stack)

from cloud-native-setup.

ganeshmaharaj avatar ganeshmaharaj commented on July 24, 2024

@krsna1729 sharing your panic here.
https://gist.githubusercontent.com/krsna1729/fb1ef568051e578cd00e4a6dd2581855/raw/1b478fa1d61c204074c04df3acce0b09dd66e643/gistfile1.txt

from cloud-native-setup.

ganeshmaharaj avatar ganeshmaharaj commented on July 24, 2024

Logs from my setup too, in case they help.
kubectl describe logs -> https://hastebin.com/amuxavojec.coffeescript

CL Vagrant box: 28300
Kernel: 4.19.28-10.lts2018
Crio:

clear@clr-03 ~ $ crio --version
crio version 1.13.1
commit: ""

from cloud-native-setup.

ganeshmaharaj avatar ganeshmaharaj commented on July 24, 2024

Logs from my setup too, in case they help.
kubectl describe logs -> https://hastebin.com/amuxavojec.coffeescript

CL Vagrant box: 28300
Kernel: 4.19.28-10.lts2018

This setup fails consistently. cc @AntonioMeireles

from cloud-native-setup.

mcastelino avatar mcastelino commented on July 24, 2024

/cc @bryteise the latest kernel seems to be issue. We do not see the issue with the same crio version with the older kernel.

Kernel: 4.19.28-10.lts2018 -> crio panics
Kernel: 4.19.27-436.lts -> crio does not panic

from cloud-native-setup.

bryteise avatar bryteise commented on July 24, 2024

We just updated to 4.19.29 so this might be fixed in release 28310. Otherwise we will need to look through kernel patches.

from cloud-native-setup.

krsna1729 avatar krsna1729 commented on July 24, 2024

At the moment the issue is narrowed down to devicemapper setup and crio we are doing in setup_kata_firecracker.sh. If we do not provision with that script in Vagrantfile, everything comes up fine

from cloud-native-setup.

ahsan518 avatar ahsan518 commented on July 24, 2024

Verified and this is not valid anymore. Crio does not panics anymore

from cloud-native-setup.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.