Giter Club home page Giter Club logo

superedge's Introduction

English | 简体中文

SuperEdge

Version Go Report Card License


👋 Now SuperEdge v0.9.0 has supported Kins [English, 中文] feature, which will provision lightweighted K3s cluster on edge NodeUnit and you can play it totally offline. Welcome to checkout and try edgeadm


What is SuperEdge?

SuperEdge is an open source container management system for edge computing to manage compute resources and container applications in multiple edge regions. These resources and applications, in the current approach, are managed as one single Kubernetes cluster. A native Kubernetes cluster can be easily converted to a SuperEdge cluster.

SuperEdge has the following characteristics:

  • Kubernetes-native: SuperEdge extends the powerful container orchestration and scheduling capabilities of Kubernetes to the edge. It makes nonintrusive enhancements to Kubernetes and is fully compatible with all Kubernetes APIs and resources. Kubernetes users can leverage SuperEdge easily for edge environments with minimal learning.

  • Edge autonomy: Now SuperEdge provides L4/L5 edge autonomy.

    L3: When the network connection between the edge and the cloud is unstable, or the edge node is offline, the node can still work independently. But at this time, edge node can't do anything associated with writing operation, such as create/delete/update.

    L4/L5: Now SuperEdge has released v0.9.0 which support Kins [English, 中文] Feature. Kins will provision a lightweight K3s cluster based on SuperEdge, which can be operated totally offline. L4 level will leverage the single master edge K3s cluster, and L5 level will provision 3 master edge K3s cluster.

  • Distributed node health monitoring: SuperEdge provides edge-side health monitoring capabilities. SuperEdge can continue to monitor the processes on the edge side and collect health information for faster and more accurate problem discovery and reporting. In addition, its distributed design can provide multi-region monitoring and management.

  • Built-in edge orchestration capability: SuperEdge supports automatic deployment of multi-regional microservices. Edge-side services are closed-looped, and it effectively reduces the operational overhead and improves the fault tolerance and availability of the system.

  • Network tunneling: SuperEdge ensures that Kubernetes nodes can operate under different network situations. It supports network tunnelling using TCP, HTTP, HTTPS and SSH.

SuperEdge was initiated by the following companies: Tencent, Intel, VMware, Huya, Cambricon, Captialonline and Meituan.

Architecture

Cloud components:

  • tunnel-cloud: Maintains a persistent network connection to tunnel-edge services. Supports TCP/HTTP/HTTPS network proxies.
  • application-grid controller: A Kubernetes CRD controller as part of ServiceGroup. It manages DeploymentGrids, StatefulSetGrids and ServiceGrids CRDs and control applications and network traffic on edge worker nodes.
  • edge-health admission: Assists Kubernetes controllers by providing real-time health check status from edge-health services distributed on all edge worker nodes.
  • site-manager: A Kubernetes CRD controller which will manipulate NodeUnitand NodeGroup. And this controller is also responsible for Kins feature.

Edge components:

  • lite-apiserver: Lightweight kube-apiserver for edge autonomy. It caches and proxies edge components' requests and critical events to cloud kube-apiserver.
  • edge-health: Monitors the health status of edge nodes in the same edge region.
  • tunnel-edge: Maintains persistent connection to tunnel-cloud to retrieve API requests to the controllers on the edge.
  • application-grid wrapper: Managed by application-grid controller to provide independent internal network space for services within the same ServiceGrid.

Quickstart Guide

Please refer to the sub-project edgeadm. If you want to bring up a SuperEdge cluster from scrath, check this manual One-click install of edge Kubernetes cluster.

  • Download the installation package

    The supported version:

    • CPU arch [amd64, arm64], kubernetes version [1.22.6], version: v0.9.0

    • CPU arch [amd64, arm64], kubernetes version [1.22.6, 1.20.6], version: v0.8.2

    arch=amd64 version=v0.9.0 kubernetesVersion=1.22.6 && rm -rf edgeadm-linux-* && wget https://superedge-1253687700.cos.ap-guangzhou.myqcloud.com/$version/$arch/edgeadm-linux-$arch-$version-k8s-$kubernetesVersion.tgz && tar -xzvf edgeadm-linux-* && cd edgeadm-linux-$arch-$version-k8s-$kubernetesVersion && ./edgeadm
  • Install edge Kubernetes master node

    ./edgeadm init --kubernetes-version=1.22.6 --image-repository superedge.tencentcloudcr.com/superedge --service-cidr=10.96.0.0/12 --pod-network-cidr=192.168.0.0/16 --install-pkg-path ./kube-linux-*.tar.gz --apiserver-cert-extra-sans=<Master Public IP> --apiserver-advertise-address=<Master Intranet IP> --enable-edge=true --edge-version=0.9.0
  • Join edge node

    ./edgeadm join <Master Public/Intranet IP Or Domain>:Port --token xxxx --discovery-token-ca-cert-hash sha256:xxxxxxxxxx --install-pkg-path <edgeadm kube-* install package address path> --enable-edge=true

Other installation, deployment, and administration, see our Tutorial.

More details

Contact

For any question or support, feel free to contact us via:

Contributing

Welcome to contribute and improve SuperEdge

Troubleshooting and Feedback

If you encounter any failure in the process of using SuperEdge, you can use Contact Contact us, or give us feedback via Troubleshooting and Feedback.

License

Apache License 2.0

superedge's People

Contributors

00pf00 avatar allenleu avatar attlee-wang avatar beckham007 avatar chenkaiyue avatar dixudx avatar dodiadodia avatar duyanghao avatar fanweixiao avatar guoguodan avatar huiwq1990 avatar huweihuang avatar janeliul avatar jasine avatar lianghao208 avatar luhaopei avatar malc0lm avatar meua avatar moonwl avatar neweyes avatar nqlite avatar omigaxm avatar peiminmin avatar pengbinbin1 avatar testwill avatar wangchenglong01 avatar wenhuwang avatar xiaoyuelin avatar yiwei-c avatar zizhuoy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

superedge's Issues

/etc/sysctl.conf invalid syntax

What happened:
stderr: cni0: ERROR while getting interface flags: No such device
flannel.1: ERROR while getting interface flags: No such device
Cannot find device "cni0"
Cannot find device "flannel.1"
sed: can't read /etc/sysconfig/selinux: No such file or directory
sed: can't read /etc/selinux/config: No such file or directory
sysctl: /etc/sysctl.d/99-sysctl.conf(61): invalid syntax, continuing...
sysctl: /etc/sysctl.d/99-sysctl.conf(62): invalid syntax, continuing...
sysctl: /etc/sysctl.d/99-sysctl.conf(63): invalid syntax, continuing...
sysctl: /etc/sysctl.d/99-sysctl.conf(64): invalid syntax, continuing...
sysctl: /etc/sysctl.d/99-sysctl.conf(65): invalid syntax, continuing...
sysctl: /etc/sysctl.d/99-sysctl.conf(66): invalid syntax, continuing...
sysctl: /etc/sysctl.conf(61): invalid syntax, continuing...
sysctl: /etc/sysctl.conf(62): invalid syntax, continuing...
sysctl: /etc/sysctl.conf(63): invalid syntax, continuing...
sysctl: /etc/sysctl.conf(64): invalid syntax, continuing...
sysctl: /etc/sysctl.conf(65): invalid syntax, continuing...
sysctl: /etc/sysctl.conf(66): invalid syntax, continuing...
I0624 23:56:30.961537 2691 init_node.go:52] Init node success
image

What you expected to happen:
no error.

How to reproduce it (as minimally and precisely as possible):
base on ubuntu/xenial64 1604
box address is https://app.vagrantup.com/ubuntu/boxes/xenial64

Environment:
vagrant@ubuntu-xenial:~/edgeadm-linux-amd64-v0.4.0$ uname -a
Linux ubuntu-xenial 4.4.0-170-generic #199-Ubuntu SMP Thu Nov 14 01:45:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

  • SuperEdge version: v0.4.0

  • Kubernetes version (use kubectl version):
    vagrant@ubuntu-xenial:~/edgeadm-linux-amd64-v0.4.0$ kubectl version
    Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.2", GitCommit:"31a3f7703ac622535d4d453fe366f9746b450463", GitTreeState:"clean", BuildDate:"2020-10-13T12:50:07Z", GoVersion:"go1.14.4", Compiler:"gc", Platform:"linux/amd64"}
    Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.2", GitCommit:"52c56ce7a8272c798dbc29846288d7cd9fbae032", GitTreeState:"clean", BuildDate:"2020-04-16T11:48:36Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}

  • OS (e.g. cat /etc/os-release):
    vagrant@ubuntu-xenial:~/edgeadm-linux-amd64-v0.4.0$ cat /etc/os-release
    NAME="Ubuntu"
    VERSION="16.04.6 LTS (Xenial Xerus)"
    ID=ubuntu
    ID_LIKE=debian
    PRETTY_NAME="Ubuntu 16.04.6 LTS"
    VERSION_ID="16.04"
    HOME_URL="http://www.ubuntu.com/"
    SUPPORT_URL="http://help.ubuntu.com/"
    BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
    VERSION_CODENAME=xenial
    UBUNTU_CODENAME=xenial

  • Kernel (e.g. uname -a):
    vagrant@ubuntu-xenial:~/edgeadm-linux-amd64-v0.4.0$ uname -a
    Linux ubuntu-xenial 4.4.0-170-generic #199-Ubuntu SMP Thu Nov 14 01:45:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

  • Hardware configuration (e.g. lscpu):
    vagrant@ubuntu-xenial:~/edgeadm-linux-amd64-v0.4.0$ lscpu
    Architecture: x86_64
    CPU op-mode(s): 32-bit, 64-bit
    Byte Order: Little Endian
    CPU(s): 2
    On-line CPU(s) list: 0,1
    Thread(s) per core: 1
    Core(s) per socket: 2
    Socket(s): 1
    NUMA node(s): 1
    Vendor ID: GenuineIntel
    CPU family: 6
    Model: 60
    Model name: Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz
    Stepping: 3
    CPU MHz: 4000.004
    BogoMIPS: 8000.00
    Hypervisor vendor: KVM
    Virtualization type: full
    L1d cache: 32K
    L1i cache: 32K
    L2 cache: 256K
    L3 cache: 8192K
    NUMA node0 CPU(s): 0,1
    Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc tsc_known_freq pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx rdrand hypervisor lahf_lm abm invpcid_single kaiser fsgsbase avx2 invpcid md_clear flush_l1d

  • Go Version (e.g. go version)
    go version go1.16.5

[Question] CNCF SIG-Runtime Discussion/Presentation?

Hello SuperEdge team,

I'm one of the co-chairs of the CNCF SIG-Runtime, I'm reaching out and think it would be great for you to present/discuss the project at one of our meetings. An overview of the project would be great.

Let me know if this something you'd be interested in doing. If yes, please feel free to add it to our agenda or reach out to me (raravena80 at gmail.com)

Thanks!

lite-apiserver should listen on a dedicated IP not 127.0.0.1

What would you like to be added/modified:
Add an argument for lite-apiserver to set the listening IP.

Why is this needed:
Now lite-apiserver listens on 127.0.0.1 that may cause some problems:

  1. If using ipvs mode, kube-proxy will build an IPVS rule which redirect the request to Kubernetes service (such as 10.96.0.1:443) to the local lite-apiserver (such as 127.0.0.1:51003). But kubelet has added an iptables rule on KUBE-FIREWALL to block this redirect after Kubernetes version v1.19. (DROP all -- !127.0.0.0/8 127.0.0.0/8 /* block incoming localnet connections */ ! ctstate RELATED,ESTABLISHED,DNAT). see kubernetes/kubernetes#90259
  2. For the containers running on edge node without host network, they cannot access the Kubernetes service, because the request will redirect to 127.0.0.1:51003 in the container's network namespace not host network's 127.0.0.1.

According to these problems, I suggest to add an argument for lite-apiserver to set the listening IP.

one key install errors

steps as follow:

  • hj@master1:/opt/edgeadm-linux-amd64-v0.3.0-beta.0$ uname -a
    Linux master1 5.4.0-73-generic #82-Ubuntu SMP Wed Apr 14 17:39:42 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

  • wget 'https://github.com/superedge/superedge/releases/download/v0.3.0-beta.0/edgeadm-linux-amd64-v0.3.0-beta.0.tgz'

  • sudo tar zxvf edgeadm-linux-amd64-v0.3.0-beta.0.tgz

  • cd edgeadm-linux-amd64-v0.3.0-beta.0

  • ./edgeadm init
    --kubernetes-version=1.18.2
    --image-repository superedge.tencentcloudcr.com/superedge
    --service-cidr=10.192.0.0/12
    --pod-network-cidr=10.10.0.0/16
    --install-pkg-path ./kube-linux-*.tar.gz
    --apiserver-advertise-address=192.168.1.245
    --enable-edge=false
    -v=6

E0518 07:26:18.719014  714423 runcmd.go:52] Wait command: systemctl stop firewalld && systemctl disable firewalld exec finish error: exit status 5
I0518 07:26:18.719026  714423 runcmd.go:40] Run command: 'systemctl stop firewalld && systemctl disable firewalld'
 stdout:
 stderr: Failed to stop firewalld.service: Unit firewalld.service not loaded.

after i user sudo apt install firwalld ,this step pass,then come to this error step:

I0518 07:22:39.641457  739969 runcmd.go:40] Run command: 'cat <<EOF >/etc/sysconfig/modules/ipvs.modules
modprobe -- ip_vs
modprobe -- ip_vs_sh
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- iptable_nat
modprobe -- nf_conntrack_ipv4
modprobe -- br_netfilter

EOF'
 stdout:
 stderr:
E0518 07:22:39.642261  739969 runcmd.go:52] Wait command: chmod 755 /etc/sysconfig/modules/ipvs.modules &&
source /etc/sysconfig/modules/ipvs.modules &&
lsmod | grep -e ip_vs -e nf_conntrack_ipv4
 exec finish error: exit status 127
I0518 07:22:39.642277  739969 runcmd.go:40] Run command: 'chmod 755 /etc/sysconfig/modules/ipvs.modules &&
source /etc/sysconfig/modules/ipvs.modules &&
lsmod | grep -e ip_vs -e nf_conntrack_ipv4
'
 stdout:
 stderr: /bin/sh: 2: source: not found

Error: error execution phase init-node/load-kernel: exit status 127
Usage:
  edgeadm init [flags]
  edgeadm init [command]

[Question] Missing file /lib/x86_64-linux-gnu/libdevmapper.so.1.02

Has been successfully installed container runtime!

stderr: Removed symlink /etc/systemd/system/multi-user.target.wants/docker.service.
Removed symlink /etc/systemd/system/multi-user.target.wants/containerd.service.
Failed to stop dockerd.service: Unit dockerd.service not loaded.
Failed to execute operation: No such file or directory
dpkg: error: dpkg frontend is locked by another process
dpkg: error: dpkg frontend is locked by another process
dpkg: error: dpkg frontend is locked by another process
dpkg: error: dpkg frontend is locked by another process
ln: failed to create symbolic link '/lib/x86_64-linux-gnu/libdevmapper.so.1.02': File exists
E: Could not get lock /var/lib/dpkg/lock-frontend - open (11: Resource temporarily unavailable)
E: Unable to acquire the dpkg frontend lock (/var/lib/dpkg/lock-frontend), is another process using it?
Created symlink from /etc/systemd/system/multi-user.target.wants/containerd.service to /usr/lib/systemd/system/containerd.service.
Created symlink from /etc/systemd/system/multi-user.target.wants/docker.service to /usr/lib/systemd/system/docker.service.

【SuperEdge user registration】SuperEdge使用者登记

On behalf of the developers of SuperEdge, we thank you for using our open source product.
In order to better serve you and allow the design of the functions behind SuperEdge to directly solve your problems, we hope to understand your usage scenarios and make products that truly solve user pain points.

If you want to cooperate with us, want to participate in the planning of the later SuperEdge products, or want us to provide you with better features... you can directly contact us through the following contact information:

Or you can leave your contact information in the following format so that we can get in touch with you:

country:
organization:
contact details:
scenes to be used:
other information: 

as:

country: China
organization: Tencentcloud
contact details: [email protected]
scenes to be used: Use SuperEdge to unify management of machines and equipment in various regions
other information: Hope to cooperate with Superedge

[Question] cni config uninitialized

Lease:
  HolderIdentity:  ubuntu-xenial
  AcquireTime:     <unset>
  RenewTime:       Fri, 25 Jun 2021 00:55:32 +0000
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  MemoryPressure   False   Fri, 25 Jun 2021 00:52:33 +0000   Thu, 24 Jun 2021 23:57:01 +0000   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Fri, 25 Jun 2021 00:52:33 +0000   Thu, 24 Jun 2021 23:57:01 +0000   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Fri, 25 Jun 2021 00:52:33 +0000   Thu, 24 Jun 2021 23:57:01 +0000   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            False   Fri, 25 Jun 2021 00:52:33 +0000   Thu, 24 Jun 2021 23:57:01 +0000   KubeletNotReady              runtime network not ready: NetworkReady=false **reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized**
Addresses:
  InternalIP:  10.0.2.15
  Hostname:    ubuntu-xenial

使用edgeadm安装,边缘节点没有coredns服务,导致pod不能访问外网

What happened:
在使用edgeadm v0.4.0 进行1 Master(腾讯云,2C4G 有公网ip)+ 2 Node(esxi虚机,8C16G 无公网IP,nat网络)集群搭建后,边缘pod不能访问外网
What you expected to happen:
边缘pod能正常访问外网
How to reproduce it (as minimally and precisely as possible):
使用edgeadm v0.4.0 安装集群即可复现
Anything else we need to know?:
在边缘pod中修改/etc/resolv.conf的dns即可正常访问外网
Environment:

  • SuperEdge version:v0.4.0
  • Kubernetes version (use kubectl version):均为v1.18.2
  • OS (e.g. cat /etc/os-release):均为Ubuntu20.04 LTS
  • Kernel (e.g. uname -a):均为5.4.0-72
  • Hardware configuration (e.g. lscpu) Master 2C4G,Node均为8C16G
  • Go Version (e.g. go version)not install
  • Others:

Refactor the tcp module of the tunnel component

What would you like to be added/modified:

  1. When the edge node is disconnected, the request connection forwarded by the tcp module will be actively closed
  2. If the tcp connection on the cloud or edge is disconnected, the peer will be notified of the disconnection
  3. You can configure the edge node to which tcp is forwarded, so that one listening port corresponds to one edge node
  4. tunnel supports sub-module loading
    Why is this needed:

Support change edge node by label.

What would you like to be added/modified:

At present, the command of edgeadm change will change all kubeadm kubernetes nodes to edge nodes, but I just want to change some of my nodes to edge nodes. So, add an edgeadm change label subcommand to add a label to the node to be change.

Why is this needed:
I just want to change some of my nodes to edge nodes.

'docs/installation/install_manually.md' have two things to be fixed.

What would you like to be added/modified:
docs/installation/install_manually.md have two things to be fixed.

  1. cat << EOF >lite_apiserver.conf replaces with cat << EOF >lite-apiserver.conf
  2. Copy lite-apiserver.crt and lite-apiserver.key into edge worker node, path at /etc/kubernetes/pki/ replaces with Copy lite-apiserver.crt and lite-apiserver.key into edge worker node, path at /etc/kubernetes/edge/

Why is this needed:
Manual-installation SuperEdge in the guide is imperfect.

Remove the DNS hijacking and kube-apiserver can still access kubelet at edge.

What would you like to be added/modified:
SuperEdge hijacked kube-apiserver's request to access kubelet at edge node, and sent the request to tunnel-coud, which then forwarded the request to tunnel-edge, and finally reached kubelet. But this implementation needs to modify the DNS parameters of kube-api and restart, which is particularly inelegant. see link: https://github.com/superedge/superedge/blob/main/docs/installation/install_manually.md

image

Why is this needed:

I don’t want to modify and restart kube-apiserve and kube-apiserver's can still request to access kubelet at edge node.(kubectl logs、kubectl exec ……can still work normally)。

[Edge DNS] dns for edge

It is usually to create a CoreDNS Deployment for running CoreDNS on Kubernetes cluster, coredns.yaml

In edge computing, it is hard to visit kube-dns service for network reason.
We can deploy CoreDNS by DaemonSet with hostnetwork, and then modify --cluster-dns of kubelet to hostIP.

  1. apply the following coredns-daemonset.yaml
    kubectl apply -f coredns-daemonset.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: coredns
  namespace: kube-system
  labels:
    k8s-app: kube-dns
    kubernetes.io/name: "CoreDNS"
spec:
  selector:
    matchLabels:
      k8s-app: kube-dns
  template:
    metadata:
      labels:
        k8s-app: kube-dns
    spec:
      priorityClassName: system-cluster-critical
      serviceAccountName: coredns
      tolerations:
        - key: "CriticalAddonsOnly"
          operator: "Exists"
      nodeSelector:
        kubernetes.io/os: linux
      containers:
        - name: coredns
          image: coredns/coredns:1.6.7
          imagePullPolicy: IfNotPresent
          args: [ "-conf", "/etc/coredns/Corefile" ]
          resources:
            limits:
              memory: 170Mi
            requests:
              cpu: 100m
              memory: 70Mi
          volumeMounts:
            - name: config-volume
              mountPath: /etc/coredns
              readOnly: true
            - name: tmp
              mountPath: /tmp
          ports:
            - containerPort: 53
              name: dns
              protocol: UDP
            - containerPort: 53
              name: dns-tcp
              protocol: TCP
            - containerPort: 9153
              name: metrics
              protocol: TCP
          securityContext:
            allowPrivilegeEscalation: false
            capabilities:
              add:
                - NET_BIND_SERVICE
              drop:
                - all
            readOnlyRootFilesystem: true
      dnsPolicy: Default
      hostNetwork: true
      volumes:
        - name: config-volume
          configMap:
            name: coredns
            items:
              - key: Corefile
                path: Corefile
        - name: tmp
          hostPath:
            path: /tmp
  1. delete CoreDNS deployment
    kubectl delete deployment coredns -nkube-system

  2. update kubelet config

edgeadm didn't config kube-apiserver when convert a common cluster to an edge cluster

What happened:
After I used edgeadm to change a common k8s cluster to an edge cluster, I found the kube-apiserver didn't reconfigure for the edge cluster.
I have to adjust the apiserver's args manually to make it work.
The cluster is deployed by kubeadm v1.19.9.
I hope edgeadm can configure kube-apiserver.

The followings are what I changed for kube-apiserver:

  1. change --kubelet-preferred-address-types=Hostname,InternalIP,ExternalIP, the original value is Hostname,InternalIP,ExternalIP
  2. add dnsConfig:

dnsConfig: { nameservers: [10.2.63.215] }

  1. set dnsPolicy to None, because the host's nameserver may disturb apiserver's DNS

dnsPolicy: None

What you expected to happen:

edgeadm configure the kube-apiserver.

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • SuperEdge version: 0.4.0
  • Kubernetes version (use kubectl version):
  • OS (e.g. cat /etc/os-release):
  • Kernel (e.g. uname -a):
  • Hardware configuration (e.g. lscpu)
  • Go Version (e.g. go version)
  • Others: kubeadm v1.19.9

边缘节点跟云端断连后再重新连上会导致节点上的pod重启

条件: 在边缘节点可以连通云端组件的apiserver的时候,节点处于ready状态,pod也运行正常。 网络异常导致断联后节点处于notready状态(正常现象),pod的状态显示如下:
image

网络恢复正常会,pod状态,节点状态恢复为正常,但是节点上的pod被重启了
image

这里是不是需要有一个什么机制检测边缘节点的pod状态,如果跟中心的状态一致就可以不用重启。

[Question] Node communication in edge senarios

CNI plugins such as flannel require that the network between nodes are reachable, but in edge scenarios, it often appears that edge nodes and cloud nodes belong to different local area networks, and the cloud nodes are generally assigned public network IP while edge nodes will not. In this scenario, how should we solve the communication problem between edge nodes and cloud nodes? Is it by extending the CNI to the edge node, or is it through traffic interception at the application layer or the transport layer? Does superedge consider this scenario?
Thanks

Support new edge node and master node change

What would you like to be added/modified:

Currently, edgeadm does not support change again that if a new kubeadm node join and wants to become an edge node after change kubeadm cluster at first time. So,I want to add two subcommands about edgeadm change to support the add of new master and edge nodes.

Why is this needed:

Support new nodes to join the edge kubernetes cluster.

【SuperEdge Developer Registration】SuperEdge开发者登记

Since SuperEdge is open source, there are some developers or organizations that want to participate in SuperEdge. We hope to work with you to develop feature plans and development plans to build SuperEdge together.
On behalf of the initiators of SuperEdge, we sincerely invite you to become the developers of SuperEdge. There are three ways to become a long-term SuperEdge developer:

  • Reply to'/contributor' under this issue;
  • Directly and Slack/Discussion Forum/Contact the person in charge of WeChat Group;

Or leave your contact information so that we can contact you, format:

country:
organization:
contact details:

as

country: China
organization: TencentCloud
contact details: [email protected]

install_manually_CN.md has wrong description about tunnel-could's certificate

What happened:

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • SuperEdge version: v0.4.0 main
  • Kubernetes version (use kubectl version):
  • OS (e.g. cat /etc/os-release):
  • Kernel (e.g. uname -a):
  • Hardware configuration (e.g. lscpu)
  • Go Version (e.g. go version)
  • Others:

The node-name parameter does not support ipv4 format

What would you like to be added/modified:
Check whether the node-name parameter is valid on edgeadm init and edgeadm join. IPv4 format is not supported as parameter value

Why is this needed:
superedge's tunnnel forwarding has a restriction that the node name cannot be in IPv4 format, so when using --node-name to specify the node name, need to restrict users to use the IPv4 format.

Failed to initialize the master node when the node bandwidth is small

What would you like to be added/modified:
Submit the deploymentGrid and serviceGrid of the CRD customized by superedge in edgeadm

Why is this needed:
Coredns is deployed by deploymentGrid and serviceGrid,which is CRDs customized by superedge. At present, the definition of CRDs is placed in the grid-controller. When the master node pulls the grid-controller image slowly or fails, edgeadm will interrupt the process of initializing the master node, resulting in the failure of cluster creation

wecom-temp-5fff60b65fc2f857c560c2f37af21a01

云端apiserver不能访问边端的webhook server

在边缘集群上部署了nginx ingress controller,希望能通过ingress方式来访问应用
在为应用创建ingress时报错
image

后定位原因是云端的apiserver无法访问边端的webhook server导致

Whether the edge node deployment application container supports multiple replicas.

Whether the edge node deployment application container supports multiple replicas.

apiVersion: superedge.io/v1
kind: DeploymentGrid
metadata:
  name: deploymentgrid-demo
  namespace: default
spec:
  replicas: 2 # Can it be used here?
  gridUniqKey: zone
  template:
    selector:
      matchLabels:
        appGrid: nginx
    template:
      metadata:
        labels:
          appGrid: nginx
      spec:
        containers:
          - name: nginx
            image: nginx:1.7.9
            ports:
              - containerPort: 80

When using edgeadm-v0.3.0 to deploy an edge cluster, docker cannot get up; and restarting the edge node will cause the node to fail to start normally

What happened:
When using edgeadm-v0.3.0 to deploy an edge cluster, docker cannot get up; and restarting the edge node will cause the node to fail to start normally

What you expected to happen:
Docker can start normally, and restart the edge node, the node can join the cluster normally

How to reproduce it (as minimally and precisely as possible):

  1. version of centOS: CentOS Linux release 7.8.2003 (Core)
  2. firewalld is activated by default
  3. selinux is started by default

Environment:

  • SuperEdge version:
    {
    "gitVersion": "v0.3.0",
    "gitBranch": "main",
    "gitCommit": "9982477207523ce376144e64fe5c59b648b944c9",
    "gitTreeState": "clean",
    "buildDate": "2021-05-20T04:01:30Z",
    "goVersion": "go1.14.14",
    "compiler": "gc",
    "platform": "linux/amd64"
    }

  • Kubernetes version (use kubectl version):
    Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.2", GitCommit:"31a3f7703ac622535d4d453fe366f9746b450463", GitTreeState:"clean", BuildDate:"2020-10-13T12:50:07Z", GoVersion:"go1.14.4", Compiler:"gc", Platform:"linux/amd64"}
    Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.2", GitCommit:"52c56ce7a8272c798dbc29846288d7cd9fbae032", GitTreeState:"clean", BuildDate:"2020-04-16T11:48:36Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}

  • OS (e.g. cat /etc/os-release):
    NAME="CentOS Linux"
    VERSION="7 (Core)"
    ID="centos"
    ID_LIKE="rhel fedora"
    VERSION_ID="7"
    PRETTY_NAME="CentOS Linux 7 (Core)"
    ANSI_COLOR="0;31"
    CPE_NAME="cpe:/o:centos:centos:7"
    HOME_URL="https://www.centos.org/"
    BUG_REPORT_URL="https://bugs.centos.org/"

    CENTOS_MANTISBT_PROJECT="CentOS-7"
    CENTOS_MANTISBT_PROJECT_VERSION="7"
    REDHAT_SUPPORT_PRODUCT="centos"
    REDHAT_SUPPORT_PRODUCT_VERSION="7"

  • Kernel (e.g. uname -a): Linux k8s-edge01 3.10.0-1127.el7.x86_64 #1 SMP Tue Mar 31 23:36:51 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

  • Hardware configuration (e.g. lscpu) :
    Architecture: x86_64
    CPU op-mode(s): 32-bit, 64-bit
    Byte Order: Little Endian
    CPU(s): 4
    On-line CPU(s) list: 0-3
    Thread(s) per core: 1
    Core(s) per socket: 4
    Socket(s): 1
    NUMA node(s): 1
    Vendor ID: GenuineIntel
    CPU family: 6
    Model: 142
    Model name: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
    Stepping: 10
    CPU MHz: 1991.999
    BogoMIPS: 3983.99
    Hypervisor vendor: KVM
    Virtualization type: full
    L1d cache: 32K
    L1i cache: 32K
    L2 cache: 256K
    L3 cache: 8192K
    NUMA node0 CPU(s): 0-3
    Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc eagerfpu pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single fsgsbase avx2 invpcid rdseed clflushopt md_clear flush_l1d

  • Go Version (e.g. go version): go version go1.16.4 linux/amd64

  • Others:

Reduce the time for multiple tunnel cloud pods to update the node name and ip mapping to configmap

What would you like to be added/modified:
When the tunnel cloud pod updates the configmap, only the node name information related to the pod is updated when the configmap is updated
Why is this needed:
In the case that there are multiple copies of the deployment of tunnel cloud, when multiple edge nodes connect to the cloud, it will cause conflicts in the configmap of the tunnel cloud update cordons hosts plug-in.The tunnel attempts to re-update the configmap cycle is one minute, which will cause the node information connected to the cloud to take a long time to update to the configmap.

[Question]应用程序如何访问ServiceGrid

部署ServiceGrid后,程序无法访问,报错信息如下图:
image

数据库部署yaml:

---
apiVersion: superedge.io/v1
kind: DeploymentGrid
metadata:
  annotations: {}
  labels:
    k8s.kuboard.cn/layer: db
    k8s.kuboard.cn/name: database
  name: database
  namespace: mkwszc
spec:
  gridUniqKey: zone
  progressDeadlineSeconds: 600
  template:
    replicas: 1
    selector:
      matchLabels:
        k8s.kuboard.cn/layer: db
        k8s.kuboard.cn/name: database
    strategy:
      type: Recreate
    template:
      metadata:
        creationTimestamp: null
        labels:
          k8s.kuboard.cn/layer: db
          k8s.kuboard.cn/name: database
      spec:
        containers:
          - args:
              - mysqld
              - '--character-set-server=utf8mb4'
              - '--collation-server=utf8mb4_general_ci'
              - '--default_authentication_plugin=mysql_native_password'
              - '--explicit_defaults_for_timestamp=true'
              - '--lower_case_table_names=1'
              - '--max_allowed_packet=128M'
              - '--max_connections=500'
              - >-
                --sql-mode=STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION,ERROR_FOR_DIVISION_BY_ZERO
            env:
              - name: MYSQL_ROOT_PASSWORD
                value: '123456'
            image: 'mysql:8.0.25'
            imagePullPolicy: IfNotPresent
            name: mysql
            ports:
              - containerPort: 3306
                name: mysql
                protocol: TCP
            resources:
              limits:
                cpu: '1'
                memory: 1Gi
            terminationMessagePath: /dev/termination-log
            terminationMessagePolicy: File
            volumeMounts:
              - mountPath: /var/lib/mysql
                name: volume-mysql
        dnsPolicy: ClusterFirst
        nodeSelector:
          project: mkwszc
        restartPolicy: Always
        schedulerName: default-scheduler
        securityContext: {}
        terminationGracePeriodSeconds: 30
        volumes:
          - hostPath:
              path: /var/lib/mysql
              type: DirectoryOrCreate
            name: volume-mysql

---
apiVersion: superedge.io/v1
kind: ServiceGrid
metadata:
  annotations: {}
  labels:
    k8s.kuboard.cn/layer: db
    k8s.kuboard.cn/name: database
  name: database
  namespace: mkwszc
spec:
  gridUniqKey: zone
  externalTrafficPolicy: Cluster
  template:
    ports:
      - name: mysql
        nodePort: 3306
        port: 3306
        protocol: TCP
        targetPort: 3306
    selector:
      k8s.kuboard.cn/layer: db
      k8s.kuboard.cn/name: database
  sessionAffinity: None
  type: NodePort

应用程序部署:

---
apiVersion: superedge.io/v1
kind: DeploymentGrid
metadata:
  annotations: {}
  labels:
    k8s.kuboard.cn/layer: svc
    k8s.kuboard.cn/name: webapp
  name: webapp
  namespace: mkwszc
spec:
  gridUniqKey: zone
  progressDeadlineSeconds: 600
  template:
    replicas: 1
    selector:
      matchLabels:
        k8s.kuboard.cn/layer: svc
        k8s.kuboard.cn/name: webapp
    strategy:
      type: Recreate
    template:
      metadata:
        creationTimestamp: null
        labels:
          k8s.kuboard.cn/layer: svc
          k8s.kuboard.cn/name: webapp
      spec:
        containers:
          - image: 'webapp:latest'
            imagePullPolicy: IfNotPresent
            name: webapp
            ports:
              - containerPort: 8061
                name: webapp
                protocol: TCP
            resources:
              limits:
                cpu: '1'
                memory: 1Gi
            terminationMessagePath: /dev/termination-log
            terminationMessagePolicy: File
        dnsPolicy: ClusterFirst
        nodeSelector:
          project: mkwszc
        restartPolicy: Always
        schedulerName: default-scheduler
        securityContext: {}
        terminationGracePeriodSeconds: 30

---
apiVersion: superedge.io/v1
kind: ServiceGrid
metadata:
  annotations: {}
  labels:
    k8s.kuboard.cn/layer: svc
    k8s.kuboard.cn/name: webapp
  name: webapp
  namespace: mkwszc
spec:
  gridUniqKey: zone
  externalTrafficPolicy: Cluster
  template:
    ports:
      - name: webapp
        nodePort: 8061
        port: 8061
        protocol: TCP
        targetPort: 8061
    selector:
      k8s.kuboard.cn/layer: svc
      k8s.kuboard.cn/name: webapp
  sessionAffinity: None
  type: NodePort

数据库连接字符串:
Data Source=database;Port=3306;User ID=root;Password=123456;Initial Catalog=jy_eadp_mkwszc;Charset=utf8mb4;SslMode=none;Min pool size=1

Support Kubernetes 1.20

What would you like to be added/modified:
application-grid-wrapper/application-grid-controller/statefulset-grid-daemon should be modified to work with Kubernetes EndpointSlices.

Why is this needed:
Kubernetes 1.20 enable EndpointSlices by default,this feature can't work well with SuperEdge now.

Change the installation namespace to 'edge-system'

What would you like to be added/modified:
Change the installation namespace to edge-system. The changes includes:

  • The yaml templates and files that using kube-system.
  • The webhook certificates that using kube-system as the part of DNS domain.
  • The docs that use kube-system to install the superedge components.

Why is this needed:
Currently the superedge is installed to kube-system namespace, this brings management problems and taint of kube-system.

编译报错:checksum mismatch

项目编译报错

go版本 1.15.2

What happened:

make build

报错如下:

===========> Building binary application-grid-controller v0.2.0.2.g7318886 for darwin amd64
===========> Building binary application-grid-wrapper v0.2.0.2.g7318886 for darwin amd64
===========> Building binary edge-health v0.2.0.2.g7318886 for darwin amd64
===========> Building binary edge-health-admission v0.2.0.2.g7318886 for darwin amd64
===========> Building binary edgeadm v0.2.0.2.g7318886 for darwin amd64
===========> Building binary helper-job v0.2.0.2.g7318886 for darwin amd64
===========> Building binary lite-apiserver v0.2.0.2.g7318886 for darwin amd64
verifying github.com/google/[email protected]: checksum mismatch
        downloaded: h1:/PtAHvnBY4Kqnx/xCQ3OIV9uYcSFGScBsWI3Oogeh6w=
        go.sum:     h1:N8EguYFm2wwdpoNcpchQY0tPs85vOJkboFb2dPxmixo=

SECURITY ERROR
This download does NOT match an earlier download recorded in go.sum.
The bits may have been replaced on the origin server, or an attacker may
have intercepted the download attempt.

For more information, see 'go help module-auth'.
make[1]: *** [go.build.darwin_amd64.lite-apiserver] Error 1
make: *** [build] Error 2

superedge部署--编译环境

环境是kubernetes V1.16.15 docker-ce:18.09.9 golang:1.15.5
请教一下:执行下面命令报错
[root@master ~]# kubectl apply -f tunnel-coredns.yaml
error: error parsing tunnel-coredns.yaml: error converting YAML to JSON: yaml: line 90: mapping values are not allowed in this context
是不是需要编译环境,编译的二进制文件tar.gz在哪里获取
谢谢!

kubelet error log: node "xxxx" not found

What happened

提前配置好 /etc/kubernetes/manifests/lite-apiserver.yaml , cache类型 - --file-cache-path=/data/lite-apiserver/cache。
将kubelet的server地址指向127.0.0.1:51003,systemct start kubelet。此时lite-apiserver正常启动, 且kubectl get node 可以看到节点注册,但是kubelet日志显示node not found,此时kubelet不能正常工作,不能自动建daemonset。

临时解决方式

重启lite-apiserver或重启kubelet服务可以恢复。

k8s 版本

kube-apiserver、 kubelet version:v1.14.10

kubelet日志

systemd[1]: Started kubelet: The Kubernetes Node Agent.
kubelet[4744]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
kubelet[4744]: Flag --kube-reserved has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
kubelet[4744]: Flag --feature-gates has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
kubelet[4744]: Flag --eviction-hard has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
kubelet[4744]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
kubelet[4744]: Flag --kube-reserved has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
kubelet[4744]: Flag --feature-gates has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
kubelet[4744]: Flag --eviction-hard has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
kubelet[4744]: I0421 15:53:38.199076    4744 server.go:418] Version: v1.14.10-tk8s.1.4+2f7b4c68ce2039
kubelet[4744]: I0421 15:53:38.199326    4744 plugins.go:103] No cloud provider specified.
kubelet[4744]: I0421 15:53:38.259712    4744 server.go:629] --cgroups-per-qos enabled, but --cgroup-root was not specified.  defaulting to /
kubelet[4744]: GetCgroupMounts.MountPoints: []cgroups.Mount{cgroups.Mount{Mountpoint:"/sys/fs/cgroup/systemd", Root:"/", Subsystems:[]string{"systemd"}}, cgroups.Mount{Mountpoint:"/sys/fs/cgroup/net_cls", Root:"/", Subsystems:[]string{"net_cls"}}, cgroups.Mount{Mountpoint:"/sys/fs/cgroup/cpu,cpuacct", Root:"/", Subsystems:[]string{"cpuacct", "cpu"}}, cgroups.Mount{Mountpoint:"/sys/fs/cgroup/hugetlb", Root:"/", Subsystems:[]string{"hugetlb"}}, cgroups.Mount{Mountpoint:"/sys/fs/cgroup/freezer", Root:"/", Subsystems:[]string{"freezer"}}, cgroups.Mount{Mountpoint:"/sys/fs/cgroup/oom", Root:"/", Subsystems:[]string{"oom"}}, cgroups.Mount{Mountpoint:"/sys/fs/cgroup/perf_event", Root:"/", Subsystems:[]string{"perf_event"}}, cgroups.Mount{Mountpoint:"/sys/fs/cgroup/devices", Root:"/", Subsystems:[]string{"devices"}}, cgroups.Mount{Mountpoint:"/sys/fs/cgroup/cpuset", Root:"/", Subsystems:[]string{"cpuset"}}, cgroups.Mount{Mountpoint:"/sys/fs/cgroup/memory", Root:"/", Subsystems:[]string{"memory"}}, cgroups.Mount{Mountpoint:"/sys/fs/cgroup/pids", Root:"/", Subsystems:[]string{"pids"}}, cgroups.Mount{Mountpoint:"/sys/fs/cgroup/blkio", Root:"/", Subsystems:[]string{"blkio"}}}
kubelet[4744]: I0421 15:53:38.295169    4744 container_manager_linux.go:261] container manager verified user specified cgroup-root exists: []
kubelet[4744]: I0421 15:53:38.295182    4744 container_manager_linux.go:266] Creating Container Manager object based on Node Config: {RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: ContainerRuntime:docker CgroupsPerQOS:true CgroupRoot:/ CgroupDriver:cgroupfs KubeletRootDir:/var/lib/kubelet ProtectKernelDefaults:false NodeAllocatableConfig:{KubeReservedCgroupName: SystemReservedCgroupName: EnforceNodeAllocatable:map[pods:{}] KubeReserved:map[cpu:{i:{value:60 scale:-3} d:{Dec:<nil>} s:60m Format:DecimalSI} memory:{i:{value:167772160 scale:0} d:{Dec:<nil>} s: Format:BinarySI}] SystemReserved:map[] HardEvictionThresholds:[]} QOSReserved:map[] ExperimentalCPUManagerPolicy:static ExperimentalCPUManagerReconcilePeriod:10s ExperimentalPodPidsLimit:-1 EnforceCPULimits:true CPUCFSQuotaPeriod:100ms CPUReservedEnabled:false}
kubelet[4744]: I0421 15:53:38.295246    4744 container_manager_linux.go:286] Creating device plugin manager: true
kubelet[4744]: I0421 15:53:38.295356    4744 cpu_manager.go:122] [cpumanager] detected CPU topology: &{24 12 2 map[0:{0 0} 1:{0 1} 2:{0 2} 3:{0 3} 4:{0 4} 5:{0 5} 6:{1 6} 7:{1 7} 8:{1 8} 9:{1 9} 10:{1 10} 11:{1 11} 12:{0 0} 13:{0 1} 14:{0 2} 15:{0 3} 16:{0 4} 17:{0 5} 18:{1 6} 19:{1 7} 20:{1 8} 21:{1 9} 22:{1 10} 23:{1 11}]}
kubelet[4744]: I0421 15:53:38.295708    4744 policy_static.go:98] [cpumanager] reserved 1 CPUs ("0") not available for exclusive assignment
kubelet[4744]: I0421 15:53:38.295730    4744 state_mem.go:38] [cpumanager] initializing new in-memory state store
kubelet[4744]: I0421 15:53:38.297188    4744 kubelet.go:304] Watching apiserver
kubelet[4744]: I0421 15:53:38.298770    4744 client.go:75] Connecting to docker on unix:///var/run/docker.sock
kubelet[4744]: I0421 15:53:38.298786    4744 client.go:104] Start docker client with request timeout=2m0s
kubelet[4744]: W0421 15:53:38.309166    4744 docker_service.go:561] Hairpin mode set to "promiscuous-bridge" but kubenet is not enabled, falling back to "hairpin-veth"
kubelet[4744]: I0421 15:53:38.309262    4744 docker_service.go:238] Hairpin mode set to "hairpin-veth"
kubelet[4744]: W0421 15:53:38.311033    4744 hostport_manager.go:68] The binary conntrack is not installed, this can cause failures in network connection cleanup.
kubelet[4744]: I0421 15:53:38.311188    4744 docker_service.go:253] Docker cri networking managed by cni
kubelet[4744]: I0421 15:53:38.316247    4744 docker_service.go:258] Docker Info: &{ID:LIBD:ZZBK:IBWK:2QOV:HFAH:CVCF:3U2O:DNVB:KXMD:P5MJ:7KH5:R2XH Containers:0 ContainersRunning:0 ContainersPaused:0 ContainersStopped:0 Images:6 Driver:overlay2 DriverStatus:[[Backing Filesystem extfs] [Supports d_type true] [Native Overlay Diff false]] SystemStatus:[] Plugins:{Volume:[local] Network:[bridge host ipvlan macvlan null overlay] Authorization:[] Log:[awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog]} MemoryLimit:true SwapLimit:true KernelMemory:true CPUCfsPeriod:true CPUCfsQuota:true CPUShares:true CPUSet:true IPv4Forwarding:true BridgeNfIptables:true BridgeNfIP6tables:true Debug:false NFd:24 OomKillDisable:true NGoroutines:42 SystemTime:2021-04-21T15:53:38.311837751+08:00 LoggingDriver:json-file CgroupDriver:cgroupfs NEventsListener:0 KernelVersion:3.10.107-1-tlinux2-0051 OperatingSystem:Tencent tlinux 2.2 (Final) OSType:linux Architecture:x86_64 IndexServerAddress:https://index.docker.io/v1/ RegistryConfig:0xc0005c0310 NCPU:24 MemTotal:33268166656 GenericResources:[] DockerRootDir:/var/lib/docker HTTPProxy: HTTPSProxy: NoProxy: Name:TENCENT64.site Labels:[] ExperimentalBuild:false ServerVersion:19.03.9 ClusterStore: ClusterAdvertise: Runtimes:map[runc:{Path:runc Args:[]}] DefaultRuntime:runc Swarm:{NodeID: NodeAddr: LocalNodeState:inactive ControlAvailable:false Error: RemoteManagers:[] Nodes:0 Managers:0 Cluster:<nil>} LiveRestoreEnabled:true Isolation: InitBinary:docker-init ContainerdCommit:{ID:7ad184331fa3e55e52b890ea95e65ba581ae3429 Expected:7ad184331fa3e55e52b890ea95e65ba581ae3429} RuncCommit:{ID:dc9208a3303feef5b3839f4323d9beb36df0a9dd Expected:dc9208a3303feef5b3839f4323d9beb36df0a9dd} InitCommit:{ID:fec3683 Expected:fec3683} SecurityOptions:[name=seccomp,profile=default]}
kubelet[4744]: I0421 15:53:38.316340    4744 docker_service.go:271] Setting cgroupDriver to cgroupfs
kubelet[4744]: I0421 15:53:38.321542    4744 remote_runtime.go:62] parsed scheme: ""
kubelet[4744]: I0421 15:53:38.321556    4744 remote_runtime.go:62] scheme "" not registered, fallback to default scheme
kubelet[4744]: I0421 15:53:38.321584    4744 remote_image.go:50] parsed scheme: ""
kubelet[4744]: I0421 15:53:38.321590    4744 remote_image.go:50] scheme "" not registered, fallback to default scheme
kubelet[4744]: I0421 15:53:38.321673    4744 asm_amd64.s:1337] ccResolverWrapper: sending new addresses to cc: [{/var/run/dockershim.sock 0  <nil>}]
kubelet[4744]: I0421 15:53:38.321694    4744 clientconn.go:796] ClientConn switching balancer to "pick_first"
kubelet[4744]: I0421 15:53:38.321730    4744 asm_amd64.s:1337] ccResolverWrapper: sending new addresses to cc: [{/var/run/dockershim.sock 0  <nil>}]
kubelet[4744]: I0421 15:53:38.321748    4744 clientconn.go:796] ClientConn switching balancer to "pick_first"
kubelet[4744]: I0421 15:53:38.321764    4744 balancer_conn_wrappers.go:131] pickfirstBalancer: HandleSubConnStateChange: 0xc0002fd4d0, CONNECTING
kubelet[4744]: I0421 15:53:38.321789    4744 balancer_conn_wrappers.go:131] pickfirstBalancer: HandleSubConnStateChange: 0xc000235470, CONNECTING
kubelet[4744]: I0421 15:53:38.321942    4744 balancer_conn_wrappers.go:131] pickfirstBalancer: HandleSubConnStateChange: 0xc0002fd4d0, READY
kubelet[4744]: I0421 15:53:38.321947    4744 balancer_conn_wrappers.go:131] pickfirstBalancer: HandleSubConnStateChange: 0xc000235470, READY
kubelet[4744]: E0421 15:53:38.591631    4744 reflector.go:126] k8s.io/kubernetes/pkg/kubelet/kubelet.go:442: Failed to list *v1.Service: services is forbidden: User "system:anonymous" cannot list resource "services" in API group "" at the cluster scope
kubelet[4744]: E0421 15:53:38.604531    4744 reflector.go:126] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: pods is forbidden: User "system:anonymous" cannot list resource "pods" in API group "" at the cluster scope
kubelet[4744]: E0421 15:53:38.612940    4744 reflector.go:126] k8s.io/kubernetes/pkg/kubelet/kubelet.go:451: Failed to list *v1.Node: nodes "11-22-00-ff-bb-cd" is forbidden: User "system:anonymous" cannot list resource "nodes" in API group "" at the cluster scope
kubelet[4744]: E0421 15:53:58.690462    4744 aws_credentials.go:77] while getting AWS credentials NoCredentialProviders: no valid providers in chain. Deprecated.
kubelet[4744]: For verbose messaging see aws.Config.CredentialsChainVerboseErrors
kubelet[4744]: I0421 15:53:58.695520    4744 kuberuntime_manager.go:215] Container runtime docker initialized, version: 19.03.9, apiVersion: 1.40.0
kubelet[4744]: I0421 15:53:58.699860    4744 server.go:1056] Started kubelet
kubelet[4744]: E0421 15:53:58.699948    4744 kubelet.go:1283] Image garbage collection failed once. Stats initialization may not have completed yet: failed to get imageFs info: unable to find data in memory cache
kubelet[4744]: I0421 15:53:58.699915    4744 server.go:141] Starting to listen on 0.0.0.0:10250
kubelet[4744]: I0421 15:53:58.700742    4744 server.go:344] Adding debug handlers to kubelet server.
kubelet[4744]: I0421 15:53:58.700746    4744 fs_resource_analyzer.go:64] Starting FS ResourceAnalyzer
kubelet[4744]: I0421 15:53:58.700793    4744 status_manager.go:152] Starting to sync pod status with apiserver
kubelet[4744]: I0421 15:53:58.700822    4744 volume_manager.go:248] Starting Kubelet Volume Manager
kubelet[4744]: I0421 15:53:58.700830    4744 kubelet.go:1809] Starting kubelet main sync loop.
kubelet[4744]: I0421 15:53:58.700849    4744 desired_state_of_world_populator.go:130] Desired state populator starts to run
kubelet[4744]: I0421 15:53:58.700851    4744 kubelet.go:1826] skipping pod synchronization - [container runtime status check may not have completed yet., PLEG is not healthy: pleg has yet to be successful.]
kubelet[4744]: E0421 15:53:58.707219    4744 docker_sandbox.go:538] Failed to retrieve checkpoint for sandbox "a8904d14dded5d2f8c51e7da22da9009a7f8f93e8a99eae7e40c3a453bf0a371": checkpoint is not found
kubelet[4744]: I0421 15:53:58.782412    4744 kubelet_node_status.go:286] Setting node annotation to enable volume controller attach/detach
kubelet[4744]: E0421 15:53:58.800896    4744 kubelet.go:2247] node "11-22-00-ff-bb-cd" not found
kubelet[4744]: I0421 15:53:58.800897    4744 kubelet_node_status.go:286] Setting node annotation to enable volume controller attach/detach
kubelet[4744]: I0421 15:53:58.800962    4744 kubelet.go:1826] skipping pod synchronization - container runtime status check may not have completed yet.
kubelet[4744]: E0421 15:53:58.900998    4744 kubelet.go:2247] node "11-22-00-ff-bb-cd" not found
kubelet[4744]: I0421 15:53:59.001063    4744 kubelet.go:1826] skipping pod synchronization - container runtime status check may not have completed yet.
kubelet[4744]: E0421 15:53:59.001085    4744 kubelet.go:2247] node "11-22-00-ff-bb-cd" not found
kubelet[4744]: E0421 15:53:59.027756    4744 controller.go:204] failed to get node "11-22-00-ff-bb-cd" when trying to set owner ref to the node lease: nodes "11-22-00-ff-bb-cd" not found
kubelet[4744]: E0421 15:53:59.050947    4744 streamwatcher.go:109] Unable to decode an event from the watch stream: got short buffer with n=0, base=168, cap=2688
kubelet[4744]: W0421 15:53:59.050970    4744 reflector.go:289] k8s.io/client-go/informers/factory.go:133: watch of *v1beta1.CSIDriver ended with: very short watch: k8s.io/client-go/informers/factory.go:133: Unexpected watch close - watch lasted less than a second and no items received
kubelet[4744]: E0421 15:53:59.101164    4744 kubelet.go:2247] node "11-22-00-ff-bb-cd" not found
kubelet[4744]: E0421 15:53:59.201277    4744 kubelet.go:2247] node "11-22-00-ff-bb-cd" not found
kubelet[4744]: I0421 15:53:59.213443    4744 cpu_manager.go:202] [cpumanager] starting with static policy
kubelet[4744]: I0421 15:53:59.213452    4744 kubelet_node_status.go:72] Attempting to register node 11-22-00-ff-bb-cd
kubelet[4744]: I0421 15:53:59.213455    4744 cpu_manager.go:203] [cpumanager] reconciling every 10s
kubelet[4744]: I0421 15:53:59.213549    4744 state_mem.go:96] [cpumanager] updated default cpuset: "0-23"
kubelet[4744]: W0421 15:53:59.215017    4744 manager.go:540] Failed to retrieve checkpoint for "kubelet_internal_checkpoint": checkpoint is not found
kubelet[4744]: E0421 15:53:59.215285    4744 eviction_manager.go:287] eviction manager: failed to get summary stats: failed to get node info: node "11-22-00-ff-bb-cd" not found
kubelet[4744]: I0421 15:53:59.271316    4744 kubelet_node_status.go:75] Successfully registered node 11-22-00-ff-bb-cd
kubelet[4744]: E0421 15:53:59.301381    4744 kubelet.go:2247] node "11-22-00-ff-bb-cd" not found
kubelet[4744]: I0421 15:53:59.326104    4744 kuberuntime_manager.go:1030] updating runtime config through cri with podcidr 192.168.18.0/24
kubelet[4744]: I0421 15:53:59.326254    4744 docker_service.go:353] docker cri received runtime config &RuntimeConfig{NetworkConfig:&NetworkConfig{PodCidr:192.168.18.0/24,},}
kubelet[4744]: I0421 15:53:59.326416    4744 kubelet_network.go:77] Setting Pod CIDR:  -> 192.168.18.0/24
kubelet[4744]: W0421 15:53:59.401203    4744 pod_container_deletor.go:75] Container "7caa8bbd2ab47fb43bb3165cbadda047212a9ad4178caa01a06c63d3dcbf0304" not found in pod's containers
kubelet[4744]: W0421 15:53:59.401218    4744 pod_container_deletor.go:75] Container "80511f792a67d2aa510d2dc61365ee335cd230ad2f5df7c0323aa7c97a653a91" not found in pod's containers
kubelet[4744]: W0421 15:53:59.401226    4744 pod_container_deletor.go:75] Container "a8904d14dded5d2f8c51e7da22da9009a7f8f93e8a99eae7e40c3a453bf0a371" not found in pod's containers
kubelet[4744]: W0421 15:53:59.401232    4744 pod_container_deletor.go:75] Container "d127ca41b22e2b63de55a34de340bab7d0774f962aa94381fc73faa34ee40032" not found in pod's containers
kubelet[4744]: E0421 15:53:59.401467    4744 kubelet.go:2247] node "11-22-00-ff-bb-cd" not found
kubelet[4744]: E0421 15:53:59.501550    4744 kubelet.go:2247] node "11-22-00-ff-bb-cd" not found
kubelet[4744]: I0421 15:53:59.502100    4744 reconciler.go:154] Reconciler: start to sync state
kubelet[4744]: E0421 15:53:59.601658    4744 kubelet.go:2247] node "11-22-00-ff-bb-cd" not found
kubelet[4744]: E0421 15:53:59.701753    4744 kubelet.go:2247] node "11-22-00-ff-bb-cd" not found
kubelet[4744]: E0421 15:53:59.801846    4744 kubelet.go:2247] node "11-22-00-ff-bb-cd" not found
kubelet[4744]: E0421 15:53:59.901930    4744 kubelet.go:2247] node "11-22-00-ff-bb-cd" not found
kubelet[4744]: E0421 15:54:00.002009    4744 kubelet.go:2247] node "11-22-00-ff-bb-cd" not found
kubelet[4744]: E0421 15:54:00.102092    4744 kubelet.go:2247] node "11-22-00-ff-bb-cd" not found
kubelet[4744]: E0421 15:54:00.202185    4744 kubelet.go:2247] node "11-22-00-ff-bb-cd" not found
kubelet[4744]: E0421 15:54:00.302270    4744 kubelet.go:2247] node "11-22-00-ff-bb-cd" not found
kubelet[4744]: E0421 15:54:00.402350    4744 kubelet.go:2247] node "11-22-00-ff-bb-cd" not found

lite-apiserver 日志

 proxy.go:69] New request: method->GET, url->/apis/coordination.k8s.io/v1beta1/namespaces/kube-node-lease/leases/11-22-00-ff-bb-cd?timeout=10s
 proxy.go:199] request resourceInfo=&{IsResourceRequest:true Path:/apis/coordination.k8s.io/v1beta1/namespaces/kube-node-lease/leases/11-22-00-ff-bb-cd Verb:get APIPrefix:apis APIGroup:coordination.k8s.io APIVersion:v1beta1 Namespace:kube-node-lease Resource:leases Subresource: Name:11-22-00-ff-bb-cd Parts:[leases 11-22-00-ff-bb-cd]}
 cache_mgr.go:60] cache for kubelet/v1.14.10_kube-node-lease_leases_11-22-00-ff-bb-cd_
 cache_mgr.go:78] cache 586 bytes body from response for kubelet/v1.14.10_kube-node-lease_leases_11-22-00-ff-bb-cd_
 proxy.go:69] New request: method->PUT, url->/apis/coordination.k8s.io/v1beta1/namespaces/kube-node-lease/leases/11-22-00-ff-bb-cd?timeout=10s
 proxy.go:199] request resourceInfo=&{IsResourceRequest:true Path:/apis/coordination.k8s.io/v1beta1/namespaces/kube-node-lease/leases/11-22-00-ff-bb-cd Verb:update APIPrefix:apis APIGroup:coordination.k8s.io APIVersion:v1beta1 Namespace:kube-node-lease Resource:leases Subresource: Name:11-22-00-ff-bb-cd Parts:[leases 11-22-00-ff-bb-cd]}
 proxy.go:69] New request: method->GET, url->/api/v1/nodes/11-22-00-ff-bb-cd?resourceVersion=0&timeout=10s
 proxy.go:199] request resourceInfo=&{IsResourceRequest:true Path:/api/v1/nodes/11-22-00-ff-bb-cd Verb:get APIPrefix:api APIGroup: APIVersion:v1 Namespace: Resource:nodes Subresource: Name:11-22-00-ff-bb-cd Parts:[nodes 11-22-00-ff-bb-cd]}
 cache_mgr.go:60] cache for kubelet/v1.14.10__nodes_11-22-00-ff-bb-cd_
 cache_mgr.go:78] cache 3660 bytes body from response for kubelet/v1.14.10__nodes_11-22-00-ff-bb-cd_

What you expected to happen

正常情况下,默认启动后,kubelet应正常提供服务。

Support custom cofiguration where running install script

What would you like to be added/modified:

  • When user use the auto install script(look like ./edgeadm init --kubernetes-version=1.18.2 --image-repository superedge.tencentcloudcr.com/superedge --service-cidr=10.96.0.0/12 --pod-network-cidr=192.168.0.0/16 --install-pkg-path ./kube-linux-*.tar.gz --apiserver-cert-extra-sans=<Master public IP> --apiserver-advertise-address=<master Intranet IP> --enable-edge=true -v=6 ),we want to allow user to config there own configuration,such as doker root path: /var/lib/docker instead of /data/docker.

Why is this needed:

  • Use the default arguments, such as /var/lib/docker ,not production ready, only be used to test.

[Bug]: Can not join edge nodes into existing cluster by public apiServer IP

What happened:
I can not join edge node into existing cluster by public apiServer IP.

How to reproduce it (as minimally and precisely as possible):

  1. Install add-on edge apps (master node)
./edgeadm addon edge-apps  --ca.cert /etc/kubernetes/ca.crt --ca.key /etc/kubernetes/ca.key --master-public-addr <apiserver public address/ domain>:<port>
  1. Print join command (master node)
./edgeadm token create --print-join-command
  1. Join an edge node
./edgeadm join <apiserver public address/ internal address/ domain>:<port> --token xxxx --discovery-token-ca-cert-hash sha256:xxxxxxxxxx --install-pkg-path <path of installation package> --enable-edge=true
  1. Got x509 certificate error
I0626 19:22:55.099493    9543 join.go:511] [preflight] Discovering cluster-info
I0626 19:22:55.099538    9543 token.go:78] [discovery] Created cluster-info discovery client, requesting info from "152.136.179.199:6443"
I0626 19:22:55.219936    9543 token.go:116] [discovery] Requesting info from "152.136.179.199:6443" again to validate TLS against the pinned public key
I0626 19:22:55.295476    9543 token.go:215] [discovery] Failed to request cluster-info, will try again: Get "https://152.136.179.199:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s": x509: certificate is valid for 10.96.0.1, 192.168.11.8, not 152.136.179.199
I0626 19:23:01.783490    9543 token.go:215] [discovery] Failed to request cluster-info, will try again: Get "https://152.136.179.199:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s": x509: certificate is valid for 10.96.0.1, 192.168.11.8, not 152.136.179.199

Environment:

  • SuperEdge version:
    v0.4.0

SEP: ServiceGroup StatefulSetGrid Design Specification

This is my initial design of ServiceGroup StatefulSetGrid as below:

image

StatefulSetGrid custom resource samples below:

apiVersion: superedge.io/v1
kind: StatefulSetGrid
metadata:
  name: statefulsetgrid-demo
  namespace: default
spec:
  gridUniqKey: zone1
  template:
    selector:
      matchLabels:
        appGrid: nginx # has to match .spec.template.metadata.labels
    serviceName: "servicegrid-demo-svc"
    replicas: 3 # by default is 1
    template:
      metadata:
        labels:
          appGrid: nginx # has to match .spec.selector.matchLabels
      spec:
        terminationGracePeriodSeconds: 10
        containers:
        - name: nginx
          image: k8s.gcr.io/nginx-slim:0.8
          ports:
          - containerPort: 80
            name: web
          volumeMounts:
          - name: www
            mountPath: /usr/share/nginx/html
    volumeClaimTemplates:
    - metadata:
        name: www
      spec:
        accessModes: [ "ReadWriteOnce" ]
        storageClassName: "my-storage-class"
        resources:
          requests:
            storage: 1Gi

ServiceGrid stays unchanged and samples below:

apiVersion: superedge.io/v1
kind: ServiceGrid
metadata:
  name: servicegrid-demo
  namespace: default
spec:
  gridUniqKey: zone1
  template:
    selector:
      appGrid: nginx
    ports:
    - port: 80
      name: web
    clusterIP: None                                                       

this design works as follows:

  1. StatefulSetGrid will generate relevant Kubernetes statefulset workload for each of nodeunit as DeploymentGrid does(i.e. {StatefulSetGrid}-{NodeUnit})
  2. StatefulSet headless service will be access by domains constructed by {StatefulSetGrid}-{0..N-1}.{ServiceGrid}-svc. default.svc.cluster.local which corresponds to actual statefulset workload {StatefulSetGrid}-{NodeUnit}-{0..N-1}.{ServiceGrid}-svc.default.svc.cluster.local, aiming to block the difference of nodeunits.
  3. StatefulSet normal clusterIP service access stays unchanged by using achieved service topology.(i.e. {ServiceGrid}-svc)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.