Giter Club home page Giter Club logo

windows-machine-config-bootstrapper's Introduction

Windows Machine Config Bootstrapper

Bootstrapper is the entity responsible for bootstrapping a Windows node. The current scope of this component is to perform an one shot configuration of the Windows node to ensure that it can be become a worker node. Following are the jobs that the bootstrapper does:

  • Parse the worker ignition file to get the bootstrap kubeconfig
  • Ensures that the kubelet gets the correct kubelet config
  • Run the kubelet as a windows service

Once the bootstrapper has been run and the CSR associated with the Windows node is approved, the Windows node will have a taint called os=Windows:NoSchedule, only the pods with matching toleration can be scheduled onto the Windows node. An example pod spec with the toleration would be:

tolerations:
  - key: "os"
    operator: "Equal"
    value: "Windows"
    effect: "NoSchedule"

This will be remotely invoked from a Ansible script or can be run locally

Requirements

  • Must be run on Windows server 2019
  • Must be run as administrator
  • A worker ignition file generated by the cluster must be on disk
  • The kubelet you wish to use must be on disk. Currently we support v1.16.2
  • If running on AWS, the Windows instance must have the same tags as the other worker nodes in the cluster

Usage

make build
wmcb initialize-kubelet --ignition-file $IGNITION_FILE_PATH --kubelet-path $KUBELET_PATH

The initialize-kubelet command provides the following optional parameters:

  • --cluster-dns is the DNS server IP passed to kubelet, that will be used to configure all containers for DNS resolution. If unset, kubelet will determine the DNS server to use. See clusterDNS option in KubeletConfiguration.

Testing

Windows Machine Config Bootstrapper

End to end testing

The following environment variables need to be set for running the end to end tests:

  • ARTIFACT_DIR
    • This can be set to any directory
  • AWS_SHARED_CREDENTIALS_FILE
    • Set this to point to your AWS credentials file
  • KUBE_SSH_KEY_PATH
    • The ssh key used to bring up the VM
  • WMCB_IMAGE
    • Registry url for remote WMCB image that needs to be tested. eg. quay.io//:

To build the WMCB image, execute:

podman build -f Dockerfile.tools -t quay.io/<USERNAME>/<IMAGE>:<TAG> .

The WMCB image needs to be pushed to a remote repository:

podman push quay.io/<USERNAME>/<IMAGE>:<TAG>

Once the above variables are set, you can run the unit and end to end tests by executing:

$ hack/run-wmcb-ci-e2e-test.sh

Inorder to skip MachineSet setup, add -skipVMSetup argument to args field in internal/test/wmcb/deploy/job.yaml. A MachineSet with label machine.openshift.io/os-id=Windows needs to be created, and the Machine should be in Provisioned state in order to use -skipVMSetup. Test suite will use the mounted private key to access the Machine created. Using an already Provisioned VM would reduce the wait time to run the test from 12 minute to just 1 minute.

windows-machine-config-bootstrapper's People

Contributors

aravindhp avatar codyhoag avatar fedosin avatar gmarkley-vi avatar jrvaldes avatar leonjia0112 avatar lorbuschris avatar mansikulkarni96 avatar openshift-ci[bot] avatar openshift-merge-bot[bot] avatar openshift-merge-robot avatar pratikmahajan avatar ravisantoshgudimetla avatar saifshaikh48 avatar sebsoto avatar suhanime avatar thrasher-redhat avatar vaishnavihire avatar vinaykns avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

windows-machine-config-bootstrapper's Issues

Network OVNKubernetesHybridOverlayNetwork not found

After following these instructions (https://github.com/openshift/windows-machine-config-bootstrapper/blob/master/tools/ansible/docs/azure/azure-with-windows-server.md) my cluster appears to be up and running -

$ oc get node
NAME                                STATUS   ROLES    AGE   VERSION
ocp447-nffjf-master-0               Ready    master   19h   v1.17.1+f5fb168
ocp447-nffjf-master-1               Ready    master   19h   v1.17.1+f5fb168
ocp447-nffjf-master-2               Ready    master   19h   v1.17.1+f5fb168
ocp447-nffjf-worker-eastus1-959mg   Ready    worker   19h   v1.17.1+f5fb168
ocp447-nffjf-worker-eastus2-pxtzc   Ready    worker   19h   v1.17.1+f5fb168
ocp447-nffjf-worker-eastus3-vtrqr   Ready    worker   19h   v1.17.1+f5fb168
win-worker1                         Ready    worker   17h   v1.17.1

But when running the test on the default namespace I see -

$ oc create -f https://gist.githubusercontent.com/suhanime/683ee7b5a2f55c11e3a26a4223170582/raw/d893db98944bf615fccfe73e6e4fb19549a362a5/WinWebServer.yaml -n default
service/win-webserver created
deployment.apps/win-webserver created

But then -

$ oc describe pod/win-webserver-6f5bdc5b95-nl7rz
...
Events:
  Type     Reason                  Age                  From                  Message
  ----     ------                  ----                 ----                  -------
  Normal   Scheduled               116s                 default-scheduler     Successfully assigned default/win-webserver-6f5bdc5b95-nl7rz to win-worker1
  Warning  FailedCreatePodSandBox  114s                 kubelet, win-worker1  Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "d63e0d5ebef238519b4ac6329558db3eaefadfc0c5a7005afc8705b5986c0592" network for pod "win-webserver-6f5bdc5b95-nl7rz": networkPlugin cni failed to set up pod "win-webserver-6f5bdc5b95-nl7rz_default" network: error while GETHNSNewtorkByName(OVNKubernetesHybridOverlayNetwork): Network OVNKubernetesHybridOverlayNetwork not found

Any advice to diagnose and fix this ? Thanks.

Ansible issue when worker nodes have common string in the IP

I have a set of a OpenShift worker nodes with IPs 192.168.200.2, 192.168.200.22, 192.168.200.234 etc. where '192.168.200.2' is a common string in all the IP addresses.

Due to this, the below line in windows-machine-config-bootstrapper/tools/ansible/tasks/wsu/main.yaml (line 329) breaks the ansible automation, since the output of it contains multiple nodes, not just the node I am trying to bootstrap.
shell: "oc get node -o wide |awk '/{{ inventory_hostname }}/ || /{{ private_ip }}/ {print $1}'"

For example, I am trying to bootstrap a Windows node with 192.168.200.2 IP. But since I have other nodes with this common string in the IP address, the ansible steps fails because it outputs all the nodes with similar IP addresses.

If I run above command,

# oc get node -o wide |awk '/192.168.200.2/ || /192.168.200.2/ {print $1}'
win2k19dc
worker-00
worker-01
worker-02

It was supposed to show only the node 'win2k19dc' where I was trying to bootstrap. But it shows the other 3 nodes which also has '192.168.200.2' in the beginning of the IP.

As a workaround I modified the line to have a space in the end, not sure if this is the right way.
shell: "oc get node -o wide |awk '/{{ inventory_hostname }} / || /{{ private_ip }} / {print $1}'"

Support ocp 4.4.7

With ocp 4.4.7 I see :

$ oc version -o json | jq -r '.serverVersion.gitVersion'
v1.17.1+f5fb168

This breaks tools/ansible/tasks/wsu/main.yaml ... I think we need to remove the +f5fb168 suffix

Unable to pull from imagestream

With this configuration on azure, I found pulling from the ImageStream didn't work. I see -

$ oc describe pod pmml-scoring-service-77757499d7-qqjnq
...
Events:
  Type     Reason     Age                From                      Message
  ----     ------     ----               ----                      -------
  Normal   Scheduled  <unknown>          default-scheduler         Successfully assigned default/pmml-scoring-service-77757499d7-qqjnq to winworker-svad8
  Warning  Failed     27s (x3 over 54s)  kubelet, winworker-svad8  Back-off pulling image "image-registry.openshift-image-registry.svc:5000/default/scoring-service-base-win:latest"
  Normal   Pulling    12s (x3 over 54s)  kubelet, winworker-svad8  Pulling image "image-registry.openshift-image-registry.svc:5000/default/scoring-service-base-win:latest"
  Warning  Failed     12s (x3 over 54s)  kubelet, winworker-svad8  Failed to pull image "image-registry.openshift-image-registry.svc:5000/default/scoring-service-base-win:latest": rpc error: code = Unknown desc = Error response from daemon: Get https://image-registry.openshift-image-registry.svc:5000/v2/: dial tcp: lookup image-registry.openshift-image-registry.svc: no such host

However I found the following did work -

  • login to windows server
  • add public imagestream address to C:\PrograData\doker\config\daemon.json as an insecure registry
  • Restart-Service *docker*
  • docker login ... public imagestream address
  • docker pull public imagestream address/default/scoring-service-base-win:latest
  • docker tag public imagestream address/default/scoring-service-base-win:latest scoring-service-base-win:latest
  • update the yaml to use image address scoring-service-base-win:latest

It would be good to know if this configuration is designed to use the ImageStream, or if there is a way to push a image to the windows server.

Thanks !

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.