Giter Club home page Giter Club logo

ocp4-upi-powervm-hmc's People

Contributors

cs-zhang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

ocp4-upi-powervm-hmc's Issues

Fatal undefined variable error when running ansible-playbook -e @vars-powervm.yaml playbooks/main.yaml

Seeing this error when running: ansible-playbook -e @vars-powervm.yaml playbooks/main.yaml

fatal: [9.114.219.127]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'product_repo'\n\nThe error appears to be in '/home/wilder/ocp4.8-install/ocp4-upi-powervm-hmc/playbooks/ocp4-helpernode/tasks/set_facts_.yaml': line 41, column 5, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n - set_fact:\n ^ here\n"}

I found that commit 4960f6d removed these lines from examples/vars-powervm.yaml.

 setup_registry:
   deploy: false
-  autosync_registry: true
-  registry_image: docker.io/ibmcom/registry-ppc64le:2.6.2.5
-  local_repo: "ocp4/openshift4"
-  product_repo: "openshift-release-dev"
-  release_name: "ocp-release"
-  release_tag: "4.6.5-ppc64le"

When I added them back into my vars-powervm.yaml the error went away.

(FYI: After making the change I ran into a later error /"bin/sh: openshift-install: command not found", but I think this is unrelated.)

Update ocp4-helpernode submodule to avoid ansible bug

RHEL8 epel ansible-core is at version 2.13.3, however running this repo will cause this error:

TASK [Enable restart always for critical services] *********************************************************************************************************************************************************
fatal: [localhost]: FAILED! => {"msg": "Invalid data passed to 'loop', it requires a list, got this instead: ['httpd', 'named', 'haproxy'] + [ 'dhcpd' ]. Hint: If you passed a list/dict of just one element, try adding wantlist=True to your lookup invocation or use q/query instead of lookup."}

This error was fixed with this PR redhat-cop/ocp4-helpernode#300

Can you update the ocp4-helpernode submodule @cs-zhang ?

Installer failed/timed out at master-0 node config step

I am trying to use this project to install OCP 4.10 on a tech zone/CECC kit. I followed the readme instructions and vars-powervm.yaml looks good to me, yet the installation continuously failed due to timeout during the TASK [nodes-config : Check connection] of master-0 node. The timeout is set to 2700s (=45min) in the vars yaml file and it waited the whole 45 minutes and then it failed.

TASK [ocp-config : Skip config if install workdir exist] **************************************************************************
ok: [129.40.126.241]

TASK [ocp-config : meta] **********************************************************************************************************

PLAY [Check and configure bootstrap node] *****************************************************************************************

TASK [nodes-config : Check connection] ********************************************************************************************
ok: [129.40.126.242]

TASK [nodes-config : Configure node] **********************************************************************************************
[WARNING]: Distribution redhat 4.10 on host 129.40.126.242 should use /usr/bin/python, but is using /usr/libexec/platform-python,
since the discovered platform python interpreter was not present. See https://docs.ansible.com/ansible-
core/2.12/reference_appendices/interpreter_discovery.html for more information.
changed: [129.40.126.242]

PLAY [Check and configure control-plane nodes] ************************************************************************************

TASK [nodes-config : Check connection] ********************************************************************************************
fatal: [129.40.126.243]: FAILED! => {"changed": false, "elapsed": 2715, "msg": "timed out waiting for ping module test: Failed to connect to the host via ssh: ssh: connect to host 129.40.126.243 port 22: Connection refused"}

NO MORE HOSTS LEFT ****************************************************************************************************************

PLAY RECAP ************************************************************************************************************************
129.40.126.241             : ok=141  changed=55   unreachable=0    failed=0    skipped=135  rescued=1    ignored=0   
129.40.126.242             : ok=2    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
129.40.126.243             : ok=0    changed=0    unreachable=0    failed=1    skipped=0    rescued=0    ignored=0   

[root@p664-bastion ocp4-upi-powervm-hmc]#

No useful info in the logs despite elevating the log level to debug. Only seeing this block of log repeated displayed in /var/log/messages every 30 sec:

Oct 19 01:21:57 p664-bastion systemd[1]: helper-tftp.service: Succeeded.
Oct 19 01:22:27 p664-bastion systemd[1]: helper-tftp.service: Service RestartSec=30s expired, scheduling restart.
Oct 19 01:22:27 p664-bastion systemd[1]: helper-tftp.service: Scheduled restart job, restart counter is at 16735.
Oct 19 01:22:27 p664-bastion systemd[1]: Stopped Starts TFTP on boot because of reasons.
Oct 19 01:22:27 p664-bastion systemd[1]: Started Starts TFTP on boot because of reasons.
Oct 19 01:22:27 p664-bastion systemd[1]: helper-tftp.service: Succeeded.

LPAPR netboot Error

I've installed everything as per the steps but when I run the playbook:

ansible-playbook -e @vars-powervm.yaml playbooks/main.yaml

it keeps runing till it reach this stage:

+++++++++++++++++++++++
TASK [bootup-nodes : netboot bootstrap node] *********************************************************************************
fatal: [localhost]: FAILED! => {"changed": true, "cmd": "orig_mac=02:DE:E7:68:23:02\npvm_mac=echo ${orig_mac//:}\npvmcec=S922a\npvmlpar=ocp4-bootstrap\npvm_profile=ssh -o StrictHostKeyChecking=no [email protected]. \"lssyscfg -r lpar -m ${pvmcec} --filter lpar_names=${pvmlpar} -F curr_profile\"\nremote_cmd="lpar_netboot -f -t ent -m ${pvm_mac} -s auto -d auto ${pvmlpar} ${pvm_profile} ${pvmcec}"\nssh -o StrictHostKeyChecking=no [email protected]. "${remote_cmd}"\n", "delta": "0:02:17.963866", "end": "2021-02-26 09:08:29.950819", "msg": "non-zero return code", "rc": 1, "start": "2021-02-26 09:06:11.986953", "stderr": "lpar_netboot: Error : Close command sent/bin/stty: standard input: Inappropriate ioctl for device\nlpar_netboot: The network boot ended in an error.", "stderr_lines": ["lpar_netboot: Error : Close command sent/bin/stty: standard input: Inappropriate ioctl for device", "lpar_netboot: The network boot ended in an error."], "stdout": "# Connecting to ocp4-bootstrap\n# Connected\n# Checking for power off.\n# Power off the node\n# Wait for power off.\n# Power off complete.\n# Power on ocp4-bootstrap to Open Firmware.\n# Power on complete.\n# Getting adapter location codes.\nBOOTP initiated!\n# bootp sent over network.\n Parameters: \r\n---------------- \r\nchosen-network-type = ethernet,auto,none,auto\r\nserver IP = 0.0.0.0\r\nclient IP = 0.0.0.0\r\ngateway IP = 0.0.0.0\r\ndevice = /vdevice/l-lan@30000002\r\nMAC address = 02 de e7 68 23 02 \r\nloc-code = U9009.22A.zzzzzzz-V20-C2-T1\r\n\r\nBOOTP request retry attempt: 1 \r\nBOOTP request retry attempt: 2 \r\nBOOTP request retry attempt: 3 \r\nBOOTP request retry attempt: 4 \r\n\t!BA01B015", "stdout_lines": ["# Connecting to ocp4-bootstrap", "# Connected", "# Checking for power off.", "# Power off the node", "# Wait for power off.", "# Power off complete.", "# Power on ocp4-bootstrap to Open Firmware.", "# Power on complete.", "# Getting adapter location codes.", "BOOTP initiated!", "# bootp sent over network.", " Parameters: ", "---------------- ", "chosen-network-type = ethernet,auto,none,auto", "server IP = 0.0.0.0", "client IP = 0.0.0.0", "gateway IP = 0.0.0.0", "device = /vdevice/l-lan@30000002", "MAC address = 02 de e7 68 23 02 ", "loc-code = U9009.22A.zzzzzz-V20-C2-T1", "", "BOOTP request retry attempt: 1 ", "BOOTP request retry attempt: 2 ", "BOOTP request retry attempt: 3 ", "BOOTP request retry attempt: 4 ", "\t!BA01B015"]}

+++++++++++++++++++++++++++

I use static ips!

Thanks

update_ignition_bootstrap.py is missing filesystem stanza for chrony.conf entry

Issue is with: ocp4-playbooks/playbooks/roles/ocp-config/files/update_ignition_bootstrap.py
The script fails to write the "filesystem" stanze for the chrony.conf file.
Correct content needs to look like this:
...
if os.path.isfile('/tmp/chrony.conf.tmp'):
with open("/tmp/chrony.conf.tmp", "rb") as chronyconf:
chrony_b64 = base64.standard_b64encode(chronyconf.read()).decode().strip()
files.append(
{
'filesystem': 'root',
'path': '/etc/chrony.conf',
'user': {
'name': 'root'
},
'mode': 420,
'contents': {
'source': 'data:text/plain;charset=utf-8;base64,' + chrony_b64,
'verification': {}
},
'overwrite': True
})
....
Note the added line:
'filesystem': 'root',

With that change I was able to successfully install OCP 4.5 with the provided playbooks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.