Giter Club home page Giter Club logo

assisted-service's Introduction

assisted-service

Swagger API

Go Report Card License Apache

About

This repository provides a service that installs OpenShift. Its main benefits include a minimum amount of prerequisites from the user's infrastructure, as well as comprehensive pre-flight validations to ensure a successful installation. The service exposes either a REST API, or it can be deployed as an Operator where it exposes a Kubernetes-native API via Custom Resources. A UI is available that uses the REST API.

The Assisted Service can currently install clusters with highly-available control planes (3 hosts and above) and can also install Single-Node OpenShift (SNO). Highly available clusters are configured to use OpenShift's baremetal platform (typically used in bare metal IPI deployments), while SNO uses none (typically used in UPI deployments).

The basic flow for creating a new OpenShift cluster using the Assisted Service via the UI or REST API is:

  1. Create a new Cluster resource with the minimal required properties.
  2. Generate and download a bootable image which is customized for that cluster. This image is based on RHCOS and is customized to automatically run an agent upon boot.
  3. Boot the hosts that will form the cluster with the image from the previous step. The boot method is left to the user (e.g., USB drive, virtual media, PXE, etc.).
  4. The agent running on each host contacts the Assisted Service via REST API and performs discovery (sends hardware inventory and connectivity information).
  5. The UI guides the user through the installation, with the service performing validations along the way. Alternatively, this can be done via API.
  6. Once all validations pass, the user may initiate the installation. Progress may be viewed via the UI or API, and logs are made available for download directly from the service.

Demos and blog posts

Below are some recent demos and blog posts:

User documentation

By continuing to read this document you will learn how to build and deploy Assisted Service. If you are interested in using Assisted Service to deploy an OCP cluster, please refer to the User Documentation.

Development Prerequisites

  1. Docker or Podman.
    For podman make sure to enable podman socket and update skipper.yaml to map it properly
  2. skipper https://github.com/stratoscale/skipper
  3. minikube (for tests)
  4. kubectl
  5. Python modules pip install waiting

First Setup

To push your build target to a Docker registry you first need to change the default target.

  1. Create a quay.io or Docker Hub account if you don't already have one. These instructions refer to quay.io, Docker Hub is similar.
  2. Create a repository called assisted-service.
  3. Make sure you have your ~/.docker/config.json file set up to point to your account. For quay.io, you can go to quay.io -> User Settings, and click "Generate Encrypted Password" under "Docker CLI Password".
  4. Login to quay.io using docker login quay.io.
  5. Export the SERVICE environment variable to your Docker registry, and pass a tag of your choice, e.g., "test":
export SERVICE=quay.io/<username>/assisted-service:<tag>

For the first build of the build container run:

skipper build assisted-service-build

Build

skipper make all

Generate code after swagger changes

After every change in the API (swagger.yaml) the code should be generated and the build must pass.

skipper make generate-from-swagger

Testing

More information is available here: Assisted Installer Testing

Update Discovery Image base OS

If you want to update the underlying operating system image used by the discovery iso, follow these steps:

  1. Choose the base os image you want to use

    1. RHCOS: https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/
    2. Fedora CoreOS: https://getfedora.org/en/coreos/download?tab=metal_virtualized&stream=stable
  2. Build the new iso generator image

    # Example with RHCOS
    BASE_OS_IMAGE=https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/pre-release/latest/rhcos-4.6.0-0.nightly-2020-08-26-093617-x86_64-live.x86_64.iso make build-assisted-iso-generator-image

Deployment

Deploy to minikube

The deployment is a system deployment, it contains all the components the service need for all the operations to work (if implemented). S3 service (minio), DB and will use the image generator to create the images in the deployed S3 and create relevant bucket in S3.

skipper make deploy-all

Note: when deployed in minikube, we need to be running minikube tunnel, as some services are type LoadBalancer and need to be exposed.

nohup minikube tunnel &>/dev/null

Deploy to OpenShift

Besides default minikube deployment, the service supports deployment to OpenShift cluster using ingress as the access point to the service.

skipper make deploy-all TARGET=oc-ingress

This deployment option have multiple optional parameters that should be used in case you are not the Admin of the cluster:

  1. APPLY_NAMESPACE - True by default. Will try to deploy "assisted-installer" namespace, if you are not the Admin of the cluster or maybe you don't have permissions for this operation you may skip namespace deployment.
  2. INGRESS_DOMAIN - By default deployment script will try to get the domain prefix from OpenShift ingress controller. If you don't have access to it then you may specify the domain yourself. For example: apps.ocp.prod.psi.redhat.com
  3. DISABLE_TLS - Based on the target environment, routes that are being used are HTTPS. Setting it to true will make HTTP routes instead.

To set the parameters simply add them in the end of the command, for example:

skipper make deploy-all TARGET=oc-ingress APPLY_NAMESPACE=False INGRESS_DOMAIN=apps.ocp.prod.psi.redhat.com

Note: All deployment configurations are under the deploy directory in case more detailed configuration is required.

Deploy UI

This service supports optional UI deployment.

skipper make deploy-ui

* In case you are using podman run the above command without skipper.

For OpenShift users, look at the service deployment options on OpenShift platform.

Deploy Monitoring

Note: This target is only for development purpose.

This will allow you to deploy Prometheus and Grafana already integrated with Assisted installer:

  • On Minikube
# Step by step
make deploy-olm
make deploy-prometheus
make deploy-grafana

# Or just all-in
make deploy-monitoring
  • On Openshift
# Step by step
make deploy-prometheus TARGET=oc-ingress APPLY_NAMESPACE=false
make deploy-grafana TARGET=oc-ingress APPLY_NAMESPACE=false

# Or just all-in
make deploy-monitoring TARGET=oc-ingress APPLY_NAMESPACE=false

NOTE: To expose the monitoring UI's on your local environment you could follow these steps

kubectl config set-context $(kubectl config current-context) --namespace assisted-installer

# To expose Prometheus
kubectl port-forward svc/prometheus-k8s 9090:9090

# To expose Grafana
kubectl port-forward svc/grafana 3000:3000

Now you just need to access http://127.0.0.1:3000 to access to your Grafana deployment or http://127.0.0.1:9090 for Prometheus.

Deploy by tag

This feature is for internal usage and not recommended to use by external users. This option will select the required tag that will be used for each dependency. If deploy-all use a new tag the update will be done automatically and there is no need to reboot/rollout any deployment.

Deploy images according to the manifest:

skipper make deploy-all DEPLOY_MANIFEST_PATH=./assisted-installer.yaml

Deploy images according to the manifest in the assisted-installer-deployment repo (require git tag/branch/hash):

skipper make deploy-all DEPLOY_MANIFEST_TAG=master

Deploy all the images with the same tag. The tag is not validated, so you need to make sure it actually exists.

skipper make deploy-all DEPLOY_TAG=<tag>

Default tag is latest

Deploy without a Kubernetes cluster

There are two ways the assisted service can be deployed without using a Kubernetes cluster:

Using containers on your local host

In this scenario the service and associated components are deployed onto your local host as a pod using Podman.

See the README for details.

Storage

assisted-service maintains a cache of openshift-baremetal-install binaries at $WORK_DIR/installercache/. Persistent storage can optionally be mounted there to persist the cache across container restarts. However, that storage should not be shared across multiple assisted-service processes.

Cache Expiration

Currently there is no mechanism to expire openshift-baremetal-install binaries out of the cache. The recommendation for now is to allow the cache to use the container's own local storage that will vanish when the Pod gets replaced, for example during upgrade. That will prevent the cache from growing forever while allowing it to be effective most of the time.

Troubleshooting

A document that can assist troubleshooting: link

Documentation

Markdown formatted documentation is available in the docs directory.

Linked repositories

coreos_installation_iso

https://github.com/oshercc/coreos_installation_iso

Image in charge of generating the Fedora-coreOs image used to install the host with the relevant ignition file.
Image is uploaded to deployed S3 under the name template "installer-image-<cluster-id>".

Assisted Service on console.redhat.com

The Assisted Installer is also available for users as a SAAS hosted in console.redhat.com.

More information is available here: Assisted Installer on console.redhat.com

Setting a custom discovery ISO password

It's possible to modify the discovery ISO (via the API) to enable password login for troubleshooting purposes.

More information is available here: Set discovery ISO user password example

Contributing

Please, read our CONTRIBUTING guidelines for more info about how to create, document, and review PRs.

assisted-service's People

Contributors

avishayt avatar carbonin avatar crystalchun avatar danielerez avatar danmanor avatar dependabot[bot] avatar djzager avatar eliorerz avatar empovit avatar eranco74 avatar filanov avatar flaper87 avatar mkowalski avatar omertuc avatar openshift-edge-bot avatar ori-amizur avatar oshercc avatar osherdp avatar paul-maidment avatar pawanpinjarkar avatar razregev avatar rccrdpccl avatar red-hat-konflux[bot] avatar rollandf avatar rwsu avatar slaviered avatar tsorya avatar ybettan avatar yevgeny-shnaidman avatar yuvigold avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

assisted-service's Issues

waiting for host

Hello, I use ''podman play kube --configmap okd-configmap.yml pod.yml'' to run and install okd, only changing okd-configmap.yml. After downloading the iso generated by the WEB UI, the web page always displays waiting for host. , the firewall has been turned off, the host IP is 10.0.7.1, and only the virtual machine can be used to access the WEB interface, and other computers cannot access it. Are there any settings that I have made wrong?
Thanks
截圖 2024-01-27 晚上9 10 25

截圖 2024-01-27 晚上9 10 39

`apiVersion: v1
kind: ConfigMap
metadata:
name: config
data:
ASSISTED_SERVICE_HOST: 127.0.0.1:8090
ASSISTED_SERVICE_SCHEME: http
AUTH_TYPE: none
DB_HOST: 127.0.0.1
DB_NAME: installer
DB_PASS: admin
DB_PORT: "5432"
DB_USER: admin
DEPLOY_TARGET: onprem
DISK_ENCRYPTION_SUPPORT: "false"
DUMMY_IGNITION: "false"
ENABLE_SINGLE_NODE_DNSMASQ: "false"
HW_VALIDATOR_REQUIREMENTS: '[{"version":"default","master":{"cpu_cores":4,"ram_mib":16384,"disk_size_gb":100,">
IMAGE_SERVICE_BASE_URL: http://10.0.7.1:8888
IPV6_SUPPORT: "true"
ISO_IMAGE_TYPE: "full-iso"
LISTEN_PORT: "8888"
NTP_DEFAULT_SERVER: ""
POSTGRESQL_DATABASE: installer
POSTGRESQL_PASSWORD: admin
POSTGRESQL_USER: admin
PUBLIC_CONTAINER_REGISTRIES: 'quay.io'
SERVICE_BASE_URL: http://127.0.0.1:8090
STORAGE: filesystem
OS_IMAGES: '[{"openshift_version":"4.12","cpu_architecture":"x86_64","url":"https://builds.coreos.fedoraprojec>
RELEASE_IMAGES: '[{"openshift_version":"4.12","cpu_architecture":"x86_64","cpu_architectures":["x86_64"],"url">
ENABLE_UPGRADE_AGENT: "false"
ENABLE_OKD_SUPPORT: "true"

`

L2ConnectivityMajorityGroup is empty

I am trying to deploy a 5 node ipv6 only cluster (ocp 4.12) using ACM (2.8). Deployment is stuck with status "insufficient- no connectivity to majority of host in the cluster"
All hosts are able to ping each other nodes successfully.

L3ConnectivityGroup has all nodes UUID listed
L2ConnecivityGroup is empty Causing this error. Is this a bug ?

connectivityMajorityGroups: '{"XXXX:XXXX:XXXX:XXXX::/64":[],"IPv4":["52657e37-3fd0-9798-c148-333c8b596562","c26a7a93-9abd-5dd8-94eb-a215b7908b58","c35283f4-d545-5ce7-6726-c66bdef30e27"],"IPv6":["52657e37-3fd0-9798-c148-333c8b596562","c26a7a93-9abd-5dd8-94eb-a215b7908b58","c35283f4-d545-5ce7-6726-c66bdef30e27"]}'

Corresponding Code validation
https://github.com/openshift/assisted-service/blob/master/internal/host/validator.go#L690

Installer failed - Dependency failed for Generate New UUID for Boot Disk GPT

Hi,

This is my first time with the Assisted Installer.
I'm trying to deploy Openshift 4.8.9 Single Node edition on bare-metal server.
Installer failed with job dev-disk-by\x2dlabel-boot.device (Timed out).
As a consequence, I had this error :
Dependency failed for Generate New UUID for Boot Disk GPT.

Disk layout:
SDA : SSD 1.92T
SDB : SSD 1.92T
SDC : ISO

Extract from installer.logs :

Sep 20 14:23:41 bse19mw installer[2877]: time="2021-09-20T14:23:41Z" level=info msg="Setting efibootmgr to boot from disk"
Sep 20 14:23:41 bse19mw installer[2877]: time="2021-09-20T14:23:41Z" level=info msg="efibootmgr: ** Warning ** : Boot0006 has same label Red Hat Enterprise Linux\n"
Sep 20 14:23:45 bse19mw installer[2877]: time="2021-09-20T14:23:45Z" level=info msg="BootCurrent: 0006\nTimeout: 3 seconds\nBootOrder: 0007,2001,2003,0005,2002\nBoot0000* EFI USB Device (Linux File-Stor Gadget)\nBoot0001* EFI PXE 0 for IPv4 (08-00-38-BF-BA-86) \nBoot0002* EFI PXE 1 for IPv4 (08-00-38-BF-BA-87) \nBoot0003* EFI PXE 2 for IPv4 (08-00-38-BF-BA-88) \nBoot0004* EFI PXE 3 for IPv4 (08-00-38-BF-BA-89) \nBoot0005* Internal EFI Shell\nBoot0006* Red Hat Enterprise Linux\nBoot2001* EFI USB Device\nBoot2002* EFI DVD/CDROM\nBoot2003* EFI Network\nBoot0007* Red Hat Enterprise Linux\n"
Sep 20 14:23:45 bse19mw installer[2877]: time="2021-09-20T14:23:45Z" level=info msg="MirroredPercentageAbove4G: 0.00\nMirrorMemoryBelow4GB: false\n"
Sep 20 14:23:45 bse19mw installer[2877]: time="2021-09-20T14:23:45Z" level=info msg="Uploading logs and reporting status before rebooting the node 557742c9-40f3-92fa-90b3-04542055f162 for cluster 894f19a4-0fb8-415d-8526-01883054915e"

Support removed for rootless containers?

I can no longer run Assisted-Install using a rootless podman pod.
This has never been a problem before.

as myself

$ podman run --rm -ti quay.io/centos7/postgresql-12-centos7:latest bash
bash-4.2$ cgroup-limits 
Warning: Can't detect cpu quota from cgroups
Warning: Can't detect cpuset size from cgroups
Traceback (most recent call last):
  File "/usr/bin/cgroup-limits", line 143, in <module>
    "NUMBER_OF_CORES": get_number_of_cores()
  File "/usr/bin/cgroup-limits", line 76, in get_number_of_cores
    return min([l for l in limits if l])
ValueError: min() arg is an empty sequence
bash-4.2$ id  
uid=26(postgres) gid=26(postgres) groups=26(postgres),0(root)
bash-4.2$ 

as root

$ sudo podman run --rm -ti quay.io/centos7/postgresql-12-centos7:latest bash
bash-4.2$ cgroup-limits 
NO_MEMORY_LIMIT=true
MAX_MEMORY_LIMIT_IN_BYTES=9223372036854775807
MEMORY_LIMIT_IN_BYTES=9223372036854775807
NUMBER_OF_CORES=20
bash-4.2$ 

Was this an accidental or intentional change in behavior?

Make it work with OKD

The docs should contain a hint of how AI can be used to deploy OKD instead of OCP

Inconsistencies possible between levels of assisted-install in hub and agent in discovery iso deployments

I was trying to install OCP 4.11.20 SNO with assisted-installer and the cluster node was stuck with this message repeating in the agent container log

time="2023-01-22T21:43:46Z" level=error msg="Failed to update node nuc10 installation status" func="github.com/openshift/assisted-installer/src/assisted_installer_controller.(*controller).waitAndUpdateNodesStatus" file="/go/src/github.com/openshift/assisted-installer/src/assisted_installer_controller/assisted_installer_controller.go:255" error="response status code does not match any response statuses defined for this endpoint in the swagger spec (status 409): {}" request_id=b86c5c8c-e90c-4aab-b6c5-aeb7893d0931

and on the server the assisted-service container log had this repeating

time="2023-01-22T21:43:46Z" level=error msg="failed to update host d001adcc-f6cc-f1cb-7b10-bb5011add461 progress" func="github.com/openshift/assisted-service/internal/bminventory.(*bareMetalInventory).V2UpdateHostInstallProgressInternal" file="/assisted-service/internal/bminventory/inventory.go:5301" error="Stages Joined isn't available for host role master bootstrap true" go-id=430202 host_id=d001adcc-f6cc-f1cb-7b10-bb5011add461 infra_env_id=f092fd1f-732a-49f6-a4b2-bc1c8fbd8952 pkg=Inventory request_id=7d8a05d4-dd03-46a8-acf9-17eb8ca71937
time="2023-01-22T21:43:46Z" level=info msg="Update host d001adcc-f6cc-f1cb-7b10-bb5011add461 install progress" func="github.com/openshift/assisted-service/internal/bminventory.(*bareMetalInventory).V2UpdateHostInstallProgressInternal" file="/assisted-service/internal/bminventory/inventory.go:5283" go-id=430202 host_id=d001adcc-f6cc-f1cb-7b10-bb5011add461 infra_env_id=f092fd1f-732a-49f6-a4b2-bc1c8fbd8952 pkg=Inventory request_id=7d8a05d4-dd03-46a8-acf9-17eb8ca71937

I was able to reproduce on other h/w. My theory is that some incompatibility occurred between the hub which was running for several weeks on an older release and the cluster node that started with a newer release of the agent.

I would like to suggest that perhaps it would be better to reference a sha digest of a compatible agent in the generated discover iso instead of using the latest tag, i.e. choose a better default for AgentDockerImg than quay.io/edge-infrastructure/assisted-installer-agent:latest

assisted service pods are not working

Hi ALL,

We have deployed ACM in our OCP cluster and now we are trying to deploy new cluster by using ACM, For that we are using assisted service for static ip allocation for nodes, but our pods (assisted-image-service-xxxx & assisted-service-64c7b5cfff-sqt9z ) are going in crashloopbackoff.

Below is some logs reference also attaching .tar file for your reference please help us.


oc logs assisted-image-service-759c6bc744-snbgj
{"file":"/remote-source/app/pkg/imagestore/imagestore.go:149","func":"github.com/openshift/assisted-image-service/pkg/imagestore.(*rhcosStore).Populate.func1","level":"info","msg":"Downloading iso from http://172.90.11.40:8000/caas/rhcos/images/latest/rhcos-4.8.2-x86_64-live.x86_64.iso to /data/rhcos-full-iso-4.8-x86_64.iso","time":"2021-12-16T06:25:25Z"}
{"file":"/remote-source/app/main.go:40","func":"main.main","level":"fatal","msg":"Failed to populate image store: failed to download http://172.90.11.40:8000/caas/rhcos/images/latest/rhcos-4.8.2-x86_64-live.x86_64.iso: Get "http://172.90.11.40:8000/caas/rhcos/images/latest/rhcos-4.8.2-x86_64-live.x86_64.iso\": dial tcp 172.90.11.40:8000: i/o timeout\n","time":"2021-12-16T06:25:55Z"}

NOTE:- we are able to wget & curl this URL from same system.


BR,
Shashan
logs.zip

Cannot run unit-tests locally using Podman

The Makefile unit-test endpoint should be adapted to work with Podman.

When I try to run them locally (skipper make unit-test) it fails with:

docker: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?.

Attempted with RHEL8.2

skipper make all is failing

Behind proxy, but have proxy setting in environment. Getting error running "skipper make all" with podman.

INFO [config_reader] Config search paths: [./ /home/lansys/assisted-service /home/lansys /home /]
INFO [config_reader] Used config file .golangci.yml
INFO [lintersdb] Active 14 linters: [errcheck gci gocyclo gofmt goimports gosec gosimple govet ineffassign scopelint staticcheck typecheck unconvert unused]
INFO [loader] Go packages loading at mode 575 (files|deps|exports_file|imports|name|types_sizes|compiled_files) took 1.65132016s
WARN [runner] The linter 'scopelint' is deprecated (since v1.39.0) due to: The repository of the linter has been deprecated by the owner. Replaced by exportloopref.
INFO [runner/filename_unadjuster] Pre-built 0 adjustments in 158.186717ms
INFO [linters_context/goanalysis] analyzers took 0s with no stages
INFO [runner/skip_dirs] Skipped 1 issues from dir restapi/operations/operators by pattern restapi
INFO [runner/skip_dirs] Skipped 22 issues from dir restapi/operations/installer by pattern restapi
INFO [runner/skip_dirs] Skipped 2 issues from dir restapi/operations/manifests by pattern restapi
INFO [runner/skip_dirs] Skipped 2 issues from dir restapi/operations/events by pattern restapi
INFO [runner/skip_dirs] Skipped 2 issues from dir restapi/operations by pattern restapi
INFO [runner/skip_dirs] Skipped 6 issues from dir internal/installcfg/builder by pattern build
INFO [runner/skip_dirs] Skipped 4 issues from dir pkg/k8sclient by pattern client
INFO [runner/skip_dirs] Skipped 13 issues from dir cmd/agentbasedinstaller/client by pattern client
WARN [runner/nolint] Found unknown linters in //nolint directives: shadow
INFO [runner] Issues before processing: 2537, after processing: 0
INFO [runner] Processors filtering stat (out/in): filename_unadjuster: 2537/2537, skip_dirs: 2485/2537, cgo: 2537/2537, exclude: 268/281, nolint: 0/13, skip_files: 2537/2537, identifier_marker: 281/281, exclude-rules: 13/268, path_prettifier: 2537/2537, autogenerated_exclude: 281/2485
INFO [runner] processing took 17.042514ms with stages: nolint: 6.460019ms, identifier_marker: 3.078127ms, autogenerated_exclude: 2.375033ms, path_prettifier: 2.124256ms, exclude-rules: 1.773021ms, skip_dirs: 610.337µs, exclude: 311.375µs, cgo: 170.126µs, filename_unadjuster: 137.349µs, max_same_issues: 570ns, skip_files: 329ns, source_code: 323ns, diff: 256ns, fixer: 243ns, severity-rules: 215ns, sort_results: 204ns, uniq_by_line: 183ns, max_from_linter: 181ns, path_shortener: 138ns, path_prefixer: 126ns, max_per_file_from_linter: 103ns
INFO [runner] linters took 464.777764ms with stages: goanalysis_metalinter: 447.663062ms
INFO File cache stats: 0 entries of total size 0B
INFO Memory: 24 samples, avg is 47.2MB, max is 83.5MB
INFO Execution took 2.287641556s
podman-remote ps -q --filter "name=postgres" | xargs -r podman-remote kill && sleep 3
Cannot connect to Podman. Please verify your connection to the Linux system using podman system connection list, or try podman machine init and podman machine start to manage a new Linux VM
Error: unable to connect to Podman socket: Get "http://d/v4.1.1/libpod/_ping": dial unix /run//podman/podman.sock: connect: connection refused
podman-remote run -d --rm --tmpfs /var/lib/pgsql/data --name postgres -e POSTGRESQL_ADMIN_PASSWORD=admin -e POSTGRESQL_MAX_CONNECTIONS=10000 -p 127.0.0.1:5432:5432
quay.io/centos7/postgresql-12-centos7:latest
Cannot connect to Podman. Please verify your connection to the Linux system using podman system connection list, or try podman machine init and podman machine start to manage a new Linux VM
Error: unable to connect to Podman socket: Get "http://d/v4.1.1/libpod/_ping": dial unix /run//podman/podman.sock: connect: connection refused
make: *** [Makefile:510: run-db-container] Error 125

Hostprefix should not be required for non OpenshiftSDN / OVNkubernetes networktypes

For plugins like Calico which don't use the hostPrefix for anything,
it does not make sense to validate this field and force customers
to set specific values just to pass validation considering the
field is not even used in these plugins.

install-config example:

apiVersion: v1
baseDomain: dummy.hostname.com
metadata:
  name: ref3
networking:
  machineNetwork:
    - cidr: 10.10.127.0/24
  clusterNetwork:
    - cidr: 10.128.0.0/14
  serviceNetwork:
    - 172.30.0.0/16
  networkType: Calico
compute:
- name: worker
  hyperthreading: Disabled
  replicas: 6
controlPlane:
  name: master
  hyperthreading: Enabled
  replicas: 3
  platform:
    baremetal: {}

Because I did not specify the hostprefix it fails with the following error in the assisted installer:
Invalid Cluster Network prefix: Host prefix, now 0, must be a positive integer.
https://github.com/openshift/assisted-service/blob/master/internal/network/cidr_validations.go#L148

It should only fail for OVN/Openshift SDN networktypes. See similar bugzilla for other installers: https://bugzilla.redhat.com/show_bug.cgi?id=1852112

save-partlabel and save-partindex coreos-installer arguments are not honored because the partition is formatted previously

The arguments that can be set to keep a partition of the installation disk are not honored because the installation disk is formated before running the coreos-installer utility here and then a disk performance test is run

From the assisted service I can see that the arguments are passed properly to the coreos-installer binary:

Apr 13 09:20:07 snonode.virt01.eko4.cloud.lab.eng.bos.redhat.com installer[8571]: time="2022-04-13T09:20:07Z" level=info msg="Writing image and ignition to disk with arguments: [install --insecure -i /opt/openshift/master.ign --image-url 
http://10.19.140.20/rhcos-4.10.3-x86_64-metal.x86_64.raw.gz --save-partlabel data --append-karg ip=ens3:dhcp /dev/vda]"

but the partition is not saved to /dev/vda5 as expected:

[root@snonode ~]# lsblk
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
loop0    7:0    0  15.7G  0 loop /run/ephemeral
loop1    7:1    0 882.7M  0 loop /sysroot
sr0     11:0    1   102M  0 rom  
vda    252:0    0   120G  0 disk 
├─vda1 252:1    0     1M  0 part 
├─vda2 252:2    0   127M  0 part 
├─vda3 252:3    0   384M  0 part 
└─vda4 252:4    0   3.3G  0 part 

It just works if I run the coreos-installer + args in the command line. The device format I think it is also that the coreos-installer does by checking the help flag:

        --preserve-on-error         
            Don't clear partition table on error
            
            If installation fails, coreos-installer normally clears the destination's partition table to prevent booting from invalid boot media.  Skip clearing the partition table as a
            debugging aid.

Pods work fine but unable to bring up VMs

When I try to bring up a VM, I get the following error: server error.
command SyncVMI failed: "LibvirtError(Code=1, Domain=10, Message='internal error: process exited while connecting to monitor: 2021-12-14T12:57:02.789388Z qemu-kvm: -blockdev {"driver":"file","filename":"/var/run/kubevirt-private/vmi-disks/rootdisk/disk.img","node-name":"libvirt-1-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"}: Could not open '/var/run/kubevirt-private/vmi-disks/rootdisk/disk.img': Permission denied')"

Strange thing is that there is no var/run/kubevirt-private/vmi-disks/rootdisk/disk.img file on the node.

I am using NFS based storage to spin up Pods and VM. I have no issues with Pods.

Unable to create cluster with name starting with a number

Currently this line

clusterNameRegex    = "^([a-z]([-a-z0-9]*[a-z0-9])?)*$"

limits cluster names to start with a letter, is there an underlying reason for that?

If not, I'd like to add a PR to widen the naming requirements to be able to name clusters starting with a number.

bundle.Dockerfile ref wrong location

Post bundle build the bundle.Docker file is ref wrong file locations.

output from the bundle build command

INFO[0000] Writing bundle.Dockerfile in /home/kni/assisted-installer-home-dir/assisted-service 
mv /home/kni/assisted-installer-home-dir/assisted-service/build/assisted-installer/bundle/temp1/* /home/kni/assisted-installer-home-dir/assisted-service/build/assisted-installer/bundle
mv /home/kni/assisted-installer-home-dir/assisted-service/build/assisted-installer/bundle/temp2/metadata /home/kni/assisted-installer-home-dir/assisted-service/build/assisted-installer/bundle
rm -rf /home/kni/assisted-installer-home-dir/assisted-service/build/assisted-installer/bundle/temp1
rm -rf /home/kni/assisted-installer-home-dir/assisted-service/build/assisted-installer/bundle/temp2```


`FROM scratch

LABEL operators.operatorframework.io.bundle.mediatype.v1=registry+v1
LABEL operators.operatorframework.io.bundle.manifests.v1=manifests/
LABEL operators.operatorframework.io.bundle.metadata.v1=metadata/
LABEL operators.operatorframework.io.bundle.package.v1=assisted-service-operator
LABEL operators.operatorframework.io.bundle.channels.v1=alpha
LABEL operators.operatorframework.io.metrics.builder=operator-sdk-v1.4.2
LABEL operators.operatorframework.io.metrics.mediatype.v1=metrics+v1
LABEL operators.operatorframework.io.metrics.project_layout=go.kubebuilder.io/v3
LABEL operators.operatorframework.io.test.config.v1=tests/scorecard/
LABEL operators.operatorframework.io.test.mediatype.v1=scorecard+v1
COPY build/assisted-installer/bundle/temp2/manifests /manifests/
COPY build/assisted-installer/bundle/temp2/metadata /metadata/
`

Only first mac-address interface mapped

I have multi-homed (2 NIC's) nodes.
But only the 1st NIC gets its mac-address mapped to an interface name.

  • only d8:5e:d3:42:c8:a0 get's mapped to enp3s0f0
  • but d8:5e:d3:42:c8:a1 doesn't get mapped (to enp3s0f1).

Using the following parameter file:

kind: Cluster
static_network_config:
- interfaces:
  - name: enp3s0f0
    ipv4:
      address:
      - ip: 15.235.80.193
        prefix-length: 24
      dhcp: false
      enabled: true
    ipv6:
      enabled: false
    mtu: 1500
    mac-address: d8:5e:d3:42:c8:a0
    state: up
    type: ethernet
  - name: enp3s0f1
    ipv4:
      address:
      - ip: 192.168.100.10
        prefix-length: 24
      dhcp: false
      enabled: true
    ipv6:
      enabled: false
    mtu: 1500
    mac-address: d8:5e:d3:42:c8:a1      
    state: up
    type: ethernet
  routes:
    config:
    - destination: 0.0.0.0/0
      next-hop-address: 1.2.80.254
      next-hop-interface: enp3s0f0
      table-id: 254
  dns-resolver:
    config:
      search:
      - mylab.nl
      server:
      - 1.2.33.99
      - 1.1.1.1
aicli --debug create cluster dev2 --paramfile nodes.yaml
send: b'POST //api/assisted-install/v2/infra-envs HTTP/1.1\r\nHost: okd-assisted-installer.apps.workersno.lab01.mylab.eu\r\nAccept-Encoding: identity\r\nContent-Length: 9012\r\nAccept: application/json\r\nContent-Type: application/json\r\nUser-Agent: Swagger-Codegen/1.0.0/python\r\n\r\n'
send: b'{"name": "dev2", "ssh_authorized_key": "..", "pull_secret": "...", "static_network_config": [{"network_yaml": "dns-resolver:\\n  config:\\n    search:\\n    - mylab.com\\n    server:\\n    - 213.186.33.99\\n    - 1.1.1.1\\ninterfaces:\\n- ipv4:\\n    address:\\n    - ip: 1.2.80.193\\n      prefix-length: 24\\n    - ip: 1.2.97.99\\n      prefix-length: 24\\n    - ip: 1.2.97.100\\n      prefix-length: 24\\n    dhcp: false\\n    enabled: true\\n  ipv6:\\n    enabled: false\\n  mac-address: d8:5e:d3:42:c8:a0\\n  mtu: 1500\\n  name: enp3s0f0\\n  state: up\\n  type: ethernet\\n- ipv4:\\n    address:\\n    - ip: 192.168.100.10\\n      prefix-length: 24\\n    dhcp: false\\n    enabled: true\\n  ipv6:\\n    enabled: false\\n  mac-address: d8:5e:d3:42:c8:a1\\n  mtu: 1500\\n  name: enp3s0f1\\n  state: up\\n  type: ethernet\\nroutes:\\n  config:\\n  - destination: 0.0.0.0/0\\n    next-hop-address: 1.2.80.254\\n    next-hop-interface: enp3s0f0\\n    table-id: 254\\n", "mac_interface_map": [{"mac_address": "d8:5e:d3:42:c8:a0", "logical_nic_name": "enp3s0f0"}]}, {"network_yaml": "dns-resolver:\\n  config:\\n    search:\\n    - mylab.com\\n    server:\\n    - 213.186.33.99\\n    - 1.1.1.1\\ninterfaces:\\n- ipv4:\\n    address:\\n    - ip: 1.2.80.192\\n      prefix-length: 24\\n    dhcp: false\\n    enabled: true\\n  ipv6:\\n    enabled: false\\n  mac-address: d8:5e:d3:42:c8:6a\\n  mtu: 1500\\n  name: enp3s0f0\\n  state: up\\n  type: ethernet\\n- ipv4:\\n    address:\\n    - ip: 192.168.100.11\\n      prefix-length: 24\\n    dhcp: false\\n    enabled: true\\n  ipv6:\\n    enabled: false\\n  mac-address: d8:5e:d3:42:c8:6b\\n  mtu: 1500\\n  name: enp3s0f1.2101\\n  state: up\\n  type: vlan\\n  vlan:\\n    base-iface: enp3s0f1\\n    id: 2101\\n- name: enp0s20f0u9u3c2\\n  state: down\\n  type: ethernet\\nroutes:\\n  config:\\n  - destination: 0.0.0.0/0\\n    next-hop-address: 1.2.80.254\\n    next-hop-interface: enp3s0f0\\n    table-id: 254\\n", "mac_interface_map": [{"mac_address": "d8:5e:d3:42:c8:6a", "logical_nic_name": "enp3s0f0"}]}, {"network_yaml": "dns-resolver:\\n  config:\\n    search:\\n    - mylab.com\\n    server:\\n    - 213.186.33.99\\n    - 1.1.1.1\\ninterfaces:\\n- ipv4:\\n    address:\\n    - ip: 1.2.80.191\\n      prefix-length: 24\\n    dhcp: false\\n    enabled: true\\n  ipv6:\\n    enabled: false\\n  mac-address: d8:5e:d3:42:c9:90\\n  mtu: 1500\\n  name: enp3s0f0\\n  state: up\\n  type: ethernet\\n- ipv4:\\n    address:\\n    - ip: 192.168.100.12\\n      prefix-length: 24\\n    dhcp: false\\n    enabled: true\\n  ipv6:\\n    enabled: false\\n  mac-address: d8:5e:d3:42:c9:91\\n  mtu: 1500\\n  name: enp3s0f1.2101\\n  state: up\\n  type: vlan\\n  vlan:\\n    base-iface: enp3s0f1\\n    id: 2101\\nroutes:\\n  config:\\n  - destination: 0.0.0.0/0\\n    next-hop-address: 1.2.80.254\\n    next-hop-interface: enp3s0f0\\n    table-id: 254\\n", "mac_interface_map": [{"mac_address": "d8:5e:d3:42:c9:90", "logical_nic_name": "enp3s0f0"}]}, {"network_yaml": "dns-resolver:\\n  config:\\n    search:\\n    - mylab.com\\n    server:\\n    - 213.186.33.99\\n    - 1.1.1.1\\ninterfaces:\\n- ipv4:\\n    address:\\n    - ip: 1.2.54.21\\n      prefix-length: 24\\n    dhcp: false\\n    enabled: true\\n  ipv6:\\n    enabled: false\\n  mac-address: d0:50:99:ff:ab:ba\\n  mtu: 1500\\n  name: eno1\\n  state: up\\n  type: ethernet\\n- ipv4:\\n    address:\\n    - ip: 192.168.100.20\\n      prefix-length: 24\\n    - ip: 192.168.100.1\\n      prefix-length: 24\\n    - ip: 192.168.100.2\\n      prefix-length: 24\\n    dhcp: false\\n    enabled: true\\n  ipv6:\\n    enabled: false\\n  mac-address: d0:50:99:ff:ab:bb\\n  mtu: 1500\\n  name: enp4s0f1.2101\\n  state: up\\n  type: vlan\\n  vlan:\\n    base-iface: enp4s0f1\\n    id: 2101\\nroutes:\\n  config:\\n  - destination: 0.0.0.0/0\\n    next-hop-address: 1.2.54.254\\n    next-hop-interface: eno1\\n    table-id: 254\\n", "mac_interface_map": [{"mac_address": "d0:50:99:ff:ab:ba", "logical_nic_name": "eno1"}]}, {"network_yaml": "dns-resolver:\\n  config:\\n    search:\\n    - mylab.com\\n    server:\\n    - 213.186.33.99\\n    - 1.1.1.1\\ninterfaces:\\n- ipv4:\\n    address:\\n    - ip: 1.2.65.48\\n      prefix-length: 24\\n    dhcp: false\\n    enabled: true\\n  ipv6:\\n    enabled: false\\n  mac-address: d8:5e:d3:42:c9:24\\n  mtu: 1500\\n  name: enp4s0f0\\n  state: up\\n  type: ethernet\\n- ipv4:\\n    address:\\n    - ip: 192.168.100.21\\n      prefix-length: 24\\n    dhcp: false\\n    enabled: true\\n  ipv6:\\n    enabled: false\\n  mac-address: d8:5e:d3:42:c9:25\\n  mtu: 1500\\n  name: enp4s0f1.2101\\n  state: up\\n  type: vlan\\n  vlan:\\n    base-iface: enp4s0f1\\n    id: 2101\\nroutes:\\n  config:\\n  - destination: 0.0.0.0/0\\n    next-hop-address: 1.2.65.254\\n    next-hop-interface: enp4s0f0\\n    table-id: 254\\n", "mac_interface_map": [{"mac_address": "d8:5e:d3:42:c9:24", "logical_nic_name": "enp4s0f0"}]}, {"network_yaml": "dns-resolver:\\n  config:\\n    search:\\n    - mylab.com\\n    server:\\n    - 213.186.33.99\\n    - 1.1.1.1\\ninterfaces:\\n- ipv4:\\n    address:\\n    - ip: 1.2.65.47\\n      prefix-length: 24\\n    dhcp: false\\n    enabled: true\\n  ipv6:\\n    enabled: false\\n  mac-address: d8:5e:d3:42:c8:74\\n  mtu: 1500\\n  name: enp4s0f0\\n  state: up\\n  type: ethernet\\n- ipv4:\\n    address:\\n    - ip: 192.168.100.22\\n      prefix-length: 24\\n    dhcp: false\\n    enabled: true\\n  ipv6:\\n    enabled: false\\n  mac-address: d8:5e:d3:42:c8:75\\n  mtu: 1500\\n  name: enp4s0f1.2101\\n  state: up\\n  type: vlan\\n  vlan:\\n    base-iface: enp4s0f1\\n    id: 2101\\nroutes:\\n  config:\\n  - destination: 0.0.0.0/0\\n    next-hop-address: 1.2.65.254\\n    next-hop-interface: enp4s0f0\\n    table-id: 254\\n", "mac_interface_map": [{"mac_address": "d8:5e:d3:42:c8:74", "logical_nic_name": "enp4s0f0"}]}], "image_type": "full-iso", "cluster_id": "f7045b5c-6315-4841-836d-2e0fdfc212c7", "openshift_version": "4.10.0-0.okd-2022-07-09-073606", "cpu_architecture": "x86_64"}'
reply: 'HTTP/1.1 201 Created\r\n'
header: server: nginx/1.20.1
header: date: Mon, 25 Jul 2022 20:08:23 GMT
header: content-type: application/json
header: transfer-encoding: chunked
header: set-cookie: bd75429d2952548245bbda62f57a34bd=466e25cdee6ae9d872d6803320389a74; path=/; HttpOnly
2022-07-25 22:08:23,457 DEBUG https://okd-assisted-installer.apps.workersno.lab01.mylab.eu:443 "POST //api/assisted-install/v2/infra-envs HTTP/1.1" 201 None
2022-07-25 22:08:23,458 DEBUG response body: {"cluster_id":"f7045b5c-6315-4841-836d-2e0fdfc212c7","cpu_architecture":"x86_64","created_at":"2022-07-25T20:08:17.157948Z","download_url":"https://okd-assisted-installer.apps.workersno.lab01.mylab.eu:443/images/df6cca1d-bbae-42ef-9e88-78469dfe9d57?arch=x86_64&type=full-iso&version=4.10","email_domain":"Unknown","expires_at":"0001-01-01T00:00:00.000Z","href":"/api/assisted-install/v2/infra-envs/df6cca1d-bbae-42ef-9e88-78469dfe9d57","id":"df6cca1d-bbae-42ef-9e88-78469dfe9d57","kind":"InfraEnv","name":"dev2","openshift_version":"4.10","proxy":{},"pull_secret_set":true,"ssh_authorized_key":"...","static_network_config":"[{\"mac_interface_map\":[{\"logical_nic_name\":\"eno1\",\"mac_address\":\"d0:50:99:ff:ab:ba\"}],\"network_yaml\":\"dns-resolver:\\n  config:\\n    search:\\n    - mylab.com\\n    server:\\n    - 213.186.33.99\\n    - 1.1.1.1\\ninterfaces:\\n- ipv4:\\n    address:\\n    - ip: 1.2.54.21\\n      prefix-length: 24\\n    dhcp: false\\n    enabled: true\\n  ipv6:\\n    enabled: false\\n  mac-address: d0:50:99:ff:ab:ba\\n  mtu: 1500\\n  name: eno1\\n  state: up\\n  type: ethernet\\n- ipv4:\\n    address:\\n    - ip: 192.168.100.20\\n      prefix-length: 24\\n    - ip: 192.168.100.1\\n      prefix-length: 24\\n    - ip: 192.168.100.2\\n      prefix-length: 24\\n    dhcp: false\\n    enabled: true\\n  ipv6:\\n    enabled: false\\n  mac-address: d0:50:99:ff:ab:bb\\n  mtu: 1500\\n  name: enp4s0f1.2101\\n  state: up\\n  type: vlan\\n  vlan:\\n    base-iface: enp4s0f1\\n    id: 2101\\nroutes:\\n  config:\\n  - destination: 0.0.0.0/0\\n    next-hop-address: 1.2.54.254\\n    next-hop-interface: eno1\\n    table-id: 254\\n\"},{\"mac_interface_map\":[{\"logical_nic_name\":\"enp4s0f0\",\"mac_address\":\"d8:5e:d3:42:c8:74\"}],\"network_yaml\":\"dns-resolver:\\n  config:\\n    search:\\n    - mylab.com\\n    server:\\n    - 213.186.33.99\\n    - 1.1.1.1\\ninterfaces:\\n- ipv4:\\n    address:\\n    - ip: 1.2.65.47\\n      prefix-length: 24\\n    dhcp: false\\n    enabled: true\\n  ipv6:\\n    enabled: false\\n  mac-address: d8:5e:d3:42:c8:74\\n  mtu: 1500\\n  name: enp4s0f0\\n  state: up\\n  type: ethernet\\n- ipv4:\\n    address:\\n    - ip: 192.168.100.22\\n      prefix-length: 24\\n    dhcp: false\\n    enabled: true\\n  ipv6:\\n    enabled: false\\n  mac-address: d8:5e:d3:42:c8:75\\n  mtu: 1500\\n  name: enp4s0f1.2101\\n  state: up\\n  type: vlan\\n  vlan:\\n    base-iface: enp4s0f1\\n    id: 2101\\nroutes:\\n  config:\\n  - destination: 0.0.0.0/0\\n    next-hop-address: 1.2.65.254\\n    next-hop-interface: enp4s0f0\\n    table-id: 254\\n\"},{\"mac_interface_map\":[{\"logical_nic_name\":\"enp4s0f0\",\"mac_address\":\"d8:5e:d3:42:c9:24\"}],\"network_yaml\":\"dns-resolver:\\n  config:\\n    search:\\n    - mylab.com\\n    server:\\n    - 213.186.33.99\\n    - 1.1.1.1\\ninterfaces:\\n- ipv4:\\n    address:\\n    - ip: 1.2.65.48\\n      prefix-length: 24\\n    dhcp: false\\n    enabled: true\\n  ipv6:\\n    enabled: false\\n  mac-address: d8:5e:d3:42:c9:24\\n  mtu: 1500\\n  name: enp4s0f0\\n  state: up\\n  type: ethernet\\n- ipv4:\\n    address:\\n    - ip: 192.168.100.21\\n      prefix-length: 24\\n    dhcp: false\\n    enabled: true\\n  ipv6:\\n    enabled: false\\n  mac-address: d8:5e:d3:42:c9:25\\n  mtu: 1500\\n  name: enp4s0f1.2101\\n  state: up\\n  type: vlan\\n  vlan:\\n    base-iface: enp4s0f1\\n    id: 2101\\nroutes:\\n  config:\\n  - destination: 0.0.0.0/0\\n    next-hop-address: 1.2.65.254\\n    next-hop-interface: enp4s0f0\\n    table-id: 254\\n\"},{\"mac_interface_map\":[{\"logical_nic_name\":\"enp3s0f0\",\"mac_address\":\"d8:5e:d3:42:c9:90\"}],\"network_yaml\":\"dns-resolver:\\n  config:\\n    search:\\n    - mylab.com\\n    server:\\n    - 213.186.33.99\\n    - 1.1.1.1\\ninterfaces:\\n- ipv4:\\n    address:\\n    - ip: 1.2.80.191\\n      prefix-length: 24\\n    dhcp: false\\n    enabled: true\\n  ipv6:\\n    enabled: false\\n  mac-address: d8:5e:d3:42:c9:90\\n  mtu: 1500\\n  name: enp3s0f0\\n  state: up\\n  type: ethernet\\n- ipv4:\\n    address:\\n    - ip: 192.168.100.12\\n      prefix-length: 24\\n    dhcp: false\\n    enabled: true\\n  ipv6:\\n    enabled: false\\n  mac-address: d8:5e:d3:42:c9:91\\n  mtu: 1500\\n  name: enp3s0f1.2101\\n  state: up\\n  type: vlan\\n  vlan:\\n    base-iface: enp3s0f1\\n    id: 2101\\nroutes:\\n  config:\\n  - destination: 0.0.0.0/0\\n    next-hop-address: 1.2.80.254\\n    next-hop-interface: enp3s0f0\\n    table-id: 254\\n\"},{\"mac_interface_map\":[{\"logical_nic_name\":\"enp3s0f0\",\"mac_address\":\"d8:5e:d3:42:c8:6a\"}],\"network_yaml\":\"dns-resolver:\\n  config:\\n    search:\\n    - mylab.com\\n    server:\\n    - 213.186.33.99\\n    - 1.1.1.1\\ninterfaces:\\n- ipv4:\\n    address:\\n    - ip: 1.2.80.192\\n      prefix-length: 24\\n    dhcp: false\\n    enabled: true\\n  ipv6:\\n    enabled: false\\n  mac-address: d8:5e:d3:42:c8:6a\\n  mtu: 1500\\n  name: enp3s0f0\\n  state: up\\n  type: ethernet\\n- ipv4:\\n    address:\\n    - ip: 192.168.100.11\\n      prefix-length: 24\\n    dhcp: false\\n    enabled: true\\n  ipv6:\\n    enabled: false\\n  mac-address: d8:5e:d3:42:c8:6b\\n  mtu: 1500\\n  name: enp3s0f1.2101\\n  state: up\\n  type: vlan\\n  vlan:\\n    base-iface: enp3s0f1\\n    id: 2101\\n- name: enp0s20f0u9u3c2\\n  state: down\\n  type: ethernet\\nroutes:\\n  config:\\n  - destination: 0.0.0.0/0\\n    next-hop-address: 1.2.80.254\\n    next-hop-interface: enp3s0f0\\n    table-id: 254\\n\"},{\"mac_interface_map\":[{\"logical_nic_name\":\"enp3s0f0\",\"mac_address\":\"d8:5e:d3:42:c8:a0\"}],\"network_yaml\":\"dns-resolver:\\n  config:\\n    search:\\n    - mylab.com\\n    server:\\n    - 213.186.33.99\\n    - 1.1.1.1\\ninterfaces:\\n- ipv4:\\n    address:\\n    - ip: 1.2.80.193\\n      prefix-length: 24\\n    - ip: 1.2.97.99\\n      prefix-length: 24\\n    - ip: 1.2.97.100\\n      prefix-length: 24\\n    dhcp: false\\n    enabled: true\\n  ipv6:\\n    enabled: false\\n  mac-address: d8:5e:d3:42:c8:a0\\n  mtu: 1500\\n  name: enp3s0f0\\n  state: up\\n  type: ethernet\\n- ipv4:\\n    address:\\n    - ip: 192.168.100.10\\n      prefix-length: 24\\n    dhcp: false\\n    enabled: true\\n  ipv6:\\n    enabled: false\\n  mac-address: d8:5e:d3:42:c8:a1\\n  mtu: 1500\\n  name: enp3s0f1\\n  state: up\\n  type: ethernet\\nroutes:\\n  config:\\n  - destination: 0.0.0.0/0\\n    next-hop-address: 1.2.80.254\\n    next-hop-interface: enp3s0f0\\n    table-id: 254\\n\"}]","type":"full-iso","updated_at":"2022-07-25T20:08:23.343048Z","user_name":"admin"}

creating storage class for virtualization

For Virtualization workload, I need a Storage Class. How do I create a storage class that uses local storage on the node. I configured Local Storage operator and it created a single PV for the whole disk. I need a SC that can support multiple PVCs using local storage on the node.

Incorrect modes in the discovery.ign template

Ignition storage file modes are represented in decimal while their natural representation in Linux is typically octal, i.e. 755, 644, etc.

These two modes are incorrect in the discovery.ign template:

  "path": "/usr/local/bin/agent-fix-bz1964591",
  "mode": 755,

and

  "path": "/usr/local/bin/okd-binaries.sh",
  "mode": 755,

both should be changed to:

    "mode": 493,

assisted-service is in degraded state due to its container (installer) in exit state.

running crucible playbook for installation of openshift cluster 4.10.18 with rhel 8,4 . After running playbook "deploy_assisted_installer_onprem.yml" , all assisted-service conatiner is created but installer pod in stopped state due to that assisted-service is showing in degraded state.

podman pod ps
POD ID NAME STATUS CREATED INFRA ID # OF CONTAINERS
d8550189ca2b assisted-service Degraded 20 hours ago 6ac2b790faf3 4
b28d83ab59e5 http_store_pod Running 2 days ago 43bd63e63a86 2

podman ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
c3d33f14e48c docker.io/library/registry:latest /etc/docker/regis... 3 weeks ago Up 2 days ago 0.0.0.0:5002->5000/tcp Mirror-Registry2
43bd63e63a86 registry-quay.sno.localdomain:5002/ubi8/pause:latest 2 days ago Up 2 days ago 0.0.0.0:80->8080/tcp b28d83ab59e5-infra
fc718330670e registry.centos.org/centos/httpd-24-centos7:latest /usr/bin/run-http... 2 days ago Up 2 days ago 0.0.0.0:80->8080/tcp http_store
6ac2b790faf3 registry-quay.sno.localdomain:5002/ubi8/pause:latest 20 hours ago Up 20 hours ago 0.0.0.0:8000->8000/tcp, 0.0.0.0:8080->8080/tcp, 0.0.0.0:8090->8090/tcp d8550189ca2b-infra
3c3b07228209 registry-quay.sno.localdomain:5002/ocpmetal/postgresql-12:v1.1 run-postgresql 20 hours ago Up 20 hours ago 0.0.0.0:8000->8000/tcp, 0.0.0.0:8080->8080/tcp, 0.0.0.0:8090->8090/tcp db
5737d15f4f11 registry-quay.sno.localdomain:5002/ocpmetal/ocp-metal-ui:latest /opt/bitnami/scri... 20 hours ago Up 20 hours ago 0.0.0.0:8000->8000/tcp, 0.0.0.0:8080->8080/tcp, 0.0.0.0:8090->8090/tcp gui
5613b17ed014 registry-quay.sno.localdomain:5002/ocpmetal/assisted-service:latest /assisted-service 20 hours ago Exited (1) 20 hours ago 0.0.0.0:8000->8000/tcp, 0.0.0.0:8080->8080/tcp, 0.0.0.0:8090->8090/tcp installer

after looking into the logs of stopped container we got below error message:

time="2022-07-25T09:27:40Z" level=info msg="Setting log format: text" func=main.InitLogs file="/go/src/github.com/openshift/origin/cmd/main.go:159"
time="2022-07-25T09:27:40Z" level=info msg="Setting Log Level: info" func=main.InitLogs file="/go/src/github.com/openshift/origin/cmd/main.go:165"
time="2022-07-25T09:27:40Z" level=info msg="Starting bm service" func=main.main file="/go/src/github.com/openshift/origin/cmd/main.go:203"
time="2022-07-25T09:27:40Z" level=fatal msg="Failed to parse OS_IMAGES json "http:///discovery/rhcos-4.10.3-x86_64-live.x86_64.iso"" func=main.main.func1 file="/go/src/github.com/openshift/origin/cmd/main.go:196" error="json: cannot unmarshal string into Go value of type models.OsImages"

observation:
Checked template file used during deployment having same osimages parameter:

POSTGRESQL_DATABASE=installer
POSTGRESQL_PASSWORD=admin
POSTGRESQL_USER=admin
DB_HOST=127.0.0.1
DB_PORT=5432
DB_USER=admin
DB_PASS=admin
DB_NAME=installer
IMAGE_SERVICE_BASE_URL=http://{{ host | default("ansible_fqdn") }}:8090
#SERVICE_BASE_URL=http://{{ host | default("ansible_fqdn") }}:8090
DEPLOY_TARGET=onprem
STORAGE=filesystem
DUMMY_IGNITION=false
#OS_IMAGES={{ assisted_service_openshift_versions }}
OS_IMAGES="{{ assisted_service_openshift_versions[openshift_version]['rhcos_image'] }}"
OPENSHIFT_VERSIONS={{ assisted_service_openshift_versions | to_json }}
ENABLE_SINGLE_NODE_DNSMASQ=true
PUBLIC_CONTAINER_REGISTRIES=quay.io
NTP_DEFAULT_SERVER=
IPV6_SUPPORT=true
AUTH_TYPE=none

SKIP_CERT_VERIFICATION=true

SELF_VERSION={{ assisted_service_image }}
INSTALLER_IMAGE={{ assisted_installer_image }}
CONTROLLER_IMAGE={{ assisted_installer_controller_image }}
AGENT_DOCKER_IMAGE={{ assisted_installer_agent_image }}

HW_VALIDATOR_MIN_DISK_SIZE_GIB=20

Format has changed and likely will continue to change for HW validation (Link: https://github.com/openshift/assisted-service/blob/master/onprem-environment#L19)

HW_VALIDATOR_REQUIREMENTS={{ assisted_installer_hardware_validation | to_json }}

Kindly suggest the solution on the same

Question about functionality provided

Hi, I am looking at the repo and try to understand about its functionality.
If visit https://cloud.redhat.com/openshift/assisted-installer/clusters and try to create a cluster it generates a discovery-image iso that I can boot a node and get discovered in the dashboard so I can install openshift4.

I currently try to do the same process for ppc64le nodes, so my idea was to use this repo, build the discovery iso for ppc64le and follow the same process in the cloud.redhat.com portal to install openshift4.

Would something like the above be possible, using the current repo?

The python client returns a list of dicts instead of a list of Cluster when invoking the list_clusters() method

Printing the type of the objects in the list with the following code:

clusters = api.list_clusters()
print([type(cluster) for cluster in clusters])

returns:

[<class 'dict'>]

If we get the cluster info from the API:

print([type(api.get_cluster(cluster["id"])) for cluster in clusters])

the type is correct:

[<class 'assisted_service_client.models.cluster.Cluster'>]

According to the specification in the swagger.yaml file the returned list should contain a list of clusters, not a list of dicts.

Question: Difference with OpenShift Hive

I was looking at OpenShift Assisted Installer recently, and found that it's driven by assisted service at backend which is hosted in this repo.

However, I was a bit confused about the relationship between this project and OpenShift Hive, because it looks both two projects are addressing OpenShift provisioning, and Hive seems more cloud-native since it provides K8s CRs that allows people to describe the desired cluster to be provisioned.

I know assisted service has integration with Hive, but wonder why there are two projects and in which case I should pick up one over another. Thanks!

SNO disconnted environment openshift cluster installer failed with assisted-service in failed state and its pod showing in degraded state

Installling a single node openshift cluster via crucible playbook , while running assisted-installer on prem job , it is getting failed with assisted service pod in degraded state and assisted-service pod in degraded state.

podman pod ps
POD ID NAME STATUS CREATED INFRA ID # OF CONTAINERS
20daebe9ee8d assisted-service Degraded 28 hours ago 8b34257730b0 4
b142bdaae025 http_store_pod Running 30 hours ago 23905c0c0d9d 2

"SNO" installation on Azure support ? cfr. .vhd instead of discovery ISO

Azure doesn't supports .iso images. To install the VM via the asissted-installer, one must start it with the provided .iso image.

It seems that 'just' converting the .iso to .vhd is not that simple (is it even possible ?)

(Also couldn't find an alternative way of just booting an Azure VM with the generic RHCOS image and start up the assisted installer agent manually on the VM)

any insights that could help ? thx in advance !

static IP support

Hi

Is this the right place to ask about feature available in version quay.io/ocpmetal/assisted-service:v1.0.16.2, namely support for static IPs? It is missing from the latest version and we desperately need it for a big Telco lab.

Now using quay.io/ocpmetal/assisted-service:stable.10.08.2021-22.10 and this feature is no longer there.

I am trying to map all the dependencies and versions so was checking the API here: https://generator.swagger.io/?url=https://raw.githubusercontent.com/openshift/assisted-service/master/swagger.yaml

Is above swagger API referring to master branch?

Thanks

Allow machine CIDR to be empty when having a Dual-stack cluster and userManagedNetworking: true

Hi,

We can deploy a ipv4 only cluster just fine, but when we enable a dual-stack cluster we get this error:
error: Reason: Dual-stack cluster cannot be created with empty Machine Networks
https://github.com/openshift/assisted-service/blob/master/internal/cluster/validations/validations.go#L776

We have an external loadbalancer for our ingress & API traffic which is on subnet 10.123.127.0/24
Our machinecidr range is on 10.124.0.0/24.

When specifying userManagedNetworking: true it should not check if the machine CIDR range is not specified even for dual-stack clusters.

I cannot specify the machine cidr range because then I get this error: No suitable matching CIDR found for VIP. It expects the Ingress / API VIP to be in the machine CIDR range which is not in our case.

https://github.com/openshift/assisted-service/blob/master/internal/network/machine_network_cidr.go#L73

         10.123.127.0/24

┌───────────┐     ┌──────────┐
│           │     │          │
│           │     │          │
│  Lb1      │     │  Lb2     │
│           │     │          │
└─────────┬─┘     └─┬────────┘
          │         │ VIPS:
          └────┬────┘ 10.123.127.10 (API)
               │      10.123.127.11 (ingress)
               │
               │
               │
               │
      ┌────────┴────────┐
      │                 │
┌─────┴─────┐     ┌─────┴─────┐
│           │     │           │
│ Worker1   │     │ Worker2   │
│ (ingress) │     │ (ingress) │
└───────────┘     └───────────┘
   10.124.0.5       10.124.0.6

            10.124.0.0/24

Regards,
Paul

OKD install fails using podman

Using the latest podman image to try and deploy OKD. I tried using a fake pull secret ({"auths":{"fake":{"auth":"aWQ6cGFzcwo="}}}) and my pull secret from Red Hat, same result.

image

Here is the user.ign file (I think this is the ignition file it is complaining about?):
image

The assisted-service pod deployment with podman play kube fails by initdb failure

Description of the problem:

The assisted-service local deployment with podman play kube fails.

On CentOS Stream 8 latest(but maybe it occurs on other distros)

[root@assistedservice podmandeploy]# podman version
Version:      3.0.2-dev
API Version:  3.0.0
Go Version:   go1.16.1
Built:        Sat Mar 27 05:39:59 2021
OS/Arch:      linux/amd64
[root@assistedservice podmandeploy]# podman pod list
POD ID        NAME                STATUS    CREATED             INFRA ID      # OF CONTAINERS                                                                                              
72869e6c2649  assisted-installer  Degraded  About a minute ago  26c2fceccfb6  5
[root@assistedservice podmandeploy]# podman ps -a
CONTAINER ID  IMAGE                                                      COMMAND               CREATED             STATUS                     PORTS                                        
                                                                                        NAMES                                                                                             
991180d86b6f  gcr.io/k8s-minikube/kicbase:v0.0.29                                              9 days ago          Created                    127.0.0.1:36941->22/tcp, 127.0.0.1:38681->237
6/tcp, 127.0.0.1:41045->5000/tcp, 127.0.0.1:42147->8443/tcp, 127.0.0.1:39247->32443/tcp  minikube                                                                                          
26c2fceccfb6  registry.access.redhat.com/ubi8/pause:latest                                     About a minute ago  Up 41 seconds ago          0.0.0.0:8080->8080/tcp, 0.0.0.0:8090->8090/tc
p, 0.0.0.0:8888->8888/tcp                                                                72869e6c2649-infra                                                                                
9eab4cdd74c0  quay.io/centos7/postgresql-12-centos7:latest               run-postgresql        About a minute ago  Exited (1) 41 seconds ago  0.0.0.0:8080->8080/tcp, 0.0.0.0:8090->8090/tc
p, 0.0.0.0:8888->8888/tcp                                                                assisted-installer-db                                                                             
b04fb071cf3a  quay.io/edge-infrastructure/assisted-installer-ui:latest   /deploy/start.sh      About a minute ago  Up 41 seconds ago          0.0.0.0:8080->8080/tcp, 0.0.0.0:8090->8090/tc
p, 0.0.0.0:8888->8888/tcp                                                                assisted-installer-ui                                                                             
c94f19107438  quay.io/edge-infrastructure/assisted-image-service:latest  /assisted-image-s...  About a minute ago  Up 40 seconds ago          0.0.0.0:8080->8080/tcp, 0.0.0.0:8090->8090/tc
p, 0.0.0.0:8888->8888/tcp                                                                assisted-installer-image-service                                                                  
6b4de76ac577  quay.io/edge-infrastructure/assisted-service:latest        /assisted-service     40 seconds ago      Exited (1) 31 seconds ago  0.0.0.0:8080->8080/tcp, 0.0.0.0:8090->8090/tc
p, 0.0.0.0:8888->8888/tcp                                                                assisted-installer-service                               
[root@assistedservice podmandeploy]# podman logs 9eab4cdd74c0
initdb: error: cannot be run as root
Please log in (using, e.g., "su") as the (unprivileged) user that will
own the server process.

This issue occurs both rootfull and rootless.

It can be worked around by adding 'securityContext/runAsUser:26' (postgres uid) to run-postgresql container args in deploy/podman/pod.yml. But I don't understand it should be fix with it. I'm happy if I have comments from guys.

How reproducible:

always

Steps to reproduce:

  1. Install podman in fresh-installed CentOS stream 8 environment

  2. clone assisted-service repo (https://github.com/openshift/assisted-service)

  3. run podman play kube with "deploy/podman/pod.yml"

   #cd assisted-service/deploy/podman

   #podman play kube --configmap configmap.yml pod.yml

Actual results:

The assisted-service pod is degraded status.
Running quay.io/centos7/postgresql-12-centos7:latest fails due to error of initdb.

Expected results:

The assisted-service pod is running successfully

Documentation lacking for airgapped envs

I am attempting to install OCP 4.10 .20i n an air gapped bm env using the standalone podman deployment instructions. I've successfully mirrored the OpenShift install images to my airgapped env as well as the AssistedI Installer images for running in podman.

The Makefile has the following which implies I will have to rebuild the image to add mirror registry support.
MIRROR_REG_CA_FILE = mirror_ca.crt.
REGISTRIES_FILE_PATH = registries.conf
MIRROR_REGISTRY_SUPPORT := $(or ${MIRROR_REGISTRY_SUPPORT},False)

I have some questions:

  1. What is the format of the registries.conf file referred to above?

  2. What modifications do I have to make to the configmap.yml file that I use to run the AI in podman . In particular what should these values be set to? RELEASE_IMAGES , PUBLIC_CONTAINER_REGISTRIES

  3. Do I have to patch the install-config file via the AI API to add something like this ?

imageContentSources:

  • mirrors:
    • ocp-utilities:8443/ocp4/openshift4
      source: quay.io/openshift-release-dev/ocp-release
  • mirrors:
    • ocp-utilities:8443/ocp4/openshift4
      source: quay.io/openshift-release-dev/ocp-v4.0-art-dev
  1. Can I use a local mirror registry in insecure mode (ie skip SSL verification) ?

How to change the url for edge-infrastructure/assisted-installer-service-agent to local repo for disconnected sites?

Hello everyone!

I am just testing the AIS deployment method for disconnected site, one thing that I have noticed is that once RHCOS started initializing, it failed to download the AIS-Agent which is required for it to communicate with the AIS platform. In a disconnected site, where internet is unavailable, how can we point the AIS-Agent to our local repository?

Second thing that I noted is that there is an ignition.go file which include the url, does that means we need to recompile the entire image following from the start?

Error message:
Error: initializing source docker://quay.io/edge-infrastructure/assisted-installer-agent:latest: pinging container registry quay.io: Get "https://quay.io/v2/": dial tcp: lookup quay.io on x.y.40.17:53: server misbehaving

SNO disconnected installation fails with error msg: "invalid \"install-config.yaml\" file: platform.baremetal.Hosts: Required value: not enough hosts found (0) to support all the configured ControlPlane replicas (1)"

Hi,

Environment Details:

AI version : 2.5.0
RHOCP : 4.10.20
Single Node OCP - disconnected Installation

Problem:

OCP installation failed on the single node stating:

"time="2022-07-07T14:23:58Z" level=warning msg="Failed to prepare installation of cluster 962f5c7b-6a5e-4742-8f9a-89eaf357d2be" func="github.com/openshift/assisted-service/internal/cluster.(*Manager).HandlePreInstallError" file="/go/src/github.com/openshift/origin/internal/cluster/cluster.go:885" cluster_id=962f5c7b-6a5e-4742-8f9a-89eaf357d2be error="failed generating install config for cluster 962f5c7b-6a5e-4742-8f9a-89eaf357d2be: error running openshift-install manifests, level=fatal msg=failed to fetch Master Machines: failed to load asset "Install Config": invalid "install-config.yaml" file: platform.baremetal.Hosts: Required value: not enough hosts found (0) to support all the configured ControlPlane replicas (1)\n: exit status 1" go-id=11714 pkg=cluster-state request_id="

Hosts section under "platform.baremetal" in the generated install-config.yaml is empty,
...
platform:
baremetal:
provisioningNetwork: ""
apiVIP: ""
ingressVIP: ""
hosts: []
clusterosimage: http://hostname.domain.net:8080/rhcos-4.10.16-x86_64-openstack.x86_64.qcow2.gz?sha256=e0a1d8a99c5869150a56b8de475ea7952ca2fa3aacad7ca48533d1176df503ab
none: {}
vsphere: null
...

REST API - GET/v1/clusters cannot request unregistered clusters

I am trying to automate deleting all unregistered clusters by saving all unregistered cluster id's and deleting each entry with the given id.
However, you need Org Admin credentials to delete these clusters because the Bearer token being passed for each cluster generation is user-level and not account-owner level.
Is there a work around to this?

The FCOS live image of OKD is failed to boot in UEFI mode

Description of the problem:

I'd like to file this issue for someone trying to install OKD with assisted-service and facing the same issue.

It already can be worked around and has been fixed in the latest release of the FCOS live image.
We'll be able to close this issue when deploy/podman/okd-configmap.yml is updated to use the later release of FCOS live image.

The FCOS live image for OKD installation (https://github.com/openshift/assisted-service/blob/master/deploy/podman/okd-configmap.yml#L29) has a bug to fail UEFI boot. It cannot be fixed until the ISO is updated to 34.20210904.2.0 or later.

coreos/fedora-coreos-tracker#953

If you want to set up OKD cluster with the discovery ISO in UEFI mode, this workaround may help.

At the dropped grub console, type these commands. (hd0) may be correct but you may need to adjust to your environment.

grub> set prefix=(hd0)/EFI/fedora
grub> configfile $prefix/grub.cfg

Then entering FCOS live boot menu, press e to edit boot menu and add (hd0) prefix to the path of vmlinuz,initrd,ignition images

linux (hd0)/images/pxeboot/vmlinuz mitigations=auto,nosmt coreos.liveiso=fedora-coreos-34.20210626.3.1 ignition.firstboot ignition.platform.id=metal
initrd (hd0)/images/pxeboot/initrd.img (hd0)/images/ignition.img

Type Ctrl+x to boot FCOS.

How reproducible:

always

Steps to reproduce:

  1. Launch the assisted-service in the local environment with podman play kube, using okd-configmap.yml
  2. Create OKD cluster in the assisted-service UI and generate discovery iso
  3. Boot machine with the ISO in the pure UEFI mode

Actual results:
Boot fails. We dropped to grub console

Expected results:
The grub menu of FOCS live is shown and boot successfully.

DNS validation fails but it's wrong

I've tried to install a cluster with

Screenshot from 2022-10-09 23-12-33

with the following dns settings

Screenshot from 2022-10-09 23-10-35

but the validations fails due to the dns wildcard certificate (that does not exists at all!)

Screenshot from 2022-10-09 23-09-03

Is that a known issue?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.