Giter Club home page Giter Club logo

Comments (50)

schaefi avatar schaefi commented on September 27, 2024 1

ok 👍

from azure-li-services.

rjschwei avatar rjschwei commented on September 27, 2024

Fixed the title, predictable names are based on path and device etc., i.e. en.... and persistent names are the "legacy" names such as eth0, eth1 etc.

@jaawasth you say OS upgrade can you be more specific please? I sthis an upgrade between service packs, i.e. SLES 15 to SLES 15 SP1 or is this a distribution upgrade, i.e. SLES 12 SP5 to SLES 15 SP1?

Names are set by udev rules and udev is part of systemd. So we need to know starting and end point. Also please file a bug in bugzilla so we can involve other teams at SUSE if this turns out to be a udev/systemd issue.

Thanks

from azure-li-services.

jaiawasthi avatar jaiawasthi commented on September 27, 2024

@rjschwei thanks for the update !!
I still don;t have complete clarity, I'm getting it from customer engagement team.
But from what they have described is that the customer patched the system [i believe they did a kernel upgrade/refresh ], I'll update the exact process that has been done.
So the upgrade was for a SLES15 SP1 VLI image [i think it's just a patching of OS]

Names are set by udev rules and udev is part of systemd.

But do the udev rules always write out a 70-persistent-net.rules file.
If there is some other way we are adding the "ID_NET_NAME_PATH", is it safe to add this rule still in the persistent-net.rules file ?

from azure-li-services.

jaiawasthi avatar jaiawasthi commented on September 27, 2024

So, it does look, that the customer just did a kernel upgrade.
After the kernel upgrade, the interface names switched to

From Dmesg:
[ 162.098760] mlx5_core 0000:41:00.0 ens2370f0: renamed from eth4
[ 162.153688] mlx5_core 0000:41:00.1 ens2370f1: renamed from eth5
[ 162.364046] mlx5_core 0001:c1:00.0 enP1s2372f0: renamed from eth6
[ 162.434419] mlx5_core 0001:c1:00.1 enP1s2372f1: renamed from eth7

Please note, we expect the names to be predictable, so the interface names should be

enp65s0f0 - eth4
enp65s0f1 - eth5
enP1p193s0f0 - eth6
enP1p193s0f1 - eth7

The kernel has been upgraded from 4.12.14-197.45-default ----> 4.12.14-197.56-default

As a workaround to allow customer to continue having access to servers, following rules were added to 70-persistent-net.rules file

SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?", ATTR{address}=="mac-addresses1", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth", NAME="eth4"
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?", ATTR{address}=="mac-addresses2", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth", NAME="eth5"
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?", ATTR{address}=="mac-addresses3", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth", NAME="eth6"
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?", ATTR{address}=="mac-addresses4", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth", NAME="eth7"

Please note these are the mac addresses of the network devices.

from azure-li-services.

jaiawasthi avatar jaiawasthi commented on September 27, 2024

The issue is reproducible locally in my environemnt now, if i do a zypper ref & zypper up.
The interface names are changing.
What were they before

sdflex01:~ # ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: enp195s0f0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 08:00:69:18:13:00 brd ff:ff:ff:ff:ff:ff
3: enp195s0f1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 08:00:69:18:13:01 brd ff:ff:ff:ff:ff:ff
4: enp195s0f2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 08:00:69:18:13:02 brd ff:ff:ff:ff:ff:ff
5: enp195s0f3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 08:00:69:18:13:03 brd ff:ff:ff:ff:ff:ff
6: enp65s0f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master bond0 state UP group default qlen 1000
link/ether b8:83:03:94:a7:74 brd ff:ff:ff:ff:ff:ff
7: enp65s0f1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master bond1 state UP group default qlen 1000

link/ether b8:83:03:94:a7:75 brd ff:ff:ff:ff:ff:ff
8: enP1p193s0f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master bond0 state UP group default qlen 1000
link/ether b8:83:03:94:a7:74 brd ff:ff:ff:ff:ff:ff
9: enP1p193s0f1: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 9000 qdisc mq master bond1 state DOWN group default qlen 1000
link/ether b8:83:03:94:a7:75 brd ff:ff:ff:ff:ff:ff
10: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP group default qlen 1000
link/ether b8:83:03:94:a7:74 brd ff:ff:ff:ff:ff:ff
11: bond1: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP group default qlen 1000
link/ether b8:83:03:94:a7:75 brd ff:ff:ff:ff:ff:ff
12: vlan210@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether b8:83:03:94:a7:74 brd ff:ff:ff:ff:ff:ff
inet 10.100.0.179/24 brd 10.100.0.255 scope global vlan210
valid_lft forever preferred_lft forever
13: vlan211@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP group default qlen 1000
link/ether b8:83:03:94:a7:74 brd ff:ff:ff:ff:ff:ff
inet 10.20.211.179/24 brd 10.20.211.255 scope global vlan211
valid_lft forever preferred_lft forever
14: vlan213@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP group default qlen 1000
link/ether b8:83:03:94:a7:74 brd ff:ff:ff:ff:ff:ff
inet 10.20.213.179/24 brd 10.20.213.255 scope global vlan213
valid_lft forever preferred_lft forever
15: vlan212@bond1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP group default qlen 1000
link/ether b8:83:03:94:a7:75 brd ff:ff:ff:ff:ff:ff
inet 10.20.212.179/24 brd 10.20.212.255 scope global vlan212
valid_lft forever preferred_lft forever

What they change to

sdflex01:~ # ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: enp195s0f0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 08:00:69:18:13:00 brd ff:ff:ff:ff:ff:ff
3: enp195s0f1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 08:00:69:18:13:01 brd ff:ff:ff:ff:ff:ff
4: enp195s0f2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 08:00:69:18:13:02 brd ff:ff:ff:ff:ff:ff
5: enp195s0f3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 08:00:69:18:13:03 brd ff:ff:ff:ff:ff:ff
6: ens2498f0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether b8:83:03:94:a7:74 brd ff:ff:ff:ff:ff:ff
7: ens2498f1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether b8:83:03:94:a7:75 brd ff:ff:ff:ff:ff:ff

8: enP1s2500f0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether b8:83:03:94:97:9c brd ff:ff:ff:ff:ff:ff
9: enP1s2500f1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether b8:83:03:94:97:9d brd ff:ff:ff:ff:ff:ff

Please note only the "UP" interfaces are impacted and not the down ones, so the "UP" interfaces are changing names which is why the network is going down.
Interestingly, "DOWN" interfaces as still based on the pci buses names and slots rather than the pci bus slots.

from azure-li-services.

schaefi avatar schaefi commented on September 27, 2024

Hmm, maybe an update of udev is causing this. So our images comes with the following setting:

/usr/lib/systemd/network/99-default.link

[Link]
NamePolicy=path
MACAddressPolicy=persistent

Is that file changed after you ran the zypper up ?

If yes please open a bug against [email protected] and ask how to configure the system permanently for the setting above such that it will survive an update process.

Thanks

from azure-li-services.

jaiawasthi avatar jaiawasthi commented on September 27, 2024

Thanks @schaefi , i did see both systemd & udev being upgraded but i didn't have a look at this file after wards, let me check again, I'll update you soon, will need to setup the system again.

from azure-li-services.

jaiawasthi avatar jaiawasthi commented on September 27, 2024

@schaefi, I have created a bug against systemd-maintainers.

the name policy is getting changed to kernel
[Link]
NamePolicy=kernel database onboard slot path
MACAddressPolicy=persistent

https://bugzilla.suse.com/show_bug.cgi?id=1176738

from azure-li-services.

schaefi avatar schaefi commented on September 27, 2024

ok so it's clear why the issue happened. Now we only need a solution. If you get further information from the bug please send a short notice in this report. I'm not looking on bugzilla as often as I look/get-notified here. Thanks

from azure-li-services.

jaiawasthi avatar jaiawasthi commented on September 27, 2024

@schaefi
i thikn changing the naming policy to "path" would be fine for fixing this issue temporarily, i have tried it and it works fine, but just want to ensure that's the only thing we need to do.

NamePolicy= path

Thanks !!

from azure-li-services.

schaefi avatar schaefi commented on September 27, 2024

want to ensure that's the only thing we need to do.

yes that's the fix you can apply manually to make the system work again. A real fix however must be done differently. I saw there was no response to the bug you have created so far. I guess we have to wait a little longer

from azure-li-services.

jaiawasthi avatar jaiawasthi commented on September 27, 2024

@schaefi , thanks, also, I'm assuming the fixes will be in 2 parts.

  1. patch fixing by systemd, which will be available to the customers who are patching their current OS's
  2. these patches will be included in all our VLI images.

from azure-li-services.

rjschwei avatar rjschwei commented on September 27, 2024

@schaefi I'm afraid this is ours:

https://www.freedesktop.org/software/systemd/man/systemd.link.html

from azure-li-services.

rjschwei avatar rjschwei commented on September 27, 2024

@jaiawasthi

There is nothing that can be done to prevent this on updated. Before update users should run, as root:

mkdir -p /etc/systemd/network
cp /usr/lib/systemd/network/99-default.link /etc/systemd/network

from azure-li-services.

jaiawasthi avatar jaiawasthi commented on September 27, 2024

@schaefi @rjschwei
just to be sure,

  1. for users who already have their systems upgraded need to run below 2 steps
    mkdir -p /etc/systemd/network
    cp /usr/lib/systemd/network/99-default.link /etc/systemd/network

Also, this will prevent changed interface names across future updates ?

  1. for all new images you will provide, the rules file will be added in the correct location, which will persist across updates and we will always have the same interface names. ?

from azure-li-services.

schaefi avatar schaefi commented on September 27, 2024

@jaiawasthi yes your thinking is correct. I will merge the changes from #244 today. This will result in new testing images which I can test in our environment as it's independent of your data center. If there is confirmation that the change really fixed it I will submit the images to the production(SUSE) namespace and will create another production image release.

Do you want a full set of new testing images to test in your space as well ?

Thanks

from azure-li-services.

jaiawasthi avatar jaiawasthi commented on September 27, 2024

@schaefi since the changes involve essentially running just below 2 commands.
mkdir -p /etc/systemd/network
cp /usr/lib/systemd/network/99-default.link /etc/systemd/network

I'll add these manually in our current setup and see if its working.
Meanwhile please test in your setup & provide us the prod image directly.
Thanks !!

from azure-li-services.

jaiawasthi avatar jaiawasthi commented on September 27, 2024

@schaefi @rjschwei
i tried applying the steps mentioned before an upgrade
mkdir -p /etc/systemd/network
cp /usr/lib/systemd/network/99-default.link /etc/systemd/network

sdflexOptanePart1:~ # cat /etc/systemd/network/99-default.link
[Link]
NamePolicy=path
MACAddressPolicy=persistent

But the interfaces are not coming back up.
Also, interfaces are named differently now
sdflexOptanePart1:~ # ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 08:00:69:18:1b:ac brd ff:ff:ff:ff:ff:ff
3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 08:00:69:18:1b:ad brd ff:ff:ff:ff:ff:ff
4: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 08:00:69:18:1b:ae brd ff:ff:ff:ff:ff:ff
5: eth3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 08:00:69:18:1b:af brd ff:ff:ff:ff:ff:ff
6: eth4: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether b8:83:03:92:32:84 brd ff:ff:ff:ff:ff:ff
7: eth5: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether b8:83:03:92:32:85 brd ff:ff:ff:ff:ff:ff
8: eth6: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether b8:83:03:92:22:d4 brd ff:ff:ff:ff:ff:ff
9: eth7: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether b8:83:03:92:22:d5 brd ff:ff:ff:ff:ff:ff

from azure-li-services.

jaawasth avatar jaawasth commented on September 27, 2024

@schaefi , since this is not fixed, can we please repopen this issue, since I will not be able to do so.

from azure-li-services.

schaefi avatar schaefi commented on September 27, 2024

yes of course, sorry should have done this already

from azure-li-services.

schaefi avatar schaefi commented on September 27, 2024

Since there was an obs buildservice outage yesterday I will test the image with the change today. I expect the suggested fix to not work though, but wanted to double check

from azure-li-services.

schaefi avatar schaefi commented on September 27, 2024

I can confirm the fix did not work. The interface name in my test was 'ens3' where it should be based on the mac address. Same applies to the images that use the path policy.

I'm going to revert everything done here.

from azure-li-services.

schaefi avatar schaefi commented on September 27, 2024

sure that's what I'm currently doing. But your change broke everything because the namepolicy as we need is now not taken into account at all. I don't want to keep broken images around.

from azure-li-services.

jaawasth avatar jaawasth commented on September 27, 2024

@schaefi @rjschwei did we find any solutions ?

from azure-li-services.

schaefi avatar schaefi commented on September 27, 2024

Solution was found. The bug is in dracut. For details see

I have added the changes from our side in PR #246
But now we have to wait until dracut gets fixed and a new dracut package gets released. That's why I set the labels respectively on the open pull request. A merge can only be done after a dracut update, otherwise the change in the open PR has no effect

from azure-li-services.

jaawasth avatar jaawasth commented on September 27, 2024

Thanks @schaefi !!

from azure-li-services.

rjschwei avatar rjschwei commented on September 27, 2024

Note that users cannot update until the dracut (fixed read grub previously which was incorrect) has been fixed and then they need to follow the cp instructions posted earlier.

from azure-li-services.

schaefi avatar schaefi commented on September 27, 2024

Note that users cannot update until the grub has been fixed and then they need to follow the cp instructions posted earlier.

@rjschwei I'm confused ?? how is grub related to this report ??

from azure-li-services.

schaefi avatar schaefi commented on September 27, 2024

@rjschwei you probably meant dracut not grub, and that's noted on PR #246 and the reason why the blocked label is set. I hope Thomas will provide a testing package such that we can target the branch build for the testing images as long as the release is not done

from azure-li-services.

rjschwei avatar rjschwei commented on September 27, 2024

@schaefi yes, comment fixed

from azure-li-services.

schaefi avatar schaefi commented on September 27, 2024

We will fix this on the image description level without a dracut update.

from azure-li-services.

jaawasth avatar jaawasth commented on September 27, 2024

Thanks Marcus, but did we need the settings for LI as well ?

from azure-li-services.

schaefi avatar schaefi commented on September 27, 2024

yes everywhere. in LI we have namePolicy set to MAC in VLI we have namePolicy set to PATH. For all descriptions we used /usr/lib/systemd/network/99-default.link. This means all images have the potential to be broken on an update of udev when this file gets overwritten :/

So I will update all images for all SLE

from azure-li-services.

jaawasth avatar jaawasth commented on September 27, 2024

@schaefi , is there any interim solution which the customer can apply before upgrading their OS's so that thet dont run into this issue [without actually upgrading to the new SLES image which you would be releasing] ?

from azure-li-services.

schaefi avatar schaefi commented on September 27, 2024

yes. The current interim solution is:

cp /usr/lib/systemd/network/99-default.link /etc/systemd/network
echo 'install_items+=" /etc/systemd/network/99-default.link "' > /etc/dracut.conf.d/03-systemd.conf
dracut -f

This makes the setting permanent and update safe

But please wait with communicating this because our systemd maintainers doesn't like it. We are currently discussing other options

from azure-li-services.

jaawasth avatar jaawasth commented on September 27, 2024

@schaefi , sure, i though we reached some agreement and you raised a PR for that ?

from azure-li-services.

schaefi avatar schaefi commented on September 27, 2024

Yes and all this is working. The PR is based on what I wrote in the interim solution and has been tested and also submitted to the SUSE namespace for a production release. I just need to click the button and send it to you.

But now people from the systemd maintainer team claimed that using /etc/systemd/network is not the right place to keep the config because it should be used for local modifications only. They suggested to use a higher prio named file and put it to /usr/lib/systemd/network. I prepared a test appliance and demonstrated that this does not work. This is the place where we are right now.

I'd like to give them another day or two to elaborate on this and once it's clear that I didn't do something completely stupid I will go and start the production release process with the solution from here.

If this is blocking you in some way please let me know

Thanks

from azure-li-services.

schaefi avatar schaefi commented on September 27, 2024

ok here is the result of my conversation with the systemd people. @jaawasth you can use the following as interim solution until we roll out new production images

cp /usr/lib/systemd/network/99-default.link /usr/lib/systemd/network/80-azure-li-net.link
dracut -f

NOTE: It's absolutely mandatory that you stick with the name 80-azure-li-net.link because that makes the sort order to be correct for systemd

from azure-li-services.

jaawasth avatar jaawasth commented on September 27, 2024

Thanks Marcus, I'll test out and let you know as well. Will then recommend these to the customers.

from azure-li-services.

jaawasth avatar jaawasth commented on September 27, 2024

@schaefi i tried testing today, sorry all the servers were busy in some testing.
And i tried it with image SLES15-SP1-SAP-Azure-VLI-BYOS.x86_64-1.0.5-Production-Build1.127.raw.xz
Is the behavior not consistent across images ?

The behavior is interesting, see we dont see the name policy as path in the image rather kernel
I still created the file /usr/lib/systemd/network/80-azure-li-net.link with name policy as path and updated the server, it still lost the interface.

interface: enp65s0f0
mtu: 9000

  • interface: enp65s0f1
    mtu: 9000
  • interface: enP1p193s0f0
    mtu: 9000
  • interface: enP1p193s0f1

sdflex02:~ # ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: enp195s0f0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 08:00:69:18:13:00 brd ff:ff:ff:ff:ff:ff
3: enp195s0f1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 08:00:69:18:13:01 brd ff:ff:ff:ff:ff:ff
4: enp195s0f2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 08:00:69:18:13:02 brd ff:ff:ff:ff:ff:ff
5: enp195s0f3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 08:00:69:18:13:03 brd ff:ff:ff:ff:ff:ff
sdflex02:~ # cat /usr/lib/systemd/network/80-azure-li-net.link
[Link]
NamePolicy=path database onboard slot path
MACAddressPolicy=persistent
sdflex02:~ # cat /usr/lib/systemd/network/99-default.link
[Link]
NamePolicy=kernel database onboard slot path
MACAddressPolicy=persistent
sdflex02:~ # uname -a
Linux sdflex02 4.12.14-197.61-default #1 SMP Thu Oct 8 11:04:16 UTC 2020 (b98c600) x86_64 x86_64 x86_64 GNU/Linux

from azure-li-services.

schaefi avatar schaefi commented on September 27, 2024

As a side note I have released new production images today that will fix the issue, you should have an e-mail with further details

Back to the issue here, I think there is a misunderstanding. Your file 80-azure-li-net.link looks wrong to me. The suggested interim solution is based on a system that has not yet run "zypper up". Meaning a system that has not yet replaced 99-default.link with an udev update. If your system has already installed the udev update you can't copy 99-default.link as it already was replaced with a version that is not sufficient for our use case.

So in case your system has already an udev update installed the following needs to be done to fixup the settings:

1a) On LI systems create /usr/lib/systemd/network/80-azure-li-net.link with the following content:

[Link]
NamePolicy=mac
MACAddressPolicy=persistent
[Match]
OriginalName=* 

1b) On VLI systems create /usr/lib/systemd/network/80-azure-vli-net.link with the following content:

[Link]
NamePolicy=path
MACAddressPolicy=persistent
[Match]
OriginalName=* 
  1. Call dracut

    dracut -f
    
  2. reboot the system

The easiest solution is to deploy the images that got released today. But if you need to fixup running servers it should be done in the above way

Hope this helps

from azure-li-services.

jaawasth avatar jaawasth commented on September 27, 2024

@schaefi this si the system state after the update.

What i had done

  1. cp /usr/lib/systemd/network/99-default.link /usr/lib/systemd/network/80-azure-vli-net.link
  2. modify the policy to path here [this is vli] in the 80-azure-vli-net.link file
  3. dracut -f
  4. reboot
  5. update system
  6. share info with you about updated system [in the final state]

I have updated the command sequence from the host itself below.

1 2020-06-29 13:11:44 uname -a
2 2020-06-29 13:12:52 /usr/lib/systemd/network/99-default.link
3 2020-06-29 13:12:56 cat /usr/lib/systemd/network/99-default.link
4 2020-06-29 13:13:15 ip a
5 2020-06-29 13:18:00 cat /usr/lib/systemd/network/99-default.link
6 2020-06-29 13:19:12 cp /usr/lib/systemd/network/99-default.link /usr/lib/systemd/network/80-azure-li-net.link
7 2020-06-29 13:19:16 vi /usr/lib/systemd/network/80-azure-li-net.link
8 2020-06-29 13:19:43 dracut -f
9 2020-06-29 13:20:35 reboot
10 2020-06-29 13:33:40 exit
11 2020-10-14 01:37:19 zypper ref -s && zypper up
12 2020-10-14 01:48:49 reboot

The other thing is why is the default policy kernel for VLI's instead of path, and even if it is how are the interfaces getting correct names in that scenario.

from azure-li-services.

jaawasth avatar jaawasth commented on September 27, 2024

@schaefi , any updates ?

from azure-li-services.

jaawasth avatar jaawasth commented on September 27, 2024

@schaefi , were you able to reproduce this at your end as well ?
can we reopen this bug ?

from azure-li-services.

schaefi avatar schaefi commented on September 27, 2024

Sorry I'm completely confused.

  • I have sent out new production images a week or two ago which fixed all the interface name policy setup.
    Did you get those, did you have a change to test them ? I haven't received any feedback to my mail with the SAS urls for
    those

  • All images have changed according to their policy setup. You saw the PRs here and you reviewed them. I have no idea
    why the behavior should be different between images.

  • All LI images uses NamePolicy=mac, all VLI images uses NamePolicy=path. Nothing has changed in this regard.
    Do you request any different ?

  • The provided procedure from my last comment here in this report worked on all systems I have tested

I'm sorry all this is more than confusing to me and I don't understand how re-open this could help.

Can we please be more specific on what exactly is not working as you expect it. At best provide me ssh access to a machine where you think something is not correct.

Thanks

from azure-li-services.

jaawasth avatar jaawasth commented on September 27, 2024

@schaefi , there were 2 parts to the problem

  1. fixing in new images
  2. fixing in existing systems
    2nd point is the one where I was testing out with the build version of the image I mentioned for an existing system.
    I hope its clearer now ? Else lets syncup over a call.

from azure-li-services.

schaefi avatar schaefi commented on September 27, 2024

fixing in existing systems

ok, thanks. You said the procedure to fix existing systems is not consistent or does not work at all ? I've tested the procedure here again in a VM and could not find a problem. Do you have a system I can ssh to for further checking ?

from azure-li-services.

jaawasth avatar jaawasth commented on September 27, 2024

@schaefi

  1. I was trying to test on latest image, which does have a path policy defined. But just for testing, i picked an older version of image, as mentioned earlier that version is, SLES15-SP1-SAP-Azure-VLI-BYOS.x86_64-1.0.5-Production-Build1.127.raw.xz. This image interestingly has no path policy defined [still the network is configured properly]
  2. Its on this image that I tried testing the workaround, is it possible for you to test on the same image and see if you are able to reproduce this issue
  3. Yes, i have a system, please let me know a time [prefrably next week] when you would want to test it out, I can prepare a system before hand

from azure-li-services.

schaefi avatar schaefi commented on September 27, 2024

Sorry for the late response, had to make some progress with work in the kiwi area

I have tested the image you mentioned and I can explain the difference. The image you tested is from 24.4.2020 but in this
image the rule rewrite to the PATH policy was still done in a different way. For details see commit #43e5664c946a20e99ef8c6c4ab953fd2125a44b9. So the setup procedure using the systemd config file as described here came later.

In the image you have tested the the policy is applied using an udev rule. See the following file on your system

/usr/lib/udev/rules.d/81-net-setup-link.rules

This file rewrites the interfaces on the udev level not on the systemd level. This rewrite is however not the best solution as it should be done once by a correct setup of the link policy through systemd. Which is the reason why it was changed in the
images.

The good news is that on systems which rewrites the interfaces through this extra 81-net-setup-link.rules you should not
see the issue as we have it with the systemd config file.

Everything should just be ok on this system, before and after the update of udev.

I'm sorry I didn't thought about production images still be there using the 81-net-setup-link.rules file

Does this makes sense to you or did I confuse you ?

Thanks

from azure-li-services.

schaefi avatar schaefi commented on September 27, 2024

In short customers running a production system that has /usr/lib/udev/rules.d/81-net-setup-link.rules should not see an issue

from azure-li-services.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.