Giter Club home page Giter Club logo

Comments (22)

triha2work avatar triha2work commented on August 12, 2024

There is no automatic HSR frame generation. The path field is always A. Hardware duplicate discard only works on sequence id and source MAC address. So if the the source MAC address is different as if sent from 2 independent network devices then the feature will not work. Also using just even number nodes like 2 will not trigger the effect as the two frames likely arrive at the same time.

from evb-ksz9477.

lmajewski avatar lmajewski commented on August 12, 2024

@triha2work - Thanks for your reply.

Yes - it also looks to me that there is no automatic HSR field generation in the KSZ9477 - this must be done in SoC SW.

My question is - how the frame duplication works? In the 4.9 driver each HSR frame is sent from SoC with tail tag assigned to separate port. In other words - the frame duplication offloading in KSZ9477 is not utilized. Am I correct?

If I would like to use this offloading (i.e. frame duplication) - then I would need to enable ports in "HSR Port Map" register and modify tail tag to has bits set for those HSR aware ports? Is this enough?

I also would like to ask about supported HSR version. From mine observation - the "Supervision Frames" with EtherType=0x88FB (PRP) are send by the 4.9 Linux driver. However, when I dump skb data in Linux 4.9 HSR driver just before sending it by MAC there is no frames with EtherType=0x892F. Why such frames are not observed (the HSR seems to work as I can see ongoing transmission with unplugged one cable)?

To sum up - could you clarify which HSR version (v0 or v1) is supported by the KSZ9477 IC and the Linux 4.9 driver?

Last but not least - for current mainline (v6.5-rc3) I can observe, that when I do use pure SW based HSR (CONFIG_HSR=y) with DSA driver - the feature is not working and I notice a lot of "error" frames with "Link Down/Up" observed all the time.

Do you have any idea why this happens?

Thanks in advance for your help.

from evb-ksz9477.

triha2work avatar triha2work commented on August 12, 2024

The frame duplication is just a normal feature in KSZ switches where the host can send a frame to any port. In this case the same frame is sent to ports 1 and 2 as defined as HSR ports. There is no specific switch hardware register to enable this feature except the tail tagging, which is always required to implement the DSA driver.
The HSR driver code used by the non-DSA driver is very old. It came from 3.18. At that time the V1 implementation was not formalized. I am able to use the HSR code in 6.1 to replace the old one, but the performance is worse than before so I am hesitated to do that.
The standard HSR implementation sits on top of network device. A new HSR device is created using 2 network devices like lan1 and lan2. So I do not know why there is issue. I can try that out.
The HSR implementation in the non-DSA switch driver was changed from setting on the very top of network devices to the very bottom of the network device. When a normal frame is sent to the MAC driver it is passed to the switch driver to transform it into a HSR frame. The port destination is always updated to send to both HSR ports. When the HSR frame is received from the MAC it is passed to the switch driver to check for duplicate and remove the HSR tag.
It was implemented that way because of concern of using PTP in HSR environment.

from evb-ksz9477.

lmajewski avatar lmajewski commented on August 12, 2024

The frame duplication is just a normal feature in KSZ switches where the host can send a frame to any port. In this case the same frame is sent to ports 1 and 2 as defined as HSR ports.

Ok, this is clear - the KSZ9477 is copying 1 to 1 the HSR header when it duplicates it.

There is no specific switch hardware register to enable this feature except the tail tagging, which is always required to implement the DSA driver.

I see - the duplication is enabled when both ports (previously enabled in "HSR PORT MAP" register) are marked in KSZ9477 TAG

The HSR driver code used by the non-DSA driver is very old. It came from 3.18. At that time the V1 implementation was not formalized.

So the 4.9 (non DSA driver) supports only HSR v0 ?

I am able to use the HSR code in 6.1 to replace the old one, but the performance is worse than before so I am hesitated to do that.

Which code is that? Is the 4.9 ported driver? Or is it the only-SW-based HSR implementation already present in the Linux kernel (enabled with CONFIG_HSR)?

The standard HSR implementation sits on top of network device. A new HSR device is created using 2 network devices like lan1 and lan2.

I'm using with v6.5-rc3:
ifconfig lan1 up;ifconfig lan2 up
ip link add name hsr0 type hsr slave1 lan1 slave2 lan2 supervision 45
ethtool -K hsr0 hsr-dup-offload on --> Not yet implemented (work in progress)
ifconfig hsr0 192.168.0.1 up

So I do not know why there is issue. I can try that out.

I would expect to have the in-SW HSR working with KSZ9477 (on EVB-KSZ9477) with only using CONFIG_HSR and CONFIG_DSA

The HSR implementation in the non-DSA switch driver was changed from setting on the very top of network devices to the very bottom of the network device. When a normal frame is sent to the MAC driver it is passed to the switch driver to transform it into a HSR frame.

Yes - I've noticed it. The 4.9 Linux driver is modifying frames to insert by itself the HSR header. Moreover, it is not using the DSA framework.

The port destination is always updated to send to both HSR ports.

Are you sure? Maybe, I've misunderstood the Linux 4.9 driver code, but it looks like the HSR driver is sending two separate frames (with only one egress port set in tag). It looks like this driver is NOT relying on in-KSZ9477 frame duplication feature.

When the HSR frame is received from the MAC it is passed to the switch driver to check for duplicate and remove the HSR tag.

This must be supported anyway, as there is a chance to have two frames with the same sequence number passed to MAC, as in some circumstances the KSZ9477 will not drop the duplicate frame.

It was implemented that way because of concern of using PTP in HSR environment.

The PTP is not utilized in my use case. I do only need HSR support in the newest mainline.

from evb-ksz9477.

triha2work avatar triha2work commented on August 12, 2024

The non-DSA switch driver uses its built-in HSR driver, not the one from the kernel. That was updated to use v1. If you want to use the kernel HSR driver then you need to set up the non-DSA driver to create independent network devices, just like the DSA driver.
As explained before the kernel HSR support uses a virtual device on top of 2 network devices. The non-DSA driver HSR support is located at the bottom and uses 1 regular eth0 network device. The HSR support is set up in U-Boot with these parameters: "setenv eth1_ports 3; setenv eth1_proto hsr."

from evb-ksz9477.

lmajewski avatar lmajewski commented on August 12, 2024

The non-DSA switch driver uses its built-in HSR driver, not the one from the kernel. That was updated to use v1

Ok. That is for Linux 4.9 HSR driver from Microchip?

If you want to use the kernel HSR driver then you need to set up the non-DSA driver to create independent network devices, just like the DSA driver.

Is this sentence regarding 4.9 Microchip driver? Or is it regarding newest Linux 6.5-rc3? I guess that the former? Am I correct?

The non-DSA driver HSR support is located at the bottom and uses 1 regular eth0 network device.

Yes, I've noticed that on 4.9 driver only eth0 is present.

The HSR support is set up in U-Boot with these parameters: "setenv eth1_ports 3; setenv eth1_proto hsr."

Yes, correct. I'm able to setup KSZ9477 EVB board to work in this mode with 4.9 Linux kernel (from Microchip).

from evb-ksz9477.

triha2work avatar triha2work commented on August 12, 2024

The drivers in the GitHub repository are non-DSA drivers. There are DSA drivers in the 4.9 kernel but they are for testing only. The official DSA drivers should come from the mainline kernels, like 5.4 and up.
I just tried the kernel HSR driver with non-DSA switch driver in kernel 5.10 and did not see any problem. I can try the DSA driver later, but that should not cause any problem as the DSA operation is much simpler.
The non-DSA switch driver always run HSR in v1, as Wireshark only displays HSR frame in that version correctly, so the switch driver never tried to operate in v0. I noted the HSR driver gets the protocol version from the "ip link" command but the one I used does not support that parameter, so I just forced the version to v1 from default v0 in the HSR driver.
The performance of kernel HSR driver is very bad, especially the newer version like in 5.10. The TCP transmit performance is only about 65 Mbps while the non-DSA HSR driver gets about 145 Mbps.
There are some bugs in 6.1 version, but they were probably fixed in 6.5.

from evb-ksz9477.

lmajewski avatar lmajewski commented on August 12, 2024

The drivers in the GitHub repository are non-DSA drivers. There are DSA drivers in the 4.9 kernel but they are for testing only. The official DSA drivers should come from the mainline kernels, like 5.4 and up.

The kernel DSA driver (without HSR) for KSZ9477-EVB works as expected without issues on 6.5-rcX.

I just tried the kernel HSR driver with non-DSA switch driver in kernel 5.10 and did not see any problem. I can try the DSA driver later, but that should not cause any problem as the DSA operation is much simpler.

The issue here is the permanent link status change (Link UP/DOWN) in the logs for HSR enabled interfaces. Only a fraction of HSR frames are delivered. It looks like the network subsystem seems to not be working reliable.

The non-DSA switch driver always run HSR in v1, as Wireshark only displays HSR frame in that version correctly, so the switch driver never tried to operate in v0.

Yes. I can confirm that.

I noted the HSR driver gets the protocol version from the "ip link" command but the one I used does not support that parameter,

Yes. The 'ip" command is outdated in KSZ9477-EVB BSP setup (I'm using the newest one built from buildbot).

so I just forced the version to v1 from default v0 in the HSR driver.

I also did the same.

The performance of kernel HSR driver is very bad, especially the newer version like in 5.10.

Could you share more details about your's setup to test the performance of mainline kernel's HSR and DSA drivers (enabled together)? With 6.5-rcX the SW based HSR driver (with DSA) is not usable (as mentioned above).

The TCP transmit performance is only about 65 Mbps while the non-DSA HSR driver gets about 145 Mbps. There are some bugs in 6.1 version,

Could you be more specific here ?

but they were probably fixed in 6.5.

Can you point me to some bug reports? Are those for the non-DSA HSR driver only?

  • Just to explain mine approach:
  1. I expect that the KSZ9477-EVB shall work with in kernel's DSA and HSR driver (even with very poor performance).
  2. The drivers/net/dsa/microchip/ksz* shall have the ksz9477_hsr_join added to it to enable in KSZ9477 IC HW features (like frame duplication and filtering)

from evb-ksz9477.

triha2work avatar triha2work commented on August 12, 2024

Are you using the KSZ9477 evaluation board with SAMA5D3? I do not see the link issues you mentioned. And they should not be related to the HSR driver. Are you connecting the switch to some other switch and somehow the PHY settings are not compatible and there are link issues?
The HSR driver bug is still there in linux 6.5.0-rc6. It cannot decode the supervisor frame correctly and so every time a unicast frame is sent an error message is displayed from hsr_addr_subst_dest(). I do not know if the code is supposed to handle only v0 supervision frame. But the original code in 5.4 is correct. Maybe during refactoring later the developer just used the wrong structure. After the fix I was able to have nuttcp transmit throughput of 57 Mbps and receive throughput of 130 Mbps. The transmit throughput can be improved by tweaking the Cadence MAC driver, as it implemented a workaround that drastically reduces the TCP transmit performance.
The KSZ9477 HSR hardware duplicate drop function can be enabled all the time. If no HSR frame is used then that function is not activated. I do not know how the DSA driver handles the frame duplication though, as the tail tag processing does not know anything about the hardware configuration. The new PTP implementation may add a flag to signal HSR use. The hardware duplicate drop function only works for the same source MAC address. If lan1 and lan2 have different MAC address then the hardware cannot do anything.
Another hardware feature that needs to be enabled is self address filtering so that a broadcast frame will not be put in a loop. Before the DSA API has a set_address function so the DSA driver can program that in the switch, but that function was removed. There is a new WoL implementation which uses the device tree to specify a MAC address so that address can be programmed to the switch. That allows enabling the self address filtering function. The original set_address API is better, but the DSA maintainers may not want to add that function back.

from evb-ksz9477.

lmajewski avatar lmajewski commented on August 12, 2024

Are you using the KSZ9477 evaluation board with SAMA5D3? I do not see the link issues you mentioned. And they should not be related to the HSR driver. Are you connecting the switch to some other switch and somehow the PHY settings are not compatible and there are link issues?

I'm using the KSZ9477-EVB board with SAMA5D3 (KSZ9477-EVB)

I connect the cables to Port1 and Port2 between two KSZ9477-EVB boards. The driver with HSR from 4.9 (and proper configuration passed when kernel boots: console=ttyS0,115200 earlyprintk root=/dev/mmcblk0p2 rw rootwait spi-ksz9897.multi_dev=1 spi-ksz9897.eth1_ports=3 spi-ksz9897.eth1_vlan=0x7e spi-ksz9897.eth1_proto=hsr spi-ksz9897.eth2_vlan=0x7f spi-ksz9897.eth2_proto=redbox) works as expected.

I've found that the HSR frames have tag with 0x3 - so yes, KSZ9477 duplicates frames in HW. In struct ksz_hsr_info the info->member is set to 3.

When I use the same HW setup with 6.5-rcX kernels on both KSZ9477-EVB boards the very frequent link up/down are shown - hence I guess that this is issue with the kernel.

On the other hand - when 4.9 (with non-DSA HSR driver) is used on one KSZ9477-EVB board and 6.5-rc7 on the other - there are no issues with HSR.

The HSR driver bug is still there in linux 6.5.0-rc6. It cannot decode the supervisor frame correctly and so every time a unicast frame is sent an error message is displayed from hsr_addr_subst_dest().

When I connect two KSZ9477-EVB with running v6.5-rc7 the link up/down issue is present also with:
ksz-switch spi1.0 lan1: hsr_addr_subst_dest: Unknown node

I do not know if the code is supposed to handle only v0 supervision frame. But the original code in 5.4 is correct.

I will test is 5.4 is working correctly with the KSZ9477-EVB board.

After the fix I was able to have nuttcp transmit throughput of 57 Mbps and receive throughput of 130 Mbps. The transmit throughput can be improved by tweaking the Cadence MAC driver, as it implemented a workaround that drastically reduces the TCP transmit performance.

Do you plan to publish those fixes ? (to kernel mailing list or some web available Microchip repository)

The KSZ9477 HSR hardware duplicate drop function can be enabled all the time. If no HSR frame is used then that function is not activated.

It is set by default in Global HSR AME Control Register 0 (0x0644) - bit 7 (HSR_DUPLICATE_DISCARD_ENB).

The hardware duplicate drop function only works for the same source MAC address. If lan1 and lan2 have different MAC address then the hardware cannot do anything.

Thanks for clarification - then I must set the same MAC address for lan1 and lan2.

Another hardware feature that needs to be enabled is self address filtering so that a broadcast frame will not be put in a loop.

I will look on this more closely - thanks for pointing this out.

from evb-ksz9477.

triha2work avatar triha2work commented on August 12, 2024

I finally saw the link problem you mentioned. It is so bad the board is unusable. You should use 6.1 until the problem is fixed. It is caused by EEE and a special DSP PHY register setting. The latest kernel code enables EEE after the main PHY driver thinks EEE support is there. The workaround is to clear the EEE register first before the main PHY driver reads it. An alternative is to the use the non-DSA driver in another board which has EEE disabled so that the link problem does not happen.
For the HSR driver the wrong code works for v0. Maybe that is why the problem was not caught before. The updated code works for both.

from evb-ksz9477.

lmajewski avatar lmajewski commented on August 12, 2024

I finally saw the link problem you mentioned. It is so bad the board is unusable. You should use 6.1 until the problem is fixed.

I would prefer to use 6.5-rc7 and then manually disable EEE via ethtool:
ethtool --show-eee lan1
ethtool --show-eee lan2
ethtool --set-eee lan1 eee off
ethtool --set-eee lan2 eee off

This fixes the problem with Link Stability (without the need to modify driver).

Exactly this issue is mentioned in errata Module 4.

I've prepared proper patches for mainline kernel.

from evb-ksz9477.

lmajewski avatar lmajewski commented on August 12, 2024

Another hardware feature that needs to be enabled is self address filtering so that a broadcast frame will not be put in a loop.

With this one I'm a bit puzzled. The AN3474 note in PREVENTING PACKET LOOP IN THE RING BY SELF-ADDRESS FILTERING paragraph mentions about Port Control 2 register's bit 3.

Unfortunately, in the data sheet I cannot find it. It looks like it is not documented.

In the sources I've found:
./drivers/net/dsa/microchip/ksz9477_reg.h:1233:#define REG_PORT_MAC_CTRL_2 0x0402
but it is not referenced in the 4.9 kernel (and newest main line) at all.
Moreover, in the 'data sheet' only Port MAC Control 1 Register is described in point 5.2.5.2.

Is this the correct register to set bit 3 in it to enable per port self-filtering ?
If yes - could you point me where I can find description of this register?

I'm asking as (according to AN3474) to have this feature enabled, I do need to enable the self MAC filtering globally (this I've found) AND per HSR port (this I'm looking for).

from evb-ksz9477.

triha2work avatar triha2work commented on August 12, 2024

Maybe you got tired after reading all the document. The AN3474 note states correctly the names that are in the datasheet. Switch Lookup Engine Control 1 Register is 0x0311. Port Control 2 Register is 0xNB00. The datasheet has links in the descriptions of these registers so you can jump between them.
This feature is generic switch feature and is not a part of HSR design.

from evb-ksz9477.

lmajewski avatar lmajewski commented on August 12, 2024

Ach..... Now I do see.

The Port Control 2 Register is defined as REG_PORT_LUE_CTRL in the 4.9 and newest main line.

Thanks for pointing this out.

from evb-ksz9477.

lmajewski avatar lmajewski commented on August 12, 2024

The patches now has been posted to ML (@triha2work you were added to receive them :-) )

The performance measurements:

  • HW - two KSZ9477-EVB boards (port1 and port2 used as HSR) with the same kernel v6.5-rc7
  • SW used:
    nuttcp -S --nofork
    nuttcp -vv -T 60 -r 192.168.0.2
    nuttcp -vv -T 60 -t 192.168.0.2
  • Code: KSZ9477-EVB Linux repository
  • Tested HSR v0 and v1
  • Results:
    With KSZ9477 offloading support added: RX: 100 Mbps TX: 98 Mbps
    With no offloading (last 4 patches removed) RX: 63 Mbps TX: 63 Mbps

from evb-ksz9477.

lmajewski avatar lmajewski commented on August 12, 2024

@triha2work - I'm wondering why KSZ9477 doesn't support in HW in any way forwarding HSR frames between switch ports? I've read somewhere that this may contribute to some slow down when many HSR aware devices work in a "ring".

Do you have any thoughts about this issue?

from evb-ksz9477.

triha2work avatar triha2work commented on August 12, 2024

Are you referring to the DSA driver behavior? The hardware and non-DSA driver do work correctly in forwarding HSR frames. The behavior of DSA driver is at initialization all ports act independently and requires a software bridge device to forward frames. After the ports are joined to the bridge they are linked to forward frames inside the switch. There is an indication to the bridge device to stop forwarding the frames itself.
I think the HSR join may need further implementation to link the ports just like joining a bridge. The HSR driver does have an indication to stop this forwarding itself.
There is a limitation of the hardware duplicate drop feature. It only works for same source MAC address and sequence id. When the number of nodes in the HSR network is even it will fail to activate for the node furthest from the node sending the frames. This happens when there are 2 nodes. For this reason it is better to always add a third switch, which can be a dump switch, to activate this duplicate drop feature when doing HSR demo.

from evb-ksz9477.

lmajewski avatar lmajewski commented on August 12, 2024

Are you referring to the DSA driver behavior? The hardware and non-DSA driver do work correctly in forwarding HSR frames. The behavior of DSA driver is at initialization all ports act independently and requires a software bridge device to forward frames. After the ports are joined to the bridge they are linked to forward frames inside the switch. There is an indication to the bridge device to stop forwarding the frames itself.

Yes, the frame forwarding can be setup in HW and then when bridging is used the DSA driver would offload it.

I think the HSR join may need further implementation to link the ports just like joining a bridge. The HSR driver does have an indication to stop this forwarding itself.

This is really interesting, as HSR driver has NETIF_F_HW_HSR_FWD flag, which can be enabled in the HSR core. I just need to dig where in 4.9 the implicit "bridging" is enabled for HSR ports.

There is a limitation of the hardware duplicate drop feature. It only works for same source MAC address and sequence id. When the number of nodes in the HSR network is even it will fail to activate for the node furthest from the node sending the frames. This happens when there are 2 nodes.

Why the further node? The App Note says that it shall work for more than 2 nodes. According to the note - the problem is when two packets arrive together at almost the same time.

Or is there any other problem?

For this reason it is better to always add a third switch, which can be a dump switch, to activate this duplicate drop feature when doing HSR demo.

Yes, the HSR works better with more than 2 HSR nodes.

from evb-ksz9477.

triha2work avatar triha2work commented on August 12, 2024

The switch forwarding is simply manipulating the port membership bits. The default value is 0x7f for 7-port switch. It is set to 0x21 for port 1; 0x22, port 2; 0x24, port 3; and so on at the beginning. When port 1 and 2 are joined their membership values becomes 0x23. When all 6 ports are joined the value goes back to 0x7f. In HSR the HSR port 1 and 2 have this membership value: 0x23.

from evb-ksz9477.

lmajewski avatar lmajewski commented on August 12, 2024

The switch forwarding is simply manipulating the port membership bits. The default value is 0x7f for 7-port switch. It is set to 0x21 for port 1; 0x22, port 2; 0x24, port 3; and so on at the beginning. When port 1 and 2 are joined their membership values becomes 0x23. When all 6 ports are joined the value goes back to 0x7f. In HSR the HSR port 1 and 2 have this membership value: 0x23.

When I call ksz9477_hsr_join then the REG_PORT_VLAN_MEMBERSHIP__4 value for both HSR ports is 0x20.
(the REG_SW_QM_CTRL__4 has correct value of 0x82).

I guess that both HSR ports (1,2) shall have the value of REG_PORT_VLAN_MEMBERSHIP__4 equal to 0x23 as you suggested (i.e. frames are passed between CPU port and HSR ports in KSZ9477 internal matrix).

Then the NETIF_F_HW_HSR_FWD net device capability flag could be set as frames from HSR port 1 are passed to CPU port and HSR port 2. In the current v6.5-rc7 Linux this flag disables promiscuity mode for HSR ports (1,2).

I will have more info regarding the performance impact (about frames passing) when I form the HSR mesh (with 3 devices).

from evb-ksz9477.

lmajewski avatar lmajewski commented on August 12, 2024

@triha2work - If I may ask about the RedBox setup for the HSR driver [*].

Why in the drivers/net/ethernet/micrel/ksz_hsr.c in prep_hsr() function the vlan (in KSZ9477 switch IC) is setup for RedBox access?

Please also correct me, but it looks like in your driver there is some kind of "bridge" HSR driver (L2), which decides (based on configuration) if the HSR frame shall be forwarded to port marked as RedBox one (e.g. sw_ins_hsr() at ksz_sw_9897.c).

Have you tried to upstream the HSR RedBox functionality to Linux mainline?

I've poked around mainline Linux sources and I did not spot any attempt of such work done yet.
The current status is to have hsr0 (for HSR ring) and lan3 (for "normal" ETH). Communication between them can be performed on L3 level without issues (with proper routing).

However, "bridging" those two interfaces on L2 level is not working out of the box.

Note:
[*] - "EVB KSZ9477 Software Setup Guide" (point 7.4)
setenv eth1_vlan 0x7e
setenv eth2_proto redbox
setenv eth2_vlan 0x7f

from evb-ksz9477.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.