Are you using paramiko as a client or server? Client <h3 dir="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Got it, <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

[BUG] - TransportTest.test_compression failing on RHEL8(s390x) about paramiko HOT 8 OPEN

pghmcfc commented on September 27, 2024

[BUG] - TransportTest.test_compression failing on RHEL8(s390x)

from paramiko.

Comments (8)

bskinn commented on September 27, 2024

Hm, this is interesting.

What sort of outcome are you hoping for here? Is this a blocking problem, and you need a v2.12.1 that will pass on this RHEL/arch?

Is an update to 3.x not feasible for RHEL8?

from paramiko.

pghmcfc commented on September 27, 2024

@bskinn, I'm not looking for any 2.x release. If the test as it stands doesn't make sense and gets changed in the 3.x series, I can make the same/equivalent change in the RHEL-8 version. If the test does make sense then I'll have to think again about what to do with it.

An update to 3.x isn't viable because (a) some of the dependencies on that platform are too old, and (b) we prefer to minimize the chances of breaking things with incompatible updates on RHEL platforms, and a major version update would just be too risky I think.

from paramiko.

bskinn commented on September 27, 2024

Got it, @pghmcfc -- looking for a policy decision on the strength of the assertEqual, and if perhaps a comparison instead of an equality check is enough.

That'll be @bitprophet's call. We'll get back to you.

from paramiko.

bitprophet commented on September 27, 2024

Thank you for the detailed report! 🙏🏻 high level, non-COTS platform support is always a weak spot for us as with most unpaid OSS, and yea, I lack my own mainframe. but if we can solve this easily that'd be great!

FWIW any change I make for this would likely get committed to the 2.12 branch and merged upwards through 3.x, which I assume would work okay for you re: cherry-picking to the RHEL8 package.

FTR: The last time this line was changed is the SHA2 support add (#596), specifically in 26b11d0 which looks like it's trying to be slightly less hardcoded than previously, but as you've found, it's still making too strict an assumption for all platforms.

Also FTR, the primary low level tomfoolery inside the test message is add_string, which in turn encapsulates string.encode() + (via Message.add_int) struct.pack.

This low level byte juggling isn't my comfort zone, but having scanned this so far, my off the cuff thoughts are:

Is the struct.pack output differing between platforms, making these packets larger on S390X (which would explain the larger byte counts)?
What about the string encoding? I would hope not, but.
The other obvious cause could be that on S390X only, some other protocol messages are flying by in the background while the test is capturing byte counts; lots of possible culprits there.
- You might want to take a closer look at the overall debug-level logs during this test and see if anything is being logged (eg: extra re-keying, or whatever).
- There is also a "please dump all packets in hex" setting buried in Transport you could temporarily enable, if you were feeling especially daring.

If we assume no tugging on the above unravels anything we can use, then as noted, we already have a passing assert with a looser bound, so perhaps we should just band-aid it:

Delete the stricter assertion outright. I don't love this, since it works everywhere else.
Wrap in if sys.platform != "S390X": or equivalent. This seems...fine? If this is the only change needed for S390X support, not the worst sin.
- Or wrap the entire test in @mark.skipIf, but again, why throw baby out with bathwater.

from paramiko.

pghmcfc commented on September 27, 2024

I can try some of the above things but first I'd like to understand the test as it stands. The assertion is this:

self.assertEqual(16 + block_size + mac_size, bytes2 - bytes)

As I understand it, bytes is the byte count prior to sending the 1024 "x" characters and bytes2 is the byte count after sending those characters. So the difference (bytes2-bytes) should be the compressed size of those 1024 characters. This is already checked to be less than 1024 as a basic compression functionality test in the assertion prior to the failing one.

This compressed data size is then being compared with 16+block_size+mac_size, and I don't really understand how this is supposed to relate to the compressed data size. Does this actually work if different cipher/mac algorithms are used and their block/mac sizes are different?

One more data point: the zlib implementation on s390x does a lot in hardware rather than software, which might make a difference, though this problem does not manifest on Fedora/s390x.

from paramiko.

bitprophet commented on September 27, 2024

We'd have to pick the brain of the original committer to see what they were doing, but I just did a lot of low level snooping and RFC reading (pro tip: don't do this if it's not your forte and you're also low on blood sugar 😵) and yes, the math in question is both tied somewhat to the default cipher, and can break if you choose one with a different block size or other properties.

(Granted, the only non-16-byte blocksize available to us right now is 3DES, but hey. I did do that, and the test did break, for math reasons I won't replicate here.)

Having dug into this it's no clearer to me why S390X would arrive at an exactly twice as large packet size, unless its zlib implementation is somehow worse despite being partly hardware-offloaded. On amd64 the result of compressing the protocol message (which is the 1024 x's plus another ~5 bytes of headers) is merely 20 bytes, and the rest of the 2x16byte wire packet (sans MAC) is more headers and then padding up to blocksize. So if your 128 bytes is still the same single message, it implies the payload is in the neighborhood of 120-ish bytes. I still suspect it's something else.

Right now my inclination is to trim this test back to just "look, if compression was broken, the packet would be >1024 bytes because the payload alone would be 1024 bytes and then we'd have a bunch of other bytes on top of that".

Adapting the test to be truly accurate regardless of cipher blocksize would essentially require porting the entire Packetizer._build_packet body into the test (fragile, slightly complicated) and that doesn't seem like it gains us much - especially as it would still break for S390X.

I could do that, document it more clearly, and also wrap it in the sys.platform check; but that seems like a handful of extra lines for, again, no real gain. The likelihood of the compression functionality somehow breaking in a manner that "only partly compresses" and results in a falsely passing test because it's <1024 bytes, seems vanishingly small, especially given how dead simple that functionality is (it's literally just two tiny zlib method calls).

from paramiko.

bitprophet commented on September 27, 2024

@pghmcfc Please see the commit I just pushed in a branch for this. (It's failing CI because of codecov, ignore that. Is based off the 2.12 branch which is now very old.)

I'm still really curious now what's going on in your case; when I stop and think about it, platform architecture should not matter for the purposes of the exact single packet in question. Otherwise, you couldn't ssh to a 64-bit system from a 32-bit one, or whatever. Even if it's somehow zlib's fault, again, arriving at a different pile of bytes on S390X and sending it over the wire to an amd64 system would result in failure to cleanly decompress.

I would still place a small bit of money on "the packet is sending twice for some reason, or rekeying is happening, or whatever" because nothing else makes much sense.

Or it's a bug in the test harness that is only being triggered on this specific platform? 🤔

from paramiko.

pghmcfc commented on September 27, 2024

@bitprophet Your commit does, not unexpectedly since it's equivalent to the first part of the original test, make the test pass.

It looks like compression level makes a difference. I tweaked the zlib compression code to use different compression levels and this is what I got:

Compression Level	EL8/x86_64 start_bytes	EL8/x86_64 sent_bytes	EL8/s390x start_bytes	EL8/s390x sent_bytes
-1	336	64	528	128
0	352	1088	352	1088
1	336	64	528	128
...
6	336	64	528	128
7	336	64	336	64
...
9	336	64	336	64

The default compression level (-1) is equivalent to level 6. If it had been equivalent to level 7, the test would have passed as-is.

Looking at this, I don't think there's anything particularly bad going on, just that the zlib implementation on that platform produces bigger output at lower compression levels compared with other implementations. It doesn't mean there's an interoperability issue since different implementations can produce different, but valid streams that decompress back to the original data correctly.

Another reference: nodejs/node/pull/44117

from paramiko.

[BUG] - TransportTest.test_compression failing on RHEL8(s390x) about paramiko HOT 8 OPEN

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent