Giter Club home page Giter Club logo

Comments (8)

bskinn avatar bskinn commented on September 27, 2024

Hm, this is interesting.

What sort of outcome are you hoping for here? Is this a blocking problem, and you need a v2.12.1 that will pass on this RHEL/arch?

Is an update to 3.x not feasible for RHEL8?

from paramiko.

pghmcfc avatar pghmcfc commented on September 27, 2024

@bskinn, I'm not looking for any 2.x release. If the test as it stands doesn't make sense and gets changed in the 3.x series, I can make the same/equivalent change in the RHEL-8 version. If the test does make sense then I'll have to think again about what to do with it.

An update to 3.x isn't viable because (a) some of the dependencies on that platform are too old, and (b) we prefer to minimize the chances of breaking things with incompatible updates on RHEL platforms, and a major version update would just be too risky I think.

from paramiko.

bskinn avatar bskinn commented on September 27, 2024

Got it, @pghmcfc -- looking for a policy decision on the strength of the assertEqual, and if perhaps a comparison instead of an equality check is enough.

That'll be @bitprophet's call. We'll get back to you.

from paramiko.

bitprophet avatar bitprophet commented on September 27, 2024

Thank you for the detailed report! 🙏🏻 high level, non-COTS platform support is always a weak spot for us as with most unpaid OSS, and yea, I lack my own mainframe. but if we can solve this easily that'd be great!

FWIW any change I make for this would likely get committed to the 2.12 branch and merged upwards through 3.x, which I assume would work okay for you re: cherry-picking to the RHEL8 package.


FTR: The last time this line was changed is the SHA2 support add (#596), specifically in 26b11d0 which looks like it's trying to be slightly less hardcoded than previously, but as you've found, it's still making too strict an assumption for all platforms.

Also FTR, the primary low level tomfoolery inside the test message is add_string, which in turn encapsulates string.encode() + (via Message.add_int) struct.pack.

This low level byte juggling isn't my comfort zone, but having scanned this so far, my off the cuff thoughts are:

  • Is the struct.pack output differing between platforms, making these packets larger on S390X (which would explain the larger byte counts)?
  • What about the string encoding? I would hope not, but.
  • The other obvious cause could be that on S390X only, some other protocol messages are flying by in the background while the test is capturing byte counts; lots of possible culprits there.
    • You might want to take a closer look at the overall debug-level logs during this test and see if anything is being logged (eg: extra re-keying, or whatever).
    • There is also a "please dump all packets in hex" setting buried in Transport you could temporarily enable, if you were feeling especially daring.

If we assume no tugging on the above unravels anything we can use, then as noted, we already have a passing assert with a looser bound, so perhaps we should just band-aid it:

  • Delete the stricter assertion outright. I don't love this, since it works everywhere else.
  • Wrap in if sys.platform != "S390X": or equivalent. This seems...fine? If this is the only change needed for S390X support, not the worst sin.
    • Or wrap the entire test in @mark.skipIf, but again, why throw baby out with bathwater.

from paramiko.

pghmcfc avatar pghmcfc commented on September 27, 2024

I can try some of the above things but first I'd like to understand the test as it stands. The assertion is this:

self.assertEqual(16 + block_size + mac_size, bytes2 - bytes)

As I understand it, bytes is the byte count prior to sending the 1024 "x" characters and bytes2 is the byte count after sending those characters. So the difference (bytes2-bytes) should be the compressed size of those 1024 characters. This is already checked to be less than 1024 as a basic compression functionality test in the assertion prior to the failing one.

This compressed data size is then being compared with 16+block_size+mac_size, and I don't really understand how this is supposed to relate to the compressed data size. Does this actually work if different cipher/mac algorithms are used and their block/mac sizes are different?

One more data point: the zlib implementation on s390x does a lot in hardware rather than software, which might make a difference, though this problem does not manifest on Fedora/s390x.

from paramiko.

bitprophet avatar bitprophet commented on September 27, 2024

We'd have to pick the brain of the original committer to see what they were doing, but I just did a lot of low level snooping and RFC reading (pro tip: don't do this if it's not your forte and you're also low on blood sugar 😵) and yes, the math in question is both tied somewhat to the default cipher, and can break if you choose one with a different block size or other properties.

(Granted, the only non-16-byte blocksize available to us right now is 3DES, but hey. I did do that, and the test did break, for math reasons I won't replicate here.)

Having dug into this it's no clearer to me why S390X would arrive at an exactly twice as large packet size, unless its zlib implementation is somehow worse despite being partly hardware-offloaded. On amd64 the result of compressing the protocol message (which is the 1024 x's plus another ~5 bytes of headers) is merely 20 bytes, and the rest of the 2x16byte wire packet (sans MAC) is more headers and then padding up to blocksize. So if your 128 bytes is still the same single message, it implies the payload is in the neighborhood of 120-ish bytes. I still suspect it's something else.


Right now my inclination is to trim this test back to just "look, if compression was broken, the packet would be >1024 bytes because the payload alone would be 1024 bytes and then we'd have a bunch of other bytes on top of that".

Adapting the test to be truly accurate regardless of cipher blocksize would essentially require porting the entire Packetizer._build_packet body into the test (fragile, slightly complicated) and that doesn't seem like it gains us much - especially as it would still break for S390X.

I could do that, document it more clearly, and also wrap it in the sys.platform check; but that seems like a handful of extra lines for, again, no real gain. The likelihood of the compression functionality somehow breaking in a manner that "only partly compresses" and results in a falsely passing test because it's <1024 bytes, seems vanishingly small, especially given how dead simple that functionality is (it's literally just two tiny zlib method calls).

from paramiko.

bitprophet avatar bitprophet commented on September 27, 2024

@pghmcfc Please see the commit I just pushed in a branch for this. (It's failing CI because of codecov, ignore that. Is based off the 2.12 branch which is now very old.)

I'm still really curious now what's going on in your case; when I stop and think about it, platform architecture should not matter for the purposes of the exact single packet in question. Otherwise, you couldn't ssh to a 64-bit system from a 32-bit one, or whatever. Even if it's somehow zlib's fault, again, arriving at a different pile of bytes on S390X and sending it over the wire to an amd64 system would result in failure to cleanly decompress.

I would still place a small bit of money on "the packet is sending twice for some reason, or rekeying is happening, or whatever" because nothing else makes much sense.

Or it's a bug in the test harness that is only being triggered on this specific platform? 🤔

from paramiko.

pghmcfc avatar pghmcfc commented on September 27, 2024

@bitprophet Your commit does, not unexpectedly since it's equivalent to the first part of the original test, make the test pass.

It looks like compression level makes a difference. I tweaked the zlib compression code to use different compression levels and this is what I got:

Compression Level EL8/x86_64 start_bytes EL8/x86_64 sent_bytes EL8/s390x start_bytes EL8/s390x sent_bytes
-1 336 64 528 128
0 352 1088 352 1088
1 336 64 528 128
...
6 336 64 528 128
7 336 64 336 64
...
9 336 64 336 64

The default compression level (-1) is equivalent to level 6. If it had been equivalent to level 7, the test would have passed as-is.

Looking at this, I don't think there's anything particularly bad going on, just that the zlib implementation on that platform produces bigger output at lower compression levels compared with other implementations. It doesn't mean there's an interoperability issue since different implementations can produce different, but valid streams that decompress back to the original data correctly.

Another reference: nodejs/node/pull/44117

from paramiko.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.