Giter Club home page Giter Club logo

Comments (10)

bitprophet avatar bitprophet commented on July 1, 2024

Think the problem is not nested invokes, but the following:

  • inv test -m blah
  • This runs one Invoke Python process
  • Which loads a task function
  • Which calls invoke.run.run(command)
  • invoke.run.run uses invoke.monkey.Popen to run the subprocess
  • Invoke.monkey.Popen is what contains a copy of Python's Popen, patched to intercept all data from the subprocess
  • And writes that data to sys.stdout
  • The problem is the output in question is two different print-esque statements from the subprocess; one is print statements inside test code, the other is the output from the test runner (test names etc).
  • Why does running the subprocess by itself correctly separate the two into distinct lines, but invoke's byte-by-byte printing interpolate them? Invoke shouldn't be using any threading; is Nose? EDIT: threading.active_count is always 1, implying no threading. So WTF?

from invoke.

bitprophet avatar bitprophet commented on July 1, 2024

Double checked threading, pid, etc, but the answer was of course simpler: Nose prints its info via stderr for some oddball reason (ugh, though apparently this is for a semi good reason) and I was using print thus stdout, and so that's why the interpolation was showing up.

So the underlying reason ends up being ye olde "byte vs line buffering". Will need to solve this for interactive solutions in some way (i.e. same as Fabric, re: prompts and other less-than-line outputs) to say nothing of stdout vs stderr mixing like this.

I'm still curious how "real" shell programs deal with this (printing stdout & stderr independently and yet being able to display prompts and similar). Has to be something I'm missing.

EDIT: Linux manpages for standard streams mentions how typically stderr (which is supposed to be used for prompts and such) is unbuffered, but stdout is line buffered. This makes sense: as long as one of them is line buffered there will be no ugly conflicts. (Also useful: this exhaustive page on stdio buffering which says roughly the same thing.)

EDIT 2: Well, was partly wrong, can still have ugly conflicts, just only with stderr -- stdout always prints its whole lines, but it can still interrupt stderr's bytewise printing. So we're still doing something differently/wrong.

EDIT 3: Thought perhaps I'm totally overdoing things based on earlier experience; tried simply <stream>.write() with no .flush() to see if Python would buffer things correctly. Sadly, not the case, though that's still not surprising: Python should be doing what my latest implementation did anyway: line-buffering stdout but not stderr.

from invoke.

bitprophet avatar bitprophet commented on July 1, 2024

Trying to organize thoughts before working on something else:

  • My sample case for this has been Invoke's own test suite (dogfooding), which is spec therefore nose therefore unittest.
    • Specifically, the real invocation I've been using is spec --tests=tests/some_module.py, which is wrapped up as inv -m some_module.
    • And I stick some basic print statements into some_module.py; that output is what gets mingled with the test-label/test-result stderr from spec/nose/unittest.
  • When run by itself, i.e. interacting only with my shell & terminal app/tty/etc, spec appears to line-buffer both streams -- they are never mixed.
  • When run via Invoke and its patched Popen, the output is always mixed, depending on what buffering is being applied.
    • E.g. in the latest setup with stdout line buffered and stderr unbuffered, stdout lines are whole but appear "inside" stderr output;
    • With no buffering on either, they're wholly intermingled -- output for both streams is arriving from Popen's select.select effectively simultaneously.
  • So the big question is: what's causing stderr to (apparently) buffer linewise when spec runs by itself?

So far the only thing I haven't poked much at is the fact that libc and friends (i.e. what Python is typically the client of when it asks to write to these streams) apparently buffer differently depending on target: if targeting an interactive terminal they will sometimes line buffer when they would unbuffer (or vice versa -- not 100% sure) vs how they behave against a noninteractive program.


EDIT: Found this ML thread clarifying that my above is backwards: Python/libc fully/block buffer against pipes/noninteractive programs, and line buffer against ptys.

A quick sanity test switching creation of my Popen instance to use =None for the pipes instead of =PIPE (again still with my Popen subclass doing straightforward writethroughs, no buffering or flushing) behaves exactly like spec in the shell and seems to prove the above. It's also probably skipping my patched Popen's read/write junk too.


Question now becomes:

  • Is it still possible to capture-and-print while preserving the called program's buffering behavior re: ptys? E.g. what do other make-likes do? (Guessing many of them don't capture?) What does that sh/pbs lib do? etc.
  • How is "stderr is unbuffered so prompts work" squaring with this apparent "stderr is line buffering when hitting a pty" behavior? Is this a Python specific quirk (which I assume raw_input() and friends explicitly work around by flushing)?

from invoke.

bitprophet avatar bitprophet commented on July 1, 2024
  • Found another useful discussion which is very on-topic (though TBH most of the info is repeating bits and snatches I've seen elsewhere)
  • Also looked at tools like e.g. sarge, pexpect (still reviewing the latter)
  • Feels impossible to force useful "interactive style" buffering without using a real pty
    • Real ptys only work well on POSIX (Mac/Linux/Cygwin) -- not too heartbroken about this, esp if we can easily make the code path optional
    • As we've seen in Fabric, use of a pty means you lose the ability to tell stdout/stderr apart, full stop
  • Sounds like best route is to do similar to Fabric: make use of a pty an option, and present it as an interactive-noninteractive binary choice:
    • Do you care more about seeing (and possibly interacting with) the child process? Use a pty, and lose distinct stderr.
    • Do you care more about programmatically introspecting stderr by itself? Disable pty and you're good to go, but output may look garbled, if displayed.
    • Default should probably be to use a pty if we're truly aiming to be Make-like?

from invoke.

bitprophet avatar bitprophet commented on July 1, 2024

Also, re: using a pty, may be nontrivial, can't use subprocess with them so have to do some more low level os bullshit. EDIT: perhaps not?

Pexpect has that all tied up pretty neatly so I'll see whether it's possible to just use that as a client (it's MIT licensed), though not sure if it's worth the dependency.

May be worth ripping out just what we "really" need instead -- tho counterpoint is that folks frequently want expect-like behavior anyway, so why bother not using what's already there? Ends up only being worth borrowing chunkwise if their public API isn't usable for our needs.

Other downsides of pexpect: no public release since 2008, despite SVN having commits 2 weeks old; also, SVN...

from invoke.

bitprophet avatar bitprophet commented on July 1, 2024

Using the EDIT link above, actually got something working as intended; the trick is:

  • Create PTY pair
  • Hand child to a Popen object's stdout & stderr args
  • Poll Popen object to determine exit code / termination
  • select and os.read on parent PTY to pull out the text stream.

Still need my subclassed Popen to handle the more programmatic use case (separate streams available, but w/ noninteractive style buffering), and in fact can hopefully find a way to move the above code (currently inside run) into that class (this way the core "run a process w/ hide-able output & realtime printing" behavior is in one spot w/ one API).

Never did inspect the details about using Pexpect; should do it at least quickly though I suspect I'd need to hack it for output hiding, and it would be a semi-optional dependency given it would not be used in the non-pty case (AFAIK it only uses a pty so it won't work for the split-streams use case.)

from invoke.

bitprophet avatar bitprophet commented on July 1, 2024

Pexpect has a moderate amount of useful stuff, re: expectations & controlling a tty (re: handling signals, EOFs, echo state, etc) but also has a lot of stuff we don't need (much of the file-like behavior, for example) and lacks things we do need (easy ability to turn output on/off during interaction -- it treats interaction as basically the stdout|stderr=None case for subprocess; naturally, no "I do want stderr, don't use a pty" case).

It's also still basically in the doldrums maintenance-wise:

  • half dozen commits in 2012
  • none in 2011
  • half dozen commits in 2010
  • a pittance in 2009
  • last release in 2008

Clearly, relying on it as an external dependency is a non-starter. So our options are either write our own stuff from the ground up, cribbing small bits/ideas as necessary, or keep an internal fork that we modify to suit our needs.

from invoke.

bitprophet avatar bitprophet commented on July 1, 2024

Turns out pexpect does work pretty well, interact has a filter option that we can use OOB to implement hiding & capturing (albeit via some slight scope abuse for the latter). So now I am leaning back towards using pexpect for the pty side of things, possibly vendorizing it given the state of upstream (I've been testing against master, should try the 2008 release).

Currently adding the pty option to run and really filling all this out for reals.

from invoke.

bitprophet avatar bitprophet commented on July 1, 2024

With some of the pty=True tests, realized the way Nose/Spec capture stdout/err (@trap) don't work for subprocesses hooking up directly to the testing Python process' file descriptors.

So I need to choose:

  • Follow lead from some of the 'cli' tests and push these one level deeper, into a tasks modules within _support that is invoked as run("inv testname") -- then introspect that result's stdout/err.
    • This makes things a bit harder to follow and requires opening even more subprocesses during testing than we do now.
    • But it's quick, known to work, requires no other changes.
  • Update Spec to optionally capture at the FD level (e.g. @trap(fds=True)).
    • Cleaner, tests remain as they are now, etc.
    • Requires code additions to Spec (I don't see any existing Nose plugins that provide this.)
      • Specifically, something along the lines of a, b = os.pipe(); os.dup2(a, sys.stdout.fileno()) + a thread to read out of b or similar. May also need to os.fdopen(sys.stdout.fileno(), 'w', 0) to obtain an OS level file handle on sys.stdout / change how it's currently opened?
  • Switch to py.test which has this functionality OOB.
    • Don't want to give up Spec functionality
    • Porting Spec to py.test probably a huge chunk of work
    • Not a fan of py.tests's implementation (even harder to follow than nose internals)

from invoke.

bitprophet avatar bitprophet commented on July 1, 2024

Went with the first option for expediency; see #25 for improving the subprocess-y bits somewhat. We should end up with only 1-2 of these "call nested subprocesses" tests at which point there's definitely no point in doing more work to get FD capturing working.


And with that, all tests pass again, and more importantly, I gave pty=True to our internal tasks file and see "correct" test printing output again, at least in my trivial manual test case. Calling this done, jeez.

from invoke.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.