Giter Club home page Giter Club logo

Comments (6)

zakukai avatar zakukai commented on August 19, 2024

I seem to be getting some rather odd behavior here.
I wanted to turn this example into a test I could use on different shells. Bash posed a problem because "shopt lastpipe" was not set. So I rearranged the test to take input from a "here string":

file read-n_bytes_or_characters.sh

for ((i=1;i<=6;i++)); do
	IFS= read -rn "$i" a <<<'€€€€€€€€'
	printf "$i %q\n" "$a"
done

In this case, ksh seems to get it right:

$ ksh read-n_bytes_or_characters.sh
1 $'\u[20ac]'
2 $'\u[20ac]\u[20ac]'
3 $'\u[20ac]\u[20ac]\u[20ac]'
4 $'\u[20ac]\u[20ac]\u[20ac]\u[20ac]'
5 $'\u[20ac]\u[20ac]\u[20ac]\u[20ac]\u[20ac]'
6 $'\u[20ac]\u[20ac]\u[20ac]\u[20ac]\u[20ac]\u[20ac]'

But if I switch it back to "echo '€€€€€€€€' | read ..." then it fails again.
The "here string" feature is implemented using an unlinked temporary file, while the pipeline is (of course) a socket pair in ksh. It seems to go wrong when the input is a socket, tty, or pipe, but it does OK when the input is a file.

from ast.

stephane-chazelas avatar stephane-chazelas commented on August 19, 2024

from ast.

zakukai avatar zakukai commented on August 19, 2024

I haven't quite nailed it down yet but it looks like it comes down to this line

ast/src/cmd/ksh93/bltins/read.c : 649
if((size -= x) > 0 && (up >= cur || z < 0) && ((flags & NN_FLAG) || z < 0 || m > c))
    continue;   // Otherwise, end the read loop

Basically, when using -n (that's N_FLAG, as opposed to -N which is NN_FLAG), the code loops: On each iteration there are (x) characters remaining, so the implementation reads (x) bytes (since each character must take at least one byte). Then the newly-completed multi-byte characters are counted (starting from the pointer (up) and extending to the pointer (cur)) to determine how many characters will be left on the next iteration. Usually there will be some number of extra bytes read in that are carried over to the next iteration. But when the end of the last read coincides with a character boundary, it hits an edge case:
(up == cur): the end of the last complete character read coincides with the last byte read from the stream
(z > 0): The call to mbsize(up) on line 644 returned nonzero, because *up = 0 (an end-of-string marker set on (cur) to prevent reading beyond the buffered data) - and 0 is a single-byte character in UTF-8.
So the back half of the "if" condition ((flags & NN_FLAG) || z < 0 || m > c) fails on this edge case, because the NN flag isn't set and z>0 (I don't really understand (m > c) yet..)

I'll try to work out the parts I'm not quite understanding yet and put a patch together.

from ast.

siteshwar avatar siteshwar commented on August 19, 2024

Ksh uses sockets instead of real pipes to implement pipes in shell. This has caused issues on multiple occasions. For e.g.

$ cat /etc/passwd | head -1 /dev/stdin
head: cannot open '/dev/stdin' for reading: No such device or address

I tried removing this code. It fixes above issue, but it caused some of the io tests to fail. A number of failing tests seems to be incorrect. For e.g. consider this test. It expects different outputs on different locales. This command:

$ (print -n foo; sleep 1; print -n bar) |read -n6 temp; echo $temp

should always output foobar. With current development version, it gives foo. And the output may change with locale. Switching to real pipes fixes this issue, however there are going to be other regressions if we switch to real pipes. We should do a deeper analysis of how this is going to affect scripts.

Regarding fix for this bug, you can see my experiments in this branch.

from ast.

DavidMorano avatar DavidMorano commented on August 19, 2024

Of course some of you remember why KSH had to use sockets on some platforms: KSH needs to occasionally read only up to a new-line character and so it needs to PEEK into the input byte stream. Although older SysV type UNIXi allow for PEEKs on pipes, many newer systems (Linux?) do not allow for that (PEEK's on pipes). So KSH has to resort to sockets to get the PEEK capability on platforms that do not support PEEK for pipes. What KSH does now-a-days, exactly, on each platform, I do not know. I would like to think that it only resorts to sockets for shell "pipes" when needed, but I do not know if this is the case any longer (it might be using sockets on all platforms now, for all I know).

from ast.

krader1961 avatar krader1961 commented on August 19, 2024

See also issue #1186 which is almost certainly the same problem as this issue.

from ast.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.