Giter Club home page Giter Club logo

ast's Introduction

AST

This is the AT&T Software Technology (AST) toolkit from AT&T Research. It includes many tools and libraries, like KSH, NMAKE, SFIO, VMALLOC, VCODEX, etc. It also includes more efficient replacements for a lot of the POSIX tools. It was designed to be portable across many UNIX systems and also works under UWIN on Microsoft Windows (see UWIN repo on GitHub under att/uwin).

ksh93u+ and v-

This repo contains the ksh93u+ and ksh93v- versions of KSH.

  • ksh93u+, the master branch, was the last version released by the main AST authors in 2012, while they were at AT&T. It also has some later build fixes but it is not actively maintained.
  • ksh93v-, ksh93v tag, contains contributions from the main authors through 2014 (after they left) and is considered less stable

Please search the web for forks of this repo (or check the Network graph on GitHub) if you are looking for an actively maintained version of ksh.

Build

This software is used to build itself, using NMAKE. After cloning this repo, cd to the top directory of it and run:

./bin/package make

Almost all the tools in this package (including the bin/package script are self-documenting; run --man (or --html) for the man page for the tool.

(If you were used to the old AST packaging mechanism, on www.research.att.com, this repo is equivalent to downloading the INIT and ast-open packages and running: ./bin/package read on them).

ast's People

Contributors

gordonwoodhull avatar lkoutsofios avatar siteshwar avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ast's Issues

Wrong syntax for the "suspend" alias in ksh93

In ksh93 (and mksh and ksh88), suspend is defined as:

suspend='kill -s STOP $$'

Unquoted parameter expansion means to ask the shell to perform word splitting and globbing on it.

$ echo "$$"
24725
$ IFS=7
$ suspend
kill: 24: permission denied
kill: 25: permission denied

It should be:

alias suspend='kill -s STOP "$$"'

nmake needs to be build with -D_FORTIFY_SOURCE=0 on MacOSX or buffer overlap protection kills it

During my tries to get ksh compiled on OSX El Capitan using the ast build environment I ran into the problem that I am presented with some "Abort trap 6" messages as soon as nmake is running of the ksh sources:

$ bin/package make ksh93 SHELL=sh

package: initialize the /Users/maus/Documents/projekte/ksh-beta/arch/darwin.i386 view
package: update /Users/maus/Documents/projekte/ksh-beta/arch/darwin.i386/bin/cc
package: update /Users/maus/Documents/projekte/ksh-beta/arch/darwin.i386/bin/ldd
package: update /Users/maus/Documents/projekte/ksh-beta/arch/darwin.i386/lib/probe/C/make/probe
package: update /Users/maus/Documents/projekte/ksh-beta/arch/darwin.i386/bin/mamake
[...]
probing C language processor /Users/maus/Documents/projekte/ksh-beta/arch/darwin.i386/bin/cc for make information
++ set -
cmd/INIT:
sh: line 114: 68354 Abort trap: 6           /Users/maus/Documents/projekte/ksh-beta/arch/darwin.i386/bin/nmake --ignorelock --keepgoing --errorid=cmd/INIT .RWD.=cmd/INIT RECURSEROOT=.. believe
make: *** termination code 6 making cmd/INIT

Looking at the MacOSX system log files a crash is reported within nmake:

Process:               nmake [68354]
Path:                  /Users/USER/Documents/*/nmake
Identifier:            nmake
Version:               0
Code Type:             X86 (Native)
Parent Process:        ??? [68353]
Responsible:           nmake [68354]
User ID:               501

Date/Time:             2016-03-09 18:24:05.117 +0100
OS Version:            Mac OS X 10.11.4 (15E49a)
Report Version:        11
Anonymous UUID:        EDEE8ECF-E07E-787D-E6DF-2B5B6B158D92


Time Awake Since Boot: 1200000 seconds

System Integrity Protection: disabled

Crashed Thread:        0  Dispatch queue: com.apple.main-thread

Exception Type:        EXC_CRASH (SIGABRT)
Exception Codes:       0x0000000000000000, 0x0000000000000000
Exception Note:        EXC_CORPSE_NOTIFY

Application Specific Information:
detected source and destination buffer overlap

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   libsystem_kernel.dylib          0x9e3dd572 __pthread_kill + 10
1   libsystem_pthread.dylib         0x92438654 pthread_kill + 101
2   libsystem_c.dylib               0x9a5dbd00 __abort + 187
3   libsystem_c.dylib               0x9a5dbc45 abort + 173
4   libsystem_c.dylib               0x9a5dbd7f abort_report_np + 82
5   libsystem_c.dylib               0x9a60aad1 __chk_fail + 54
6   libsystem_c.dylib               0x9a60aae8 __chk_fail_overlap + 23
7   libsystem_c.dylib               0x9a60ab23 __chk_overlap + 59
8   libsystem_c.dylib               0x9a60ad29 __strcpy_chk + 72
9   nmake                           0x000d5515 resetvar + 341
10  nmake                           0x000d4ca8 setvar + 1576
11  nmake                           0x000b16cf assignment + 1295
12  nmake                           0x000ab8e9 parse + 1433
13  nmake                           0x00064432 apply + 706
14  nmake                           0x000ade10 assertion + 224
15  nmake                           0x000ab8cb parse + 1403
16  nmake                           0x000baf2e readfp + 6590
17  nmake                           0x000b9197 readfile + 1335
18  nmake                           0x000884b2 main + 7938
19  libdyld.dylib                   0x95e2c6ad start + 1

This crashlog suggests that the source and destination buffer in the strcpy() call in resetvar() overlaps and thus MacOSX is terminating the nmake process resulting in the "Abort trap: 6" messages above.

Using -D_FORTIFY_SOURCE=0 when calling bin/package make seems to workaround this problem. However, the build then fails at another sudden point (probably due to the still existing buffer overlap problem which is simply not reported anymore):

$ bin/package make ksh93 SHELL=sh CCFLAGS=-D_FORTIFY_SOURCE=0

[...]
cpp: "/Users/maus/Documents/projekte/ksh-beta/src/cmd/ksh93/include/shell.h", line 172: cmd.h: cannot find include file
cpp: "FEATURE/dynamic", line 10: dlldefs.h: cannot find include file
cpp: "/Users/maus/Documents/projekte/ksh-beta/src/cmd/ksh93/include/shell.h", line 172: cmd.h: cannot find include file
cpp: "FEATURE/dynamic", line 10: dlldefs.h: cannot find include file
cpp: "/Users/maus/Documents/projekte/ksh-beta/src/cmd/ksh93/include/shell.h", line 172: cmd.h: cannot find include file
make [cmd/ksh93]: *** exit code 2 making cd_pwd.o
make [cmd/ksh93]: *** exit code 2 making cflow.o
make [cmd/ksh93]: *** exit code 1 making deparse.o
cpp: "/Users/maus/Documents/projekte/ksh-beta/src/cmd/ksh93/include/shell.h", line 172: cmd.h: cannot find include file
cpp: "/Users/maus/Documents/projekte/ksh-beta/src/cmd/ksh93/include/shell.h", line 172: cmd.h: cannot find include file
cpp: "FEATURE/dynamic", line 10: dlldefs.h: cannot find include file
[...]

I am using MacOSX 10.11.4 with Xcode 7.2.1 (7C1002) ending up in Apple LLVM version 7.0.2 (clang-700.1.81) being used for compilation.

variable exported in function in a subshell is also visible in a different subshell

I've observed this issue in a Solaris 11 system with all the ksh93 versions(alpha, beta and the master versions) as well as in Ubuntu 14 ( ksh2012).

Here's a testcase which reproduces the issue.

# cat test1.ksh
# !/usr/bin/ksh

function proxy { 
         export MYVAR="bla" 
        child 
        unset MYVAR 
} 

function child { 
        echo "MYVAR=$MYVAR" >> /var/tmp/debug.log 
} 

function test { 
        $(child) 
        $(proxy) 
        $(child) 
} 

rm /var/tmp/debug.log 
test 
cat /var/tmp/debug.log 
# ./test1.ksh

MYVAR= 
MYVAR=bla 
MYVAR=bla <------------------------------ this should not happen 
# 

The following patch which removes an optimization fixes the issue. If there is any other patch, please let me know.

--- INIT.2012-08-01.old/src/cmd/ksh93/sh/subshell.c     2016-03-01 04:01:06.513890578 -0800
+++ INIT.2012-08-01/src/cmd/ksh93/shsubshell.c  2016-03-01 04:02:43.617872391 -0800
@@ -260,9 +260,6 @@
        shp = sp->shp;
        dp = shp->var_tree;
-       /\* don't bother to save if in newer scope */
-       if(sp->var!=shp->var_tree && sp->var!=shp->var_base && shp->last_root==shp->var_tree)
-               return(np);
      if((ap=nv_arrayptr(np)) && (mp=nv_opensub(np)))
      {
              shp->last_root = ap->table;

The issue seems to be happening because of the following optimization in sh_assignok function in sh/subshell.c.


 /\* don't bother to save if in newer scope */ 
 if(sp->var!=shp->var_tree && shp->last_root==shp->var_tree) 
        return(np); 

This optimization prevents saving the variables in situations where they don't need to be restored. If we remove the optimization, the issue will go away.

ksh93 echoing wrong output due to missing EIO handling during logout

Here's a reproducible testcase on a Solaris11 host running ksh93u+(2012-08-01).
$ cat a.sh
#!/bin/sh

AAA="aaa"
echo 'insert character'
BBB=echo ${AAA} | sed "s/aaa/bbb/g"
logger "variable BBB = ${BBB}"

$ cat t.sh
#!/bin/ksh

sleep 10
/bin/ksh ./a.sh
exit 0

$

$ ./t.sh

The expected result is:

Apr 9 12:43:34 lab user: [ID 702911 user.notice] variable BBB = bbb

because variable "BBB" is supposed to be set to 'bbb' in a.sh.

But if the parent shell is terminated, the variable is wrongly set.

user@xxxxx$ telnet lab
...
$ ./t.sh & <--- Run t.sh in background.
[1] 2067
$ logout <--- CTRL + D to exit while t.sh is running.
Connection to lab closed by foreign host.

Again, access the system and check the output:

user@xxxxx$ telnet lab
...
$ tail -f /var/adm/messages
:
Apr 9 12:47:47 lab user: [ID 702911 user.notice] variable BBB = insert
character <--- !!!
Apr 9 12:47:47 lab bbb
<--- !!!

Thus the variable is wrongly set. (The previous echo string was not cleared.)

The issue happens because the EIO error during the logout is not handled properly.
The following patch fixes the issue

--- INIT.2012-08-01.old/src/cmd/ksh93/sh/io.c 2017-01-04 14:41:25.199402375 +0000
+++ INIT.2012-08-01/src/cmd/ksh93/sh/io.c 2017-01-04 14:32:20.279449987 +0000
@@ -64,9 +64,9 @@

#ifndef ERROR_PIPE
#ifdef ECONNRESET
-#define ERROR_PIPE(e) ((e)==EPIPE||(e)==ECONNRESET)
+#define ERROR_PIPE(e) ((e)==EPIPE||(e)==ECONNRESET||(e)==EIO)
#else
-#define ERROR_PIPE(e) ((e)==EPIPE)
+#define ERROR_PIPE(e) ((e)==EPIPE||(e)==EIO)
#endif
#endif

ksh93 dumps core in emacs mode while entering characters in different locale.

I observed this issue in a Solaris 11 system on ksh2012-08-01, ie the master version. I guess this issue is present in the later versions too as the relevant code has not changed.

The issue can be reproduced if we add Asian locales to ibus (such as Korean).
In the ksh93 shell prompt, input some Asian character. ksh promptly dumps core with the following stacktrace.

bash-4.2$ pstack core
core 'core' of 1134: ksh
00000000004f1cf4 ed_emacsread () + 404
00000000004a5096 slowread () + 116
0000000000592d12 sfrd () + 482
000000000058b707 _sffilbuf () + 297
00000000005936ac sfreserve () + 2ac
0000000000477ae2 exfile () + 6e2
0000000000477393 sh_main () + af3
00000000004767dd main () + 4d
0000000000476614 ???????? ()

The coredump happens at the following line no 320 in src/cmd/ksh93/edit/emacs.c
i.e if(c!='\t' && c!=ESC && !isdigit(c)).

I referred the vi.c code and added the digit(c) macro, i.e
((c&~STRIP)==0 && isdigit(c)) and replaced the isdigit(c) usage with the "digit(c)" macro. Here's the patch which fixes the issue for me.

--- INIT.2012-08-01.old/src/cmd/ksh93/edit/emacs.c 2016-01-18 03:52:58.380801240 -0800
+++ INIT.2012-08-01/src/cmd/ksh93/edit/emacs.c 2016-02-05 01:39:08.350312914 -0800
@@ -90,6 +90,7 @@
static int print(int);
static int _isword(int);

define isword(c) _isword(out[c])

+# define digit(c) ((c&~STRIP)==0 && isdigit(c))

#else

define gencpy(a,b) strcpy((char_)(a),(char_)(b))

@@ -97,6 +98,7 @@

define genlen(str) strlen(str)

define print(c) isprint(c)

define isword(c) (isalnum(out[c]) || (out[c]=='_'))

+# define digit(c) isdigit(c)
#endif /*SHOPT_MULTIBYTE */

typedef struct emacs
@@ -317,7 +319,7 @@
count = 1;
adjust = -1;
i = cur;

  •           if(c!='\t' && c!=ESC && !isdigit(c))
    
  •           if(c!='\t' && c!=ESC && !digit(c))
                    ep->ed->e_tabcount = 0;
            switch(c)
            {
    
    @@ -775,7 +777,7 @@
    int digit,ch;
    digit = 0;
    value = 0;
  •   while ((i=ed_getchar(ep->ed,0)),isdigit(i))
    
  •   while ((i=ed_getchar(ep->ed,0)),digit(i))
    {
            value *= 10;
            value += (i - '0');
    
    @@ -1013,7 +1015,7 @@
    {
    i=ed_getchar(ep->ed,0);
    ed_ungetchar(ep->ed,i);
  •                                   if(isdigit(i))
    
  •                                   if(digit(i))
                                            ed_ungetchar(ep->ed,ESC);
                            }
                    }
    

Please create tag(s)

In order to adopt the new GitHub repository as its upstream, the Homebrew formula will need a release tag that can be assigned to the stable spec. This allows a tarball of the tag to be downloaded from GitHub and the sha256 of the tarball to be recorded in the formula to ensure integrity.

Please see Homebrew/legacy-homebrew#49653 (comment)

ksh regression: with FIGNORE, . and .. are no longer automatically excluded from glob expansions

From the manual:

If FIGNORE is set, then each file name component that matches the pattern defined by the value of FIGNORE is ignored when generating the matching filenames. The names . and .. are also ignored.

That used to be true. As in, that works as documented in ksh93k+ for instance, but in modern versions (and also in ksh93m, I couldn't find ksh93l to test) that doesn't:

$ FIGNORE=x ksh93v- -c 'echo *'
. .. a .a
$ FIGNORE=x ksh93k+ -c 'echo *'
a .a

zrep keeps 3 additional snapshots on remote

after zrepping some local zfs fs, i found that zrep Remote/Destination fs has still snapshots which are already removed on the source fs (using zfs-auto-snapshot, which purged the 3 oldest ones), although i set savecount to "1".

I would like that zrep makes identical copy of zfs fs including all snapshots.

Any clue where to look or why this is happening ?

[root@backupvm1 ~]# zfs list -r -H -t snapshot -o name -s name zfspool/backup/adminstation.local|grep -v zrep|wc -l
45
[root@backupvm1 ~]# zfs list -r -H -t snapshot -o name -s name zfsiscsipool/backup-repl/adminstation.local|grep -v zrep|wc -l
48

VAR=value followed by nested function call fails to put VAR into environment

The following code should print V=1, but doesn't:

function f2 { env | grep '^V='; }
function f1 { f2; }
V=1 f1

However, V=1 f2 behaves correctly.

Every ksh93 I tried has the same behavior, including the following:

  • Version M-12/28/93e (AIX 6.1 /bin/ksh93)
  • Version AJM 93u+ 2012-08-01 (Red Hat Enterprise Linux Server release 7.2 /bin/ksh93)

ksh93: When "read -m json" is used to read a single-line JSON object, text fields following the first numeric field are re-interpreted as numeric variable names

I realize the JSON functionality in Korn Shell isn't fully mature yet, but I built myself a shell from the beta branch on git (Version ABIJM 93v- 2014-12-24) to try it out... I found that when I feed "read -m json" a JSON object with no newlines in it, a numeric field would cause all following fields to appear wrong

$ foo=9
$ read -m json json_test <<<$'{ "squanchy" : "cromulent", "num" : 1, "text": "foo", "notbool": "true", "bool": true }'
$ print -j json_test
{
	"bool": 0,
	"notbool": 0,
	"num": 1,
	"squanchy": "cromulent",
	"text": 9
}
# "squanchy" comes through just fine, because it precedes "num".
# All variables following "num" : 1 were replaced with numeric variable lookup, as if they'd appeared in $(())
$ echo "$json_test"
(
	typeset -l -E bool=0
	typeset -l -E notbool=0
	typeset -l -E num=1
	squanchy=cromulent
	typeset -l -E text=9
)
# typeset -E is a floating-point numeric type. "bool", "notbool", and "text" have all inherited this type from "num"

However, if I insert newlines into the JSON string, this doesn't happen:

$ read -m json json_test <<<$'{ "squanchy" : "cromulent", "num" : 1,\n "text": "bar", "notbool": "true", "bool": true }'
$ print -j json_test
{
	"bool": true,
	"notbool": "true",
	"num": 1,
	"squanchy": "cromulent",
	"text": "bar"
}

I haven't looked at the source code for the JSON support yet. It does appear to be in pretty rough shape overall... When I get some time I'll see what I can do with it.

build process -- Food for thought

Should we change/adapt the build process?

(with the objective to make it easily maintainable and accessible to a greater number)

Small knowledgeable community

I know nobody amongst my fellow developers that has insight knowledge on the build process beyond bin/package make, let alone to maintain it, but simply to customise it. Neither do I.

That said, they know how to maintain and tweak GNU Autotools and CMake toolchains.

Note: Do not jump to conclusions here, I am not trying to promote Autotools or CMake; quite the contrary. I simply want to emphasise the lack of openess of the build process' logic and toolchain.

Scarse documentation

Embedded usage documentation in any C or Korn shell script is a fantastic asset of the AST developments, and certainly immensely underused among users of AST packages.... except for the AST developers who have consistently added usage information to all their utilities.

Nonetheless that documentation does not suffice for a newbie to get his head round the build toolchain and gain sufficient insight information to act alone without calling out for help.

Today I am confronted with a failing build on a platform, which is certainly not exotic (MacOS), and I find myself spending hours trying to understand where the errors occur.

Unless told otherwise, I have no supporting information to help me get through my build failures. And calling out for help won't be of great help because ( I presume) only a few have invested significant time in understanding the guts of the toolchain. Questions will take time to be answered, if ever answered.

Build tool

When all goes well, the AST build toolchain seems to beat flat out the other tools mentioned above. It has (apparently) no dependencies, allows for all the GNU Autoconfigure probing without the M4 hell, and nicely lays out its build products.

Opinion: GNU Autotools are a fantastic suite. But they have a major inconvenience: M4. Opaque and to a certain extend clumsy. Probably a good compromise for portability 30 to 40 years ago. But no longer the ad hoc tool for today; pre-processing could be done the AST way :-)

Could the AST build toolchain be system agnostic and a possible replacement for GNU Autotools or CMake on other projects? A toolchain written in portable POSIX shell targeting any raw (POSIX) UNIX or Linux.

Whilst this was probably a driver in its conception, we see, going through the source files that it depend on bash here, lynx or wget there, etc. So it is not agnostic and doesn't build on a raw system; it requires GNUish capabilities. Hence it targets UNIX/GNU or Linux/GNU platforms.

Note: for the reasoning let us ignore for now that we probably need gcc to avoid proprietary compilers (where such compilers still exist).

Logically one can ask, why then maintain a distinct build toolchain? Why not use GNU Autotools or CMake?

Liminary thoughts

The breakup of the AST development team has (luckily) brought the AST developments to the open source community. But the community is small (and probably fragile).

If the AST packages and the Korn shell are here to stay, the community needs to be enlarged.

Enlarging the community means, making the build process accessible to many.

Migrating to GNU Autotools or CMake is an enormous effort which would require such time investment that it is almost guaranteed to stall.

Documentation and HOWTOs seems to be the only realistic approach. This also requires time, and reverse engineering.

Request for comments

In the 90s, shell portability was a big concern, and scripting had to focus on POSIX shells only (Korn shell wasn't a POSIX shell at the time, it now is).

Today, thanks to AT&T opening up the source code, a Korn shell exists on (almost) every platform. Not PDKSH or old versions, but a ksh93 executable (whatever its release).

Consequently, in 2017 onwards, we can assume that we have a Korn shell executable that supports the 93 syntax and features.

Converting the AST build toolchain scripts from universal shell syntax to Korn shell 93 syntax can:
a) greatly reduce the LOC (e.g. iffe could be reduced by 50%)
b) allow for clean environments with the function keywords, limiting globals
c) break down the code into smaller and more maintainable chunks using FPATH
d) usage information can be added to all functions

This doesn't require a full reverse engineering effort, nor does it require a full rewrite of the code.

At the same time this allows for a learning curve which can be populated in HOWTO's and central documentation.

By doing this we can (re)gain knowledge of the AST build toolchain, document it properly for the community to get involved, and lead the way for a ksh2023 rather than a ksh93+z2023 :-)

ksh93: random behaviour of `read -n <nchar>` for multi-byte characters.

Reproduced with version sh (AT&T Research) 93u+ 2012-08-01 and version sh (AT&T Research) 93v- 2014-12-24 on Debian GNU/Linux amd64:

According to the man page read -n reads a number of bytes, while read --help says characters.

Tests are inconsistent: here testing in a UTF-8 locale with the 3-byte € (EURO U+20AC) character:

$ ksh -c 'for ((i=1;i<=6;i++)); do echo €€€€€€€€ | IFS= read -rn "$i" a; printf "$i %q\n" "$a"; done'
1 $'\u[20ac]'
2 $'\u[20ac]\u[20ac]'
3 $'\u[20ac]'
4 $'\u[20ac]\u[20ac]\u[20ac]'
5 $'\u[20ac]\u[20ac]\u[20ac]'
6 $'\u[20ac]\u[20ac]'

The 1 case suggests a number of characters, the 3 case a number of bytes, the rest doesn't seem to make any sense.

read -N doesn't have the issue (and seems to take a number of characters):

$ ksh -c 'for ((i=1;i<=6;i++)); do echo €€€€€€€€ | IFS= read -rN "$i" a; printf "$i %q\n" "$a"; done'
1 $'\u[20ac]'
2 $'\u[20ac]\u[20ac]'
3 $'\u[20ac]\u[20ac]\u[20ac]'
4 $'\u[20ac]\u[20ac]\u[20ac]\u[20ac]'
5 $'\u[20ac]\u[20ac]\u[20ac]\u[20ac]\u[20ac]'
6 $'\u[20ac]\u[20ac]\u[20ac]\u[20ac]\u[20ac]\u[20ac]'

'times' should be a special builtin

In ksh93, times is implemented as an alias to "{ { time;} 2>&1;}" and "command" as an alias to "command " (which means aliases are expanded after it).

That means that things like:

$ ksh93 -c 'LC_ALL=C times'
ksh93: syntax error at line 1: `{' unexpected
$ ksh93 -c 'command times'
ksh93: syntax error at line 1: `{' unexpected

Don't work, so the times utility is not POSIX compliant.

"times" should be implemented as a special builtin.

More generally, even though allowed by POSIX, implementing builtin utilities as aliases is not a good idea IMO if only for the reasons detailed at:
http://thread.gmane.org/gmane.comp.standards.posix.austin.general/12485/focus=12568

ksh does not detect invalid array declarations

ksh -n does not detect bad array declarations in following code :

bad ksh array :

#!/usr/bin/ksh

typeset -A fn
fn=([foo_key]=foo_val [bar_key])

printf %s\\n ${fn[foo_key]}

zsh arrays :

#!/usr/bin/zsh

typeset -A fn
fn=(foo_key foo_val bar_key bar_val)

printf %s\\n ${fn[foo_key]} ${fn[bar_key]}

ksh93 tests failing

Currently there are several tests failing for ksh93, so I think they are a good starting point to get fixed (or excluded), so that we can start making the tests part of the automated travis ci build to ensure the integrity of any changes.

Then we can add this to the .travis.yml file:

bin/package test ksh93
bin/package results test | grep '\*\*\*' && false

This is the output of bin/package results test after testing just ksh93:

dannyw@dannyw-ubuntu:~/wrk/att/ast$ bin/package results test | grep '\*\*\*' && false
INIT iffe ...................................  162 tests    1 error  ***
ksh93 io(shcomp) ............................   99 tests  141 errors ***
ksh93 namespace(shcomp) .....................   26 tests    1 error  ***
ksh93 treemove(shcomp) ......................   22 tests    1 error  ***
ksh93 wchar .................................    4 tests    4 errors ***
ksh93 wchar(C.UTF-8) ........................    4 tests    4 errors ***
ksh93 wchar(shcomp) .........................    4 tests    4 errors ***

Ideally, once all ast tests are working we can widen the tests using just:

bin/package test

However, there are lots of other failures and there seems to be a bug with bin/package test that causes tee to sit waiting for input.

Here is the output of the ksh93 test failures I am getting on ubuntu 16.04:

test io(shcomp) begins at 2017-07-21+20:14:53
test io(shcomp) failed at 2017-07-21+20:14:53 with exit code 141 [ 99 tests 141 errors ]
test namespace(shcomp) begins at 2017-07-21+20:15:00
/tmp/tmpjogoX8m.B0o/shcomp-namespace.ksh[127]: .a.b.x_t: not found [No such file or directory]
/tmp/tmpjogoX8m.B0o/shcomp-namespace.ksh[128]: var.pi: not found [No such file or directory]
discipline functions for types in namespace not working
/tmp/tmpjogoX8m.B0o/shcomp-namespace.ksh[138]: .com.foo.test1.y_t: not found [No such file or directory]
/tmp/tmpjogoX8m.B0o/shcomp-namespace.ksh[139]: v.x.pr: not found [No such file or directory]
	shcomp-namespace.ksh[139]: _.__ not working with nested types in a namespace
test namespace(shcomp) failed at 2017-07-21+20:15:00 with exit code 1 [ 26 tests 1 error ]
test treemove(shcomp) begins at 2017-07-21+20:22:18
/home/dannyw/wrk/att/ast/arch/linux.i386-64/bin/ksh: line 2: syntax error at line 6: `function' unexpected
	[78]: typeset -C c=(objstack_t ost=(typeset -l -i st_n=1;st[0]=(obj=(typeset -l -i val=5)))) is not idempotent
test treemove(shcomp) failed at 2017-07-21+20:22:18 with exit code 1 [ 22 tests 1 error ]
test wchar begins at 2017-07-21+20:22:35
	wchar.sh[60]: en_US.ISO-8859-15 nounicodeliterals FAILED -- expected '0000000 24 27 e2 82 ac 27 0a', got '0000000 27 5c 75 5b 32 30 61 63 5d 27 0a'
	wchar.sh[63]: en_US.ISO-8859-15 (nounicodeliterals) FAILED -- expected '0000000 24 27 e2 82 ac 27 0a', got '0000000 27 5c 75 5b 32 30 61 63 5d 27 0a'
	wchar.sh[60]: zh_CN.GB18030 nounicodeliterals FAILED -- expected '0000000 24 27 e2 82 ac 27 0a', got '0000000 27 5c 75 5b 32 30 61 63 5d 27 0a'
	wchar.sh[63]: zh_CN.GB18030 (nounicodeliterals) FAILED -- expected '0000000 24 27 e2 82 ac 27 0a', got '0000000 27 5c 75 5b 32 30 61 63 5d 27 0a'
test wchar failed at 2017-07-21+20:22:35 with exit code 4 [ 4 tests 4 errors ]
test wchar(C.UTF-8) begins at 2017-07-21+20:22:35
	wchar.sh[60]: en_US.ISO-8859-15 nounicodeliterals FAILED -- expected '0000000 24 27 e2 82 ac 27 0a', got '0000000 27 5c 75 5b 32 30 61 63 5d 27 0a'
	wchar.sh[63]: en_US.ISO-8859-15 (nounicodeliterals) FAILED -- expected '0000000 24 27 e2 82 ac 27 0a', got '0000000 27 5c 75 5b 32 30 61 63 5d 27 0a'
	wchar.sh[60]: zh_CN.GB18030 nounicodeliterals FAILED -- expected '0000000 24 27 e2 82 ac 27 0a', got '0000000 27 5c 75 5b 32 30 61 63 5d 27 0a'
	wchar.sh[63]: zh_CN.GB18030 (nounicodeliterals) FAILED -- expected '0000000 24 27 e2 82 ac 27 0a', got '0000000 27 5c 75 5b 32 30 61 63 5d 27 0a'
test wchar(C.UTF-8) failed at 2017-07-21+20:22:35 with exit code 4 [ 4 tests 4 errors ]
test wchar(shcomp) begins at 2017-07-21+20:22:35
	shcomp-wchar.ksh[60]: en_US.ISO-8859-15 nounicodeliterals FAILED -- expected '0000000 24 27 e2 82 ac 27 0a', got '0000000 27 5c 75 5b 32 30 61 63 5d 27 0a'
	shcomp-wchar.ksh[63]: en_US.ISO-8859-15 (nounicodeliterals) FAILED -- expected '0000000 24 27 e2 82 ac 27 0a', got '0000000 27 5c 75 5b 32 30 61 63 5d 27 0a'
	shcomp-wchar.ksh[60]: zh_CN.GB18030 nounicodeliterals FAILED -- expected '0000000 24 27 e2 82 ac 27 0a', got '0000000 27 5c 75 5b 32 30 61 63 5d 27 0a'
	shcomp-wchar.ksh[63]: zh_CN.GB18030 (nounicodeliterals) FAILED -- expected '0000000 24 27 e2 82 ac 27 0a', got '0000000 27 5c 75 5b 32 30 61 63 5d 27 0a'
test wchar(shcomp) failed at 2017-07-21+20:22:35 with exit code 4 [ 4 tests 4 errors ]

ksh93: random behaviour of += on multi-dimensional arrays

$ ksh93 -c 'a=((a b c) (1 2 3)); a+=( (X Y Z)); typeset -p a'
typeset -a a=((X Y Z) )
$ ksh93u+ -c 'a[3]=(1 2 3); a+=( (x y)); typeset -p a'
typeset -a a=([3]=(1 2 3) [4]=$'\xf8\u[784]D\x7f')
$ ksh93v- -c 'a[3]=(1 2 3); a+=( (x y)); typeset -p a'
typeset -a a=([1]=$'\n-\xbc\xb8%' [3]=(1 2 3) )

(On Debian GNU/Linux amd64)

ksh93: Local variables are passed on to called functions when they share their name with an invocation-level binding

When a function defines a local variable, normally that variable is not accessible to or modifiable by other functions called by the function:

$ function f1 { typeset var VF19; VF19=excalibur; echo "f1: VF1=$VF1, VF19=$VF19"; f2; echo "f1: VF1=$VF1, VF19=$VF19"; }
$ function f2 { echo "f2: VF1=$VF1, VF19=$VF19"; VF1=vf1a; VF19=VF19a; echo "f2: VF1=$VF1, VF19=$VF19"; }      
$ VF1=valkyrie f1
f1: VF1=valkyrie, VF19=excalibur
f2: VF1=valkyrie, VF19=
f2: VF1=vf1a, VF19=VF19a
f1: VF1=valkyrie, VF19=excalibur
# VF19 is local to f1, so it isn't visible to or modified by f2.
# VF1 is an invocation-level binding on f1, so it is visible to f2 but not modified by it.

However, if there's an "invocation-level" binding of the same name as a local variable, the local variable takes on the characteristics of the "invocation-level" binding:

$ unset VF1
$ unset VF19
$ VF19=YF19 f1      # Binding VF19 on the invocation of f1 prevents f1 from using it as a local variable
f1: VF1=, VF19=excalibur
f2: VF1=, VF19=excalibur
f2: VF1=vf1a, VF19=VF19a
f1: VF1=vf1a, VF19=excalibur
$ echo ${VF19-'{unset}'}   # f2's assignment of VF19 no longer reaches global scope
{unset}

$VF19 is no longer local to f1. The value that's set to it in f1 becomes visible to f2 (but f2 still can't modify it in a way that's visible to f1)

I think the proper result in this case would be like this:
$ unset VF1
$ unset VF19
$ VF19=YF19 f1 # Binding VF19 on the invocation of f1 prevents f1 from using it as a local variable
f1: VF1=, VF19=excalibur
f2: VF1=, VF19=YF19
f2: VF1=vf1a, VF19=VF19a
f1: VF1=vf1a, VF19=excalibur
$ echo ${VF19-'{unset}'} # f2's assignment of VF19 no longer reaches global scope
{unset}


That is, the invocation-level binding of VF19 should be shadowed by the local-variable definition of VF19, and then on the call to f2, the local-variable definition should be discarded and the invocation-level binding should be visible again.

(My tests are on version 93u+ 2012-08-01, RHEL 7)

man sh.1 not include json print and printf, test pattern also not include full json testing

sh.1 not include print -j and printf %(json)B extensions.

RELEASE:14-07-15 Fixed a bug in which json format output with 'print -j' had a comma
RELEASE:13-05-28 +Added -j option to print (and %(json)B format specifier to printf)
RELEASE: which will print a compound shell variable in JSON format.

My example read and print json.

I think that comvar.sh test is not enough for print -j. Arrays should be part of the compound variable. Also printf should include json testing.

KSH hang in |spawnvex(3ast)|

Just a reminder that if you see a hang from KSH (or another AST program) while it is in the "spawnvex(3ast)| subroutine it is most likely caused by LIBAST getting compiled with the use of
|vfork(2)|. When using VFORK, a program should pretty much just |exec(2)| after |vfork(2)|'ing.
This is just safe UNIX practice. But |spawnvex(3ast)| plays with the program signal mast (or some
such) after it has |vfork(2)|'ed but before it |exec(2)|'s. This can occasionally caused a hang on
some UNIX systems. The fix is to undefine some defines as follows in the |spawnvex(3ast)|
subroutine somewhere after the headers are included, with something like:

#undef _lib_vfork
#undef _real_vfork

Enjoy.

ksh93: read -r doesn't work if -d is also specified

(This is using ksh 93u+ 2012-08-01 on RHEL7)

Hi, I'm really hoping Korn Shell development will continue. Right now it kind of looks like a dead project - which is a shame because it's probably still the best of the Unix shells.

Anyway, I'm writing this shell library called "shell-pepper" and in the course of thinking about how to write a version of "read" that would read a single JSON value from the input (and stop at the end - and without writing it as a loadable "built-in") it led to this line of experimentation with the built-in "read":

$ a='foo\ bar }'
$ echo "'$a'"      # sanity check that $a contains what I expect
'foo\ bar }'
$ IFS='' read -r -d '}' x <<<"$a"     # Read to the next '}' or EOF.  $? tells us whether it was a delimiter or EOF.
$ echo "'$x'"                       # Thanks to IFS the space at the end is preserved, but we lose the backslash.
'foo bar '
$ IFS='' read -r y <<<"$a"    # If I remove -d, then the backslash is retained, but I lose the ability to stop the read at the next curly brace
$ echo "'$y'"
'foo\ bar }'

Basically the idea here is that if I've started reading a JSON object, reading to the next '}' may not get me to the end of the object, but it certainly won't take me past the end of the object. But I need backslashes intact (hence the -r), but when I use "-d" as well, backslashes in the input are lost (as if -r weren't specified)

BASH gets this one right:

a='foo\ bar }'
$ echo "'$a'"
'foo\ bar }'
$ IFS='' read -r -d '}' x <<<"$a"
$ echo "'$x'"
'foo\ bar '

As a side note - I had heard that JSON read and write were to be added to Korn Shell in upcoming versions. I am mostly using 93u but with a couple 93v builds kicking around. At the time I wrote this I was unaware that the JSON functionality is already present in 93v. Nice!

string match on ERE quantifiers fails

brace quantifiers in extended regular expression string match test cause syntax error

% ksh -c '[[ abc =~ a{2,} ]] && echo z
ksh: syntax error at line 1: `~(E)a{2,} ]] && echo z' unexpected
% ksh --version
  version         sh (AT&T Research) 93u+ 2012-08-01

ksh93: Feature Request: Redirection syntax that allows built-in commands to provide file descriptors for redirection

On the GNU Bash patches page there is a feature request/code patch which expands upon the /dev/tcp special redirection syntax to add listening on a socket:

$ cmd <>/dev/tcp-listen/localhost/$port
$ # Or, to make the file descriptor persist, use "exec {fd_var}<>/dev/tcp-listen/localhost/$port"

I bring this up not to advocate for this feature, (on the contrary, I think having the shell spoof one thing in /dev/ is one thing too many), but rather because it got me thinking of how to add similar features without creating "special" filenames.

In the case of TCP connections, for instance, one could (almost) replace /dev/tcp with a built-in. (There's not much point replacing /dev/tcp now, of course, this is more an example of how similar features might be implemented without similar "magic") For the sake of this discussion assume this built-in is called open_tcp and takes the destination host address and port number as its arguments, opens the connection, and writes the number of the newly-opened file descriptor to its stdout:

$ open_tcp localhost 80
10
$ cmd <>&10
$ fd=10; exec {fd}<&-    # close the file descriptor

This exposes one of the disadvantages of creating such a built-in: Unlike redirecting to /dev/tcp, the lifetime of this open_tcp file descriptor can not be automatically managed by the shell. A redirection can be limited to the lifetime of a command or group of commands, but open_tcp cannot.

One could imagine trying to get around the issue like this:

$ cmd >&$(open_tcp localhost 49152)-   # Evaluates as "cmd >&10-" for instance: Run "open_tcp" to open a file descriptor, redirect it to the output of "cmd", and close the original FD afterward

This wouldn't work as things stand for a couple reasons:

  • The command substitution happens in a subshell, so the file descriptor opened by open_tcp is not accessible to the parent shell
  • The effect of moving the file descriptor (with >&fdnum-), rather than just duplicating it (as with >&fdnum) is localized to the command redirection, and doesn't affect the shell's state once the command has ended.

So I propose introducing a syntax that would create the necessary sequence of operations to make this work:

  1. Run a set of commands that is embedded in the redirection in the environment of the current shell and capture its output.
  2. Attempt to interpret the output of those commands as a numeric file descriptor that will be duplicated by the redirection
  3. Once the file descriptor is cloned by the redirection, close the original. (Do not retain it or restore it when the command being redirected ends.)

One option would be to provide this behavior with the syntax I described above:
$ cmd1 >&$(open_fd_cmd)

That is, recognize that $() is being used to provide the numeric argument to >&, and treat any file descriptors opened in the shell process by open_fd_cmd as local to the redirection rather than local to the command substitution. (Though this means that command substitution in this context can't be forked - it must be evaluated as part of the main shell process. But that's the case with ksh anyway, right?)

Missing `alarm` man page

Unable to locate any documentation on the alarm builtin.
According to this thread, circa 2006, David Korn was not ready to publicise -- he had concerns about possible conflicts with new functionality. What's the status today?

In summary:
a) Is the alarm builtin usable?
b) Can we trace sufficient usage information to build a man page?

Cheers, don

Incorrect exit message when exiting

ksh gives incorrect exit message if exit code is greater than 256. ksh uses 256+signal number to show error message codes for signals. However below command should not give an error message :

$ exit 257
Hangup

ksh93 stdout not proper if EXIT/ERR traphandler defined in commandline mode

The following is the reproducible testcase.
.
$ ksh -ec 'rm -f /tmp/$USER.test.log; function log { echo $* to stdout; echo
$* to file >> /tmp/$USER.test.log; }; function test_exit { log trap; }; trap
test_exit EXIT; log exit'
.
exit to stdout
.
$ cat /tmp/$USER.test.log
exit to file
trap to stdout
trap to file

There is another case which is failing

ksh -ec 'rm -f /tmp/$USER.test.log; function log { echo $* to

stdout; echo $* to file >> /tmp/$USER.test.log; }; function test_exit { trap
test_exittrap EXIT; log trap; }; function test_exittrap { log exit;
};test_exit'
trap to stdout

cat /tmp/root.test.log

trap to file
exit to stdout
exit to file

This works if this is run as a script though. This seems to be a side effect of the ksh optimization of running the last command without forking.

Here are some details of why this is happening

I traced the code flow for both the cases(cmdline and script) and it looks
like it works for the script case because of the ksh93 optimization of
running the last cmd without forking.
.
Here are the details
.
After the "log exit" method is run, the script restores the filedescriptors
by calling the sh_iorestorefd() at
src/cmd/ksh93/sh/xec.c#1471
.
1471 if((shp->topfd>topfd) && !(shp->subshell &&
np==SYSEXEC))
1472 sh_iorestore(shp,topfd,jmpval);
whereas for the cmdline case, it does not restore the fds.
.
Here for the script case, shp->topfd=1 whereas for the cmdline case, it is 0.

In src/cmd/ksh93/sh/xec.c#1347
1347 else
1348 type = (execflg && !shp->subshell &&
!shp->st.trapcom[0]);
1349 shp->redir0 = 1;
1350 sh_redirect(shp,io,type);
.
Here the "type" parameter for the sh_redirect fn is 0 for the script, whereas
it is 1 for the cmdline.
(The execflg is being set as part of the optimization of running the last
command without a fork in the following code.
.
src/cmd/ksh93/sh/xec.c#984
.
984 int execflg = (type&sh_state(SH_NOFORK));
.
which gets the type input from
.
src/cmd/ksh93/sh/main.c#581
.
581 if(!sh_isstate(SH_PROFILE) && sh_isoption(SH_CFLAG) &&
582 (fno<0 || !(shp->fdstatus[fno]&(IOTTY|IONOSEEK)))
583 && !sfreserve(iop,0,0))
584 {
585 execflags |= sh_state(SH_NOFORK);
586 }
587 shp->st.execbrk = 0;
588 sh_exec(t,execflags);
)
.
.
The following code in sh_iosave() sets shp->topfd to 1.
src/cmd/ksh93/sh/io.c#1728
.
1728 filemap[shp->topfd++].save_fd = savefd;
.
which is called from sh_redirect as given below
( Here flag = 1 as it corresponds to the "type" parameter
of sh_redirect we saw earlier)
.
src/cmd/ksh93/sh/io.c#1501
1503 if(flag==0 || tname || (flag==1 && fn==1 &&
(shp->fdstatus[fn]&IONOSEEK) && shp->outpipepid &&
shp->outpipepid==getpid()))
1504 {
1505 if(fd==fn)
1506 {
1507 if((r=sh_fcntl(fd,F_DUPFD,10)) > 0)
1508 {
1509 fd = r;
1510 sh_close(fn);
1511 }
1512 }
1513 sh_iosave(shp,fn,indx,tname?fname:(trunc?Empty:0));
1514 }
1515 else if(sh_subsavefd(fn))
1516 sh_iosave(shp,fn,indx|IOSUBSHELL,tname?fname:0);
1517 }

I've created a patch which fixes this issue but not sure if this is really the best solution. This basically
flags the specific cases and disables the optimization.
Here are some test results

./ksh -ec 'rm -f /tmp/$USER.test.log; function log { echo $*

to stdout; echo $* to file >> /tmp/$USER.test.log; }; function test_exit {
log trap; }; trap test_exit EXIT;log exit'
exit to stdout
trap to stdout

cat /tmp/root.test.log

exit to file
trap to file

Last statement is a function with own EXIT handler


./ksh -ec 'rm -f /tmp/$USER.test.log; function log { echo $*

to stdout; echo $* to file >> /tmp/$USER.test.log; }; function test_exit {
trap test_exittrap EXIT; log trap; };function test_exittrap { log exit;
};test_exit'
trap to stdout
exit to stdout

cat /tmp/root.test.log

trap to file
exit to file
.
ERROR trap only


./ksh -ec 'function error { echo "ERROR trap to stdout";

return 1; }; trap error ERR; false > /tmp/test.log'
ERROR trap to stdout

cat /tmp/test.log

.
Non builtin echo function


./ksh -c '

rm -f /tmp/test.log
function my_echo
{
echo $* to file
}
function log
{
echo $* to stdout
my_echo $* >> /tmp/test.log
}
function test_exit
{
log trap
}
trap test_exit EXIT
log exit'
exit to stdout
trap to stdout

cat /tmp/test.log

exit to file
trap to file
.
Original testcase but run as a script


cat test-script.sh

rm -f /tmp/$USER.test.log
function log {
echo $* to stdout
echo $* to file >> /tmp/$USER.test.log
}
function test_exit {
log trap
}
trap test_exit EXIT
log exit

./ksh test-script.sh

exit to stdout
trap to stdout

cat /tmp/root.test.log

exit to file
trap to file

optimisation done by ksh93 file reading builtins break functionality and cause memory leak

(tested with ksh93u from package on Debian amd64)

$ seq 5 > a; ksh93 -c 'read a; echo test > a; read b; echo "$a $b"' < a
1 2

The second "read" could not possibly have read "2" because we have replaced the content of the "a" file with "test\n". What happens (from examining strace output) is that on the first "read", ksh93 has read up to 64kB worth of data, put the first line in $a, lseek()ed back to the end of that first line and remembered the data that was after that. Upon the second read, it optimizes out the read() system call and uses the remembered data instead.

It does check that the position of the stdin cursor within the file has not changed to assert that the optimisation is valid, but here, stdin has not moved, but the optimisation is not valid for a different reason: the content of the file has changed.

That optimisation will probably not gain you much in most cases on most systems because the OS will keep the read data in cache already (and knows better how and when to invalidate the cache), so that ksh93 behaviour could be seen as wasting resources by keeping another copy of that data in memory.

It also seems like there's a memory leak in that that "remembered" data seems never to be freed even after a file descriptor has been reused on a different file:

$ ksh93 -c 'ps -o rss,comm -p "$$"; for f in /usr/*/*; do read -n1 a < $f; done; ps -o rss,comm -p "$$"; :'
  RSS COMMAND
  1472 ksh93
  RSS COMMAND
134860 ksh93

It affects "read" and other builtin utilities that read data (which seem to share that same "remembered" data). I've verified it with cat and head.

Active locale/character set is not properly applied when parsing C-style strings

This issue affects the use of Korn Shell with variable-width character encodings that are not as well-behaved as UTF-8. In this case I am using GB-18030, an extended version of the Chinese national standard character encoding that covers all Unicode code points as well. When I say it is "not as well-behaved as UTF-8", specifically I mean that it is not self-synchronizing, and bytes from multi-byte characters, if taken out of context, can appear identical to other characters.

Take, for example, U+4E57, a Chinese character which is encoded in GB18030 and GBK as 0x81 0x5C:

$ LANG=zh_CN.GBK printf "echo \$'\\u4e57'" | LANG=zh_CN.GBK ksh | od -t x1z
0000000 81 0a                                            >..<

$ LANG=zh_CN.GBK printf "echo \$'\\u4e57n'" | LANG=zh_CN.GBK ksh | od -t x1z      
0000000 81 0a 0a                                         >...<

Basically, the second byte of the character, 0x5C is apparently interpreted as a backslash. This also occurs with "printf":

$ # 0x5C is interpreted as backslash and combined with "n"
$ LANG=zh_CN.GBK printf 'printf "\u4e57n"' | LANG=zh_CN.GBK ksh | od -t x1z       
0000000 81 0a                                            >..<

As far as I am aware this doesn't occur in other syntax:

$ # This turns out like "echo �\  x" - If the 0x5C byte is interpreted as "backslash" then it'd combine with a space - but it doesn't.
$ LANG=zh_CN.GBK printf 'echo \u4e57  x' | LANG=zh_CN.GBK ksh | od -t x1z         
0000000 81 5c 20 78 0a                                   >.\ x.<

With the locale set to zh_CN.GBK, the shell should interpret its input according to the GBK character encoding. As far as this encoding is concerned, there is no backslash in these examples.

Regression appending to an indexed array overwrites arr[-1] if arr[0] is unset

This appears to be a regression since the last release.

 $ ksh /dev/fd/9 9<<\EOF
set -x
typeset -a a=(w x) b=(a b c)
a+=("${b[@]}")           # Correct behavior with a[0] set
typeset -p a
typeset -a a=([1]=w [2]=x)
a+=("${b[@]}")           # Incorrectly overwrites a[-1] when a[0] is unset
typeset -p a
a[${#__[@]}+1].__+=(y z) # Hack to get a reference to the correct element.
typeset -p a .sh.version
EOF

+ a=( w x )
+ b=( a b c )
+ typeset -a a b
+ a+=( a b c )
+ typeset -p a
typeset -a a=(w x a b c)
+ a[1]=w
+ a[2]=x
+ typeset -a a
+ a+=( a b c )
+ typeset -p a
typeset -a a=([1]=w [2]=a [3]=b [4]=c)
+ a+=( y z )
+ typeset -p a .sh.version
typeset -a a=([1]=w [2]=a [3]=b [4]=c [5]=y [6]=z)
.sh.version='Version ABIJM 93v- 2014-12-24'

nmake fails to build on FreeBSD 11.0 and 11.1

This is probably the wrong place to get help. I'm trying to get a recent version of ksh going so I can install CDE. When I run ./bin/package make, it gives me this error and fails compiling ast.

`mamake [cmd/nmake]: *** exit code 1 making expand.o

  • nmake --base --compile '--file=/home/wfisher/ast/src/cmd/nmake/Makerules.mk'
    /bin/sh: nmake: not found
    mamake [cmd/nmake]: *** exit code 127 making Makerules.mo
    mamake: *** exit code 1 making cmd/nmake
    package: make: errors making /home/wfisher/ast/arch/freebsd11.amd64/bin/nmake`

I don't know if I'm compiling things wrong or missing a dependency but the Googles hasn't turned up any help.

Thanks!!

date: wrong timezone for 1970 -> 1971 in British timezone

https://en.wikipedia.org/wiki/British_Summer_Time#Periods_of_deviation

Between 27 October 1968 and 31 October 1971, there was no daylight saving time in mainland Britain. It was GMT+1 all year round. On both GNU and Solaris system, the system's strftime/localtime is correct:

$ TZ=Europe/London perl -MPOSIX -le 'print strftime "%F %T %Z %z", localtime 0'
1970-01-01 01:00:00 BST +0100

(BST being then British Standard Time, not Summer this time).

But ksh93's "printf %T" or the ast date utility seem to get it wrong:

$ ksh93 -c 'printf "%(%F %T %Z %z)T\n" "#0"'
1970-01-01 00:00:00 GMT -0000

(ksh93u on Solaris 10)

$ arch/linux.i386-64/bin/date -d "#0"
Thu Jan  1 00:00:00 GMT 1970

(from the beta branch on Debian).

typeset -f output truncated for functions within functions

With ksh93u+ and v- on Debian amd64:

$ ksh -c 'function f { g() uname; g; }; typeset -f f'
function f { g() uname;

Or using POSIX function declaration syntax only:

$ ksh -c 'f() { g() { uname; }; g; }; typeset -f f'
f() { g() { uname; };

See how the definition of the f function is truncated just after the end of the g definition. Note that the output doesn't even include a newline.

If I pipe that output to ksh, I also get a SEGV (with ksh93u+, not ksh93v-):

$ ksh -c 'f() { g() { uname; }; g; }; typeset -f f' | ksh
ksh: syntax error at line 1: `{' unmatched
zsh: done                ksh -c 'f() { g() { uname; }; g; }; typeset -f f' |
zsh: segmentation fault  ksh

(not when passed as ksh -c or ksh file-that-contains-that-output)

ksh `..` vs $(..) differences when results are massive

If you create a variable using backticks vs parenthesis which results in a large amount of output, the backticks version will cut the results off. For example:

a=`find . -type f`
b=$(find . -type f)

[ "$a" == "b" ]; echo $?
> 1

echo "$a" | wc -c
> 1388545
echo "$b" | wc -c
> 1881923

It appears the backticks method has a hard character limit of 1388545 (at least on the system I am testing this on).

ksh -u or 'set -o nounset' behaviour for undefined positional parameter like $1.

This is an issue which was reported earlier in the ast-developers forum, details can be found at
https://www.mail-archive.com/[email protected]/msg01906.html

I've observed the issue in the beta, alpha and master versions.
I've applied the following patch( for alpha and the master version) to fix the issue.

--- INIT.2013-10-10/src/cmd/ksh93/sh/macro.c 2015-11-12 03:05:54.008417740 -0800
+++ INIT.2013-10-10/src/cmd/ksh93/sh/macro.c 2016-03-14 11:15:32.158386840 -0700
@@ -1220,7 +1220,7 @@
{
d=fcget();
fcseek(-1);

  •                   if(!strchr(":+-?=",d))
    
  •                   if(d=='\0'  || !strchr(":+-?=",d))
                            errormsg(SH_DICT,ERROR_exit(1),e_notset,ltos(c));
            }
            break;
    

ksh93: <>; combined with <#pattern with some builtins or no command fails to truncate

(tested with ksh93u+ and ksh93v- 2014-12-24 on Ubuntu 16.04 amd64)

In:

$ seq 10 > a; strace -e read,write,lseek,ftruncate ksh -c 'printf "" <>; a >#5; cat a'
lseek(1, 0, SEEK_CUR)                   = 0
read(1, "1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n", 65536) = 21
lseek(1, 0, SEEK_CUR)                   = 21
read(1, "", 65515)                      = 0
lseek(1, 8, SEEK_SET)                   = 8
lseek(1, 0, SEEK_CUR)                   = 8
ftruncate(1, 8)                         = 0
1
2
3
4

The file is properly truncated to the start of the line matching the pattern (5).

But if we remove printf '' or replace it with many other builtins (I tried :, true, eval, alias x=x...), then we see:

$ seq 10 > a; strace -e read,write,lseek,ftruncate ksh -c '<>; a >#5; cat a'
lseek(1, 0, SEEK_CUR)                   = 0
read(1, "1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n", 65536) = 21
lseek(1, 0, SEEK_CUR)                   = 21
read(1, "", 65515)                      = 0
lseek(1, 0, SEEK_CUR)                   = 21
ftruncate(1, 21)                        = 0
lseek(1, 8, SEEK_SET)                   = 8
1
2
3
4
5
6
7
8
9
10

We see a truncation attempt but at the end of the file as if the <#5 failed to find a match.

It's the same if we use a fd other than 1 with printf "" like:

$ seq 10 > a; strace -e read,write,lseek,ftruncate ksh -c 'printf "" 3<>; a 3<#5; cat a'
lseek(3, 0, SEEK_CUR)                   = 0
lseek(3, 0, SEEK_CUR)                   = 0
read(3, "1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n", 65536) = 21
read(3, "", 65515)                      = 0
lseek(3, 0, SEEK_CUR)                   = 21
ftruncate(3, 21)                        = 0
lseek(3, 8, SEEK_SET)                   = 8
1
2
3
4
5
6
7
8
9
10

That does smell like an incorrect optimisation.

Note that the <#((expr)) operator doesn't seem to have a similar issue:

$ seq 10 > a; strace -e read,write,lseek,ftruncate ksh -c '<>; a >#((10)); cat a'
lseek(1, 0, SEEK_SET)                   = 0
lseek(1, 0, SEEK_CUR)                   = 0
read(1, "1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n", 65536) = 21
lseek(1, 10, SEEK_SET)                  = 10
lseek(1, 0, SEEK_CUR)                   = 10
ftruncate(1, 10)                        = 0
1
2
3
4
5

beta branch: Building ksh93 fails at libast/comp/tmpnam.c

When trying to build the beta branch under Ubuntu/Linux 15.10 using bin/package make the build fails with the following error message at tmpnam.c:

+ cc -D_BLD_DLL -fPIC -D_BLD_ast -O -I. -I/home/maus/projekte/ksh/src/lib/libast -Icomp -I/home/maus/projekte/ksh/src/lib/libast/comp -Iinclude -I/home/maus/projekte/ksh/src/lib/libast/include -Istd -I/home/maus/projekte/ksh/src/lib/libast/std -D_PACKAGE_ast -c /home/maus/projekte/ksh/src/lib/libast/comp/tmpnam.c
/home/maus/projekte/ksh/src/lib/libast/comp/tmpnam.c: In function '_ast_tmpnam':
/home/maus/projekte/ksh/src/lib/libast/comp/tmpnam.c:48:14: error: storage size of 'buf' isn't known
  static char buf[L_tmpnam];
              ^
/home/maus/projekte/ksh/src/lib/libast/comp/tmpnam.c:50:39: error: expected expression before ',' token
  return pathtemp(s ? s : buf, L_tmpnam, NiL, "tn", NiL);
                                       ^
mamake [lib/libast]: *** exit code 1 making tmpnam.o

After analyses the L_tmpnam, P_tmpdir and L_ctermid seem to be defined empty as can be seen by looking at arch/linux.i386-64/src/lib/libast/FEATURE/stdio:

#if defined(__STDPP__directive) && defined(__STDPP__initial)
__STDPP__directive pragma pp:initial
#endif
#ifndef P_tmpdir
#define P_tmpdir
#endif
#ifndef L_ctermid
#define L_ctermid
#endif
#ifndef L_tmpnam
#define L_tmpnam
#endif
#if defined(__STDPP__directive) && defined(__STDPP__initial)
__STDPP__directive pragma pp:noinitial
#endif

After manually patching these values to sensible values like the following the build succeeds:

#if defined(__STDPP__directive) && defined(__STDPP__initial)
__STDPP__directive pragma pp:initial
#endif
#ifndef P_tmpdir
#define P_tmpdir "/tmp"
#endif
#ifndef L_ctermid
#define L_ctermid 1024
#endif
#ifndef L_tmpnam
#define L_tmpnam 1024
#endif
#if defined(__STDPP__directive) && defined(__STDPP__initial)
__STDPP__directive pragma pp:noinitial
#endif

After having applied these changes the build succeeds and ksh93 is correctly build.

Issue with printf %Lb "\0200" in UTF-8 locales

printf %Lb "\0200"

In UTF-8 locales seems to print random areas of memory.

In ksh93u on Debian amd64 (from package):

$ ksh -c 'printf %Lb "\0200"' | wc -c
18564
$ ksh -c 'printf %Lb "\0200"' | wc -c
18972

With ksh93v- (built from beta git branch), it seems to enter some infinite loop in:

#0  ast_mbrchar (w=0x7ffc6404c664 L"", s=0x25b781d23d41 "", n=16, q=0x7ffc6404c7d0) at src/lib/libast/comp/setlocale.c:2188
#1  0x0000000000574583 in sfvprintf (f=0x841ec0 <_Sfstdout>, form=0x25b781d23d33 "", args=0x7ffc64051838) at src/lib/libast/sfio/sfvprintf.c:744
#2  0x0000000000566b67 in sfprintf (f=0x841ec0 <_Sfstdout>, form=0x5bddef "%!") at src/lib/libast/sfio/sfprintf.c:48
#3  0x0000000000492ec3 in b_print (argc=-1, argv=0x25b781d23c10, context=0x7ffc64051ab0) at src/cmd/ksh93/bltins/print.c:350
#4  0x00000000004925ea in b_printf (argc=3, argv=0x25b781d23c00, context=0x8433f0 <sh+1392>) at src/cmd/ksh93/bltins/print.c:150
#5  0x0000000000472692 in sh_exec (shp=0x7ffc6404c664, t=0x25b781d23d41, flags=5) at src/cmd/ksh93/sh/xec.c:1387
#6  0x0000000000416cad in exfile (shp=0x7ffc6404c664, iop=0x25b781d23d41, fno=16) at src/cmd/ksh93/sh/main.c:610
#7  0x0000000000416065 in sh_main (ac=3, av=0x7ffc640522e8, userinit=0x0) at src/cmd/ksh93/sh/main.c:382
#8  0x0000000000415192 in main (argc=3, argv=0x7ffc640522e8) at src/cmd/ksh93/sh/pmain.c:45

typeset -S within functions

Weird behaviour when using static function variables (typeset -S)

Consider the following script:

function alpha {
    integer -S count=0
    (( ++ count ))
    print -n " $count"
}

function beta {
    integer count=0
    (( ++ count ))
    print -n " $count"
}

print -n "Alpha:"; alpha; alpha; alpha; alpha; alpha; print
print -n "Beta: "; beta;  beta;  beta;  beta;  beta;  print

My understanding is that since the static variable is declared within a non-POSIX function its scope is that function's scope. Consequently the expected output should be:

Alpha: 1 2 3 4 5
Beta:  1 1 1 1 1

While the output of the alpha() function is consistent across tests. The behaviour of beta() is not consistent. I get various outputs, sometimes correct, but mostly incorrect. This has been tested with 93u+ on Linux and macOS.

Sample outputs:

Beta:  0 0 0 0 0
Beta:  0 0 0 1 0
Beta:  0 0 1 0 0
Beta:  1 1 0 1 1

I have not detected the pattern. Consequently I can reproduce the error (almost systematically), but not the exact output.

I tried the following without success:

  • Change the loading order of functions
  • Declare a variable with same name in the global scope (to eventually force a local scoping)
  • Replace the integer alias by its typeset equivalent

iffe has trouble detecing dynamic linking on FreeBSD 11

On FreeBSD 10 it works:

+ mamake -C lib/libdll -k install
+ set -
+ iffe -v -c 'cc -D_BLD_DLL -fPIC -Wno-unused-value -Wno-parentheses -Wno-logical-op-parentheses -O2 -pipe  -fstack-protector -fno-strict-aliasing    -lm -fstack-protector ' ref -L/wrkdirs/usr/ports/shells/ksh93/work/ksh93-20160716/arch/freebsd11.amd64/lib -I/wrkdirs/usr/ports/shells/ksh93/work/ksh93-20160716/arch/freebsd11.amd64/include/ast -I/wrkdirs/usr/ports/shells/ksh93/work/ksh93-20160716/arch/freebsd11.amd64/include /wrkdirs/usr/ports/shells/ksh93/work/ksh93-20160716/arch/freebsd11.amd64/lib/libast.a -lm : run /wrkdirs/usr/ports/shells/ksh93/work/ksh93-20160716/src/lib/libdll/features/dll
iffe: test: is sys/types.h a header ... yes
iffe: test: is -lm a library ... yes
iffe: test: is /wrkdirs/usr/ports/shells/ksh93/work/ksh93-20160716/arch/freebsd11.amd64/lib/libast.a a library ... yes
iffe: test: is dl.h a header ... no
iffe: test: is dlfcn.h a header ... yes
iffe: test: is dll.h a header ... no
iffe: test: is rld_interface.h a header ... no
iffe: test: is mach-o/dyld.h a header ... no
iffe: test: is sys/ldr.h a header ... no
iffe: test: is -ldl a library ... no
iffe: test: is dlopen a library function ... yes
iffe: test: is dllload a library function ... no
iffe: test: is loadbind a library function ... no
iffe: test: is shl_load a library function ... no
iffe: test: link{ ... }end ... no
iffe: test: run{ ... }end ... yes
iffe: test: output{ ... }end ... yes

On FreeBSD 11, however, it does not:

+ iffe -v -c 'cc -D_BLD_DLL -fPIC -Wno-unused-value -Wno-parentheses -Wno-logical-op-parentheses -O2 -pipe  -fstack-protector -fno-strict-aliasing    -lm -fstack-protector ' ref -L/wrkdirs/usr/ports/shells/ksh93/work/ksh93-20160716/arch/freebsd11.amd64/lib -I/wrkdirs/usr/ports/shells/ksh93/work/ksh93-20160716/arch/freebsd11.amd64/include/ast -I/wrkdirs/usr/ports/shells/ksh93/work/ksh93-20160716/arch/freebsd11.amd64/include -last -lm : run /wrkdirs/usr/ports/shells/ksh93/work/ksh93-20160716/src/lib/libdll/features/dll
iffe: test: is sys/types.h a header ... yes
iffe: test: is -lm a library ... yes
iffe: test: is -last a library ... no
iffe: test: is dl.h a header ... no
iffe: test: is dlfcn.h a header ... yes
iffe: test: is dll.h a header ... no
iffe: test: is rld_interface.h a header ... no
iffe: test: is mach-o/dyld.h a header ... no
iffe: test: is sys/ldr.h a header ... no
iffe: test: is -ldl a library ... no
iffe: test: is dlopen a library function ... yes
iffe: test: is dllload a library function ... no
iffe: test: is loadbind a library function ... no
iffe: test: is shl_load a library function ... no
iffe: test: link{ ... }end ... no
iffe: test: run{ ... }end ... yes
iffe: test: output{ ... }end ... no

I wonder why FreeBSD 11 is getting -last parameter
while working FreeBSD 10 uses /wrkdirs/usr/ports/shells/ksh93/work/ksh93-20160716/arch/freebsd11.amd64/lib/libast.a. Where does this difference come from?

Additional info:

FreeBSD 10 uses clang 3.4.1
FreeBSD 11 uses clang 4.0.0
I have 477c024 already applied to iffe

Since is iffe: test: output{ ... }end ... no it all leads to failures in recognizing dlopen() and therefore -last does not get built.

nv_open("tcl_library", 0, 0) will crash tksh on startup

tksh crashes on startup with the following backtrace:

#0  0x0000000000613d44 in dtuserdata (dt=0x0, data=0x0, set=0)
    at /home/saper/sw/ast/src/lib/libast/cdt/dtuser.c:45
#1  0x0000000000519d26 in nv_open (name=0x676c84 "tcl_library", root=0x0, flags=0)
    at /home/saper/sw/ast/src/cmd/ksh93/sh/name.c:1427
#2  0x00000000004a54e5 in TkshOpenVar (interp=0x1e06422c41c0, name1=0x7fffffffde10, name2=0x7fffffffde08, 
    flags=65537, options=8, msg=0x676e18 "set") at /home/saper/sw/ast/src/lib/libtksh/src/var.c:137
#3  0x00000000004a5cd8 in Tcl_SetVar2 (interp=0x1e06422c41c0, part1=0x676c84 "tcl_library", part2=0x0, 
    newValue=0x1e0642273860 "lib/tksh7.6", flags=65537) at /home/saper/sw/ast/src/lib/libtksh/src/var.c:338
#4  0x00000000004a7167 in Tcl_SetVar (interp=0x1e06422c41c0, varName=0x676c84 "tcl_library", 
    newValue=0x1e0642273860 "lib/tksh7.6", flags=1) at /home/saper/sw/ast/src/lib/libtksh/src/var.c:865
#5  0x00000000004a4f27 in TkshCreateInterp (interp=0x1e06422c41c0, data=0xa19880 <builtInCmds>)
    at /home/saper/sw/ast/src/lib/libtksh/src/init.c:108
#6  0x00000000004acfd1 in Tcl_CreateInterp () at /home/saper/sw/ast/src/lib/libtksh/src/basic.c:210
#7  0x000000000040a03a in Tksh_TkMain (argc=1, argv=0x7fffffffe140, appInitProc=0x40a788 <Tksh_AppInit>)
    at /home/saper/sw/ast/src/cmd/tksh/tkMain.c:108
#8  0x000000000040ab09 in b_tkinit (argc=1, argv=0x7fffffffe140, context=0x0)
    at /home/saper/sw/ast/src/cmd/tksh/tkMain.c:676
#9  0x0000000000409f58 in tksh_userinit (shp=0xa34d80 <sh>, subshell=0)
    at /home/saper/sw/ast/src/cmd/tksh/uinit.c:57
#10 0x00000000004f48f4 in sh_init (argc=1, argv=0xa1d840 <_error_info_>, userinit=0x409e36 <tksh_userinit>)
    at /home/saper/sw/ast/src/cmd/ksh93/sh/init.c:1787
#11 0x00000000004d16e6 in sh_main (ac=1, av=0x7fffffffe778, userinit=0x409e36 <tksh_userinit>)
    at /home/saper/sw/ast/src/cmd/ksh93/sh/main.c:146
#12 0x000000000040a002 in main (argc=1, argv=0x7fffffffe778) at /home/saper/sw/ast/src/cmd/tksh/uinit.c:77

According to the nvl(3):

SYNOPSIS

       Namval_t        *nv_open(const char *name, Dt_t *dict, int flags);

DESCRIPTION

       The function nv_open() returns a pointer to a  name-value  pair  corre‐
       sponding  to  the  given  name.   It  can  also assign a value and give
       attributes to a name-value pair.  The argument dict defines the dictio‐
       nary  to search.  A NULL value causes the shell global variable dictio‐
       nary to be searched.

TkshOpenVar() may intentionally call nv_open() with a dict set to NULL. This causes a crash now.

nv_open() now just calls dtuserdata(root, 0, 0) - this has been changed in the 2013-10-10 alpha release, while previously a call to sh_getinterp() has been used to determine the default value of root if unspecified.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.