1
0
Fork 0
mirror of git://git.code.sf.net/p/cdesktopenv/code synced 2025-02-15 04:32:24 +00:00
Commit graph

402 commits

Author SHA1 Message Date
Martijn Dekker
b16c91f012 Release 1.0.3
This point release mainly fixes the following:
- A bug in history expansion (set -H) where any use of the history
  comment character caused processing to be aborted as if it were
  an invalid history expansion. Affected e.g. 'echo ${#v}'.
- A bug in command line options processing that caused short-form
  option equivalents on some built-in commands to be ignored after
  one use, e.g., the new read -a equivalent of read -A.
- Ksh freezing or using excessive memory if HISTSIZE is assigned a
  pathologically large value.
- A bug that caused ksh in the vi editor mode to crash or produce
  invalid completions if ESC = was used at the beginning of a line.
2022-08-25 20:30:58 +01:00
Martijn Dekker
9e7ecb1b17 Fix freeze on HISTSIZE value exceeding INT_MAX
@GabrielRavier writes:
> When int is the same as int32_t, starting ksh with
> HISTSIZE=2000000000 in the environment (or any number between
> 1073741824 and 2147483647) results in an infinite loop, because
> of this code:
>
>	if(cp=nv_getval(HISTSIZE))
>		maxlines = (unsigned)strtol(cp, (char**)0, 10);
>	else
>		maxlines = HIST_DFLT;
>	for(histmask=16;histmask <= maxlines; histmask <<=1 );
>
> which gets stuck infinitely because histmask <= maxlines will
> always be true (after 26 iterations, histmask == 1073741824,
> after 27 iterations, histmask == -2147483648 and after more than
> 28 iterations it's stuck at 0).

src/cmd/ksh93/edit/history.c: hist_init():
- Add bounds check for maxlines. HIST_MAX is the number of bytes at
  which the history file gets cleaned up, so the theoretical
  maximum number of lines (i.e.: each line being empty, consisting
  only of a terminating 0 byte) should also be HIST_MAX. That value
  is 16384 on a system where sizeof(int)==4.

Resolves: https://github.com/ksh93/ksh/issues/522
2022-08-24 04:16:23 +01:00
pghvlaans
b6c8bb7b3c vi mode: Disable = and tab at the start of the line (#521)
In vi mode, due to a buffer overflow, <ESC>= as the first input
either produces an invalid completion or crashes the shell. The
easiest fix is to disable <ESC>= as well as <TAB> completion at
the start of the command line, as is already done in emacs mode.

src/cmd/ksh93/edit/vi.c:
- getline(): If cur_virt <= 0 in case '\t', beep and refuse tab
  completion.
- textmod(): Move an 'if' statement checking for INVALID cur_virt
  to include = mode along with * and \ mode.

Co-authored-by: Martijn Dekker <martijn@inlv.org>
Resolves: https://github.com/ksh93/ksh/issues/520
2022-08-24 04:16:08 +01:00
Johnothan King
bea4fd56e8 Fix 'read -a' failure to create array (re: d55e9686) (#516)
The commit that backported read -a did not add a case label for it
to read.c. This was under the assumption that AST optget(3) would
always convert -a to -A. However, that was only done for first use.

The cause is the short-form options caching mechanism in optget(3).
On the first run, the pre-caching result would be returned, but the
equivalent option (-a) would be cached as if it is its own option,
so on the second and subsequent runs, optget returned 'a' instead
of 'A'. This only happens if no long-form equivalent is present.

Reproducer:

  $ read -A foo <<< 'foo bar baz'
  $ unset foo
  $ read -a foo <<< 'foo bar baz'
  $ echo ${foo[0]}
  foo bar baz

Expected: foo

src/lib/libast/misc/optget.c,
src/lib/libast/misc/optlib.h:
- [by Martijn Dekker] Implement caching for short-option
  equivalents. If a short-form equivalent is found, instead of
  caching it as a separate option, cache the equivalent in a new
  equiv[] array. Check this when reading the cache and substitute
  the main option for the equivalent if one is cached.

src/lib/libcmd/cp.c:
- Fix cp -r/cp -R symlink handling. The -r and -R options sometimes
  ignored -P, -L and -H.
- The -r and -R options no longer follow symlinks by default.

src/cmd/ksh93/bltins/whence.c,
src/lib/libcmd/*.c:
- Remove case labels that are redundant now that the optget(3)
  caching bug is fixed.

src/cmd/ksh93/tests/libcmd.sh:
- Added. This is the new script for the /opt/ast/bin path-bound
  built-ins from libcmd. Other relevant tests are moved into here.

Co-authored-by: Martijn Dekker <martijn@inlv.org>
2022-08-20 17:11:52 +01:00
Johnothan King
4e18d65894 Fix regression in handling of head builtin's -s flag (re: 4ca578bd) (#514)
Commit 4ca578bd accidentally left in half of an if statement that was
originally just a use of the SFMTXBEGIN2 macro (which is a no-op without
vt_threaded). As a result, the head builtin's -s flag broke in one of
the ksh2020 regression tests. Reproducer:
   # Note that head must be enabled in src/cmd/ksh93/data/builtins.c
   builtin head || exit 1
   cat > "/tmp/file1" <<EOF
This is line 1 in file1
This is line 2 in file1
This is line 3 in file1
This is line 4 in file1
This is line 5 in file1
This is line 6 in file1
This is line 7 in file1
This is line 8 in file1
This is line 9 in file1
This is line 10 in file1
This is line 11 in file1
This is line 12 in file1
EOF
   got=$(head -s 5 -c 18 "/tmp/file1")
   exp=$'is line 1 in file1'
   [[ "$got" = "$exp" ]] || print "'head -s' failed" "(expected $(printf %q "$exp"), got $(printf %q "$got"))"

Code change that caused this bug (note that since the if statement
wasn't completely removed, it now guards the for loop despite the
newline):
        if(fw)
-               SFMTXBEGIN2(fw, (Sfoff_t)0);

        for(n_move = 0; n != 0; )

I noticed this since I'm making a separate commit to backport more of
the ksh2020 regression tests.

src/lib/libast/sfio/sfmove.c:
- Remove the incomplete if statement to fix the -s flag.

src/cmd/ksh93/tests/b_head.sh:
- Backport the ksh2020 regression tests for the head builtin.
2022-08-16 22:48:47 +01:00
Martijn Dekker
dde8da6769 Fix history comment character in history expansion (re: 41ee12a5)
Reproducer: on an interactive shell with the -H option on,

  $ v=foo
  $ print ${#v}

does not print anything (but should print "3").
The 'print' line also is not added to the history.

This bug was exposed by commit 41ee12a5 which enabled the history
comment character by default, setting it to '#' as on bash. When it
was disabled by default, this bug was rarely exposed.

The problem happens here:

src/cmd/ksh93/edit/hexpand.c
203: if(hc[2] && *cp == hc[2]) /* history comment designator, skip rest of line */
204: {
205: 	stakputc(*cp++);
206: 	stakputs(cp);
207: 	DONE();
208: }

The DONE() macro sets the HIST_ERROR bit flag:

src/cmd/ksh93/edit/hexpand.c
45: #define	DONE()	{flag |= HIST_ERROR; cp = 0; stakseek(0); goto done;}

For the history comment character that is clearly wrong, as no
error has occurred.

There is another problem. The documentation I added for history
expansion states this bit, which is based on bash's behaviour:

    If a word on a command line begins with the history comment
    character #, history expansion is ignored for the rest of that
    line.

With an expansion like ${#v}, the word does not begin with # so
history expansion should not have parsed that as a comment
character. The intention was to act like bash.

src/cmd/ksh93/edit/hexpand.c:
- Split the DONE() macro into DONE and ERROROUT of which only the
  latter sets the HIST_ERROR bit flag. Change usage accordingly.
  Thix fixes the first problem.
- Don't make these new macros function-style macros (with ()) as
  they end in a goto, so that's a bit misleading.
- Add is_wordboundary() which makes a best-effort attempt to
  determine if the character following the specified character is
  considered to start a new word by shell grammar rules. Word
  boundary characters are whitespace and: |&;()`<>
- Only recognise the history comment character if is_wordbounary()
  returns true for the previous character. This fixes the second
  problem.

Thanks to @jghub for the bug report.
Resolves: https://github.com/ksh93/ksh/issues/513
2022-08-16 22:00:20 +01:00
Martijn Dekker
ebdcfcbb62 Release 1.0.2 (re: cd063869)
This release fixes the interactive shell crashing when one of the
predefined aliases (currently 'history' and 'r') is redefined,
whether from a profile/kshrc script or manually. This crash
occurred in two scenarios:

1. when redefining and then unsetting a predefined alias;

2. when redefining a predefined alias and then executing a shell
   script that does not begin with a #! path.

Both are fixed now.
2022-08-08 10:49:58 +01:00
Martijn Dekker
cd0638690c Fix crashes on redefining/unsetting predefined alias (re: 7e7f1372)
Reproducer 1:

  $ alias r
  r='hist -s'
  $ alias r=foo
  $ unalias r
  ksh(10127,0x10d6c35c0) malloc: *** error for object 0x7ffdcd404468: pointer being freed was not allocated
  ksh(10127,0x10d6c35c0) malloc: *** set a breakpoint in malloc_error_break to debug
  Abort

The crash happens as unall() (typeset.c) calls nv_delete() (name.c)
which tries to free a node pointer that was not directly allocated.

Reproducer 2:

  $ ENV=/./dev/null ksh
  $ echo : >script
  $ chmod +x script
  $ alias r=foo
  $ ./script
  ksh(10193,0x10c8505c0) malloc: *** error for object 0x7fa136c04468: pointer being freed was not allocated
  ksh(10193,0x10c8505c0) malloc: *** set a breakpoint in malloc_error_break to debug
  Abort

This crash happens for the same reason, but in another location,
namely in sh_reinit() (init.c) as it is freeing up the alias table
before executing a script that does not start with a #! path.

This is a serious bug because it is not uncommon for .kshrc or
.profile scripts to (re)define an alias called 'r'.

Analysis:

These crashes happen because the incorrectly freed node pointer is
part of a larger block of nodes initialised by sh_inittree() in
init.c. That function allocates all the nodes for a table (see
data/{aliases,builtins,variables}.c) in a contiguous block
addressable by numeric index (see builtins.h and variables.h for
how that is used).

So, while the value of the alias is correctly marked NV_NOFREE and
is not freed, that attribute does not apply to the node pointer
itself, which also is not freeable. Thus, if the value is replaced
by a freeable one, the node pointer is incorrectly freed upon
unaliasing it, and the shell crashes.

The simplest fix is to allocate each predefined alias node
individually, like any other alias -- because, in fact, we do not
need the predefined alias nodes to be in a contiguous addressable
block; there is nothing that specifically addresses these aliases.

src/cmd/ksh93/sh/main.c: sh_main():
- Instead of calling sh_inittree(), use a dedicated loop to
  allocate each predefined alias node individually, making each
  node invidually freeable. The value does remain non-freeable,
  but the NV_NOFREE attribute correctly takes care of that.

src/cmd/ksh93/bltins/typeset.c:
- Get rid of the incomplete and now unnecessary workarounds in
  unall() and sh_reinit().

Thanks to @jghub and @JohnoKing for finding and reporting the bug.
Discussion: https://github.com/ksh93/ksh/discussions/503#discussioncomment-3337172
2022-08-06 11:54:33 +01:00
Martijn Dekker
9c9743998f Revert buggy exec optimisation (re: d6c9821c)
As of 2022-06-18, ksh 93u+m is not capable of being used as /bin/sh
while building GNU binutils. The execution of some of its build
system's dot scripts is incorrectly aborted as an external 'sed'
command is execve(2)'d without forking. This means that incorrect
exec optimization was happening.

Unfortunately I have not been able to derive a minimal reproducer
of the problem yet because the GNU binutils build scripts are very
complex. Pending further research, the optimisation is reverted.
Even if a way to make it work is found, it will not be reintroduced
to the 1.0 branch.

Thanks to @atheik for finding the problem and identifying the
commit that introduced it.

Resolves: https://github.com/ksh93/ksh/issues/507
2022-08-05 12:52:49 +02:00
Martijn Dekker
3a25aa0d93 Release 93u+m/1.0.0
_        _        ___ _____                          ___   ___   ___
| | _____| |__    / _ \___ / _   _   _   _ __ ___    / / | / _ \ / _ \
| |/ / __| '_ \  | (_) ||_ \| | | |_| |_| '_ ` _ \  / /| || | | | | | |
|   <\__ \ | | |  \__, |__) | |_| |_   _| | | | | |/ / | || |_| | |_| |
|_|\_\___/_| |_|    /_/____/ \__,_| |_| |_| |_| |_/_/  |_(_)___(_)___/

It may have taken exactly a decade, but here we are... a proper new
ksh release. :) Many thanks to all contributors for their hard work!
Compared to an unpatched 93u+, this release has roughly a thousand
bugs fixed. It incorporates a fair number of enhancements as well.

Not all known bugs have been worked out yet; see the TODO file. Let's
hope this release will rekindle interest and attract more bug hunters.

This commit also makes some very minor fixes in comments. Notable:
src/cmd/ksh93/sh/arith.c: sh_strnum():
- Update a security-related comment. As of b48e5b33, evaluating
  untrusted arithmetic expressions from the environment should no
  longer cause CVE-2019-14868. But let's keep disallowing it anyway.

Resolves: https://github.com/ksh93/ksh/issues/491
2022-08-01 21:44:02 +02:00
Johnothan King
3389cab2f5 Fix building on Haiku and fix some typos (#498)
- Fixed a few minor typos and updated the list of tested systems in
  src/cmd/ksh93/README.
- vmmopen.c: Include the ast.h header to fix a build error on Haiku
  caused by the otherwise undefined NoN macro (re: 05f0c1b1).
2022-07-31 00:12:07 +02:00
Martijn Dekker
9f6841c37e Fix "$*" doing pattern matching if $IFS is wildcard (BUG_IFSGLOBS)
The bug is that "$*", and related expansions such as "${arr[*]}",
etc., do pattern matching if the first character of $IFS is a
wildcard. For example, the following:

    IFS=*
    set -- F ''
    case BUGFREE in
    BUG"$*")        echo bug ;;
    esac

outputs 'bug'. This bug can be reproduced in every other glob
pattern matching context as well, but not in pathname expansion.

src/cmd/ksh93/sh/macro.c: varsub():
- When joining fields into one for a "$*"-type expansion, check if
  a glob pattern matching operation follows (mp->pattern is set).
  If so, write a preceding backslash to escape the separator.

Resolves: https://github.com/ksh93/ksh/issues/489
Resolves: https://github.com/att/ast/issues/12
2022-07-28 23:36:39 +02:00
pghvlaans
3be94a7fd4 vi mode completion: use APPEND when e_nlist is 0 (#493)
Attempting to tab-complete the only member of a directory while in
vi mode switches to "control" under the following conditions:

 1. pwd is outside of the directory

 2. No part of the basename has been typed in (e.g., starting with
    vim TEST/t will autocomplete to vim TEST/test, but starting
    with vim TEST/ will not).

Also, trying to type with "input" or "replace" fails afterwards
until a new line has been generated (either with ctrl + c or the
j/k keys in control mode).

src/cmd/ksh93/edit/vi.c: textmod():
- The strange completion behavior in vi mode for single-member
  directories appears to be at least partially due to completion
  failing to use \ or * mode from outside of pwd. Checking for
  e_nlist directly from vi.c — 0 unless in = mode — allows for
  entering the else loop at line 2575 to reach APPEND mode.

Resolves: https://github.com/ksh93/ksh/issues/485
2022-07-27 23:37:30 +02:00
Martijn Dekker
48f78227a9 Fix $PPID for hashbangless script (re: 232b7bff)
I made a mistake in sh.reinit() which caused $PPID to be set to the
new process ID, not the parent process ID. This commit fixes it by
introducing, updating and using sh.current_ppid, so we continue to
minimise context switches due to getpid(2)/getppid(2) system calls.

Thanks to Geoff Clare for the report.
2022-07-27 22:48:57 +02:00
Martijn Dekker
592ce7415a test/[/[[: Fix incorrect 0-skipping, float precision (re: c734568b)
The arguments to the binary numerical comparison operators (-eq,
-gt, etc.) in the [[ and test/[ commands are treated as arithmetic
expressions, even if $((...)) is not used. But there is some
seriously incorrect behaviour:

Reproducers (all should output 0/true):

    $ [[ 0x0A -eq 10 ]]; echo $?
    1
    $ [[ 1+0x0A -eq 11 ]]; echo $?
    0
    $ (set --posix; [[ 010 -eq 8 ]]); echo $?
    1
    $ (set --posix; [[ +010 -eq 8 ]]); echo $?
    0
    $ [[ 0xA -eq 10 ]]; echo $?
    1
    $ xA=10; [[ 0xA -eq 10 ]]; echo $?
    0
    $ xA=WTF; [[ 0xA -eq 10 ]]; echo $?
    ksh: WTF: parameter not set

(POSIX mode enables the recognition of the initial 0 as a prefix
indicating an octal number in arithmetic expressions.)

The cause is the two 'while' loops in this section in test_binop()
in src/cmd/ksh93/bltins/test.c:

502:	if(op&TEST_ARITH)
503:	{
504:		while(*left=='0')
505:			left++;
506:		while(*right=='0')
507:			right++;
508:		lnum = sh_arith(left);
509:		rnum = sh_arith(right);
510:	}

So initial zeros are unconditionally skipped. Ostensibly this is to
disable the recognition of the initial zero as an octal prefix as
well as 0x as a hexadecimal prefix. This would be okay for
enforcing a decimal-only limitation for simple numbers, but to do
this for arithmetic expressions is flagrantly incorrect, to say the
least. (insert standard rant about AT&T quality standards here)

The fix for '[[' is simply to delete the two 'while' loops. But
that creates a problem for the deprecated-but-standard 'test'/'['
command, which also uses the test_binop() function. According to
POSIX, test/[ only parses simple decimal numbers, so octal, etc. is
not a thing. But, after that fix, 'test 08 -eq 10' in POSIX mode
yields true, which is unlike every other shell. (Note that this is
unlike [[ 08 -eq 10 ]], which yields true on bash because '[['
parses operands as arithmetic expressions.)

For test/[ in non-POSIX mode, we don't need to change anything. For
POSIX mode, we should only parse literal decimal numbers for these
operators in test/[, disallowing unexpanded arithmetic expressions.
This makes ksh's POSIX-mode test/[ act like every other shell and
like external .../bin/test commands shipped by OSs.

src/cmd/ksh93/bltins/text.c: test_binop():
- Correct a type mismatch. The left and right hand numbers should
  be Sfdouble_t, the maximum double length on the current system,
  and the type that sh_arith() returns. Instead, long double
  (typeset -lF) values were reduced to double (typeset -F) before
  comparing!
- For test/[ in POSIX, only accept simple decimal numbers: use
  strtold() instead of sh_arith(). Do skip initial zeros here as
  this disables the recognition of the hexadecimal prefix. Throw an
  error on an invalid decimal number. Floating point constants are
  still parsed, but that's fine as this does not cause any
  incompatibility with POSIX.

- For code legibility, move handling of TEST_EQ, etc. to within the
  if(op&TEST_ARITH) block. This also allows lnum and rnum to be
  local to that block.
2022-07-26 21:33:00 +02:00
Martijn Dekker
b7817c3750 Release 93u+m/1.0.0-rc.1
We're nearly there!

I intend to release ksh 93u+m/1.0.0 on 2022-08-01, precisely ten
years after the last canonical 93u+ release.

We have a week until then, so here's a release candidate. Please
try as hard as you can to break it, and to help fix known bugs.

As of this commit, the 1.0 branch is feature-frozen and will only
get bugfixes.

src/cmd/ksh93/fun/man:
- Last-minute fix: .man.try_os_man(): do not look for arguments
  with / in section 1 and 8; this can cause false positives.
2022-07-25 04:03:16 +02:00
Martijn Dekker
b72992f684 ksh93/fun: Add 'autocd'
This autoloadable function activates a feature similar to 'shopt -s
autocd' in bash: type only a directory name to change directory.
It uses the DEBUG trap to work. See the file for details.
2022-07-24 14:52:30 +02:00
Martijn Dekker
663606866e ksh93/fun: add 'man' function, making builtins help easy to use
This adds a new 'man' function in src/cmd/ksh93/fun/man, meant for
autoload. This integrates the --man self-documentation of ksh
built-in and external AST commands with your system's 'man' command
so you can conveniently use 'man' for all commands, whether
built-in or external. See the file for details.
2022-07-24 10:36:30 +02:00
Martijn Dekker
3e79027cd1 'command -x': also bypass path-bound built-ins (re: 66e1d446)
Since 'command -x' provides xargs-like functionality to repeatedly
run external commands if the argument list is too long for one
invocation, it makes little sense to use with built-ins. So I
decided that 'command -x' should double as a way to guarantee
running an external command. However, it was only bypassing plain
built-ins and not path-bound builtins (the ones that show up with a
virtual directory path name in the output of 'builtin').

src/cmd/ksh93/sh/path.c:
- Before looking for a path-bound built-in, check that the SH_XARG
  state bit is not on. This needs to be done in path_absolute() as
  well as in path_spawn().
2022-07-24 07:33:56 +02:00
Martijn Dekker
bb4f23e63f set -b/--notify: do not mess up the command line
This commit makes three interrelated changes.

First, the code for erasing the command line before redrawing it
upon a window size change is simplified and modernised. Instead of
erasing the line with lots of spaces, it now uses the sequence
obtained from 'tput ed' (usually ESC, '[', 'J') to "erase to the
end of screen". This avoids messing up the detection and automatic
redrawing of wrapped lines on terminals such as Apple Terminal.app.

Second, the -b/--notify option is made more usable. When it is on
and either the vi or emacs/gmacs line editor is in use, 'Done' and
similar notifications are now buffered and trigger a command line
redraw as if the window size changed, and the redraw routine prints
that notify buffer between erasing and redrawing the commmnd line.
The effect is that the notification appears to magically insert
itself directly above the line you're typing. (The notification
behaviour when not in the shell line editor, e.g. while running
commands such as external editors, is unchanged.)

Third, a bug is fixed that caused -b/--notify to only report on one
job when more than one terminated at the same time. The rest was
reported on the next command line as if -b were not on. Reproducer:
$ set -b; sleep 1 & sleep 1 & sleep 1 &

This commit also includes a fair number of other window size and
$COLUMNS/$LINES handling tweaks that made all this easier, not all
of which are mentioned below.

src/cmd/ksh93/include/fault.h,
src/cmd/ksh93/sh/fault.c:
- Replace sh_update_columns_lines with a new sh_winsize() function.
  It calls astwinsize() and is to be used instead of it, and
  instead of nv_getval(LINES) and nv_getval(COLUMNS) calls. It:
  - Allows passing one or neither of lines or cols pointers.
  - Updates COLUMNS and LINES, but only if they actually changed
    from the last values. This makes .set discipline functions
    defined for these variables more useful.
  - Sets the sh.winch flag, but only if COLUMNS changes. If only
    the height changes, the command line does not need redrawing.

src/cmd/ksh93/include/io.h:
- Add sh_editor_active() that allows checking whether one of vi,
  emacs or gmacs is active without onerous #if SHOPT_* directives.

src/cmd/ksh93/sh/jobs.c: job_reap():
- Remove the fix backported in fc655f1a, which was really just a
  workaround that papered over the real bug.
- Move a check for errno==ECHILD so it is only done when waitpid()
  returns an error (pid < 0); the check was not correct because C
  library functions that do not error out also do not change errno.
- Move the SH_NOTIFY && SH_TTYWAIT code section to within the
  while(1) loop so it is run for each job, not only the
  last-processed one. This fixes the bug where only one job was
  notified when more than one ended at the same time.
- In that section, check if an editor is active; if so, set the
  output file for job_list() to sh.notifybuffer instead of standard
  error, list the jobs without the initial newline (JOB_NLFLAG),
  and trigger a command line redraw by setting sh.winch.

src/cmd/ksh93/edit/edit.c:
- Obtain not just CURSOR_UP but also ERASE_EOS (renamed from
  KILL_LINE) using 'tput'. The latter had the ANSI sequence
  hardcoded. Define a couple of TPUT_* macros to make it easier to
  deal with terminfo/termcap codes.
- Add get_tput() to make it easier to get several tput values
  robustly (with SIGINT blocked, trace disabled, etc.)
- ed_crlf(): Removed. Going by those ancient #ifdefs, nothing that
  93u+m will ever run on requires a '\r' before a '\n' to start a
  new line on the terminal. Plus, as of 93u+, there were already
  several spots in emacs.c and vi.c where it printed a sole '\n'.
- ed_read():
  - Simplify/modernise command line erase using ERASE_EOS.
  - Between erasing and redrawing, print the contents of the notify
    buffer. This has the effect of inserting notifications above
    the command line while the user is typing.

src/cmd/ksh93/features/cmds:
- To detect terminfo vs termcap codes, use all three codes we're
  currently using. This matters on at least on my system (macOS
  10.14.6) in which /usr/bin/tput has incomplete terminfo support
  (no 'ed') but complete termcap support!
2022-07-24 04:11:12 +02:00
Martijn Dekker
6afe659690 Report POSIX function or dot script name in error messages
An old annoyance of mine: when an error occurs in a ksh function,
the function name is reported in the error message, but the same is
not done for POSIX functions.

Since POSIX functions are treated as glorified dot scripts,
b_dot_cmd() needs an errorpush() call and an assignment to
error_info.id to make this happen for both POSIX functions and dot
scripts. This is the same thing that sh_funscope() does for ksh
functions. There is now no difference from ksh functions in how
these report errors.

Note that the sh_popcontext() macro includes an errorpop() call, so
that does not need to be added. See fault.h and error.h.
2022-07-21 07:03:31 +02:00
Martijn Dekker
948fab26aa Fix: running external command influences $RANDOM sequence
The pseudorandom generator generates a reproducible sequnece of
values after seeding it with a known value. But:

    $ (RANDOM=1; echo $RANDOM; echo $RANDOM)
    2100
    18270
    $ (RANDOM=1; echo $RANDOM; ls >/dev/null; echo $RANDOM)
    2100
    30107

Since RANDOM was seeded with a specific value, the two command
lines should output the same pair of numbers. Running 'ls' in the
middle should make no difference.

The cause is a nv_getval(RANDNOD) call in xec.c, case TFORK, that
is run for all TFORK cases, in the parent process -- including
background jobs and external commands. What should happen instead
is that $RANDOM is reseeded in the child process.

This bug is in every version of ksh93 since before 1995.

There is also an opportunity for optimisation. As of 396b388e, the
RANDOM seed may be invalidated by setting rand_last to -2,
triggering a reseed the next time a $RANDOM value is obtained. This
was done to optimise the virtual subshell mechanism. But that can
also be used to eliminate unconditional reseeding elsewhere. So as
of this commit, RANDOM is never (re)seeded until it's used.

src/cmd/ksh93/include/variables.h,
src/cmd/ksh93/sh/subshell.c:
- Add RAND_SEED_INVALIDATED macro, a single source of truth for the
  value that triggers a reseeding in sh_save_rand_seed().
- Add convenient sh_invalidate_rand_seed() function macro.

src/cmd/ksh93/sh/init.c,
src/cmd/ksh93/sh/xec.c:
- Optimisation: invalidate seed instead of reseeding directly.
- sh_exec(): case TFORK: Delete the nv_getval(RANDNOD) call. Add a
  sh_invalidate_rand_seed() to the child part. This fixes the bug.
2022-07-21 01:01:32 +02:00
Martijn Dekker
adc6a64b82 Fix bad job control msg on trapped SIGINT and set -b (re: fc655f1)
When running an external command while trapping Ctrl+C via SIGINT,
and set -b is on, then a spurious Done job control message is
printed. No background job was executed.

    $ trap 'ls' INT
    $ set -b
    $ <Ctrl+C>[file listing follows]

    [1] +  Done                    set -b

In jobs.c (487-493), job_reap() calls job_list() to list a running
or completed background job, passing the JOB_NFLAG bit to only
print jobs with the P_NOTIFY flag. But the 'ls' in the trap is not
a background job. So it is getting the P_NOTIFY flag by mistake.

In fact all processes get the P_NOTIFY flag by default when they
terminate. Somehow the shell normally does not follow a code path
that calls job_list() for foreground processes, but does when
running one from a trap. I have not yet figured out how that works.

What I do know is that there is no reason why non-background
processes should ever have the P_NOTIFY flag set on termination,
because those should never print any 'Done' messages. And we seem
to have a handy P_BG flag that is set for background processes; we
can check for this before setting P_NOTIFY. The only thing is that
flag is only compiled in if SHOPT_BGX is enabled, as it was added
to support that functionality.

For some reason I am unable to reproduce the bug in a pty session,
so there is no pty.sh regression test.

src/cmd/ksh93/sh/jobs.c:
- Rename misleadingly named P_FG flag to P_MOVED2FG; this flag is
  not set for all foreground processes but only for processes moved
  to the foreground by job_switch(), called by the fg command.
- Compile in the P_BG flag even when SHOPT_BGX is not enabled. We
  need to set this flag to check for a background job.
- job_reap(): Do not set the P_NOTIFY flag for all terminated
  processes, but only for those with P_BG set.

src/cmd/ksh93/sh/xec.c: sh_fork():
- Also pass special argument 1 for background job to job_post() if
  SHOPT_BGX is not enabled. This is what gets it to set P_BG.
- Simplify 5 lines of convoluted code into 1.

Resolves: https://github.com/ksh93/ksh/issues/481
2022-07-14 07:02:30 +02:00
Martijn Dekker
ffee9100d5 Robustify ${.sh.level} scope switching (re: 69d37d5e, e1c41bb2)
Switching the function scope to a parent scope by assigning to
.sh.level (SH_LEVELNOD) leaves the shell in an inconsistent state,
causing invalid-free and/or use-after-free bugs. The intention of
.sh.level was always to temporarily switch scopes inside a DEBUG
trap, so this commit minimises the pitfalls and instability by
imposing some sensible limitations:
1. .sh.level is now a read-only variable except while executing a
   DEBUG trap;
2. while it's writeable, attempts to unset .sh.level or to change
   its attributes are ignored;
3. attempts to set a discipline function for .sh.level are ignored;
4. it is an error to set a level < 0 or > the current scope.

Even more crashing bugs are fixed by simplifiying the handling and
initialisation of .sh.level and by exempting it completely from
virtual subshell scoping (to which it's irrelevant).

TODO: one thing remains: scope corruption and use-after-free happen
when using the '.' command inside a DEBUG trap with ${.sh.level}
changed. Behaviour same as before this commit. To be investigated.

All changed files:
- Consistently use the int16_t type for level values as that is the
  type of its non-pointer storage in SH_LEVELNOD.
- Update .sh.level by using an update_sh_level() macro that assigns
  directly to the node value, then restores the scope if needed.
- To eliminate implicit typecasts, use the same int16_t type (the
  type used by short ints such as SH_LEVELNOD) for all variables
  containing a function and/or dot script level.

src/cmd/ksh93/include/variables.h:
- Add update_sh_level() macro.

src/cmd/ksh93/include/name.h,
src/cmd/ksh93/sh/macro.c:
- Add a nv_nonptr() macro that checks attributes for a non-pointer
  value -- currently only signed or unsigned short integer value,
  accessed via the 's' member of 'union Value' (e.g. np->nvalue.s).
- nv_isnull(): To avoid undefined behaviour, check for attributes
  indicating a non-pointer value before accessing the nvalue.cp
  pointer (re: 5aba0c72).
- varsub(): In the set/unset check, remove the now-redundant
  exception for SH_LEVELNOD.

src/cmd/ksh93/data/variables.c,
src/cmd/ksh93/sh/init.c:
- shtab_variables[]: Make .sh.level a read-only short integer.
- sh_inittree(): To avoid undefined behaviour, do not assign to the
  'union Value' char pointer if the attribute indicates a non-
  pointer short integer value. Instead, the table value is ignored.

src/cmd/ksh93/sh/subshell.c: sh_assignok():
- Never create a subshell scope for SH_LEVELNOD.

src/cmd/ksh93/sh/xec.c:
- Get rid of 'struct Level' and its maxlevel member. This was only
  used in put_level() to check for an out of range assignment, but
  this can be trivially done by checking sh.fn_depth+sh.dot_depth.
- This in turn allows further simplification that reduces init for
  .sh.level to a single nv_disc() call in sh_debug(), so get rid of
  init_level().
- put_level(): Throw a "level out of range" error if assigned a
  wrong level.
- sh_debug():
  - Turn off the NV_RDONLY (read-only) attribute for SH_LEVELNOD
    while executing the DEBUG trap.
  - Restore the current scope when trap execution is finished.
- sh_funct(): Remove all .sh.level handling. POSIX functions (and
  dot scripts) already handle it in b_dot_cmd(), so sh_funct(),
  which is used by both, is the wrong place to do it.
- sh_funscope(): Update .sh.level for ksh syntax functions here
  instead. Also, do not bother to initialise its discipline here,
  as it can now only be changed in a DEBUG trap.

src/cmd/ksh93/bltins/typeset.c: setall():
- When it's not read-only, ignore all attribute changes for
  .sh.level, as changing the attributes would crash the shell.

src/cmd/ksh93/sh/nvdisc.c: nv_setdisc():
- Ignore all attempts to set a discipline function for .sh.level,
  as doing this would crash the shell.

src/cmd/ksh93/bltins/misc.c: b_dot_cmd():
- Bug fix: also update .sh.level when quitting a dot script.

src/cmd/ksh93/sh/name.c:
- _nv_unset():
  - To avoid an inconsistent state, ignore all attempts to unset
    .sh.level.
  - To avoid undefined behaviour, do not zero np->nvalue.cp if
    attributes for np indicate a non-pointer value (the actual bit
    value of a null pointer is not defined by the standard, so
    there is no guarantee that zeroing .cp will zero .s).
- sh_setscope(): For consistency, always set error_info.id (the
  command name for error messages) to the new scope's cmdname.
  Previously this was only done for two calls of this function.
- nv_name(): Fix a crashing bug by checking that np->nvname is a
  non-null pointer before dereferencing it.

src/cmd/ksh93/include/nval.h:
- The NV_UINT16P macro (which is unsigned NV_INT16P) had a typo in
  it, which went unnoticed for many years because it's not directly
  used (though its bit flags are set and used indirectly). Let's
  fix it anyway and keep it for completeness' sake.
2022-07-13 23:11:18 +02:00
Martijn Dekker
3ce064bbba lex.c: endword(): fix out-of-bounds index to state table
The lexer use 256-byte state tables (see data/lexstates.c), one
byte per possible value for the (unsigned) char type. But the sp
variable used as an index to a state table in loops like this...
                while((n = state[*sp++]) == 0)
                        ;
...is a char*, a pointer to a char. The C standard does not define
if the char type is signed or not (!). On clang and gcc, it is
signed. That means that, whenever a single-byte, high-bit (> 127)
character is encountered, the value wraps around to negative, and a
read occurs outside of the actual state table, causing potentially
incorrect behaviour or a crash.

src/cmd/ksh93/sh/lex.c:
- endword(): Make sp and three related variables explicitly
  unsigned char pointers. This requires a bunch of annoying
  typecasts to stop compilers complaining; so be it.
- To avoid even more typecasts, make stack_shift() follow suit.
- Reorder variable declarations for legibility.
2022-07-11 02:20:27 +02:00
Martijn Dekker
7a01d6df47 history: fix out-of-bounds read on retrieving empty line
Reproducer: Compile a ksh with AddressSanitizer. In that ksh, edit
the last command line with 'fc', insert an empty line at the start,
and save. Now use the up-arrow to retrieve the empty line. Ksh
aborts on history.c line 1011 as hist_copy() tries to read before
the beginning of the buffer pointed to by s1.

src/cmd/ksh93/edit/history.c: hist_copy():
- Verify that the s1 pointer was increased from the original s1
  before trying to read the character *(s1-1).
2022-07-10 21:18:49 +02:00
Martijn Dekker
1934686de3 Fix oddly specific syntax error corrupting subsequent [[ ... ]]
Reproducer:

    $ x=([x]=1 [y)
    -ksh: syntax error: `)' unexpected
    $ [[ -z $x ]]
    -ksh: [[ -z  ]]: not found

Any '[[' command following that syntax error will fail similarly;
the whole of it (after variable expansion) is incorrectly looked up
as a command name. The syntax error must be generated by an
associative array assignment (with or without an explicit typeset
-A) with at least one valid assignment element followed by an
invalid assignment element starting with '[' but not containing
']='.

This seems to be another bug that is in every ksh93 version ever.
I've confirmed that ksh 1993-12-28 s+ and ksh2020 fail identically.
Presumably, so does everything in between.

Analysis:

The syntax error function, sh_syntax(), calls lexopen() in mode 0
to reset the lexer state. There is a variable that isn't getting
reset there though it should be. Using systematic elimination I
found that the variable that needs to be reset is lp->assignok (set
"when name=value is legal"). If it is set, '[[' is not processed.

src/cmd/ksh93/sh/lex.c: lexopen():
- Reset 'assignok' in the lexer state (regardless of mode).
- In the mode 0 total lexer state reinit, several members of lexd
  (struct _shlex_pvt_lexdata_) were not getting reset; just memset
  the whole thing to zero.
     Note for backporters: this change requires commit da97587e to
  be correct. That commit took the stack size and pointer (lex_max
  and *lex_match) out of this struct; those should not be reset!

Resolves: https://github.com/ksh93/ksh/issues/486
2022-07-09 23:00:11 +02:00
Martijn Dekker
fbfd4d3ab8 Fix syntax error detection in associative array assignments
Reproducer:

$ fn=([foo_key]=foo_val [bar_key])
-ksh: [bar_key]: not found

Expected output:
-ksh: syntax error: `[bar_key]' unexpected

As soon as one correct associative array assignment element has
been processed, a subsequent one, starting with '[' but not
containing ']=', is incorrectly seen as a command to execute.
If a command '[bar_key]' existed on $PATH, it would have been run.

src/cmd/ksh93/sh/parse.c: simple():
- In the syntax check for associative array assignments, don't just
  check for an initial '[' but also verify the presence of ']='.

Thanks to @JohnoKing for finding this bug.

Resolves: https://github.com/ksh93/ksh/issues/427
2022-07-05 22:16:55 +02:00
Martijn Dekker
06e56251b9 Fix wrong syntax error upon process substitution after redirection
Grammatically, redirections may occur anywhere within a command
line and are removed after processing them, whereas a process
substitution (<(commandlist) or >(commandlist)) is replaced by a
file name which should be treated as just another simple word.
So the following should not be a syntax error:

$ cat </dev/null <(true)
-ksh: syntax error: `)' unexpected
$ cat </dev/null >(true)
-ksh: syntax error: `)' unexpected
$ cat >/dev/null <(true)
-ksh: syntax error: `)' unexpected
$ cat >/dev/null >(true)
-ksh: syntax error: `)' unexpected

This bug is in every ksh93 version.

The problem is in the parser (parse.c). The process substitution is
misparsed as a redirection due to inout() recursively parsing
multiple redirections without recognising process substitutions.
inout() is mistaking '<(' for '<' and '>(' for '>', which explains
the incorrect syntax error.

This also causes the following to fail to detect a syntax error:
$ cat >&1 <(README.md
[the contents of README.md are shown]

...and other syntax errors detected in the wrong spot, for example:
$ { true; } <(echo wrong)
-ksh: syntax error: `wrong' unexpected
which should be:
-ksh: syntax error: `<(' unexpected

src/cmd/ksh93/sh/parse.c:
- Add global inout_found_procsub flag.
- inout(): On encountering a process substitution, set this flag
  and return, otherwise clear the flag.
- simple(): After calling inout(), check this flag and, if set,
  jump back to where process substitutions are parsed.

Resolves: https://github.com/ksh93/ksh/issues/418
2022-07-05 13:20:28 +02:00
Martijn Dekker
400806afa6 Do not avoid creating subshell for last command if there are traps
Reproducer:

$ ksh -c 'trap "echo OK" TERM; (kill -s TERM $$)'

Actual output: none
Expected output: OK

The bug is only triggered if 'kill' is executed from a subshell
that is optimised out due to being the last command in the script.

src/cmd/ksh93/sh/xec.c: sh_exec(): case TPAR:
- Instead of only checking for EXIT and ERR traps, do not avoid
  creating a virtual subshell if there are any traps (except DEBUG,
  SIGKILL, SIGSTOP); for this, use the sh.st.trapdontexec flag
  introduced in 40245e08.
2022-07-03 12:52:34 +02:00
Martijn Dekker
4df6d674a0 Fix signal exit status of last command in subshell (re: b3050769)
Reproducer (on macOS/*BSD where SIGUSR1 has signal number 30):

  $ ksh -c '(sh -c '\''kill -s USR1 $$'\''); echo $?'
  ksh: 54220: User signal 1
  30

Expected output for $?: 286, not 30. The signal is not reflected in
the 9th bit of the exit status.

This bug was introduced for virtual subshells in b3050769 but
exists in every ksh93 version for real (forked) subshells:

  $ ksh -c '(ulimit -t unlimited; trap : EXIT; \
	sh -c '\''kill -s USR1 $$'\''); echo $?'
  ksh: 54267: User signal 1
  30

(As of d6c9821c, a dummy trap is needed to trigger the bug, or it
will be masked by the exec optimization for the sh invocation.)

This is caused by the exit status being masked to 8 bits when a
subshell terminates. For a real subshell, this is inevitable as the
kernel does this. As of b3050769, virtual subshells behave in a
manner consistent with real subshells in this regard.

However, for both virtual and real subshells, if its last command
was terminated by a signal, then that should still be reflected in
the 9th bit of ksh's exit stauts.

The root of the problem is that ksh simply cannot rely internally
on the 9th bit of the exit status to determine if a command exited
due to a signal. The 9th bit may be trimmed by a subshell or may be
set by 'return' without a signal being involved. This commit fixes
it by introducing a separate flag which will be a reliable
indicator of this.

src/cmd/ksh93/include/shell.h:
- Add sh.chldexitsig flag (set if the last command was a child
  process that exited due to a signal).

src/cmd/ksh93/sh/jobs.c: job_wait():
- When the last child process exited due to a signal, not only set
  the 9th (SH_EXITSIG) bit of sh.exitval but also sh.chldexitsig.

src/cmd/ksh93/sh/subshell.c: sh_subshell():
- Fix the virtual subshell reproducer above. After trimming the
  exit status to 8 bit, set the 9th bit if sh.chldexitsig is set.
  This needs to be done in two places: one that runs in the parent
  process after sh_subfork() and one for the regular virtual
  subshell exit.

src/cmd/ksh93/sh/fault.c:
- sh_trap(): Save and restore sh.chldexitsig so that this fix does
  not get deactivated if a trap is set.
- sh_done():
  - Fix the real subshell reproducer above. When the last command
    of a real subshell is a child process that exited due to a
    signal (i.e., if (sh.chldexitsig && sh.realsubshell)), then
    activate the code to pass down the signal to the parent
    process. Since there is no way to pass a 9-bit exit status to a
    parent process, this is the only way to ensure a correct exit
    status in the parent shell environment.
  - When exiting the main shell, use sh.chldexitsig and not the
    unreliable SH_EXITSIG bit to determine if the 8th bit needs to
    be set for a portable exit status indicating its last command
    exited due to a signal.
2022-07-03 12:49:36 +02:00
Martijn Dekker
372d704bfb {edit,fault}.c: improve SIGWINCH and SIGINT handling
The fault.c and edit.c changes in this commit were inspired by
changes in the 93v- beta but take a slightly different approach:
mainly, the code to update $COLUMNS and $LINES is put in its own
function instead of duplicated in sh_chktrap() and ed_setup().

src/cmd/ksh93/sh/fault.c:
- Move code to update $COLUMNS and $LINES to a new
  sh_update_columns_lines() function so it can be reused.
- Fix compile error on systems without SIGWINCH.

src/cmd/ksh93/edit/edit.c:
- ed_setup(): Call sh_update_columns_lines() instead of issuing
  SIGWINCH to self.
- Change two sh_fault(SIGINT) calls to issuing the signal to the
  current process using kill(2). sh_fault() is now never called
  directly (as in the 93v- beta).

src/cmd/ksh93/sh/main.c: sh_main():
- On non-interactive, call sh_update_columns_lines() and set the
  signal handler for SIGWINCH to sh_fault(), causing $COLUMNS and
  $LINES to be kept up to date when the terminal window is resized
  (this is handled elsewhere for interactive shells). This change
  makes ksh act like mksh, bash and zsh. (Previously, ksh required
  setting a dummy SIGWINCH trap to get auto-updated $COLUMNS and
  $LINES in scripts, as this set the SIGWINCH signal handler to
  sh_fault(). This persisted even after unsetting the trap again,
  so that was inconsistent behaviour.)

src/cmd/ksh93/include/shell.h:
- Don't define sh.winch on systems without SIGWINCH.

src/cmd/ksh93/sh.1:
- Update and tweak the COLUMNS and LINES variable documentation.
- Move them up to the section of variables that are set by the
  shell (which AT&T should have done before).
2022-07-01 22:49:40 +02:00
Martijn Dekker
416a412d71 Fix seeking on block devices with FD 0, 1 or 2
@stephane-chazelas reports:
> A very weird issue:
>
> To reproduce on GNU/Linux (here as superuser)
>
> # truncate -s10M file
> # export DEV="$(losetup -f --show file)"
> # ksh -c 'exec 3<> "$DEV" 3>#((0))'   # fine
> # ksh -c 'exec 1<> file 1>#((0))'     # fine
> # ksh -c 'exec 1<> "$DEV" 1>#((0))'
> ksh: 0: invalid seek offset
>
> Any seek offset is considered "invalid" as long as the file is a
> block device and the fd is 0, 1 or 2. It's fine for fds above 2
> and it's fine with any fd for regular files.

Apparently, block devices are not seekable with sfio. In io.c there
is specific code to avoid using sfio's sfseek(3) if there is no
sfio stream in sh.sftable[] for the file descriptor in question:

1398:	Sfio_t *sp = sh.sftable[fn];
[...]
1420:		if(sp)
1421:		{
1422:			off=sfseek(sp, off, SEEK_SET);
1423:			sfsync(sp);
1424:		}
1425:		else
1426:			off=lseek(fn, off, SEEK_SET);

For file descriptors 0, 1 or 2 (stdin/stdout/stderr), there is a
sh.sftable[] stream by default, and it is marked as not seekable.
This makes it return -1 in these lines in sfseek.c, even if the
system call called via SFSK() succeeds:

89:	if(f->extent < 0)
90:	{	/* let system call set errno */
91:		(void)SFSK(f,(Sfoff_t)0,SEEK_CUR,f->disc);
92:		return (Sfoff_t)(-1);
93:	}

...which explains the strange behaviour.

src/lib/libast/sfio/sfseek.c: sfseek():
- Allow for the possibility that the fallback system call might
  succeed: let it handle both errno and the return value.

Resolves: https://github.com/ksh93/ksh/issues/318
2022-06-28 22:18:46 +02:00
Martijn Dekker
da97587e9e lex.c: prevent restoring outdated stack pointer
Lexical levels are stored in a dynamically grown array of int values
grown by the stack_grow function. The pointer lex_match and the
maximum index lex_max are part of the lexer state struct that is now
saved and restored in various places -- see e.g. 37044047, a2bc49be.
If the stack needs to be grown, it is reallocated in stack_grow()
using sh_realloc(). If that happens between saving and restoring the
lexer state, then an outdated pointer is restored, and crash.

src/cmd/ksh93/include/shlex.h,
src/cmd/ksh93/sh/lex.c:
- Take lex_match and lex_max out of the lexer state struct and make
  them separate static variables.

src/cmd/ksh93/edit/edit.c:
- While we're at it, save and restore the lexer state in a way that
  is saner than the 93v- beta approach (re: 37044047) as well as
  more readable. Instead of permanently allocating memory, use a
  local variable to save the struct. Save/restore directly around
  the sh_trap() call that actually needs this done.

Resolves: https://github.com/ksh93/ksh/issues/482
2022-06-23 03:35:48 +01:00
Martijn Dekker
d8dc2a1d81 sh_setenviron(): deactivate compound assignment prefix
Reproducers:

$ ksh -c 'typeset -a arr=( ( (a $(($(echo 1) + 1)) c)1))'
ksh: echo: arr[0]._AST_FEATURES=CONFORMANCE - ast UNIVERSE - ucb: cannot be an array
ksh: [1]=1: invalid variable name
$ ksh -c 'typeset -a arr=( (a $(($(echo 1) + 1)) c)1)'
ksh: echo: arr._AST_FEATURES=CONFORMANCE - ast UNIVERSE - ucb: is not an identifier
ksh: [1]=1: invalid variable name

src/cmd/ksh93/sh/name.c: sh_setenviron():
- Save and clear the current compound assignment prefix (sh.prefix)
  while assigning to the _AST_FEATURES variable.
2022-06-23 03:34:16 +01:00
Martijn Dekker
225323f138 Fix more "/dev/null: cannot create" (re: 411481eb)
Reproducer:

  trap : USR1
  while :; do kill -s USR1 $$ || exit; done &
  while :; do : >/dev/null; done

It can take between a fraction of a second and a few minutes, but
eventually it will fail like this:

  $ ksh foo
  foo[3]: /dev/null: cannot create
  kill: 77946: no such process

It fails similarly with "cannot open" if </dev/null is used instead
of >/dev/null.

This is the same problem as in the referenced commit, except when
handling traps -- so the same fix is required in sh_fault().
2022-06-20 19:39:00 +01:00
Martijn Dekker
40245e088d Fix the exec optimisation mess (re: 17ebfbf6, 6701bb30, d6c9821c)
This commit supersedes @lijog's Solaris patch 280-23332860 (see
17ebfbf6) as this is a more general fix that makes the patch
redundant. Of course its associated regression tests stay.

Reproducer script:

	trap 'echo SIGUSR1 received' USR1
	sh -c 'kill -s USR1 $PPID'

Run as a normal script.
Expected behaviour: prints "SIGUSR1 received"
Actual behaviour: the shell invoking the script terminates. Oops.

As of 6701bb30, ksh again allows an exec-without-fork optimisation
for the last command in a script. So the 'sh' command gets the same
PID as the script, therefore its parent PID ($PPID) is the invoking
script and not the script itself, which has been overwritten in
working memory. This shows that, if there are traps set, the exec
optimisation is incorrect as the expected process is not signalled.

While 6701bb30 reintroduced this problem for scripts, this has
always been an issue for certain other situations: forked command
substitutions, background subshells, and -c option argument
scripts. This commit fixes it in all those cases.

In sh_exec(), case TFORK, the optimisation (flagged in no_fork) was
only blocked for SIGINT and for the EXIT and ERR pseudosignals.
That is wrong. It should be blocked for all signal and pseudosignal
traps, except DEBUG (which is run before the command) and SIGKILL
and SIGSTOP (which cannot be trapped).

(I've also tested the behaviour of other shells. One shell, mksh,
never does an exec optimisation, even if no traps are set. I don't
know if that is intentional or not. I suppose it is possible that a
script might expect to receive a signal without trapping it first,
and they could conceivably be affected the same way by this exec
optimisation. But the ash variants (e.g. Busybox ash, dash, FreeBSD
sh), as well as bash, yash and zsh, all do act like this, so the
behaviour is very widespread. This commit makes ksh act like them.)

Multiple files:
- Remove the sh.errtrap, sh.exittrap and sh.end_fn flags and their
  associated code from the superseded Solaris patch.

src/cmd/ksh93/include/shell.h:
- Add a scoped sh.st.trapdontexec flag for sh_exec() to disable
  exec-without-fork optimisations. It should be in the sh.st scope
  struct because it needs to be reset in subshell scopes.

src/cmd/ksh93/bltins/trap.c: b_trap():
- Set sh.st.trapdontexec if any trap is set and non-empty (an empty
  trap means ignore the signal, which is inherited by an exec'd
  process, so the optimisation is fine in that case).
- Only clear sh.st.trapdontexec if we're not in a ksh function
  scope; unlike subshells, ksh functions fall back to parent traps
  if they don't trap a signal themselves, so a ksh function's
  parent traps also need to disable the exec optimisation.

src/cmd/ksh93/sh/fault.c: sh_sigreset():
- Introduce a new -1 mode for sh_funscope() to use, which acts like
  mode 0 except it does not clear sh.st.trapdontexec. This avoids
  clearing sh.st.trapdontexec for ksh functions scopes (see above).
- Otherwise, clear sh.st.trapdontexec whenever traps are reset.

src/cmd/ksh93/sh/xec.c: check_exec_optimization():
- Consolidate all the exec optimisation logic into this function,
  including the logic from the no_fork flag in sh_exec()/TFORK.
- In the former no_fork flag logic, replace the three checks for
  SIGINT, ERR and EXIT traps with a single check for the
  sh.st.trapdontexec flag.
2022-06-18 23:27:10 +01:00
Martijn Dekker
16b3802148 Fix incorrect exec optimisation with monitor/pipefail on
Reproducer script:
    tempfile=/tmp/out2.$$.$RANDOM
    bintrue=$(whence -p true)
    for opt in monitor pipefail
    do
            (
                    set +x -o "$opt"
                    (
                            sleep .05
                            echo "ok $opt" >&2
                    ) 2>$tempfile | "$bintrue"
            ) &
            wait
            cat "$tempfile"
            rm -f "$tempfile"
    done

Expected output:
    ok monitor
    ok pipefail

Actual output:
    (none)

The 'monitor' and 'pipefail' options are supposed to make the shell
wait for the all commands in the pipeline to terminate and not only
the last component, regardless of whether the pipe between the
component commands is still open. In the failing reproducer, the
dummy external true command is subject to an exec optimization, so
it replaces the subshell instead of forking a new process. This is
incorrect, as the shell is no longer around to wait for the
left-hand part of the pipeline, so it continues in the background
without being waited for. Since it writes to standard error after
.05 seconds (after the pipe is closed), the 'cat' command reliably
finds the temp file empty. Without the sleep this would be a race
condition with unpredictable results.

Interestingly, this bug is only triggered for a (background
subshell)& and not for a forked (regular subshell). Which means the
exec optimization is not done for a forked regular subshell, though
there is no reason not to. That will be fixed in the next commit.

src/cmd/ksh93/sh/xec.c: sh_exec():
- case TFORK: Never allow an exec optimization if we're running a
  command in a multi-command pipeline (pipejob is set) and the
  shell needs to wait for all pipeline commands, i.e.: either the
  time keyword is in use, the SH_MONITOR state is active, or the
  SH_PIPEFAIL option is on.
- case TFIL: Fix the logic for setting job.waitall for the
  non-SH_PIPEFAIL case. Do not 'or' in the boolean value but assign
  it, and include the SH_TIMING (time keyword in use) state too.
- case TTIME: After that fix in case TFIL, we don't need to bother
  setting job.waitall explicitly here.

src/cmd/ksh93/sh.1:
- Add missing documentation for the conditions where the shell
  waits for all pipeline components (time, -o monitor/pipefail).

Resolves: https://github.com/ksh93/ksh/issues/449
2022-06-18 23:25:30 +01:00
Martijn Dekker
6016fb64ce Forking workaround for converting to associative array in subshell
$ arch/*/bin/ksh -xc 'typeset -a a=(1 2 3); \
  (typeset -A a; typeset -p a); typeset -p a'
typeset -A a=()
typeset -a a=(1 2 3)

The associative array in the subshell is empty, so the conversion
failed. So far, I have been unsuccessful at fixing this in the
array and/or virtual subshell code (a patch that fixes it there
would still be more than welcome).

As usual, real subshells work correctly, so this commit adds
another forking workaround. The use case is rare and specific
enough that I have no performance concerns.

src/cmd/ksh93/bltins/typeset.c: setall():
- Fork a virtual subshell if we're actually converting a variable
  to an associative array, i.e.: the NV_ARRAY (-A, associative
  array) attribute was passed, there are no assignments (sh.envlist
  is NULL), and the variable is not unset.

src/cmd/ksh93/tests/arith.sh:
- Fix the "Array subscript quoting test" tests that should not have
  been passing and that correctly failed after this fix; they used
  'typeset -A' without an assignment in a subshell, assuming it was
  unset in the parent shell, which it wasn't.

Resolves: https://github.com/ksh93/ksh/issues/409
2022-06-15 04:58:14 +01:00
Martijn Dekker
50db00e136 Fix subshell trap integrity, e.g. re-trapping a signal in subshell
Ksh handles local traps in virtual subshells the same way as local
traps in ksh-style shell functions, which can cause incorrect
operation.

Reproducer script:

	trap 'echo "parent shell trap"; exit 0' USR1
	(trap 'echo "subshell trap"' USR1; kill -USR1 $$)
	echo wrong

Output on every ksh93 version: 'wrong'
Output on every other shell: 'parent shell trap'

The ksh93 output is wrong because $$ is the PID of the main shell,
therefore 'kill -USR1 $$' from a subshell needs to issue SIGUSR1 to
the main shell and trigger the 'echo SIGUSR1' trap.

This is an inevitable consequence of processing signals in a
virtual subshell. Signals are a process-wide property, but a
virtual subshell and the parent shell share the same process.
Therefore it is not possible to distinguish between the parent
shell and subshell trap.

This means virtual subshells are fundamentally incompatible with
receiving signals. No workaround can make this work properly.

Ksh could either assume the signal needs to be caught by the
subshell trap (wrong in this case, but right in others) or by the
parent shell trap. But it does neither and just gives up and does
nothing, which I suppose is the least bad way of doing it wrong.

As another example, consider a subshell that traps a signal, then
passes its own process ID (as of 9de65210, that's ${.sh.pid}) to
another process to say "here is where to signal me". A virtual
subshell will send it the PID that it shares with the the parent
shell. Even if a virtual subshell receives the signal correctly, it
may fork mid-execution afterwards, depending on the commands that
it runs (and this varies by implementation as we fix or work around
bugs). So its PID could be invalidated at any time.

Forking a virtual subshell at the time of trapping a signal is the
only way to ensure a persistent PID and correct operation.

src/cmd/ksh93/bltins/trap.c: b_trap():
- Fork when trapping (or ignoring) a signal in a virtual subshell.
  (There's no need to fork when trapping a pseudosignal.)

src/cmd/ksh93/tests/signal.sh:
- Add tests. These are simplified versions of tests already there,
  which issued 'kill' as a background job. Currently, running a
  background job causes a virtual subshell to fork before forking
  the 'kill' background job (or *any* background job, see e3d7bf1d)
  -- an ugly partial workaround that I believe just became
  redundant and which I will remove in the next commit.
2022-06-14 01:33:24 +01:00
Martijn Dekker
9b893992a3 [v1.0] posix: don't zero-pad 2nds (re: 5c677a4c, 70fc1da7, b1a41311)
The POSIX mode now disables left-hand zero-padding of seconds in
'time'/'times' output. The standard specifies the output format quite
exactly and zero-padding is not in it.
2022-06-12 16:16:11 +01:00
Martijn Dekker
3030197b89 Add error message for ambiguous long-form option abbreviation
Before:

  $ set -o hist
  -ksh: set: hist: bad option(s)

After:

  $ set --hist
  -ksh: set: hist: ambiguous option

In ksh 93u+m, there are three options starting with 'hist', so the
abbreviation does not represent a single option. It is useful to
communicate this in the error message.

In addition, "bad option(s)" was changed to "unknown option", the
same message as that given by AST optget(3), for consistency.

src/cmd/ksh93/sh/string.c:
- Make sh_lookopt() return -1 for an ambiguous option. This is
  trivial as there is already an 'amb' variable flagging that up.

src/cmd/ksh93/sh/args.c:
- Use the negative sh_lookopt() return status to decide between
  "unknown option" and "ambiguous option".

src/cmd/ksh93/data/builtins.c: sh_set[]:
- Explain the --option and --nooption forms in addition to the -o
  option and +o option forms.
- Document the long options without their 'no' prefixes (e.g. glob
  instead of noglob) as this simplifies documentation and the
  relation with the short options makes more sense. Those names are
  also how they show up in 'set -o' output and it is better for
  those to match.
- Tweaks.
2022-06-10 01:11:46 +01:00
Martijn Dekker
4f9456d69f posix: re-allow preset aliases on interactive (re: ddaa145b)
The POSIX standard in fact contains a sentence that specifically
allows preset aliases: "Implementations also may provide predefined
valid aliases that are in effect when the shell is invoked."
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_03_01

I had missed that back then. It's still a terrible idea for scripts
(particularly the way 93u+ implemented them), but for interactive
POSIX shells it makes a lot more sense and does not violate POSIX.

src/cmd/ksh93/sh/main.c: sh_main():
- Preset aliases for interactive shell regardless of SH_POSIX.
2022-06-09 21:56:57 +01:00
Martijn Dekker
b14e79e9d0 posix: use real pipe(2) instead of socketpair(2)
The POSIX standard requires real UNIX pipes as in pipe(2). But on
systems supporting it (all modern ones), ksh uses socketpair(2)
instead to make it possible for certain commands to peek ahead
without consuming input from the pipe, which is not possible with
real pipes. See features/poll and sh/io.c.

But this can create undesired side effects: applications connected
to a pipe may test if they are connected to a pipe, which will fail
if they are connected to a socket. Also, on Linux:

  $ cat /etc/passwd | head -1 /dev/stdin
  head: cannot open '/dev/stdin' for reading: No such device or address

...which happens because, unlike most systems, Linux cannot open(2)
or openat(2) a socket (a limitation that is allowed by POSIX).

Unfortunately at least two things depend on the peekahead
capability of the _pipe_socketpair feature. One is the non-blocking
behaviour of the -n option of the 'read' built-in:

  -n      Causes at most n bytes to be  read  instead  of  a  full
          line, but will return when reading from a slow device as
          soon as any characters have been read.

The other thing that breaks is the <#pattern and <##pattern
redirection operators that basically grep standard input, which
inherently requires peekahead.

Standard UNIX pipes always block on read and it is not possible to
peek ahead, so these features inevitably break. Which means we
cannot simply use standard pipes without breaking compatibility.

But we can at least fix it in the POSIX mode so that cross-platform
scripts work more correctly.

src/cmd/ksh93/sh/io.c: sh_pipe():
- If _pipe_socketpair is detected at compile time, then use a real
  pipe via sh_rpipe() if the POSIX mode is active. (If
  _pipe_socketpair is not detected, a real pipe is always used.)

src/cmd/ksh93/data/builtins.c:
- sh.1 documents the slow-device behaviour of -n but 'read --man'
  did not. Add that, making it conditional on _pipe_socketpair.

Resolves: https://github.com/ksh93/ksh/issues/327
2022-06-09 16:16:16 +01:00
Martijn Dekker
0602177646 posix: block brace expansion of unquoted expansions (re: a14d17c0)
Historically, ksh (including ksh88 and mksh) allow brace expansion
not just on literal patterns but also on patterns resulting from
unquoted variable expansions or command substitutions:

  $ a='{a,b}' ksh -c 'echo X{a,b} Y$a'
  Xa Xb Ya Yb

Most people expect only the first (literal) pattern to be expanded,
as in bash and zsh:

  $ a='{a,b}' bash -c 'echo X{a,b} Y$a'
  Xa Xb Y{a,b}

The historic ksh behaviour is poorly documented and nearly unknown,
violates the principle of least astonishment, and makes unquoted
variable expansions even more unsafe. See discussion at:
https://www.austingroupbugs.net/view.php?id=1193
https://github.com/ksh93/ksh/issues/140

Unfortunately, we cannot change it in default ksh without breaking
backward compatibility. But we can at least fix it for the POSIX
mode (which disables brace expansion by default but allows turning
it back on), particularly as it looks like POSIX, if it decides to
specify brace expansion in a future version of the standard, will
disallow brace expansion on unquoted variable expansions.

src/cmd/ksh93/sh/macro.c: endfield():
- When deciding whether to do brace expansion + globbing or only
  globbing, also check that we do not have POSIX mode and an
  unquoted variable expansion (mp->pattern==1).
2022-06-08 22:21:53 +01:00
Martijn Dekker
9da0887e54 Fix spurious export attribute when printing compound variables
Reproducer script:

    typeset -Ttyp1 typ1=(
            function get {
                    .sh.value="'Sample'";
            }
    )
    typ1 var11
    typeset -p .sh.type
    typeset -p .sh.type

Buggy output:

    namespace sh.type
    {
            typeset -r typ1='Sample'
    }
    namespace sh.type
    {
            typeset -x -r typ1='Sample'
    }

An -x (export) attribute is magically pulled out of a hat.

Analysis: The walk_tree() function in nvdisc.c repurposes (!) the
NV_EXPORT attribute as an instruction to turn *off* indenting when
pretty-printing the values of compound variables. The
print_namval() function in typeset.c, implementing 'typeset -p',
turns on NV_EXPORT for compound variables to inhibit indentation.
But it then does not bother to turn it off, which causes this bug.

src/cmd/ksh93/bltins/typeset.c: print_namval():
- When printing compound variables, only turn on NV_EXPORT
  temporarily.

Resolves: https://github.com/ksh93/ksh/issues/456
2022-06-07 04:27:54 +01:00
Martijn Dekker
80f8cc497f Fix completion following $'foo\'bar'
On an interactive shell in emacs or vi, type a command with a $'…'
quoted string that contains a backslash-escaped single quote, like:

    $ true $'foo\'bar' ▁

Then begin to type the name of a file present in the current
working directory and press tab. Nothing happens as completion
fails to work.

The completion code does not recognise $'…' strings. Instead, it
parses them as '…' strings in which there are no backslash escapes,
so it considers the last ' to open a second quoted string which is
not terminated.

Plus, when replacing a $'…' string with a (backslash-escaped)
completed string, the initial '$' is not replaced:

    $ $'/etc/hosts<Tab>
    $ $/etc/hosts

src/cmd/ksh93/edit/completion.c:
- find_begin():
  - Learn how to recognise $'…' strings. A new local dollarquote
    flag variable is used to distinguish them from regular '…'
    strings. The difference is that backslash escapes (and only
    those) should be recognised as in "…".
  - Set a special type -1 for $'…' as the caller will need a way
    to distinguish those from '…'.
- ed_expand(): When replacing a quoted string, remove an extra
  initial character (being the $ in $') if the type set by
  find_begin() is -1.

Resolves: https://github.com/ksh93/ksh/issues/462
2022-06-06 03:13:13 +01:00
Martijn Dekker
7a5423dfb6 Fix more spurious comsub execution in tab completion (re: 7a2d3564)
Comsubs were either executed or caused a syntax error when attempting
to complete them within single quotes. Since single quotes do not
expand anything, no such completion should take place.

$ '`/de<TAB>-ksh: /dev/: cannot execute [Is a directory]

$ '$(/de<TAB>-ksh: syntax error: `end of file' unexpected

src/cmd/ksh93/edit/completion.c:
- find_begin():
  - Remove recursive handling for '`' comsubs from 7a2d3564; it is
    sufficient to set the return pointer to the current location cp
    (the character following '`') if we're not in single quotes.
  - For '$' and '`', if we are within single quotes, set type '\''
    and set the return pointer bp to the location of the '$' or
    '`'.
- ed_expand(): If find_begin() sets type '\'' and the current begin
  character is $ or `, refuse to attempt completion; return -1 to
  cause a terminal beep.

Related:
https://github.com/ksh93/ksh/issues/268
https://github.com/ksh93/ksh/issues/462#issuecomment-1038482307
2022-06-06 03:12:57 +01:00
Martijn Dekker
e3aa32a129 Add --functrace shell option (re: 2a835a2d)
A side effect of the bug fixed in 2a835a2d caused the DEBUG trap
action to appear to be inherited by subshells, but in a buggy way
that could crash the shell. After the fix, the trap is reset in
subshells along with all the others, as it should be. Nonetheless,
as that bug was present for years, some have come to rely on it.

This commit implements that functionality properly. When the new
--functrace option is turned on, DEBUG trap actions are now
inherited by subshells as well as ksh function scopes. In addition,
since it makes logical sense, the new option also causes the
-x/--xtrace option's state to be inherited by ksh function scopes.
Note that changes made within the scope do not propagate upwards;
this is different from bash.

(I've opted against adding a -T short-form equivalent as on bash,
because -T was formerly a different option on 93u+ (see 63c55ad7)
and on mksh it has yet anohter a different meaning. To minimise
confusion, I think it's best to have the long-form name only.)

src/cmd/ksh93/include/shell.h,
src/cmd/ksh93/data/options.c:
- Add new "functrace" (SH_FUNCTRACE) long-form shell option.

src/cmd/ksh93/sh/subshell.c: sh_subshell():
- When functrace is on, copy the parent's DEBUG trap action into
  the virtual subshell scope after resetting the trap actions.

src/cmd/ksh93/sh/xec.c: sh_funscope():
- When functrace is on and xtrace is on in the parent scope, turn
  it on in the child scope.
- Same DEBUG trap action handling as in sh_subshell().

Resolves: https://github.com/ksh93/ksh/issues/162
2022-06-04 17:27:27 +01:00
Martijn Dekker
1184b2ade9 Honour attribs for assignments preceding sp. builtins, POSIX functs
After the previous commit, one inconsistency was left. Assignments
preceding special built-ins and POSIX functions (which persist past
the command :-/) caused pre-existing attributes of the respective
variables to be cleared.

$ (f() { typeset -p n; }; typeset -i n; n=3+4 f)
n=3+4

(expected output: 'typeset -i n=7')

This problem was introduced shortly before the release of ksh 93u+,
in version 2012-05-04, by adding these lines of code to the code
for processing preceding assignments in sh_exec():

src/cmd/ksh93/sh/xec.c:
1055:	if(np)
1056:		flgs |= NV_UNJUST;

So, for special and declaration commands and POSIX functions, the
NV_UNJUST flag is passed to nv_open(). In older ksh versions, this
flag cleared justify attributes only, but in early 2012 it was
repurposed to clear *all* attributes -- without changing the name
or the relevant comment in name.h, which are now both misleading.

The reason for setting this flag in xec.c was to deal with some
bugs in 'typeset'. Removing those two lines causes regressions:

  attributes.sh[316]: FAIL: typeset -L should not preserve old attributes
  attributes.sh[322]: FAIL: typeset -R should not preserve old attributes
  attributes.sh[483]: FAIL: typeset -F after typeset -L fails
  attributes.sh[488]: FAIL: integer attribute not cleared for subsequent typeset

Those are all typeset regressions, which suggests this fix was
relevant to typeset only. This is corroborated by the relevant
AT&T ksh93/RELEASE entry:

12-04-27  A bug in which old attributes were not cleared when
	  assigning a value using typeset has been fixed.

So, to fix this 2012 regression without reintroducing the typeset
regressions, we need to set the NV_UNJUST flag for invocations of
the typeset family of commands only. This is changed in xec.c.

While we're at it, we might as well rename that little-used flag to
something that reflects its current purpose: NV_UNATTR.
2022-06-03 23:28:28 +01:00