1
0
Fork 0
mirror of git://git.code.sf.net/p/cdesktopenv/code synced 2025-02-13 11:42:21 +00:00
Commit graph

362 commits

Author SHA1 Message Date
Martijn Dekker
6cc2f6a0af Build system: make SHOPT_* editable again; allow indenting Mamfiles
The build system is adapted to make SHOPT_* compile-time options
editable without nmake. We can now easily change ksh's compile-time
options by editing src/cmd/ksh93/SHOPT.sh. The bin/package script
is adapted to turn these into compile flags. This resolves the most
important drawback of not using nmake.

Also, mamake now has support for indented Mam (Make Abstract
Machine) code. Only one type of block (make...done) is supported in
Mamfiles, so they are easy to indent automatically. A script to
(re)do this is included.

Since nmake is not going to be restored (it has too many problems
that no one is interested in fixing), this at least makes mamake
significantly easier to work with.

The Makefiles are deleted. They may still be handy for reference to
understand the Mamfiles, but they haven't actually matched the
Mamfiles for a while -- and you can still look in the git history.

Deleting them requires some adaptations to bin/package and mamake.c
because, even though they do not use those files, they still looked
for them to decide whether to build code in a directory.

Finally, this commit incorporates some #pragmas for clang to
suppress annoying warnings about the coding style used in this
historic code base. (gcc does not complain so much.)

src/cmd/ksh93/SHOPT.sh:
- Added.

bin/package, src/cmd/INIT/package.sh:
- cd into our own directory in case we were run from another dir.
- $makefiles: only look for Mamfiles.
- Add ksh compile-options via KSH_SHOPTFLAGS. Include SHOPT.sh.
- make_recurse(): Do not write a missing Makefile.
- finalize environment: Look for Mamfiles instead of Makefiles.

src/cmd/INIT/mamake.c:
- Tell clang to suppress annoying warnings about coding style.
- Update version string and self-documentation.
- input(): Add support for indented Mam code by skipping initial
  whitespace on each input line.
- files[]: Instead of looking for various of Makefiles to decide
  where to build, only look for Mamfiles.

src/Makefile, src/cmd/INIT/Makefile, src/cmd/Makefile,
src/cmd/builtin/Makefile, src/cmd/ksh93/Makefile, src/lib/Makefile,
src/lib/libast/Makefile, src/lib/libcmd/Makefile,
src/lib/libdll/Makefile, src/lib/libsum/Makefile:
- Removed.

src/Mamfile, src/cmd/INIT/Mamfile, src/cmd/Mamfile,
src/cmd/builtin/Mamfile, src/cmd/ksh93/Mamfile, src/lib/Mamfile,
src/lib/libast/Mamfile, src/lib/libcmd/Mamfile,
src/lib/libdll/Mamfile, src/lib/libsum/Mamfile:
- Indent the code with tabs.
- In ksh93/Mamfile, add ${KSH_SHOPT_FLAGS} to every $CC command.
- In ksh93/Mamfile, add "prev SHOPT.sh" for every *.o file
  so they are rebuilt whenever SHOPT.sh changes.

bin/Mamfile_indent:
- Added, in case someone wants to re-indent a Mamfile.

src/cmd/INIT/proto.c, src/cmd/INIT/ratz.c, src/cmd/INIT/release.c,
src/lib/libast/features/common, src/lib/libast/include/ast.h:
- Tell clang to suppress annoying warnings about coding style that
  it disapproves of (mainly concerning the use of parentheses).

src/cmd/INIT/cc.darwin, src/cmd/INIT/cc.freebsd,
src/cmd/INIT/cc.openbsd:
- Remove now-redundant clang warning suppression flags.

Resolves: https://github.com/ksh93/ksh/issues/60
2021-01-22 23:39:59 +00:00
Martijn Dekker
0a10e76ccc typeset: add error msgs for incompatible options; improve usage msg
This adds informative error messages if incompatible options are
given. It also documents the exclusive -m, -n and -T options on
separate usage lines, as was already done with -f. The usage
message for incompatible options now looks something like this:

| $ ksh -c 'typeset -L10 -F -f -i foo'
| ksh: typeset: -i/-F/-E/-X cannot be used with -L/-R/-Z
| ksh: typeset: -f cannot be used with other options
| Usage: typeset [-bflmnprstuxACHS] [-a[type]] [-i[base]] [-E[n]]
|                [-F[n]] [-L[n]] [-M[mapping]] [-R[n]] [-X[n]]
|                [-h string] [-T[tname]] [-Z[n]] [name[=value]...]
|    Or: typeset -f [name...]
|    Or: typeset -m [name=name...]
|    Or: typeset -n [name=name...]
|    Or: typeset -T [tname[=(type definition)]...]
|  Help: typeset [ --help | --man ] 2>&1

(see also the previous commit, e21a053e)

Unfortunately the first "Usage" line has some redundancies with the
"Or:" lines showing separate usages. It doesn't seem to be possible
to avoid this; it's a flaw in how libast generates everything
(usage, help, manual) from one huge getopt(3) string. I still think
the three added "Or:" lines are an improvement as it wasn't
previously shown that these options need to be used on their own.

src/cmd/ksh93/bltins/typeset.c: b_typeset():
- Instead of only showing a generic usage message, add an
  informative error message if incompatible options were given.
- Conflicting options detection was failing because NV_LJUST and
  NV_EXPNOTE have the same bitmask value. Use a new 'isadjust'
  flag for -L/-R/-Z to remember if one of these was set.
- Detect conflict between -L/-R/-Z and a float option, not just -i.

src/cmd/ksh93/include/name.h, src/cmd/ksh93/data/msg.c:
- Add the two new error messages for incompatible options.

src/cmd/ksh93/data/builtins.c: sh_opttypeset[]:
- Add a space after 'float' in in "[+float?\btypeset -lE\b]" as
  this makes 'float' appear on its own line, improving formatting.
- Show -m, -n, -T on separate usage lines like -f, as none of these
  can be combined with other options.
- Remove "cannot be combined with other options" from -m and -n
  descriptions, as that should now be clear from the separate usage
  lines -- and even if not, the error message is now informative.

src/cmd/ksh93/sh.1, src/cmd/ksh93/COMPATIBILITY:
- Update.

src/cmd/ksh93/tests/types.sh:
- Remove obsolete test: 'typeset -RF' is no longer accepted.
  (It crashed in 93u+, so this is not an incompatibility...)

Resolves: https://github.com/ksh93/ksh/issues/48
2021-01-21 09:36:10 +00:00
Martijn Dekker
d00b4b39f6 Fix side effect to exit status of DEBUG trap in comsub
This fixes the following:

trap ':' DEBUG
r=$(exit 123)
echo $? # Expected 123, but actually 0.

Thanks to Koichi Nakashima for the report and reproducer.

src/cmd/ksh93/sh/fault.c: sh_trap():
- Restore the saved current exit status (exitval) for all traps.
  Do not except the DEBUG trap from doing that. I've no idea why
  this exception was made, but it's not correct.

src/cmd/ksh93/tests/basic.sh:
- Add tests.

Makes progress on: https://github.com/ksh93/ksh/issues/155
2021-01-20 17:48:09 +00:00
Martijn Dekker
7bab9508aa Fix crash on subshell exit if PWD is inaccessible (re: dd9bc229)
This commit also further mitigates the problems with restoring an
inaccessible or nonexistent PWD on exiting a virtual subshell.

Harald van Dijk writes:
> On a build of ksh with -fsanitize=undefined to help diagnose
> problems:
>
> $ mkdir deleted
> $ cd deleted
> $ rmdir ../deleted
> $ ksh -c '(cd /; (cd /)); :'
> /home/harald/ksh/src/cmd/ksh93/sh/subshell.c:561:22: runtime
> error: null pointer passed as argument 1, which is declared to
> never be null
> Segmentation fault (core dumped)
>
> Note that it segfaults the same with default compilation flags,
> but it does not print out the useful extra message. The code
> assumes that pwd is non-null and passes it to strcmp without
> checking, but it will be null if the current directory cannot be
> determined, for instance because it has been deleted.

src/cmd/ksh93/sh/subshell.c: sh_subshell():
- Avoid the null pointer dereference reported above.

src/cmd/ksh93/bltins/cd_pwd.c: b_cd():
- Fork a virtual subshell even on systems with fchdir(2) if the
  present working directory tests as inaccessible on invoking 'cd';
  it may no longer exist and fchdir would fail to get a handle.
  (For the test we have to opendir(3) the full path to the PWD and
  not ".", as the latter may succeed even if the PWD is gone.)

src/cmd/ksh93/data/builtins.c:
- Update 'cd' version string.

Fixes:   https://github.com/ksh93/ksh/issues/153
Related: https://github.com/ksh93/ksh/issues/141
2021-01-19 18:47:41 +00:00
Martijn Dekker
1de20d65a8 Fix crash on long PS1 prompt (Solaris patch 195-17824699)
Original report and info:
https://www.mail-archive.com/ast-developers@lists.research.att.com/msg01677.html
https://www.mail-archive.com/ast-developers@lists.research.att.com/msg01679.html

Patch pulled in from:
https://raw.githubusercontent.com/oracle/solaris-userland/master/components/ksh93/patches/195-17824699.patch

src/cmd/ksh93/edit/edit.c: ed_setup():
- Prevent the ed_setup() function from writing past ep->e_prompt,
  which is set to the local char prompt[PRSIZE] variable in
  ed_emacsread().

src/cmd/ksh93/include/edit.h:
- Increase maximum prompt size, PRSIZE, to 256.
2021-01-08 22:22:47 +00:00
Martijn Dekker
222515bf08 Implement hash tables for virtual subshells (re: 102868f8, 9d428f8f)
The forking fix implemented in 102868f8 and 9d428f8f, which stops
the main shell's hash table from being cleared if PATH is changed
in a subshell, can cause a significant performance penalty for
certain scripts that do something like

    ( PATH=... command foo )

in a subshell, especially if done repeatedly. This is because the
hash table is cleared (and hence a subshell forks) even for
temporary PATH assignments preceding commands.

It also just plain doesn't work. For instance:

    $ hash -r; (ls) >/dev/null; hash
    ls=/bin/ls

Simply running an external command in a subshell caches the path in
the hash table that is shared with a main shell. To remedy this, we
would have to fork the subshell before forking any external
command. And that would be an unacceptable performance regression.

Virtual subshells do not need to fork when changing PATH if they
get their own hash tables. This commit adds these. The code for
alias subshell trees (which was removed in ec888867 because they
were broken and unneeded) provided the beginning of a template for
their implementation.

src/cmd/ksh93/sh/subshell.c:
- struct subshell: Add strack pointer to subshell hash table.
- Add sh_subtracktree(): return pointer to subshell hash table.
- sh_subfuntree(): Refactor a bit for legibility.
- sh_subshell(): Add code for cleaning up subshell hash table.

src/cmd/ksh93/sh/name.c:
- nv_putval(): Remove code to fork a subshell upon resetting PATH.
- nv_rehash(): When in a subshell, invalidate a hash table entry
  for a subshell by creating the subshell scope if needed, then
  giving that entry the NV_NOALIAS attribute to invalidate it.

src/cmd/ksh93/sh/path.c: path_search():
- To set a tracked alias/hash table entry, use sh_subtracktree()
  and pass the HASH_NOSCOPE flag to nv_search() so that any new
  entries are added to the current subshell table (if any) and do
  not influence any parent scopes.

src/cmd/ksh93/bltins/typeset.c: b_alias():
- b_alias(): For hash table entries, use sh_subtracktree() instead
  of forking a subshell. Keep forking for normal aliases.
- setall(): To set a tracked alias/hash table entry, pass the
  HASH_NOSCOPE flag to nv_search() so that any new entries are
  added to the current subshell table (if any) and do not influence
  any parent scopes.

src/cmd/ksh93/sh/init.c: put_restricted():
- Update code for clearing the hash table (when changing $PATH) to
  use sh_subtracktree().

src/cmd/ksh93/bltins/cd_pwd.c:
- When invalidating path name bindings to relative paths, use the
  subshell hash tree if applicable by calling sh_subtracktree().
- rehash(): Call nv_rehash() instead of _nv_unset()ting the hash
  table entry; this is needed to work correctly in subshells.

src/cmd/ksh93/tests/leaks.sh:
- Add leak tests for various PATH-related operations in the main
  shell and in a virtual subshell.
- Several pre-existing memory leaks are exposed by the new tests
  (I've confirmed these in 93u+). The tests are disabled and marked
  TODO for now, as these bugs have not yet been fixed.

src/cmd/ksh93/tests/subshell.sh:
- Update.

Resolves: https://github.com/ksh93/ksh/issues/66
2021-01-07 22:18:25 +00:00
Martijn Dekker
a95d107ee5 Fix segfault while updating ${.sh.match}
The SHOPT_2DMATCH code block in sh_setmatch() modifies the 'ap'
pointer, which is initialised as nv_arrayptr(SH_MATCHNOD). This
caused a (rarely occurring) segfault in the following line near the
end of the function:
	ap->nelem -= x;
as this line assumed that 'ap' still had the initial value.

src/cmd/ksh93/sh/init.c: sh_setmatch():
- On init, save ap in ap_save and use ap_save instead of ap where
  it should be pointing to SH_MATCHNOD. This also allows removing
  two redundant nv_arrayptr(SH_MATCHNOD) calls, slightly increasing
  the efficiency of this function.
2021-01-07 17:34:47 +00:00
Martijn Dekker
d1483150ab 'cd': properly ignore $CDPATH if initial component is '.' or '..'
@stephane-chazelas writes:
> Per POSIX[*], cd should skip the $CDPATH processing if the first
> component of the directory given to cd is . or ...
>
> Yet, with ksh93u+m 2021-01-03 at least, while that's OK with ..,
> it's not with . with or without the posix option:
>
> $ CDPATH=/ ./ksh -o posix -c 'cd -P ./etc && pwd'
> /etc
> /etc
>
> It seems to be a regression introduced with ksh93u+ as I can't
> reproduce it with ksh93u or any version prior to that. I can also
> reproduce in u+, v- and the ksh2020 from the Ubuntu 20.04
> package.

src/cmd/ksh93/bltins/cd_pwd.c: b_cd():
- Skip $CDPATH processing not only if the path is absolute, but
  also if the initial path component is '.' or '..' (in the latter
  case the $CDPATH processing was done but appeared to be a no-op).

src/cmd/ksh93/tests/builtins.sh:
- Add regression test.

[*] https://pubs.opengroup.org/onlinepubs/9699919799.2018edition/utilities/cd.html

Fixes: https://github.com/ksh93/ksh/issues/151
2021-01-05 05:04:24 +00:00
Harald van Dijk
41ef7f76cf Invocation: fix infinite loop on 'ksh +s'
When starting ksh +s, it gets stuck in an infinite loop continually
trying to parse its own binary as a shell script and rejecting it:

$ arch/linux.i386-64/bin/ksh +s
arch/linux.i386-64/bin/ksh: arch/linux.i386-64/bin/ksh: cannot execute [Exec format error]
arch/linux.i386-64/bin/ksh: arch/linux.i386-64/bin/ksh: cannot execute [Exec format error]
arch/linux.i386-64/bin/ksh: arch/linux.i386-64/bin/ksh: cannot execute [Exec format error]
arch/linux.i386-64/bin/ksh: arch/linux.i386-64/bin/ksh: cannot execute [Exec format error]
arch/linux.i386-64/bin/ksh: arch/linux.i386-64/bin/ksh: cannot execute [Exec format error]
[...]
$ echo 'echo "this is stdin"' | arch/linux.i386-64/bin/ksh +s
arch/linux.i386-64/bin/ksh: arch/linux.i386-64/bin/ksh: cannot execute [Exec format error]
(no loop, but still ksh trying to parse itself)

src/cmd/ksh93/sh/init.c: sh_init():
- When forcing on the '-s' option upon finding no command
  arguments, also update sh.offoptions, a.k.a. shp->offoptions.
  This avoids the inconsistent state causing this problem.

  In main.c, there is:

  if(sh_isoption(SH_SFLAG))
      fdin = 0;
  else
      (code to open $0 as a file)

  This was entering the else block because sh_isoption(SH_SFLAG)
  was returning 0, and $0 is set to the ksh binary as it is
  supposed to when no other script is provided. When I looked for
  why sh_isoption was returning 0, I found main.c's

  for(i=0; i<elementsof(shp->offoptions.v); i++)
      shp->options.v[i] &= ~shp->offoptions.v[i];

  Before this loop, shp->offoptions tracks which options were
  explicitly disabled by the user on the command line. The effect
  of this loop is to make "explicitly disabled" take precedence
  over "implicitly enabled". My patch removes the registration of
  the +s option.

Fixes: https://github.com/ksh93/ksh/issues/150
Co-authored-by: Martijn Dekker <martijn@inlv.org>
2021-01-03 23:54:36 +00:00
hyenias
88a6baa1a7
Fix floating point numerics having precision of 0 with assignments (#149)
Issuing typeset floating point numerics having a precision of 0
failed as the precision/size was being overwritten with the string
length of the value, e.g. 'typeset -F0 x=5.67' would result in
'typeset -F 4 x=5.6700' as len('5.67') is 4.

src/cmd/ksh93/include/nval.h:

- Created a symbolic name of NV_FLTSIZEZERO to respresent a float
  having a precision/size of 0. NV_FLTSIZEZERO needs to be a
  negative value.

src/cmd/ksh93/bltins/typeset.c:

- In b_typeset(), added code to set tdata.argnum to NV_FLTSIZEZERO
  for E, F, X options.

- In setall(), adjusted code to allow for tp->argnum to be negative.

src/cmd/ksh93/sh/name.c: nv_newattr():

- Adjusted option value only change code to handle NV_FLTSIZEZERO as
  well as changed to directly setting np->nvsize instead of using
  nv_setsize(np,size) as nv_setsize might contain conflicting and/or
  redundant code.

- Added missing conditional check of '!(newatts&NV_INTEGER)' to
  constrain the size==0 code block to justified strings as
  NV_LJUST, NV_RJUST, or NV_ZFILL are only valid for strings if
  NV_INTEGER is not set. This code block was mistakenly setting
  the precision/size value to the length of the value of an
  assignment for floats whereas it should only be performing
  auto assignment length for justified strings.
2020-11-26 13:50:30 +00:00
hyenias
95fe07d869
Improved 'typeset -xu'/'typeset -xl' fix (re: fdb9781e) (#147)
'typeset -xu' and 'typeset -xl' would export the variable but fail
to change case in the value as the check between old and new
attributes did not provide the necesssary insight for lower or
upper case transcoding due to the lower or upper case attribute
being set within typeset.c prior to calling name.c nv_newattr
function.

Previous rhbz#1188377 patch added a conditional check for size==-1
which in effect caused the nv_newattr export code block return
optimization to never be executed as one cannot set any attributes
using the readonly builtin. By altering the size==-1 check to !trans
the export only optimization can run.

Also, the rhbz#1188377 patch altered new_attr function by setting
the new size to oldsize if run by the readonly builtin. The result
of setting size==oldsize allowed the succeeding if statement to
run more frequently and if size was a non-zero value resulted in
nv_setsize resetting the value to what it already was. Investigation
yielded that size was always 0 coming from the readonly builtin.

src/cmd/ksh93/bltins/typeset.c:
- Remove the setting of tdata.argnum to -1 as it is not needed due to
  existing name.c nv_newattr() logic.

src/cmd/ksh93/sh/name.c: nv_newattr():
- Corrected the export only check optimization by using !trans instead
  of using size==-1.
- Removed previous condition check to set size=oldsize if coming from
  the readonly builtin. nv_newattr already had existing logic to
  prevent changing the size via nv_setsize as size is always 0 when
  coming from readonly builtin.
2020-11-26 13:30:24 +00:00
Martijn Dekker
dd9bc22928 Mitigate PWD race condition in non-forking subshells
Virtual/non-forking subshells that change the present working
directory (PWD) with 'cd' suffer from a serious race condition. The
PWD is changed within the same process. This means it may not be
possible to change back to the original PWD when exiting the
subshell, as some other process may destroy the PWD or modify its
permissions in the meantime. ksh did not handle this error
condition at all, so, after exiting a subshell that invoked 'cd',
it could silently end up running the script's following command(s)
in the wrong directory. Which might be 'rm -rf *'. So, ouch.

The proper and obvious fix is never to allow a virtual subshell to
change the PWD, as it can never be guaranteed you can return to a
previous directory. If the PWD is changed in a child process, there
is no need to restore it in the parent process, and this whole
problem is avoided. So subshells really should always fork on
encountering a 'cd' command.

But forking is slow. It is not uncommon for scripts to 'cd' in a
subshell that is run repeatedly in a loop.

There is also the issue of custom builtins that can be added to ksh
via shared libraries. In the standard shell language, 'cd' is the
only command that changes the PWD, so we could just make that
command fork the subshell it is run from. But there's no telling
what a custom builtin might do.

So this commit implements a compromise that will not affect
performance unless there is the pathological condition of a PWD
that has been rendered inaccessible in some way:

1. When entering a virtual subshell, if the parent shell's PWD
proves inaccessible upon saving it, the subshell will now fork into
a separate process, avoiding the unrestorable PWD problem.

2. If some attack renders the parent shell's PWD unrestorable
*after* ksh enters a virtual subshell, ksh will now error out when
exiting it. There is nothing else left to do then. Continuing would
mean running arbitrary commands in the wrong PWD.

src/cmd/ksh93/sh/subshell.c:

- Put all the code/variables only needed for fchdir() behind '#if
  _lib_fchdir'. This makes it clearer what's what.
  (I don't know if there is still any system out there without
  fchdir(3); I haven't found any. The chdir(3) fallback version may
  be removed later as there is no way to make it remotely secure.)

- Fix the attempt to use the O_PATH mode for open(2) as a fallback
  for nonexistent O_SEARCH on Linux. Define _GNU_SOURCE on Linux,
  or <fcntl.h> (which is included indirectly) won't define O_PATH.

- Fix use of O_SEARCH. The code was simply wrong, repeating an
  open(".",O_RDONLY) instead. Since a nonexistent O_SEARCH is now
  redefined as either O_PATH or O_RDONLY, we can simply
  open(".",O_SEARCH) and be done with it.

- Fix fatal error handling. Introduce fatal error condition for
  failure to fchdir(3) back to the parent's PWD; rename 'duped' to
  'fatalerror' and use it for error numbers; save and restore errno
  on fatal error so the message will report the cause. (We must
  call errormsg() near the end of sh_subshell() to avoid crashes.)

- If open(".",O_SEARCH) was not able get a file descriptor to our
  PWD on entry, then call sh_subfork() immediately before running
  the subshell commands. (Forking earlier causes a crash.)

- When restoring the PWD, if fchdir(3) fails, do *not* fall back to
  chdir(3). We already know the PWD is inaccessible, so if chdir(3)
  "succeeds" then, it's very likely to be a substitute injected by
  an attacker.

src/cmd/ksh93/bltins/cd_pwd.c:

- If we don't have fchdir(3), then sh_subshell() must fall back to
  chdir(2) to restore the PWD. That is highly vulnerable, as a
  well-timed rename would allow an attacker to usurp the PWD. We
  can't do anything about that if some custom builtin changes the
  PWD, but we can at least make 'cd' always fork a subshell, which
  slows down ksh but removes the need for the parent shell ever to
  restore the PWD. (There is certainly no popular system where this
  is relevant and there might not be any such current system.)

This commit adds no regression test because a portable regression
test is not really doable. Different kernels, external /bin/pwd
utilities, etc. all have quite different behaviour under the
pathological condition of an inaccessible PWD, so both the
before-fix and the after-fix behaviour differs. See link below.

Resolves: https://github.com/ksh93/ksh/issues/141
Thanks to Stéphane Chazelas for the bug report.
2020-10-07 00:52:11 +02:00
Martijn Dekker
d89ef0fafa Fix $LINENO corruption when autoloading functions
Autoloading a function caused the calling script's $LINENO to be
off by the number of lines in the function definition file. In
addition, while running autoloaded functions, errors/warnings were
reported with wrong line numbers.

src/cmd/ksh93/sh/path.c:
- Save $LINENO (shp->inlineno) before autoloading a function, reset
  it to 1 so that the correct line number offset is remembered for
  the function definition, and restore it after.

src/cmd/ksh93/tests/variables.sh:
- Add regression test for $LINENO, directly and in error messages,
  within and outside a non-autoloaded and an autoloaded function.

Fixes: https://github.com/ksh93/ksh/issues/116
2020-10-01 06:13:00 +02:00
Martijn Dekker
c049eec854 Fix pipefail with (errexit or ERR trap) regression
ksh 93u+ introduced a regression in the combination of the
'set -o pipefail' and 'set -e'/'set -o errexit' options:

$ ksh93 -o errexit -o pipefail -c \
	'(exit 3) | true; echo "still here despite $? status"'
still here despite 3 status

The bug is in how the the huge sh_exec() function in xec.c handles
the 'echeck' flag. Near the end of sh_exec(), this flag triggers a
sh_chktrap() call to check whether to trigger any traps, including
the ERR trap -- and that same function also handles the errexit
option, which is basically the same as 'trap "exit" ERR'.

We can learn more easily how sh_exec() works by inserting debug
warnings in all its 'switch(type&COMMSK)' cases, like:

    case TCOM:
	errormsg(SH_DICT,ERROR_warn(0),"[DEBUG] TCOM");

... and same for all the others. With that done, the output
of a very simple dummy pipeline looks as follows:

$ arch/*/bin/ksh -c 'true | true | true'
arch/darwin.i386-64/bin/ksh: warning: [DEBUG] TFIL
arch/darwin.i386-64/bin/ksh: warning: [DEBUG] TFORK
arch/darwin.i386-64/bin/ksh: warning: [DEBUG] TFORK
arch/darwin.i386-64/bin/ksh: warning: [DEBUG] TSETIO
arch/darwin.i386-64/bin/ksh: warning: [DEBUG] TCOM
arch/darwin.i386-64/bin/ksh: warning: [DEBUG] TCOM
arch/darwin.i386-64/bin/ksh: warning: [DEBUG] TCOM

So, it looks like sh_exec() handles this pipeline as follows:

	TFIL
	   |_____TFORK
	   |         |_____TCOM
	   |_____TFORK
	   |         |_____TCOM
	   |_____TSETIO
	             |_____TCOM

Each time a pipeline like command1 | command2 | ... is executed,
sh_exec() is invoked with type TFIL; this then recursively invokes
sh_exec() to handle the individual elements. The last element of
the pipe triggers a sh_exec() run with type TSETIO; since it is run
in the current shell environment, it is effectively treated as a
command with an input redirection. All the previous elements are of
type TFORK instead, because they are executed asynchronously in
separate, forked subshell processes. Finally, the TFORK or TSETIO
code then recursively calls sh_exec() again with type TCOM to
actually execute the commands.

When reading the code, we find that the 'echeck' flag is set as
part of the TSETIO code. This makes sense of why only an error in
the last element of the pipe triggers the errexit/ERR trap action.
So that's the bug: the flag is set in the wrong place.

This can be fixed by setting that flag in the TFIL handling code
instead, as this is what calls everything else and collects all the
exit statuses. So the sh_chktrap() call is now executed after
handling the entire pipeline, at the TFIL recursion level.

This also allows getting rid of the special-casing in the buggy
TSETIO version. The SH_ERREXIT state is restored at the end of each
sh_exec() call, so since we're now doing this at a lower recursion
level, it will already have been restored.

src/cmd/ksh93/sh/xec.c: sh_exec():
- Fix the bug as per the above.

src/cmd/ksh93/tests/options.sh:
- Add tests for errexit and ERR trap combined with pipefail.

src/cmd/ksh93/tests/basic.sh:
- Tweak a couple of tests that reported a trap wasn't triggered
  even if it was actually triggered more than once.

Fixes: https://github.com/ksh93/ksh/issues/121
Thanks to Stéphane Chazelas for the bug report.
2020-09-30 17:49:46 +02:00
Martijn Dekker
fdb9781ebb Fix 'typeset -xu', 'typeset -xl' (rhbz#1188377)
'typeset -xu' and 'typeset -xl' would export the variable but fail
to change case in the value under certain conditions.

Original patch:
https://src.fedoraproject.org/rpms/ksh/blob/642af4d6/f/ksh-20120801-xufix.patch

This applies the patch essentially without change and adds a
regression test based on the reproducer provided in the RH bug.

Unfortunately there is no description of how the patch works and
it's a little obscure to me. As far as I can figure out, the cause
of the problem was that nv_newattr() erroneously processed a
nonexistent size option-argument such as what can be given to
options like typeset -F, e.g. typeset -F3 for 3 digits after the
dot. A nonexistent size argument is represented by the value of -1.
2020-09-30 03:06:54 +02:00
Martijn Dekker
30aee65113 Fix signal/trap behaviour in ksh functions (rhbz#1454804)
Prior discussion:
https://bugzilla.redhat.com/1454804

On 2017-05-23 13:33:25 UTC, Paulo Andrade wrote:
> In previous ksh versions, when exiting the scope of a ksh
> (not posix) function, it would restore the trap table of
> the "calling context" and if the reason the function exited
> was a signal, it would call sh_fault() passing as argument
> the signal value.
>   Newer ksh checks it, but calls kill(getpid(), signal_number)
> after restoring the trap table, but only calls for SIGINT and
> SIGQUIT.
[...]
>   The old way appears to have been more appropriate, but there
> must be a reason to only pass SIGINT and SIGQUIT as it is an
> explicit patch.

The last paragraph is where I differ. This would not be the first
example of outright breakage that appeared to be added deliberately
and that 93u+m has fixed or removed, see e.g. 8477d2ce ('printf %H'
had code that deleted all multibyte characters), cefe087d, or
781f0a39. Sometimes it seems the developers added a little
experiment and then forgot all about it, so it became a misfeature.

In this instance, the correct pre-2012 ksh behaviour is still
explicitly documented in (k)sh.1: "A trap condition that is not
caught or ignored by the function causes the function to terminate
and the condition to be passed on to the caller". Meaning, if there
is no function-local trap, the signal defaults to the parent scope.
There is no language that limits this to SIGINT and SIGQUIT only.
It also makes no sense at all to do so -- signals such as SIGPIPE,
SIGTERM, or SIGSEGV need to be caught by default and to do
otherwise results in misbehaviour by default.

src/cmd/ksh93/sh/xec.c: sh_funscope():
- When resending a signal after restoring the global traps state,
  remove the spurious check that limits this to SIGINT and SIGQUIT.
- Replace it with a check for nsig!=0, as that means there were
  parent trap states to restore. Otherwise 'kill' may be called
  with an invalid signal argument, causing a crash on macOS.

src/cmd/ksh93/tests/signal.sh:
- Update a test to check that a function-local SIGTERM trap is
  triggered correctly when signalled from another process.
- Complete the tests for 3aee10d7; this bug needed fixing before
  we could test that previous fix in a ksh function scope.
- Add a test for triggering global traps from ksh functions,
  testing multiple POSIX-standard signals.
2020-09-29 03:16:39 +02:00
Martijn Dekker
ddcef2137e NEWS: fix typo (re: bd283959) 2020-09-28 04:47:53 +02:00
Martijn Dekker
bd283959be Fix lexing of 'case' in do...done in a $(comsub) (rhbz#1241013)
The following caused a spurious syntax error:

$ x=$(for i in 1; do case $i in word) true;; esac; done)
-ksh: syntax error: `;;' unexpected

Prior discussion:
https://bugzilla.redhat.com/1241013

Original patch, backported from 93v- beta, applied without change:
https://src.fedoraproject.org/rpms/ksh/blob/642af4d6/f/ksh-20120801-parserfix.patch
2020-09-27 21:26:09 +02:00
Martijn Dekker
960a1a99cd Avoid importing env vars with invalid names (rhbz#1147645)
This imports a new version of the code to import environment
variable values that was sent to Red Hat from upstream in 2014.
It avoids importing environment variables whose names are not valid
in the shell language, as it would be impossible to change or unset
them. However, they stay in the environment to be passed to child
processes.

Prior discussion: https://bugzilla.redhat.com/1147645
Original patch: https://src.fedoraproject.org/rpms/ksh/blob/642af4d6/f/ksh-20120801-oldenvinit.patch

src/cmd/ksh93/sh/init.c:

- env_init(): Import new, simplified code to import environment
  variable name/value pairs. Instead of doing the heavy lifting
  itself, this version uses nv_open(), passing the NV_IDENT flag to
  reject and skip invalid names.

- Get rid of gotos and a static var by splitting off the code to
  import attributes into a new env_import_attributes() function.
  This is a better way to avoid importing attributes when
  initialising the shell in POSIX mode (re: 00d43960

- Remove an nv_mapchar() call that was based on some unclear
  flaggery which was also removed by upstream as sent to Red Hat.
  I don't know what that did, if anything; looks like it might have
  had something to do with typeset -u/-l, but those particular
  attributes have never been successfully inherited through the
  environment.
    (Maybe that's another bug, or maybe I just don't care as
    inheriting attributes is a misfeature anyway; we have to put up
    with it because legacy scripts might use it. Maybe someone can
    prove it's an unacceptable security risk to import attributes
    like readonly from an environment variable that is inherently
    vulnerable to manipulation. That would be nice, as a CVE ID
    would give us a solid reason to get rid of this nonsense.)

- Remove an 'else cp += 2;' that was very clearly a no-op; 'cp' is
  immediately overwritten on the next loop iteration and not used
  past the loop.

src/cmd/ksh93/tests/variables.sh:

- Test.
2020-09-26 20:57:39 +02:00
Johnothan King
8a34fc40e6
whence -f: ignore functions (#137)
According to 'whence --man', 'whence -f' should ignore functions:
  -f              Do not check for functions.

Right now this is only accomplished partially. As of commit
a329c22d 'whence -f' avoids any output when encountering a
function (in ksh93u+ 'whence -f' has incorrect output). The
return value is still wrong though:

$ foo() { true; }
$ whence -f foo; echo $?
0

This commit fixes the return value and makes 'type -f' error out
when given a function (like in Bash).

src/cmd/ksh93/bltins/whence.c:
- If -f was passed, set 'cp' to NULL since functions should be
  ignored (as documented).
- Simplify return value by avoiding bitwise logic.

src/cmd/ksh93/tests/builtins.sh:
- Add regression tests for 'whence -f' and 'type -f'.

Co-authored-by: Martijn Dekker <martijn@inlv.org>
2020-09-26 19:26:18 +01:00
Martijn Dekker
3050bf28bc whence -v/-a: report path to autoloadable functions
Since at least 1999, whence -v on pdksh (and its successor mksh)
reports the path where an autoloadable function may be found:

  $ mkdir ~/fun; FPATH=~/fun
  $ echo 'myfn() { echo hi; }' >~/fun/myfn
  $ whence -v myfn
  myfn is a undefined (autoload from /home/user/fun/myfn) function

Whereas ksh93 only reports, rather uselessly:

  myfn is an undefined function

As of this commit, whence -v/-a on ksh 93u+m does the same as
pdksh, but with correct grammar:

  myfn is an undefined function (autoload from /home/user/fun/myfn)

This may be a small violation of my own "no new features" policy
for 93u+m, but I couldn't resist. This omission has been annoying
me, and it's just embarrassing to lack a pdksh feature :)

src/cmd/ksh93/include/path.h,
src/cmd/ksh93/data/msg.c:
- Add e_autoloadfrom[] = " (autoload from %s)" message.

src/cmd/ksh93/bltins/whence.c: whence():
- Report the path (if any) when reporting an undefined function.
  This needs to be done in two places:
  1. When a function has been explicitly marked undefined with
     'autoload', we need to do a quick path_search() loop to find
     the path. (These undefined functions take precedence over
     regular commands, so are reported first.)
  2. When a function is not explicitly autoloaded but merely
     available in $FPATH, that path search was already done, so all
     we need to do is report it. (These are reported last.)
  Note that the output remains as on 93u+ if no function definition
  file is found on $FPATH. This is also like pdksh/mksh.

src/cmd/ksh93/data/builtins.c:
- Bump 'whence' version date. The inline docs never detailed very
  exactly what 'whence -v' reports, so no need for further edits.

src/cmd/ksh93/tests/path.sh:
- Regress-test the new whence behaviour plus actual autoloading,
  including the command override behaviour of autoloaded functions.
2020-09-25 17:45:40 +02:00
Martijn Dekker
cefe087d23 Fix argv rewrite on invoking hashbangless script (rhbz#1047506)
The fixargs() function is invoked when ksh needs to run a script
without a #!/hashbang/path. Instead of letting the kernel invoke a
shell, ksh exfile()s the script itself from sh_main(). In the
forked child, it calls fixargs() to set the argument list in the
environment to the args of the new script, so that 'ps' and
/proc/PID/cmdline show the expected output.

But fixargs() is broken because, on systems other than HP-UX (on
which ksh uses pstat(2)), ksh simply inserts a terminating zero.
The arguments list is not a zero-terminated C string. Unix systems
expect the entire arguments buffer to be zeroed out, otherwise 'ps'
and /proc/*/cmdline will have fragments of previous command lines
in the output.

The Red Hat patch for this bug is:
https://src.fedoraproject.org/rpms/ksh/blob/642af4d6/f/ksh-20120801-argvfix.patch

However, that fix is incomplete because 'command_len' was also
hardcoded to be limited to 64 characters (!), which still gave
invalid 'ps' output if the erased command line was longer.

src/cmd/ksh93/sh/main.c: fixargs():

- Remove CMD_LENGTH macro which was defined as 64.

- Remove code that limited the erasure of the arguments buffer to
  CMD_LENGTH characters. That code also had quite a dodgy strdup()
  call -- it copies arguments to the heap, but they are never freed
  (or even used), so it's a memory leak. Also, none of this is
  ever done if the length is calculated using pstat(2) on HP-UX,
  which is a clear indication that it's unnecessary.
  (I think this code block must have been some experiment they
  forgot to remove. One reason why I think so is that a 64 byte
  arguments limit never made sense, even in the 1980s when they
  wrote ksh on 80-column CRT displays. Another indication of this
  is that fixing it didn't require adding anything; the code to do
  the right thing was already there, it was just being overridden.)

- Zero out the full arguments length as in the Red Hat patch.

src/cmd/ksh93/tests/basic.sh:

- Add test. It's sort of involved because 'ps' is one of the least
  portable commands in practice, in spite of standardisation.
2020-09-25 15:02:51 +02:00
Martijn Dekker
a14d17c0f4 Allow turning off brace expansion in comsubs (rhbz#1078698)
There was no check for the -B/braceexpand option before calling
path_expand() to process brace expansion, making it impossible to
turn off brace expansion within command substitutions. Normally the
lexer flags brace expansion so that this code is not reached, but
shell code within command substitutions is handled differently.

Red Hat patches this by adding this check to the function itself:
https://src.fedoraproject.org/rpms/ksh/blob/642af4d6/f/ksh-20140301-fikspand.patch
But I think it's more logical to patch it at the point of decision.

src/cmd/ksh93/sh/macro.c: endfield():
- Decide to call either path_generate() or path_expand() based on
  the state of the SH_BRACEEXPAND shell option.
- Fix '#if SHOPT_BRACEPAT' preprocessor check that previously
  hardcoded this decision at compile time.

src/cmd/ksh93/tests/options.sh:
- Add tests.
2020-09-24 08:21:37 +02:00
Martijn Dekker
3654ee73c0 Fix typeset -l/-u crash on special vars (rhbz#1083713)
When using typeset -l or -u on a variable that cannot be changed
when the shell is in restricted mode, ksh crashed.

This fixed is inspired by this Red Hat fix, which is incomplete:
https://src.fedoraproject.org/rpms/ksh/blob/642af4d6/f/ksh-20120801-tpstl.patch

The crash was caused by the nv_shell() function. It walks though a
discipline function tree to get the pointer to the interpreter
associated with it. Evidently, the problem is that some pointer in
that walk is not set correctly for all special variables.

Thing is, ksh only has one shell language interpreter, and only one
global data structure (called 'sh') to keep its main state[*]. Yet,
the code is full of 'shp' pointers to that structure. Most (not
all) functions pass that pointer around to each other, accessing
that struct indirectly, ostensibly to account for the non-existent
possibility that there might be more than one interpreter state.
The "why" of that is an interesting cause for speculation that I
may get to sometime. For now, it is enough to know that, in the
code as it is, it matters not one iota what pointer to the shell
interpreter state is used; they all point to the same thing (unless
it's broken, as in this bug).

So, rather than fixing nv_shell() and/or associated pointer
assignments, this commit simply removes it, and replaces it with
calls to sh_getinterp(), which always returns a pointer to sh (see
init.c, where that function is defined as literally 'return &sh').

[*] Defined in shell.h, with the _SH_PRIVATE part in defs.h

src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/sh/name.c:
- Remove nv_shell().

src/cmd/ksh93/sh/init.c:
- In all the discipline functions for special variables, initialise
  shp using sh_getinterp() instead of nv_shell().

src/cmd/ksh93/tests/variables.sh:
- Add regression test for typeset -l/-u on all special variables.
2020-09-24 03:03:29 +02:00
Martijn Dekker
ce68e1be37 Fix crash in backtick comsubs with job control on (rhbz#825520)
This imports another fix from Red Hat/Fedora. Original patch:
https://src.fedoraproject.org/rpms/ksh/blob/642af4d6/f/ksh-20120801-crash.patch

src/cmd/ksh93/include/jobs.h,
src/cmd/ksh93/sh/jobs.c,
src/cmd/ksh93/sh/subshell.c,
src/cmd/ksh93/sh/xec.c:
- Import the Red Hat fix with these differences:
  - Rename the 'hack1_waitall' variable to 'bktick_waitall' and add
    a comment describing what it's for.
  - Remove unused 'pipefail' variable.

src/cmd/ksh93/tests/basic.sh:
- Regression test from reproducer given in the Red Hat bug report.
- Add special handling to SIGKILL it, as it might freeze hard.
2020-09-23 01:56:09 +02:00
Martijn Dekker
fe6d0903dc Fix v=$(<file) for closed FD 0,1,2 (rhbz#1066589)
var=$(< file) now reads the file even if the standard inout,
standard output and/or standard error file descriptors are closed.

Original patch:
https://src.fedoraproject.org/rpms/ksh/blob/642af4d6/f/ksh-20120801-filecomsubst.patch

src/cmd/ksh93/sh/io.c: sh_redirect():
- When processing the '<' redirector as part of $(< ...), i.e. if
  flag==3, make sure the FD of the file to read is > 2 by calling
  sh_iomovefd(). Unlike the RedHat patch, this checks for flag==3
  to avoid unnecessary sh_iomovefd() calls for normal redirections,
  as there was no bug with those.

src/cmd/ksh93/tests/io.sh:
- Add test.
2020-09-22 03:02:06 +02:00
Martijn Dekker
5683155cb5 update NEWS, SH_RELEASE (re: 970069a6) 2020-09-22 01:45:01 +02:00
Martijn Dekker
a329c22dba Multiple 'whence' and path search fixes
Hopefully this doesn't introduce new bugs, but it does fix at
least the following:

1. When whence -v/-a found an "undefined" (i.e. autoloadable)
   function in $FPATH, it actually loaded the function as a side
   effect of reporting on its existence (!). Now it only reports.

2. 'whence' will now canonicalise paths properly. Examples:
	$ whence ///usr/lib/../bin//./env
	/usr/bin/env
	$ (cd /; whence -v dev/../usr/bin//./env)
	dev/../usr/bin//./env is /usr/bin/env

3. 'whence' no longer prefixes a spurious double slash when doing
   something like 'cd / && whence bin/echo'. On Cygwin, an initial
   double slash denotes a network server, so this was not just a
   cosmetic problem.

4. 'whence -a' now reports a "tracked alias" (a.k.a. hash table
   entry, i.e. cached $PATH search) even if an actual alias by the
   same name exists. This needed fixing because in fact the hash
   table entry continues to be used when bypassing the alias.
   Aliases and "tracked aliases" are not remotely the same thing;
   confusing nomenclature is not a reason to report wrong results.

5. When using 'hash' or 'alias -t' on a command that is also a
   builtin to force caching a $PATH search for the external
   command, 'whence -a' double-reported the path:
	$ hash printf; whence -a printf
	printf is a shell builtin
	printf is /usr/bin/printf
	printf is a tracked alias for /usr/bin/printf
   This is now fixed so that the second output line is gone.
   Plus, if there were multiple versions of the command on $PATH,
   the tracked alias was reported at the end, which is the wrong
   order. This is also fixed.

src/cmd/ksh93/bltins/whence.c: whence():
- Refactor the do...while loop that handles whence -v/-a for path
  searches in such a way that the code actually makes sense and
  stops looking like higher esotericism. Just doing this fixed #2,
  #4 and #5 above (the latter two before I even noticed them). For
  instance, the path_fullname() call to canonicalise paths was
  already there; it was just never used.
- Remove broken 'notrack' flaggery for deciding whether to report a
  hash table entry a.k.a. "tracked alias"; instead, check the hash
  table (shp->track_tree).

src/cmd/ksh93/sh/path.c:
- path_search(): Re #3: When prefixing the PWD, first check if
  we're in '/' and if so, don't prefix it; otherwise, adding the
  next slash causes an initial double slash. (Since '/' is the only
  valid single-character absolute path, all we need to do is check
  if the second character pwd[1] is non-null.)
- path_search(): Re #1: Stop autoloading when called by 'whence':
  * The 'flag==2' check to avoid autoloading a function was
    broken. The flag value is 2 on the first whence() loop
    iteration, but 3 on subsequent ones. Change to 'flag >= 2'.
  * However, this only fixes it if the function file does not have
    the x permission bit, as executable files are handled by
    path_absolute() which unconditionally autoloads functions!
    So, pass on our flag parameter when callling path_absolute().
- path_absolute(): Re #1: Add flag parameter. Do not autoload
  functions if flag >= 2.

src/cmd/ksh93/include/path.h,
src/cmd/ksh93/bltins/typeset.c,
src/cmd/ksh93/sh/main.c,
src/cmd/ksh93/sh/xec.c:
- Re #1: Update path_absolute() calls, adding a 0 flag parameter.

src/cmd/ksh93/include/name.h:
- Remove now-unused pathcomp member from union Value. It was
  introduced in 99065353 to allow examining the value of a tracked
  alias. This commit uses nv_getval() instead.

src/cmd/ksh93/tests/builtins.sh,
src/cmd/ksh93/tests/path.sh:
- Add and tweak various related tests.

Fixes: https://github.com/ksh93/ksh/issues/84
2020-09-20 07:56:09 +02:00
Martijn Dekker
f45a0f1650 -o posix: inverse-sync braceexpand; properly sync letoctal
{Brace,expansion} is potentially incompatible with POSIX scripts,
because in POSIX those are simple literal strings with no special
meaning. So the POSIX option should really turn that off.

As of b301d417, the 'posix' option was also forcing 'letoctal'
behaviour on, without actually setting that option. I've since
found that to be a botch; 'let' may recognise octals without that
option being set, and that looks like a bug.

So as of this commit, the '-o posix' option actually toggles both
of these options off/on and on/of, respectively. 'set +o posix'
toggles them inversely. However, it is now possible to control both
options (and their associated behaviour) independently in between
'set -o posix' and 'set +o posix'. Much better.

src/cmd/ksh93/sh/main.c: sh_main():
- If SH_POSIX was set on init, turn on SH_LETOCTAL by default
  instead of SH_BRACEEXPAND.

src/cmd/ksh93/sh/args.c: sh_applyopts():
- Turn off SH_BRACEEXPAND and turn on SH_LETOCTAL when SH_POSIX is
  turned on (but not if it was already on).
- Turn on SH_BRACEEXPAND and turn off SH_LETOCTAL when SH_POSIX is
  turned off (but not if it was already off).

src/cmd/ksh93/sh/arith.c: arith():
- Revert to pre-b301d417 and only check SH_LETOCTAL option when
  deciding whether 'let' should skip initial zeros.

src/cmd/ksh93/tests/options.sh:
- Update $- test to allow '-o posix' to switch B = braceexpand.

src/cmd/ksh93/sh.1:
- Update.
- Edit for clarity.
2020-09-18 22:07:44 +02:00
Martijn Dekker
7e5fd3e98d A few job control (-m, -o monitor) fixes (rhbz#960034)
This patch from Red Hat fixes the following:

1. ksh was ignoring the -m (-o monitor) option when specified on
   the invocation command line.

2. Scripts did not properly terminate their background processes
   on Ctrl+C if the -m option was turned off. Reproducer:
	xterm &
	read junk
   When run as a script without turning on -m, pressing Ctrl+C
   should terminate the xterm, and now does.

3. Scripts no longer attempt to set the terminal foreground process
   group ID, as only interactive shells should be doing that.

This makes some progress on https://github.com/ksh93/ksh/issues/119
but we're a long way from fixing all of that.

src/cmd/ksh93/sh/main.c: exfile():
- On non-interactive shells, do not turn off the monitor option.
  Instead, if it was turned on, turn on the SH_MONITOR state flag.

src/cmd/ksh93/edit/edit.c: ed_getchar():
- On Ctrl+C, issue SIGINT to the current process group using
  killpg(2) instead of going via sh_fault(), which handles a
  signal only for the current shell process.

src/cmd/ksh93/sh/jobs.c: job_reap(), job_reset(),
src/cmd/ksh93/sh/xec.c: sh_exec():
- Only attempt to set the terminal foreground process group ID
  using tcsetpgrp(3) if the shell is interactive.

Original patch: https://src.fedoraproject.org/rpms/ksh/blob/642af4d6/f/ksh-20120801-kshmfix.patch
This was applied to Red Hat's ksh 93u+ on 8 July 2013.
2020-09-18 04:42:27 +02:00
Martijn Dekker
06e721c313 data/signals.c: fix empty SIGINT/SIGPIPE messages
src/cmd/ksh93/data/signals.c includes two checks for the JOBS
identifier; if it is not defined then the interactive shell's
background job signal messages for SIGINT and SIGPIPE are empty.
The cause was that the "jobs.h" header, which defines that ID, was
not #included in signals.c. This commit adds that #include.
(ksh 93u+, ksh 93v- and ksh2020 all have this bug as well.)

Before:

$ sleep 30 &
[1]	86430
$ kill -s INT "$!"
[1] +                          sleep 30 &
$

After:

$ sleep 30 &
[1]	86445
$ kill -s INT "$!"
[1] + Interrupt                sleep 30 &
$
2020-09-18 03:22:26 +02:00
Martijn Dekker
13c3fb21e9 emacs, vi: Support repeat parameters to VT220 keys (re: f2a3f4e3)
In the vi and emacs line editors, repeat count parameters can now
also be used for the arrow keys and the forward-delete key. E.g.,
in emacs mode, <ESC> 7 <left-arrow> will now move the cursor seven
positions to the left. In vi control mode, this would be entered
as: 7 <left-arrow>.

src/cmd/ksh93/edit/emacs.c:
- ed_emacsread(): Upon getting ^[ (ESC), save current repeat count
  in a new variable; restore and reset it upon the next character.
- escape(): Minor bugfix: when processing a ^[[x sequence where 'x'
  is a character other than '~' (which would be DEL), also reinsert
  the final character into the buffer so scripts can detect them.

src/cmd/ksh93/edit/vi.c:
- cntlmode(): Do not reset the repeat count if the command is '[',
  the character following ESC in VT220 escape sequences.
- mvcursor():
  * Do not use getcount() to get the character following '[', as
    that was parsing repetition parameters in the wrong place.
    There wouldn't be any, so this would reset the repeat count.
  * After that, no more need for the special-casing of ^[[3~ (DEL)
    introduced in f2a3f4e3. Move it to within the 'switch' block.
  * When handling left and right arrows and Home and End keys, do
    not modify cursor directly but ed_ungetchar() the corresponding
    traditional command keys as with the rest. Otherwise a repeat
    count parameter would now wrongly survive those keys.

src/cmd/ksh93/sh.1:
- Document control character notation used for vi mode docs.
- Since vi control mode beeps and aborts on ESC except if a
  subsequent [ is already in the input buffer upon receiving ESC,
  document that VT220 escape sequences only preserve repeat counts
  when entered into the input buffer all at once.
- Don't skip the initial ESC in the documentation of the VT220
  escape sequences. In control mode, skipping the initial ESC still
  works as before, but that is now undocumented, as it's really
  nothing more than an artefact of VT220 escape processing.
- Move the two long paragraphs on '-o viraw' and canonical (i.e.
  line-based) input processing from the vi editor introduction to
  the options section under 'viraw'. It is much too arcane for the
  intro, and besides, ksh 93u+ (and hence also 93u+m) has
  SHOPT_VIRAW enabled by default, so the shell is compiled to force
  this option on at all times, making it even less relevant for
  most users.
2020-09-17 19:14:39 +02:00
Martijn Dekker
f2a3f4e36b Handle forward-delete key in emacs and vi editors
On every modern system, the forward-delete key on PC/Mac keyboards
generates the VT220 sequence ESC [ 3 ~. Every other shell with an
editor handles this now, ksh93 seems to be the last not to.

src/cmd/ksh93/edit/emacs.c: escape():
- Handle the ^[[3 as part of normal escape processing, then read an
  extra character to check for the final '~'. If detected, insert
  an ERASECHAR key event.

src/cmd/ksh93/edit/vi.c: mvcursor():
- Replace the ^[[3~ sequence by an 'x' command. We have to
  special-case its processing, because vi mode parses numbers as
  repetition operators. The escape sequence contains a number,
  making it incompatible with normal command handling. This means
  number repetitions don't work with the forward-delete key. If
  that annoys anyone enough to fix it, a patch would be welcome.
  For now, it will do to make the forward-delete key stop
  exhibiting bizarre behaviour (beep + change case + move forward).

src/cmd/ksh93/sh.1
- Copy-edit emacs documentation for VT220-style sequences; map them
  to their actual key, otherwise it's meaningless to the reader.
- Document the new forward-delete key behaviour for emacs mode.
- Leave the forward-delete key for vi mode undocumented for now, as
  repetitions don't work, so it doesn't really match the vi canon.
  (OTOH, it doesn't work in vim, either...)
2020-09-15 03:43:53 +02:00
hyenias
d7c90eadc3 sfio: correct floating decimal point scaling of fractions (#131)
_sfcvt(), "convert a floating point value to ASCII", did not adjust
for negative decimal place movement as what happens with leading
zeroes. This caused ksh's 'printf %f' formatter to fail to round
floating point values correctly.

src/lib/libast/sfio/sfcvt.c:
- Removed constraint of <1e-8 for doubles by matching what was done
  for long doubles having <.1.
- Corrected a condition when the next power of 10 occurred and that
  new 1 digit was being overwritten by a 0.

 src/cmd/ksh93/tests/math.sh:
- Validate that typeset -E/F formatting matches that of their
  equivalent printf formatting options as well as checking for
  correct float scaling of the fractional parts.
2020-09-14 13:46:40 +02:00
Martijn Dekker
ddaa145b3d Reinstate 'r' and 'history' as preset aliases for interactive ksh
Following a community discussion, it became clear that 'r' is
particularly problematic as a regular builtin, as the name can and
does conflict with at least one legit external command by that
name. There was a consensus against removing it altogether and
letting users set the alias in their login scripts. However,
aliases are easier to bypass, remove or rename than builtins are.
My compromise is to reinstate 'r' as a preset alias on interactive
shells only, along with 'history', as was done in 17f81ebe before
they were converted to builtins in 03224ae3. So this reintroduces
the notion of predefined aliases to ksh 93u+m, but only for
interactive shells that are not initialised in POSIX mode.

src/cmd/ksh93/Makefile,
src/cmd/ksh93/Mamfile,
src/cmd/ksh93/include/shtable.h,
src/cmd/ksh93/data/aliases.c:
- Restore aliases.c containing shtab_aliases[], a table specifying
  the preset aliases.

src/cmd/ksh93/include/shtable.h,
src/cmd/ksh93/sh/init.c:
- Rename inittree() to sh_inittree() and make it extern, because we
  need to use it in main.c (sh_main()).

src/cmd/ksh93/sh/main.c: sh_main():
- Init preset aliases from shtab_aliases[] only if the shell is
  interactive and not in POSIX mode.

src/cmd/ksh93/bltins/typeset.c,
src/cmd/ksh93/tests/alias.sh:
- unall(): When unsetting an alias, pass on the NV_NOFREE attribute
  to nv_delete() to avoid an erroneous attempt to free a preset
  alias from read-only memory. See: 5d50f825

src/cmd/ksh93/data/builtins.c:
- Remove "history" and "r" entries from shtab_builtins[].
- Revert changes to inline fc/hist docs in sh_opthist[].

src/cmd/ksh93/bltins/hist.c: b_hist():
- Remove handling for 'history' and 'r' as builtins.

src/cmd/ksh93/sh.1:
- Update accordingly.

Resolves: https://github.com/ksh93/ksh/issues/125
2020-09-11 21:35:45 +02:00
Martijn Dekker
b9d10c5a9c Fix 'command' expansion bug and POSIX compliance
The 'command' name can now result from an expansion, e.g.:
	c=command; "$c" ls
	set -- command ls; "$@"
both work now. This fixes BUG_CMDEXPAN.

If -o posix is on, 'command' now disables not only the "special"
but also the "declaration" properties of builtin commands that it
invokes. This is because POSIX specifies 'command' as a simple
regular builtin, and any command name following 'command' is just
an argument to the 'command' command, so there is nothing that
allows any further arguments (such as assignment-arguments) to be
treated specially by the parser. So, if and only if -o posix is on:
a. Arguments that start with a variable name followed by '=' are
   always treated as regular words subject to normal shell syntax.
b. Since assignment-arguments are not processed as assignments
   before the command itself, 'command' can now stop the shell from
   exiting (as required by the standard) if a command that it
   invokes (such as 'export') tries to modify a readonly variable.
   This fixes BUG_CMDSPEXIT.

Most of 'command' is integrated in the parser and parse tree
executer, so that is where it needed fixing.

src/cmd/ksh93/sh/parse.c: simple():
- If the posix option is on, do not skip past SYSCOMMAND so that
  any declaration builtin commands that are arguments to 'command'
  are not detected and thus not treated specially at parsetime.

src/cmd/ksh93/sh/xec.c: sh_exec():
- When detecting SYSCOMMAND in order to skip past it, not only
  compare the Namval_t pointer 'np' to SYSCOMMAND, but also handle
  the case where that pointer is NULL, as when the command name
  results from an expansion. In that case, search the function tree
  shp->fun_tree for the name and see if that yields the SYSCOMMAND
  pointer. fun_tree is initialised with a dtview to bltin_tree, so
  searching fun_tree instead allows for overriding 'command' with a
  shell function (which the POSIX standard requires us to allow).

src/cmd/ksh93/sh.1,
src/cmd/ksh93/data/builtins.c:
- Update documentation to match these changes.
- Various related edits and improvements.

src/cmd/ksh93/tests/builtins.sh:
- Check that 'command' works if resulting from an expansion.
- Check that 'command' can be overridden by a shell function.
2020-09-11 10:06:43 +02:00
Martijn Dekker
092b90da81 Fix BUG_LOOPRET2 and related return/exit misbehaviour
The 'exit' and 'return' commands without an argument failed to pass
down the exit status of the last-run command when incorporated in a
block with redirection, &&/|| list, 'case' statement, or 'while',
'until' or 'for' loop.

src/cmd/ksh93/bltins/cflow.c:
- Use $?, which is sh.savexit a.k.a. shp->savexit, as the default
  exit status value if there is no argument, instead of
  shp->oldexit. This fixes the default exit status behaviour to
  match POSIX and other shells.

src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/include/shell.h:
- Remove now-unused sh.oldexit (a.k.a. shp->oldexit) private struct
  member. It appeared to fulfill the same function as sh.savexit,
  but in a slightly broken way.
- Move the savexit/$? declaration from the _SH_PRIVATE part of the
  struct definition to the public API part. Since $? uses this,
  it's clearly a publicly exposed value already, and this is
  generally the one to use. (If anything, it's exitval that should
  have been private.) This declares savexit right next to exitval,
  rewriting the comments to clarify the difference between them.

src/cmd/ksh93/sh/fault.c,
src/cmd/ksh93/sh/subshell.c,
src/cmd/ksh93/sh/xec.c:
- Remove assignments to shp->oldexit.

src/cmd/ksh93/tests/basic.sh:
- Add thorough regression tests for the default exit status
  behaviour of 'return' and 'exit' in various lexical contexts.
- Verify that 'for' and 'case' without any command, as well as a
  lone redirection, still correctly reset the exit status to 0.

Fixes: #117
2020-09-09 20:02:20 +02:00
Martijn Dekker
5ed9ffd6c4 This fixes erroneous syntax errors in parameter expansions such as
${var:-wor)d} or ${var+w(ord}. The parentheses now correctly lose
their normal grammatical meaning within the braces. Fix by Eric
Scrivner (@etscrivner) from July 2018 backported from ksh2020.

This fix complies with POSIX:
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_02

src/cmd/ksh93/sh/lex.c: sh_lex():
- Set the ST_QUOTE state when analysing a modifier with parameter
  expansions using operators ':', '-', '+', '='. This state causes
  subsequent characters (including parentheses) to be considered
  quoted, suppressing their normal grammatical meaning.

src/cmd/ksh93/sh/macro.c: varsub():
- Same for skipping the expansion.

Fixes: https://github.com/ksh93/ksh/issues/126
Prior discussion: https://github.com/att/ast/issues/475
2020-09-05 16:20:22 +02:00
Martijn Dekker
00d439605f -o posix: don't import/export variable attributes thru environment
When exporting variables, ksh exports their attributes (such as
'integer' or 'readonly') in a magic environment variable called
"A__z" (string defined in e_envmarker[] in data/msg.c). Child
shells recognise that variable and restore the attributes.

This little-known feature is risky; the environment cannot
necessarily be trusted and that A__z variable is easy to manipulate
before or between ksh invocations, so you can cause a script's
variables to be of the wrong type, or readonly. Backwards
compatibility requires keeping it, at least for now. But it should
be disabled in the posix mode, as it violates POSIX.

To do this, we have to solve a catch-22 in init.c. We must parse
options to know whether to turn on posix mode; it may be specified
as '-o posix' on the command line. The option parsing loop depends
on an initialised environment[*], while environment initialisation
(i.e., importing attributes) should depend on the posix option.

The catch-22 can be solved because initialising just the values
before option parsing is enough to avoid regressions. Importing the
attributes can be delayed until after option parsing. That involves
basically splitting env_init() into two parts while keeping a local
static state variable between them.

src/cmd/ksh93/sh/init.c:
- env_init():
  * Split the function in two stages based on a new
    'import_attributes' parameter. Import values in the first
    stage; import attributes from A__z in the second (if ever).
    Make the 'next' variable static as it keeps a state needed for
    the attributes import stage.
  * Single point of truth, greppability: don't hardcode "A__z" in
    separate character comparisons, but use e_envmarker[].
  * Fix an indentation error.
- sh_init(): When initialising the environment (env_init), don't
  import the attributes from A__z yet; parse options first, then
  import attributes only if posix option is not set.

src/cmd/ksh93/sh/name.c:
- sh_envgen(): Don't export variable attributes to A__z if the
  posix option is set.

src/cmd/ksh93/tests/attributes.sh:
- Check that variable attributes aren't imported or exported
  if the POSIX option is set.

src/cmd/ksh93/sh.1:
- Update.

This was the last item on the TODO list for -o posix for now.
Closes: #20

[*] If environment initialisation is delayed until after option
    parsing, bin/shtests shows various regressions, including:
    restricted mode breaks; the locale is not initialised properly
    so that multibyte variable names break; $SHLVL breaks.
2020-09-05 11:41:02 +02:00
Martijn Dekker
bec6556236 update NEWS, SH_RELEASE (re: 6575903d) 2020-09-04 05:29:52 +02:00
Martijn Dekker
55f0f8ce52 -o posix: disable '[ -t ]' == '[ -t 1 ]' hack
On ksh93, 'test -t' is equivalent to 'test -t 1' (and of course
"[ -t ]" is equivalent to "[ -t 1 ]").

This is purely for compatibility with ancient Bourne shell
breakage. No other shell supports this. ksh93 should probably keep
it for backwards compatibility, but it should definitely be
disabled in POSIX mode as it is a violation of the standard; 'test
-t' is an instance of 'test "$string"', which tests if the string
is empty, so it should test if the string '-t' is empty (quod non).

This also replaces the fix for 'test -t 1' in a command
substitution with a better one that avoids forking (re: cafe33f0).

src/cmd/ksh93/sh/parse.c:
- qscan(): If the posix option is active, disable the parser-based
  hack that converts a simple "[ -t ]" to "[ -t 1 ]".

src/cmd/ksh93/bltins/test.c:
- e3(): If the posix option is active, disable the part of the
  compatibility hack that was used for compound expressions
  that end in '-t', e.g. "[ -t 2 -o -t ]".
- test_unop(): Remove the forking fix for "[ -t 1 ]".

src/cmd/ksh93/edit/edit.c:
- tty_check(): This function is used by "[ -t 1 ]" and in other
  contexts as well, so a fix here is more comprehensive. Forking
  here would cause a segfault, but we don't actually need to. This
  adds a fix that simply returns false if we're in a virtual
  subshell that is also a command substitution. Since command
  substitutions always fork upon redirecting standard output within
  them (making them no longer virtual), it is safe to do this.

src/cmd/ksh93/tests/bracket.sh
- Add comprehensive regression tests for test/[/[[ -t variants in
  command substitutions, in simple and compound expressions, with
  and without redirecting stdout to /dev/tty within the comsub.
- Add tests verifying that -o posix disables the old hack.
- Tweak other tests, including one that globally disabled xtrace.
2020-09-01 20:24:44 +01:00
Martijn Dekker
c607c48c84 Revert <> redir FD except in posix mode (re: eeee77ed, 60516872)
eeee77ed implemented a POSIX compliance fix that caused a potential
incompatibility with existing ksh scripts; it made the (rarely
used) read/write redirection operator, <>, default to file
descriptor 0 (standard input) as POSIX specified, instead of 1
(standard output) which is traditional ksh93 behaviour. So ksh
scripts needed to change all <> to 1<> to override the new default.

This commit reverts that change, except in the new posix mode.

src/cmd/ksh93/sh/lex.c:
- Make FD for <> default to 0 in POSIX mode, 1 otherwise.

src/cmd/ksh93/tests/io.sh:
- Revert <> regression test changes from 60516872; we no longer
  need 1<> instead of <> in ksh code.
2020-09-01 08:48:18 +01:00
Martijn Dekker
fd977388a2 -o posix: allow invoked programs to inherit FDs > 2
If there are file descriptors > 2 opened with 'exec' or 'redirect',
ksh93 has always closed them when invoking another pogram. This is
contrary to POSIX which states:
    Utilities other than the special built-ins […] shall be invoked
    in a separate environment that consists of the following. The
    initial value of these objects shall be the same as that for
    the parent shell, except as noted below.
    * Open files inherited on invocation of the shell, open files
      controlled by the exec special built-in plus any
      modifications, and additions specified by any redirections to
      the utility
    * […]
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_12

src/cmd/ksh93/sh/io.c: sh_redirect():
- When flag==2, do not close FDs > 2 if POSIX mode is active.

src/cmd/ksh93/tests/io.sh:
- Regress-test inheriting FD 7 with and without POSIX mode.

src/cmd/ksh93/sh.1:
- Update.
2020-09-01 08:11:27 +01:00
Martijn Dekker
b301d41731 -o posix: always recognise octals in "let" builtin
Though the "let" builtin is not itself a POSIX standard command, it
processes standard shell arithmetic, so it should recognise octals
by leading zeros as POSIX requires if the 'posix' option is on.
This overrides the setting of the 'letoctal' option.

Note that none of this applies to the ((...)) arithmetic command,
which has always recognised leading-octal zeros and does not listen
to 'letoctal'. So setting the posix mode makes this consistent.

src/cmd/ksh93/sh/arith.c:
- When running the 'let' builtin, test that both SH_LETOCTAL and
  SH_POSIX are off before stripping leading zeros to disable octal
  number recognition.
- Cosmetic: fix spurious newline.

src/cmd/ksh93/sh.1:
- Document the change.

src/cmd/ksh93/tests/shtests:
- Make sure to disable posix mode by default for regression tests.
2020-09-01 07:17:22 +01:00
Martijn Dekker
921bbcaeb7 Remove SHOPT_BASH; keep &> redir operator, '-o posix' option
On 16 June there was a call for volunteers to fix the bash
compatibility mode; it has never successfully compiled in 93u+.
Since no one showed up, it is now removed due to lack of interest.

A couple of things are kept, which are now globally enabled:

1. The &>file redirection shorthand (for >file 2>&1). As a matter
   of fact, ksh93 already supported this natively, but only while
   running rc/profile/login scripts, and it issued a warning. This
   makse it globally available and removes the warning, bringing
   ksh93 in line with mksh, bash and zsh.

2. The '-o posix' standard compliance option. It is now enabled on
   startup if ksh is invoked as 'sh' or if the POSIXLY_CORRECT
   variable exists in the environment. To begin with, it disables
   the aforementioned &> redirection shorthand. Further compliance
   tweaks will be added in subsequent commits. The differences will
   be fairly minimal as ksh93 is mostly compliant already.

In all changed files, code was removed that was compiled (more
precisely, failed to compile/link) if the SHOPT_BASH preprocessor
identifier was defined. Below are other changes worth mentioning:

src/cmd/ksh93/sh/bash.c,
src/cmd/ksh93/data/bash_pre_rc.sh:
- Removed.

src/cmd/ksh93/data/lexstates.c,
src/cmd/ksh93/include/shlex.h,
src/cmd/ksh93/sh/lex.c:
- Globally enable &> redirection operator if SH_POSIX not active.
- Remove warning that was issued when &> was used in rc scripts.

src/cmd/ksh93/data/options.c,
src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/sh/args.c:
- Keep SH_POSIX option (-o posix).
- Replace SH_TYPE_BASH shell type by SH_TYPE_POSIX.

src/cmd/ksh93/sh/init.c:
- sh_type(): Return SH_TYPE_POSIX shell type if ksh was invoked
  as sh (or rsh, restricted sh).
- sh_init(): Enable posix option if the SH_TYPE_POSIX shell type
  was detected, or if the CONFORMANCE ast config variable was set
  to "standard" (which libast sets on init if POSIXLY_CORRECT
  exists in the environment).

src/cmd/ksh93/tests/options.sh,
src/cmd/ksh93/tests/io.sh:
- Replace regression tests for &> and move to io.sh. Since &> is
  now for general use, no longer test in an rc script, and don't
  check that a warning is issued.

Closes: #9
Progresses: #20
2020-09-01 06:19:19 +01:00
Martijn Dekker
9ba2c2e0df Speed up 'read', fixing macOS hang (take 2)
This fixes a hanging bug that could occur on macOS when using the
'read' command to read from a FIFO and encountering end-of-file
without a final newline character. It also makes the 'read' command
perform 15-25% faster on macOS and Linux.

The previous version (ff385e5a) failed on SunOS/Solaris/Illumos
because those systems apparently don't (fully) support the POSIX
standard recv(2) syscall with MSG_PEEK[*], which is the feature
that iffe detects under the 'socket_peek' identifier. On Illumos,
using that methods causes a compilation failure (unknown identifier
MSG_PEEK); on Solaris 11.4, that method causes multiple regressions
in tests/io.sh, suggesting the method compiles but doesn't work at
all. Instead, SunOS/Solaris/Illumos requires the method using
ioctl(2)+I_PEEK and select(2). No other system that ksh currently
builds on requires this method, so it is now only used on
SunOS/Solaris/Illumos.

So far, this version of sfpkrd() has been tested to work correctly
on Linux, macOS, FreeBSD, NetBSD, OpenBSD, HP-UX, Solaris, and
OmniOS (an Illumos distribution).

It still fails to peek on Cygwin, but in the exact same way it
failed before, so that's no loss.

To test, run the 'io' test set:  bin/shtests -p io

src/lib/libast/sfio/sfpkrd.c: sfpkrd():
- Remove long-obsolete Mac OS X and Solaris bug workarounds.
- Remove methods that are no longer needed.
     On systems with a POSIX compliant recv(2), the only thing that
  is required to avoid regressions is the code that was conditional
  upon the socket_peek feature test, which tests for the correct
  functioning of the recv(2) syscall. This has now been made
  mandatory for non-SunOS/Solaris/Illumos systems (using an #error
  directive if it is not detected), with the other methods removed.
  The result performs 15-25% faster on macOS and Linux while
  passing all the regression tests.
     On macOS, avoiding the select(2) method fixes the hanging bug.
     On SunOS/Solaris/Illumos (the '__sun' identifier), the method
  using ioctl(2)+I_PEEK and select(2) (iffe feature IDs:
  stream_peek and lib_select) is preserved.

Resolves: https://github.com/ksh93/ksh/issues/118 (again)

[*] https://pubs.opengroup.org/onlinepubs/9699919799/functions/recv.html
2020-08-19 23:54:55 +01:00
Martijn Dekker
569c1bb9c1 Revert "Speed up 'read', fixing macOS hang"
This reverts commit ff385e5a89.
It broke Solaris and illumos. More testing is needed.
2020-08-19 04:10:55 +01:00
Martijn Dekker
ff385e5a89 Speed up 'read', fixing macOS hang
This fixes a hanging bug that could occur on macOS when using the
'read' command to read from a FIFO and encountering end-of-file
without a final newline character. It also makes the 'read' command
perform 15-25% faster on macOS and Linux and maybe other systems.

src/lib/libast/sfio/sfpkrd.c: sfpkrd():
- Get rid of the optional stuff that uses the poll(2) or select(2)
  syscalls. The only thing that is required to avoid regressions is
  the code that was conditional upon the socket_peek feature test,
  which tests for the correct functioning of the recv(2) syscall.
  This has now been made mandatory. The rest now uses what was
  previously a fallback in plain C, resulting in a function that is
  not only more readable, but actually faster than the syscalls.

Resolves: https://github.com/ksh93/ksh/issues/118
2020-08-19 01:36:01 +01:00
Martijn Dekker
d03e948bcd Fix 'command -p' lookup if hash table entry exists (re: c9ccee86)
If a command's path was previously added to the hash table as a
'tracked alias', then the hash table entry was used, bypassing
the default utility path search activated by 'command -p'.

'command -p' activates a SH_DEFPATH shell state. The bug was caused
by a failure to check for this state before using the hash table.
This check needs to be added in four places.

src/cmd/ksh93/sh/path.c,
src/cmd/ksh93/sh/xec.c:
- path_search(), path_spawn(), sh_exec(), sh_ntfork(): Only consult
  the hash table, which is shp->track_tree, if the SH_DEFPATH shell
  state is not active.

src/cmd/ksh93/tests/path.sh:
- Add regress tests checking that 'command -p' and 'command -p -v'
  still search in the default path if a hash table entry exists for
  the command searched.
2020-08-17 20:23:39 +01:00
Martijn Dekker
acf84e9633 Fix 'command -x' on macOS, Linux, Solaris
'command -x' (basically builtin xargs for 'command') worked for
long argument lists on *BSD and HP-UX, but not on macOS and Linux,
where it reliably entered into an infinite loop.

The problem was that it assumed that every byte of the environment
space can be used for arguments, without accounting for alignment
that some OSs do. MacOS seems to be the most wasteful one: it
aligns on 16-byte boundaries and requires some extra bytes per
argument as well.

src/cmd/ksh93/sh/path.c:
- path_xargs(): When calculating how much space to subtract per
  argument, add 16 extra bytes to the length of each argument, then
  align the result on 16-byte boundaries. The extra 16 bytes is
  more than even macOS needs, but hopefully it is future-proof.
- path_spawn(): If path_xargs() does fail, do not enter a retry
  loop (which always becomes an infinite loop if the argument list
  exceeds OS limitations), but abort with an error message.
2020-08-16 09:31:43 +01:00
Martijn Dekker
56805b25af Fix leak and crash upon defining functions in subshells
A memory leak occurred upon leaving a virtual subshell if a
function was defined within it. If this was done more than 32766
(= 2^15-2 = the 'short' max value - 1) times, the shell crashed.
Discussion and reproducer: https://github.com/ksh93/ksh/issues/114

src/cmd/ksh93/sh/subshell.c: table_unset():
- A subshell-defined function was never freed because a broken
  check for autoloaded functions (which must not be freed[*]). It
  looked for an initial '/' in the canonical path of the script
  file that defined the function, but that path is also stored for
  regular functions. Now use a check that executes nv_search() in
  fpathdict, the same method used in _nv_unset() in name.c for a
  regular function unset.

src/cmd/ksh93/bltins/misc.c: b_dot_cmd():
- Fix an additional memory leak introduced in bd88cc7f, that caused
  POSIX functions (which are run with b_dot_cmd() like dot scripts)
  to leak extra. This fix avoids both the crash fixed there and the
  memory leak by introducing a 'tofree' variable remembering the
  filename to free. Thanks to Johnothan King for the patch.

src/lib/libast/include/stk.h,
src/lib/libast/misc/stk.c,
src/lib/libast/man/stk.3,
src/lib/libast/man/stak.3:
- Make the stack more resilient by extending the stack reference
  counter 'stkref' from (signed) short to unsigned int. On modern
  systems with 32-bit ints, this extends the maximum number of
  elements on a stack from 2^15-1==32767 to 2^32-1==4294967295.
  The ref counter can never be negative, so there is no reason for
  signedness. sizeof(int) is defined as the size of a single CPU
  word, so this should not affect performance at all.
     On a 16-bit system (not that ksh still compiles there), this
  doubles the max number of entries to 2^16-1=65535.

src/cmd/ksh93/tests/leaks.sh:
- Add leak regression tests for ksh functions, POSIX functions, dot
  scripts run with '.', and dot scripts run with 'source'.

src/cmd/ksh93/tests/path.sh:
- Add an output builtin with a redirect to an autoloaded function
  so that a crash[*] is triggered if the check for an autoloaded
  function is ever removed from table_unset(), as was done in ksh
  93v- (which crashed).

[*] Freeing autoloaded functions after leaving a virtual subshell
    causes a crashing bug: https://github.com/att/ast/issues/803

Co-authored-by: Johnothan King <johnothanking@protonmail.com>
Fixes: https://github.com/ksh93/ksh/issues/114
2020-08-14 00:25:31 +01:00
Johnothan King
05ac1dbb41
Fix crash upon running many subshells (#113)
Co-authored-by: Martijn Dekker <martijn@inlv.org>

An intermittent crash occurred after running many thousands of
virtual/non-forked subshells. One reproducer is a crash in the
shbench fibonacci.ksh test, as documented here:
https://github.com/ksh-community/shbench/blob/f3d9e134/bench/fibonacci.ksh#L4-L10

The apparent cause was the signed and insufficiently large 'short'
data type of 'curenv' and related variables which wrapped around to
a negative number when overflowing. These IDs are necessary for the
'wait' builtin to obtain the exit status from a background job.

This fix is inspired by a patch based on ksh 93v-:
https://build.opensuse.org/package/view_file/shells/ksh/ksh93-longenv.dif?expand=1
https://src.fedoraproject.org/rpms/ksh/blob/f24/f/ksh-20130628-longer.patch

However, we change the type to 'unsigned int' instead of 'long'. On
all remotely modern systems, ints are 32-bit values, and using this
type avoids a performance degradation on 32-bit sytems. Making them
unsigned prevents an overflow to negative values.

src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/include/jobs.h,
src/cmd/ksh93/include/nval.h,
src/cmd/ksh93/include/shell.h:
- Change the types of the static global 'subenv' and the subshell
  structure members 'curenv', 'jobenv', 'subenv', 'p_env' and
  'subshell' to one consistent type, unsigned int.

src/cmd/ksh93/sh/jobs.c,
src/cmd/ksh93/sh/macro.c:
src/cmd/ksh93/sh/name.c:
src/cmd/ksh93/sh/nvtype.c,
src/cmd/ksh93/sh/subshell.c:
- Updates to match new variable types.

src/cmd/ksh93/tests/subshell.sh:
- Show wrong exit status in message on failure of 'wait' builtin.
2020-08-12 18:50:59 +01:00
Martijn Dekker
61437b2728 Fix crash, take three (re: e805c7d9, 33858689)
The current fix appears to be only partially successful in
eliminating the intermittent crash, and also breaks '-o notify'
during the 60-second $TMOUT grace period. This replaces it.

The root cause appears to be that the state of job control becomes
somehow inconsistent when running external commands in a command
substitution expanded from the $PS1 prompt. The job_unpost() or
(sometimes) the job_list() function intermittently crash. These are
called if the SH_TTYWAIT state is active:
https://github.com/ksh93/ksh/blob/88e8fa67/src/cmd/ksh93/sh/jobs.c#L463-L469
Temporarily deactivating the SSH_TTYWAIT state while expanding
PS{1..4} prompts appears to fix the problem reliably.

It is quite possible that this fix merely masks a bug in the job
control system, but testing has shown that it stops ksh crashing
without side effects, so I'm calling it good for now.

Thanks to Marc Wilson for many hours of persistent testing.

src/cmd/ksh93/sh/jobs.c:
- Revert changes made in 33858689 and e805c7d9.

src/cmd/ksh93/sh/io.c: io_prompt():
- Save SH_TTYWAIT state and turn it off while expanding prompts.

Resolves: https://github.com/ksh93/ksh/issues/103
Resolves: https://github.com/ksh93/ksh/issues/112
2020-08-11 01:51:31 +01:00
Martijn Dekker
8477d2ce22 printf: Fix HTML and URI encoding (%H, %#H)
This applies a number of fixes to the printf formatting directives
%H and %#H (as well as their equivalents %(html)q and %(url)q):
1. Both formatters have been made multibyte/UTF-8 aware, and no
   longer delete multibyte characters. Invalid UTF-8 byte sequences
   are rendered as ASCII question marks.
2. %H no longer wrongly encodes spaces as non-breaking spaces
   (&nbsp;) and instead correctly encodes the UTF-8 non-breaking
   space as such.
3. %H now converts the single quote (') to '%#39;' instead of
   '&apos;' which is not a valid entity in all HTML versions.
4. %#H failed to encode some reserved characters (e.g. '?') while
   encoding some unreserved ones (e.g. '~'). It now percent-encodes
   all characters except those 'unreserved' as per RFC3986 (ASCII
   alphanumeric plus -._~).

Prior discussion:
https://groups.google.com/d/msgid/korn-shell/ce8d1467-4a6d-883b-45ad-fc3c7b90e681%40inlv.org

src/cmd/ksh93/include/defs.h:
src/cmd/ksh93/sh/string.c:
- defs.h: If compiling without SHOPT_MULTIBYTE, redefine the
  mbwide() macro (which tests if we're in a multibyte locale) as 0.
  This lets the compiler optimiser do the work that would otherwise
  require a lot of tedious '#if SHOPT_MULTIBYTE' directives.
- string.c: Remove some now-unneeded '#if SHOPT_MULTIBYTE' stuff.
- defs.h, string.c: Rename is_invisible() to sh_isprint(), invert
  the boolean return value, and make it an extern for use in
  fmthtml() -- see below. If compiling without SHOPT_MULTIBYTE,
  simply #define sh_isprint() as equivalent to isprint(3).
- defs.h: Add URI_RFC3986_UNRESERVED macro for fmthtml() containing
  the characters "unreserved" for purposes of URI percent-encoding.

src/cmd/ksh93/bltins/print.c: fmthtml():
- Remove kludge that skipped all multibyte characters (!).
- Complete rewrite to implement fixes described above.
- Don't bother with '#if SHOPT_MULTIBYTE' directives (see above).

src/cmd/ksh93/data/builtins.c:
- sh_optprintf[]: %H: Add single quote to encoded chars doc.
- Edit credits and bump version date.

src/cmd/ksh93/tests/builtins.sh:
- Update and tweak old regression tests.
- Add a number of new tests for UTF-8 HTML and URI encoding, which
  are only run when running tests in a UTF-8 locale (shtests -u).
2020-08-10 22:51:55 +01:00
Martijn Dekker
5312a59d5a Skip '.' and '..' when globbing patterns like .*
There are convincing arguments why including '.' and '..' in the
result of pathname expansion is actively harmful. See:
https://www.austingroupbugs.net/view.php?id=1228
https://github.com/ksh93/ksh/issues/58#issuecomment-653716846

pdksh, mksh and zsh already skip these special traversal names
in all cases. This commit makes ksh act like these shells.

Since passing '.' and especially '..' as arguments to commands like
'chmod -R' and 'cp -r' may cause harm, this change seems likely to
fix more legacy scripts than it breaks. I'm unaware of anyone ever
having come up with a concrete use case for the old behaviour.

This change also fixes the bug that '.' and '..' failed to be
ignored as documented if FIGNORE is set.

src/lib/libast/misc/glob.c: glob_dir():
- Explicitly skip any matching '.' and '..' in all cases.

src/cmd/ksh93/tests/glob.sh:
- Add test_glob() tests for '*' and '.*'.

src/cmd/ksh93/sh.1: File Name Generation:
- Update to match new behaviour.

Resolves: https://github.com/ksh93/ksh/issues/58
2020-08-10 00:35:53 +01:00
Martijn Dekker
be5ea8bbb2 redirect: check args before executing redirections (re: 7b82c338)
The 'redirect' builtin command did not error out before executing
any valid redirections. For example, 'redirect ls >foo.txt' issued
an "incorrect syntax" error, but still created 'foo.txt' and left
standard output permanently redirected to it.

src/cmd/ksh93/sh/xec.c: sh_exec():
- If we have redirections (io != NULL), and the command is
  SYSREDIR, then check for arguments and error out if there are
  any, before calling sh_redirect() to execute redirections.
  (Note, the other check for arguments in b_exec() in bltins/misc.c
  must be kept, as that applies if there are no redirections.)

src/cmd/ksh93/sh/io.c: sh_redirect():
- Edit comments to better explain what the flag values do.

src/cmd/ksh93/bltins/misc.c:
- Add a dummy b_redirect() function declaration "for the dictionary
  generator" as has historically been done for other builtins that
  share one C function. I'm not sure what that dictionary generator
  is supposed to be, but this also improves greppability.

src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/sh.1:
- Fix misleading "I/O redirection arguments" term. I/O redirections
  are not arguments at all; no argument parser ever sees them.

src/cmd/ksh93/tests/io.sh:
- Test both conditions that should make 'redirect' produce an
  "incorrect syntax" error.
- Test that any redirections are not executed if erroneous
  non-redirection arguments exist.

src/cmd/ksh93/tests/builtins.sh:
- "... should show usage info on unrecognized options" test:
  Because 'redirect' now refuses to process redirections on error,
  the error message was not captured. The fix is to run the builtin
  in a braces block and add the redirection to the block.
2020-08-09 00:47:22 +01:00
Martijn Dekker
e805c7d9b1 Fix crash: do not list job if in 60 sec grace period (re: 33858689)
The crash in job_list() or job_unpost() could still occur after the
previous patch if a signal was being handled after $TMOUT was
exceeded and the 60-second grace period was entered.

It *should* work to add a general check for !sh_isstate(SH_GRACE).
We know that the SH_GRACE state is set immediately after printing
the 60 second grace period warning message:
https://github.com/ksh93/ksh/blob/9de65210/src/cmd/ksh93/sh/io.c#L1869-L1870
(and that the crashes occur upon re-evaluating the $PS1 prompt
after setting the SH_GRACE state). We know that the SH_GRACE state
is not turned off again until either the user enters a line:
https://github.com/ksh93/ksh/blob/9de65210/src/cmd/ksh93/sh/main.c#L474
or the shell times out after the grace period:
https://github.com/ksh93/ksh/blob/9de65210/src/cmd/ksh93/sh/io.c#L1861
The SH_GRACE state flag is not used or changed in any other context
(verified with grep -rn SH_GRACE src/cmd/ksh93). So, logically,
this should suffice to make sure the crash stays gone.

src/cmd/ksh93/sh/jobs.c: job_reap():
- Do not list jobs when the SH_GRACE state (the 60 second timeout
  grace period after TMOUT was exceeded) is active.
- Keep the previous check for job control just to be sure, and
  because it makes sense.

Fixes: https://github.com/ksh93/ksh/issues/103 (again)
2020-08-07 21:09:01 +01:00
Johnothan King
9de65210c6
Add ${.sh.pid} as an alternative to $BASHPID (#109)
This variable is like Bash's $BASHPID, but in virtual subshells
it will retain its previous value as virtual subshells don't fork.
Both $BASHPID and ${.sh.pid} are different from $$ as the latter
is only set to the parent shell's process ID (i.e. it isn't set
to the process ID of the current subshell).

src/cmd/ksh93/include/defs.h:
- Add 'current_pid' for storing the current process ID at a valid
  memory address.
- Change 'ppid' from 'int32_t' to 'pid_t', as the return value from
  'getppid' is of the 'pid_t' data type.

src/cmd/ksh93/data/variables.c,
src/cmd/ksh93/include/variables.h,
src/cmd/ksh93/sh/init.c,
src/cmd/ksh93/sh/xec.c:
 - Add the ${.sh.pid} variable as an alternative to $BASHPID.
   The process ID is stored in a struct before ${.sh.pid} is set
   as environment variables are pointers that must point to a
   valid memory address. ${.sh.pid} is updated by the _sh_fork()
   function, which is called when ksh forks a new process with
   sh_fork() or sh_ntfork().

src/cmd/ksh93/tests/variables.sh:
- Add ${.sh.pid} to the list of special variables and add three
  regression tests for ${.sh.pid}.

src/cmd/ksh93/tests/subshell.sh:
- Update the PATH forking regression test to use ${.sh.pid} and
  remove the TODO note.
2020-08-07 02:53:25 +01:00
Johnothan King
f9fdbfc9e9
Fix a large number of typos and other problems (#110)
Most of these fixes are for typos and extra whitespace at the
end of lines. These are the notable changes:
- Fixed a compatibility issue with how asterisks are displayed
  using certain fonts. Bug report: https://github.com/att/ast/issues/764
- Fixed a bug in the man page that caused searches for the '|'
  character to fail. Bug report: https://github.com/att/ast/issues/871
- Removed a duplicate description of 'set -B' from the man
  page. Bug report: https://github.com/att/ast/issues/789
- Added documentation for options missing from the ksh man
  page (applies to 'hist -N', 'sleep -s', 'whence -q' and
  many of ulimit's options). Bug reports:
  https://github.com/att/ast/issues/948
  https://github.com/att/ast/issues/503#issuecomment-386649715
  https://github.com/att/ast/issues/507#issuecomment-507924608
- Applied the following ksh2020 documentation fixes:
  https://github.com/att/ast/pull/351
  https://github.com/att/ast/pull/352
- Fixed a minor GCC -Wformat warning in procopen.c by changing
  a sentinel to NULL.
2020-08-07 00:50:11 +01:00
Martijn Dekker
338586896d Fix crash: do not list jobs if there is no job control
This bug caused an undefined state, which sometimes crashed the
shell in job_list() or job_unpost(), if $PS1 contains a command
substitution running an external command and the '-b'/'-o notify'
shell option is active. So far the only known way to trigger the
crash is by letting $TMOUT time out the interactive shell. See
https://github.com/ksh93/ksh/issues/103 for details.

src/cmd/ksh93/sh/jobs.c: job_reap():
- The check for the SH_NOTIFY option and the SH_TTYWAIT state
  before listing jobs was insufficient. Job control is disabled in
  command substitutions, so also check that job control is active
  before listing jobs.

src/cmd/ksh93/sh.1:
- Fix TMOUT documentation. The 'read' command in fact only times
  out when reading from a terminal, just like 'select'. Also
  document the extra 60 second grace period when an interactive
  shell prompt reads from a terminal.

Fixes: https://github.com/ksh93/ksh/issues/103
2020-08-06 22:46:02 +01:00
Martijn Dekker
ac8991e525 Fix shellquoting of invalid multibyte char (re: f9d28935, 8c7c60ec)
This commit fixes two bugs in the generation of $'...' shellquoted
strings:
1. A bug introduced in f9d28935. In UTF-8 locales, a byte that is
   invalid in UTF-8, e.g. hex byte 86, would be shellquoted as
   \u[86], which is not the same as the correct quoting, \x86.
2. A bug inherited from 93u+. Single bytes (e.g. hex 11) were
   always quoted as \x11 and not \x[11], even if a subsequent
   character was a hexadecimal digit. However, the parser reads
   past two hexadecimal digits, so we got:
	$ printf '%q\n' $'\x[11]1'
	$'\x111'
	$ printf $'\x111' | od -t x1
	0000000    c4  91
	0000002
   After the bug fix, this works correctly:
	$ printf '%q\n' $'\x[11]1'
	$'\x[11]1'
	$ printf $'\x[11]1' | od -t x1
	0000000    11  31
	0000002

src/cmd/ksh93/sh/string.c: sh_fmtq():
- Make the multibyte code for $'...' more readable, eliminating the
  'isbyte' flag.
- When in a multibyte locale, make sure to shellquote both invalid
  multibyte characters and unprintable ASCII characters as
  hexadecimal bytes (\xNN). This reinstates 93u+ behaviour.
- When quoting bytes, use isxdigit(3) to determine if the next
  character is a hex digit, and if so, protect the quoted byte with
  square brackets.

src/cmd/ksh93/tests/quoting2.sh:
- Move the 'printf %q' shellquoting regression tests here from
  builtins.sh; they test the shellquoting algorithm, not so much
  the printf builtin itself.
- Add regression tests for these bugs.
2020-08-05 18:22:22 +01:00
Johnothan King
e53177abca
Fix unset method in multidimensional arrays (#105)
A segfault happens when an array with an unset method
is turned into a multidimensional array. Reproducer:
function foo {
    typeset -a a
    a.unset() {
        print unset
    }
    a[3][6][11][20]=7
}
foo

src/cmd/ksh93/sh/nvdisc:
- Fix the multidimensional array unset method crash by
  checking if np->nvenv is an array, since multidimensional
  arrays need to be handled as arrays. This bugfix was
  backported from ksh93v- 2013-10-10-alpha.

src/cmd/ksh93/tests/arrays2.sh:
- Add the reproducer as a regression test for the crash
  with multidimensional arrays.

Bug report on the old mailing list:
https://www.mail-archive.com/ast-developers@lists.research.att.com/msg01195.html
2020-08-05 18:14:30 +01:00
Johnothan King
23f2e23385
Over-shifting in a POSIX function should cause scripts to exit (#106)
The required longjmp used to terminate scripts was not being run
when over-shifting in a POSIX function with a redirection. This
caused scripts to continue after an error in the shift builtin,
which is incorrect since shift is a special builtin. The
interpreter is sent into an indeterminate state that causes
undefined behavior as well:
$ cat reproducer.ksh
some_func() {
   shift 10
}

for i in a b c d e f; do
  echo "read $i"
  [ "$i" != "c" ] && continue
  some_func 2>&1
  echo "$i = c"
done
$ ksh ./reproducer.ksh
read a
read b
read c
/tmp/k[2]: shift: 10: bad number
c = c
read d
/tmp/k[2]: shift: 10: bad number
d = c
read e
/tmp/k[2]: shift: 10: bad number
e = c
read f
/tmp/k[2]: shift: 10: bad number
f = c

src/cmd/ksh93/sh/xec.c: sh_exec():
- Do the necessary longjmp needed to terminate the script after
  over-shifting in a POSIX function when the function call has a
  redirection.

src/cmd/ksh93/tests/functions.sh:
- Add the over-shifting regression test from ksh93v- 2013-10-10-alpha.

Bug report and fix on the old mailing list:
https://www.mail-archive.com/ast-developers@lists.research.att.com/msg00732.html
2020-08-05 18:06:16 +01:00
Marc Wilson
4144f404ae
Fix expansion of multibyte character after $1 - $9, $?, etc (#102)
A multibyte character immediately following an expansion of a
single-character name, e.g. $1 through $9, $?, $-, etc. was
corrupted when in a UTF-8 locale, e.g.:

    $ set -- foo; echo "$1テスト"
    foo?スト

Prior discussion:
https://www.mail-archive.com/ast-users@lists.research.att.com/msg01060.html
https://bugzilla.redhat.com/show_bug.cgi?id=1256495

src/cmd/ksh93/sh/macro.c:
- Apply a Red Hat patch by Paulo Andrade that avoids calling
  fcmbget() if backtracking more than one byte might be required.

src/cmd/ksh93/tests/basic.c:
- Test "テスト" following expansion of "$1", "$?" and "$#".

Co-authored-by: Martijn Dekker <martijn@inlv.org>
2020-08-01 01:12:45 +01:00
Johnothan King
02a14ff9b7
Fix creation of extra associative array element '0' (#101)
Multidimensional associative arrays are created with an extra array
member named '0', which is set to no value. Reproducer:

$ typeset -A foo
$ typeset -A foo[bar]
$ typeset -p foo
typeset -A foo=([bar]=([0]='') )

The bugfix prevents nv_setarray from creating the extra '[0]' member
when an associative array is empty. This bug was discussed on the old
mailing list:
https://www.mail-archive.com/ast-developers@lists.research.att.com/msg01574.html

src/cmd/ksh93/sh/array.c:
- Do not allow the creation of an extra array member when an array
  is empty.

src/cmd/ksh93/tests/arrays.sh:
- Add a regression test for creating multidimensional associative
  arrays, but use the output from 'typeset -p' instead of fgrep.
2020-07-31 17:32:09 +01:00
Martijn Dekker
70f6d758c0 Fix blocked signals after fork(2)ing external command in subshell
When the classic fork/exec mechanism was used (via sh_fork()) to
run an external command from within a non-forking subshell, SIGINT
was blocked until that subshell was exited. If a subsequent loop
was run in the subshell, it became uninterruptible, e.g.:

   $ arch/*/bin/ksh -c '(/usr/bin/true; while :; do :; done); exit'
   ^C^C^C^C^C

src/cmd/ksh93/sh/xec.c:
- sh_fork() did not reset the savesig variable in the parent part
  of the fork when running in a virtual subshell. This had the
  effect of delaying signal handling until exiting the subshell.
  There is no reason for that subshell check that I can discern, so
  this removes it.
      I've verified that this causes no regression test failures
  even when ksh is compiled with -DSHOPT_SPAWN=0 which means the
  classic fork/exec mechanism is always used.

Fixes: https://github.com/ksh93/ksh/issues/86
2020-07-30 01:46:00 +01:00
Martijn Dekker
a2f13c19f2 Fix typeset attributes -a, -A, -l, -u leaking out of subshells
If an array or upper/lowercase variable was declared with a null
initial value within a virtual/non-forked subshell, like:
	( typeset -a foo; ... )
	( typeset -A foo; ... )
	( typeset -l foo; ... )
	( typeset -u foo; ... )
then the type declaration leaked out of the subshell into the
parent shell environment, though without any values that may
subsequently have been assigned.

src/cmd/ksh93/bltins/typeset.c: setall():
- When deciding whether to create a virtual subshell scope for a
  variable, use sh_assignok(), which was actually designed for the
  purpose, instead of _nv_unset(). This allows getting rid of a
  tangled mess of special-casing that never worked quite right.

src/cmd/ksh93/tests/arrays.sh:
- Add regression tests checking that array declarations don't leak
  out of virtual subshells.

src/cmd/ksh93/tests/attributes.sh:
- Add regression tests for combining the 'export' and 'readonly'
  attributes with every other possible typeset attribute on unset
  variables. This also includes a subshell leak test for each one.

Fixes: https://github.com/ksh93/ksh/issues/88
2020-07-26 02:41:12 +01:00
Johnothan King
1bc2c74c74
Fix how unrecognized options are handled in 'sleep' and 'suspend' (#93)
When a builtin is given an unrecognized option, the usage information
for that builtin should be shown as 'Usage: builtin-name options'. The
sleep and suspend builtins were an exception to this. 'suspend' would
not show usage information and sleep wouldn't exit on error:

$ suspend -e
/usr/bin/ksh: suspend: -e: unknown option
$ time sleep -e 1
sleep: -e: unknown option

real	0m1.00s
user	0m0.00s
sys	0m0.00s

src/cmd/ksh93/bltins/sleep.c:
- Show usage information and exit when sleep is given an unknown
  option. This bugfix was backported from ksh2020: https://github.com/att/ast/pull/1024

src/cmd/ksh93/bltins/trap.c:
- Use the normal method of parsing options with optget to fix the
  suspend builtin's test failure.

src/cmd/ksh93/tests/builtins.sh:
- Add the ksh2020 regression test for getting the usage information
  of each builtin. Enable all /opt/ast/bin builtins in a subshell
  since those should be tested as well (aside from getconf and uname
  because those builtins fallback to the real commands on error).
2020-07-26 02:18:49 +01:00
Johnothan King
8b5f11dcd7
Add support for multibyte characters to $IFS (#92)
Add support for multibyte characters to $IFS

This commit fixes BUG_MULTIBIFS, which had two bug reports in the ksh2020 branch.

src/cmd/ksh93/sh/macro.c:
- Backport Eric Scrivner's fix for multibyte IFS characters (slightly modified
  for compatibility with C89). Explanation from https://github.com/att/ast/pull/737:

  Previously, the varsub method used for the macro expansion of $param, ${param},
  and ${param op word} would incorrectly expand the internal field separator (IFS)
  if it was a multibyte character. This was due to truncation based on the
  incorrect assumption that the IFS would never be larger than a single byte.

  This change fixes this issue by carefully tracking the number of bytes that
  should be persisted in the IFS case and ensuring that all bytes are written
  during expansion and substitution.

  Bug report: https://github.com/att/ast/issues/13

- Fixed another bug that caused multibyte characters with the same initial byte
  to be treated as the same character by the IFS. This bug was occurring because
  the first byte of a multibyte character wasn't being written to the stack when
  the IFS delimiter had the same initial byte:

  $ IFS=£
  $ v='§'
  $ set -- $v
  $ v="${1-}"
  $ echo "$v" | hd # The first byte should be c2, but it isn't due to the bug
  00000000  a7 0a                                             |..|
  00000002

  Bug report: https://github.com/att/ast/issues/1372

src/cmd/ksh93/tests/variables.sh:
- Add (reworked) regression tests from ksh2020 for the multibyte IFS bugs.
- Add a regression test for att/ast#1372 based on the reproducer.
2020-07-25 19:46:11 +01:00
Johnothan King
8c16f38a88
Fix an infinite loop related to $_ if ksh is /bin/sh (#90)
The following explanation is mostly taken from Tomas Klacko's report on
the old mailing list (which also contains a C program reproducer) [*]:

1. When ksh starts a binary, it sets its environment variable "_"
   to "*number*/path/to/binary". Where "number" is the pid of the
   ksh process.

2. The binary forks and the child executes a suid root shell script
   which begins with #!/bin/sh. For this bug to occur, ksh must be /bin/sh.

3. The ksh process interpreting the suid shell script leaves the "_"
   variable as not set (nv_getval(L_ARGNOD) returns NULL) because
   the "number" from step 1 is not the pid of its parent process.

4-5. Because "_" is not set and the script is suid root, an infinite
   loop occurs because when the SHELL environment variable contains
   "/bin/sh" pathshell() returns "/bin/sh". This becomes an infinite
   loop of /bin/sh /dev/fd/3 executing /bin/sh /dev/fd/3.

src/cmd/ksh93/sh/init.c: get_lastarg():
- Disable the check for if the "number" refers to the process id of
  the parent process.

src/cmd/ksh93/sh/main.c: sh_main():
- Prevent an infinite loop when '$_' is not passed in from the environment.

Solaris applies this bugfix to their version of ksh:
https://github.com/oracle/solaris-userland/blob/master/components/ksh93/patches/190-17432413.patch

[*]: https://www.mail-archive.com/ast-developers@lists.research.att.com/msg01680.html
2020-07-24 01:20:26 +01:00
Johnothan King
6e515f1d45
Fix command substitutions run on the same line as a here-doc (#91)
When a command substitution is run on the same line as a here-document,
a syntax error occurs due to a regression introduced in ksh93u+ 2011-04-15:

true << EOF; true $(true)
EOF
syntax error at line 1: `<<EOF' here-document not contained within command substitution

The regression is caused by an error check that was added to make
the following script causes a syntax error (because the here-document
isn't completed inside of the command substitution):

$(true << EOF)
EOF

src/cmd/ksh93/sh/lex.c:
- Only throw an error when a here-document in a command substitution
  isn't completed inside of the command substitution.

src/cmd/ksh93/tests/heredoc.sh:
- Add a regression test for running a command substitution on the
  same line as a here-document.
- Add a missed regression test for using here-documents in command
  substitutions. This is the original bug that was fixed in ksh93u+
  2011-04-15 (it is why the error message was added), but a regression
  test for here-documents in command substitutions wasn't added in
  that version.

This bugfix was backported from ksh93v- 2013-10-10-alpha.
2020-07-24 00:03:57 +01:00
Martijn Dekker
f207cd5787 Fix race conditions running external commands with job control on
When ksh is compiled with SHOPT_SPAWN (the default), which uses
posix_spawn(3) or vfork(2) (via sh_ntfork()) to launch external
commands, at least two race conditions occur when launching
external commands while job control is active. See:
https://bugs.launchpad.net/ubuntu/+source/ksh/+bug/1887863/comments/3
https://www.mail-archive.com/ast-developers@research.att.com/msg00717.html

The basic issue is that this performance optimisation is
incompatible with job control, because it uses a spawning mechanism
that doesn't copy the parent process' memory pages into the child
process, therefore no state that involves memory can be set before
exec-ing the external program. This makes it impossible to
correctly set the terminal's process group ID in the child process,
something that is essential for job control to work.

src/cmd/ksh93/sh/xec.c:
- Use sh_fork() instead of sh_ntfork() if job control is active.
  This uses fork(2), which is 30%-ish slower on most sytems, but
  allows for correctly setting the terminal process group.

src/cmd/ksh93/tests/basic.sh:
- Add regression test for the race condition reported in #79.

src/cmd/INIT/cc.darwin:
- Remove hardcoded flag to disable SHOPT_SPAWN on the Mac.
  It should be safe to use now.

Fixes https://github.com/ksh93/ksh/issues/79
2020-07-22 13:45:33 +01:00
Martijn Dekker
db72f41f4b Fix subshell file descriptor leak
A file descriptor (at least 3, can't reproduce for 4 and up) opened
with 'exec' or 'redirect' in a virtual/non-forked subshell survived
that subshell after exiting it:

    $ ksh -c '(redirect 3>&1); echo bug >&3'
    bug

src/cmd/ksh93/sh/io.c:
- Apply a patch from OpenSUSE (ksh93-redirectleak.dif). Source:
  https://build.opensuse.org/package/show/openSUSE:Leap:42.3:Update/ksh

src/cmd/ksh93/tests/io.sh:
- Add regression test.

Thanks to Marc Wilson for flagging this up.
2020-07-21 04:12:40 +01:00
Martijn Dekker
bc8b36faba whence -a/type -a: report both function and built-in by same name
'whence -a' is documented to list all possible interpretations of a
command, but failed to list a built-in command if a shell function
by the same name exists or is marked undefined using 'autoload'.

src/cmd/ksh93/bltins/whence.c: whence():
- Refactor and separate the code for reporting functions and
  built-in commands so that both can be reported for one name.

src/cmd/ksh93/data/builtins.c: sh_optwhence[]:
- Correct 'whence --man' to document that:
  * 'type' is equivalent to 'whence -v'
  * '-a' output is like '-v'

src/cmd/ksh93/tests/builtins.sh:
- Test 'whence -a' with these combinations:
  * a function, built-in and external command
  * an undefined/autoload function, built-in and external command

Fixes https://github.com/ksh93/ksh/issues/83
2020-07-20 21:16:24 +01:00
Johnothan King
bd88cc7f4f
Fix two crashes related to kshdb (#82)
This commit fixes two different crashes related to kshdb:
- When redirect is given an invalid file descriptor, a segfault
  no longer occurs. Reproducer:
  $ ksh -c 'redirect 9>&200000000000'

- Fix a crash due to free(3) being used on an invalid pointer.
  This can be reproduced with kshdb (commands from att/ast#582):
  $ git clone https://github.com/rocky/kshdb.git
  $ cd kshdb
  $ ksh autogen.sh
  $ echo "print hi there" > $HOME/.kshdbrc
  $ ./kshdb -L . test/example/dbg-test1.sh

src/cmd/ksh93/bltins/misc.c: b_dot_cmd():
- The string pointed to by shp->st.filename must be able to be
  freed from memory with free(3), so duplicate the string with
  strdup(3).

src/cmd/ksh93/sh/io.c: sh_redirect():
- Show an error message when a file descriptor is invalid to
  fix a memory fault.
2020-07-19 23:42:12 +01:00
Johnothan King
2db9953ae0
Fix three bugs in the sleep builtin (#77)
This commit backports the main changes to sh_delay from ksh93v-
and ksh2020, which fixes the following bugs:

- Microsecond amounts of less than one millisecond are no longer
  ignored. The following loop will now take a minimum of one
  second to complete:
  for ((i = 0; i != 10000; i++)) do
    sleep PT100U
  done

- 'sleep 30' no longer adds an extra 30 milliseconds to the total
  amount of time to sleep. This bug is hard to notice since 30
  milliseconds can be considered within the margin of error. The
  only reason why longer delays weren't affected is because the old
  code masked the bug when the interval is greater than 30 seconds:
  else if(n > 30)
  {
      sleep(n);
      t -= n;
  }
  This caused 'sleep -s' to break with intervals greater than 30
  seconds, so an actual fix is used instead of a workaround.

- 'sleep -s' now functions correctly with intervals of more than
  30 seconds as the new code doesn't need the old workaround. This
  is done by handling '-s' in sh_delay.

src/cmd/ksh93/bltins/sleep.c:
- Remove the replacement for sleep(3) from the sleep builtin.
- Replace the old sh_delay function with the newer one from ksh2020.
  The new function uses tvsleep, which uses nanosleep(3) internally.

src/cmd/ksh93/include/shell.h,
src/cmd/ksh93/edit/edit.c,
src/cmd/ksh93/sh/jobs.c,
src/cmd/ksh93/sh/xec.c,
src/cmd/ksh93/shell.3:
- Update sh_delay documentation and usage since the function now
  requires two arguments.

src/cmd/ksh93/tests/builtins.sh:
- Add a regression test for 'sleep -s' when the interval is greater
  than 30 seconds. The other bugs can't be tested for in a feasible
  manner across all systems:
  https://github.com/ksh93/ksh/pull/72#issuecomment-657215616
2020-07-17 05:00:28 +01:00
Johnothan King
ea5b25b93a
Fix some formatting errors, typos and other problems (#78)
Some notes:
- Removed a TODO note that was fixed in commit 43d9fbac.
- Removed a duplicate note about the '%l' time format in the changelog.
- Applied the following documentation fixes from Terrence J. Doyle:
  - https://www.mail-archive.com/ast-developers@lists.research.att.com/msg01852.html
  - https://www.mail-archive.com/ast-developers@lists.research.att.com/msg01856.html
- Fixed strange grammar in one of the error messages.
- Added missing options for rksh to the synopsis section.
- Applied a formatting fix from ksh93v- to the man page.
- Replaced a C99 line comment in src/lib/libast/comp/realpath.c with a
  proper comment that is valid in C89.
- Prioritize UTC over GMT in the documentation (missed by commit c9634e90).
- Add some extra information for 'ksh -R file' to the man page. This patch
  is from Red Hat: https://git.centos.org/rpms/ksh/blob/c8/f/SOURCES/ksh-20080202-manfix.patch
2020-07-16 22:27:00 +01:00
Johnothan King
03224ae3af
Make the 'history' and 'r' commands builtins (#76)
With this change no more preset aliases exist, so the preset alias
tables can be safely removed. All ksh commands can now be used
without 'unalias -a' removing them, even in interactive shells.
Additionally, the history and r commands are no longer limited to
being used in interactive shells.

src/cmd/ksh93/bltins/hist.c:
- Implement the history and r commands as builtins. Also guarantee
  lflag is set to one by avoiding 'lflag++'.

src/cmd/ksh93/Makefile,
src/cmd/ksh93/Mamfile,
src/cmd/ksh93/sh/main.c,
src/cmd/ksh93/sh/init.c,
src/cmd/ksh93/data/aliases.c:
- Remove the table of predefined aliases because the last few have
  been removed. During init the alias tree is now initialized the
  same way as the function tree.

src/cmd/ksh93/bltins/typeset.c:
- Remove the bugfix for unsetting predefined aliases because it is
  now a no-op. Aliases are no longer able to have the NV_NOFREE
  attribute.

src/cmd/ksh93/tests/alias.sh:
- Remove the regression test for unsetting predefined aliases since
  those no longer exist.

src/cmd/ksh93/data/builtins.c:
- Update sh_opthist[] for 'hist --man', etc.

src/cmd/ksh93/sh.1:
- Remove the list of preset aliases since those no longer exist.
- Document history and r as builtins instead of preset aliases.

Co-authored-by: Martijn Dekker <martijn@inlv.org>
2020-07-16 18:56:49 +01:00
Martijn Dekker
17f81ebedb Load 'r' and 'history' default aliases on interactive only
These two default aliases are useful on interactive shells. In
scripts, they interfere with possible function or command names.

As of this commit, these final two default aliases are only loaded
for interactive shells, leaving zero default aliases for scripts.
This completes the project to get rid of misguided default aliases.

src/cmd/ksh93/include/shtable.h,
src/cmd/ksh93/data/aliases.c:
src/cmd/ksh93/sh/init.c:
- Add empty alias table shtab_noaliases[] for scripts.
- Rename inittree() to sh_inittree() and make it external.
- nv_init(), sh_reinit(): Initialise empty alias tree for scripts.

src/cmd/ksh93/sh/main.c: sh_main():
- If interactive, reinitialise alias tree for interactive shells.

src/cmd/ksh93/tests/alias.sh:
- To test default alias removal, launch shell with -i.
2020-07-16 06:44:05 +01:00
Johnothan King
01145a48dd
Handle the escape sequence for the End key (#75)
Many terminals (xterm being one example) give the Home and End keys
the escape sequences '^[[H' and '^[[F'. The first sequence is
handled in both editing modes by moving the cursor to start of
line, but ksh ignored the second sequence.

src/cmd/ksh93/edit/emacs.c,
src/cmd/ksh93/edit/vi.c:
- Add case labels for '^[[F' so that in both editing modes the End
  key moves the cursor to the end of the line.
2020-07-15 23:38:44 +01:00
Martijn Dekker
1fbbeaa19d Convert default typeset aliases to regular builtins
This converts the 'autoload', 'compound', 'float', 'functions',
'integer' and 'nameref' default aliases into regular built-in
commands, so that 'unalias -a' does not remove them. Shell
functions can now use these names, which improves compatibility
with POSIX shell scripts.

src/cmd/ksh93/data/aliases.c:
- Remove default typeset aliases.

src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/include/builtins.h:
- Add corresponding built-in command declarations. Typeset-style
  commands are now defined by a pointer range, SYSTYPESET ..
  SYSTYPESET_END. A couple need their own IDs (SYSCOMPOUND,
  SYSNAMEREF) for special-casing in sh/xec.c.
- Update 'typeset --man'.

src/cmd/ksh93/bltins/typeset.c: b_typeset():
- Recognise the new builtin commands by argv[0]. Implement them by
  inserting the corresponding 'typeset' options into the argument
  list before parsing options. This may seem like a bit of a hack,
  but it is simpler, shorter, more future-proof and less
  error-prone than manually copying and adapting all the complex
  flaggery from the option parsing loop.

src/cmd/ksh93/sh/parse.c,
src/cmd/ksh93/sh/xec.c:
- Recognise typeset-style commands by SYSTYPESET .. SYSTYPESET_END
  pointer range.
- Special-case 'compound' (SYSCOMPOUND) and 'nameref' (SYSNAMEREF)
  along with recognising the corresponding 'typeset' options.

src/cmd/ksh93/sh.1:
- Update to document the new built-ins.
- Since not all declaration commands are special built-ins now,
  identify declaration commands using a double-dagger "\(dd"
  character (which renders as '=' in ASCII) and disassociate their
  definition from that of special built-ins.

src/cmd/ksh93/tests/variables.sh:
- Adapt a regression test as there is no more 'integer' alias.
2020-07-15 20:54:06 +01:00
Martijn Dekker
b1a4131123 Millisecond precision for 'times' builtin (re: 65d363fd, 5c677a4c)
Now that we have an iffe feature test for getrusage(3), introduced
in 70fc1da7, the millisecond-precision 'times' command from the
last version of ksh2020 can easily be backported.

src/cmd/ksh93/bltins/misc.c:
- Incorporate ksh2020 'times' command, with a couple of tweaks:
  * Use locale's radix point instead of '.'.
  * Pad seconds with initial zero if < 10.

src/cmd/ksh93/data/builtins.c:
- Update version date for 'times --man'.

src/cmd/ksh93/tests/builtins.sh:
- Update 'times' test for 3 digits after radix point.
2020-07-15 04:22:45 +01:00
Johnothan King
70fc1da73e
Fix the max precision of the 'time' keyword (#72)
This commit backports the required fixes from ksh2020 for using
millisecond precision with the 'time' keyword. The bugfix refactors
a decent amount of code to rely on the BSD 'timeradd' and
'timersub' macros for calculating the total amount of time elapsed
(as these aren't standard, they are selectively implemented in an
iffe feature test for platforms without them). getrusage(3) is now
preferred since it usually has higher precision than times(3) (the
latter is used as a fallback).

There are three other fixes as well:

src/lib/libast/features/time:
- Test for getrusage with an iffe feature test rather than
  assume _sys_times == _lib_getrusage.

src/cmd/ksh93/sh/xec.c:
- A single percent at the end of a format specifier is now
  treated as a literal '%' (like in Bash).
- Zero-pad seconds if seconds < 10. This was already done for
  the times builtin in commit 5c677a4c, although it wasn't
  applied to the time keyword.
- Backport the ksh2020 bugfix for the time keyword by using
  timeradd and timersub with gettimeofday (which is used with
  a timeofday macro). Prefer getrusage when it is available.
- Allow compiling without the 'timeofday' ifdef for better
  portability.
  This is the order of priority for getting the elapsed time:
  1) getrusage (most precise)
  2) times + gettimeofday (best fallback)
  3) only times (doesn't support millisecond precision)
  This was tested by using debug '#undef' statements in xec.c.

src/cmd/ksh93/features/time:
- Implement feature tests for the 'timeradd' and 'timersub'
  macros.
- Do a feature test for getrusage like in the libast time test.

src/cmd/ksh93/tests/basic.sh:
- Add test for millisecond precision.
- Add test for handling of '%' at the end of a format specifier.
- Add test for locale-specific radix point.
2020-07-14 22:48:04 +01:00
Johnothan King
fc655f1a26
Restore 'set -b'/'set -o notify' functionality (#74)
'set -b' had no effect; it should cause the shell to notify job
state changes immediately instead of waiting for the next prompt.

This fixes a regression that was introduced in ksh93t 2008-07-25.
The bugfix is from: https://github.com/att/ast/pull/1089

src/cmd/ksh93/sh/jobs.c:
- Save the tty wait state and avoid changing it if TTYWAIT was
  already on to avoid breaking 'set -b'.
  The last 'sh_offstate' is inside of an '#if' directive because it
  is only required when ksh is compiled with SHOPT_COSHELL enabled.

src/cmd/ksh93/tests/pty.sh:
- Add a regression test for 'set -b' in interactive shells.
2020-07-14 22:00:28 +01:00
Johnothan King
66c955bc8f
Fix a fork bomb when vi is run from a script and sent Ctrl-Z (#73)
This bug was reported on the old mailing list:
https://www.mail-archive.com/ast-developers@lists.research.att.com/msg00207.html

A fork bomb can occur when SIGTSTP is sent to the vi editor. Vi
must be launched from a script run with exec (tested with
BusyBox vi, nvi and vim):
$ cat /tmp/foo
vi /tmp/bar
echo end
$ ksh
$ chmod +x /tmp/foo
$ exec /tmp/foo
While in vi, send SIGTSTP using Ctrl-Z

src/cmd/ksh93/sh/fault.c:
- Only fork after Ctrl-Z if job control is available. The patch
  used checks 'job.jobcontrol' instead of 'SH_MONITOR':
  https://git.centos.org/rpms/ksh/blob/c8/f/SOURCES/ksh-20120801-forkbomb.patch
2020-07-13 19:10:23 +01:00
Martijn Dekker
778fd6ca2d Fix possible crash due to failure to update shell FD state
This applies ksh-20100621-fdstatus.patch from Red Hat. Not very
much information is available, so this one is more or less taken
on faith. But it seems to make sense on the face of it: calling
sh_fcntl() instead of fcntl(2) directly makes the shell update its
internal file descriptor state more frequently.

It claims to fix Red Hat bug 924440. The report is currently closed
to the public: https://bugzilla.redhat.com/show_bug.cgi?id=924440

However, Kamil Dudka at Red Hat writes:
https://github.com/ksh93/ksh/issues/67#issuecomment-656379993
| Yes, the summary of RHBZ#924440 is "crash in bestreclaim() after
| traversing a memory block with a very large size". We did not have
| any in house reproducer for the bug. The mentioned patch was
| provided and verified by a customer.

...and Marc Wilson dug up a Red Hat erratum containing this info:
https://download.rhn.redhat.com/errata/RHBA-2013-1599.html
| Previously, the ksh shell did not resize the file descriptor list
| every time it was necessary. This could lead to memory corruption
| when several file descriptors were used. As a consequence, ksh
| terminated unexpectedly. This updated version resizes the file
| descriptor list every time it is needed, and ksh no longer
| crashes in the described scenario. (BZ#924440)

No reproducer means no regression test can be added now.

src/cmd/ksh93/sh/io.c,
src/cmd/ksh93/sh/subshell.c,
src/cmd/ksh93/sh/xec.c:
- Change several fcntl(2) calls to sh_fcntl(). This function calls
  fcntl(2) and then updates the shell's file descriptor state.
2020-07-10 20:04:31 +01:00
Johnothan King
c4236cc295 Fix type names starting with lowercase 'a' (#69)
Type names that start with a lowercase 'a' cause an error when used:

$ typeset -T al=(typeset bar)
$ al foo=(bar=testset)
/usr/bin/ksh: al: : invalid variable name

The error occurs because when the parser checks for the alias
builtin (to set 'assignment' to two instead of one), only the first
letter of 'argp->argval' is checked (rather than the entire
string). This was fixed in ksh93v- by comparing argp->argval
against "alias", but in ksh93u+m the check can simply be removed
because it is only run when a builtin has the BLT_DCL flag. As of
04b9171, the alias builtin does not have that flag.

src/cmd/ksh93/sh/parse.c:
- Remove the bugged check for the alias builtin.

src/cmd/ksh93/tests/types.sh:
- Add a regression test for type names starting with a lowercase 'a'.
2020-07-10 17:54:51 +01:00
Martijn Dekker
f9d28935bb Fix UTF-8 shellquoting for xtrace, printf %q, etc.
This fixes an annoying issue in the shell's quoting algorithm
(used for xtrace (set -x), printf %q, and other things) for UTF-8
locales, that caused it to encode perfectly printable UTF-8
characters unnecessarily and inconsistently. For example:

$ (set -x; : 'aeu aéu')
+ : $'aeu a\u[e9]u'
$ (set -x; : 'aéu aeu')
+ : 'aéu aeu'
$ (set -x; : '正常終了 aeu')
+ : '正常終了 aeu'
$ (set -x; : 'aeu 正常終了')
+ : $'aeu \u[6b63]\u[5e38]\u[7d42]\u[4e86]'

This issue was originally reported by lijo george in May 2017:
https://www.mail-archive.com/ast-developers@lists.research.att.com/msg01958.html

src/cmd/ksh93/sh/string.c:
- Add is_invisible() function that returns true if a character is a
  Unicode invisible (non-graph) character, excluding ASCII space.
  Ref.: https://unicode.org/charts/PDF/U2000.pdf
- Use a fallback in is_invisible() if we cannot use the system's
  iswprint(3); this is the case for the ksh C.UTF-8 locale if the
  OS doesn't support that. Fall back to a hardcoded blacklist of
  invisible and control characters and put up with not encoding
  nonexistent characters into \u[xxxx] escapes.
  Ref.: https://unicode.org/charts/PDF/U2000.pdf
- When deciding whether to switch to $'...' quoting mode (state=2),
  use is_invisible() instead of testing for ASCII 0-127 range.
- In $'...' quoting mode, use is_invisible() to decide whether to
  encode wide characters into \u[xxxx] escapes.

src/cmd/ksh93/tests/builtins.sh:
- Add regression tests for shellquoting Arabic, Japanese and Latin
  UTF-8 characters, to be run only in a UTF-8 locale. The Arabic
  sample text[*] contains a couple of direction markers that are
  expected to be encoded into \u[xxxx] escapes.

[*] source: https://r12a.github.io/scripts/tutorial/summaries/arabic
2020-07-10 05:55:11 +01:00
Martijn Dekker
588a1ff7ca Fix spurious warning output in KIA (-R) database file
The ksh -R option creates a cross-reference database that can be
parsed with a "C Query Language" (CQL) tool.
See cql-1994.pdf at: http://gsf.cococlyde.org/files

The -R option puts ksh in noexec mode as it parses the script, and
this can produce warnings as the syntax is parsed. The bug is that
these warnings can end up in the database file, corrupting it.

This applies a fix from Paulo Andrade, via Siteshwar Vashisht:
https://www.mail-archive.com/ast-developers@lists.research.att.com/msg01952.html

src/cmd/ksh93/sh/parse.c:
- Terminate names with a zero character when writing database
  output.

A regression test is not very feasible because the majority of the
database output consists of cryptic IDs/hashes that vary depending
on the session and/or system and possibly other things.
2020-07-09 23:18:41 +01:00
Johnothan King
6930666234
Fix a syntax error when ((...)) is combined with redirections (#68)
This bugfix was backported from ksh93v- 2013-10-10-alpha.

src/cmd/ksh93/sh/parse: item():
- The done label is placed after the 'inout' call for handling I/O
  redirections. This causes the command below to produce a syntax
  error because the '>' is not handled as a redirection operator
  after 'goto done':
  $ ((1+2)) > /dev/null
  /usr/bin/ksh: syntax error: `>' unexpected
  Moving the done label fixes the syntax error as 'inout' is now
  called to handle the redirection operator.

src/cmd/ksh93/tests/arith.sh:
- Add a simple regression test.
2020-07-09 22:12:04 +01:00
Martijn Dekker
361fe1fcc3 Fix hash table memory leak when restoring PATH
There is a bug in path_alias() that may cause a memory leak when
clearing the hash table while setting/restoring PATH.

This applies a fix from Siteshwar Vashist:
https://www.mail-archive.com/ast-developers@lists.research.att.com/msg01945.html

Note that, contrary to Siteshwar's analysis linked above, this bug
has nothing directly to do with subshells, forked or otherwise; it
can also be reproduced by temporarily setting PATH for a command,
for example, 'PATH=/dev/null true', and then doing a PATH search.

Modified analysis:
ksh maintains the value of PATH as a linked list. When a local
scope for PATH is created (e.g. in a virtual subshell or when doing
something like PATH=/foo/bar command ...), ksh duplicates PATH by
increasing the refcount for every element in the linked list by
calling the path_dup() and path_alias() functions. However, when
the state of PATH is restored, this refcount is not decreased. Next
time when PATH is reset to a new value, ksh calls the path_delete()
function to delete the linked list that stored the older path. But
the path_delete() function does not free elements whose refcount is
greater than 1, causing a memory leak.

src/cmd/ksh93/sh/path.c: path_alias():
- Decrease refcount and free old item if needed.
  (The 'old' variable was already introduced in 99065353, but
  its value was never used there; this fixes that as well.)

src/cmd/ksh93/tests/leaks.sh:
- Add regression test. With the bug, setting/restoring PATH
  (which clears the hash table) and doing a PATH search 16 times
  causes about 1.5 KiB of memory to be leaked.
2020-07-09 18:34:15 +01:00
Martijn Dekker
5e7d335f2f Fix crash when listing indexed arrays with 'typeset -a'
There is a bug in print_scan() function that may cause ksh to crash
while listing indexed arrays. The crash happens in nv_search() when
called from print_scan().

This applies a fix from Siteshwar Vashist:
https://www.mail-archive.com/ast-developers@lists.research.att.com/msg01944.html

src/cmd/ksh93/bltins/typeset.c:
- Call nv_scan() without the NV_IARRAY flag, even for a null scan.

src/cmd/ksh93/tests/arrays.sh:
- Add regression test for 'typeset -a' crash and check output.
2020-07-09 16:42:16 +01:00
Martijn Dekker
a8f6d6b842 Fix crash due to double free() when sourcing multiple files
There is a bug in sh_eval() that may cause ksh to crash due to a
double free() after sourcing multiple files with '.' or 'source'
if a longjmp is triggered, e.g. by a syntax error.

This applies a fix from Siteshwar Vashist:
https://www.mail-archive.com/ast-developers@lists.research.att.com/msg01943.html

src/cmd/ksh93/sh/xec.c: sh_eval():
- Zero file descriptor io_save after closing it. This prevents a
  double free() after returning from a longjmp.

src/cmd/ksh93/tests/basic.sh:
- Add reproducer as regression test.
2020-07-09 15:35:07 +01:00
Johnothan King
9526b3fa08
Fix unexpected output from 'printf %T' with certain formats (#65)
This commit changes the behavior of four date formats accepted
by 'printf %()T' because the old behavior is not compatible with
modern implementations of date(1):
- %k and %l now return a blank-padded hour, the former based on a
  24-hour clock and the latter a 12-hour clock (these are common
  extensions present on Linux and *BSD).
- %f now returns a date with the format '%Y.%m.%d-%H:%M:%S'
  (BusyBox extension).
- %q now returns the quarter of the current year (GNU extension).

src/cmd/ksh93/data/builtins.c:
- Copy the date format documentation from date in libcmd to
  the printf man page (for documenting 'printf %T').

src/cmd/ksh93/tests/builtins.sh:
- Add four regression tests for the changed date formats.

src/cmd/ksh93/sh.1:
- Remove inaccurate information about the date formats accepted by
  printf %T'. The KornShell uses a custom version of strftime(3)
  that isn't guaranteed to accepts the same formats as the native
  strftime function.

src/lib/libast/tm/tmxfmt.c:
- Change the behavior of %f, %k, %l and %q to the common behavior.
  %k and %l are implemented as aliases to %_H and %_I to avoid
  duplicating code.

src/lib/libcmd/date.c:
- Update the documentation for the AST date command since it is
  also affected by the changes to 'printf %T'.

Fixes #62
2020-07-09 05:08:28 +01:00
Johnothan King
e70925ce10
Fix memory leak on unset of associative array (#64)
Associative arrays weren't being properly freed from memory, which
was causing a memory leak.

This commit incorporates a patch and reproducer/regress test from:
https://www.mail-archive.com/ast-users@lists.research.att.com/msg01016.html

src/cmd/ksh93/sh/name.c:
- Properly free associative arrays from memory in nv_delete().

src/cmd/ksh93/tests/leaks.sh:
- Add regression test.
2020-07-09 01:09:40 +01:00
Johnothan King
9a9da2c299
Fix use of strdup on a NULL pointer (#63)
The following set of commands can rarely cause a memory fault
when auditing[*] is enabled, although most of the time it will
simply cause ksh to write '(null)' to the auditing file in place
of a tty name:

$ [ -e /etc/ksh_audit ] || echo "/tmp/ksh_auditfile;$(id -u)" | sudo tee /etc/ksh_audit;
$ v=$(ksh  2> /dev/null +o rc -ic $'getopts a:bc: opt --man\nprint $?')
$ cat /tmp/ksh_auditfile
1000;1593599493;(null); getopts a:bc: opt --man

This happens because strdup is used unconditionally on the pointer
returned by 'ttyname', which can be NULL if stderr is closed. This
then causes 'hp->tty' to be set to null, as strdup returns NULL.
See https://github.com/att/ast/issues/1028

src/cmd/ksh93/edit/history.c:
- Make strdup duplicate 'notty' instead of NULL to prevent
  crashes.

[*] https://blog.fpmurphy.com/2008/12/ksh93-auditing-and-accounting.html
2020-07-06 21:51:44 +01:00
Martijn Dekker
300cd19987 Fix corrupt UTF-8 char processing & shellquoting after aborted read
If the processing of a multibyte character was interrupted in UTF-8
locales, e.g. by reading just one byte of a two-byte character 'ü'
(\303\274) with a command like:
	print -nr $'\303\274' | read -n1 g
then the shellquoting algorithm was corrupted in such a way that
the final quote in simple single-quoted string was missing. This
bug may have had other, as yet undiscovered, effects as well. The
problem was with corrupted multibyte character processing and not
with the shell-quoting routine sh_fmtq() itself.

Full trace and discussion at: https://github.com/ksh93/ksh/issues/5
(which is also an attempt to begin to understand the esoteric
workings of the libast mb* macros that process UTF-8 characters).

src/lib/libast/comp/setlocale.c: utf8_mbtowc():
- If called from the mbinit() macro (i.e. if both pointer
  parameters are null), reset the global multibyte character
  synchronisation state variable. This fixes the problem with
  interrupted processing leaving an inconsistent state, provided
  that mbinit() is called before processing multibyte characters
  (which it is, in most (?) places that do this). Before this fix,
  calling mbinit() in UTF-8 locales was a no-op.

src/cmd/ksh93/sh/string.c: sh_fmtq():
- Call mbinit() before potentially processing multibyte characters.
  Testing suggests that this could be superfluous, but at worst,
  it's harmless; better be sure.

src/cmd/ksh93/tests/builtins.sh:
- Add regression test for shellquoting with 'printf %q' after
  interrupting the processing of a multibyte characeter with
  'read -n1'. This test only fails in a UTF-8 locale, e.g. when
  running: bin/shtests -u builtins SHELL=/buggy/ksh-2012-08-01

Fixes #5.
2020-07-05 19:24:41 +02:00
Johnothan King
658bba748e
Fix 'kill -INFO' on systems that support SIGINFO (#59)
src/cmd/ksh93/data/signals.c:
- SIGINFO was absent from the table of signals, which caused
  commands like 'kill -INFO $$' to fail even on platforms with
  SIGINFO (such as macOS and FreeBSD). Fix that by adding
  it to the signal table.

src/cmd/ksh93/tests/signal.sh:
- Add a regression tests for using SIGINFO with the kill builtin.
  The test will only be run if the external kill command supports
  SIGINFO.
2020-07-04 15:57:47 +01:00
Johnothan King
a0dcdeeade Fix bugs with backslash escaping in interactive vi mode (#57)
This commit fixes the following bugs in the 'vi' editing mode
backslash escape feature. Ref.: Bolsky & Korn (1995), p. 113, which
states for \: "Similar to Control+V [...] except that it escapes
only the next Erase or Kill charactrer".

1. The vi mode now only escapes the next character if the last
   character input was a backslash, fixing the bug demonstrated at:
   https://asciinema.org/a/E3Rq3et07MMQG5BaF7vkXQTg0
2. Escaping backslashes are now disabled in vi.c if the vi mode is
   disabled (note that vi.c handles raw editing mode in UTF-8
   locales). This makes the behavior of the raw editing mode
   consistent in C/POSIX and UTF-8 locales.
3. An odd interaction with Backspace when the character prior to a
   separate buffer entered with Shift-C was a backslash has been
   fixed. Demonstration at: https://asciinema.org/a/314833
   ^? will no longer be output repeatedly when attempting to erase
   a separate buffer with a Backspace, although, to be consistent
   with vi(1), you still cannot backspace past it before escaping
   out of it. Ref.:
   https://github.com/ksh93/ksh/issues/56#issuecomment-653586994

src/cmd/ksh93/edit/vi.c:
- Prevent a backslash from escaping the next input if the previous
  input wasn't a backslash. This is done by unsetting a variable
  named backslash if a backslash escaped a character. backslash is
  set to the result of c == '\\' when the user enters a new
  character.
- Disable escaping backslashes in the raw editing mode because
  it should not be enabled there.

src/cmd/ksh93/tests/pty.sh:
- Add some tests for how ksh handles backslashes in each
  editing mode to test for the bugs fixed by this commit.

Fixes #56.
2020-07-03 21:15:21 +02:00
Anuradha Weeraman
035a4cb3f4
Fix segfault if $PATH contains a .paths directory (#55)
ksh crashed if it encountered a .paths directory in any of the
directories in $PATH.

Ref: https://bugs.launchpad.net/ubuntu/+source/ksh/+bug/1534855

src/cmd/ksh93/sh/path.c: path_chkpaths():
- Refuse to read .paths if it's not a regular file
  or a symlink to a regular file.
2020-07-02 23:29:07 +01:00
Johnothan King
db1d539d49
Fix ERE repetition expressions in [[ ... =~ ERE{x,y} ]] (#54)
Regular expressions that combine a repetition expression with
a parenthesized sub-expression throw a garbled syntax error:

$ [[ AATAAT =~ (AAT){2} ]]
ksh: syntax error: `~(E)(AAT){2} ]]
:'%Cred%h%Creseksh: syntax error: `~(E)(AAT){2} ]]
:'%Cred%h%Creseksh: syntax' unexpected

The syntax error occurs because ksh is not fully
accounting for '=~' when it runs into a curly bracket.
This fix disables the syntax error when the operator
is '=~' and adds handling for '(str){x}' (to allow for
more than one sub-expression). This bugfix and the
regression tests for it were backported from ksh93v-
2014-12-24-beta.

src/cmd/ksh93/sh/lex.c:
- Do not trigger a syntax error for '{x}' when the operator
  is '=~' and add handling for multiple parentheses when
  combined with '{x}'.

src/cmd/ksh93/tests/bracket.sh:
- Add two tests from ksh93v- to test sub-expressions
  combined with the '{x}' quantifier.
2020-07-02 18:40:15 +01:00
Johnothan King
120aec25ba
Fix a crash when 'read -u' is given an invalid fd (#53)
File descriptors that are too far out of range will cause the
read builtin to crash. The following example will generate
two crashes:

$ ksh -c 'read -u 2000000' || ksh -c 'read -u-2000000'

The fix is to error out when the given file descriptor is out
of range. This bugfix is from Tomas Klacko, although it was
modified to use 'sh_iovalidfd' and reject numbers greater
than 'INT_MAX':
https://www.mail-archive.com/ast-developers@lists.research.att.com/msg01912.html
The question about 'shp->fdstatus[-1]' only applies to ksh93v-
(ksh93u+ doesn't have any references to 'shp->fdstatus[-1]').

src/cmd/ksh93/bltins/read.c:
- File descriptors that are out of range should be rejected
  with an error message (like invalid file descriptors that
  are in range). The seemingly redundant check for negative
  numbers is there because out of range negative numbers also
  cause memory faults despite the later 'fd<0' check.

src/cmd/ksh93/tests/io.sh:
- Add three tests for attempting 'read -u' on various invalid
  file descriptor numbers.
2020-07-01 18:14:10 +01:00
Johnothan King
1b5bc1802a
Fix the readonly builtin's scope in functions (#51)
* Fix the readonly builtin's scope in functions

This bug was first reported at https://github.com/att/ast/issues/881

'tdata.sh->prefix' is only set to the correct value when
'b_readonly' is called as 'export', which breaks 'readonly' in
functions because the correct scope isn't set. As a result, the
following example will only print a newline:

$ function show_bar { readonly foo=bar; echo $foo; }; show_bar

The fix is to move the required code out of the if statement for
'export', as it needs to be run for 'readonly' as well. This bugfix
is from https://github.com/att/ast/pull/906

src/cmd/ksh93/bltins/typeset.c:
- Set 'tdata.sh->prefix' to the correct value, otherwise 'readonly'
  uses the wrong scope.

src/cmd/ksh93/tests/builtins.sh:
- Add the regression test from ksh2020, modified to run in a
  subshell.

src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/sh.1:
- Add documentation of 'readonly' vs. 'typeset -r' difference:
  'readonly' does not create a function-local scope.

Co-authored-by: Martijn Dekker <martijn@inlv.org>
2020-06-29 19:09:20 +01:00
Johnothan King
10b6ba801d
Fix memory corruption when a compound variable is unset (#49)
The following set of commands ends with a memory fault under
certain circumstances because ksh attempts to free memory
twice, causing memory corruption:

$ testarray=(1 2)
$ compound testarray
$ unset testarray
$ eval testarray=

The fix is to make sure 'np->nvfun' is a valid pointer before
attempting to free memory in 'put_tree'. This patch is from
OpenSUSE: https://build.opensuse.org/package/view_file/shells/ksh/ksh93-nvtree-free.dif?expand=1

src/cmd/ksh93/sh/nvtree.c:
- Do not try to free memory when 'np->nvfun' and 'val'
  are false.

src/cmd/ksh93/tests/comvar.sh:
- Add a regression test for the double free problem. The
  reproducer must be run from an executable script
  with 'ksh -c'.
2020-06-29 18:08:28 +01:00
Johnothan King
5135cf651c
Fix crashes caused by 'typeset -RF' (#47)
Variables created with 'typeset -RF' were being treated as
short integers, even though they are actually floating point
values. As a result the following example will cause a crash:

$ typeset -RF foo=1
$ test "$foo"

This is fixed by checking for 'NV_DOUBLE' with 'nv_isattr',
which prevents ksh from treating floating point values as
short integers due to '== NV_INT16P' excluding 'NV_DOUBLE'.
This bugfix was backported from ksh93v- 2013-10-10-alpha.

src/cmd/ksh93/sh/array.c,
src/cmd/ksh93/sh/name.c,
src/cmd/ksh93/sh/nvdisc:
- Avoid treating floating point values as short integers by
  checking for 'NV_DOUBLE' with 'nv_isattr'.

src/cmd/ksh93/tests/types.sh:
- Add a regression test for the 'typeset -RF' crash. The
  crash cannot be replicated if 'typeset -RF' sets 'foo'
  to zero.
2020-06-28 23:30:27 +01:00
Johnothan King
bb4745e897
Fix incorrect behavior of 'cd ../.foo' (#46)
The cd builtin was removing '.' from directory names when combined
with a preceding '../', which caused commands like 'cd ../.local'
to become 'cd ../local'. This patch fixes the problem by limiting
the extra handling to leading '..'. The bugfix comes from ksh93v-
2013-10-10-alpha, although this version is a shortened patch from
Solaris (as ksh93v- refactored a decent amount of the code for the
cd builtin).

src/cmd/ksh93/bltins/cd_pwd.c:
- cd should only check for leading '..', as trying to handle a lone
  '.' only causes problems.

src/cmd/ksh93/tests/builtins.sh:
- Add a regression test for this problem based on the test present in
  ksh93v- 2013-10-10-alpha.

Patch from Solaris:
https://github.com/oracle/solaris-userland/blob/860d27f/components/ksh93/patches/270-23319761.patch
2020-06-26 23:36:29 +01:00
Martijn Dekker
8c705bf3b7 Fix behaviour of tabs in raw Bourne Shell-like editing mode
When neither '-o emacs' nor '-o vi' is active, there were a couple
of bugs with entering tab characters:
1. Tab completion was erroneously left active. The cause of this
   was that raw Bourne edit mode is handled by ed_viread() in vi.c
   on shells with wide character support, instead of the default
   ed_read() in edit.c, and the former failed to check if vi mode
   is active when processing tab characters.
2. When entering literal tab characters, the cursor was moved to
   the right only one character, instead of the amount of
   characters corresponding to the tab.

src/cmd/ksh93/edit/vi.c: getline():
- Before processing '\t' (tab) for command completion, check that
  the 'vi' shell option (SH_VI) is active.

src/cmd/ksh93/edit/edit.c: ed_virt_to_phys():
- When translating literal tabs to on-terminal spaces and when
  recalculating the cursor position, remove erroneous checks for
  SH_VI; this is also needed in raw Bourne mode. According to my
  own testing, this has no effect on emacs mode (knock on wood).

src/cmd/ksh93/tests/pty.sh:
- Add two regression tests. An odd race condition reveals itself in
  either pty or in ksh's raw/Bourne edit mode; see comment in test.
  Effect is we have to expect either literal tabs or tabs expanded
  to spaces, until that is tracked down and fixed.

Fixes #43.
2020-06-26 11:34:02 +02:00
Johnothan King
4cecde1dd3 Fix buggy completion of ~/some in vi mode (#41)
This commit fixes the bug reported in:
https://github.com/att/ast/issues/682
The following sequence fails in vi mode because ksh looks in the
wrong part of the 'virtual' buffer:

$ touch ~/testfile
$ ls ~/test<tab>

The fix is to change 'virtual[i]' to 'virtual[last_virt]' in the
bugged section of code. The other changes are to make sure listing
files in a directory with something like 'ls /etc/<tab>' calls the
code for Ctrl+L to preserve 'ls /etc/' rather than try (and fail)
to complete the directory name, producing 'ls /etc\n/'. This bugfix
was backported from ksh93v- 2013-10-10-alpha.

src/cmd/ksh93/edit/vi.c
 - Backport the bugfix from ksh93v- 2013-10-10-alpha for this
   problem.

src/cmd/ksh93/tests/pty.sh
 - Add a regression test for this issue using pty, adjusted slightly
   for a fake home directory in /tmp.
2020-06-25 23:13:45 +02:00
Johnothan King
d41ec674c7 Fix some errors in the documentation and other minor issues (#42)
Somewhat notable changes in this commit:
- The 'set +r' bugfix (re: 74b41621) is now documented in the
  changelog.
- Missing options have been added to the synopsis section of the
  ksh man page.
- The minor formatting fix from https://github.com/ksh-community/ksh/pull/5
  has been applied to the ksh man page.
- A few fixes from https://github.com/att/ast/commit/5e747cfb
  have been applied to the ksh man page.
- The man page fixes from https://github.com/att/ast/pull/353
  have been applied, being:
  - An addition to document the behavior of 'set -H'.
  - A fix for the cd section appending rksh93.
  - A fix for some options being indented too far.
  - Removal of a duplicate section documenting '-D'.
  - Reordering the options for 'set' in alphabetical order.
  - A minor fix for the documentation of 'ksh -i'.
2020-06-25 19:31:51 +02:00
Martijn Dekker
2315f6687a Add regress test for fixed BUG_KUNSETIFS (re: 6f0e008c, 7b994b6a)
Modernish is no longer detecting BUG_KUNSETIFS, as I've just
discovered. Always nice when bugs spontaneously vanish...

A 'git reset HEAD~1'/recompile/retest loop reveals this bug was
fixed by 6f0e008c, as later modified by 7b994b6a.

So, let's make sure it stays fixed.

src/cmd/ksh93/tests/variables.sh:
- Add a couple of regression tests for BUG_KUNSETIFS presence,
  detection and known workaround, based on the same in modernish.
  Ref.: https://github.com/modernish/modernish/blob/3ddcbd13/lib/modernish/cap/BUG_KUNSETIFS.t
	https://github.com/modernish/modernish/blob/3ddcbd13/lib/modernish/tst/isset.t#L204-L222
2020-06-24 20:00:01 +02:00
Johnothan King
0aa9e03f55
Fix process substitution combined with redirection (#40)
The code for handling process substitution with redirection was
never being run because IORAW is usually set when IOPROCSUB is
set. This commit fixes the problem by moving the required code
out of the !IORAW if statement. The following command now prints
'good' instead of writing 'ok' to a bizzare file:

$ ksh -c 'echo ok > >(sed s/ok/good/); wait'
good

This commit also fixes a bug that caused the process ID of the
asynchronous process to print when the shell was in interactive
mode. The following command no longer prints a process ID,
behaving like in Bash and zsh:

$ echo >(true)
/dev/fd/5

src/cmd/ksh93/sh/args.c:
 - Temporarily turn off the interactive state while in a process
   substitution to prevent the shell from printing the PID of
   the asynchronous process.

src/cmd/ksh93/sh/io.c:
 - Move the code for process substitution with redirection into
   a separate if statement.

src/cmd/ksh93/tests/io.sh:
 - Add two tests for both process substitution bugs fixed by this
   commit.

src/cmd/ksh93/tests/shtests:
 - Update shtests with a patch from Martijn Dekker to use
   pretty-printing for the output from the times builtin (if it
   is available).

Fixes #2
2020-06-23 23:02:16 +01:00
Johnothan King
c1994b87f1
Fix nested functions ignoring prefixed variable assignments (#37)
This commit fixes the bug described in att/ast#32. The fix and
following explanation is from att/ast#467:

While copying variables from function's local scope to a new scope,
variable attributes were not copied. Such variables were not marked
to be exported in the new function. For e.g.

function f2 { env | grep -i "^foo"; }
function f1 { env | grep -i "^foo"; f2; }
foo=bar f1

prints 'foo=bar' only once, but it should print be twice.

src/cmd/ksh93/sh/xec.c:
 - When variables from the local scope of a function are copied into
   the scope of a nested function, the attributes of the variables
   need to be copied as well.

src/cmd/ksh93/tests/functions.sh:
 - Add regression tests from ksh2020 to check environment variables
   passed to functions.
2020-06-23 00:27:05 +01:00
Johnothan King
ff358f3464 Fix a crash when 'kill %%' and 'kill %+' are run (#35)
Ksh was trying to use the 'pw' variable as a valid pointer even
when it was NULL. This is fixed by doing the error check for
'pw' before doing anything else in 'job_kill'.

This bugfix is from Red Hat:
44e0a643a9/f/SOURCES/ksh-20130214-fixkill.patch

Fixes #34
2020-06-22 19:11:49 +02:00
Martijn Dekker
3ba4900e9c Make 'stop' and 'suspend' regular built-ins
The 'stop' and 'suspend' default aliases are now converted into
regular built-in commands so that 'unalias -a' does not remove
them, 'suspend' can do some sanity checks, and something like
	cmd=stop; $cmd $!
will now work.

src/cmd/ksh93/bltins/trap.c:
- b_kill(): Incorporate 'stop' functionality, which is simply
  setting the same flag and variable as '-s STOP' would have done.
- b_suspend(): Add simple builtin function that sends SIGSTOP to
  the main shell. Check for no operands, and refuse to suspend a
  login shell (which would leave the user stuck with no way out).
  Also check that 'kill' succeeds; if we're in an asynchronous
  subshell, it is possible the main shell no longer exists.

src/cmd/ksh93/data/aliases.c:
- Remove "stop" and "suspend" default aliases. (Why were these
  conditional upon SIGTSTP when they actually issued SIGSTOP?)

src/cmd/ksh93/include/builtins.h,
src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/data/msg.c:
- Add declarations of "stop" and "suspend" regular built-ins.
- Add option strings (AST manual/--man pages) for them.
- Add e_toomanyops ("too many operands") reusable error message for
  b_suspend(). Other new commands may want this at some point.

src/cmd/ksh93/sh.1:
- Remove "stop" and "suspend" default aliases.
- Document "stop" and "suspend" regular built-in commands.
2020-06-22 15:36:29 +02:00
Johnothan King
bd3e2a8001
Fix unreliable behavior when special vars are readonly or unset (#27)
src/cmd/ksh93/data/variables.c:
 - Running 'unset .sh.lineno' creates a memory fault, so fix that
   by giving it the NV_NOFREE attribute. This crash was happening
   because ${.sh.lineno} is an integer that cannot be freed from
   memory with free(3).

src/cmd/ksh93/sh/init.c:
 - Tell _nv_unset to ignore NV_RDONLY when $RANDOM and $LINENO are
   restored from the subshell scope. This is required to fully
   restore the original state of these variables after a virtual
   subshell finishes.

src/cmd/ksh93/bltins/typeset.c,
src/cmd/ksh93/sh/subshell.c:
 - Disabled some optimizations for two instances of 'sh_assignok' to
   fix 'readonly' in virtual subshells and '(unset .sh.level)' in
   nested functions. This fixes the following variables when
   '(readonly $varname); enum varname=' is run:

   $_
   ${.sh.name}
   ${.sh.subscript}
   ${.sh.level}

   The optimization in question prevents sh_assignok from saving the
   original state of these variables by making the sh_assignok call
   a no-op. Ksh needs the original state of a variable for it to be
   properly restored after a virtual subshell has run, otherwise ksh
   will simply carry over any new flags (being NV_RDONLY in this case)
   from the subshell into the main shell.

src/cmd/ksh93/tests/variables.sh:
 - Add regression tests from Martijn Dekker for setting special
   variables as readonly in virtual subshells and for unsetting
   special variables in general.

Fixes #4
2020-06-20 18:08:41 +01:00
Johnothan King
99065353b3 Fix 'whence -a' to print correct path for tracked alias (#25)
'whence -a' bases the path for tracked aliases on the user's
current working directory if an enabled ksh builtin of the same
name is also available. The following example will claim 'cat'
is in the user's current working directory:

$ whence -a cat
cat is a tracked alias for /usr/bin/cat
$ builtin cat
$ whence -a cat
cat is a shell builtin
cat is /usr/bin/cat
cat is a tracked alias for /current/working/directory/cat

This patch from ksh2020 fixes this problem by properly saving the
path of the tracked alias for use with 'whence -a', since
'path_pwd' (as implied by the function's name) only gets the users
current working directory, not the location of tracked aliases.
Ref.: https://github.com/att/ast/issues/1049

This bug was originally reported by David Morano about two decades
ago to the AST team: https://github.com/att/ast/issues/954

src/cmd/ksh93/bltins/whence.c:
 - Print the actual path of a tracked alias, path_pwd doesn't
   have this functionality.

src/cmd/ksh93/include/name.h:
 - Add 'pathcomp' for saving the value of tracked aliases.

src/cmd/ksh93/sh/path.c:
 - Save the value of tracked aliases for use by whence.

src/cmd/ksh93/tests/builtins.sh:
 - Add a regression test for using 'whence -a' on tracked
   aliases with a builtin equivalent.
2020-06-19 14:03:58 +02:00
Martijn Dekker
3e3f6b0f12 Restore #22 'unset -f' fix minus segfault (re: b7932e87, 97511748)
Applying the fix for 'unset -f' exposed a crashing bug in lookup()
in sh/nvdisc.c, which is the function for looking up discipline
functions. This is what caused tests/variables.sh to crash.
Ref.: https://github.com/ksh93/ksh/issues/23#issuecomment-645699614

src/cmd/ksh93/sh/nvdisc.c: lookup():
- To avoid segfault, check that the function pointer nq->nvalue.rp
  is actually set before checking if nq->nvalue.rp->running==1.

src/cmd/ksh93/sh/xec.c,
src/cmd/ksh93/tests/functions.sh:
- Uncomment the 'unset -f' fix from b7932e87.

Resolves #21 (again).
2020-06-18 02:48:51 +02:00
Martijn Dekker
975117485c Part revert #22 to undo memory fault (re: b7932e87)
The fix in sh/xec.c, which was backported from the ksh 93v- beta to
delay the actual removal of a running function that unsets itself,
caused a segfault in the variables.sh regression tests (see #23).

src/cmd/ksh93/sh/xec.c:
- Comment out the backported code pending a correct fix for #21.
  Now both types of functions silently fail to unset themselves
  (unless they're discipline functions).

src/cmd/ksh93/tests/functions.sh:
- Disable regression tests checking that the function was actually
  unset, pending a correct fix for #21.

Resolves: #23
Reopens: #21
2020-06-17 21:01:55 +02:00
Johnothan King
b7932e87b6
Fix two problems with 'unset -f' behavior (#22)
src/cmd/ksh93/sh/name.c:
 - Correct the check for when a function is currently running
   to fix a segmentation fault that occurred when a POSIX
   function tries to unset itself while it is running.
   This bug fix was backported from ksh93v-.

src/cmd/ksh93/sh/xec.c:
 - If a function tries to unset itself, unset the function
   with '_nv_unset(np, NV_RDONLY)' to fix a silent failure.
   This fix was also backported from ksh93v-.

src/cmd/ksh93/tests/functions.sh:
 - Add four regression tests for when a function unsets itself.

Resolves #21
2020-06-17 18:26:43 +01:00
Johnothan King
fae8862c53
Fix assignments preceding 'command <special builtin>' (#19)
Ksh was not checking for `command` when running a special builtin,
which caused preceding invocation-local variable assignments to
become global. This is the reproducer from the att/ast#72:

$ foo=BUG command eval ':'
$ echo "$foo"

This no longer prints 'BUG', as ksh now makes sure the command builtin
is not running a special builtin before making invocation-local
variable assignments global.

src/cmd/ksh93/sh/xec.c:
 - Backport the bugfix for BUG_CMDSPASGN from ksh93v- 2013-10-10-alpha.

src/cmd/ksh93/tests/builtins.sh:
 - Add a regression test based on the reproducer in att/ast#72.
2020-06-16 22:58:05 +01:00
Johnothan King
764acefaf1 read -r -d should not ignore -r
This bug was previously reported in att/ast#37.
Ksh ignores `-r` when `read -r -d` is run because when
the bit for `D_FLAG` is set, the bit for `R_FLAG` is unset
as a side effect of setting `D_FLAG`. The following set
of commands fails to print a backslash:

$ printf '\\\000' | read -r -d ''
$ echo $REPLY

The fix for this bug is to set `D_FLAG` with `D_FLAG + 1`,
which prevents `R_FLAG` from being unset. This bugfix
has been backported from ksh93v- 2013-10-10-alpha.

src/cmd/ksh93/bltins/read.c:
 - Set `D_FLAG` with `D_FLAG + 1` to prevent the bit for
   `R_FLAG` from being unset.

src/cmd/ksh93/tests/builtins.sh:
 - Add the regression test for `read -r -d` from ksh93v-.
2020-06-16 13:49:23 +01:00
Martijn Dekker
6916a873c2 optget: display --help and --man in terse usage messages
The fact that every ksh builtin command self-documents with the
options --help and --man (and others, see 'getopts --man'; but
these are the essential ones) is poorly known; the information is
buried somewhere deep in the sh.1 manual page, and is incomplete at
that. None of the terse usage messages displayed on error point the
user to these options. So let's fix that.

src/lib/libast/misc/optget.c:
- Change generic 'options' placeholder, used in all terse usage
  messages, to 'options | --help | --man'.

src/cmd/ksh93/sh.1:
- Edit documentation of --man/-?, adding documentation on --help
  which was completely undocumented. Refer to 'getopts --man' for
  more advanced info.
- Separate these from the (important) documentation on special
  builtins using a paragraph break.
2020-06-15 16:56:11 +02:00
Johnothan King
3d38270b32 Remove a buggy optimization for variables in subshells
This bug was originally reported by @lijog in att/ast#7 and has been
reported again in #15. KSH does not save the state of a variable if it
is in a newer scope. This is because of an optimization in sh_assignok
first introduced in ksh93t+ 2010-05-24. Here is the code change in that
version:

                return(np);
        /* don't bother to save if in newer scope */
-       if(!(rp=shp->st.real_fun)  || !(dp=rp->sdict))
-               dp = sp->var;
-       if(np->nvenv && !nv_isattr(np,NV_MINIMAL|NV_EXPORT) && shp->last_root)
-               dp = shp->last_root;
-       if((mp=nv_search((char*)np,dp,HASH_BUCKET))!=np)
-       {
-               if(mp || !np->nvfun || np->nvfun->subshell>=sh.subshell)
-                       return(np);
-       }
+       if(sp->var!=shp->var_tree && shp->last_root==shp->var_tree)
+               return(np);
        if((ap=nv_arrayptr(np)) && (mp=nv_opensub(np)))
        {

This change was originally made to replace a buggier optimization.
However, the current optimization causes variables set in subshells
to wrongly affect the environment outside of the subshell, as the
variable does not get set back to its original value. This patch
simply removes the buggy optimization to fix this problem.

src/cmd/ksh93/sh/subshell.c:
 - Remove a buggy optimization that caused variables set in subshells
   to affect the environment outside of the subshell.

src/cmd/ksh93/tests/subshell.sh:
 - Add a regression test for setting variables in subshells. This
   test has to be run from the disk after being created with a here
   document because it always returns the expected result when run
   directly in the regression test script.
2020-06-15 07:13:38 -07:00
Martijn Dekker
ef1621c18f Make 'source' a regular built-in
The 'source' alias is now converted into a regular built-in command
so that 'unalias -a' does not remove it, and something like
	cmd=source; $cmd name args
will now work.

This is part of the project to replace default aliases that define
essential commands by proper builtins that act identically (except
you now get the actual command's name in any error/usage messages).

src/cmd/ksh93/data/aliases.c:
- Remove 'source' default alias.

src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/include/builtins.h:
- Define 'source' regular builtin with extra parser ID "SYSSOURCE".
  Same definition as '.', minus the BLT_SPC flag indicating a
  special builtin. This preserves the behaviour of 'command .'.
- Update sh_optdot[] to include info for 'source --man'.
  (Note that \f?\f expands to the current command name.
  This allows several commands to share a single --man page.)

src/cmd/ksh93/sh/parse.c:
- In the two places that SYSDOT is checked for, also check for
  SYSSOURCE, making sure the two commands are parsed identically.

src/cmd/ksh93/sh.1:
- Remove 'source' default alias.
- Document 'source' regular builtin.
2020-06-15 11:33:44 +02:00
Johnothan King
af0bd6ad70 read -S now correctly handles nested double quotes
Prior to this bugfix, the following set of commands would
fail to print two double quotes:

IFS=',' read -S a b c <<<'foo,"""title"" data",bar'
echo $b

This fix is from ksh93v- 2013-10-10-alpha, although it has
been revised to use stakputc to put the required double quote
into the buffer for consistency with the ksh93u+ codebase.

src/cmd/ksh93/bltins/read.c:
 - When handling nested double quotes, put the required double
   quote in read's buffer with stakputc.

src/cmd/ksh93/tests/builtins.sh:
 - Add the regression test for `read -S` from ksh93v-.

src/cmd/ksh93/sh.1:
 - Fix a minor formatting error to highlight '-S' in the ksh(1)
   man page.
2020-06-14 10:40:30 -07:00
Martijn Dekker
1a6f37dc86 NEWS: update repo URI 2020-06-14 06:28:38 +02:00
Johnothan King
d7c9470704 Backport the ksh2020 fix for timezone name determination
This fix for `printf '%T' now` on FreeBSD was written by
@krader1961. This is from https://github.com/att/ast/pull/591:

On FreeBSD calling tzset() does not guarantee the tzname array will
be correctly populated. On most systems that works but on FreeBSD you
have to call localtime() or a related function (e.g., ctime()).

This change also eliminates a potential, very small, memory leak due to
the strdup()'ed tznames not being freed.

src/lib/libast/tm/tminit.c:
 - Fix timezone name determination on FreeBSD and a memory leak.
2020-06-13 14:28:10 -07:00
Martijn Dekker
e500479ede
Merge pull request #1 from JohnoKing/fix-builtin-delete
`builtin -d` should not delete special builtins
2020-06-12 12:36:42 +01:00
Johnothan King
017d088c39 builtin -d should not delete special builtins
The man page for the builtin command says special builtins cannot
be deleted. This wasn't the case though, running `builtin -d` on
a special builtin was deleting it. As an example, the following
set of commands was ending with 'export: not found':

$ builtin -d export
$ export foo=bar

This commit backports the bugfix from ksh93v- (2014-12-24-beta),
which added an error check to prevent special builtins from being
deleted.

src/cmd/ksh93/sh/nvdisc.c:
 - Add an error check to prevent special builtins from being deleted.

src/cmd/ksh93/tests/builtins.sh
 - Add a regression test for using `builtin -d` on special builtins.
2020-06-12 04:26:40 -07:00
Martijn Dekker
d8eba9d112 Remove 'login' and 'newgrp' builtins: not sane default behaviour
This commit removes the undocumented 'login' and 'newgrp' builtin
commands. They already stopped blocking shell functions by that
name by changing from special to regular builtins in 04b91718 (a
change I forgot to mention in that commit message), but there is
another obnoxious aspect to these: being glorified hooks into
'exec', they replaced your shell session with the external commands
by the same name. This makes argument and error checking
impossible, so if you made so much as a typo, you would be
immediately logged out.

Even if that behaviour is wanted by a few, having it as the default
is user-hostile enough to be called a bug. It also violates the
POSIX definition of the 'newgrp' utility which explicitly says that
it "shall create a new shell execution environment", not replace
the existing one.
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/newgrp.html

Users who do want this behaviour can easily restore it by setting:
	alias login='exec login'
	alias newgrp='exec newgrp'

src/cmd/ksh93/bltins/misc.c:
- As there is no more 'login' builtin, combine b_exec() and
  B_login() functions, which allows eliminating a few variables.
  Note that most of 'exec' was actually implemented in B_login()!

src/cmd/ksh93/data/builtins.c:
- Remove "login" and "newgrp" table entries.

src/cmd/ksh93/include/builtins.h:
- Remove SYSLOGIN parser ID. As this was the first, all the others
  needed renumbering.

src/cmd/ksh93/sh/xec.c:
- Remove SYSLOGIN parser check that made 'login' and 'newgrp' act
  like 'exec' and replace the shell.
2020-06-12 06:57:57 +02:00
Martijn Dekker
7b82c338da Make 'redirect' a regular builtin instead of an alias of 'exec'
This commit converts the redirect='command exec' alias to a regular
'redirect' builtin command that only accepts I/O redirections, which
persist as in 'exec'. This means that:
* 'unlias -a' no longer removes the 'redirect' command;
* users no longer accidentally get logged out of their shells if
  they type something intuitive but wrong, like 'redirect ls >file'.

This should not introduce any legitimate change in behaviour. If
someone did accidentally pass non-redirection arguments to
'redirect', unexpected behaviour would occur; this now produces
an 'incorrect syntax' error.

src/cmd/ksh93/bltins/misc.c: b_exec():
- Recognise 'redirect' when parsing options.
- If invoked as 'redirect', produce error if there are arguments.

src/cmd/ksh93/data/aliases.c:
- Remove redirect='command exec' alias.

src/cmd/ksh93/data/builtins.c:
- Update/improve comments re ordering.
- Add 'redirect' builtin entry.
- sh_optexec[]: Abbreviate redirection-related documentation;
  refer to redirect(1) instead.
- sh_optredirect[]: Add documentation.

src/cmd/ksh93/include/builtins.h:
- Add SYSREDIR parser ID, renumbering those following it.
- Improve comments.
- Add extern sh_optredirect[].

src/cmd/ksh93/sh.1:
- exec: Abbreviate redirection-related documentation; refer to
  'redirect' instead.
- redirect: Add documentation.

src/cmd/ksh93/sh/xec.c:
- Recognise SYSREDIR parser ID in addition to SYSEXEC when
  determining whether to make redirections persistent.

src/cmd/ksh93/tests/io.sh:
- To regress-test the new builtin, change most 'command exec' uses
  to 'redirect'.
- Add tests verifying the exit behaviour of 'exec', 'command exec',
  'redirect' on redirections.
2020-06-12 04:54:33 +02:00
Martijn Dekker
04b9171858 POSIX compliance fix: make 'unalias' a regular builtin
Both 'alias' and 'unalias' are specified as regular builtins. Among
a few other things, that means it ought to be portable to use these
names for shell functions. But ksh93 disallowed that until now.

src/cmd/ksh93/data/builtins.c:
- Make 'unalias' a regular builtin by removing the BLT_SPC flag.
- (same fix for 'alias' was already done in afa68dca)
- Add the BLT_ENV flag to the 'alias' and 'hash' commands. In
  include/name.h, this flag is commented: "non-stoppable, can
  modify environment". The "non-stoppable" bit seems like a good
  idea: these operations should not be interruptable as that would
  cause an inconsistent state.

src/cmd/ksh93/sh.1:
- Remove the '-', indicating special builtin, from 'alias' entry.
- Minor cosmetic fix: space after the '-' for 'unset'.

(cherry picked from commit a4315d7672204acb543010b4d4916b22dcb9cb08)
2020-06-12 01:45:18 +02:00
Johnothan King
102868f850 Replace the hash alias with a proper builtin
This commit replaces the old hash alias with a proper builtin.
I based this builtin off of the code alias uses for handling
`alias -t --`, but with the hack for `--` removed as it has
no use in the new builtin. `alias -t --` will no longer work,
that hack is now gone.

While I was testing this builtin, I found a bug with hash tables
in non-forking subshells. If the hash table of a non-forking
subshell is changed, the parent shell's hash table is also changed.
As an example, running `(hash -r)` was resetting the parent shell's
hash table. The workaround is to force the subshell to fork if the
hash table will be changed.

src/cmd/ksh93/bltins/typeset.c:
 - Move the code for hash out of the alias builtin into a dedicated
   hash builtin. `alias -t --` is no longer supported.

src/cmd/ksh93/data/aliases.c:
 - Remove the old alias for hash from the table of predefined aliases.

src/cmd/ksh93/data/builtins.c:
 - Fix the broken entry for the hash builtin and add a man page for
   the new builtin.

src/cmd/ksh93/sh.1:
 - Replace the entry for the hash alias with a more detailed entry
   for the hash builtin.

src/cmd/ksh93/sh/name.c:
 - Force non-forking subshells to fork if the PATH is being reset
   to workaround a bug with the hash tree.

src/cmd/ksh93/tests/alias.sh:
 - Add a regression test for resetting a hash table, then adding
   a utility to the refreshed hash table.

src/cmd/ksh93/tests/subshell.sh:
 - Add regression tests for changing the hash table in subshells.

(cherry picked from commit d8428a833afe9270b61745ba3d6df355fe1d5499)
2020-06-12 01:45:18 +02:00
Johnothan King
e92faddbf9 Fix 39 spelling errors and a formatting issue
A column of whitespace in the NEWS file was removed for consistent
formatting. Most of the spelling errors were found with this
codespell dictionary:
https://github.com/orbitcowboy/codespell_dictionary

(cherry picked from commit 0e36b17abe5609c461a3e4da7041eb0fdf9991b7)
2020-06-12 01:45:18 +02:00
Martijn Dekker
a1f46d785f rm "I/O error" error msg; just keep >0 exit status (re: 9011fa93)
The bug was really that I/O errors in output builtins were
undetectable by any means. Having a >0 exit status is sufficient.
Adding an error message risks making existing ksh scripts noisier,
or even breaking them if they redirect stderr to stdout.

Note to self: in future, implement the minimum change necessary to
fix bugs, nothing more. The fact that I needed to add four extra
2>/dev/null to the regression tests should have been a hint.

src/cmd/ksh93/bltins/print.c,
src/cmd/ksh93/data/msg.c,
src/cmd/ksh93/include/io.h:
- Remove "I/O error" message.

src/cmd/ksh93/tests/builtins.sh:
- Update to check for exit status only.

src/cmd/ksh93/tests/basic.sh,
src/cmd/ksh93/tests/coprocess.sh:
- Revert four new '2>/dev/null' to suppress the error message.

(cherry picked from commit 5e17be24d18455b575b6e98bc631c6935ffc795a)
2020-06-12 01:45:18 +02:00
Johnothan King
5d50f825e4 The unalias builtin should return an error for non-existent aliases
This commit fixes a bug that caused unalias to return a zero status
when it tries to remove an alias twice. The following set of commands
will no longer end with an error:

$ alias foo=bar
$ unalias foo
$ unalias foo && echo 'Error'

This commit is based on the fix present in ksh2020, but it has been
extended with another bugfix. The initial fix for this problem tried to
remove aliases from the alias tree without accounting for NV_NOFREE. This
caused any attempt to remove a predefined aliases (e.g. `unalias float`)
to trigger an error with free, as all predefined aliases are in read-only
memory. The fix for this problem is to set NV_NOFREE when removing aliases
from the alias tree, but only if the alias is in read-only memory. All
other aliases must be freed from memory to prevent memory leaks.

I'll also note that I am using an `isalias` variable rather than the `type`
enum from ksh2020, as the `VARIABLE` value is never used and was replaced
with a bool called `aliases` in the ksh2020 release. The `isalias` variable
is an int as the ksh93u+ codebase does not use C99 bools.

Previous discussion: https://github.com/att/ast/issues/909

- src/cmd/ksh93/bltins/typeset.c:
  Remove aliases from the alias tree by using nv_delete. NV_NOFREE
  is only used when it is necessary.

- src/cmd/ksh93/tests/alias.sh:
  Add two regression tests for the bugs fixed by this commit.

(cherry picked from commit 16d5ea9b52ba51f9d1bca115ce8f4f18e97abbc4)
2020-06-12 01:45:18 +02:00
Johnothan King
b5e52703e9 Fix an issue with the up arrow key in Emacs editing mode.
Emacs editing mode is bugged in ksh93u+ and ksh2020. Let's
say you were to run the following commands after starting
a fresh instance of ksh:

$ alias foo='true'
$ unalias foo

If you type 'a' and then press the up arrow on your keyboard,
ksh will complete 'a' to `alias foo='true'` by doing a reverse
search for the last command that starts with 'a'.
Run the alias command again, then type 'u' and press the up
arrow key again. If ksh is in Vi mode, you will get `unalias foo`,
but in Emacs mode you will get `alias foo='true'` again.

This bug was occurring because ksh was only doing a reverse search
based on the first command that was completed using the up arrow.
All subsequent commands were ignored as ksh was saving the first
command and only based later searches off of it.

NEWS:
 - Add instructions for reproducing this bug with the up arrow
   key and information about why this bug was happening in the
   first place.

src/cmd/ksh93/edit/emacs.c:
 - Remove a bad check that was preventing ksh from using the
   latest input for reverse search.

(cherry picked from commit 745b065487ad6bac693ec6f44752f96e87f9a63b)
2020-06-12 01:45:17 +02:00
Martijn Dekker
d024d4c895 Fix signal handling due to exit status > 256
This fixes two bugs: issuing the 'exit' command with a value > 256
would cause ksh 93u+ to kill itself with the corresponding signal
(try 'exit 265' to SIGKILL your interactive shell), and, if the
last command of a script exits due to a signal, the shell would
repeat that signal to itself, causing any parent ksh to also be
killed.

Discussion:
https://bugzilla.redhat.com/show_bug.cgi?id=1469624
https://rainbow.chard.org/2017/03/21/ksh-deliberately-segfaults-if-the-last-command-in-a-script-crashes/

This commit is loosely based on a patch applied to the 93v- beta
and the abandoned ksh2020, but that patch was incomplete & broken:
  $ ksh-2020.0.0 -c 'exit 265'; echo $?
  137
Expected: 9. Since the exit was *not* due to a signal, the value
should simply be cropped to the 8 bits supported by the OS.

src/cmd/ksh93/bltins/cflow.c: b_exit():
- For the 'exit' builtin command, bitwise-AND the argument to
  'exit' with SH_EXITMASK (8 bits, crop to 0-255) before passing it
  on to sh_exit(). This restores the behaviour of <=2011 ksh93
  versions and is in line with all other POSIX shells.
  It also fixes this bogosity:
    $ (exit 265); echo $?                   # non-forked subshell
    265
    $ (ulimit -t unlimited; exit 265); echo $?  # forked subshell
    9
  Forked or non-forked should make no difference at all
  (see commit message a0e0e29e for why).

src/cmd/ksh93/sh/fault.c: sh_done():
- If the current exit status is equal to the status for the last
  signal that was received from a child process, remove the
  SH_EXITSIG (9th) bit, so that the shell doesn't kill itself.
- If the shell's last child process exits due to a signal, exit
  with a portable 8-bit exit status (128 + signal number). This
  avoids the exit status being < 128 by being cropped to 8 bits.

src/cmd/ksh93/tests/signal.sh:
- Add regression test for exit with status > 256.
- Add regression test verifying the shell no longer kills itself.

(cherry picked from commit 98e0fc94393e175ce6adfee390327c320795bf12)
2020-06-12 01:45:17 +02:00
Martijn Dekker
f88f302c38 Remove code related to long-dead 3DFS research project
This commit gets rid of dead weight related to an obscure early
1990s Bell Labs versioning file system research project called
3DFS, which has not existed for decades and for which I have not
managed to find any evidence that it was ever used outside the lab.

This removes:
- the SHOPT_FS_3D compile option (which was forced on even when 0)
- the obnoxious default alias 2d='set -f;_2d' that turned off your
  globbing and then tried to run a nonexistent _2d command
- undocumented builtins 'vmap' and 'vpath' that only errored out
- a non-functional -V unary operator for the test and [[ commands
- some specific code for Apollo workstations (last made in 1997),
  which was inseparably intertwined with the 3DFS code

(cherry picked from commit 20cdf3709f4fb4e468057b534dcee819b1961fb6)
2020-06-12 01:45:17 +02:00
Martijn Dekker
5f8b0512f0 POSIX compliance fix: apply 'set -u' to $!
POSIX requires[*] that expanding any unset parameter other than $@
and $* is an error when 'set -u'/'set -o nounset' is active.
However, on ksh93, $! was exempt as well. That is a bug.
[*] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_25

src/cmd/ksh93/sh/macro.c:
- special(): Handle 'set -u' for special parameters if/when it is
  about to return NIL. That code path is currently only possible to
  reach for "$!", but this is future-proof and will do the right
  thing if any other special parameter can ever have no value.

src/cmd/ksh93/tests/options.sh:
- Add and tweak 'set -u' regression tests.

(cherry picked from commit 75cc7a38cafe3a9929e1ed17d8b952babda22a09)
2020-06-12 01:45:17 +02:00
Martijn Dekker
36da314c9e POSIX compliance fix: apply 'set -u' to $1, $2, ...
POSIX requires[*] that expanding any unset parameter other than $@
and $* is an error when 'set -u'/'set -o nounset' is active.
However, on ksh93, $1, $2, ... were exempt as well. That is a bug.
[*] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_25

src/cmd/ksh93/sh/macro.c:
- varsub(): Backport code for handling 'set -u' for positional
  parameters from the ast 2016-10-01-beta branch.

src/cmd/ksh93/tests/options.sh:
- Add relevant regression tests.

src/cmd/ksh93/sh.1:
- Document that $@ and $* are exempt from 'set -u'.

(cherry picked from commit f954c6be0748c4c38a680a75f27564965fbd328e)
2020-06-12 01:45:17 +02:00
Martijn Dekker
61d9bca581 POSIX compliance: rm harmful default aliases 'command '/'nohup '
Continuing alias substitution after 'command' (due to the final
space in the alias) is inherently broken and doing so by default is
incompatible with the POSIX standard, as aliases may contain
arbitrary shell grammar.

For instance, until the previous commit, the POSIX standard 'times'
command was an alias: times='{ { time;} 2>&1;}' -- and so, of
course, 'command times' gave a syntax error, although this is
a perfectly valid POSIX idiom that must be supported.

'command' is specified by POSIX as a regular builtin, not an alias.
Therefore it should always bypass aliases just as it bypasses
functions to expose standard builtin and external commands.

I can only imagine that the reason for this command='command '
alias was that some standard commands themselves were implemented
as aliases, and POSIX requires that standard commands are
accessible with the 'command' prefix. But implementing standard
commands as aliases is itself inherently broken. It never worked
for 'command times', as shown; and in any case, removing all
aliases with 'unalias -a' should not get rid of standard commands.

Similarly, the default alias nohup='nohup ' is also harmful.

Anyone who really wants to keep this behaviour can just define
these aliases themselves in their script or ~/.kshrc file.

src/cmd/ksh93/data/aliases.c:
- Remove default alias command='command '.
- Remove default alias nohup='nohup '.

src/cmd/ksh93/sh.1
- Remove the above default aliases from the list.
- Mention that the 'command' builtin does not search for aliases.

(cherry picked from commit 5cfe7c4e2015b7445da24983af5008035c4b6e1e)
2020-06-12 01:45:16 +02:00
Martijn Dekker
65d363fd34 POSIX compliance fix: make 'times' a proper builtin
As of this commit, the 'times' command is a POSIX-compliant special
builtin command instead of an alias that doesn't produce the
required output. It displays the accumulated user and system CPU
times, one line with the times used by the shell and another with
those used by all of the shell's child processes.
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_27

This was originally written by Kurtis Rader and is now backported
and tweaked from the abandoned ksh2020 branch. I chose an earlier
and simpler version[*1] that uses times(3), with a precision of
hundredths of seconds, so it outputs the same precision as mksh and
zsh. Rader later wrote another version[*2] that uses getrusage(2),
giving it the same millisecond precision as bash. But that required
adding a feature test and a fallback to the old version, which is
non-trivial in the old INIT/iffe system. This simpler version is
enough to gain POSIX compliance and I think it will do very nicely
in this stable bugfix branch.

[*1] https://github.com/att/ast/pull/1332
[*2] https://github.com/att/ast/commit/038045f6

src/cmd/ksh93/bltins/misc.c
- Add b_times() function for 'times' builtin.
- Note we include <times.h>, not <sys/times.h>, so that we use the
  AST feature-tested version with fallback on systems that need it.

src/cmd/ksh93/data/aliases.c:
- Remove times='{ { time;} 2>&1;}' builtin alias.

src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/include/builtins.h:
- Add entry for 'times' special builtin.
- Add --help/--man info for same.

src/cmd/ksh93/sh.1:
- Update manual page.

src/cmd/ksh93/tests/builtins.sh:
- Add a couple of simple regression tests.

(cherry picked from commit ebf71e619eb298ec1cf6b81d1828fa7cdf6e9203)
2020-06-12 01:45:16 +02:00
Martijn Dekker
8e97419b0b Fix ${.sh.subshell} counter to actually count level of subshells
This counter is documented as follows:
"The current depth for subshells and command substitution."

But before this commit, the actual behaviour was that the counter
was reset to zero whenever a subshell forked for any reason: a
pipe, background job, running 'ulimit', redirecting stdout in a
command substitution, and more. This behaviour was:

1. Not consistent with the documentation. Non-forked (a.k.a.
   virtual) subshells are an internal implementation detail which
   scripts should not have to be concerned with. The manual page
   doesn't mention them at all.

2. Inherently broken. Since a subshell may fork for any number of
   reasons, even mid-run, and those reasons may change with
   bugfixes and further development, scripts have never actually
   been able to rely on the value of ${.sh.subshell}.

So, this commit fixes the counter to count the levels of all
subshells, both virtual and forked.

src/cmd/ksh93/sh/xec.c: _sh_fork():
- Increase ${.sh.subshell} whenever we fork.

src/cmd/ksh93/sh/subshell.c:
- sh_subfork():
  * Fix comment to properly explain what it does. It doesn't
    "create" a subshell, it forks off an existing virtual subshell.
  * Don't zero ${.sh.subshell}. Instead, since sh_fork() increases
    it but we're forking an existing subshell, undo the increase.
- sh_subshell():
  * Remove 'int16_t subshell' variable. It was unnecessary and
    mostly unused. It was also the wrong type: it was assigned the
    value from shp->subshell which is of type short.
  * Increase and decrease the level of virtual subshells and
    ${.sh.subshell} independently.

src/cmd/ksh93/tests/variables.sh:
- Add regression tests for ${.sh.subshell} in virtual and forked
  subshells of several kinds: comsub, parentheses, pipe, bg job.
- Undo wrong error test count fix from 04b4aef0.

(cherry picked from commit a0e0e29e7e0dbf21e4b3958ae02bde6665fb2696)
2020-06-12 01:45:16 +02:00
Martijn Dekker
8b07d2a011 Fix various crashes by removing invalid memccpy() use
The sfputr() function (put out a null-terminated string) contained
a use of memccpy() that was invalid and could cause crashes,
because the buffer it was copying into could overlap or even be
identical with the buffer being copied from.

Among (probably) other things, this commit fixes a crash in 'print
-v' (print a compound variable structure) on macOS, that caused the
comvar.sh and comvario.sh regression tests to fail spectacularly.
Now they pass.

Issue discovered and fixed by Kurtis Rader in the abandoned ksh2020
branch; this commit backports the fix. He wrote:

| #if _lib_memccpy && !__ia64 /* these guys may never get it right */
|
| The problem is that assertion is wrong. It implies that the libc
| implementation of memccpy() on IA64 is broken. Which is
| incorrect. The problem is the AST sfputr() function is depending
| on what is explicitly undefined behavior in the face of
| overlapping source and destination buffers.
| [...] Using memccpy() simply complicates the code and is unlikely
| to be measurably, let alone noticeably, faster.

Further discussion/analysis: https://github.com/att/ast/issues/78

src/lib/libast/sfio/sfputr.c:
- Remove memccpy use. Always use the manual copying loop.

(cherry picked from commit fbe3c83335256cb714a4aa21f555083c9f1d71d8)
2020-06-12 01:45:16 +02:00
Martijn Dekker
6f0e008cf7 Fix unsetting special vars in subshells (re: efa31503, 8b7f8f9b)
This fixes (or at least works around) a bug that caused special
variables such as PATH, LANG, LC_ALL, LINENO, etc. to lose their
effect after being unset in a subshell.

For example:
(unset PATH; PATH=/dev/null; ls); : wrongly ran 'ls'
(unset LC_ALL; LC_ALL=badlocale); : failed to print a diagnostic

This is yet another problem with non-forking/virtual subshells. If
you forced the subshell to fork (one way of doing this is using the
'ulimit' builtin, e.g. ulimit -t unlimited) before unsetting the
special variable, the problem vanished.

I've tried to localise the problem. I suspect the sh_assignok()
function, which is called from unall(), is to blame. This function
is supposed to make a copy of a variable node in the virtual
subshell's variable tree. Apparently, it fails to copy the
associated permanent discipline function settings (stored in the
np->nvfun->disc pointer) that gave these variables their special
effect, and which survive unset. However, my attempts to fix that
have been unsuccessful. If anyone can figure out a fix, please send
a patch/pull request!
Data point: This bug existed in 93u 2011-02-08, but did not yet
exist in M-1993-12-28-s+. So it is a regression.

Meanwhile, pending a proper fix, this commit adds a safe
workaround: it forces a non-forked subshell to fork before
unsetting such a special variable.

src/cmd/ksh93/bltins/typeset.c: unall():
- If we're in a non-forked, non-${ ...; } subshell, then before
  unsetting any variable, check for variables with internal
  trap/discipline functions, and call sh_subfork() if any are
  found. To avoid crashing, this must be done before calling
  sh_pushcontext(), so we need to loop through the args separately.

src/cmd/ksh93/tests/variables.sh:
- Remove the 'ulimit' that forced the fork; we do this in C now.

(cherry picked from commit 21b1a67156582e3cbd36936f4af908bb45211a4b)
2020-06-12 01:45:16 +02:00
Martijn Dekker
1026006db3 Fix BUG_KBGPID: $! was not updated under certain conditions
The $! special parameter was not set if a background job
(somecommand &) or co-process (somecommand |&) was launched as the
only command within a braces block with an attached redirection,
for example:
	{
		somecommand &
	} >&2
With the bug, $! was unchanged; now it contains the PID of
somecommand.

Ref.: https://github.com/att/ast/issues/1357

src/cmd/ksh93/sh/parse.c: item():
- When processing redirections following a compound command, always
  create a parent node with the TSETIO (I/O redirection) token.
     Before this commit, if the last command was of type TFORK (and
  the last command only tested as TFORK if the bg job or coprocess
  was the only command in a braces block, because the ksh parser
  optimises away the braces in that case), then the parent node was
  created with the TFORK token instead.
     I have no idea what David Korn's intention was with that, but
  this is clearly very wrong. Creating another TFORK node when
  parsing the redirection caused sh_exec() in sh/xec.c to execute
  the redirection in an extra forked, non-background subshell.
  Since redirections are executed before anything else, this
  subshell is what then launched the background job between the
  braces, so $! (a.k.a. shp->bckpid) was updated in that subshell
  only, and never in the main shell. The extra subshell also
  prevented the background job from being noticed by job control
  on interactive shells.
     So, the fix is simply to remove the broken test for TFORK.

src/cmd/ksh93/tests/variables.sh:
- Add regression tests for a bg job and a co-process as the only
  command within a braces block with attached redirection.

(cherry picked from commit ffe5df30e69f7b596941a98498014d8e838861f2)
2020-06-12 01:45:15 +02:00
Martijn Dekker
eee47df423 Fix handling of skipped directories when autoloading functions
Fix a bug in autoloading functions. Directories in the path search
list which should be skipped (e.g. because they don't exist) did
not interact correctly with autoloaded functions, so that a
function to autoload was not always found.

Details:
https://github.com/att/ast/issues/1454

Fix backported (and cleaned up) from:
https://github.com/att/ast/commit/3bc58164

src/cmd/ksh93/sh/path.c:
- path_opentype(): Fix the path search loop so that entries marked
  with PATH_SKIP are handled correctly.

src/cmd/ksh93/tests/functions.sh:
- Add regression test verifying an autoloaded function with a PATH
  that triggered the bug.
  The bug in path_opentype() fixed by this commit may affect other
  scenarios but we know it affects autoloaded functions. Hence the
  test for that scenario.

(cherry picked from commit a27903165775309f4f032de5d42ec1785f14cfbc)
2020-06-12 01:45:15 +02:00
Martijn Dekker
482d1c3dd6 fix 24 more typos found with the help of codespell
(cherry picked from commit a92198bc5f196ec1b4a34dc042ff3a594e316ad7)
2020-06-12 01:45:15 +02:00
Martijn Dekker
7003aba487 Fix 'test'/'[' exit status >1 on error in arithmetic expression
Fix BUG_TESTERR1A: POSIX non-compliance of 'test'/'[' exit status
on error. The command now returns status 2 instead of 1 when given
an invalid number or arithmetic expression, e.g.: [ 123 -eq 123x ]

The problem was that the test builtin (b_test()) calls the generic
arithmetic evaluation subsystem (sh/arith.c, sh/streval.c) which
has no awareness of the test builtin. A simple solution would be to
always make the arithmetic subsystem use an exit status > 1 for
arithmetic errors, but globally changing this may cause backwards
compatibility issues. So it's best to change the behaviour of the
'test' builtin only. This requires the arithmetic subsystem to be
aware of whether it was called from the 'test' builtin or not. To
that end, this commit adds a global flag and overrides the
ERROR_exit macro where needed.

src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/sh/defs.c:
- Declare and initialise a global sh_in_test_builtin flag.
- Declare internal function for ERROR_exit override in test.c.

src/cmd/ksh93/bltins/test.c:
- Add override for ERROR_exit macro using a function that checks if
  the exit status is at least 2 if the error occurred while running
  the test builtin.
- b_test(): Set sh_in_test_builtin flag while running test builtin.

src/cmd/ksh93/sh/arith.c,
src/cmd/ksh93/sh/streval.c:
- Override ERROR_exit macro using function from test.c.

src/cmd/ksh93/tests/bracket.sh:
- Add regression test verifying status > 1 on arith error in test.

(cherry picked from commit 5eeae5eb9fd5ed961a5096764ad11ab870a223a9)
2020-06-12 01:45:15 +02:00
Martijn Dekker
ec888867fd Fix unsetting aliases in subshells
Aliases can now be correctly unset within subshell environments
(such as ( ... ), $(command substitutions), etc), as well as
non-subshell "shared" command substitutions (${ ...; }). Before,
attempts to unset aliases within these were silently ignored.

Prior discussion: https://github.com/att/ast/issues/108

Subshell alias trees are only referenced in a few places in the
code, *and* have always been broken, so this commit gets rid of the
whole notion of a subshell alias tree. Instead, there is now just
one flat alias tree, and subshells fork into a separate process
when aliases are set or unset within them. It is not really
conceivable that this could be a performance-sensitive operation,
or even a common one, so this is a clean fix with no downside.

src/cmd/ksh93/include/defs.h:
- Remove sh_subaliastree() definition.

src/cmd/ksh93/sh/subshell.c:
- Remove salias element (pointer to subshell alias tree) from
  subshell struct.
- Remove sh_subaliastree() function.
- sh_subshell(): Remove alias subshell tree cleanup.

src/cmd/ksh93/bltins/typeset.c:
- b_alias(): If in subshell, fork before setting alias.
- b_unalias(): If in subshell, fork before unsetting alias.
- unall(): Remove sh_subaliastree() call.

src/cmd/ksh93/sh/name.c:
- nv_open(): Remove sh_subaliastree() call.

src/cmd/ksh93/tests/subshell.sh:
- Add regression tests for unsetting or redefining aliases within
  subshells.

(cherry picked from commit 12a15605b9521a2564a6e657905705a060e89095)
2020-06-12 01:45:14 +02:00
Martijn Dekker
047cb3303c Fix redefining & unsetting functions in subshells (BUG_FNSUBSH)
Functions can now be correctly redefined and unset in subshell
environments (such as ( ... ), $(command substitutions), etc).
Before this fix, attempts to do this were silently ignored (!!!),
causing the wrong code (i.e.: the function by the same name from
the parent shell environment) to be executed.

Redefining and unsetting functions within "shared" command
substitutions of the form '${ ...; }' is also fixed.

Prior discussion: https://github.com/att/ast/issues/73

src/cmd/ksh93/sh/parse.c:
- A fix from George Koelher (URL above). He writes:
  | The parser can set t->comnamp to the wrong function.
  | Suppose that the shell has executed
  |     foo() { echo WRONG; }
  | and is now parsing
  |     (foo() { echo ok; } && foo)
  | The parser was setting t->comnamp to the wrong foo. [This
  | fix] doesn't set t->comnamp unless it was a builtin. Now the
  | subshell can't call t->comnamp, so it looks for foo and finds
  | the ok foo in the subshell's function tree.

src/cmd/ksh93/bltins/typeset.c:
- Unsetting functions in a virtual/non-forked subshell still
  doesn't work: nv_open() fails to find the function. To work
  around this problem, make 'unset -f' fork the subshell into its
  own process with sh_subfork().
- The workaround exposed another bug: if we unset a function in a
  subshell tree that overrode a function by the same name in the
  main shell, then nv_delete() exposes the function from the main
  shell scope. Since 'unset -f' now always forks a subshell, the
  fix is to simply walk though troot's parent views and delete any
  such zombie functions as well. (Without this, the 4 'more fun'
  tests in tests/subshell.sh fail.)

src/cmd/ksh93/sh/subshell.c: sh_subfuntree():
- Fix function (re)definitions and unsetting in "shared" command
  substitutions of the form '${ commandlist; }' (i.e.: if
  sp->shp->subshare is true). Though internally this is a weird
  form of virtual subshell, the manual page says it does not
  execute in a subshell (meaning, all changes must survive it), so
  a subshell function tree must not be created for these.

src/cmd/ksh93/tests/subshell.sh:
- Add regression tests related to these bugfixes. Test unsetting
  and redefining a function in all three forms of virtual subshell.

(cherry picked from commit dde387825ab1bbd9f2eafc5dc38d5fd0bf9c3652)
2020-06-12 01:45:14 +02:00
Martijn Dekker
593a5a8b7f Patch vulnerability CVE-2019-14868
Certain environment variables were interpreted as arithmetic
expressions on startup, leading to code injection.

Ref.:
https://bugzilla.redhat.com/show_bug.cgi?id=1757324
c7de8b6412

(cherry picked from commit ee6b001d0611ad2e00b6da2c2b42051995c0a678)
2020-06-12 01:45:14 +02:00
Martijn Dekker
e999f6b169 Fix truncating of files with <>;file combined with <#pattern
The issue with truncating files was caused by out-of-sync streams.
Details and discussion: https://github.com/att/ast/issues/61

src/cmd/ksh93/sh/io.c: sh_iorestore():
- To be safe, sync all streams before restoring file descriptors.

src/cmd/ksh93/tests/io.sh:
- Add two regression tests for truncating files with this
  combination of redirections.
- The second test, which invokes a -c script, is disabled for now
  as this triggers another corner case bug involving the SH_NOFORK
  optimisaton for -c scripts. That fix is for another commit.

(cherry picked from commit 18fb64840365c2ff4608188e5487bd79d08f67d1)
2020-06-12 01:45:14 +02:00
Martijn Dekker
952944197f Fix bugs in testing if a parameter is set
This fixes three related bugs:
1. Expansions like ${var+set} remained static when used within a
   'for', 'while' or 'until' loop; the expansions din't change
   along with the state of the variable, so they could not be used
   to check whether a variable is set within a loop if the state of
   that variable changed in the course of the loop. (BUG_ISSETLOOP)
2. ${IFS+s} always yielded 's', and [[ -v IFS ]] always yielded
   true, even if IFS is unset. (BUG_IFSISSET)
3. IFS was incorrectly exempt from '-u' ('-o nounset') checks.

src/cmd/ksh93/sh/macro.c: varsub():
- When getting a node pointer (np) to the parameter to test,
  special-case IFS by checking if it has a value and not setting
  the pointer if not. The node to IFS always exists, even after
  'unset -v IFS', so before this fix it always followed the code
  path for a parameter that is set. This fixes BUG_IFSISSET for
  ${IFS+s} and also fixes set -u (-o nounset) with IFS.
- Before using the 'nv_isnull' macro to check if a regular variable
  is set, call nv_optimize() if needed. This fixes BUG_ISSETLOOP.
  Idea from Kurtis Rader: https://github.com/att/ast/issues/1090
  Of course this only works if SHOPT_OPTIMIZE==1 (the default),
  but if not, then this bug is not triggered in the first place.
- Add some comments for future reference.

src/cmd/ksh93/bltins/test.c: test_unop():
- Fix BUG_IFSISSET for [[ -v IFS ]]. The nv_optimize() method
  doesn't seem to have any effect here, so the only way that I can
  figure out is to special-case IFS, nv_getval()'ing it to check if
  IFS has a value in the current scope.

src/cmd/ksh93/tests/variables.sh:
- Add regression tests for checking if a varariable is set within a
  loop, within and outside a function with that variable made local
  (to check if the scope is honoured). Repeat these tests for a
  regular variable and for IFS, for ${foo+set} and [[ -v foo ]].

(cherry picked from commit a2cf79cb98fa3e47eca85d9049d1d831636c9b16)
2020-06-12 01:45:14 +02:00
Martijn Dekker
c9ccee86bb Fix 'command -p' by fixing initialisation of default PATH variable
'command -p' was broken for non-interactive shells as the variable
used to store the default system PATH, std_path, was not
initialised correctly. For instance:
	$ ksh -c 'command -p ls'
	ksh: ls: not found
This fix by Siteshwar Vashisht is backported from ksh2020.
Ref.:
https://github.com/att/ast/issues/426
https://github.com/att/ast/pull/448

src/cmd/ksh93/sh/path.c:
- Correctly initialise std_path (the default PATH) when ksh is
  started as a non-interactive shell.

src/cmd/ksh93/sh.1:
- Fix vague explanation of 'command -p'.

src/cmd/ksh93/tests/path.sh:
- Add regression test.

(cherry picked from commit a76439d60b70c18cf44d84c1962fcd8df84c947c)
2020-06-12 01:45:14 +02:00
Martijn Dekker
cafe33f048 Fix 'test -t 1' in $(command substitutions)
Standard output (FD 1) tested as being on a terminal within a
command substitution, which makes no sense as the command
substitution is supposed to be catching standard output.
    ksh -c 'v=$(echo begincomsub
                [ -t 1 ] && echo oops
                echo endcomsub)
            echo "$v"'
This should not output "oops".

This is one of the many bugs with ksh93 virtual (non-forked)
subshells. On the abandoned Vashist/Rader ksh2020 branch, this bug
was fixed by changing quite a lot of code, which introduced and/or
exposed another bug:
	https://github.com/att/ast/issues/1079
	https://github.com/att/ast/commit/8e1e405e
	https://github.com/att/ast/issues/1088
That issue was unresolved when the ksh2020 branch was abandoned.

The safer and more conservative fix is simply forcing the subshell
to fork if we're in a non-forked command substitution and testing
'-t 1'. It is hard to imagine a situation where this would cause a
noticable performance hit.

Note that this fix does not affect ksh93-specific "shared"
non-subshell ${ command substitutions; } which are executed in the
main shell environment, so that variables survive, etcetera.
'test -t 1' continues to wrongly return true there, but command
substitutions of that form cannot be forked because that would
defeat their purpose.

src/cmd/ksh93/bltins/test.c:
- Fix 'test -t 1', '[ -t 1 ]' and '[[ -t 1 ]]' by forking the
  current subshell if it is a virtual/non-forked subshell
  (shp->subshell), and a command substitution (shp->comsub), but
  NOT a "shared" ${ command substitution; } (!shp->subshare).

src/cmd/ksh93/tests/bracket.sh:
- Add two regression tests for this issue, which were adapted from
  the Vashist/Rader ksh2020 branch.

NEWS, src/cmd/ksh93/include/version.h:
- Update.

(cherry picked from commit b8ef05e457ead65b83417699b8dd8632f855e2fa)
2020-06-12 01:45:13 +02:00
Martijn Dekker
2e7602da2a Add bin/shtests, convenient wrapper for regression tests
This new wrapper script sets up the correct environment for running
the ksh93 regression test suite. It allows running the tests
without AST nmake, which is not maintained in this repository.
An alternative ksh to test may be passed in the $KSH env var.

bin/shtests:
- Added. Sets up environment before passing control to
  'src/cmd/ksh93/tests/shtest'. Passes on any options given.

NEWS:
- Updated.

(cherry picked from commit 14ced94ed83991687c645a09bd2e45a5c2ffe8dc)
2020-06-12 01:45:13 +02:00
Martijn Dekker
93e15a3035 Fix BUG_PUTIOERR: Check for and report I/O error in output builtins
This allows scripts to check for a nonzero exit status on the
'print', 'printf' and 'echo' builtins and prevent possible infinite
loops if SIGPIPE is ignored.

sfsync() was already returning a negative value on I/O error, so
all we need to do is add a check. The stream buffer will need to be
filled before an I/O error can be detected, but this is the same on
other shells. See manual page: src/lib/libast/man/sfio.3

Ref.: https://github.com/att/ast/issues/1093
      https://github.com/att/ast/pull/1363

src/cmd/ksh93/bltins/print.c: b_print():
- Make sure an error result from sfsync() is reflected in the
  output builtin's exit status (exitval).
- Write an I/O error message (e_io) if the exit status is nonzero.

src/cmd/ksh93/data/msg.c, src/cmd/ksh93/include/io.h:
- Add the e_io[] error message.

src/cmd/ksh93/tests/builtins.sh:
- Add I/O error regression test, checking for the correct error
  message and exit status. All three output builtins use the same
  b_print() function so we only need to test one.

src/cmd/ksh93/tests/basic.sh,
src/cmd/ksh93/tests/coprocess.sh:
- Redirect stderr on a few 'print' commands to /dev/null; these
  now issue an expected I/O error. This does not cause failures.

NEWS, TODO:
- Update.

(cherry picked from commit 9011fa933552e483dab460f7dd1593d64e059d94)
2020-06-12 01:45:13 +02:00
Martijn Dekker
846ad93272 Fix 'print -s -f'
This fix is backported from the Vashisht/Rader ksh2020 branch.

src/cmd/ksh93/bltins/print.c:
- Fix syncing history when print -s -f is used. For example, the
  following now correctly adds a 'cd' command to the history:
	print -s -f 'cd -- %q\n' "$PWD"
  Ref.:	https://github.com/att/ast/issues/425
	https://github.com/att/ast/pull/442

src/cmd/ksh93/include/version.h:
- Version date bump.

(cherry picked from commit 46ba7ecdc5c14cd73f6cb24b16c50bdc331a000e)
2020-06-12 01:45:13 +02:00
Martijn Dekker
eeee77edd1 Fix BUG_REDIRIO
ksh used to redirect standard output by default when no file
descriptor was specified with the rarely used '<>' reading/writing
redirection operator. It now redirects standard input by default,
as POSIX specifies and as all other POSIX shells do. To redirect
standard output for reading and writing, you now need '1<>'.

Ref.: https://github.com/att/ast/issues/75
      http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_07_07

(cherry picked from commit 29afc16c47824fc79ed092ae7704c525b1db6a0a)
2020-06-12 01:45:13 +02:00
Martijn Dekker
6a4972069f Fix BUG_CASELIT: pattern matching as literal string in 'case'
This fixes an undocumented 'case' pattern matching misbehaviour
(labelled BUG_CASELIT in modernish) that goes back to the original
Bourne shell, but wasn't discovered until 2018.

If a pattern doesn't match as a pattern, it's tried again as a
literal string. This breaks common validation use cases, such as:

n='[0-9]'
case $n in
( [0-9] )  echo "$n is a number" ;;
esac

would output "[0-9] is a number" as the literal string fallback
matches the pattern. As this misbehaviour was never documented
anywhere (not for Bourne, ksh88, or ksh93), and it was never
replicated in other shells (not even in ksh88 clones pdksh and
mksh), it is unlikely any scripts rely on it.

Of course, a literal string fallback, should it be needed, is
trivial to implement correctly without this breakage:

case $n in
( [0-9] | "[0-9]")  echo "$n is a number or the number pattern" ;;
esac

src/cmd/ksh93/sh/xec.c:
- Remove trim_eq() function responsible for implementing the
  misbehaviour described above.

NEWS:
- Added. Document this bugfix.

Ref.:
- The problem: thread starting at
  https://www.mail-archive.com/austin-group-l@opengroup.org/msg02127.html
- The solution, thanks to George Koehler: comments/commits in
  https://github.com/att/ast/issues/476
- Modernish BUG_CASELIT bug test & documentation:
  https://github.com/modernish/modernish/commit/b2024ae3

(cherry picked from commit 8d6c8ce69884767a160c1e20049e77bdd849c248
with some extra edits to NEWS to upate the info for this reboot)
2020-06-12 01:45:13 +02:00