1
0
Fork 0
mirror of git://git.code.sf.net/p/cdesktopenv/code synced 2025-02-13 11:42:21 +00:00
Commit graph

363 commits

Author SHA1 Message Date
Martijn Dekker
2c35a53964 Fix resuming external command run from POSIX function or dot script
This fixes yet another whopper of a bug in job control. And it's
been in every version of ksh93 back to 1995, the beginning of
ast-open-archive. <sigh>

This is also bug number 23 that is fixed by simply removing code.

Reproducer:

1. Run vi, or another suspendable program, from a dot script or
   POSIX function (ksh handles them the same way). So either:
   $ echo vi >v
   $ . ./v
   or:
   $ v() { vi; }
   $ v

2. Suspend vi by typing Ctrl+Z.

3. Not one but two jobs are registered:
   $ jobs -l
   [2] + 85513	Stopped                  . ./v
   [1] - 85512	Stopped                  . ./v

4. 'fg' does not work on either of them, just printing the job
   command name but not resuming the editor. The second job
   disappears from the table after 'fg'.
   $ fg %2
   . ./v
   $ fg %2
   ksh: fg: no such job
   $ fg %1
   . ./v
   $ fg %1
   . ./v

Either way, you're stuck with an unresumable vi that you have to
'kill -9' manually.

src/cmd/ksh93/sh/xec.c: sh_exec(): TFORK:
- Do *not* turn off the SH_MONITOR state flag (which tells ksh to
  keep track of jobs) when running an external command from a
  profile script or dot script/POSIX function. It's obvious that
  this results in an inconsistent job control state as ksh will not
  update it when the external command is suspended. The purpose of
  this nonsense is surely lost to history as it's been there since
  1995 or before. My testing says that removing it doesn't break
  anything. If that turns out to be wrong, then the breakage will
  have to be fixed in a correct way instead.
2022-01-28 05:15:42 +00:00
Johnothan King
c0567c5e1d Fix spurious syntax error when using ${foo[${bar}..${baz}]} (#436)
Attempting to use array subscript expansion with variables that
aren't set currently causes a spurious syntax error (in ksh93u+ and
older commits the reproducer crashes):
   $ ksh -c 'echo ${foo[${bar}..${baz}]}'  # Shouldn't print anything
   ksh: : arithmetic syntax error

src/cmd/ksh93/sh/macro.c:
- Backport a parser bugfix from ksh93v- 2012-08-24 that avoids
  setting mp->dotdot until the copyto() function's loop is
  finished.

src/cmd/ksh93/tests/arrays.sh:
- Add regression tests for this bug.
2022-01-27 16:08:54 +00:00
Martijn Dekker
fe268fcc91 Cygwin: workaround for ksh to execute #!-less scripts with itself
On Cygwin, ksh does not execute scripts without a #! path in a fork
of the ksh process as it does on other systems. Reproducer (run
from ksh):

  $ cat test.sh
  echo "${BASH_VERSION:-not bash}"
  echo "${.sh.version}"
  $ chmod +x test.sh
  $ ./test.sh
  4.4.12(3)-release
  ./test.sh: line 2: ${.sh.version}: bad substitution

The script was executed in bash instead of ksh.

After this fix, the output on Cygwin is like ksh on other systems:

  not bash
  Version AJM 93u+m/1.1.0-alpha+dev 2022-01-26

This also fixes a number of regression test failures, as quite a
few tests create and execute temp scripts without a hashbang path.

Analysis: On Cygwin, execve(2) happily executes shell scripts
without a #! path with /bin/sh (which is bash --posix). However,
ksh relies on execve(2) executing binaries or #! only, as it uses
an ENOEXEC failure to decide whether to fork and execute a #!-less
shell script with a reinitialized copy of itself via exscript().

src/cmd/ksh93/sh/path.c: path_spawn():
- Look at the magic first two bytes of the file; if it is "MZ"
  (Mark Zbikowski, originator of the .exe format) or "#!", continue
  as normal, otherwise simulate an ENOEXEC failure from execve(2)
  which will cause ksh to fall back on #!-less script execution.
2022-01-27 16:08:44 +00:00
Martijn Dekker
41ee12a527 Document history expansion and fix a few loose ends
src/cmd/ksh93/sh.1:
- Add a new section on history expansion mostly adapted from the
  "History substitution" section from the tcsh(1) man page. This
  has the standard BSD license which is already in the COPYRIGHT
  file. Inapplicable stuff was removed, some new stuff added.

src/cmd/ksh93/edit/hexpand.c,
src/cmd/ksh93/sh/io.c:
- Set '#' as the default history comment character as on bash;
  no longer disable it by default.
- Add the 'a' modifier as a synonym for 'g', as on bash.
- Remove pointless static keyword from np variable; since the
  value from previous calls is never used it can just be local.
- Use NV_NOADD flag with nv_open() to avoid pointlessly creating
  the node if the variable doesn't exist yet.
- Fix a bug in history expansion where the 'p' modifier had no
  effect if the 'histverify' option is on.
  Reproducer:
    $ set -H -o histv
    $ true a b c
    $ !!:p
    $ true a b c▁  <= instead of printed, the line is re-edited
  Expected:
    $ set -H -o histv
    $ true a b c
    $ !!:p
    true a b c
    $ ▁
  This is fixed by making 'p' remove the HIST_EVENT bit from the
  returned flag bits in hist_expand(), leaving only the HIST_PRINT
  flag, then adding special handling for this case to slowread()
  in io.c (print the line, then instead of executing it, continue
  and read the next line).
2022-01-25 03:45:47 +00:00
Martijn Dekker
cda8fc916f Fix a crashing bug in history expansion
Reproducer:

$ set -o histexpand
$ echo foo !#^:h !#^:&
/usr/local/bin/ksh: :&: no previous substitution
ksh(80822,0x10bc2a5c0) malloc: *** error for object 0x10a13bae3: pointer being freed was not allocated
ksh(80822,0x10bc2a5c0) malloc: *** set a breakpoint in malloc_error_break to debug
Abort

Analysis: In hist_expand(), the 'cc' variable has two functions:
it holds a pointer to a malloc'ed copy of the current line, and is
also used as a temporary pointer with functions like strchr().
After that temporary use, it is set to NULL again, because the
'done:' routine checks if it non-NULL to decide whether to free the
pointer. But if an error occurs, the function may jump straight to
'done' without first setting cc to NULL if it had been used as a
temporary pointer. It then tries to free an unallocated pointer.

src/cmd/ksh93/edit/hexpand.c: hist_expand():
- Eliminate this bad practice by using a separate variable for
  temporary pointer purposes.

(I was unable to reproduce the crash in a pty regression test,
though it is consistently reproducible in a real interactive
session. So I haven't added that test.)
2022-01-24 08:41:27 +00:00
Martijn Dekker
8afc4756e8 history expansion: add missing bounds check
So far all ksh versions accept event numbers referring to
nonexistent history events in history expansion (-H/-o histexpand),
e.g. !9999 is accepted even if the history file has no item 9999.
These expansions seem to show random content from the history file,
sometimes including binary data. Of course an "event not found"
error should have been thrown instead.

hist_expand() in hexpand.c calls hist_seek() (from history.c)
without any bounds checking except verifying the history event
number is greater than zero. This commit adds a bounds check
to hist_seek() itself as it's called from three other places
in history.c, so perhaps this fixes a few other bugs as well.

src/cmd/ksh93/edit/history.c: hist_seek():
- Use the hist_min() and hist_max() macros provided in history.h
  to check bounds. Note that hist_max() yields the number of the
  command line currently being entered, so the maximum for seeking
  purposes is actually its result minus 1.
2022-01-21 02:13:53 +00:00
Johnothan King
eaf7662daa Fix history expansion buffer overflow (#434)
History expansion currently crashes under ASan due to a buffer
overflow. Reproducer:

   $ set -H
   $ !!:s/old/new/

Explanation from <https://github.com/att/ast/issues/1369>:
> The problem is the code assumes the buffer allocated for a string
> stream is zero initialized. But the SFIO code uses malloc() to
> allocate the buffer and does not explicitly initialize it with
> memset(). That it works at all, even without ASAN enabled, is
> purely accidental. It will fail if that malloc() returns a block
> that had been previously allocated, used, and freed. Under ASAN
> the buffer is initialized (at least on my system) to a sequence
> of 0xBE bytes. So the strdup() happily tries to duplicate a
> string that is the size of that buffer and fails when it reads
> past the end of the buffer looking for the terminating zero byte.

src/cmd/ksh93/edit/hexpand.c:
- Backport ksh2020 bugfix that avoids assuming the string stream
  has been initialized to zeros:
  https://github.com/att/ast/commit/cf16bcca
  (minus the incorrect change to the static wm variable).
2022-01-21 02:13:08 +00:00
Martijn Dekker
a5700d3937 Expose --histreedit and --histverify options (re: 921bbcae)
This adds two long-form shell options that modify history expansion
(-H/--histexpand). If --histreedit is on and a history expansion
fails, the command line is reloaded into the next prompt's edit
buffer, allowing corrections. If --histverify is on, the results of
a history expansion are not immediately executed but instead loaded
into the next prompt's edit buffer, allowing further changes.

SH_HISTREEDIT and SH_HISTVERIFY were already handled all along in
slowread() in io.c and via 'reedit' arguments to functions called
there, but could not be turned on as they were only ever exposed as
shopt options in the removed bash compatibility mode (in theory
only, as it failed to compile). I had overlooked them until now.

So, since the code is there, let's expose these options through the
normal long options interface. They're working fine, and activating
them actually makes history expansion tolerable to use.

src/cmd/ksh93/data/options.c:
- Make these options available as "histreedit" and "histverify".

src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/sh.1:
- Document the "new" options.

src/cmd/ksh93/include/defs.h:
- Remove unused SH_HISTAPPEND and SH_HISTORY2 macros which were
  part of the removed bash compatibility code. Note that ksh does
  not need a histappend option as it never overwrites the history
  file (in the bash mode, this shopt option was a no-op).
2022-01-12 20:38:30 +00:00
Johnothan King
1a9af9db40 Fix vi mode tab completion with spaces (#413)
Attempting to complete file names in vi mode using tab completion can
fail if the last character on the command line is a space. Reproducer
(note that this bug doesn't occur in emacs mode):
   $ set -o vi
   $ mkdir '/tmp/foo bar'
   $ test -d /tmp/foo\ <Tab>

src/cmd/ksh93/edit/vi.c:
- Don't disable tab completion or reset the tab count just because the
  last character on the command line is a space. This bugfix was
  backported from ksh93v- 2014-06-06.

src/cmd/ksh93/tests/pty.sh:
- Add a regression test for the tab completion bug.
2022-01-07 16:18:28 +00:00
Johnothan King
d347ec0fc9 Allow ksh to compile on Haiku; implement SIGKILLTHR support (#408)
This commit implements the build fixes required to get ksh running on
Haiku. Note that while ksh does compile, it has a ton of regression test
failures on Haiku.

src/cmd/ksh93/data/signals.c,
src/lib/libast/features/signal.c:
- Add support for the SIGKILLTHR signal, which is supported by BeOS and
  Haiku.
- SIGINFO was missing an entry in the libast feature test, so add one
  (re: 658bba74).

src/cmd/ksh93/RELEASE:
- Add an entry noting that ksh now compiles on Haiku, albeit with many
  regression test failures.

src/cmd/ksh93/{include/terminal.h,sh/path.c}:
- Silence compiler warnings on Haiku.

src/lib/libast/features/mmap:
- The mmap feature test freezes on Haiku, so modify the test to fail
  immediately on that OS.

src/lib/libast/misc/signal.c:
- Avoid redefining the signal definition on Haiku to fix a compiler
  error.

src/lib/libast/features/nl_types:
- For some reason the nl_item typedef on Haiku doesn't work correctly.
  Work around that by creating the nl_item type in the libast nl_types
  feature test.
2022-01-07 16:16:42 +00:00
Martijn Dekker
57ed1efc2c Actually deactivate CDPATH when unsetting it
After 'unset CDPATH', CDPATH continued to work as if nothing
happened. Unsetting it should be a valid way to deactivate it.

This bug is in every ksh93 version.

src/cmd/ksh93/bltins/cd_pwd.c: b_cd():
- Fix a manifest logic error: first check if CDPATH (CDPNOD) is
  unset before assigning to 'cdpath', not the other way around.
  Setting the 'cdpath' pointer is what activates the CDPATH search.
2021-12-29 01:48:55 +00:00
Johnothan King
4032050249 Port cksum builtin performance improvements from illumos (#391)
This commit ports performance optimizations from illumos for the libsum
code (used by the cksum and sum builtins):
98bea71f0d
The new codepath in libsum uses prefetching and loop unrolling to
improve performance (prefetching is done with __builtin_prefetch()
or sun_prefetch_read_many() if either is available).

Script for testing (note that cksum must be enabled in
src/cmd/ksh93/data/builtins.c):
   #!/bin/ksh
   builtin cksum || exit 1
   for ((i=0; i!=50000; i++)) do
       cksum -x att /etc/hosts
   done >/dev/null

Results on Linux x86_64 (using CCFLAGS=-O2):
$ echo 'UNPATCHED:'; time arch/linux.i386-64/bin/ksh /tmp/foo; echo 'PATCHED'; time /tmp/ksh /tmp/foo
UNPATCHED:

real    0m09.989s
user    0m07.582s
sys     0m02.406s
PATCHED:

real    0m06.536s
user    0m04.331s
sys     0m02.204s

src/lib/libsum/{sum-att.c,sum-crc.c,Mamfile}:
- Port the performance optimizations from illumos to 93u+m libsum. To
  prevent problems with older versions of GCC, avoid the new codepath
  if GCC is older than the 3.1 release series. Additionally, the ast.h
  header must be included to handle tcc defining __GNUC__ on FreeBSD.
- Apply some build fixes to allow the new codepath to build with Clang
  3.6 and newer (my own testing indicates an even better performance
  improvement with Clang than with GCC).
2021-12-28 22:22:52 +00:00
Johnothan King
8f9d1bec97 Add three options to 'ulimit' (#406)
This patch adds a few extra options to the ulimit command (if the OS
supports them). These options are also present in Bash, although in ksh
additional long forms of each option are available:
  ulimit -k/--kqueues   This is the maximum number of kqueues.
  ulimit -P/--npts      This is the maximum number of pseudo-terminals.
  ulimit -R/--rttime    This is the time a real-time process can run
                        before blocking, in microseconds. When the
                        limit is exceeded, the process is sent SIGXCPU.

Other changes:
- bltins/ulimit.c: Change the formatting from sfprintf and increase the
  size of the tmp buffer to prevent text from being cut off in ulimit
  -a (this was required to add ulimit -R).
- data/limits.c: Add support for using microseconds as a unit.
2021-12-28 22:02:20 +00:00
Johnothan King
0e197eee57 Fix mkservice compile errors and add SHOPT_MKSERVICE (#401)
The unused mkservice and eloop builtins are currently not built, and if
an attempt to compile them is made the build ends in failure. This
commit backports a few build fixes from ksh93v- 2012-08-24 that allow
mkservice and eloop to build (plus an additional compiler warning fix
not in ksh93v-). I've also added a new SHOPT_MKSERVICE setting (turned
off by default) so that mkservice and eloop can be built if the user
chooses to include them in their build of ksh.
2021-12-28 17:51:11 +00:00
Johnothan King
24174f0fb7 Backport -P and -t flags for 'type'/'whence' from ksh93v- (#392)
This commit backports the whence '-t' option from ksh93v-. The '-t'
option is useful when one needs to identify the type of a command.
The '-t' flag was added by ksh93v- for compatibility with Bash.

It should be noted the ksh93v- patch had one bug, which this commit
fixes. Path-bound builtins from /opt/ast/bin were classified as
files if loaded from /opt/ast/bin in the PATH. Reproducer:
   $ PATH=/opt/ast/bin whence -t cat
   file

src/cmd/ksh93/bltins/whence.c:
- Simplify the bitmask values for the command and whence builtin
  flags.
- Add the -t flag to the whence and type builtins. To prevent bugs,
  -t will always override -v if both of those flags were passed.

src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/sh.1:
- Add documentation for the new -t option.
2021-12-27 06:40:02 +00:00
Martijn Dekker
e072e7c170 Fix crash in xtrace while processing here-document (re: d7cada7b)
Depending on the OS, the heredoc.sh regression tests, and possibly
others, still crashed with the -x option (xtrace) on.

Analysis: The lexer crashes in lex_advance(). Something has caused
an inconsistent lexer state, and it happened earlier on, so the
backtrace is useless for figuring out where that happened.

But I think I've found it. It's the sh_mactry() call here:

src/cmd/ksh93/sh/xec.c, lines 2800 to 2807 in f7213f03
2800:   if(!(cp=nv_getval(sh_scoped(shp,PS4NOD))))
2801:           cp = "+ ";
2802:   else
2803:   {
2804:           sh_offoption(SH_XTRACE);
2805:           cp = sh_mactry(shp,cp);
2806:           sh_onoption(SH_XTRACE);
2807:   }

sh_mactry() needs to parse the contents of $PS4 to perform
expansions and command substitutions in it, which involves the
lexer. If that happens in a here-document, the lexer is in the C
function call stack, in the middle of parsing the here-document.
Result: inconsistent lexer state. Solution: save and restore lexer
state in sh_mactry().

After this commit, all regression tests should pass with the
'-x'/'--xtrace' option in use, with no errors or crashes.

Note for backporters: this fix depends both on on d7cada7b and on
the consistency fix for the Lex_t type's size applied in a7ed5d9f.

src/cmd/ksh93/include/shlex.h:
- Cosmetic fix: remove a copied & pasted backslash. (re: a7ed5d9f)

src/cmd/ksh93/sh/macro.c: sh_mactry():
- Save and restore the lexer state before letting sh_mactrim()
  indirectly parse and execute code.

src/cmd/ksh93/tests/*.sh:
- Turn off xtrace in various command substitutions that contain
  2>&1 redirections, so that the xtrace output is not caught by
  the command substitutions, causing tests to fail incorrectly.
- Turn off xtrace for a few code blocks with 2>&1 redirections,
  stopping xtrace output from being written to standard output.

Resolves: https://github.com/ksh93/ksh/issues/306 (again)
2021-12-27 04:02:25 +00:00
Martijn Dekker
91a7c2e3e9 Fix crash/freeze upon interrupting command substitution with pipe
On some systems (at least Linux and macOS):

1. Run on a command line: t=$(sleep 10|while :; do :; done)
2. Press Ctrl+C in the first 10 seconds.
3. Execute any other command substitution. The shell crashes.

Analysis: Something in the job_wait() call in the sh_subshell()
restore routine may be interrupted by a signal such as SIGINT on
Linux and macOS. Exactly what that interruptible thing is remains
to be determined. In any case, since job_wait() was invoked after
sh_popcontext(), interrupting it caused the sh_subshell() restore
routine to be aborted, resulting in an inconsistent state of the
shell. The fix is to sh_popcontext() at a later stage instead.

src/cmd/ksh93/sh/subshell.c: sh_subshell():
- Rename struct checkpt buff to checkpoint because it's clearer.
- Move the sh_popcontext() call to near the end, just after
  decreasing the subshell level counters and restoring the global
  subshell data struct to its parent. This seems like a logical
  place for it and could allow other things to be interrupted, too.
- Get rid of the if(shp->subshell) because it is known that the
  value is > 0 at this point.
- The short exit routine run if the subshell forked now needs a new
  sh_popcontext() call, because this is handled before restoring
  the virtual subshell state.
- While we're here, do a little more detransitioning from all those
  pointless shp pointers.

Fixes: https://github.com/ksh93/ksh/issues/397
2021-12-27 03:49:41 +00:00
Johnothan King
f7213f03a2 Fix multiple bugs when using 'alias -p' to print aliases (#398)
This commit was originally intended to fix just one bug with shcomp's
handling of 'alias -p', but while fixing that I found a large number
of related issues in the alias command's -p, -t and -x options. The
current patch provides bugfixes for all of the bugs listed below:

1) Listing aliases in a script with 'alias -p' or 'alias' broke
   shcomp's bytecode output:
   https://github.com/ksh93/ksh/issues/87#issuecomment-813819122

2) Listing individual aliases with the -p option doesn't work:
      $ alias foo=bar bar=foo
      $ alias foo
      foo=bar
      $ alias -p foo  # No output

3) Listing specific tracked aliases with -pt does not display them
   in a reusable format, but rather adds another tracked alias:
      $ hash -r cat vi
      $ alias -pt vi  # No output
      $ alias -pt rm
      $ alias -t
      cat=/usr/bin/cat
      rm=/usr/bin/rm
      vi=/usr/bin/vi

4) Listing all tracked aliases with -pt does not output them in a
   reusable format (the resulting command printed only creates a
   normal alias, which is different from a tracked alias):
      $ hash -r cat
      $ alias -pt
      alias cat=/usr/bin/cat  # Expected 'alias -t cat'

5) Listing a non-existent alias with -p doesn't cause an error:
      $ unalias -a
      $ alias -p notanalias  # No output
      $ echo $?
      0
      $ alias notanalias
      notanalias: alias not found
      $ echo $?
      1
      $ hash -r
      $ alias -pt notacommand  # No output
      $ echo $?
      0

6) Attempting to list 256 non-existent aliases results in exit
   status zero:
      $ unalias -a
      $ alias $(awk -v ORS= 'BEGIN { for(i=0;i<256;i++) print "x "; }')
      x: alias not found
      --cut error message--
      $ echo $?
      0

Changes:

- typeset.c: Avoid printing anything while shcomp is compiling a
  script. This is needed because the alias command is run by shcomp
  to prevent parsing issues.

- b_alias(): To avoid adding tracked aliases with -pt, set
  tdata.aflag to '+' so that setall() and other related functions
  only list tracked aliases.

- b_alias(): Set tdata.pflag to 1 so that setall() and other
  functions recognize -p was passed.

- print_value(): Add support for listing specific aliases with
  'alias -p'.

- setall(): To avoid any issues with zombie tracked aliases (see also
  the regression tests) ignore tracked alias nodes marked with the
  NV_NOALIAS attribute. This bit is set for tracked alias nodes by
  the nv_rehash() function.

- setall(): For backward compatibility, continue incrementing the
  exit status for each invalid alias and tracked alias passed. This
  was already how alias behaved when listing aliases without -p, so
  using -p shouldn't cause a change in behavior:
      $ unalias -a
      $ alias foo bar
      foo: alias not found
      bar: alias not found
      $ echo $?
      2
  To fix bug 6, the exit status is set to one if an enforced 8-bit
  exit status would be zero.

- print_namval(): Set the prefix to 'alias -t' so that listing
  tracked aliases with 'alias -pt' works correctly.

- data/msg.c and include/name.h: Add an error message for when
  'alias -pt' doesn't find a tracked alias.

- tests/alias.sh: Add a ton of regression tests for the bugs fixed in
  this commit.
2021-12-27 03:49:06 +00:00
Johnothan King
3785a0685c Fix process substitutions printing PIDs in profile scripts (#395)
- sh/args.c: A process substitution run in a profile script may print
  its PID as if it was a command spawned with '&'. Reproducer:
     $ cat /tmp/env
     true >(false)
     $ ENV=/tmp/env ksh
     [1]	730227
     $
  This bug is fixed by turning off the SH_PROFILE state while running
  a process substitution.

- sh/subshell.c: The SH_INTERACTIVE fix in 3525535e renders the extra
  check for SH_PROFILE redundant, so it has been removed.

- tests/io.sh: Update the procsub PIDs test to also check the result
  after using process substitution in a profile script.
2021-12-22 13:27:00 +00:00
Martijn Dekker
fcd9efce7f Interactive: Avoid losing the job after suspending a subshell
Reproducer: run vi in a subshell:

	$ (vi)

vi opens; now press Ctrl+Z to suspend. The output is as expected:

	[2] + Stopped                  (vi)

…but the exit status is 18 (SIGTSTP's signal number) instead of 0.

Now do:

	$ fg
	(vi)
	$

The exit status is 18 again, vi is not resumed, and the job is
lost. You have to find vi's pid manually using ps and kill it.

Forking all non-command substitution subshells invoked from the
interactive main shell is the only reliable and effective fix I've
found. I've tried to fork the subshell conditionally in every other
remotely plausible place I can think of in fault.c and xec.c, but I
can't get anything to work properly. If anyone can get this to work
without forking as much (or at all), please do submit a patch or PR
that supersedes this fix.

At least subshells of subshells don't need to fork, so the
performance impact can be limited. Plus, it's not as if most people
need maximum speed on the interactive command line. Scripts
(including login/profile scripts) are not affected at all.

Command substitutions can be handled differently. My testing shows
that all shells except ksh93 simply block SIGTSTP (the ^Z signal)
while they run. We should do the same, so they don't need to fork.

NOTE for any backporters: the subshell.c and fault.c changes depend
on commits 35b02626 and 48ba6964 to work correctly.

src/cmd/ksh93/sh/subshell.c: sh_subshell():
- If the interactive shell state bit is on, then before executing
  the subshell's code:
  - for command substitutions, block SIGTSTP;
  - for other subshells, fork.
- For command substitutions, release SIGTSTP if the interactive
  shell state bit was on upon invoking the subshell.

src/cmd/ksh93/sh/fault.c:
- Instead of checking for a virtual subshell, check the shell's
  interactive state bit to decide whether to handle SIGTSTP, as
  that is only turned on in the interactive main shell.

src/cmd/ksh93/sh/main.c: sh_main():
- To avoid bugs, ignore SIGTSTP while running profile scripts.
  Blocking it doesn't work because delaying it until after
  sigrelease() will cause a crash. Thanks to @JohnoKing for this.
- While we're here, prevent a possible overflow of the 'beenhere'
  static char variable by only incrementing it once.

Co-authored-by: Johnothan King <johnothanking@protonmail.com>
Resolves: https://github.com/ksh93/ksh/issues/390
2021-12-22 05:09:17 +00:00
Martijn Dekker
3525535e1f sh_parse(): don't turn on interactive state (re: 48ba6964)
Reproducer:

	$ (sleep 1& echo done)
	done
	$ (eval "echo hi"; sleep 1& echo done)
	hi
	[1]	30587
	done

No job control output should be printed for a background process
invoked from a subshell, not even after 'eval'.

The cause: sh_parse() turns on the shell's interactive state bit
(sh_state(SH_INTERACTIVE)) if the interactive shell option is on.

This is incorrect. The parser should have no involvement with shell
interactivity in principle because that's not its domain.

Not only that, the parser may need to run in a subshell, e.g. when
executing traps or 'eval' commands (as above). By definition, a
subshell can never be interactive.

We already fixed many bugs related to job control and the shell's
interactive state. Even if these two lines previously papered over
some breakage, I can't find any now after simply removing them. If
any is found later, then it'll need to be fixed properly instead.

Related: https://github.com/ksh93/ksh/issues/390
2021-12-22 05:06:12 +00:00
Martijn Dekker
ce3e080c0e Release 1.0.0-beta.2
Announcing: KornShell 93u+m 1.0.0-beta.2
https://github.com/ksh93/ksh

In May 2020, when every KornShell (ksh93) development project was
abandoned, development was rebooted in a new fork based on the last
stable AT&T version: ksh 93u+. This new fork is called ksh 93u+m as a
permanent nod to its origin. We're restarting it at version 1.0. Seven
months after the first beta, the second one is ready. Please test this
second beta and report any bugs you find, or help us fix known bugs.

We're now the default ksh93 in some OS distributions, at least Debian
and Slackware! Even though we don't think it's stable release quality
yet, the consensus seems to be that 93u+m is already much better than
the last AT&T release.

Main developers: Martijn Dekker, Johnothan King, hyenias

Contributors: Andy Fiddaman, Anuradha Weeraman, Chase, Gordon Woodhull,
Govind Kamat, Harald van Dijk, Lev Kujawski, Marc Wilson, Ryan Schmidt,
Sterling Jensen

HOW TO GET IT

Please download the source code tarball from our GitHub releases page:

	https://github.com/ksh93/ksh/releases

To build, follow the instructions in README.md or src/cmd/ksh93/README.

HOW TO GET INVOLVED

To report a bug, please open an issue at our GitHub page (see above).
Alternatively, email me at martijn@inlv.org with your report.
To get involved in development, read the brief policy information in
README.md and then jump right in with a pull request or email a patch.
See the TODO file in the top-level directory for a to-do list.

*** MAIN CHANGES between 1.0.0-beta.1 and 1.0.0-beta.2 ***

New features in built-in commands:

- 'cd' now supports an -e option that, when combined with -P, verifies
  that $PWD is correct after changing directories; this helps detect
  access permission problems. See:
  https://www.austingroupbugs.net/view.php?id=253

- 'printf' now supports a -v option as in bash. This assigns formatted
  output directly to variables, which is very fast and will not strip
  final newline (\n) characters.

- The 'return' command, when used to return from a function, can now
  return any status value in the 32-bit signed integer range, like on
  zsh. However, due to a traditional Unix kernel limitation, $? is
  still trimmed to its least significant 8 bits whenever leaving a
  (sub)shell environment.

- 'test'/'[' now supports all the same operators as [[ (including =~,
  \<, \>) except for the different 'and'/'or' operators. Note that
  'test'/'[' remains deprecated due to its unfixable pitfalls;
  [[ ... ]] is recommended instead.

Shell language changes:

- Several improvements were made to the --noexec shell code linter.

- Arithmetic expressions in native ksh mode no longer interpret a
  number with a leading zero as octal in any context. Use 8#octalnumber
  instead (e.g. 8#400 == 256). Arithmetic expressions now also behave
  identically within and outside ((...)) and $((...)).

- POSIX compatibility mode fixes (only applicable with the --posix shell
  option on):
  - A leading zero is now consistently recognised as introducing an octal
    number in all arithmetic contexts.
  - $((inf)) and $((nan)) are now interpreted as regular variables.
  - The '.' built-in no longer runs ksh functions and now only runs
    files.

Bugs fixed:

- '.' and '..' are now once again completed by tab completion.

- If SIGINT is set to ignore, the interactive shell no longer exits on
  Ctrl+C.

- ksh now builds and runs on Apple's new M1 hardware.

- The 'return' and 'exit' commands no longer risk triggering actual
  signals by returning or exiting with a status > 256.

- Ksh no longer behaves badly when parsing a type definition command
  ('typeset -T' or 'enum') without executing it or when executing it in
  a subshell. Types can now safely be defined in subshells and defined
  conditionally as in 'if condition; then enum ...; fi'.

- Discipline functions, especially those applied to PS2 or .sh.tilde,
  will no longer crash your shell upon being interrupted or throwing an
  error.

- Fixed a bug that could corrupt output if standard output is closed
  upon initialising the shell.

- Fixed a bug in the [[ ... ]] compound command: the '!' logical
  negation operator now correctly negates another '!', e.g.,
  [[ ! ! 1 -eq 1 ]] now returns 0/true. Note that this has always been
  the case for 'test'/'['.

- Fixed SHLVL so that replacing ksh by itself (exec ksh) will not
  increase it.

- Arithmetic expressions are no longer allowed to assign out-of-range
  values to variables of types declared with enum.

- The 'time' keyword no longer makes the --errexit shell option
  ineffective.

- Various bugs in libcmd built-in commands (those bound to the
  /opt/ast/bin path by default) have been fixed.

- Various other crashing bugs have been fixed.

Fixes for the shcomp byte code compiler:

- shcomp is now able to compile scripts that define types using enum.

- shcomp now refuses to mess up your terminal by writing bytecode
  to it.

*** MAIN CHANGES between ksh 93u+ 2012-08-01 and 93u+m 1.0.0-beta.1 ***

Hundreds of bugs have been fixed, including many serious/critical bugs.
This includes upstreamed patches from OpenSUSE, Red Hat, and Solaris, fixes
backported from the abandoned 93v- beta and ksh2020 fork, as well as many
new fixes from the community. See the NEWS file for more information, and
the git commit log for complete documentation of every fix. Incompatible
changes have been minimised, but not at the expense of fixing bugs. For a
list of potentially incompatible changes, see src/cmd/ksh93/COMPATIBILITY.

Though there was a "no new features, bugfixes only" policy, some new
features were found necessary, either to fix serious design flaws or to
complete functionality that was evidently intended, but not finished.
Below is a summary of these new features.

New command line editor features:

- The forward-delete and End keys are now handled as expected in the
  emacs and vi built-in line editors.

- In the vi and emacs line editors, repeat count parameters can now also
  be used for the arrow keys and the forward-delete key. E.g., in emacs
  mode, <ESC> 7 <left-arrow> will now move the cursor seven positions to
  the left. In vi control mode, this would be entered as: 7 <left-arrow>.

New shell language features:

- The &>file redirection shorthand (for >file 2>&1) is now available for
  all scripts and interactive sessions and not only for profile/login
  scripts, bringing ksh 93u+m in line with mksh, bash, and zsh.

- File name generation (a.k.a. pathname expansion, a.k.a. globbing) now
  never matches the special navigational names '.' (current directory)
  and '..' (parent directory). This change makes a pattern like .*
  useful; it now matches all hidden files (dotfiles) in the current
  directory, without the harmful inclusion of '.' and '..'.

- Tilde expansion can now be extended or modified by defining a
  .sh.tilde.get or .sh.tilde.set discipline function. This replaces a
  2004 undocumented attempt to add this functionality via a .sh.tilde
  command, which never worked and crashed the shell. See the manual for
  details on the new method.

New features in built-in commands:

- Usage error messages now show the --help/--man self-documentation options.

- Path-bound built-ins (such as /opt/ast/bin/cat) can now be executed by
  invoking the canonical path, so the following will now work as expected:
	$ /opt/ast/bin/cat --version
	  version         cat (AT&T Research) 2012-05-31

- 'command -x' now looks for external commands only, skipping built-ins.
  In addition, its xargs-like functionality no longer freezes the shell on
  Linux and macOS, making it effectively a new feature on these systems.

- 'redirect' now checks if all arguments are valid redirections before
  performing them. If an error occurs, it issues an error message instead
  of terminating the shell.

- 'suspend' now refuses to suspend a login shell, as there is probably no
  parent shell to return to and the login session would freeze.

- 'times' now gives high precision output in a POSIX compliant format.

- 'typeset' now gives an informative error message if an incompatible
  combination of options is given.

- 'whence -v/-a' now reports the location of autoloadable functions.

New features in shell options:

- A new --globcasedetect shell option is added on OSs where we can
  check for a case-insensitive file system (currently Windows/Cygwin,
  macOS, Linux and QNX 7.0+). When this option is turned on, file name
  generation (globbing), as well as file name tab completion on
  interactive shells, automatically become case-insensitive on file
  systems where the difference between upper and lower case is ignored
  for file names. This is transparently determined for each directory, so
  a path pattern that spans multiple file systems can be part
  case-sensitive and part case-insensitive.

- A new --nobackslashctrl shell option disables the special escaping
  behaviour of the backslash character in the emacs and vi built-in
  editors. Particularly in the emacs editor, this makes it much easier to
  go backward, insert a forgotten backslash into a command, and then
  continue editing without having your next cursor key replace your
  backslash with garbage. Note that Ctrl+V (or whatever other character
  was set using 'stty lnext') always escapes all control characters in
  either editing mode.

- A new --posix shell option has been added to ksh 93u+m that makes the
  ksh language more compatible with other shells by following the POSIX
  standard more closely. See the manual page for details. It is enabled by
  default if ksh is invoked as sh, otherwise it is disabled by default.

- Enhancement to -G/--globstar: symbolic links to directories are now
  followed if they match a normal (non-**) glob pattern. For example, if
  '/lnk' is a symlink to a directory, '/lnk/**' and '/l?k/**' now work as
  you would expect.
2021-12-17 04:20:04 +01:00
Johnothan King
85199ab351 Backport ksh93v- bugfix for [[ 1<2 ]] (#380)
Strings compared in [[ with the > and < operators should be compared
lexically. This does not work when the strings are single digits, as
the parser interprets it as a syntax error:
   $ [[ 10<2 ]]   # 10 lexically sorts before 2
   $ echo $?
   0
   $ [[ 1<2 ]]
   /usr/bin/ksh: syntax error: `<' unexpected
   $ echo $?
   3

src/cmd/ksh93/sh/lex.c:
- Don't interpret numbers next to > and < as a redirection while
  inside of [[. This bugfix was backported from ksh93v- 2014-06-25.

src/cmd/ksh93/tests/bracket.sh:
- Add regression tests for the > and < operators.
2021-12-17 03:26:41 +01:00
Martijn Dekker
e67df29c07 Re-fix defining types conditionally or in subshells (re: f508660d)
New version. I'm pretty sure the problems that forced me to revert
it earlier are fixed.

This commit mitigates the effects of the hack explained in the
referenced commit so that dummy built-in command nodes added by the
parser for declaration/assignment purposes do not leak out into the
execution level, except in a relatively harmless corner case.

Something like

    if false; then
        typeset -T Foo_t=(integer -i bar)
    fi

will no longer leave a broken dummy Foo_t declaration command. The
same applies to declaration commands created with enum.

The corner case remaining is:

    $ ksh -c 'false && enum E_t=(a b c); E_t -a x=(b b a c)'
    ksh: E_t: not found

Since the 'enum' command is not executed, this should have thrown
a syntax error on the 'E_t -a' declaration:
ksh: syntax error at line 1: `(' unexpected

This is because the -c script is parsed entirely before being
executed, so E_t is recognised as a declaration built-in at parse
time. However, the 'not found' error shows that it was successfully
eliminated at execution time, so the inconsistent state will no
longer persist.

This fix now allows another fix to be effective as well: since
built-ins do not know about virtual subshells, fork a virtual
subshell into a real subshell before adding any built-ins.

src/cmd/ksh93/sh/parse.c:

- Add a pair of functions, dcl_hactivate() and dcl_dehacktivate(),
  that (de)activate an internal declaration built-ins tree into
  which check_typedef() can pre-add dummy type declaration command
  nodes. A viewpath from the main built-ins tree to this internal
  tree is added, unifying the two for search purposes and causing
  new nodes to be added to the internal tree. When parsing is done,
  we close that viewpath. This hides those pre-added nodes at
  execution time. Since the parser is sometimes called recursively
  (e.g. for command substitutions), keep track of this and only
  activate and deactivate at the first level.
     (Fixed compared to previous version of this commit: calling
  dcl_dehacktivate() when the recursion level is already zero is
  now a harmless no-op. Since this only occurs in error handling
  conditions, who cares.)

- We also need to catch errors. This is done by setting libast's
  error_info.exit variable to a dcl_exit() function that tidies up
  and then passes control to the original (usually sh_exit()).
     (Fixed compared to previous version of this commit: dcl_exit()
  immediately deactivates the hack, no matter the recursion level,
  and restores the regular sh_exit(). This is the right thing to
  do when we're in the process of erroring out.)

- sh_cmd(): This is the most central function in the parser. You'd
  think it was sh_parse(), but $(modern)-form command substitutions
  use sh_dolparen() instead. Both call sh_cmd(). So let's simply
  add a dcl_hacktivate() call at the beginning and a
  dcl_deactivate() call at the end.

- assign(): This function calls path_search(), which among many
  other things executes an FPATH search, which may execute
  arbitrary code at parse time (!!!). So, regardless of recursion
  level, forcibly dehacktivate() to avoid those ugly parser side
  effects returning in that context.

src/cmd/ksh93/bltins/enum.c: b_enum():

- Fork a virtual subshell before adding a built-in.

src/cmd/ksh93/sh/xec.c: sh_exec():

- Fork a virtual subshell when detecting typeset's -T option.

Improves fix to https://github.com/ksh93/ksh/issues/256
2021-12-17 01:28:28 +01:00
Martijn Dekker
2bc1d814c9 Do not exit shell on Ctrl+C with SIGINT ignored (re: 7e5fd3e9)
The killpg(getpgrp(),SIGINT) call added to ed_getchar() in that
commit caused the interactive shell to exit on ^C even if SIGINT is
being ignored. We cannot revert or remove that call without
breaking job control. This commit applies a new fix instead.

Reproducers fixed by this commit:

  SIGINT ignored by child:

    $ PS1='childshell$ ' ksh
    childshell$ trap '' INT
    childshell$ (press Ctrl+C)
    $

  SIGINT ignored by parent:

    $ (trap '' INT; ENV=/./dev/null PS1='childshell$ ' ksh)
    childshell$ (press Ctrl+C)
    $

  SIGINT ignored by parent, trapped in child:

    $ (trap '' INT; ENV=/./dev/null PS1='childshell$ ' ksh)
    childshell$ trap 'echo test' INT
    childshell$ (press Ctrl+C)
    $

I've experimentally determined that, under these conditions, the
SFIO stream error state is set to 256 == 0400 == SH_EXITSIG.

src/cmd/ksh93/sh/main.c: exfile():
- On EOF or error, do not return (exiting the shell) if the shell
  state is interactive and if sferror(iop)==SH_EXITSIG.
- Refactor that block a little to make the new check fit in nicely.

src/cmd/ksh93/tests/pty.sh:
- Test the above three reproducers.

Fixes: https://github.com/ksh93/ksh/issues/343
2021-12-16 19:56:46 +01:00
Martijn Dekker
7cdb01f625 New selection of default libcmd /opt/ast/bin built-ins
Note that this is only about the /opt/ast/bin built-in commands,
not about the regular pathless builtins such as printf.
To use these, either add /opt/ast/bin to your $PATH or use a
command like 'builtin cp'. As usual, --man provides info.

Removed as defaults for lack of convincing advantages over the OS's
external commands:
- chmod, cmp, head, logname, mkdir, sync, uname, wc

Remain as useful defaults:
- basename, cat, cut, dirname. These are commonly used in
  performance-sensitive code paths in scripts and having them as
  built-ins can be good for performance.
- getconf: This is the only interface to some libast internals that
  is available to ksh. It's also has better functionality than most
  OS-shipped 'getconf' commands, e.g., it can list and query all
  the configuration values.

Added as defaults:
- cp, ln, mv: Having these built in can speed up scripts that
  manage files. Also the AST versions have extended functionality
  (see cp --man, etc.).
- mktemp: External mktemp commands vary too widely and are
  incompatible, but it's important that scripts can securely make
  temporary files, so it's good to ship a known interface to this
  functionality.

As a result, the statically linked ksh binary is very slightly
smaller than before.

Resolves: https://github.com/ksh93/ksh/issues/349
2021-12-16 09:09:37 +01:00
Johnothan King
61fa1b68bf The chown builtin should fail with the same error consistently (#378)
This bug was first reported at <https://www.illumos.org/issues/3782>.
The chown builtin when used on illumos can fail with different error
messages after running the same command twice:

  $ touch /tmp/x
  $ /opt/ast/bin/chown -h 433:434 /tmp/px
  chown: /tmp/x: cannot change owner and group [Not owner]
  $ /opt/ast/bin/chown -h 433:434 /tmp/px
  chown: /tmp/x: cannot change owner and group [Invalid argument]

The error messages differ because the libast struid and strgid
functions will return -2 if the same nonexistent ID is used twice.

The fix for this bug has been ported from here:
https://github.com/illumos/illumos-gate/commit/4162633a7c5961f388fd

src/lib/libcmd/chgrp.c:
- Remove NOID macro and check for a < 0 error status instead.
  This is different from the Illumos fix at
  <https://github.com/illumos/illumos-gate/commit/4162633a7c59>
  which added another macro.

src/lib/libast/man/{strgid,struid}.3:
- Correct errors in the strgid and struid documentation.
- Document that the strgid and struid functions will return -2 if
  the same invalid name is used twice.

Co-authored-by: Martijn Dekker <martijn@inlv.org>
2021-12-14 10:37:44 +01:00
Martijn Dekker
fc752b574a Re-match '.' and '..' in tab completion (re: 5312a59d, aad74597)
Turns out there is a bona fide, honest-to-goodness use case for
matching '.' and '..' in globbing after all. It's when globbing is
used as the backend mechanism for file name completion in
interactive shell editors. A tab invisibly adds a * at the end of
the word to the left of your cursor and the resulting pattern is
expanded. In 5312a59d, this broke for '.' and '..'.

Typing '.' followed by two tabs should result in a menu that
includes './' and '../'. Typing '..' followed by a tab should
result in '../', (or a menu that includes it if there are files
with names starting with '..'). This is the behaviour in 93u+ and
we should maintain this.

To restore this functionality without reintroducing the harmful
behaviour fixed in the referenced commits, we should special-case
this, allowing '.' and '..' to match only for file name completion.

src/lib/libast/include/glob.h:
- Fix an inaccurate comment: the GLOB_COMPLETE flag is used for
  command completion, not file name completion. This is very clear
  from reading the path_expand() function in sh/expand.c.
- Add new GLOB_FCOMPLETE flag for file name completion.

src/lib/libast/misc/glob.c:
- Adapt flags mask to fit the new flag.
- glob_dir(): If GLOB_FCOMPLETE is passed, allow '.' and '..' to
  match even if expanded from a pattern.
- Clarify the fix from aad74597 with an extended comment based on
  <https://github.com/ksh93/ksh/issues/146#issuecomment-790991990>.

src/cmd/ksh93/sh/expand.c: path_expand():
- If we're in the SH_FCOMPLETE (file name completion) state, then
  pass the new GLOB_FCOMPLETE flag to AST glob(3).

Fixes: https://github.com/ksh93/ksh/issues/372
Thanks to @fbrau for the bug report.
2021-12-13 01:50:50 +01:00
Johnothan King
e54001d58b Various minor capitalization and typo fixes (#371)
This commit fixes various minor typos, punctuation errors and
corrects the capitalization of many names.
2021-12-13 01:49:42 +01:00
Johnothan King
cd562b16e2 Port more shell lint improvements from illumos and ksh93v- (#374)
This commit adds onto <https://github.com/ksh93/ksh/pull/353> by porting
over two additional improvements to the shell linter:

1) The changes in the aforementioned pull request were merged into
   illumos-gate with an additional change.[*] The illumos revision of
   the patch improved the warning for (( $foo = $? )) to specify '$foo'
   causes the warning.[**] Example:
      $ ksh -n -c '(( $? != $bar ))'
      ksh: warning: line 1: in '(( $? != $bar ))', using '$' as in '$bar' is slower and can introduce rounding errors
   While I was porting the illumos patch I did notice one problem. The
   string it uses from paramsub() skips over the initial '{' in
   '${var}', resulting in the warning printing '$var}' instead:
      $ ksh -n -c '(( ${.sh.pid} != $$ ))'
      ...  in '(( ${.sh.pid} != $$ ))', using '$' as in '$.sh.pid}' is slower  ...
   This was fixed by including the missing '{' in the string returned by
   paramsub for ${var} variables.

2) In ksh93v-, parsing x=$((expr)) with the shell linter will cause ksh
   to warn the user x=$((expr)) is slower than ((x=expr)). This
   improvement has been backported with a modified warning:
      # Result from this commit
      $ ksh -n -c 'x=$((1 + 2))'
      ksh: warning: line 1: x=$((1 + 2)) is slower than ((x=1 + 2))
      # Result from ksh93v-
      $ ksh93v -n -c 'x=$((1 + 2))'
      ksh93v: warning: line 1: ((x=1 + 2)) is more efficient than x=$((1 + 2))
   Minor note: the ksh93v- patch had an invalid use of memcmp; this
   version of the patch uses strncmp instead.

References:
be548e87bc
https://code.illumos.org/c/illumos-gate/+/1834/comment/65722363_22fdf8e7/
2021-12-13 01:49:37 +01:00
Martijn Dekker
65feb9641a Fix two more PS2/SIGINT crashing bugs (re: 3023d53b)
*** Crash 1: ***

ksh crashed if the PS1 prompt contains one or more command
substitutions and you enter a multi-line command substitution
on the command line, then interrupt while on the PS2 prompt.

    $ ENV=/./dev/null /usr/local/bin/ksh -o emacs
    $ PS1='$(echo foo) $(echo bar) $(echo baz) ! % '
    foo bar baz 16999 % echo $(
    > true		<-- here, press Ctrl+C instead of Return

    Memory fault

The crash occurred due to a corrupted lexer state while trying to
display the PS1 prompt.

Analysis: My fix for the crashing bug with Ctrl+C in commit
3023d53b is incorrect and only worked accidentally. sh_fault() is
not the right place to reset the lexer state because, when we press
Ctrl+C on a PS2 prompt, ksh had been waiting for input to finish
lexing a multi-line command, so sh_lex() and other lexer functions
are on the function call stack and will be returned to.

src/cmd/ksh93/sh/fault.c: sh_fault():
- Remove incorrect SIGINT fix.

src/cmd/ksh93/sh/io.c: io_prompt():
- Reset the lexer state immediately before printing every PS1
  prompt. Even in situations where this is redundant it should be
  perfectly safe, the overhead is negligible, and it resolves this
  crash. It may pre-empt other problems as well.

*** Crash 2: ***

If an INT trap is set, and you start entering a multi-line command
substitution, then press Ctrl+C on the PS2 prompt to trigger the
crash, the lexer state is corrupted because the lexer is invoked to
eval the trap action. A crash then occurs on entering the final ')'
of the command substitution.

    $ trap 'echo TRAPPED' INT
    $ echo $(
    > trueTRAPPED	<-- press Ctrl+C to output "TRAPPED"
    > )
    Memory fault

Technically, as SIGINT is trapped, it should not interrupt, so ksh
should execute the trap, then continue with the PS2 prompt to let
the user finish inputting the command. But I have been unsuccessful
in many different attempts to make this work properly. I managed to
get multi-line command substitutions to lex correctly by saving and
restoring the lexer state, but command substitutions were still
corrupted at the parser and/or execution level and I have not
managed to trace the cause of that.

My testing showed that all other shells interrupt the PS2 prompt
and return to PS1 when the user presses Ctrl+C, even if SIGINT is
trapped. I think that is a reasonable alternative, and it is
something I managed to make work.

src/cmd/ksh93/sh/fault.c: sh_chktrap():
- Immediately after invoking sh_trap() to run a trap action, check
  if we're in a PS2 prompt (sh.nextprompt == 2). If so, assume the
  lexer state is now overwritten. Closing the fcin stream with
  fcclose() seems to reliably force the lexer to stop doing
  anything else. Then we can just reset the prompt to PS1 and
  invoke sh_exit() to start new command line, which will now reset
  the lexer state as per above.
2021-12-11 04:29:53 +01:00
Martijn Dekker
feedc05037 Reset lexer state on syntax error and on SIGINT (Ctrl+C)
ksh crashed if you pressed Ctrl+C or Ctrl+D on a PS2 prompt while
you haven't finished entering a $(command substitution). It
corrupts subsequent command substitutions. Sometimes the situation
recovers, sometimes the shell crashes.

Simple crash reproducer:

  $ PS1="\$(echo foo) \$(echo bar) \$(echo baz) > "
  foo bar baz > echo $(                        <-- now press Ctrl+D
  > ksh: syntax error: `(' unmatched
  Memory fault

The same happens with Ctrl+C, minus the syntax error message.

The problem is that the lexer state becomes inconsistent when the
lexer is interrupted in the middle of reading a command
substitution of the form $( ... ). This is tracked in the
'lexd.dolparen' variable in the lexer state struct.

Resetting that variable is sufficient to fix this issue. However,
in this commit I prefer to just reinitialise the lexer state
completely to pre-empt any other possible issues. Whether there was
a syntax error or the user pressed Ctrl+C, we just interrupted all
lexing and parsing, so the lexer *should* restart from scratch.

src/cmd/ksh93/sh/fault.c: sh_fault():
- If the shell is in an interactive state (e.g. not a subshell) and
  SIGINT was received, reinitialise the lexer state. This fixes the
  crash with Ctrl+C.

src/cmd/ksh93/sh/lex.c: sh_syntax():
- When handling a syntax error, reset the lexer state. This fixes
  the crash with Ctrl+D.

NEWS:
- Also add the forgotten item for the previous fix (re: 2322f939).
2021-12-09 07:35:12 +01:00
Martijn Dekker
350e52877b Revert "[1.0 release prep] Remove tilde expansion discipline"
This reverts c0334e32, thereby restoring 936a1939.

After the fixes in 0a343244 and a2bc49be, the tilde expansion
disciplines work nicely, so they can come back to the 1.0 branch.
2021-12-09 07:31:37 +01:00
Johnothan King
2b8eaa6609 Fix handling of files without newlines in the head and tail builtins (#365)
The head and tail builtins don't correctly handle files that lack
newlines[*]:
    $ print -n foo > /tmp/bar
    $ /opt/ast/bin/head -1 /tmp/bar  # No output
    $ print -n 'foo\nbar' > /tmp/bar
    $ /opt/ast/bin/tail -1 /tmp/bar
    foo
    bar$
This commit backports the required changes from ksh93v- to handle files
without a newline in the head and tail builtins. (Also note that the
required fix to sfmove was already backported in commit 1bd06207.)

src/lib/libcmd/{head,tail}.c:
- Backport the relevant ksh93v- code for handling files
  without newlines.

src/cmd/ksh93/tests/builtins.sh:
- Add a few regression tests for using 'head -1' and 'tail -1' on a file
  missing and ending newline.

[*]: https://www.illumos.org/issues/4149
2021-12-09 06:43:10 +01:00
Martijn Dekker
b3050769ea Fix 'return' emitting signals; allow arbitrary return values
When a global EXIT trap is set, and a ksh-style function exits with
a status > 256 that could have been the result of a signal, then
the shell incorrectly issues that signal to itself. Depending on
the signal, this causes ksh to terminate itself ungracefully:

    $ cat /tmp/exit267
    trap 'echo OK' EXIT  # This trap triggers the crash
    function foo { return 267; }
    foo
    $ bash /tmp/exit267
    OK
    $ ksh-3aee10d7 /tmp/exit267
    OK
    $ ksh /tmp/exit267
    Memory fault(coredump)

On most systems, status 267 corresponds to SIGSEGV. The reported
memory fault is not real; it results from ksh incorrectly killing
itself with that signal.

The problem is caused by two factors:

1. As of 93u+ 2012-08-01, ksh explicitly allows 'return' to use an
   exit status corresponding to a signal (from 257 to end of signal
   range). The rest of the integer range is trunctated to 8 bits.
   This is contrary to both 'man ksh' and 'return --man' which both
   say it's always truncated to 8 bits. Plus, combined with point 2
   below, this new behaviour is nonsensical, as 'return' has no
   business actually generating signals. However, a couple of
   regression tests now depend on this, as may some scripts.

2. When a ksh-style function does not handle a signal, the signal
   is passed down to the parent environment and ksh does this by
   reissuing the signal to its own process after leaving the
   function scope. However, it does this by checking the exit
   status, which is very bad practice as there is no guarantee
   that an exit status corresponding to a signal was in fact
   produced by a signal, particularly after they changed the
   behaviour of 'return' per 1 above.

This commit fixes both issues. It also takes a proper decision on
allowable 'return' exit status arguments. Since 93u+ was released
nearly a decade ago and some scripts may now rely on being able to
pass certain exit statuses out of the 8-bit range, we should not
disallow this now. But neither should we be half-hearted in
allowing only some arbitrary selection of 9-bit statuses; 'return'
values categorically should have nothing to do with signals, so
this is no basis for limiting them. We're now allowing the full
unsigned integer range, which is usually 32 bits. This is like zsh,
and may create some interesting possibilities for scripts.
Just don't forget that $? will still lose all but its 8 least
significant bits when leaving the current (sub)shell environment.

src/cmd/ksh93/sh/xec.c: sh_funscope():
- Fix passing down unhandled signals from interrupted ksh functions
  (jumpval==SH_JMPFUN) to the parent environment. Do not pay any
  attention to the exit status. Instead, use sh.lastsig (a.k.a.
  shp->lastsig). It is set by sh_fault() in fault.c for just this
  purpose and contains the last signal handled for the current
  command. It is reset in sh_exec() before running any new command.
  So if it contains a signal, that is the one that interrupted the
  ksh function, so it's the correct one to pass down. (Further
  evidence: sh_subshell() was already using this in the same way.)

src/cmd/ksh93/bltins/cflow.c: b_return():
- Allow any signed int return value when invoked as and behaving
  like 'return'.
- Add warning if a passed value is out of int range. Set the exit
  status to 128 in that case; int overflow is undefined behaviour
  in C and we want consistent behaviour across platforms. It should
  be safe enough to check if the long and int values are equal.
- Refactor for clarity.

src/cmd/ksh93/sh/subshell.c: sh_subshell():
- If a function returns with a status out of the 8 bit range in a
  virtual subshell, this status could be passed down to the parent
  shell in full. However, if the subshell forks, then the kernel
  will enforce an 8-bit exit status. That is inconsistent. Scripts
  should not be able to tell the difference between forked and
  non-forked subshells, so artificially enforce that limit here.

Other changed files:
- Documentation updates and copy-edits.
- Update an AT&T functions.sh regress test to allow arbitrary
  integer return values for functions.
- Add regression tests based in part on @JohnoKing's reproducers.
- Rework some vaguely related regression tests to fail gracefully.

Thanks to Johnothan King for the report and the testing.

Fixes: https://github.com/ksh93/ksh/issues/364
2021-12-09 06:41:39 +01:00
Johnothan King
cd8c48cc5a Add the '-e' flag to the 'cd' builtin (#358)
This change adds the -e flag to the cd builtin, as specified in
<https://www.austingroupbugs.net/view.php?id=253>. The -e flag is
used to verify if the the current working directory after 'cd -P'
successfully changes the directory, and returns with exit status 1
if the cwd couldn't be determined. Additionally, it causes all
other errors to return with exit status >1 (i.e., status 2 unless
ENOMEM occurs) if -e and -P are both active.

src/cmd/ksh93/bltins/cd_pwd.c:
- Add -e option to the cd builtin command. It verifies $PWD by
  using test_inode() to execute the equivalent of [[ . -ef $PWD ]].
- The check for restricted mode has been moved after optget to
  allow 'cd -eP' to return with exit status 2 when in restricted
  mode. To avoid changing the previous behavior of cd when -e isn't
  passed, extra checks have been added to prevent cd from printing
  usage information in restricted mode.

src/cmd/ksh93/tests/builtins.sh:
- Add regression tests for the exit status when using the cd -P
  flag with and without -e.

src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/sh.1:
- Document the addition of -e to the cd builtin.
2021-12-06 06:58:11 +01:00
Johnothan King
f6a6d236f1 Port over illumos fixes for conf.sh (#361)
This commit ports over two of Andy Fiddaman's bugfixes to conf.sh
on illumos:

- The compiler isn't passed on to an invocation of iffe. The bugfix is
  from this commit: <https://github.com/citrus-it/ast/commit/63563232>
- The getconf builtin is missing several parameters on illumos.
  Reproducer:
     $ /opt/ast/bin/getconf ADDRESS_WIDTH
     getconf: Invalid argument (ADDRESS_WIDTH)  # Should output '64'
  This bug occurs because conf.sh expects GNU sed and fails to work
  properly with other sed implementations. The bugfix and original bug
  report can be found here:
  https://www.illumos.org/issues/14044
  https://github.com/citrus-it/ast/commit/ba443cfd
2021-12-05 19:57:15 +01:00
Johnothan King
0a343244c1 nvdisc.c: Fix crash after an error or signal in discipline function (#356)
This patch fixes the crashes experienced when a discipline function
exited because of a signal or an error from a special builtin. The
crashes were caused by ksh entering an inconsistent state after
performing a longjmp away from the assign() and lookup() functions in
nvdisc.c. Fixing the crash requires entering a new context, then setting
a nonlocal goto with sigsetjmp(3). Any longjmps that happen while
running the discipline function will go back to assign/lookup, allowing
ksh to do a proper cleanup afterwards.

Resolves: https://github.com/ksh93/ksh/issues/346
2021-12-05 19:28:09 +01:00
Johnothan King
6904585f49 Port illumos' shell linter improvements (#353)
This commit ports over two improvements to the shell linter from
illumos (original patch written by Andy Fiddaman). Links to the
relevant bug reports and the original patch:
https://www.illumos.org/issues/13601
https://www.illumos.org/issues/13631
c7b656fc71

The first improvement is to the lint warning for arithmetic
operators in [[ ... ]]. The ksh linter now suggests the correct
equivalent operator to use in ((...)). Example:
  $ ksh -nc '[[ 30 -gt 25 ]]'
  # Original warning
  warning: line 1: -gt within [[ ... ]] obsolete, use ((...))
  # New warning
  warning: line 1: [[ ... -gt ... ]] obsolete, use ((... > ...))

The second improvement pertains to variable expansion in arithmetic
expressions. The ksh linter now suggests referencing variable names
directly:
  $ ksh -nc 'integer foo=40; (($foo < 50 ))'
  # Old warning
  warning: line 1: variable expansion makes arithmetic evaluation less efficient
  # New warning
  warning: line 1: in '(($foo < 50))', using '$' is slower and can introduce rounding errors

src/cmd/ksh93/{data/lexstates,sh/lex,sh/parse}.c:
- Port the improved shell lint warnings from illumos to ksh93u+m.
- The original checks for arithmetic operators involved a bunch of
  if statements with inefficient calls to strcmp(3). These were
  replaced with a more efficient switch statement that avoids
  strcmp.
2021-12-05 19:27:16 +01:00
Johnothan King
370440473e Fix KEYBD trap crash when inputting a command substitution (#355)
This change fixes a crash that can occur after setting a KEYBD trap
then inputting a multi-line command substitution. The crash is
similar to issue #347, but it's easier to reproduce since it
doesn't require you to setup a kshrc file. Reproducer for the
crash:

  $ ENV=/./dev/null ksh
  $ trap : KEYBD
  $ : $(
  > true)
  Memory fault(coredump)

The bugfix was backported (with considerable changes) from ksh93v-
2013-10-08. The crash was first reported on the old mailing list:
https://www.mail-archive.com/ast-users@lists.research.att.com/msg00313.html

src/cmd/ksh93/{include/shlex.h,sh/lex.c}:
- To fix this properly, we need sizeof(Lex_t) to work as expected
  in edit.c, but that is thwarted by the _SHLEX_PRIVATE macro in
  lex.c which shlex.h uses to add private structs to the Lex_t type
  in lex.c only. So get rid of that _SHLEX_PRIVATE macro and make
  those members part of the centrally defined struct, renaming them
  to make it clear they're considered private to lex.c.

src/cmd/ksh93/edit/edit.c:
- Now that we can get its size, save and restore the shell lexing
  context when a KEYBD trap is present.

src/cmd/ksh93/tests/pty.sh:
- Add a regression test for the KEYBD trap crash.

Co-authored-by: Martijn Dekker <martijn@inlv.org>
2021-11-30 04:27:31 +01:00
Johnothan King
bfad44e56d Fix memory fault on ARM-based Macs (#354)
Ksh segfaults on M1 Macs, apparently because Apple's compiler
optimizer can't cope with overriding bzero(3) with libast's
implementation (though it's nothing more than "memset(b, 0, n);").

Apple has disabled libast's bzero function since 2018-12-04:
https://opensource.apple.com/source/ksh/ksh-27/patches/src__lib__libast__comp__omitted.c.diff.auto.html

src/lib/libast/comp/omitted.c:
- Only fall back to the libast bzero function if the OS lacks an
  implementation of bzero.
- Remove the bzero undef since this fallback is only reached if the
  OS doesn't have bzero.

NEWS:
- Add compat info for macOS on ARM64. This notes that macOS
  Monterey can now compile and run ksh, although there is one
  regression test failure:
        builtins.sh[345]: printf %H: invalid UTF-8 characters
        (expected %3F%C2%86%3F%3F%3F; got %3F%C2%86%3F%3Fv%3F%3F)

Thanks to @DesantBucie for the report and the testing.

Resolves: https://github.com/ksh93/ksh/issues/329
2021-11-29 20:12:57 +01:00
Johnothan King
a0eeb14787 Stop the time keyword overriding errexit (#351)
This bug was first reported in <https://www.illumos.org/issues/7694>.
The time keyword currently overrides the errexit shell option,
allowing failing scripts to continue after an error:

  $ cat 1.sh
  #!/bin/sh
  time false   # This should cause the script to exit
  echo FAILURE
  true
  $ ksh -o errexit 1.sh

  real    0m0.00s
  user    0m0.00s
  sys     0m0.00s
  FAILURE

src/cmd/ksh93/sh/xec.c:
- When the time keyword runs a command, pass the errexit state flag
  to the sh_exec call. This state flag is required for ksh to exit
  when a command fails while the errexit option is on.

src/cmd/ksh93/tests/basic.sh:
- Add a regression test based on the reproducer.
2021-11-29 20:12:15 +01:00
Martijn Dekker
f508660ddf Revert "Fix defining types conditionally and/or in subshells (re: 8ced1daa)"
This reverts commit 2b9cbbbc8e.

This is not ready for prime time. Crashses when running a $PS2
discipline function. This needs fixing and more testing in
development before making it into the 1.0 branch. In the meantime,
that terrible problem with types is back, sorry about that.
2021-11-29 20:08:53 +01:00
Martijn Dekker
2b9cbbbc8e Fix defining types conditionally and/or in subshells (re: 8ced1daa)
This commit mitigates the effects of the hack explained in the
referenced commit so that dummy built-in command nodes added by the
parser for declaration/assignment purposes do not leak out into the
execution level, except in a relatively harmless corner case.

Something like

	if false; then
		typeset -T Foo_t=(integer -i bar)
	fi

will no longer leave a broken dummy Foo_t declaration command. The
same applies to declaration commands created with enum.

The corner case remaining is:

$ ksh -c 'false && enum E_t=(a b c); E_t -a x=(b b a c)'
ksh: E_t: not found

Since the 'enum' command is not executed, this should have thrown
a syntax error on the 'E_t -a' declaration:
ksh: syntax error at line 1: `(' unexpected

This is because the -c script is parsed entirely before being
executed, so E_t is recognised as a declaration built-in at parse
time. However, the 'not found' error shows that it was successfully
eliminated at execution time, so the inconsistent state will no
longer persist.

This fix now allows another fix to be effective as well: since
built-ins do not know about virtual subshells, fork a virtual
subshell into a real subshell before adding any built-ins.

src/cmd/ksh93/sh/parse.c:

- Add a pair of functions, dcl_hactivate() and dcl_dehacktivate(),
  that (de)activate an internal declaration built-ins tree into
  which check_typedef() can pre-add dummy type declaration command
  nodes. A viewpath from the main built-ins tree to this internal
  tree is added, unifying the two for search purposes and causing
  new nodes to be added to the internal tree. When parsing is done,
  we close that viewpath. This hides those pre-added nodes at
  execution time. Since the parser is sometimes called recursively
  (e.g. for command substitutions), keep track of this and only
  activate and deactivate at the first level.

- We also need to catch errors. This is done by setting libast's
  error_info.exit variable to a dcl_exit() function that tidies up
  and then passes control to the original (usually sh_exit()).

- sh_cmd(): This is the most central function in the parser. You'd
  think it was sh_parse(), but $(modern)-form command substitutions
  use sh_dolparen() instead. Both call sh_cmd(). So let's simply
  add a dcl_hacktivate() call at the beginning and a
  dcl_deactivate() call at the end.

- assign(): This function calls path_search(), which among many
  other things executes an FATH search, which may execute arbitrary
  code at parse time (!!!). So, regardless of recursion level,
  forcibly dehacktivate() to avoid those ugly parser side effects
  returning in that context.

src/cmd/ksh93/bltins/enum.c: b_enum():

- Fork a virtual subshell before adding a built-in.

src/cmd/ksh93/sh/xec.c: sh_exec():

- Fork a virtual subshell when detecting typeset's -T option.

Improves fix to https://github.com/ksh93/ksh/issues/256
2021-11-29 09:02:07 +01:00
Johnothan King
84ded2d0c4 Backport the ksh93v- rm builtin to fix 'rm -d' (#348)
The -d flag implemented in the rm builtin is completely broken. No
matter what you do it refuses to remove directories, even if -r is
also passed. Reproducer:

  $ mkdir /tmp/empty
  $ PATH=/opt/ast/bin rm -d /tmp/empty
  rm: /tmp/empty: directory
  $ PATH=/opt/ast/bin rm -dr /tmp/empty
  rm: /tmp/empty: directory not removed [Is a directory]

Additionally, the description of 'rm -d' in the man page contradicts
how it's specified in <https://www.austingroupbugs.net/view.php?id=802>.

The ksh93v- rm builtin fixed nearly all of these issues, so I've
backported it to 93u+m and applied one additional fix for 'rm -rd'.

src/lib/libcmd/rm.c:
- Backported the fixes from the ksh93v- rm builtin's -d flag when
  used on empty directories.
- Backported the man page update for rm(1) from ksh93v-.
- The ksh93v- rm builtin had one additional bug that caused the -r
  option to fail when combined with -d. This was fixed by
  overriding -d if -r is also passed.

src/cmd/ksh93/tests/builtins.sh:
- Add regression tests for the rm builtin's -d option.
2021-11-25 03:52:05 +01:00
Martijn Dekker
214308f81e '.': disable ksh function lookup in POSIX mode
POSIXly, '.' loads only files, not functions.

This only applies to '.', not 'source' (which is not in POSIX).

src/cmd/ksh93/bltins/misc.c: b_source():
- For ksh function lookup, add an additional check that we're not
  in POSIX mode and running the '.' (SYSDOT) builtin.
2021-11-24 09:12:39 +01:00
Martijn Dekker
c0334e32a1 [1.0 release prep] Remove tilde expansion discipline
Defining a .sh.tilde.get or .sh.tilde.set discipline function to
extend tilde expansion works well as long as the discipline
function doesn't get interrupted (e.g. with Crtl+C) or produce an
error message. Either of those will cause the shell to become
unstable and crash.

This feature is now removed from the 1.0 branch as it is not ready
for prime time. It can return to a release branch if/when we manage
to fix it on the master branch.

Related: https://github.com/ksh93/ksh/issues/346
2021-11-24 07:46:58 +01:00
Martijn Dekker
a66cd72f7d arith: implement range checking for enum types
Within arithmetic expressions, enumeration values of variables of a
type created with the 'enum' command translate to index numbers
from 0 to the number of elements minus 1. However, there was no
range checking on this in the arithmetic subsystem, allowing the
assignment of out-of-range values that did not correspond to any
enumeration value.

Variables of an enum type are internally unsigned short integers
(NV_UINT16), like those created with 'integer -su', except with an
additional discipline function (ENUM_disc).

src/cmd/ksh93/bltins/enum.c,
src/cmd/ksh93/include/builtins.h:
- To implement range checking, the arithmetic system needs access
  to the 'nelem' (number of elements) member of 'struct Enum'. This
  is only defined locally in enum.c. We could move that to name.h
  so arith.c can access it, but enum.c has code that supports
  compiling as standalone. So, instead, define a quick extern
  function, b_enum_elem(), that does the necessary type conversion
  and returns a type's number of elements.
- Add --man documentation for the arithmetic subsystem behaviour
  for enum types. Tell the enuminfo() function, which dynamically
  inserts values into the documentation, how to process new \f tags
  'lastv' (the last-defined value) and 'lastn' (the number of the
  last element).

src/cmd/ksh93/sh/arith.c: arith():
- For NV_UINT16 variables with an ENUM_disc discipline, check the
  range using b_enum_elem() and error out if necessary.

Resolves: https://github.com/ksh93/ksh/issues/335
2021-11-23 22:10:40 +01:00
Johnothan King
e26937b36a Add support for 'stty size' to the libcmd 'stty' builtin (#342)
This commit adds support for 'stty size' to the stty builtin, as
defined in <https://austingroupbugs.net/view.php?id=1053>. The size
mode is used to display the terminal's number of rows and columns.
Note that stty isn't included in the default list of builtin
commands; testing this addition requires adding CMDLIST(stty) to
the table of builtins in src/cmd/ksh93/data/builtins.c.

src/lib/libcmd/stty.c:
- Add support for the size mode to the stty builtin. This mode is
  only used to display the terminal's number of rows and columns,
  so error out if any arguments are given that attempt to set the
  terminal size.
2021-11-23 15:38:14 +01:00
Martijn Dekker
8ced1daadf Fix enum type definition pre-parsing for shcomp and dot/source
Parser limitations prevent shcomp or source from handling enum
types correctly:

    $ cat /tmp/colors.sh
    enum Color_t=(red green blue orange yellow)
    Color_t -A Colors=([foo]=red)

    $ shcomp /tmp/colors.sh > /dev/null
    /tmp/colors.sh: syntax error at line 2: `(' unexpected
    $ source /tmp/colors.sh
    /bin/ksh: source: syntax error: `(' unexpected

Yet, for types created using 'typeset -T', this works. This is done
via a check_typedef() function that preliminarily adds the special
declaration builtin at parse time, with details to be filled in
later at execution time.

This hack will produce ugly undefined behaviour if the definition
command creating that type built-in is then not actually run at
execution time before the type built-in is accessed.

But the hack is necessary because we're dealing with a fundamental
design flaw in the ksh language. Dynamically addable built-ins that
change the syntactic parsing of the shell language on the fly are
an absurdity that violates the separation between parsing and
execution, which muddies the waters and creates the need for some
kind of ugly hack to keep things like shcomp more or less working.

This commit extends that hack to support enum.

src/cmd/ksh93/sh/parse.c:
- check_typedef():
  - Add 'intypeset' parameter that should be set to 1 for typeset
    and friends, 2 for enum.
  - When processing enum arguments, use AST getopt(3) to skip over
    enum's options to find the name of the type to be defined.
    (getopt failed if we were running a -c script; deal with this
    by zeroing opt_info.index first.)
- item(): Update check_typedef() call, passing lexp->intypeset.
- simple(): Set lexp->intypeset to 2 when processing enum.

The rest of the changes are all to support the above and should be
fairly obvious, except:

src/cmd/ksh93/bltins/enum.c:
- enuminfo(): Return on null pointer, avoiding a crash upon
  executing 'Type_t --man' if Type_t has not been fully defined due
  to the definition being pre-added at parse time but not executed.
  It's all still wrong, but a crash is worse.

Resolves: https://github.com/ksh93/ksh/issues/256
2021-11-21 17:43:55 +01:00
Johnothan King
e554a07c56 typeset -T shouldn't list types created with enum (#340)
Listing types with 'typeset -T' will list not only types created with
typeset, but also types created with enum. However, the types created
by enum are not displayed correctly in the resulting output:

$ enum Foo_t=(foo bar)
$ typeset -T
typeset -T Foo_t
typeset -T Foo_t=fo)

The fix for this bug was backported from ksh93v- 2013-10-08.

src/cmd/ksh93/sh/nvtype.c:
- sh_outtype(): Skip over enums when listing types with 'typeset -T'.
2021-11-20 09:48:48 +01:00
Johnothan King
396b388e1f Fix a few issues with $RANDOM seeding in subshells (#339)
This commit fixes an issue I found in the subshell $RANDOM
reseeding code.

The main issue is a performance regression in the shbench fibonacci
benchmark, introduced in commit af6a32d1. Performance dropped in
this benchmark because $RANDOM is always reseeded and restored,
even when it's never used in a subshell. Performance results from
before and after this performance fix (results are on Linux with
CC=gcc and CCFLAGS='-O2 -D_std_malloc'):

  $ ./shbench -b bench/fibonacci.ksh -l 100 ./ksh-0f06a2e ./ksh-af6a32d ./ksh-f31e368 ./ksh-randfix

  benchmarking ./ksh-0f06a2e, ./ksh-af6a32d, ./ksh-f31e368, ./ksh-randfix ...
  *** fibonacci.ksh ***
  # ./ksh-0f06a2e  # Recent version of ksh93u+m
  # ./ksh-af6a32d  # Commit that introduced the regression
  # ./ksh-f31e368  # Commit without the regression
  # ./ksh-randfix  # Ksh93u+m with this patch applied

  -------------------------------------------------------------------------------------------------
  name           ./ksh-0f06a2e        ./ksh-af6a32d        ./ksh-f31e368        ./ksh-randfix
  -------------------------------------------------------------------------------------------------
  fibonacci.ksh  0.481 [0.459-0.515]  0.472 [0.455-0.504]  0.396 [0.380-0.442]  0.407 [0.385-0.439]
  -------------------------------------------------------------------------------------------------

src/cmd/ksh93/include/variables.h,
src/cmd/ksh93/sh/{init,subshell}.c:
- Rather than reseed $RANDOM every time a subshell is created, add
  a sh_save_rand_seed() function that does this only when the
  $RANDOM variable is used in a subshell. This function is called
  by the $RANDOM discipline functions nget_rand() and put_rand().
  As a minor optimization, sh_save_rand_seed doesn't reseed if it's
  called from put_rand().
- Because $RANDOM may have a seed of zero (i.e., RANDOM=0),
  sp->rand_seed isn't enough to tell if $RANDOM has been reseeded.
  Add sp->rand_state for this purpose.
- sh_subshell(): Only restore the former $RANDOM seed and state if
  it is necessary to prevent a subshell leak.

src/cmd/ksh93/tests/variables.sh:
- Add two regression tests for bugs I ran into while making this
  patch.
2021-11-19 08:18:44 +01:00
Martijn Dekker
bd9752e43c Backport 'printf -v' from ksh 93v-
'printf' on bash and zsh has a popular -v option that allows
assigning formatted output directly to variables without using a
command substitution. This is much faster and avoids snags with
stripping final linefeeds. AT&T had replicated this feature in the
abandoned 93v- beta version. This backports it with a few tweaks
and one user-visible improvement.

The 93v- version prohibited specifying a variable name with an
array subscript, such as printf -v var\[3\] foo. This works fine on
bash and zsh, so I see no reason why this should not work on ksh,
as nv_putval() deals with array subscripts just fine.

src/cmd/ksh93/bltins/print.c: b_print():
- While processing the -v option when called as printf, get a
  pointer to the variable, creating it if necessary. Pass only the
  NV_VARNAME flag to enforce a valid variable name, and not (as
  93v- does) the NV_NOARRAY flag to prohibit array subscripts.
- If a variable was given, set the output file to an internal
  string buffer and jump straight to processing the format.
- After processing the format, assign the contents to the string
  buffer to the variable.

src/cmd/ksh93/data/builtins.c:
- Document the new option, adding a warning that unquoted square
  brackets may trigger pathname expansion.
2021-11-19 03:54:33 +01:00
Martijn Dekker
c734568b02 arithmetic: Fix the octal leading zero mess (#337)
In C/POSIX arithmetic, a leading 0 denotes an octal number, e.g.
010 == 8. But this is not a desirable feature as it can cause
problems with processing things like dates with a leading zero.
In ksh, you should use 8#10 instead ("10" with base 8).

It would be tolerable if ksh at least implemented it consistently.
But AT&T made an incredible mess of it. For anyone who is not
intimately familiar with ksh internals, it is inscrutable where
arithmetic evaluation special-cases a leading 0 and where it
doesn't. Here are just some of the surprises/inconsistencies:

1. The AT&T maintainers tried to honour a leading 0 inside of
   ((...)) and $((...)) and not for arithmetic contexts outside it,
   but even that inconsistency was never quite consistent.

2. Since 2010-12-12, $((x)) and $(($x)) are different:
      $ /bin/ksh -c 'x=010; echo $((x)) $(($x))'
      10 8
   That's a clear violation of both POSIX and the principle of
   least astonishment. $((x)) and $(($x)) should be the same in
   all cases.

3. 'let' with '-o letoctal' acts in this bizarre way:
      $ set -o letoctal; x=010; let "y1=$x" "y2=010"; echo $y1 $y2
      10 8
   That's right, 'let y=$x' is different from 'let y=010' even
   when $x contains the same string value '010'! This violates
   established shell grammar on the most basic level.

This commit introduces consistency. By default, ksh now acts like
mksh and zsh: the octal leading zero is disabled in all arithmetic
contexts equally. In POSIX mode, it is enabled equally.

The one exception is the 'let' built-in, where this can still be
controlled independently with the letoctal option as before (but,
because letoctal is synched with posix when switching that on/off,
it's consistent by default).

We're also removing the hackery that causes variable expansions for
the 'let' builtin to be quietly altered, so that 'x=010; let y=$x'
now does the same as 'let y=010' even with letoctal on.

Various files:
- Get rid of now-redundant sh.inarith (shp->inarith) flag, as we're
  no longer distinguishing between being inside or outside ((...)).

src/cmd/ksh93/sh/arith.c:
- arith(): Let disabling POSIX octal constants by skipping leading
  zeros depend on either the letoctal option being off (if we're
  running the "let" built-in") or the posix option being off.
- sh_strnum(): Preset a base of 10 for strtonll(3) depending on the
  posix or letoctal option being off, not on the sh.inarith flag.

src/cmd/ksh93/include/argnod.h,
src/cmd/ksh93/sh/args.c,
src/cmd/ksh93/sh/macro.c:
- Remove astonishing hackery that violated shell grammar for 'let'.

src/cmd/ksh93/sh/name.c (nv_getnum()),
src/cmd/ksh93/sh/nvdisc.c (nv_getn()):
- Remove loops for skipping leading zeroes that included a broken
  check for justify/zerofill attributes, thereby fixing this bug:
	$ typeset -Z x=0x15; echo $((x))
	-ksh: x15: parameter not set
  Even if this code wasn't redundant before, it is now: sh_arith()
  is called immediately after the removed code and it ignores
  leading zeroes via sh_strnum() and strtonll(3).

Resolves: https://github.com/ksh93/ksh/issues/334
2021-11-17 04:28:08 +01:00
Johnothan King
b40155fae8 Fix file descriptor leaks in the hist builtin (#336)
This commit fixes two file descriptor leaks in the hist built-in.
The bugfix for the first file descriptor leak was backported from
ksh2020. See:
https://github.com/att/ast/issues/872
https://github.com/att/ast/commit/73bd61b5

Reproducer:
  $ echo no
  $ hist -s no=yes

The second file descriptor leak occurs after a substitution error
in the hist built-in (this leak wasn't fixed in ksh2020).
Reproducer:
  $ echo no
  $ ls /proc/$$/fd
  $ hist -s no=yes
  $ hist -s no=yes
  $ ls /proc/$$/fd

src/cmd/ksh93/bltins/hist.c:
- Close leftover file descriptors when an error occurs and after
  'hist -s' runs a command.

src/cmd/ksh93/tests/builtins.sh:
- Add two regression tests for both of the file descriptor leaks.
2021-11-16 23:34:46 +01:00
Martijn Dekker
56c2e13e92 arith: Fix variables 'nan' and 'inf' in arithmetic for POSIX mode
The --posix compliance option now disables the case-insensitive
special floating point constants Inf and NaN so that all case
variants of $((inf)) and $((nan)) refer to the variables by those
names as the standard requires. (BUG_ARITHNAN)

src/cmd/ksh93/sh/arith.c: arith():
- Only do case-insensitive checks for "Inf" and "NaN" if the POSIX
  option is off.
2021-11-15 21:16:23 +01:00
Martijn Dekker
a4375f3090 Fix crash on unsetting .sh.match
ksh crashed after unsetting .sh.match and then matching a pattern:

$ unset .sh.match
$ [[ bar == ba* ]]
Memory fault

src/cmd/ksh93/sh/init.c: sh_setmatch():
- Do nothing if we cannot get an array pointer to SH_MATCHNOD.
2021-11-15 21:15:08 +01:00
Martijn Dekker
d9f1fdaa41 Fix [ \( str -a str \) ], [ \( str -o str \) ]
Symptoms:

$ test \( string1 -a string2 \)
/usr/local/bin/ksh: test: argument expected
$ test \( string1 -o string2 \)
/usr/local/bin/ksh: test: argument expected

The parentheses should be irrelevant and this should be a test for
the non-emptiness of string1 and/or string2.

src/cmd/ksh93/bltins/test.c:

- b_test(): There is a block where the case of 'test' with five or
  less arguments, the first and last one being parentheses, is
  special-cased. The parentheses are removed as a workaround: argv
  is increased to skip the opening parenthesis and argc is
  decreased by 2. However, there is no corresponding increase of
  tdata.av which is a copy of this function's argv. This renders
  the workaround ineffective. The fix is to add that increase.

- e3(): Do not handle '!' as a negator if not followed by an
  argument. This allows a right-hand expression that is equal to
  '!' (i.e. a test for the non-emptiness of the string '!').
2021-11-15 02:44:56 +01:00
Martijn Dekker
c81473061a test/[: binary operators: fix '<' and add '=~'; some more cleanups
In ksh88, the test/[ built-in supported both the '<' and '>'
lexical sorting comparison operators, same as in [[. However, in
every version of ksh93, '<' does not work though '>' still does!

Still, the code for both is present in test_binop():

src/cmd/ksh93/bltins/test.c
548:		case TEST_SGT:
549:			return(strcoll(left, right)>0);
550:		case TEST_SLT:
551:			return(strcoll(left, right)<0);

Analysis: The binary operators are looked up in shtab_testops[] in
data/testops.c using a macro called sh_lookup, which expands to a
sh_locate() call. If we examine that function in sh/string.c, it's
easy to see that on systems using ASCII (i.e. all except IBM
mainframes), it assumes the table is sorted in ASCII order.

src/cmd/ksh93/sh/string.c
64:	while((c= *tp->sh_name) && (CC_NATIVE!=CC_ASCII || c <= first))

The problem was that the '<' operator was not correctly sorted in
shtab_testops[]; it was sorted immediately before '>', but after
'='. The ASCII order is: < (60), = (61), > (62). This caused '<' to
never be found in the table.

The test_binop() function is also used by [[, yet '<' always worked
in that. This is because the parser has code that directly checks
for '<' and '>' within [[ (in sh/parse.c, lines 1949-1952).

This commit also adds '=~' to 'test', which took three lines of
code and allowed eliminating error handling in test_binop() as
test/[ and [[ now support the same binary ops. (re: fc2d5a60)

src/cmd/ksh93/*/*.[ch]:
- Rename a couple of very misleadingly named macros in test.h:
  . For == and !=, the TEST_PATTERN bit is off for pattern compares
    and on for literal string compares! Rename to TEST_STRCMP.
  . The TEST_BINOP bit does not denote all binary operators, but
    only the logical -a/-o ops in test/[. Rename to TEST_ANDOR.

src/cmd/ksh93/bltins/test.c: test_binop():
- Add support for =~. This is only used by test/[. The method is
  implemented in two lines that convert the ERE to a shell pattern
  by prefixing it with ~(E), then call test_strmatch with that
  temporary string to match the ERE and update ${.sh.match}.
- Since all binary ops from shtab_testops[] are now accounted for,
  remove unknown op error handling from this function.

src/cmd/ksh93/data/testops.c:
- shtab_testops[]:
  . Correctly sort the '<' (TEST_SLT) entry.
  . Remove ']]' (TEST_END). It's not an op and doesn't belong here.
- Update sh_opttest[] documentation with =~, \<, \>.
- Remove now-unused e_unsupported_op[] error message.

src/cmd/ksh93/sh/lex.c: sh_lex():
- Check for ']]' directly instead of relying on the removed
  TEST_END entry from shtab_testops[].

src/cmd/ksh93/tests/bracket.sh:
- Add relevant tests.

src/cmd/ksh93/tests/builtins.sh:
- Fix an old test that globally deleted the 'test' builtin. Delete
  it within the command substitution subshell only.
- Remove the test for non-support of =~ in test/[.
- Update the test for invalid test/[ op to use test directly.
2021-11-14 02:46:34 +01:00
Martijn Dekker
6f5c9fea93 test/[: Fix binary -a/-o operators in POSIX mode
POSIX requires
	test "$a" -a "$b"
to return true if both $a and $b are non-empty, and
	test "$a" -o "$b"
to return true if either $a or $b is non-empty.

In ksh, this fails if "$a" is '!' or '(' as this causes ksh to
interpret the -a and -o as unary operators (-a being a file
existence test like -e, and -o being a shell option test).

$ test ! -a ""; echo "$?"
0		(expected: 1/false)
$ set -o trackall; test ! -o trackall; echo "$?"
1		(expected: 0/true)
$ test \( -a \); echo "$?"
ksh: test: argument expected
2		(expected: 0/true)
$ test \( -o \)
ksh: test: argument expected
2		(expected: 0/true)

Unfortunately this problem cannot be fixed without risking breakage
in legacy scripts. For instance, a script may well use
	test ! -a filename
to check that a filename is nonexistent. POSIX specifies that this
always return true as it is a test for the non-emptiness of both
strings '!' and 'filename'.

So this commit fixes it for POSIX mode only.

src/cmd/ksh93/bltins/test.c: e3():
- If the posix option is active, specially handle the case of
  having at least three arguments with the second being -a or -o,
  overriding their handling as unary operators.

src/cmd/ksh93/data/testops.c:
- Update 'test --man --' date and say that unary -a is deprecated.

src/cmd/ksh93/sh.1:
- Document the fix under the -o posix option.
- For test/[, explain that binary -a/-o are deprecated.

src/cmd/ksh93/tests/bracket.sh:
- Add tests based on reproducers in bug report.

Resolves: https://github.com/ksh93/ksh/issues/330
2021-11-13 03:43:29 +01:00
Martijn Dekker
09a8a279f2 Fix bug on closed stdout; improve BUG_PUTIOERR fix (re: 93e15a30)
Stéphane Chazelas reported:

> As noted in this austin-group-l discussion[*] (relevant to this
> issue):
>
>   $ ksh93u+m -c 'pwd; echo "$?" >&2; echo test; echo "$?" >&2' >&-
>   0
>   1
>   /home/chazelas
>
> when stdout is closed, pwd does claim it succeeds (by returning a
> 0 exit status), while echo doesn't (not really relevant to the
> problem here, only to show it doesn't affect all builtins), and
> the output that pwd failed to write earlier ends up being written
> on stderr here instead of stdout upon exit (presumably) because
> of that >&2 redirection.
>
> strace shows ksh93 attempting write(1, "/home/chazelas\n", 15) 6
> times (1, the last one, successful).
>
> It gets even weirder when redirecting to a file:
>
>   $ ksh93u+m -c 'pwd; echo "$?" >&2; echo test; echo "$?" > file' >&-
>   0
>   $ cat file
>   1
>   1
>   ome/chazelas

In my testing, the problem does not occur when closing stdout at
the start of the -c script itself (using redirect >&- or exec >&-);
it only occurs if stdout was closed before initialising the shell.

That made me suspect that the problem had to do with an
inconsistent file descriptor state in the shell. ksh uses internal
sh_open() and sh_close() functions, among others, to maintain that
state.

src/cmd/ksh93/sh/main.c: sh_main():
- If the shell is initialised with stdin, stdout or stderr closed,
  then make the shell's file descriptor state tables reflect that
  fact by calling sh_close() for the closed file descriptors.

This commit also improves the BUG_PUTIOERR fix from 93e15a30. Error
checking after sfsync() is not sufficient. For instance, on
FreeBSD, the following did not produce a non-zero exit status:
  ksh -c 'echo hi' >/dev/full
even though this did:
  ksh -c 'echo hi >/dev/full'
Reliable error checking requires not only checking the result of
every SFIO command that writes output, but also synching the buffer
at the end of the operation and checking the result of that.

src/cmd/ksh93/bltins/print.c:
- Make exitval variable global to allow functions called by
  b_print() to set a nonzero exit status.
- Check the result of all SFIO output commands that write output.
- b_print(): Always sfsync() at the end, except if the s (history)
  flag was given. This allows getting rid of the sfsync() call that
  required the workaround introduced in 846ad932.

[*] https://www.mail-archive.com/austin-group-l@opengroup.org/msg08056.html

Resolves: https://github.com/ksh93/ksh/issues/314
2021-11-07 15:44:06 +00:00
Martijn Dekker
7b5b0a5d54 Fix octal number arguments in printf integer arithmetic
Bug 1: POSIX requires numbers used as arguments for all the %d,
%u... in printf to be interpreted as in the C language, so
	printf '%d\n' 010
should output 8 when the posix option is on. However, it outputs 10.

This bug was introduced as a side effect of a change introduced in
the 2012-02-07 version of ksh 93u+m, which caused the recognition
of leading-zero numbers as octal in arithmetic expressions to be
disabled outside ((...)) and $((...)). However, POSIX requires
leading-zero octal numbers to be recognised for printf, too.

The change in question introduced a sh.arith flag that is set while
we're processing a POSIX arithmetic expression, i.e., one that
recognises leading-zero octal numbers.
Bug 2: Said flag is not reset in a command substitution used within
an arithmetic expression. A command substitution should be a
completely new context, so the following should both output 10:

$ ksh -c 'integer x; x=010; echo $x'
10            # ok; it's outside ((…)) so octals are not recognised
$ ksh -c 'echo $(( $(integer x; x=010; echo $x) ))'
8             # bad; $(comsub) should create new non-((…)) context

src/cmd/ksh93/bltins/print.c: extend():
- For the u, d, i, o, x, and X conversion modifiers, set the POSIX
  arithmetic context flag before calling sh_strnum() to convert the
  argument. This fixes bug 1.

src/cmd/ksh93/sh/subshell.c: sh_subshell():
- When invoking a command substitution, save and unset the POSIX
  arithmetic context flag. Restore it at the end. This fixes bug 2.

Reported-by: @stephane-chazelas
Resolves: https://github.com/ksh93/ksh/issues/326
2021-09-13 04:57:37 +02:00
Martijn Dekker
bdc3069bfd Fix 'ps' output for hashbangless scripts on Linux/macOS
When invoking a script without an interpreter (#!hashbang) path,
ksh forks, but there is no exec syscall in the child. The existing
command line is overwritten in fixargs() with the name of the new
script and associated arguments. In the generic/fallback version of
fixargs() which is used on Linux and macOS, if the new command line
is longer than the existing one, it is truncated. This works well
when calling a script with a shorter name.

However, it generates a misleading name in the common scenario
where a script is invoked from an interactive shell, which
typically has a short command line. For instance, if "/tmp/script"
is invoked, "ksh" gets replaced with "/tm" in "ps" output.

A solution is found in the fact that, on these systems, the
environment is stored immediately after the command line arguments.
This space can be made available for use by a longer command line
by moving the environment strings out of the way.

src/cmd/ksh93/sh/main.c: fixargs():
- Refactor BSD setproctitle(3) version to be more self-contained.
- In the generic (Linux/macOS) version, on init (i.e. mode==0), if
  the command line is smaller than 128 bytes and the environment
  strings have not yet been moved (i.e. if they still immediately
  follow the command line arguments in memory), then strdup the
  environment strings, pointing the *environment[] members to the
  new strings and adding the length of the strings to the maximum
  command line buffer size.

Reported-by: @gkamat
Resolves: https://github.com/ksh93/ksh/pull/300
2021-09-12 05:34:52 +02:00
Martijn Dekker
a2196f9434 Fix backtick comsubs by making them act like $(modern) ones
ksh93 currently has three command substitution mechanisms:
- type 1: old-style backtick comsubs that use a pipe;
- type 3: $(modern) comsubs that use a temp file, currently with
  fallback to a pipe if a temp file cannot be created;
- type 2: ${ shared-state; } comsubs; same as type 3, but shares
  state with parent environment.

Type 1 is buggy. There are at least two reproducers that make it
hang. The Red Hat patch applied in 4ce486a7 fixed a hang in
backtick comsubs but reintroduced another hang that was fixed in
ksh 93v-. So far, no one has succeeded in making pipe-based comsubs
work properly.

But, modern (type 3) comsubs use temp files. How does it make any
sense to have two different command substitution mechanisms at the
execution level? The specified functionality between backtick and
modern command substitutions is exactly the same; the difference
*should* be purely syntactic.

So this commit removes the type 1 comsub code at the execution
level, treating them all like type 3 (or 2). As a result, the
related bugs vanish while the regression tests all pass.

The only side effect that I can find is that the behaviour of bug
https://github.com/ksh93/ksh/issues/124 changes for backtick
comsubs. But it's broken either way, so that's neutral.

So this commit can now be added to my growing list of ksh93 issues
fixed by simply removing code.

src/cmd/ksh93/sh/xec.c:
- Remove special code for type 1 comsubs from iousepipe(),
  sh_iounpipe(), sh_exec() and _sh_fork().

src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/sh/subshell.c:
- Remove pipe support from sh_subtmpfile(). This also removes the
  use of a pipe as a fallback for $(modern) comsubs. Instead, panic
  and error out if temp file creation fails. If the shell cannot
  create a temporary file, there are fatal system problems anyway
  and a script should not continue.
- No longer pass comsub type to sh_subtmpfile().

All other changes:
- Update sh_subtmpfile() calls.

src/cmd/ksh93/tests/subshell.sh:
- Add two regression tests based on reproducers from bug reports.

Resolves: https://github.com/ksh93/ksh/issues/305
Resolves: https://github.com/ksh93/ksh/issues/316
2021-08-13 09:14:11 +02:00
Martijn Dekker
d25dbcc1ef [[ ... ]]: fix '!' to negate another '!'
Bug: [[ ! ! 1 -eq 1 ]] returns false, but should return true.

This bug was reported for bash, but ksh has it too:
https://lists.gnu.org/archive/html/bug-bash/2021-06/msg00006.html

Op 24-05-21 om 17:47 schreef Chet Ramey:
> On 5/22/21 2:45 PM, Vincent Menegaux wrote:
>> Previously, these commands:
>>
>>    [[ ! 1 -eq 1 ]]; echo $?
>>    [[ ! ! 1 -eq 1 ]]; echo $?
>>
>> would both result in `1', since parsing `!' set CMD_INVERT_RETURN
>> instead of toggling it.
>
> Interestingly, ksh93 produces the same result as bash. I agree
> that it's more intuitive to toggle it.

Also interesting is that '!' as an argument to the simple
'test'/'[' command does work as expected (on both bash and ksh93):
'test ! ! 1 -eq 1' and '[ ! ! 1 -eq 1 ]' return 0/true.

Even the man page for [[ is identical for bash and ksh93:

|               ! expression
|                      True if expression is false.

This suggests it's supposed to be a logical negation operator, i.e.
'!' is implicitly documented to negate another '!'. Bolsky & Korn's
1995 ksh book, p. 167, is slightly more explicit about it:
"! test-expression. Logical negation of test-expression."

I also note that multiple '!' negators in '[[' work as expected on
mksh, yash and zsh.

src/cmd/ksh93/sh/parse.c: test_primary():
- Fix bitwise logic for '!': xor the TNEGATE bit into tretyp
  instead of or'ing it, which has the effect of toggling it.
2021-06-03 15:57:16 +02:00
Martijn Dekker
0dd115e4b4 Fix shell exit on function call redirection error (re: 23f2e23)
This regression also exists on ksh 93v- and ksh2020, from which it
was backported.

Reproducer:

$ (fn() { true; }; fn >/dev/null/ne; true) 2>/dev/null; echo $?
1

Expected output: 0 (as on ksh 93u+).

FreeBSD sh and NetBSD sh are the only other known shells that share
this behaviour. POSIX currently allows both behaviours, but may
require the ksh 93u+ behaviour in future. In any case, this causes
an incompatibility with established ksh behaviour that could easily
break existing ksh scripts.

src/cmd/ksh93/sh/xec.c: sh_exec():
- Commit 23f2e23 introduced a check for jmpval > SH_JMPIO (5).
  When a function call pushes context for a redirection, this is
  done with the jmpval exit value of SH_JMPCMD (6). Change that to
  SH_JMPIO to avoid triggering that check.

src/cmd/ksh93/tests/exit.sh:
- Add regression tests for exit behaviour on various kinds of
  shell errors as listed in the POSIX standard, including an error
  in a redirection of a function call.

Fixes: https://github.com/ksh93/ksh/issues/310
2021-05-19 06:59:18 +02:00
Martijn Dekker
e5e1d4b53e Decrease SHLVL before doing 'exec' from main shell
Problem:

$ exec ksh
$ echo $SHLVL
2
$ exec ksh
$ echo $SHLVL
3
$ exec ksh
$ echo $SHLVL
4

...etc. SHLVL is supposed to acount the number of shell processes
that you need to exit before you get logged out. Since ksh was
replacing itself with a new shell in the same process using 'exec',
SHLVL should not increase.

src/cmd/ksh93/bltins/misc.c: b_exec():
- When about to replace the shell and we're not in a subshell,
  decrease SHLVL to cancel out a subsequent increase by the
  replacing shell. Bash and zsh also do this.
2021-05-19 00:08:12 +02:00
Martijn Dekker
53f4bc6a53 Re-fix 'test -t 1' in command substitutions (re: 090b65e7)
Since a command substitution no longer forks on non-permanently
redirecting standard output within it for a specific command,
test -t 1, [ -t 1 ], and [[ -t 1 ]] broke as follows:
v=$(test -t 1 >/dev/tty && echo ok) did not assign 'ok' to v.
This is because the assumption in tty_check() that standard output
is never on a terminal in a non-forked command substitution, added
in 55f0f8ce, was made invalid by 090b65e7.

src/cmd/ksh93/edit/edit.c: tty_check():
- Implement a new method. Return false if the file descriptor
  stream is of type SF_STRING, which is the case for non-forked
  command substitutions -- it means the sfio stream writes directly
  into a memory area. This can be checked with the sfset(3)
  function (see src/lib/libast/man/sfio.3). To avoid a segfault
  when accessing sh.sftable, we need to validate the FD first.

src/cmd/ksh93/tests/pty.sh:
- Add the above reproducer.
2021-05-14 04:52:18 +02:00
Martijn Dekker
246062ff0b Release 1.0.0-beta.1
In May 2020, when every KornShell (ksh93) development project was
abandoned, development was rebooted in a new fork based on the last
stable AT&T version: ksh 93u+. Now, one year and hundreds of bug
fixes later, the first beta version is ready, and KornShell lives
again. This new fork is called ksh 93u+m as a permanent nod to its
origin; a standard semantic version number is added starting at
1.0.0-beta.1. Please test the beta and report any bugs you find,
or help us fix known bugs.
2021-05-10 18:42:42 +02:00
hyenias
92f7ca5423
Back port ksh93v- float, int, and exp10 changes from math.tab (#299)
src/cmd/ksh93/data/math.tab:
- Added exp10().
- Remove int() as being an alias to floor().
- Created entries for local float() and local int() which are
  defined in features/math.sh.

src/cmd/ksh93/features/math.sh:
- Backport floor() and int() related code from ksh93v-.

src/cmd/ksh93/sh.1:
- Sync man page to math.tab's potential functions.
2021-05-08 04:43:37 +01:00
Martijn Dekker
a197b0427a Fix two more 'command' bugs
BUG 1: Though 'command' is specified/documented as a regular
builtin, preceding assignments survive the invocation (as with
special or declaration builtins) if 'command' has no command
arguments in these cases:

$ foo=wrong1 command; echo $foo
wrong1
$ foo=wrong2 command -p; echo $foo
wrong2
$ foo=wrong3 command -x; echo $foo
wrong3

Analysis: sh_exec(), case TCOM (simple command), contains the
following loop that skips over 'command' prefixes, preparsing any
options and remembering the offset in the 'command' variable:

src/cmd/ksh93/sh/xec.c
1059 while(np==SYSCOMMAND || !np && com0
     && nv_search(com0,shp->fun_tree,0)==SYSCOMMAND)
1060 {
1061         register int n = b_command(0,com,&shp->bltindata);
1062         if(n==0)
1063                 break;
1064         command += n;
1065         np = 0;
1066         if(!(com0= *(com+=n)))
1067                 break;
1068         np = nv_bfsearch(com0, shp->bltin_tree, &nq, &cp);
1069 }

This skipping is not done if the preliminary b_command() call on
line 1061 (with argc==0) returns zero. This is currently the case
for command -v/-V, so that 'command' is treated as a plain and
regular builtin for those options.

The cause of the bug is that this skipping is even done if
'command' has no arguments. So something like 'foo=bar command' is
treated as simply 'foo=bar', which of course survives.

So the fix is for b_command() to return zero if there are no
arguments. Then b_command() itself needs changing to not error out
on the second/main b_command() call if there are no arguments.

src/cmd/ksh93/bltins/whence.c: b_command():
- When called with argc==0, return a zero offset not just for -v
  (X_FLAG) or -V (V_FLAG), but also if there are no arguments left
  (!*argv) after parsing options.
- When called with argc>0, do not issue a usage error if there are
  no arguments, but instead return status 0 (or, if -v/-V was given,
  status 2 which was the status of the previous usage message).
  This way, 'command -v $emptyvar' now also works as you'd expect.

BUG 2: 'command -p' sometimes failed after executing certain loops.

src/cmd/ksh93/sh/path.c: defpath_init():
- astconf() returns a pointer to memory that may be overwritten
  later, so duplicate the string returned. Backported from ksh2020.
  (re: f485fe0f, aa4669ad, <https://github.com/att/ast/issues/959>)

src/cmd/ksh93/tests/builtins.sh:
- Update the test for BUG_CMDSPASGN to check every variant of
  'command' (all options and none; invoking/querying all kinds of
  command and none) with a preceding assignment. (re: fae8862c)
  This also covers bug 2 as 'command -p' was failing on macOS prior
  to the fix due to a loop executed earlier in another test.
2021-05-05 02:43:18 +01:00
hyenias
642a105351
Fix arithmetic assignment operations for multidimensional indexed arrays (#296)
This PR corrects #168 for indexed arrays having more than one
level. Turns out ksh was only keeping track of the subscript number
for assignment in lvalue's nosub variable. By saving the actual
subscript reference, the result can be assigned to its proper
destination instead of putting the result into the last looked
value or subscript location.

src/cmd/ksh93/include/streval.h: struct lval:
- Create a new pointer named sub to hold the reference that nosub
  describes.

src/cmd/ksh93/sh/arith.c: arith():
- Adjust LOOKUP: for lvalue ARITH_ASSIGNOP operations on indexed
  arrays to save the np of the destination subscript for later use.
- Adjust ASSIGN: to act when lvalue's nosub > 0 which happens as
  the last step in the arithmetic parsing loop for assignment
  operations. Only indexed arrays will have a nosub value > 0. All
  others have a nosub of 0 unless they are involved in a unary
  operation (++, --) which sets nosub to -1. All said in the
  context of assignment operations like (( arr[0][1] += 1 )).

src/cmd/ksh93/sh/streval.c:
- Initialize the new sub pointer to 0.

src/cmd/ksh93/tests/arrays2.sh:
- Created a few multidimensional indexed array tests for assignment
  operations like += as an example.

Resolves: https://github.com/ksh93/ksh/issues/168
2021-05-04 03:13:14 +01:00
Martijn Dekker
d309d604e7 POSIX: 'command': don't disable declaration proprts (re: b9d10c5a)
Following the resolution of Austin Group bug 1393[*] that is set to
be included in the next version of the POSIX standard, the
'command' prefix in POSIX mode (set -o posix) no longer disables
the declaration properties of declaration built-ins.
[*] https://austingroupbugs.net/view.php?id=1393

src/cmd/ksh93/sh/parse.c: lex():
- Skip the 'command' prefix even in POSIX mode so that any
  declaration commands prefixed by it are treated as such in xec.c
  (sh_exec()).

src/cmd/ksh93/sh/xec.c: sh_exec():
- The foregoing change reintroduced a variant of BUG_CMDSPEXIT: the
  shell exits on something like 'command export readonlyvar=foo'.
  This now fixes that bug for both POSIX and non-POSIX mode. When
  calling nv_setlist() to process true shell assignments, and there
  is a 'command' prefix, push a shell context and use sigsetjmp to
  intercept any errors in assignments and stop the shell exiting.

src/cmd/ksh93/tests/builtins.sh:
- Borrow the BUG_CMDSPEXIT regression test from modernish and adapt
  it for ksh. (I'm the author so yes, I can do this.) Original:
  https://github.com/modernish/modernish/blob/ae8fe9c3/lib/modernish/tst/builtin.t#L80-L109
2021-05-04 00:52:10 +01:00
Martijn Dekker
af6a32d14f
Fix $RANDOM to act consistently in subshells (#294)
This fixes the following:
1. Using $RANDOM in a virtual/non-forked subshell no longer
   influences the reproducible $RANDOM sequence in the parent
   environment.
2. When invoking a subshell $RANDOM is now re-seeded (as mksh and
   bash do) so that invocations in repeated subshells (including
   forked subshells) longer produce identical sequences by default.
3. Program flow corruption that occurred in scripts on executing
   ( ( simple_command & ) ).

src/cmd/ksh93/include/variables.h:
- Move 'struct rand' here as it will be needed in subshell.c. Add
  rand_seed member to save the pseudorandom generator seed. Remove
  the pointer to the shell state as it's redundant.

src/cmd/ksh93/sh/init.c:
- put_rand(): Store given seed in rand_seed while calling srand().
  No longer pointlessly limit the number of possible seeds with the
  RANDMASK bitmask (that mask is to limit the values to 0-32767,
  it should not limit the number of possible sequences to 32768).
- nget_rand(): Instead of using rand(), use rand_r() to update the
  random_seed value. This makes it possible to save/restore the
  current seed of the pseudorandom generator.
- Add sh_reseed_rand() function that reseeds the pseudorandom
  generator by calling srand() with a bitwise-xor combination of
  the current PID, the current time with a granularity of 1/10000
  seconds, and a sequence number that is increased on each
  invocation.
- nv_init(): Set the initial seed using sh_reseed_rand() here
  instead of in sh_main(), as this is where the other struct rand
  members are initialised.

src/cmd/ksh93/sh/main.c: sh_main():
- Remove the srand() call that was replaced by the sh_reseed_rand()
  call in init.c.

src/cmd/ksh93/sh/subshell.c: sh_subshell():
- Upon entering a virtual subshell, save the current $RANDOM seed
  and state, then reseed $RANDOM for the subshell.
- Upon exiting a virtual subshell, restore $RANDOM seed and state
  and reseed the generator using srand() with the restored seed.

src/cmd/ksh93/sh/xec.c: sh_exec():
- When optimizing out a subshell that is the last command, still
  act like a subshell: reseed $RANDOM and increase ${.sh.subshell}.
- Fix a separate bug discovered while implementing this. Do not
  optimize '( simple_command & )' when in a virtual subshell; doing
  this causes program flow corruption.
- When optimizing '( simple_command & )', also reseed $RANDOM and
  increment ${.sh.subshell}.

src/cmd/ksh93/tests/subshell.sh,
src/cmd/ksh93/tests/variables.sh:
- Add various tests for all of the above.

Co-authored-by: Johnothan King <johnothanking@protonmail.com>
Resolves: https://github.com/ksh93/ksh/issues/285
2021-05-03 04:03:46 +01:00
Martijn Dekker
f31e368795 Fix remaining bug in ${var:-'{}'} (re: d087b031)
The following problems remained:

$ var=x; echo ${var:-'{}'}
x}
$ var=; echo ${var:+'{}'}
}

src/cmd/ksh93/sh/macro.c: varsub():
- Use the new ST_MOD1 state table to skip over ${var-'foo'}, etc.
  instead of ST_QUOTE. In ST_MOD1 the ' is categorised as S_LIT
  which causes the single quotes to be skipped over correctly.
  See d087b031 for more info.

src/cmd/ksh93/tests/quoting2.sh:
- Add tests for this remaining bug.
- Make the new test xtrace-proof.

Resolves: https://github.com/ksh93/ksh/issues/290 (again)
2021-05-03 03:14:30 +01:00
Martijn Dekker
88a1f3d661 Fork before entering shared-state command substitution
The code contains various checks to see if a subshell needs to
fork, like this one in the ulimit builtin:

	if(shp->subshell && !shp->subshare)
		sh_subfork();

All checks of this form are fatally broken, as each one of them
causes shared-state command substitutions to ignore parent virtual
subshells.

Currently the only feasible way to fix this is to fork a virtual
subshell before executing a shared-state command substitution in
it. In the long term I think shared-state command substitutions
should probably be redesigned to disassociate them completely from
the virtual subshell mechanism.

src/cmd/ksh93/sh/macro.c: comsubst():
- If we're in a non-subshare virtual subshell, fork it before
  entering a type 2 (subshare) command substitution.

src/cmd/ksh93/sh/subshell.c:
- sh_assignok(): Remove subshare fix from 911d6b06 as it's
  redundant now that the parent of a subshare is never a virtual
  subshell. Go back to not doing anything if the current "subshell"
  is a subshare.
- sh_subtracktree(), sh_subfuntree(): Similarly, remove the
  now-redundant subshare fixes from 13c57e4b.

src/cmd/ksh93/sh/xec.c: sh_exec():
- Fix a separate bug: only fork a virtual subshell before running a
  background job if that "subshell" is not a subshare.

src/cmd/ksh93/tests/subshell.sh:
- Add test for bug fixed in xec.c.
- Add tests for 'ulimit', 'builtin' and 'exec' run in subshare
  within subshell -- all commands that use checks of the form
  'if(sh.subshell && !sh.subshare) sh_subfork();'.

Resolves: https://github.com/ksh93/ksh/issues/289
2021-05-01 00:47:39 +01:00
Govind Kamat
7439e3dffe Parse quotes when extracting words from command history (#291)
This avoids splitting on quoted whitespace when extracting words
from the command history using the emacs M-. or vi _ command.

Example: if the prior command is

$ ls Stairway\ To\ Heaven.mp3

then, M-. in Emacs editing mode (and _ in vi mode) now inserts
Stairway\ To\ Heaven.mp3 instead of Heaven.mp3. The behavior is
similar for 'Stairway To Heaven.mp3' and "Stairway To Heaven.mp3".

src/cmd/ksh93/edit/history.c: hist_word():
- Skip over single-quoted and double-quoted strings and
  backslash-escaped characters.

src/cmd/ksh93/tests/pty.sh:
- Add regression test for this feature in vi mode. Since emacs and
  vi both use the same code for this, that should be good enough.

Co-authored-by: Martijn Dekker <martijn@inlv.org>
2021-04-30 20:18:07 +01:00
Martijn Dekker
d087b031f0 Fix single quotes in expansion operator string (re: 5ed9ffd6)
The referenced commit introduced the following bug:

> The closing quote does not appear to be registering during the
> parse of the following:
>
>	echo ${var:+'{}'}
>
> Within a script, this will result in:
>
>	syntax error at line 1: `'' unmatched

src/cmd/ksh93/data/lexstates.c,
src/cmd/ksh93/include/lexstates.h:
- Add new ST_MOD1 state table that is a copy of ST_QUOTE, but adds
  a special meaning (ST_LIT) for the single quote (position 39).

src/cmd/ksh93/sh/lex.c: sh_lex():
- For parameter expansion operators with old-style quoting
  (S_MOD1), use the new ST_MOD1 state table instead of ST_QUOTE.
  This causes single quotes within them to be processed properly.

src/cmd/ksh93/tests/quoting2.sh:
- Add tests.

Thanks to @gkamat for the bug report.
Resolves: https://github.com/ksh93/ksh/issues/290
2021-04-30 05:28:21 +01:00
Martijn Dekker
090b65e79b Fix fork after redirecting stdout in subshare (re: 500757d7)
Previously, command substitutions executed as virtual subshells
were always forked if any command was run within them that
redireceted standard output, even if the redirection was local to
that command.

Commit 500757d7 removed the check for a shared-state command
substitution (subshare), so introduced a bug where even that would
fork, causing it to stop sharing its state.

We can further improve on that fix by only forking if the
redirection is permanent as with `exec` or `redirect`. There should
be no need to do that if the redirection is local to a command run
within the command substitution, as the file descriptor is restored
when that command finishes, which is still within the command
substitution.

src/cmd/ksh93/sh/io.c: sh_redirect():
- Only fork upon redirecting stdout if the virtual subshell is a
  command substitution, and if the redirection is permanent
  (flag==1 or flag==2).
2021-04-26 18:22:17 +01:00
Johnothan King
086d504393
Lots of man page fixes and some other minor fixes (#284)
Noteworthy changes:
- The man pages have been updated to fix a ton of instances of
  runaway underlining (this was done with `sed -i 's/\\f5/\\f3/g'`
  commands). This commit dramatically increased in size because
  of this change.
- The documentation for spawnveg(3) has been extended with
  information about its usage of posix_spawn(3) and vfork(2).
- The documentation for tmfmt(3) has been updated with the changes
  previously made to the man pages for the printf and date builtins
  (though the latter builtin is disabled by default).
- The shell's tracked alias tree (hash table) is now documented in
  the shell(3) man page.
- Removed the commented out regression test for an ERRNO variable
  as the COMPATIBILITY file states it was removed in ksh93.
2021-04-23 22:02:30 +01:00
Johnothan King
2c22ace1e6
Fix LINENO after unsetting it a virtual subshell (#283)
There is a TODO note in variables.sh that notes the value of LINENO
is wrong after a virtual subshell. The following script should
print '6', but the bug causes it to print '1' instead:
  $ cat /tmp/lineno
  #!/bin/ksh
  (
      unset LINENO
      :
  )
  echo $LINENO

This bug started to occur after the bugfix applied in 7b994b6a.
However, that commit is not where the cause of bug was (when that
bugfix is applied to ksh versions 2008-07-25 through 2012-01-01,
$LINENO works fine). Rather, the cause of this bug was introduced
in 93u+ 2012-02-29. In that version, the mp->nvfun pointer was only
copied from np->nvfun if the variable can be freed from memory.
This is what caused 7b994b6a to break $LINENO in subshells, so to
fix this bug the mp->nvfun and np->nvfun must point to the same
object, even when the variable isn't freed from memory.

src/cmd/ksh93/sh/subshell.c: nv_restore():
- Always copy the np->nvfun pointer to mp->nvfun. To prevent
  crashes, the value of np->nvfun->nofree is set to the value given
  by the nofree variable, which is set before _nv_unset. See also
  commit 7e7f1372, which fixed a crash that happened because
  _nv_unset discards the NV_NOFREE flag.

src/cmd/ksh93/tests/variables.sh:
- Remove the workaround for LINENO after a virtual subshell.
- Add a regression test for the value of LINENO when unset in a
  virtual subshell, then used after the subshell. Note that before
  commit 997ad43b LINENO's value was corrupted after being unset in
  a subshell, so the test checks for corruption of the LINENO
  variable (in prior commits LINENO was set to '49' because of the
  previous bug).
2021-04-22 19:16:25 +01:00
Martijn Dekker
32d1abb1ba shcomp: fix redirection with process substitution
The commands within a process substitution used as an argument to a
redirection (e.g. < <(...) or > >(...)) are simply not included in
parse trees dumped by shcomp. This can be verified with a command
like hexdump -C. As a result, these process substitutions do not
work when running a bytecode-compiled shell script.

The fix is surprisingly simple. A process substitution is encoded
as a complete parse tree. When used with a redirection, that parse
tree is used as the file name for the redirection. All we need to
do is treat the "file name" as a parse tree instead of a string if
flags indicate a process substitution.

A process substitution is detected by the struct ionod field
'iofile'. Checking the IOPROCSUB bit flag is not enough. We also
need to exclude the IOLSEEK flag as that form of redirection may
use the IOARITH flag which has the same bit value as IOPROCSUB (see
include/shnodes.h).

src/cmd/ksh93/sh/tdump.c: p_redirect():
- Call p_tree() instead of p_string() for a process substitution.

src/cmd/ksh93/sh/trestore.c: r_redirect():
- Call r_tree() instead of r_string() for a process substitution.

src/cmd/ksh93/include/version.h:
- Bump the shcomp binary header version as this change is not
  backwards compatible; previous trestore.c versions don't know how
  to read the newly compiled process substitutions and would crash.

src/cmd/ksh93/tests/io.sh:
- Add test.

src/cmd/ksh93/tests/builtins.sh,
src/cmd/ksh93/tests/options.sh:
- Revert shcomp workarounds. (re: 6701bb30)

Resolves: https://github.com/ksh93/ksh/issues/165
2021-04-22 03:25:24 +01:00
Martijn Dekker
b7dde4e747 Fix ksh exit on syntax error in profile (re: cb67a01b, ceb77b13)
Johnothan King writes:
> There are two regressions related to how ksh handles syntax
> errors in the .kshrc file. If ~/.kshrc or the file pointed to by
> $ENV have a syntax error, ksh exits during startup. Additionally,
> the error message printed is incorrect:
>
> $ cat /tmp/synerror
> ((
> echo foo
>
> # ksh93u+m
> $ ENV=/tmp/synerror arch/*/bin/ksh -ic 'echo ${.sh.version}'
> /tmp/synerror: syntax error: `/t/tmp/synerror' unmatched
>
> # ksh93u+
> $ ENV=/tmp/synerror ksh93u -ic 'echo ${.sh.version}'
> /tmp/synerror: syntax error: `(' unmatched
> Version AJM 93u+ 2012-08-01
>
> The regression that causes the incorrect error message was
> introduced by commit cb67a01. The other bug that causes ksh to
> exit on startup was introduced by commit ceb77b1.

src/cmd/ksh93/sh/lex.c: fmttoken():
- Call stakfreeze(0) to terminate a possible unterminated previous
  stack item before writing the token string onto the stack. This
  fixes the bug with garbage in a syntax error message.

src/cmd/ksh93/sh/main.c: exfile():
- Revert Red Hat's ksh-20140801-diskfull.patch applied in ceb77b13.
  This fixes the bug with interactive ksh exiting on syntax error
  in a profile script. Testing by @JohnoKing showed the patch is no
  longer necessary to fix a login crash on disk full, as commit
  970069a6 (which applied Red Hat patches ksh-20120801-macro.patch
  and ksh-20120801-fd2lost.patch) also fixes that crash.

src/cmd/ksh93/README:
- Fix typos. (re: fdc08b23)

Co-authored-by: Johnothan King <johnothanking@protonmail.com>
Resolves: https://github.com/ksh93/ksh/issues/281
2021-04-21 19:42:24 +01:00
Martijn Dekker
7954855f21 Don't import/export readonly attribute via magic A__z env var
While automagically importing/exporting ksh variable attributes via
the environment is probably a misfeature in general (now disabled
for POSIX standard mode), doing so with the readonly attribute is
particularly problematic. Scripts can take into account the
possibility of importing unwanted attributes by unsetting or
typesetting variables before using them. But there is no way for a
script to get rid of an unwanted imported readonly variable. This
is a possible attack vector with no possible mitigation.

This commit blocks both the import and the export of the readonly
attribute through the environment. I consider it a security fix.

src/cmd/ksh93/sh/init.c: env_import_attributes():
- Clear NV_RDONLY from imported attributes before applying them.

src/cmd/ksh93/sh/name.c: sh_envgen():
- Remove NV_RDONLY from bitmask defining attributes to export.
2021-04-21 04:11:55 +01:00
Johnothan King
f28bce61a7
Fix multiple problems with the getconf builtin (#280)
This commit fixes three problems with getconf pathbound builtin:
1. The -l/--lowercase option did not change all variable names to
   lower case.
2. The -q/--quote option now quotes all string values. Previously,
   it only quoted string values that had a space or other
   non-shellsafe character.
3. The -c/--call, -n/--name and -s/--standard options matched all
   variable names provided by 'getconf -a', even if none were
   actual matches.

Additionally, references to the confstr and sysconf functions have
been updated to reference section 3 of the man pages instead of
section 2.

src/lib/libast/port/astconf.c:
- Previously, only values that had spaces in them were quoted. Change
  that behavior to quote all string values by using the FMT_ALWAYS
  flag. Bug report: https://github.com/att/ast/issues/1173
- Not all variable names were printed in lowercase by 'getconf -l'.
  Fix it by adding a few missing instances of fmtlower.
  Bug report: https://github.com/att/ast/issues/1171
- Add the missing code to the '#if _pth_getconf_a' block to handle
  -c/-n/-s while parsing the OS's native 'getconf -a' output. This
  approach reuses code for name matching from other parts of
  astconflist(). Resolves: https://github.com/ksh93/ksh/issues/279

src/lib/libcmd/getconf.c:
- Update the documentation to note the -q flag only quotes strings.

src/cmd/ksh93/tests/bulitins.sh:
- Add regression tests for the getconf bugs fixed in this commit.

Co-authored-by: Martijn Dekker <martijn@inlv.org>
2021-04-21 03:34:54 +01:00
Martijn Dekker
b0a6c1bde5 Further fix '<>;' and fix crash on 32-bit systems (re: 6701bb30)
Accessing t->tre.treio for every sh_exec() run is invalid because
't' is of type Shnode_t, which is a union that can contain many
different kinds of structs. As all members of a union occupy the
same address space, only one can be used at a time. Which member is
valid to access depends on the node type sh_exec() was called with.
The invalid access triggered a crash on 32-bit systems when
executing an arithmetic command like ((x=1)).

The t->tre.treio union member should be accessed for a simple
command (case TCOM in sh_exec()). The fix is also needed for
redirections attached to blocks (case TSETIO) in which case the
union member to use is t->fork.forkio.

src/cmd/ksh93/sh/xec.c:
- Add check_exec_optimization() function that checks for all the
  conditions where the exec optimisation should not be done. For
  redirections we need to loop through the whole list to check for
  an IOREWRITE (<>;) one.
- sh_exec(): case TCOM (simple command): Only bother to call
  check_exec_optimization() if there are either command arguments
  or redirections (IOW: don't bother for bare variable
  assignments), so move it to within the if(io||argn) block.
- sh_exec(): case TSETIO: This needs a similar fix. To avoid the
  optimization breaking again if the last command is a subshell
  with a <>; redirection attached, we need to not only set execflg
  to 0 but also clear the SH_NOFORK state bit from the 'flags'
  variable which is passed on to the recursive sh_exec() call.

src/cmd/ksh93/tests/io.sh:
- Update and expand tests. Add tests for redirections attached to
  simple commands (TCOM) and various kinds of code block (TSETIO).

Co-authored-by: Johnothan King <johnothanking@protonmail.com>
Resolves: https://github.com/ksh93/ksh/issues/278
2021-04-17 21:56:39 +01:00
Martijn Dekker
ba43436f10 emacs: Fix digits input after completion (re: 16e4824c, e8b3274a)
Immediately after tab-completing the name of a directory, it is
not possible to type digits after the slash; ksh eats them as it
parses them as a menu selection for a nonexistent menu.

Reproducer:
$ mkdir -p emacstest/123abc
$ cd emacste[tab]123abc

Actual results:
$ cd emacstest/abc

Expected results:
$ cd emacstest/123abc

Workarounds are to press a non-numeric key followed by backspace,
or hit [tab] again to get a list of options.

Originally reported by Arnon Weinberg, 2012-12-23 07:15:19 UTC, at:
https://bugzilla.redhat.com/889745

The fix had been partially backported from ksh 93v- by AT&T
(16e4824c), which made things worse, so it was reverted (e8b3274a).
This commit backports a slightly edited version of the complete
fix. Thanks to @JohnoKing for finding the correct code. Discussion:
https://github.com/ksh93/ksh/issues/198#issuecomment-820178514

src/cmd/ksh93/edit/emacs.c: escape():
- Backport the fix for this bug that was implemented in ksh 93v-
  alpha 2013-10-10. Immediately after a slash, do not stay in "\"
  mode (file name completion) and reset the tab count.

src/cmd/ksh93/tests/pty.sh:
- Test the fix.

Resolves: https://github.com/ksh93/ksh/issues/198
2021-04-16 14:46:07 +01:00
Johnothan King
6701bb30de
Fix <>; redirection for final command exec optimization (#277)
The <>; operator doesn't work correctly if it's used as the last
command of a -c script. Reproducer:
  $ echo test > a; ksh -c 'echo x 1<>; a'; cat a
  x
  st
This bug is caused by ksh running the last command of -c scripts
with execve(2) instead of posix_spawn(3) or fork(2). The <>;
operator is noted by the man page as being incompatible with the
exec builtin (see also the ksh93u+ man page), so it's not
surprising this bug occurs when ksh runs a command using execve:

> <>;word cannot be used with the exec and redirect built-ins.

The ksh2020 fix simply removed the code required for ksh to use
this optimization at all. It's not a performance friendly fix and
only papers over the bug, so this commit provides a better fix.

This bug was first reported at:
https://github.com/att/ast/issues/9

In addition, this commit re-enables the execve(2) optimization for
the last command for scripts loaded from a file. It was enabled in
in older ksh versions, and was only disabled in interactive shells:
https://github.com/ksh93/ast-open-history/blob/2011-06-30/src/cmd/ksh93/sh/main.c#L593-L599
It was changed on 2011-12-24 to only be used for -c scripts:
https://github.com/ksh93/ast-open-history/blob/2011-12-24/src/cmd/ksh93/sh/main.c#L593-L599

We think there is no good reason why scripts loaded from a file
should be optimised less than scripts loaded from a -c argument.
They're both scripts; there's no essential difference between them.
So this commit reverts that change. If there is a bug left in the
optimization after this fix, this revert increases the chance of
exposing it so that it can be fixed.

src/cmd/ksh93/sh/xec.c:
- The IOREWRITE flag is set when handling the <>; operator, so to
  fix this bug, avoid exec'ing the last command if it uses <>;. See
  also commit 17ebfbf6, which fixed another issue related to the
  execve optimization.

src/cmd/ksh93/tests/io.sh:
- Enable a regression test that was failing because of this bug.
- Add the reproducer from https://github.com/att/ast/issues/9 as a
  regression test.

src/cmd/ksh93/sh/main.c:
- Only avoid the non-forking optimization in interactive shells.

src/cmd/ksh93/tests/signal.sh:
- Add an extra comment to avoid the non-forking optimization in the
  regression test for rhbz#1469624.
- If the regression test for rhbz#1469624 fails, show the incorrect
  exit status in the error message.

src/cmd/ksh93/tests/builtins.sh,
src/cmd/ksh93/tests/options.sh:
- This bugfix was causing the options regression test to segfault
  when run under shcomp. The cause is the same as
  <https://github.com/ksh93/ksh/issues/165>, so as a workaround,
  avoid parsing process substitutions with shcomp until that is
  fixed. This workaround should also avoid the other problem
  detailed in <https://github.com/ksh93/ksh/issues/274>.

Resolves: https://github.com/ksh93/ksh/issues/274
2021-04-15 18:29:50 +01:00
Martijn Dekker
519bb08265
Allow invoking path-bound built-in commands by direct path or preceding PATH assignment (#275)
Path-bound builtins on ksh (such as /opt/ast/bin/cat) break some
basic assumptions about paths in the shell that should hold true,
e.g., that a path output by whence -p or command -v should actually
point to an executable command. This commit should fix the
following:

1. Path-bound built-ins (such as /opt/ast/bin/cat) can now be
   executed by invoking the canonical path (independently of the
   value of $PATH), so the following will now work as expected:

        $ /opt/ast/bin/cat --version
          version         cat (AT&T Research) 2012-05-31
        $ (PATH=/opt/ast/bin:$PATH; "$(whence -p cat)" --version)
          version         cat (AT&T Research) 2012-05-31

   In the event an external command by that path exists, the
   path-bound builtin will now override it when invoked using the
   canonical path. To invoke a possible external command at that
   path, you can still use a non-canonical path, e.g.:
   /opt//ast/bin/cat or /opt/ast/./bin/cat

2. Path-bound built-ins will now also be found on a PATH set
   locally using an assignment preceding the command, so something
   like the following will now work as expected:

        $ PATH=/opt/ast/bin cat --version
          version         cat (AT&T Research) 2012-05-31

   The builtin is not found by sh_exec() because the search for
   builtins happens long before invocation-local preceding
   assignments are processsed. This only happens in sh_ntfork(),
   before forking, or in sh_fork(), after forking. Both sh_ntfork()
   and sh_fork() call path_spawn() to do the actual path search, so
   a check there will cover both cases.

   This does mean the builtin will be run in the forked child if
   sh_fork() is used (which is the case on interactive shells with
   job.jobcontrol set, or always after compiling with SHOPT_SPAWN
   disabled). Searching for it before forking would mean
   fundamentally redesigning that function to be basically like
   sh_ntfork(), so this is hard to avoid.

src/cmd/ksh93/sh/path.c: path_spawn():
- Before doing anything else, check if the passed path appears in
  the builtins tree as a pathbound builtin. If so, run it. Since a
  builtin will only be found if a preceding PATH assignment
  temporarily changed the PATH, and that assignment is currently in
  effect, we can just sh_run() the builtin so a nested sh_exec()
  invocation will find and run it.
- If 'spawn' is not set (i.e. we must return), set errno to 0 and
  return -2. See the change to sh_ntfork() below.

src/cmd/ksh93/sh/xec.c:
- sh_exec(): When searching for built-ins and the restricted option
  isn't active, also search bltin_tree for names beginning with a
  slash.
- sh_ntfork(): Only throw an error if the PID value returned is
  exactly -1. This allows path_spawn() to return -2 after running a
  built-in to tell sh_ntfork() to do the right things to restore
  state.

src/cmd/ksh93/sh/parse.c: simple():
- When searching for built-ins at parse time, only exclude names
  containing a slash if the restricted option is active. This
  allows finding pointers to built-ins invoked by literal path like
  /opt/ast/bin/cat, as long as that does not result from an
  expansion. This is not actually necessary as sh_exec() will also
  cover this case, but it is an optimisation.

src/lib/libcmd/getconf.c:
- Replace convoluted deferral to external command by a simple
  invocation of the path to the native getconf command determined
  at compile time (by src/lib/libast/comp/conf.sh). Based on:
  https://github.com/ksh93/ksh/issues/138#issuecomment-816384871
  If there is ever a system that has /opt/ast/bin/getconf as its
  default native external 'getconf', then there would still be an
  infinite recursion crash, but this seems extremely unlikely.

Resolves: https://github.com/ksh93/ksh/issues/138
2021-04-15 04:08:12 +01:00
Johnothan King
2c38fb93fd
Fix the exit status returned when a command isn't executable (#273)
Previous discussion: https://github.com/att/ast/issues/485

If ksh attempts to execute a non-executable command found in the
PATH, in some instances the error message and return status are
incorrect. In the example below, ksh returns with exit status 126
when using the -c execve(2) optimization or when using fork(2) in
an interactive shell. However, using posix_spawn(3) causes the exit
status to change:
  $ echo 'print cannot execute' > /tmp/x
  # Runs command with spawnveg (i.e., posix_spawn or vfork)
  $ ksh -c 'PATH=/tmp; x; echo $?'
  ksh: x: not found
  127
  # Runs command with execve
  $ ksh -c 'PATH=/tmp; x'; echo $?
  ksh: x: cannot execute [Permission denied]
  126
  # Runs command with fork
  $ ksh -ic 'PATH=/tmp; x; echo $?'
  ksh: x: cannot execute [Permission denied]
  126

Since 'x' is in the PATH but can't be executed, the correct exit
status is 126, not 127. It's worth noting this bug doesn't cause
the regression tests to fail with ksh93u+m, but it does cause one
test to fail when run under dtksh:

    path.sh[706]: Long nonexistent command name: got status 126, ''

This commit backports various fixes for this bug from ksh2020, with
additional fixes applied (since there were still some additional
issues the ksh2020 patch didn't fix). The lacking regression test
for exit status 126 in path.sh has been rewritten to test for more
scenarios where ksh failed to return the correct error message
and/or exit status. I can also confirm with this patch applied the
path.sh regression tests now pass when run under dtksh.

src/cmd/ksh93/sh/path.c:
- Add a comment to path_absolute() describing 'oldpp' is the
  current pointer in the while loop and 'pp' is the next pointer.
  Backported from:
  https://github.com/att/ast/commit/a6cad450

- The patch from ksh2020 didn't fix this bug in the SHOPT_SPAWN
  code (because ksh2020 prefers fork(2)), so issues with the exit
  status could still occur when using spawnveg. To fix this, always
  set 'noexec' to the value of errno if can_execute fails. Before
  this fix, errno was discarded if 'pp' was a null pointer and
  can_execute failed.

- If a command couldn't be executed and the error wasn't ENOENT,
  save errno in a 'not_executable' variable. If an executable
  command couldn't be found in the PATH, exit with status 126 and
  set errno to the saved value. This was based on a ksh2020 bugfix,
  but it has been reworked a little bit to fix a bug that caused a
  mismatch between the error message shown and errno. Example with
  a non-executable file in PATH:
  $ nonexec
  ksh2020: nonexec: cannot execute [No such file or directory]
  The ksh2020 patch: <https://github.com/att/ast/pull/493>

- Backport a ksh2020 bugfix for directories in the PATH when
  running one of the added regression tests on OpenBSD:
  https://github.com/att/ast/pull/767

src/cmd/ksh93/data/msg.c,
src/cmd/ksh93/include/shell.h,
src/cmd/ksh93/sh/{path,xec}.c:
- If a command name is too long (ENAMETOOLONG), then it wasn't
  found in the PATH. For that case return exit status 127, like
  for ENOENT.

src/cmd/ksh93/tests/path.sh:
- Replace the old test with a new set of more extensive tests.
  These tests check the error message and exit status when ksh
  attempts to run a command using any of the following:
   - execve(2), used with the last command run with -c       (*A tests).
   - posix_spawn(3)/vfork(2), used in noninteractive scripts (*B tests).
   - fork(2), used in interactive shells with job control    (*C tests).
   - command -x                                              (*D tests).
   - exec(1)                                                 (*E tests).
- Add a regression test from ksh2020 for attempting to execute a
  directory:
  https://github.com/att/ast/pull/758

src/lib/libast/include/ast.h,
src/lib/libast/include/wait.h:
- Avoid bitshifts in macros for static error codes. The return
  values of command not found and exec related errors are static
  values and should not require any macro magic for calculation.
  Backported from: https://github.com/att/ast/commit/c073b102
- Simplify EXIT_* and W* macros to use 8 bits.
2021-04-15 03:37:57 +01:00
hyenias
d6ddd89053
Correct memory fault when removing default nameref KSH_VERSION (#271)
This commit fixes a segmentation fault when an attempt was made to
unset the default KSH_VERSION variable prior any other nameref
activity such as creating another nameref or even reassigning the
nameref KSH_VERSION to something else.

(new shell without prior nameref activity)
$ nameref
KSH_VERSION=.sh.version
$ unset -n KSH_VERSION
Memory fault

src/cmd/ksh93/sh/name.c: _nv_unset():
- Add a 'Refdict' check before attempting to remove a value from it
  as apparently one does not exist until some sort of nameref
  activity occurs after shell startup as the default nameref of
  'KSH_VERSION=.sh.version' does not create one.
2021-04-13 03:15:34 +01:00
Johnothan King
75796a9c75
Fix += operator regressions (re: fae8862c) (#270)
The bugfix for BUG_CMDSPASGN backported in commit fae8862c caused
two regressions with the += operator:

1. The += operator did not append to variables. Reproducer:
     $ integer foo=3
     $ foo+=2 command eval 'echo $foo'
     2

2. The += operator ignored the readonly attribute, modifying readonly
   variables in the same manner as above. Reproducer
     $ readonly bar=str
     $ bar+=ing command eval 'echo $bar'
     ing

Both of the regressions above were caused by nv_putval() failing to
clone the variable from the previous scope into the invocation-local
scope. As a result, 'foo+=2' was effectively 0 + 2 (since ksh didn't
clone 3). The first regression was noticed during the development of
ksh93v-, so to fix both bugs I've backported the bugfix for the
regression from the ksh93v- 2013-10-10 alpha version:
https://www.mail-archive.com/ast-users@lists.research.att.com/msg00369.html

src/cmd/ksh93/sh/name.c:
- To fix both of the bugs above, find the variable to modify with
  nv_search(), then clone it into the invocation local scope. To
  fix the readonly bug as well, this is done before the NV_RDONLY
  check (otherwise np will be missing that attribute and be
  incorrectly modified in the invocation-local scope).
- Update a nearby comment describing what sh_assignok() does (per this
  comment: https://github.com/ksh93/ksh/pull/249#issuecomment-811381759)

src/cmd/ksh93/tests/builtins.sh:
- Add regression tests for both of the now fixed regressions,
  loosely based on the regression tests in ksh93v-.
2021-04-12 01:24:33 +01:00
Martijn Dekker
d50d3d7c4c Reset arithmetic recursion level on all errors (re: 264ba48b)
The recursion level for arithmetic expressions is kept track of in
a static 'level' variable in streval.c. It is reset when arithmetic
expressions throw an error.

But an error for an arithmetic expression may also occur elsewhere
-- at least in one case: when an arithmetic expression attempts to
change a read-only variable. In that case, the recursion level is
never reset because that code does not have access to the static
'level' variable.

If many such conditions occur (as in the new readonly.sh regression
tests), an arithmetic command like 'i++' may eventually fail with a
'recursion too deep' error.

To mitigate the problem, MAXLEVEL in streval.c was changed from 9
to 1024 in 264ba48b (as in the ksh 93v- beta). This commit leaves
that increase, but adds a proper fix.

src/cmd/ksh93/include/defs.h:
- Add global sh.arithrecursion (a.k.a. shp->arithrecursion)
  variable to keep track of the arithmetic recursion level,
  replacing the static 'level' variable in streval.c.

src/cmd/ksh93/sh/xec.c: sh_exec():
- Reset sh.arithrecursion before starting a new simple command
  (TCOM), a new subshell with parentheses (TPAR), a new pipe
  (TFIL), or a new [[ ... ]] command (TTST). These are the same
  places where 'echeck' is set to 1 for --errexit and ERR trap
  checks, so it should cover everything.

src/cmd/ksh93/sh/streval.c:
- Change all uses of 'level' to sh.arithrecursion.
- _seterror, aritherror(): No longer bother to reset the level
  to zero here; xec.c should have this covered for all cases now.

src/cmd/ksh93/tests/arith.sh:
- Add tests for main shell and subshell.
2021-04-11 01:25:19 +01:00
Johnothan King
5461f11968
Fix handling of '--posix' and '--default' (#265)
src/cmd/ksh93/sh/args.c: sh_argopts():
- Remove special-casing for --posix (see also data/builtins.c) and
  move the case -5: to the case ':' instead, so this option is
  handled like all other long options. This change fixes two bugs:
  1. 'set --posix' had no effect on the letoctal or braceexpand
     options. Reproducer:
       $ set --posix
       $ [[ -o braceexpand ]]; echo $?
       0
       $ [[ -o letoctal ]]; echo $?
       1
  2. 'ksh --posix' could not run scripts correctly because it
     wrongly enabled '-c'. Reproducer:
       $ ksh --posix < <(echo 'exit 0')
       ksh: -c requires argument
       Usage: ksh [--posix] [arg ...]
       Help: ksh [ --help | --man ] 2>&1
- Don't allow 'set --default' to unset the restricted option.

src/cmd/ksh93/tests/options.sh:
- Add regression tests for the bugs described above, using -o posix
  and --posix.

src/cmd/ksh93/tests/restricted.sh:
- Add a regression test for 'set --default' in rksh.

Co-authored-by: Martijn Dekker <martijn@inlv.org>
2021-04-09 23:26:07 +01:00
Johnothan King
504cbda269
Fix 'printf %T' ignoring the current locale in LC_TIME (#263)
src/lib/libast/tm/tmlocale.c:
- Load the locale set by LC_TIME or LC_ALL if it hasn't been loaded
  before or if it was loaded previously but isn't the current locale.

src/cmd/ksh93/tests/locale.sh:
- Add a regression test using the nl_NL.UTF-8 and ja_JP.UTF-8 locales.

Fixes: https://github.com/ksh93/ksh/issues/261
2021-04-09 03:49:48 +01:00
Johnothan King
a065558291
Fix more compiler warnings, typos and other minor issues (#260)
Many of these changes are minor typo fixes. The other changes
(which are mostly compiler warning fixes) are:

NEWS:
- The --globcasedetect shell option works on older Linux kernels
  when used with FAT32/VFAT file systems, so remove the note about
  it only working with 5.2+ kernels.

src/cmd/ksh93/COMPATIBILITY:
- Update the documentation on function scoping with an addition
  from ksh93v- (this does apply to ksh93u+).

src/cmd/ksh93/edit/emacs.c:
- Check for '_AST_ksh_release', not 'AST_ksh_release'.

src/cmd/INIT/mamake.c,
src/cmd/INIT/ratz.c,
src/cmd/INIT/release.c,
src/cmd/builtin/pty.c:
- Add more uses of UNREACHABLE() and noreturn, this time for the
  build system and pty.

src/cmd/builtin/pty.c,
src/cmd/builtin/array.c,
src/cmd/ksh93/sh/name.c,
src/cmd/ksh93/sh/nvtype.c,
src/cmd/ksh93/sh/suid_exec.c:
- Fix six -Wunused-variable warnings (the name.c nv_arrayptr()
  fixes are also in ksh93v-).
- Remove the unused 'tableval' function to fix a -Wunused-function
  warning.

src/cmd/ksh93/sh/lex.c:
- Remove unused 'SHOPT_DOS' code, which isn't enabled anywhere.
  https://github.com/att/ast/issues/272#issuecomment-354363112

src/cmd/ksh93/bltins/misc.c,
src/cmd/ksh93/bltins/trap.c,
src/cmd/ksh93/bltins/typeset.c:
- Add dictionary generator function declarations for former
  aliases that are now builtins (re: 1fbbeaa1, ef1621c1, 3ba4900e).
- For consistency with the rest of the codebase, use '(void)'
  instead of '()' for print_cpu_times.

src/cmd/ksh93/sh/init.c,
src/lib/libast/path/pathshell.c:
- Move the otherwise unused EXE macro to pathshell() and only
  search for 'sh.exe' on Windows.

src/cmd/ksh93/sh/xec.c,
src/lib/libast/include/ast.h:
- Add an empty definition for inline when compiling with C89.
  This allows the timeval_to_double() function to be inlined.

src/cmd/ksh93/include/shlex.h:
- Remove the unused 'PIPESYM2' macro.

src/cmd/ksh93/tests/pty.sh:
- Add '# err_exit #' to count the regression test added in
  commit 113a9392.

src/lib/libast/disc/sfdcdio.c:
- Move diordwr, dioread, diowrite and dioexcept behind
  '#ifdef F_DIOINFO' to fix one -Wunused-variable warning and
  multiple -Wunused-function warnings (sfdcdio() only uses these
  functions when F_DIOINFO is defined).

src/lib/libast/string/fmtdev.c:
- Fix two -Wimplicit-function-declaration warnings on Linux by
  including sys/sysmacros.h in fmtdev().
2021-04-08 19:58:07 +01:00
Martijn Dekker
2e5b625915 Allow path-bound builtins on restricted shells
If a system administrator prefixes /opt/ast/bin to the path and
then invokes the shell in restricted mode, they clearly intend for
the user to run those AST utilities.

Similarly, if a system administrator sets a PATH for a restricted
shell that includes libraries listed in the .paths file, they must
have intended for the user to use those loadable built-ins, as they
will be associated with the pathnames of their respective
libraries. Since the user cannot change PATH or use the builtin
command, they still cannot load just any built-in they choose.

src/cmd/ksh93/sh/path.c:
- Remove SH_RESTRICTED check when handling path-bound builtins
  or dynamic libaries containining builtins in $PATH.

src/cmd/ksh93/tests/builtins.sh:
- Add test verifying a restricted user can use /opt/ast/bin/cat
  via a PATH search.

Progresses: https://github.com/ksh93/ksh/issues/138
2021-04-08 14:48:29 +01:00
Johnothan King
0cd8646361
Backport bugfix for BUG_CSUBSTDO from ksh93v- 2012-08-24 (#259)
This commit fixes BUG_CSUBSTDO, which could break stdout inside of
non-forking command substitutions. The breakage only occurred when
stdout was closed outside of the command substitution and a file
descriptor other than stdout was redirected in the command substitution
(such as stderr). Thanks to the ast-open-history repo, I was able to
identify and backport the bugfix from ksh93v- 2012-08-24.

This backport may fix other bugs as well. On 93v- 2012-08-24 it
fixed the regression below, though it was not triggered on 93u+(m).
  src/cmd/ksh93/tests/heredoc.sh
  487 print foo > $tmp/foofile 
  488 x=$( $SHELL 2> /dev/null 'read <<< $(<'"$tmp"'/foofile) 2> /dev/null;print -r "$REPLY"') 
  489 [[ $x == foo ]] || err_exit '<<< $(<file) not working' 

src/cmd/ksh93/sh/io.c: sh_open():
- If the just-opened file descriptor exists in sftable and is
  flagged with SF_STRING (as in non-forking command substitutions,
  among other situations), then move the file descriptor to a
  number >= 10.

src/cmd/ksh93/tests/io.sh:
- Add a regression test for BUG_CSUBSTDO, adapted from the one in
  modernish.
2021-04-08 13:24:17 +01:00
Johnothan King
b2a7ec032f
Add LC_TIME to the supported locale variables (#257)
The current version of 93u+m does not have proper support for the
LC_TIME variable. Setting LC_TIME has no effect on printf %T, and
if the locale is invalid no error message is shown:
    $ LC_TIME=ja_JP.UTF-8
    $ printf '%T\n' now
    Wed Apr  7 15:18:13 PDT 2021
    $ LC_TIME=invalid.locale
    $ # No error message

src/cmd/ksh93/data/variables.c,
src/cmd/ksh93/include/variables.h,
src/cmd/ksh93/sh/init.c:
- Add support for the $LC_TIME variable. ksh93v- attempted to add
  support for LC_TIME, but the patch from that version was extended
  because the variable still didn't function correctly.

src/cmd/ksh93/tests/variables.sh:
- Add LC_TIME to the regression tests for LC_* variables.
2021-04-08 13:06:22 +01:00
Martijn Dekker
6b9703ffdd Backport bugfixes for arrays of 'enum' types from ksh 93v- beta
These fixes are applied rather blindly as no one has yet managed to
understand the almost entirely uncommented arrays and variables
handling code (arrays.c, name.c, nvdisc.c, nvtree.c, nvtype.c).
Hopefully we'll figure all that out at some point. In the meantime
these backported fixes appear to work fine, and these bugs impact
the usability of 'enum', so I'm just going to have to violate my
own policy and backport these fixes without understanding them.
Thanks to @JohnoKing for putting in a lot of work tracing these.

Further discussion at: https://github.com/ksh93/ksh/issues/87

src/cmd/ksh93/sh/array.c:
- nv_arraysettype():
  * Further simplify the function. After my initial simplification
    of it (re: 5491fe97), I don't believe there's actually a need
    to save a duplicate copy of the value. Use the pointer returned
    by nv_getval() directly to restore the value.
  * Cope with a null value (nv_getval() returning a NULL pointer).
    This is needed for compatibility with the backported fix in
    nvtype.c (below).
- array_putval(): If the array's value pointer (up->cp) is a
  pointer to the empty string, it is set to NULL before calling
  nv_putv() to prevent an empty string from being deleted. Backport
  a fix from 93v- that restores the pointer to the empty string if
  the NV_NOFREE attribute is set. Removing it somehow causes these
  regressions:
	enum.sh[86]: ${array[@]} doesn't yield all values for
	associative enum arrays (expected 'green blue blue red
	yellow green red orange'; got 'green blue blue  yellow
	green  orange')
	enum.sh[94]: unsetting associative enum array does not work
	(got 'Color_t -A Colors=([foo]=red [rood]=red)')
	enum.sh[116]: assigning first enum element to indexed array
	failed (expected 'red red'; got 'BUG BUG')
- nv_associative(): Do not increase the 'nelem' (number of
  elements) value of the array's 'header' struct if the array is
  associative and of an enum type. The original 93v- fix only
  checked for the NV_INTEGER attribute, but backporting that caused
  several regressions. Using a debug output command I've determined
  that the exact value of 'type' is somehow consistently set to
  0x26 if the array is associative and of an enum type, which is
  NV_INTEGER | NV_LTOU | NV_RJUST as defined in include/nval.h. I
  cannot find where/how that value is determined. In any case this
  fix, based on but more specific than the 93v- one, appears to
  work fine. Removing it somehow causes this regression:
	enum.sh[94]: unsetting associative enum array does not work
	(got 'Color_t -A Colors=()')

src/cmd/ksh93/sh/nvtype.c: nv_settype():
- Another fix backported from 93v-. If the variable is an array,
  also set the type of element 0 of that array using a call to
  nv_arraysettype(). The value may be null. Removing this somehow
  causes this regression:
	enum.sh[94]: unsetting associative enum array does not work
	(got 'Color_t -A Colors=()')

src/cmd/ksh93/tests/enum.sh:
- Add tests for all the bugs fixed here, plus some hypothetical
  bugs (e.g., do the same tests for indexed enum type arrays as for
  associative enum type arrays, even though indexed enum type
  arrays didn't have all the same problems).

Co-authored-by: Johnothan King <johnothanking@protonmail.com>
Resolves: https://github.com/ksh93/ksh/issues/87
2021-04-06 06:33:32 +01:00