1
0
Fork 0
mirror of git://git.code.sf.net/p/cdesktopenv/code synced 2025-02-13 11:42:21 +00:00
Commit graph

344 commits

Author SHA1 Message Date
atheik
86b94d9feb libast: optget(3): Fix memory leak in --help/--man info
src/lib/libast/misc/optget.c: textout(): case ']':
- Before returning, call pop() to free any \f...\f info items that
  are left. Note that this is a safe no-op if the pointer is null.

Resolves: https://github.com/ksh93/ksh/issues/407
Co-authored-by: Martijn Dekker <martijn@inlv.org>
2022-03-11 21:24:08 +01:00
Martijn Dekker
fd28da31da Fix another test/[ corner case bug; add --posix test script
This fixes another corner case bug in the horror show that is the
test/[ comand.

Reproducer:

   $ ksh --posix -c 'test X -a -n'
   ksh: test: argument expected

Every other shell returns 0 (success) as, POSIXly, this is a test
for the strings 'X' and '-n' both being non-empty, combined with
the binary -a (logical and) operator. Instead, '-n' was taken as a
unary primary operator with a missing argument, which is incorrect.

POSIX reference:
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/test.html
> 3 arguments:
> * If $2 is a binary primary, perform the binary test of $1 and $3.

src/cmd/ksh93/bltins/test.c:
- e3(): If the final argument begins with a dash, always treat it
  as a test for a non-empty string, therefore return true. Do not
  limit this to "new flags" only.

src/cmd/ksh93/tests/posix.sh:
- Added. These are tests for every aspect of the POSIX mode.
2022-03-11 21:23:45 +01:00
Martijn Dekker
9e2a8c6925 posix mode: disable effect of repeating whitespace char in $IFS
ksh has a little-known field splitting feature that conflicts with
POSIX: if a single-byte whitespace character (cf. isspace(3)) is
repated in $IFS, then field splitting is done as if that character
wasn't a whitespace character. An exmaple with the tab character:

  $ (IFS=$'\t'; val=$'\tone\t\ttwo\t'; set -- $val; echo $#)
  2
  $ (IFS=$'\t\t'; val=$'\tone\t\ttwo\t'; set -- $val; echo $#)
  4
The latter being the same as, for example
  $ (IFS=':'; val='1️⃣2️⃣'; set -- $val; echo $#)
  4

However, this is incompatible with the POSIX spec and with every
other shell except zsh, in which repeating a character in IFS does
not have any effect. So the POSIX mode must disable this.

src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/sh/init.c:
- Add sh_invalidate_ifs() function that invalidates the IFS state
  table by setting the ifsnp discipline struct member to NULL,
  which will cause the next get_ifs() call to regenerate it.
- get_ifs(): Treat a repeated char as S_DELIM even if whitespace,
  unless --posix is on.

src/cmd/ksh93/sh/args.c:
- sh_argopts(): Call sh_invalidate_ifs() when enabling or disabling
  the POSIX option. This is needed to make the change in field
  splitting behaviour take immediate effect instead of taking
  effect at the next assignment to IFS.
2022-03-11 21:22:22 +01:00
Martijn Dekker
fae1932e62 enum: remove arbitrary one-argument limitation
b_enum() contains a check that exactly one argument is given:

237:	if (error_info.errors || !*argv || *(argv + 1))

But the subsequent argument handling loop will happily deal with
multiple arguments:

246:	while(cp = *argv++)

Every other declaration command supports multiple arguments and I
see no reason why enum shouldn't. Simply removing the '*(argv + 1)'
check allows 'enum' to create more than one type per invocation.

src/cmd/ksh93/bltins/enum.c:
- b_enum(): Remove check for >1 args as described above.
- Update documentation to describe the behaviour of enumeration
  types in arithmetic expressions and to add an example: a bool
  type with two enumeration values 'false' (0) and 'true' (1).
  That type is predefined in ksh 93v- and 2020. We're not going
  to do that in 93u+m but it's good to document the possibility.

src/cmd/ksh93/sh.1:
- Make changes parallel to the enum.c self-doc update.
2022-03-11 21:21:23 +01:00
Johnothan King
8fc8c2f51c Fix a few minor issues (#473)
Changes:
- Fixed two xtrace test failures introduced in commit cfc8744c.
- The definition of _use_ntfork_tcpgrp in xec.c is now dependent on
  SHOPT_SPAWN being defined (re: 8e9ed5be).
- Removed many unnecessary newlines and fixed various typos.
2022-03-11 21:18:42 +01:00
Johnothan King
bb3527aea5 Fix infinite loop when posix_spawn fails (re: 0863a8eb) (#468)
This commit fixes an infinite loop introduced in commit 0863a8eb
that caused ksh to enter an infinite loop if posix_spawn failed
to start a new process after setting the terminal process group.
Reproducer (warning: it will cause ksh to crash Wayland sessions
and drives up CPU usage by a ton):

   $ /tmp/this/file/does/not/exist
   /usr/bin/ksh: /tmp/this/file/does/not/exist: not found
   $ <Press enter>
   (ksh now prints $PS1 in a loop until killed with SIGKILL)

The first bug fixed is the infinite loop that occurs when
posix_spawn fails to execute a command. This was fixed by setting
the terminal process group to the main interactive shell.

The second bug fixed is related to the signal handling of the
SIGTTIN, SIGTTOU and SIGTSTP signals. In sh_ntfork() these signals
are set to their default signal handlers (SIG_DFL) before running
a command. The signal handlers were only restored to SIG_IGN
(ignore signal) when sh_ntfork() successfully ran a command.
This could cause a SIGTTOU lockup under strace when a command
failed to execute in an interactive shell, while also being one
cause of the infinite loop.

src/cmd/ksh93/sh/xec.c: sh_ntfork():
- Restore the terminal process group if posix_spawn failed to
  launch a new process. This is necessary because posix_spawn will
  set the terminal process group before it attempts to run a
  command and doesn't restore it on failure.
2022-03-11 21:14:20 +01:00
atheik
2e5fd4d4c1 slowread(): Turn off O_NONBLOCK for stdin if it is on (#471)
This change turns off O_NONBLOCK for stdin if a previously ran
program left it on so that interactive programs that expect it
to be off work properly.

src/cmd/ksh93/sh/io.c: slowread():
- Turn off O_NONBLOCK for stdin if it is on.

Fixes: https://github.com/ksh93/ksh/issues/469
2022-03-11 21:10:59 +01:00
Johnothan King
e87dbebebd Fix use after free bug when using += (re: 75796a9c) (#466)
The previous fix for the += operator introduced a use-after-free
bug that could result in a variable pointing to random garbage:
   $ foo=bar
   $ foo+=_foo true
   $ typeset -p foo
   foo=V V
The use after free issue occurs because when nv_clone creates a
copy of $foo in the true command's invocation-local scope, it does
not duplicate the string $foo points to. As a result, the $foo
variable in the parent scope points to the same string as $foo in
the invocation-local scope, which causes the use after free bug
when cloned $foo variable is freed from memory.

src/cmd/ksh93/sh/nvdisc.c:
- To fix the use after free bug, allow nv_clone to duplicate the
  string with memdup or strdup when no flags are passed.

src/cmd/ksh93/tests/variables.sh:
- Add a regression test for using the += operator with regular
  commands.

src/cmd/ksh93/tests/leaks.sh:
- Add a regression test to ensure the bugfix doesn't introduce any
  memory leaks.
2022-03-11 21:08:57 +01:00
Martijn Dekker
b09ce2fa02 Fix crash when suspending a blocked write to a FIFO
Reproducer (symptoms on at least macOS and FreeBSD):

$ mkfifo f
$ echo foo > f
(press Ctrl+Z)
^Zksh: f: cannot create [Interrupted system call]
Abort

The shell either aborts (dev builds) or crashes with 'Illegal
instruction' (release builds). This is consistent with
UNREACHABLE() being reached.

Backtrace:

0   libsystem_kernel.dylib    __kill + 10
1   ksh                       sh_done + 836 (fault.c:678)
2   ksh                       sh_fault + 1324
3   libsystem_platform.dylib  _sigtramp + 29
4   dyld                      ImageLoaderMachOCompressed::resolve(ImageLoader::LinkContext const&, char const*, unsigned ch
5   libsystem_c.dylib         abort + 127
6   ksh                       sh_redirect + 3576 (io.c:1356)
7   ksh                       sh_exec + 7231 (xec.c:1308)
8   ksh                       exfile + 3247 (main.c:607)
9   ksh                       sh_main + 3551 (main.c:368)
10  ksh                       main + 38 (pmain.c:45)
11  libdyld.dylib             start + 1

This means that UNREACHABLE() is actually reached here:

ksh/src/cmd/ksh93/sh/io.c
1351: if((fd=sh_open(tname?tname:fname,o_mode,RW_ALL)) <0)
1352: {
1353: 	errormsg(SH_DICT,ERROR_system(1),((o_mode&O_CREAT)?e_create:e_open),fname);
1354: 	UNREACHABLE();
1355: }

The cause is that, in the following section of code in sh_fault():

ksh/src/cmd/ksh93/sh/fault.c
183: #ifdef SIGTSTP
184: 		if(sig==SIGTSTP)
185: 		{
186: 			sh.trapnote |= SH_SIGTSTP;
187: 			if(pp->mode==SH_JMPCMD && sh_isstate(SH_STOPOK))
188: 			{
189: 				sigrelease(sig);
190: 				sh_exit(SH_EXITSIG);
191: 				return;
192: 			}
193: 		}
194: #endif /* SIGTSTP */

...sh_exit() is not getting called and the function will not return
because the SH_STOPOK bit is not set while the shell is blocked
waiting to write to a FIFO.

Even if sh_exit() did get called, that would not fix it, because
that function also checks for the SH_STOPOK bit and returns without
doing a longjmp if the signal is SIGTSTP and the SH_STOPOK bit is
not set. That is direct the reason why UNREACHABLE() was raeched:
errormsg() does call sh_exit() but sh_exit() then does not longjmp.

src/cmd/ksh93/sh/fault.c: sh_fault():
- To avoid the crash, we simply need to return from sh_fault() if
  SH_STOPOK is off, so that the code path does not continue, no
  error message is given on Ctrl+Z, UNREACHABLE() is not reached,
  and the shell resumes waiting on trying to write to the FIFO.
  The sh.trapnote flag should not be set if we're not going to
  process the signal. This makes ksh behave like all other shells.

Resolves: https://github.com/ksh93/ksh/issues/464
2022-02-17 20:21:23 +00:00
Martijn Dekker
11177d448d Fix crash on cd in subshell with PWD unset (re: 5ee290c)
Reproducer:

$ ksh -c 'unset PWD; (cd /); :'
Memory fault

The shell crashes because b_cd() is testing the value of the PWD
variable without checking if there is one.

src/cmd/ksh93/sh/path.c: path_pwd():
- Never return an unfreeable pointer to e_dot; always return a
  freeable pointer. This fixes another corner-case crashing bug.
- Make sure the PWD variable gets assigned a value if it doesn't
  have one, even if it's the "." fallback. However, if the PWD is
  inaccessible but we did inherit a $PWD value that starts with a
  /, then use the existing $PWD value as this will help the shell
  fail gracefully.

src/cmd/ksh93/bltins/cd_pwd.c:
- b_cd(): When checking if the PWD is valid, use the sh.pwd copy
  instead of the PWD variable. This fixes the crash above.
- b_cd(): Since path_pwd() now always returns a freeable value,
  free sh.pwd unconditionally before setting the new value.
- b_pwd(): Not only check that path_pwd() returns a value starting
  with a slash, but also verify it with test_inode() and error out
  if it's wrong. This makes the 'pwd' command useful for checking
  that the PWD is currently accessible.

src/cmd/ksh93/data/msg.c:
- Change e_pwd error message for accuracy and clarity.
2022-02-17 19:45:37 +00:00
Martijn Dekker
d55e9686d7 Backport 'read -a' and 'read -p' from ksh 93v-/2020
This backports two minor additions to the 'read' built-in from ksh
93v-: '-a' is now the same as '-A' and '-u p' is the same as '-p'.
This is for compatibility with some 93v- or ksh2020 scripts.

Note that their change to the '-p' option to support both prompts
and reading from the coprocess was *not* backported because we
found it to be broken and unfixable. Discussoin at:
https://github.com/ksh93/ksh/issues/463

src/cmd/ksh93/bltins/read.c: b_read():
- Backport as described above.
- Rename the misleadingly named 'name' variable to 'prompt'.
  It points to the prompt string, not to a variable name.

src/cmd/ksh93/data/builtins.c: sh_optpwd[]:
- Add -a as an alterative to -A. All that is needed is adding '|a'
  and optget(3) will automatically convert it to 'A'.
- Change -u from a '#' (numeric) to ':' option to support 'p'. Note
  that b_read() now needs a corresponding strtol() to convert file
  descriptor strings to numbers where applicable.
- Tweaks.

src/cmd/ksh93/sh.1:
- Update accordingly.
- Tidy up the unreadable mess that was the 'read' documentation.
  The options are now shown in a list.
2022-02-17 19:44:54 +00:00
Martijn Dekker
95d695cb5a Improve and document fast filescan loops (SHOPT_FILESCAN)
From README:

FILESCAN on  Experimental option that allows fast reading of files
             using while < file;do ...; done and allowing fields in
             each line to be accessed as positional parameters.

As SHOPT_FILESCAN has been enabled by default since ksh 93l
2001-06-01, the filescan loop is now documented in the manual page
and the compile-time option is no longer considered experimental.

We must disable this at runtime if --posix is active because it
breaks a portable use case: POSIXly, 'while <file; do stuff; done'
repeatedly excutes 'stuff' while 'file' can successfully be opened
for reading, without actually reading from 'file'.

This also backports a bugfix from the 93v- beta. Reproducer:

$ echo 'one two three' >foo
$ while <foo; do printf '[%s] ' "$@"; echo; done
[one two three]

Expected output:
[one] [two] [three]

The bug is that "$@" acts like "$*", joining all the positional
parameters into one word though it should be generating one word
for each.

src/cmd/ksh93/sh/macro.c: varsub():
- Backport fix for the bug described above. I do not understand the
  opaque macro.c code well enough yet to usefully describe the fix.

src/cmd/ksh93/sh/xec.c: sh_exec():
- Improved sanity check for filescan loop: do not recognise it if
  the simple command includes variable assignments, more than one
  redirection, or an output or append redirection.
- Disable filescan loops if --posix is active.
- Another 93v- fix: handle interrupts (errno==EINTR) when closing
  the input file.
2022-02-17 19:43:36 +00:00
Martijn Dekker
4886463bb6 Disable broken KEYBD trap for multibyte characters
In UTF-8 locales, ksh breaks when a KEYBD trap is active, even a
dummy no-op one like 'trap : KEYBD'. Entering multi-byte characters
fails (the input is interrupted and a new prompt is displayed) and
pasting content with multi-byte characters produces corrupted
results.

The cause is that the KEYBD trap code is not multibyte-ready.
Unfortunately nobody yet understands the edit.c code well enough
to implement a proper fix. Pending that, this commit implements
a workaround that at least avoids breaking the shell.

src/cmd/ksh93/edit/edit.c: ed_getchar():
- When a multi-byte locale is active, do not trigger the the KEYBD
  trap except for ASCII characters (1-127).

Resolves: https://github.com/ksh93/ksh/issues/307
2022-02-17 19:39:42 +00:00
Martijn Dekker
56c0c24b55 Do not disable completion along with pathname expansion
The -f/--noglob shell option is documented simply as: "Disables
pathname expansion." But after 'set -f' on an interactive shell,
command completion and file name completion also stop working. This
is because they internally use the pathname expansion mechanism.
But it is not documented anywhere that 'set -f' disables
completion; it's just a side effect of an implementation detail.

Though ksh has always acted like this, I think it should change
because it's not useful or expected behaviour. Other shells like
bash, yash or zsh don't act like this.

src/cmd/ksh93/sh/expand.c,
src/cmd/ksh93/sh/macro.c:
- Allow the SH_COMPLETE (command completion) or SH_FCOMPLETE
  (file name completion) state bit to override SH_NOGLOB in
  path_generate() and in sh_macexpand().
2022-02-17 19:38:15 +00:00
Martijn Dekker
6304dfce41 Fix corner-case >&- redirection leak out of subshell
Reproducer:

    exec 9>&1
    ( { exec 9>&1; } 9>&- )
    echo "test" >&9 # => 9: cannot open [Bad file descriptor]

The 9>&- incorrectly persists beyond the { } block that it
was attached to *and* beyond the ( ) subshell. This is yet another
bug with non-forking subshells; forking it with something like
'ulimit -t unlimited' works around the bug.

In over a year we have not been able to find a real fix, but I came
up with a workaround that forks a virtual subshell whenever it
executes a code block with a >&- or <&- redirection attached. That
use case is obscure enough that it should not cause any performance
regression except in very rare corner cases.

src/cmd/ksh93/sh/xec.c: sh_exec(): TSETIO:
- This is where redirections attached to code blocks are handled.
  Check for a >&- or <&- redirection using bit flaggery from
  shnodes.h and fork if we're executing such in a virtual subshell.

Resolves: https://github.com/ksh93/ksh/issues/161
Thanks to @ko1nksm for the bug report.
2022-02-17 19:35:47 +00:00
Johnothan King
f38494ea1d Fix multiple bugs in .sh.match (#455)
This commit backports all of the relevant .sh.match bugfixes from
ksh93v-. Most of the .sh.match rewrite is from versions 2012-08-24
and 2012-10-04, with patches from later releases of 93v- and
ksh2020 also applied. Note that there are still some remaining bugs
in .sh.match, although now the total count of .sh.match bugs should
be less that before.

These are the relevant changes in the ksh93v- changelog that were
backported:
12-08-07  .sh.match no longer gets set for patterns in PS4 during
          set -x.
12-08-10  Rewrote .sh.match expansions fixing several bugs and
          improving performance.
12-08-22  .sh.match now handles subpatterns that had no matches with
          ${var//pattern} correctly.
12-08-21  A bug in setting .sh.match after ${var//pattern/string}
          when string is empty has been fixed.
12-08-21  A bug in setting .sh.match after [[ string == pattern ]]
          has been fixed.
12-08-31  A bug that could cause a core dump after
          typeset -m var=.sh.match has been fixed.
12-09-10  Fixed a bug in typeset -m the .sh.match is being renamed.
12-09-07  Fixed a bug in .sh.match code that coud cause the shell
          to quitely
13-02-21  The 12-01-16 bug fix prevented .sh.match from being used
          in the replacement string. The previous code was restored
          and a different fix which prevented .sh.match from being
          computed for nested replacement has been used instead.
13-05-28  Fixed two bug for typeset -c and typeset -m for variable
          .sh.match.

Changes:
- The SHOPT_2DMATCH option has been removed. This was already the
  default behavior previously, and now it's documented in the man
  page.
- init.c: Backported the sh_setmatch() rewrite from 93v- 2012-08-24
  and 2012-10-04.
- Backported the libast 93v- strngrpmatch() function, as the
  .sh.match rewrite requires this API.
- Backported the sh_match regression tests from ksh93v-, with many
  other sh_match tests backported from ksh2020. Much of the sh_match
  script is based on code from Roland Mainz:
  https://marc.info/?l=ast-developers&m=134606574109162&w=2
  https://marc.info/?l=ast-developers&m=134490505607093
- tests/{substring,treemove}.sh: Backported other relevant .sh.match
  fixes, with tests added to the substring and treemove test scripts.
- tests/types.sh: One of the (now reverted) memory leak bugfixes
  introduced a CI test failure in this script, so for that test the
  error message has been improved.
- string/strmatch.c: The original ksh93v- code for the strngrpmatch()
  changes introduced a crash that could occur because strlen would
  be used on a null pointer. This has been fixed by avoiding strlen
  if the string is null.

One nice side effect of these changes is a considerable performance
improvement in the shbench[1] gsub benchmark (results from 20
iterations with CCFLAGS=-Os):
--------------------------------------------------
name      /tmp/ksh-current     /tmp/ksh-matchfixes
--------------------------------------------------
gsub.ksh  0.883 [0.822-0.959]  0.457 [0.442-0.505]
--------------------------------------------------

Despite all of the many fixes and improvements in the backported
93v- .sh.match code, there are a few remaining bugs:

- .sh.match is printed with a default [0] subscript (see also
  https://github.com/ksh93/ksh/issues/308#issuecomment-1025016088):
     $ arch/*/bin/ksh -c 'echo ${!.sh.match}'
       .sh.match[0]
  This bug appears to have been introduced by the changes from
  ksh93v- 2012-08-24.
- The wrong variable name is given for 'parameter not set' errors
  (from https://marc.info/?l=ast-developers&m=134489094602596):
     $ arch/*/bin/ksh -u
     $ x=1234
     $ true "${x//~(X)([012])|([345])/}"
     $ compound co
     $ typeset -m co.array=.sh.match
     $ printf "%q\n" "${co.array[2][0]}"
     arch/linux.i386-64/bin/ksh: co.array[2][(null)]: parameter not set
- .sh.match leaks out of subshells. Further information and a
  reproducer can be found here:
  https://marc.info/?l=ast-developers&m=136292897330187

[1]: https://github.com/ksh-community/shbench
2022-02-10 21:04:23 +00:00
Martijn Dekker
232b7bff30 Fix multiple bugs in executing scripts without a #! path
When executing a script without a hashbang path like #!/bin/ksh,
ksh forks itself, longjmps back to sh_main(), and then (among other
things) calling sh_reinit() which is the function that tries to
reinitialise as much of the shell as it can. This is its way of
ensuring the child script is run in ksh and not some other shell.

However, this appraoch is incredibly buggy. Among other things,
changes in built-in commands and custom type definitions survived
the reinitialisation, "exporting" variables didn't work properly,
and the hash table and ${.sh.stats} weren't reset. As a result,
depending on what the invoking script did, the invoked script could
easily fail or malfunction.

It is not actually possible to reinitialise the shell correctly,
because some of the shell state is in locally scoped static
variables that cannot simply be reinitialised. There are probably
huge memory leaks with this approach as well. At some point, all
this is going to need a total redesign. Clearly, the only reliable
way involves execve(2) and a start from scratch.

For now though, this seems to fix the known bugs at least. I'm sure
there are more to be discovered.

This commit makes another change: instead of the -h/trackall option
(which has been a no-op for decades), the posix option is now
inherited by the child script. Since there is no hashbang path from
which to decide whether the shell should run in POSIX mode nor not,
the best guess is probably the invoking script's setting.

src/cmd/ksh93/sh/init.c: sh_reinit():
- Keep the SH_INIT state on during the entire procedure.
- Delete remaining non-exported, non-default variables.
- Remove attributes from exported variables. In POSIX mode, remove
  all attributes; otherwise, only remove readonly.
- Unset discipline function pointers for variables.
- Delete all custom types.
- Delete all functions and built-ins, then reinitialise the built-ins
  table from scatch.
- Free the alias values before clearing the alias table.
- Same with hash table entries (tracked aliases).
- Reset statistics.
- Inherit SH_POSIX instead of SH_TRACKALL.
- Call user init function last, not somewhere in the middle.

src/cmd/ksh93/sh/name.c: sh_envnolocal():
- Be sure to preserve the export attribute of kept variables.

Resolves: https://github.com/ksh93/ksh/issues/350
2022-02-10 21:03:43 +00:00
Johnothan King
7d4c7d9156 Fix 'typeset -p' output of compound array types (#453)
This bugfix was backported from ksh93v- 2012-10-04. The bug fixed
by this change is one that causes 'typeset -p' to omit the -C flag
when listing compound arrays belonging to a type:

   $ typeset -T Foo_t=(compound -a bar)
   $ Foo_t baz
   $ typeset -p baz.bar
   typeset -a baz.bar=''  # This should be 'typeset -C -a'

src/cmd/ksh93/sh/nvtype.c:
- Backport change from 93v- 2012-10-04 that sets the array nvalue to
  a pointer named Null (which is "") in nv_mktype(), then to Empty
  in fixnode().
- Change the Null name from the 93v- code to AltEmpty to avoid
  misleading code readers into thinking that it's a null pointer.

src/cmd/ksh93/tests/types.sh:
- Backport the relevant 93v- changes to the types regression tests.

Co-authored-by: Martijn Dekker <martijn@inlv.org>
2022-02-10 21:03:24 +00:00
Johnothan King
787058bdbf Fix the output of typeset -p for two dimensional indexed arrays (#454)
In ksh93v- 2012-10-04 the following bugfix is noted in the changelog
(this fix was most likely part of ksh93v- 2012-09-27, although that
version is not archived anywhere):
12-09-21  A bug in which the output of a two dimensional sparse
          indexed array would cause the second subscript be treated
          as an associative array when read back in has been fixed.
          Elements that are sparse indexed arrays now are prefixed
          type "typeset -a".

Below is a before and after of this change:

   # Before
   $ typeset -a foo[1][2]=bar
   $ typeset -p foo
   typeset -a foo=([1]=([2]=bar) )

   # After
   $ typeset -a foo[1][2]=bar
   $ typeset -p foo
   typeset -a foo=(typeset -a [1]=([2]=bar) )

src/cmd/ksh93/sh/*.c:
- Backport changes from ksh93v- to print 'typeset -a' before sparse
  indexed arrays and properly handle 'typeset -a' in reinput
  commands from 'typeset -p'.

src/cmd/ksh93/tests:
- Add two regression tests to arrays.sh for this change.
- Update the existing regression tests for compatibility with the
  new printed typeset output.
2022-02-10 21:01:40 +00:00
Martijn Dekker
e6d0187dd8 Don't allow 'enum' and 'typeset -T' to override special built-ins
Special builtins are undeleteable for a reason. But 'enum' and
'typeset -T' allow overriding them, causing an inconsistent state.

@JohnoKing writes:
| The behavior is rather buggy, as it appears to successfully
| override normal builtins but fails to delete the special
| builtins, leading to scenarios where both the original builtin
| and type are run:
|
| $ typeset -T eval=(typeset BAD; typeset TYPE)  # This should have failed
| $ eval foo=BAD
| /usr/bin/ksh: eval: line 1: foo: not found
| $ enum trap=(BAD TYPE)   # This also should have failed
| $ trap foo=BAD
| /usr/bin/ksh: trap: condition(s) required
| $ enum umask=(BAD TYPE)
| $ umask foo=BAD
| $ echo $foo
| BAD
|
| # Examples of general bugginess
| $ trap bar=TYPE
| /usr/bin/ksh: trap: condition(s) required
| $ echo $bar
| TYPE
| $ eval var=TYPE
| /usr/bin/ksh: eval: line 1: var: not found
| $ echo $var
| TYPE

This commit fixes the following:

The 'enum' and 'typeset -T' commands are no longer allowed to
override and replace special built-in commands, except for type
definition commands previously created by these commands; these
are already (dis)allowed elsewhere.

A command like 'typeset -T foo_t' without any assignments no longer
creates an incompletely defined 'foo_t' built-in comamnd. Instead,
it is now silently ignored for backwards compatibility. This did
have a regression test checking for it, but I'm changing it because
that's just not a valid use case. An incomplete type definition
command does nothing useful and only crashes the shell when run.

src/cmd/ksh93/bltins/enum.c: b_enum():
- Do not allow overriding non-type special built-ins.

src/cmd/ksh93/sh/name.c: nv_setlist():
- Do not allow 'typeset -T' to override non-type special built-ins.
  To avoid an inconsistent state, this must be checked for while
  processing the assignments list before typeset is really invoked.

src/cmd/ksh93/bltins_typeset.c: b_typeset():
- Only create a type command if sh.envlist is set, i.e., if some
  shell assignment(s) were passed to the 'typeset -T' command.

Progresses: https://github.com/ksh93/ksh/issues/350
2022-02-10 21:01:00 +00:00
Martijn Dekker
65aff0befb Fix conditional expansions ${array[i]=value}, ${array[i]?error}
$ unset foo
    $ echo ${foo[42]=bar}
    (empty line)

Instead of the empty line, 'bar' was expected. As foo[42] was
unset, the conditional assignment should have worked.

    $ unset foo
    $ : ${foo[42]?error: unset}
    (no output)

The expansion should have thrown an error with the given message.

This bug was introduced in ksh 93t 2008-10-01. Thanks to @JohnoKing
for finding the breaking change.

Analysis: The problem was experimenally determined to be in in the
following lines of nv_putsub(). If the array member is unset (i.e.
null), the value is set to the empty string instead:

src/cmd/ksh93/sh/array.c
1250: 		else
1251:			ap->val[size].cp = Empty;

It makes some sense: if there is a value (even an empty one), the
variable is set and these expansions should behave accordingly.
Sure enough, deleting these lines fixes the bug, but at the expense
of introducing a lot of other array-related regressions. So we need
a way to special-case the affected expansions.

Where to do this? If we replace line 1251 with an abort(3) call, we
get this stack trace:

0   libsystem_kernel.dylib    __pthread_kill + 10
1   libsystem_pthread.dylib   pthread_kill + 284
2   libsystem_c.dylib         abort + 127
3   ksh                       nv_putsub + 1411 (array.c:1255)
4   ksh                       nv_endsubscript + 940 (array.c:1547)
5   ksh                       nv_create + 4732 (name.c:1066)
6   ksh                       nv_open + 1951 (name.c:1425)
7   ksh                       varsub + 4934 (macro.c:1322)
[rest omitted]

The special-casing needs to be done on line 1250 of array.c, but
flagged in varsub() which processes these expansions. So, varsub()
calls nv_open() calls nv_create() calls nv_endsubscript() calls
nv_putsub(). That's a fairly deep call stack, so passing an extra
flag argument does not seem doable. I did try an approach using a
couple of new bit flags passed via these functions' flags and mode
parameters, but the way this code base uses bit flags is so
intricate, it does not seem to be possible to add or change
anything without unwanted side effects in all sorts of places.

So the only fix I can think of adds yet another global flag
variable for a very special case. It's ugly, but it works.
An elegant fix would probably involve a fairly comprehensive
redesign, which is simply not going to happen.

src/cmd/ksh93/include/shell.h:
- Add global sh.cond_expan flag.

src/cmd/ksh93/sh/array.c: nv_putsub():
- Do not set value to empty string if sh.cond_expan is set.

src/cmd/ksh93/sh/macro.c: varsub():
- Set sh.cond_expan flag while calling nv_open() for one of the
  affected expansions.
- Minor refactoring for legibility and to make the fix fit better.
- SSOT: Instead of repeating string "REPLY", use the node's nvname.
- Do not pointlessly add an extra 0 byte when saving id for error
  message; sfstruse() already adds this.

Thanks to @oguz-ismail for the bug report.

Resolves: https://github.com/ksh93/ksh/issues/383
2022-02-05 23:39:16 +00:00
Martijn Dekker
493a31053e Do not export variables with dot names (re: 8e72608c)
Variables with a dot in their name, such as those declared in
namespace { ... } blocks, are usually stored in a separate tree
with their actual names not containing any dots. But under some
circumstances, including at least direct assignment of a
non-preexisting dot variable, dot variables are stored in the main
sh.var_tree with names actually containing dots. With allexport
active, those could end up exported to the environment. This bug
was also present in previous release versions of ksh.

src/cmd/ksh93/sh/name.c: pushnam():
- Check for a dot in the name before pushing a variable to export.
2022-02-05 15:08:50 +00:00
Johnothan King
a8dd1bbd9d typeset -p: fix output of nonexistent [0]= array element (#451)
This fix was backported from ksh 93v- 2012-10-04.

src/cmd/ksh93/sh/nvtree.c: nv_outnode():
- If the array is supposed to be empty, do not continue. This
  avoids outputting a nonexistent [0]= element for empty arrays.

Resolves: https://github.com/ksh93/ksh/issues/420
Co-authored-by: Martijn Dekker <martijn@inlv.org>
2022-02-05 13:53:51 +00:00
Johnothan King
fb696ecfae trap: fix use after free (#446)
This commit adds a fix for the trap command, backported from a fork
of ksh2020: https://github.com/l0stman/ksh/commit/2033375f

src/cmd/ksh93/sh/jobs.c: job_chldtrap():
- Fixed a use after free bug in the for loop. The string pointed to
  by sh.st.trapcom[SIGCHLD] may be freed from memory after
  sh_trap(), so it must be reobtained each time sh_trap() is called
  from within the for loop.
2022-02-05 13:53:11 +00:00
Johnothan King
8e72608c1c Export all variables assigned to while allexport is on (#431)
All variables that are assigned a value should be exported while
the allexport shell option is enabled. This works in most cases,
but variables assigned to with ${var:=foo} or $((var=123)) aren't
exported while allexport is on.

src/cmd/ksh93/sh/name.c:
- nv_putval(): This is the central assignment function; all forms
  of variable assignment end up here. So this is the best place
  to check for SH_ALLEXPORT and turn on the export attribute.
- nv_setlist(): Remove allexport handling, now redundant.

src/cmd/ksh93/bltins/read.c: sh_readline():
- Remove allexport handling, now redundant.

src/cmd/ksh93/sh/main.c: sh_main():
- nv_putval() is used to initialize PS4 and IFS using nv_putval();
  this is after an -a/--allexport specified on the ksh command
  line has been processed, so temporarily turn that off.

Co-authored-by: Martijn Dekker <martijn@inlv.org>
2022-02-05 13:52:28 +00:00
Martijn Dekker
1375cda934 Default to emacs upon invoking interactive shell
If the VISUAL or EDITOR environment variable is not set to a value
matching *[Vv][Ii]* or *macs* at initialisation time, then ksh does
not turn on any line editor.

This is user-hostile. New users on Unix-like systems typically have
a simple editor like nano preconfigured as their default, or may
not have the VISUAL or EDITOR variable set at all. So if they try
ksh, they find themselves without basic functionality such as arrow
keys and probably go straight back to bash.

The emacs line editor is by far the most widely used, especially
among new users, so ksh should default to that. Most other shells
already do this.

src/cmd/ksh93/sh/main.c: sh_main():
- On an interactive shell, if on editor was turned on based on
  $VISUAL or $EDITOR, turn on emacs before reading input.
2022-02-01 23:47:56 +00:00
Martijn Dekker
d650c73e55 Fix SIGINT handling for external commands run from scripts
Reproducer:

  $ ksh -c 'bash -c '\''kill -s INT $$'\''; echo "$?, continuing"'

Expected result: output "258, continuing"; exit status 0.

Actual result: no output; exit status 258. The child process sent
SIGINT only to itself and not to the process group, so the parent
script was wrongly interrupted.

Every shell except ksh93 produces the expected result. ksh93 also
gave the expected result before version 2008-01-31 93s+, which
introduced the code below.

Analysis: The problem is in these lines of code in xec.c,
sh_exec(), TFORK case, parent branch of fork:

1649:	if(!sh_isstate(SH_MONITOR))
1650:	{
1651:		if(!(sh.sigflag[SIGINT]&(SH_SIGFAULT|SH_SIGOFF)))
1652:			sh_sigtrap(SIGINT);
1653:		sh.trapnote |= SH_SIGIGNORE;
1654:	}
[...pipe and I/O handling, wait for command to finish...]
1667:	if(!sh_isstate(SH_MONITOR))
1668:	{
1669:		sh.trapnote &= ~SH_SIGIGNORE;
1700:		if(sh.exitval == (SH_EXITSIG|SIGINT))
1701:			kill(sh.current_pid,SIGINT);
1702:	}

When a user presses Ctrl+C, SIGINT is sent to the entire process
group. If job control is fully off (i.e., !sh_isstate(SH_MONITOR)),
then the process group includes the parent script. Therefore, in a
script such as

  $ ksh -c 'bash -c '\''read x'\''; echo "$?, continuing"'

when the user presses Ctrl+C while bash waits for 'read x' input,
the parent ksh script should be interrupted as well.

Now, the code above ignores SIGINT while bash is running. (This is
done using special-casing in sh_fault() to handle that SH_SIGIGNORE
flag for SIGINT.) So, when Ctrl+C interrupts the process group, the
parent script is not getting interrupted as it should.

To compensate for that, the code then detects, using sh.exitval
(the child process' exit status), whether the child process was
killed by SIGINT. If so, it simply assumes that the signal was
meant for the process group including the parent script, so it
reissues SIGINT to itself after unignoring it.

But, as we can see from the broken reproducer above, that
assumption is not valid. Scripts are perfectly free to send SIGINT
to themselves only, and that must work as expected.

src/cmd/ksh93/sh/xec.c: sh_exec(): TFORK: parent branch:
- Instead of ignoring SIGINT, sigblock() it, which delays handling
  the signal until sigrelease(). (Note that these are macros
  defined in src/cmd/ksh93/features/sigfeatures according to OS
  capabilities.)
- This makes reissuing SIGINT redundant, so delete that, which
  fixes the bug.

src/cmd/ksh93/sh/fault.c:
- Nothing now sets the SH_SIGIGNORE flag in sh.trapnote, so remove
  special-casing added in 2008-01-31 93s+.
2022-02-01 05:50:10 +00:00
Johnothan King
9a5af738ef Add support for more keyboard shortcuts (#410)
Add extra key bindings to the emacs and vi modes

This patch adds the following key bindings to the emacs and vi
editing modes:
- Support for Home key sequences ^[[1~ and ^[[7~ as well as End key
  sequences ^[[4~ and ^[[8~.
- Support for arrow key sequences ^[OA, ^[OB, ^[OC and ^[OD.
- Support for the following keyboard shortcuts (if the platform
  supports the expected escape sequence):
  - Ctrl-Left Arrow:  Go back one word
  - Alt-Left Arrow:   Go back one word     (Not supported on Haiku)
  - Ctrl-Right Arrow: Go forward one word
  - Alt-Right Arrow:  Go forward one word  (Not supported on Haiku)
  - Ctrl-G:           Cancel reverse search
  - Ctrl-Delete:      Delete next word     (Not supported on Haiku)
- Added a key binding for the Insert key, which differs in the
  emacs and vi editing modes:
  - In emacs mode, Insert escapes the next character.
  - In vi mode, Insert will switch the editor to insert mode (like
    in vim).

src/cmd/ksh93/edit/{emacs,vi}.c:
- Add support for the <M-Left> and <M-Right> sequences. Like in
  bash and mksh, these shortcuts move the cursor one word backward
  or forward (like the <Ctrl-Left> and <Ctrl-Right> shortcuts).
- Only attempt to process these shortcuts if the escape sequence
  begins with $'\E[1;'.

src/cmd/ksh93/edit/vi.c:
- If the shell isn't doing a reverse search, insert the bell
  character when Ctrl+G is input.
- Add the Ctrl-Delete shortcut as an alias of 'dw'. Calling
  ed_ungetchar twice does not work for 'dw', so Ctrl-Delete was
  implemented by using a vp->del_word variable.

Co-authored-by: Martijn Dekker <martijn@inlv.org>
2022-01-31 21:51:50 +00:00
Johnothan King
ad9f9ff13e Accumulated fixes for minor issues (#442)
This commit applies various accumulated bugfixes:

- Applied some fixes for compiler warnings based off of the
  following pull requests (with whitespace changes excluded to
  avoid inflating the size of the diff):

  https://github.com/att/ast/pull/281
  https://github.com/att/ast/pull/283
  https://github.com/att/ast/pull/303
  https://github.com/att/ast/pull/304

- clone_type(): Two separate variables in this function share the
  same name. A bugfix from ksh93v- 2013-05-24 was backported to
  avoid conflict issues.

- Backported a minor error message improvement from ksh2020 for
  when the user attempts to run too many coprocesses.

- Backported a ksh2020 bugfix for a file descriptor leak:
  https://github.com/att/ast/commit/58bc8b56

- Backported ksh2020 bugfixes for unused variable and pointless
  assignment lint warnings:
  https://github.com/att/ast/commit/47650fe0
  https://github.com/att/ast/commit/df209c0d
  https://github.com/att/ast/commit/5e417b00

- Applied a few minor improvements to libast from graphviz:
  https://gitlab.com/graphviz/graphviz/-/commit/e7c03541
  https://gitlab.com/graphviz/graphviz/-/commit/969a7cde

- dtmemory(): Mark unused parameters with NOT_USED(). Based on:
  https://gitlab.com/graphviz/graphviz/-/commit/6ac3ad99

- Applied a few documentation/comment tweaks for the NEWS file,
  printf -v and spawnveg.

- Added a missing regression test for using the rm builtin's -f
  option without additional arguments (this was fixed in
  ksh93u+ 2012-02-14).
2022-01-30 20:42:59 +00:00
Johnothan King
e3a1dda93a Fix memory leak in defpathinit() (#441)
Currently, running ksh under ASan without the ASAN_OPTIONS variable
set to 'detect_leaks=0' usually ends with ASan complaining about a
memory leak in defpathinit() (this leak doesn't grow in a loop, so
no regression test was added to leaks.sh). Reproducer:

 $ ENV=/dev/null arch/*/bin/ksh
 $ cp -?
 cp: invalid option -- '?'
 Try 'cp --help' for more information.
 $ exit

 =================================================================
 ==225132==ERROR: LeakSanitizer: detected memory leaks

 Direct leak of 85 byte(s) in 1 object(s) allocated from:
     #0 0x7f6dab42d459 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
     #1 0x5647b77fe144 in sh_calloc /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/init.c:265
     #2 0x5647b788fea9 in path_addcomp /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/path.c:1567
     #3 0x5647b78911ed in path_addpath /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/path.c:1705
     #4 0x5647b7888e82 in defpathinit /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/path.c:442
     #5 0x5647b78869f3 in ondefpath /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/path.c:67
 --- cut ---
 SUMMARY: AddressSanitizer: 174 byte(s) leaked in 2 allocation(s).

Analysis: The previous code was leaking memory because
defpathinit() returns a pointer from path_addpath(), which has
memory allocated to it in path_addcomp(). This is the code ASan
traced as having allocated memory:

442:	return(path_addpath((Pathcomp_t*)0,(defpath),PATH_PATH));

In path_addpath():
1705:	first = path_addcomp(first,old,cp,type);
[...]
1729:	return(first);

In path_addcomp():
1567:	pp = sh_newof((Pathcomp_t*)0,Pathcomp_t,1,len+1);

The ondefpath() function doesn't save a reference to the pointer
returned by defpathinit(), which causes the memory leak:

66:	if(!defpath)
67:		defpathinit();

The changes in this commit avoid this problem by setting the
defpath variable without also calling path_addpath().

src/cmd/ksh93/sh/path.c:
- Move the code for allocating defpath from defpathinit() into its
  own dedicated function called std_path(). This function is called
  by defpathinit() and ondefpath() to obtain the current string
  stored in the defpath variable. This bugfix is adapted from a
  fork of ksh2020: https://github.com/l0stman/ksh/commit/db5c83a6
2022-01-30 20:42:15 +00:00
Martijn Dekker
7259153f1a Fix resuming external command run from 'eval' (re: 2c35a539)
'eval' suffers from the same bug. Reproducer:
    $ eval vi
then suspend vi, then try to resume it -- the same as in the
reproducer shown in the previous commit.

src/cmd/ksh93/bltins/misc.c: b_eval():
- Same fix. Do *not* turn off SH_MONITOR.
2022-01-28 05:59:54 +00:00
Martijn Dekker
2c35a53964 Fix resuming external command run from POSIX function or dot script
This fixes yet another whopper of a bug in job control. And it's
been in every version of ksh93 back to 1995, the beginning of
ast-open-archive. <sigh>

This is also bug number 23 that is fixed by simply removing code.

Reproducer:

1. Run vi, or another suspendable program, from a dot script or
   POSIX function (ksh handles them the same way). So either:
   $ echo vi >v
   $ . ./v
   or:
   $ v() { vi; }
   $ v

2. Suspend vi by typing Ctrl+Z.

3. Not one but two jobs are registered:
   $ jobs -l
   [2] + 85513	Stopped                  . ./v
   [1] - 85512	Stopped                  . ./v

4. 'fg' does not work on either of them, just printing the job
   command name but not resuming the editor. The second job
   disappears from the table after 'fg'.
   $ fg %2
   . ./v
   $ fg %2
   ksh: fg: no such job
   $ fg %1
   . ./v
   $ fg %1
   . ./v

Either way, you're stuck with an unresumable vi that you have to
'kill -9' manually.

src/cmd/ksh93/sh/xec.c: sh_exec(): TFORK:
- Do *not* turn off the SH_MONITOR state flag (which tells ksh to
  keep track of jobs) when running an external command from a
  profile script or dot script/POSIX function. It's obvious that
  this results in an inconsistent job control state as ksh will not
  update it when the external command is suspended. The purpose of
  this nonsense is surely lost to history as it's been there since
  1995 or before. My testing says that removing it doesn't break
  anything. If that turns out to be wrong, then the breakage will
  have to be fixed in a correct way instead.
2022-01-28 05:15:42 +00:00
Johnothan King
c0567c5e1d Fix spurious syntax error when using ${foo[${bar}..${baz}]} (#436)
Attempting to use array subscript expansion with variables that
aren't set currently causes a spurious syntax error (in ksh93u+ and
older commits the reproducer crashes):
   $ ksh -c 'echo ${foo[${bar}..${baz}]}'  # Shouldn't print anything
   ksh: : arithmetic syntax error

src/cmd/ksh93/sh/macro.c:
- Backport a parser bugfix from ksh93v- 2012-08-24 that avoids
  setting mp->dotdot until the copyto() function's loop is
  finished.

src/cmd/ksh93/tests/arrays.sh:
- Add regression tests for this bug.
2022-01-27 16:08:54 +00:00
Martijn Dekker
fe268fcc91 Cygwin: workaround for ksh to execute #!-less scripts with itself
On Cygwin, ksh does not execute scripts without a #! path in a fork
of the ksh process as it does on other systems. Reproducer (run
from ksh):

  $ cat test.sh
  echo "${BASH_VERSION:-not bash}"
  echo "${.sh.version}"
  $ chmod +x test.sh
  $ ./test.sh
  4.4.12(3)-release
  ./test.sh: line 2: ${.sh.version}: bad substitution

The script was executed in bash instead of ksh.

After this fix, the output on Cygwin is like ksh on other systems:

  not bash
  Version AJM 93u+m/1.1.0-alpha+dev 2022-01-26

This also fixes a number of regression test failures, as quite a
few tests create and execute temp scripts without a hashbang path.

Analysis: On Cygwin, execve(2) happily executes shell scripts
without a #! path with /bin/sh (which is bash --posix). However,
ksh relies on execve(2) executing binaries or #! only, as it uses
an ENOEXEC failure to decide whether to fork and execute a #!-less
shell script with a reinitialized copy of itself via exscript().

src/cmd/ksh93/sh/path.c: path_spawn():
- Look at the magic first two bytes of the file; if it is "MZ"
  (Mark Zbikowski, originator of the .exe format) or "#!", continue
  as normal, otherwise simulate an ENOEXEC failure from execve(2)
  which will cause ksh to fall back on #!-less script execution.
2022-01-27 16:08:44 +00:00
Martijn Dekker
41ee12a527 Document history expansion and fix a few loose ends
src/cmd/ksh93/sh.1:
- Add a new section on history expansion mostly adapted from the
  "History substitution" section from the tcsh(1) man page. This
  has the standard BSD license which is already in the COPYRIGHT
  file. Inapplicable stuff was removed, some new stuff added.

src/cmd/ksh93/edit/hexpand.c,
src/cmd/ksh93/sh/io.c:
- Set '#' as the default history comment character as on bash;
  no longer disable it by default.
- Add the 'a' modifier as a synonym for 'g', as on bash.
- Remove pointless static keyword from np variable; since the
  value from previous calls is never used it can just be local.
- Use NV_NOADD flag with nv_open() to avoid pointlessly creating
  the node if the variable doesn't exist yet.
- Fix a bug in history expansion where the 'p' modifier had no
  effect if the 'histverify' option is on.
  Reproducer:
    $ set -H -o histv
    $ true a b c
    $ !!:p
    $ true a b c▁  <= instead of printed, the line is re-edited
  Expected:
    $ set -H -o histv
    $ true a b c
    $ !!:p
    true a b c
    $ ▁
  This is fixed by making 'p' remove the HIST_EVENT bit from the
  returned flag bits in hist_expand(), leaving only the HIST_PRINT
  flag, then adding special handling for this case to slowread()
  in io.c (print the line, then instead of executing it, continue
  and read the next line).
2022-01-25 03:45:47 +00:00
Martijn Dekker
cda8fc916f Fix a crashing bug in history expansion
Reproducer:

$ set -o histexpand
$ echo foo !#^:h !#^:&
/usr/local/bin/ksh: :&: no previous substitution
ksh(80822,0x10bc2a5c0) malloc: *** error for object 0x10a13bae3: pointer being freed was not allocated
ksh(80822,0x10bc2a5c0) malloc: *** set a breakpoint in malloc_error_break to debug
Abort

Analysis: In hist_expand(), the 'cc' variable has two functions:
it holds a pointer to a malloc'ed copy of the current line, and is
also used as a temporary pointer with functions like strchr().
After that temporary use, it is set to NULL again, because the
'done:' routine checks if it non-NULL to decide whether to free the
pointer. But if an error occurs, the function may jump straight to
'done' without first setting cc to NULL if it had been used as a
temporary pointer. It then tries to free an unallocated pointer.

src/cmd/ksh93/edit/hexpand.c: hist_expand():
- Eliminate this bad practice by using a separate variable for
  temporary pointer purposes.

(I was unable to reproduce the crash in a pty regression test,
though it is consistently reproducible in a real interactive
session. So I haven't added that test.)
2022-01-24 08:41:27 +00:00
Martijn Dekker
8afc4756e8 history expansion: add missing bounds check
So far all ksh versions accept event numbers referring to
nonexistent history events in history expansion (-H/-o histexpand),
e.g. !9999 is accepted even if the history file has no item 9999.
These expansions seem to show random content from the history file,
sometimes including binary data. Of course an "event not found"
error should have been thrown instead.

hist_expand() in hexpand.c calls hist_seek() (from history.c)
without any bounds checking except verifying the history event
number is greater than zero. This commit adds a bounds check
to hist_seek() itself as it's called from three other places
in history.c, so perhaps this fixes a few other bugs as well.

src/cmd/ksh93/edit/history.c: hist_seek():
- Use the hist_min() and hist_max() macros provided in history.h
  to check bounds. Note that hist_max() yields the number of the
  command line currently being entered, so the maximum for seeking
  purposes is actually its result minus 1.
2022-01-21 02:13:53 +00:00
Johnothan King
eaf7662daa Fix history expansion buffer overflow (#434)
History expansion currently crashes under ASan due to a buffer
overflow. Reproducer:

   $ set -H
   $ !!:s/old/new/

Explanation from <https://github.com/att/ast/issues/1369>:
> The problem is the code assumes the buffer allocated for a string
> stream is zero initialized. But the SFIO code uses malloc() to
> allocate the buffer and does not explicitly initialize it with
> memset(). That it works at all, even without ASAN enabled, is
> purely accidental. It will fail if that malloc() returns a block
> that had been previously allocated, used, and freed. Under ASAN
> the buffer is initialized (at least on my system) to a sequence
> of 0xBE bytes. So the strdup() happily tries to duplicate a
> string that is the size of that buffer and fails when it reads
> past the end of the buffer looking for the terminating zero byte.

src/cmd/ksh93/edit/hexpand.c:
- Backport ksh2020 bugfix that avoids assuming the string stream
  has been initialized to zeros:
  https://github.com/att/ast/commit/cf16bcca
  (minus the incorrect change to the static wm variable).
2022-01-21 02:13:08 +00:00
Martijn Dekker
a5700d3937 Expose --histreedit and --histverify options (re: 921bbcae)
This adds two long-form shell options that modify history expansion
(-H/--histexpand). If --histreedit is on and a history expansion
fails, the command line is reloaded into the next prompt's edit
buffer, allowing corrections. If --histverify is on, the results of
a history expansion are not immediately executed but instead loaded
into the next prompt's edit buffer, allowing further changes.

SH_HISTREEDIT and SH_HISTVERIFY were already handled all along in
slowread() in io.c and via 'reedit' arguments to functions called
there, but could not be turned on as they were only ever exposed as
shopt options in the removed bash compatibility mode (in theory
only, as it failed to compile). I had overlooked them until now.

So, since the code is there, let's expose these options through the
normal long options interface. They're working fine, and activating
them actually makes history expansion tolerable to use.

src/cmd/ksh93/data/options.c:
- Make these options available as "histreedit" and "histverify".

src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/sh.1:
- Document the "new" options.

src/cmd/ksh93/include/defs.h:
- Remove unused SH_HISTAPPEND and SH_HISTORY2 macros which were
  part of the removed bash compatibility code. Note that ksh does
  not need a histappend option as it never overwrites the history
  file (in the bash mode, this shopt option was a no-op).
2022-01-12 20:38:30 +00:00
Johnothan King
1a9af9db40 Fix vi mode tab completion with spaces (#413)
Attempting to complete file names in vi mode using tab completion can
fail if the last character on the command line is a space. Reproducer
(note that this bug doesn't occur in emacs mode):
   $ set -o vi
   $ mkdir '/tmp/foo bar'
   $ test -d /tmp/foo\ <Tab>

src/cmd/ksh93/edit/vi.c:
- Don't disable tab completion or reset the tab count just because the
  last character on the command line is a space. This bugfix was
  backported from ksh93v- 2014-06-06.

src/cmd/ksh93/tests/pty.sh:
- Add a regression test for the tab completion bug.
2022-01-07 16:18:28 +00:00
Johnothan King
d347ec0fc9 Allow ksh to compile on Haiku; implement SIGKILLTHR support (#408)
This commit implements the build fixes required to get ksh running on
Haiku. Note that while ksh does compile, it has a ton of regression test
failures on Haiku.

src/cmd/ksh93/data/signals.c,
src/lib/libast/features/signal.c:
- Add support for the SIGKILLTHR signal, which is supported by BeOS and
  Haiku.
- SIGINFO was missing an entry in the libast feature test, so add one
  (re: 658bba74).

src/cmd/ksh93/RELEASE:
- Add an entry noting that ksh now compiles on Haiku, albeit with many
  regression test failures.

src/cmd/ksh93/{include/terminal.h,sh/path.c}:
- Silence compiler warnings on Haiku.

src/lib/libast/features/mmap:
- The mmap feature test freezes on Haiku, so modify the test to fail
  immediately on that OS.

src/lib/libast/misc/signal.c:
- Avoid redefining the signal definition on Haiku to fix a compiler
  error.

src/lib/libast/features/nl_types:
- For some reason the nl_item typedef on Haiku doesn't work correctly.
  Work around that by creating the nl_item type in the libast nl_types
  feature test.
2022-01-07 16:16:42 +00:00
Martijn Dekker
57ed1efc2c Actually deactivate CDPATH when unsetting it
After 'unset CDPATH', CDPATH continued to work as if nothing
happened. Unsetting it should be a valid way to deactivate it.

This bug is in every ksh93 version.

src/cmd/ksh93/bltins/cd_pwd.c: b_cd():
- Fix a manifest logic error: first check if CDPATH (CDPNOD) is
  unset before assigning to 'cdpath', not the other way around.
  Setting the 'cdpath' pointer is what activates the CDPATH search.
2021-12-29 01:48:55 +00:00
Johnothan King
4032050249 Port cksum builtin performance improvements from illumos (#391)
This commit ports performance optimizations from illumos for the libsum
code (used by the cksum and sum builtins):
98bea71f0d
The new codepath in libsum uses prefetching and loop unrolling to
improve performance (prefetching is done with __builtin_prefetch()
or sun_prefetch_read_many() if either is available).

Script for testing (note that cksum must be enabled in
src/cmd/ksh93/data/builtins.c):
   #!/bin/ksh
   builtin cksum || exit 1
   for ((i=0; i!=50000; i++)) do
       cksum -x att /etc/hosts
   done >/dev/null

Results on Linux x86_64 (using CCFLAGS=-O2):
$ echo 'UNPATCHED:'; time arch/linux.i386-64/bin/ksh /tmp/foo; echo 'PATCHED'; time /tmp/ksh /tmp/foo
UNPATCHED:

real    0m09.989s
user    0m07.582s
sys     0m02.406s
PATCHED:

real    0m06.536s
user    0m04.331s
sys     0m02.204s

src/lib/libsum/{sum-att.c,sum-crc.c,Mamfile}:
- Port the performance optimizations from illumos to 93u+m libsum. To
  prevent problems with older versions of GCC, avoid the new codepath
  if GCC is older than the 3.1 release series. Additionally, the ast.h
  header must be included to handle tcc defining __GNUC__ on FreeBSD.
- Apply some build fixes to allow the new codepath to build with Clang
  3.6 and newer (my own testing indicates an even better performance
  improvement with Clang than with GCC).
2021-12-28 22:22:52 +00:00
Johnothan King
8f9d1bec97 Add three options to 'ulimit' (#406)
This patch adds a few extra options to the ulimit command (if the OS
supports them). These options are also present in Bash, although in ksh
additional long forms of each option are available:
  ulimit -k/--kqueues   This is the maximum number of kqueues.
  ulimit -P/--npts      This is the maximum number of pseudo-terminals.
  ulimit -R/--rttime    This is the time a real-time process can run
                        before blocking, in microseconds. When the
                        limit is exceeded, the process is sent SIGXCPU.

Other changes:
- bltins/ulimit.c: Change the formatting from sfprintf and increase the
  size of the tmp buffer to prevent text from being cut off in ulimit
  -a (this was required to add ulimit -R).
- data/limits.c: Add support for using microseconds as a unit.
2021-12-28 22:02:20 +00:00
Johnothan King
0e197eee57 Fix mkservice compile errors and add SHOPT_MKSERVICE (#401)
The unused mkservice and eloop builtins are currently not built, and if
an attempt to compile them is made the build ends in failure. This
commit backports a few build fixes from ksh93v- 2012-08-24 that allow
mkservice and eloop to build (plus an additional compiler warning fix
not in ksh93v-). I've also added a new SHOPT_MKSERVICE setting (turned
off by default) so that mkservice and eloop can be built if the user
chooses to include them in their build of ksh.
2021-12-28 17:51:11 +00:00
Johnothan King
24174f0fb7 Backport -P and -t flags for 'type'/'whence' from ksh93v- (#392)
This commit backports the whence '-t' option from ksh93v-. The '-t'
option is useful when one needs to identify the type of a command.
The '-t' flag was added by ksh93v- for compatibility with Bash.

It should be noted the ksh93v- patch had one bug, which this commit
fixes. Path-bound builtins from /opt/ast/bin were classified as
files if loaded from /opt/ast/bin in the PATH. Reproducer:
   $ PATH=/opt/ast/bin whence -t cat
   file

src/cmd/ksh93/bltins/whence.c:
- Simplify the bitmask values for the command and whence builtin
  flags.
- Add the -t flag to the whence and type builtins. To prevent bugs,
  -t will always override -v if both of those flags were passed.

src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/sh.1:
- Add documentation for the new -t option.
2021-12-27 06:40:02 +00:00
Martijn Dekker
e072e7c170 Fix crash in xtrace while processing here-document (re: d7cada7b)
Depending on the OS, the heredoc.sh regression tests, and possibly
others, still crashed with the -x option (xtrace) on.

Analysis: The lexer crashes in lex_advance(). Something has caused
an inconsistent lexer state, and it happened earlier on, so the
backtrace is useless for figuring out where that happened.

But I think I've found it. It's the sh_mactry() call here:

src/cmd/ksh93/sh/xec.c, lines 2800 to 2807 in f7213f03
2800:   if(!(cp=nv_getval(sh_scoped(shp,PS4NOD))))
2801:           cp = "+ ";
2802:   else
2803:   {
2804:           sh_offoption(SH_XTRACE);
2805:           cp = sh_mactry(shp,cp);
2806:           sh_onoption(SH_XTRACE);
2807:   }

sh_mactry() needs to parse the contents of $PS4 to perform
expansions and command substitutions in it, which involves the
lexer. If that happens in a here-document, the lexer is in the C
function call stack, in the middle of parsing the here-document.
Result: inconsistent lexer state. Solution: save and restore lexer
state in sh_mactry().

After this commit, all regression tests should pass with the
'-x'/'--xtrace' option in use, with no errors or crashes.

Note for backporters: this fix depends both on on d7cada7b and on
the consistency fix for the Lex_t type's size applied in a7ed5d9f.

src/cmd/ksh93/include/shlex.h:
- Cosmetic fix: remove a copied & pasted backslash. (re: a7ed5d9f)

src/cmd/ksh93/sh/macro.c: sh_mactry():
- Save and restore the lexer state before letting sh_mactrim()
  indirectly parse and execute code.

src/cmd/ksh93/tests/*.sh:
- Turn off xtrace in various command substitutions that contain
  2>&1 redirections, so that the xtrace output is not caught by
  the command substitutions, causing tests to fail incorrectly.
- Turn off xtrace for a few code blocks with 2>&1 redirections,
  stopping xtrace output from being written to standard output.

Resolves: https://github.com/ksh93/ksh/issues/306 (again)
2021-12-27 04:02:25 +00:00
Martijn Dekker
91a7c2e3e9 Fix crash/freeze upon interrupting command substitution with pipe
On some systems (at least Linux and macOS):

1. Run on a command line: t=$(sleep 10|while :; do :; done)
2. Press Ctrl+C in the first 10 seconds.
3. Execute any other command substitution. The shell crashes.

Analysis: Something in the job_wait() call in the sh_subshell()
restore routine may be interrupted by a signal such as SIGINT on
Linux and macOS. Exactly what that interruptible thing is remains
to be determined. In any case, since job_wait() was invoked after
sh_popcontext(), interrupting it caused the sh_subshell() restore
routine to be aborted, resulting in an inconsistent state of the
shell. The fix is to sh_popcontext() at a later stage instead.

src/cmd/ksh93/sh/subshell.c: sh_subshell():
- Rename struct checkpt buff to checkpoint because it's clearer.
- Move the sh_popcontext() call to near the end, just after
  decreasing the subshell level counters and restoring the global
  subshell data struct to its parent. This seems like a logical
  place for it and could allow other things to be interrupted, too.
- Get rid of the if(shp->subshell) because it is known that the
  value is > 0 at this point.
- The short exit routine run if the subshell forked now needs a new
  sh_popcontext() call, because this is handled before restoring
  the virtual subshell state.
- While we're here, do a little more detransitioning from all those
  pointless shp pointers.

Fixes: https://github.com/ksh93/ksh/issues/397
2021-12-27 03:49:41 +00:00
Johnothan King
f7213f03a2 Fix multiple bugs when using 'alias -p' to print aliases (#398)
This commit was originally intended to fix just one bug with shcomp's
handling of 'alias -p', but while fixing that I found a large number
of related issues in the alias command's -p, -t and -x options. The
current patch provides bugfixes for all of the bugs listed below:

1) Listing aliases in a script with 'alias -p' or 'alias' broke
   shcomp's bytecode output:
   https://github.com/ksh93/ksh/issues/87#issuecomment-813819122

2) Listing individual aliases with the -p option doesn't work:
      $ alias foo=bar bar=foo
      $ alias foo
      foo=bar
      $ alias -p foo  # No output

3) Listing specific tracked aliases with -pt does not display them
   in a reusable format, but rather adds another tracked alias:
      $ hash -r cat vi
      $ alias -pt vi  # No output
      $ alias -pt rm
      $ alias -t
      cat=/usr/bin/cat
      rm=/usr/bin/rm
      vi=/usr/bin/vi

4) Listing all tracked aliases with -pt does not output them in a
   reusable format (the resulting command printed only creates a
   normal alias, which is different from a tracked alias):
      $ hash -r cat
      $ alias -pt
      alias cat=/usr/bin/cat  # Expected 'alias -t cat'

5) Listing a non-existent alias with -p doesn't cause an error:
      $ unalias -a
      $ alias -p notanalias  # No output
      $ echo $?
      0
      $ alias notanalias
      notanalias: alias not found
      $ echo $?
      1
      $ hash -r
      $ alias -pt notacommand  # No output
      $ echo $?
      0

6) Attempting to list 256 non-existent aliases results in exit
   status zero:
      $ unalias -a
      $ alias $(awk -v ORS= 'BEGIN { for(i=0;i<256;i++) print "x "; }')
      x: alias not found
      --cut error message--
      $ echo $?
      0

Changes:

- typeset.c: Avoid printing anything while shcomp is compiling a
  script. This is needed because the alias command is run by shcomp
  to prevent parsing issues.

- b_alias(): To avoid adding tracked aliases with -pt, set
  tdata.aflag to '+' so that setall() and other related functions
  only list tracked aliases.

- b_alias(): Set tdata.pflag to 1 so that setall() and other
  functions recognize -p was passed.

- print_value(): Add support for listing specific aliases with
  'alias -p'.

- setall(): To avoid any issues with zombie tracked aliases (see also
  the regression tests) ignore tracked alias nodes marked with the
  NV_NOALIAS attribute. This bit is set for tracked alias nodes by
  the nv_rehash() function.

- setall(): For backward compatibility, continue incrementing the
  exit status for each invalid alias and tracked alias passed. This
  was already how alias behaved when listing aliases without -p, so
  using -p shouldn't cause a change in behavior:
      $ unalias -a
      $ alias foo bar
      foo: alias not found
      bar: alias not found
      $ echo $?
      2
  To fix bug 6, the exit status is set to one if an enforced 8-bit
  exit status would be zero.

- print_namval(): Set the prefix to 'alias -t' so that listing
  tracked aliases with 'alias -pt' works correctly.

- data/msg.c and include/name.h: Add an error message for when
  'alias -pt' doesn't find a tracked alias.

- tests/alias.sh: Add a ton of regression tests for the bugs fixed in
  this commit.
2021-12-27 03:49:06 +00:00
Johnothan King
3785a0685c Fix process substitutions printing PIDs in profile scripts (#395)
- sh/args.c: A process substitution run in a profile script may print
  its PID as if it was a command spawned with '&'. Reproducer:
     $ cat /tmp/env
     true >(false)
     $ ENV=/tmp/env ksh
     [1]	730227
     $
  This bug is fixed by turning off the SH_PROFILE state while running
  a process substitution.

- sh/subshell.c: The SH_INTERACTIVE fix in 3525535e renders the extra
  check for SH_PROFILE redundant, so it has been removed.

- tests/io.sh: Update the procsub PIDs test to also check the result
  after using process substitution in a profile script.
2021-12-22 13:27:00 +00:00