1
0
Fork 0
mirror of git://git.code.sf.net/p/cdesktopenv/code synced 2025-03-09 15:50:02 +00:00
Commit graph

584 commits

Author SHA1 Message Date
Martijn Dekker
064baa372e More misc. tweaks and cleanups
Notable changes:

.github/workflows/ci.yml:
- Run 'bin/package test' on the github runner so we test iffe too.

src/cmd/ksh93/sh/subshell.c:
- sh_assignok was usually called like 'np = sh_assignok(np,0)'. But
  the function never changes np, it just returns the np value
  passed to it, so the assignment is pointless and that function
  can be changed to a void.

src/cmd/ksh93/sh/fault.c: sh_fault():
- Remove check for sh.subshell after sh_isstate(SH_INTERACTIVE). As
  of 48ba6964, it is never set in subshells.
2022-07-14 17:34:08 +02:00
Martijn Dekker
ffee9100d5 Robustify ${.sh.level} scope switching (re: 69d37d5e, e1c41bb2)
Switching the function scope to a parent scope by assigning to
.sh.level (SH_LEVELNOD) leaves the shell in an inconsistent state,
causing invalid-free and/or use-after-free bugs. The intention of
.sh.level was always to temporarily switch scopes inside a DEBUG
trap, so this commit minimises the pitfalls and instability by
imposing some sensible limitations:
1. .sh.level is now a read-only variable except while executing a
   DEBUG trap;
2. while it's writeable, attempts to unset .sh.level or to change
   its attributes are ignored;
3. attempts to set a discipline function for .sh.level are ignored;
4. it is an error to set a level < 0 or > the current scope.

Even more crashing bugs are fixed by simplifiying the handling and
initialisation of .sh.level and by exempting it completely from
virtual subshell scoping (to which it's irrelevant).

TODO: one thing remains: scope corruption and use-after-free happen
when using the '.' command inside a DEBUG trap with ${.sh.level}
changed. Behaviour same as before this commit. To be investigated.

All changed files:
- Consistently use the int16_t type for level values as that is the
  type of its non-pointer storage in SH_LEVELNOD.
- Update .sh.level by using an update_sh_level() macro that assigns
  directly to the node value, then restores the scope if needed.
- To eliminate implicit typecasts, use the same int16_t type (the
  type used by short ints such as SH_LEVELNOD) for all variables
  containing a function and/or dot script level.

src/cmd/ksh93/include/variables.h:
- Add update_sh_level() macro.

src/cmd/ksh93/include/name.h,
src/cmd/ksh93/sh/macro.c:
- Add a nv_nonptr() macro that checks attributes for a non-pointer
  value -- currently only signed or unsigned short integer value,
  accessed via the 's' member of 'union Value' (e.g. np->nvalue.s).
- nv_isnull(): To avoid undefined behaviour, check for attributes
  indicating a non-pointer value before accessing the nvalue.cp
  pointer (re: 5aba0c72).
- varsub(): In the set/unset check, remove the now-redundant
  exception for SH_LEVELNOD.

src/cmd/ksh93/data/variables.c,
src/cmd/ksh93/sh/init.c:
- shtab_variables[]: Make .sh.level a read-only short integer.
- sh_inittree(): To avoid undefined behaviour, do not assign to the
  'union Value' char pointer if the attribute indicates a non-
  pointer short integer value. Instead, the table value is ignored.

src/cmd/ksh93/sh/subshell.c: sh_assignok():
- Never create a subshell scope for SH_LEVELNOD.

src/cmd/ksh93/sh/xec.c:
- Get rid of 'struct Level' and its maxlevel member. This was only
  used in put_level() to check for an out of range assignment, but
  this can be trivially done by checking sh.fn_depth+sh.dot_depth.
- This in turn allows further simplification that reduces init for
  .sh.level to a single nv_disc() call in sh_debug(), so get rid of
  init_level().
- put_level(): Throw a "level out of range" error if assigned a
  wrong level.
- sh_debug():
  - Turn off the NV_RDONLY (read-only) attribute for SH_LEVELNOD
    while executing the DEBUG trap.
  - Restore the current scope when trap execution is finished.
- sh_funct(): Remove all .sh.level handling. POSIX functions (and
  dot scripts) already handle it in b_dot_cmd(), so sh_funct(),
  which is used by both, is the wrong place to do it.
- sh_funscope(): Update .sh.level for ksh syntax functions here
  instead. Also, do not bother to initialise its discipline here,
  as it can now only be changed in a DEBUG trap.

src/cmd/ksh93/bltins/typeset.c: setall():
- When it's not read-only, ignore all attribute changes for
  .sh.level, as changing the attributes would crash the shell.

src/cmd/ksh93/sh/nvdisc.c: nv_setdisc():
- Ignore all attempts to set a discipline function for .sh.level,
  as doing this would crash the shell.

src/cmd/ksh93/bltins/misc.c: b_dot_cmd():
- Bug fix: also update .sh.level when quitting a dot script.

src/cmd/ksh93/sh/name.c:
- _nv_unset():
  - To avoid an inconsistent state, ignore all attempts to unset
    .sh.level.
  - To avoid undefined behaviour, do not zero np->nvalue.cp if
    attributes for np indicate a non-pointer value (the actual bit
    value of a null pointer is not defined by the standard, so
    there is no guarantee that zeroing .cp will zero .s).
- sh_setscope(): For consistency, always set error_info.id (the
  command name for error messages) to the new scope's cmdname.
  Previously this was only done for two calls of this function.
- nv_name(): Fix a crashing bug by checking that np->nvname is a
  non-null pointer before dereferencing it.

src/cmd/ksh93/include/nval.h:
- The NV_UINT16P macro (which is unsigned NV_INT16P) had a typo in
  it, which went unnoticed for many years because it's not directly
  used (though its bit flags are set and used indirectly). Let's
  fix it anyway and keep it for completeness' sake.
2022-07-13 23:11:18 +02:00
Martijn Dekker
893ea066f7 Fix race condition in coprocess test with external 'cat'
The race is between '$cat |&' and 'kill $pid'. In between, there
are only a variable assignment and two buffered writes, so there
is nothing that waits for the external 'cat' to finish forking,
execve'ing and initialising -- meaning there is no guarantee it is
ready to catch SIGTERM. This explains the hang; 'cat' misses the
signal, continues to initialise, and simply waits for more input.

src/cmd/ksh93/tests/coprocess.sh:
- Actually read from the /bin/cat coprocess and verify that it
  works. This has the beneficial side effect of ensuring it is
  fully loaded and initialised before SIGTERMing it.

Resolves: https://github.com/ksh93/ksh/issues/132
2022-07-10 06:30:00 +02:00
Martijn Dekker
1934686de3 Fix oddly specific syntax error corrupting subsequent [[ ... ]]
Reproducer:

    $ x=([x]=1 [y)
    -ksh: syntax error: `)' unexpected
    $ [[ -z $x ]]
    -ksh: [[ -z  ]]: not found

Any '[[' command following that syntax error will fail similarly;
the whole of it (after variable expansion) is incorrectly looked up
as a command name. The syntax error must be generated by an
associative array assignment (with or without an explicit typeset
-A) with at least one valid assignment element followed by an
invalid assignment element starting with '[' but not containing
']='.

This seems to be another bug that is in every ksh93 version ever.
I've confirmed that ksh 1993-12-28 s+ and ksh2020 fail identically.
Presumably, so does everything in between.

Analysis:

The syntax error function, sh_syntax(), calls lexopen() in mode 0
to reset the lexer state. There is a variable that isn't getting
reset there though it should be. Using systematic elimination I
found that the variable that needs to be reset is lp->assignok (set
"when name=value is legal"). If it is set, '[[' is not processed.

src/cmd/ksh93/sh/lex.c: lexopen():
- Reset 'assignok' in the lexer state (regardless of mode).
- In the mode 0 total lexer state reinit, several members of lexd
  (struct _shlex_pvt_lexdata_) were not getting reset; just memset
  the whole thing to zero.
     Note for backporters: this change requires commit da97587e to
  be correct. That commit took the stack size and pointer (lex_max
  and *lex_match) out of this struct; those should not be reset!

Resolves: https://github.com/ksh93/ksh/issues/486
2022-07-09 23:00:11 +02:00
Martijn Dekker
7c4418ccdc Multibyte character handling overhaul; allow global disable
The SHOPT_MULTIBYTE compile-time option did not make much sense as
disabling it only disabled multibyte support for ksh/libshell, not
libast or libcmd built-in commands. This commit allows disabling
multibyte support for the entire codebase by defining the macro
AST_NOMULTIBYTE (e.g. via CCFLAGS). This slightly speeds up the
code and makes an optimised binary about 5% smaller.

src/lib/libast/include/ast.h:
- Add non-multibyte fallback versions of the multibyte macros that
  are used if AST_NOMULTIBYTE is defined. This should cause most
  multibyte handling to be automatically optimised out everywhere.
- Reformat the multibyte macros for legibility.
- Similify mbchar() and and mbsize() macros by defining them in
  terms of mbnchar() and mbnsize(), eliminating code duplication.
- Correct non-multibyte fallback of mbwidth(). For consistent
  behaviour, control characters and out-of-range values should
  return -1 as they do for UTF-8. The fallback is now the same as
  default_wcwidth() in src/lib/libast/comp/setlocale.c.

src/lib/libast/comp/setlocale.c:
- If AST_NOMULTIBYTE is defined, do not compile in the debug and
  UTF-8 locale conversion functions, including several large
  conversion tables. Define their fallback macros as 0 as these are
  used as function pointers.

src/cmd/ksh93/SHOPT.sh,
src/cmd/ksh93/Mamfile:
- Change the SHOPT_MULTIBYTE default to empty, indicating "probe".
- Synchronise SHOPT_MULTIBYTE with !AST_NOMULTIBYTE by default.

src/cmd/ksh93/include/defs.h:
- When SHOPT_MULTIBYTE is zero but AST_NOMULTIBYTE is not non-zero,
  then enable AST_NOMULTIBYTE here to use the ast.h non-multibyte
  fallbacks for ksh. When this is done, the effect is that
  multibyte is optimized out for ksh only, as before.
- Remove previous fallback for disabling multibyte (re: c2cb0eae).

src/cmd/ksh93/include/lexstates.h,
src/cmd/ksh93/sh/lex.c:
- Define SETLEN() macro to assign to LEN (i.e. _Fcin.fclen) for
  multibyte only and do not assign to it directly. With no
  SHOPT_MULTIBYTE, define that macro as empty. This allows removing
  multiple '#if SHOPT_MULTIBYTE' directives from lex.c, as that
  code will all be optimised out automatically if it's disabled.

src/cmd/ksh93/include/national.h,
src/cmd/ksh93/sh/string.c:
- Fix flagrantly incorrect non-multibyte fallback for sh_strchr().
  The latter returns an integer offset (-1 if not found), whereas
  strchr(3) returns a char pointer (NULL if not found). Incorporate
  the fallback into the function for correct handling instead of
  falling back to strchr(3) directly.

src/cmd/ksh93/sh/macro.c:
- lastchar() optimisation: avoid function call if SHOPT_MULTIBYTE
  is enabled but we're not actually in a multibyte locale.

src/cmd/ksh93/sh/name.c:
- Use ja_size() even with SHOPT_MULTIBYTE disabled (re: 2182ecfa).
  Though no regression tests failed, the non-multibyte fallback for
  typeset -L/-R/-Z length calculation was probably not quite
  correct as ja_size() does more. The ast.h change to mbwidth()
  ensures correct behaviour for non-multibyte locales.

src/cmd/ksh93/tests/shtests:
- Since its value in SHOPT.sh is now empty by default, add a quick
  feature test (for the length of the UTF-8 character 'é') to check
  if SHOPT_MULTIBYTE needs to be enabled for the regression tests.
2022-07-09 00:32:27 +02:00
Martijn Dekker
fbfd4d3ab8 Fix syntax error detection in associative array assignments
Reproducer:

$ fn=([foo_key]=foo_val [bar_key])
-ksh: [bar_key]: not found

Expected output:
-ksh: syntax error: `[bar_key]' unexpected

As soon as one correct associative array assignment element has
been processed, a subsequent one, starting with '[' but not
containing ']=', is incorrectly seen as a command to execute.
If a command '[bar_key]' existed on $PATH, it would have been run.

src/cmd/ksh93/sh/parse.c: simple():
- In the syntax check for associative array assignments, don't just
  check for an initial '[' but also verify the presence of ']='.

Thanks to @JohnoKing for finding this bug.

Resolves: https://github.com/ksh93/ksh/issues/427
2022-07-05 22:16:55 +02:00
Martijn Dekker
06e56251b9 Fix wrong syntax error upon process substitution after redirection
Grammatically, redirections may occur anywhere within a command
line and are removed after processing them, whereas a process
substitution (<(commandlist) or >(commandlist)) is replaced by a
file name which should be treated as just another simple word.
So the following should not be a syntax error:

$ cat </dev/null <(true)
-ksh: syntax error: `)' unexpected
$ cat </dev/null >(true)
-ksh: syntax error: `)' unexpected
$ cat >/dev/null <(true)
-ksh: syntax error: `)' unexpected
$ cat >/dev/null >(true)
-ksh: syntax error: `)' unexpected

This bug is in every ksh93 version.

The problem is in the parser (parse.c). The process substitution is
misparsed as a redirection due to inout() recursively parsing
multiple redirections without recognising process substitutions.
inout() is mistaking '<(' for '<' and '>(' for '>', which explains
the incorrect syntax error.

This also causes the following to fail to detect a syntax error:
$ cat >&1 <(README.md
[the contents of README.md are shown]

...and other syntax errors detected in the wrong spot, for example:
$ { true; } <(echo wrong)
-ksh: syntax error: `wrong' unexpected
which should be:
-ksh: syntax error: `<(' unexpected

src/cmd/ksh93/sh/parse.c:
- Add global inout_found_procsub flag.
- inout(): On encountering a process substitution, set this flag
  and return, otherwise clear the flag.
- simple(): After calling inout(), check this flag and, if set,
  jump back to where process substitutions are parsed.

Resolves: https://github.com/ksh93/ksh/issues/418
2022-07-05 13:20:28 +02:00
Martijn Dekker
400806afa6 Do not avoid creating subshell for last command if there are traps
Reproducer:

$ ksh -c 'trap "echo OK" TERM; (kill -s TERM $$)'

Actual output: none
Expected output: OK

The bug is only triggered if 'kill' is executed from a subshell
that is optimised out due to being the last command in the script.

src/cmd/ksh93/sh/xec.c: sh_exec(): case TPAR:
- Instead of only checking for EXIT and ERR traps, do not avoid
  creating a virtual subshell if there are any traps (except DEBUG,
  SIGKILL, SIGSTOP); for this, use the sh.st.trapdontexec flag
  introduced in 40245e08.
2022-07-03 12:52:34 +02:00
Martijn Dekker
4df6d674a0 Fix signal exit status of last command in subshell (re: b3050769)
Reproducer (on macOS/*BSD where SIGUSR1 has signal number 30):

  $ ksh -c '(sh -c '\''kill -s USR1 $$'\''); echo $?'
  ksh: 54220: User signal 1
  30

Expected output for $?: 286, not 30. The signal is not reflected in
the 9th bit of the exit status.

This bug was introduced for virtual subshells in b3050769 but
exists in every ksh93 version for real (forked) subshells:

  $ ksh -c '(ulimit -t unlimited; trap : EXIT; \
	sh -c '\''kill -s USR1 $$'\''); echo $?'
  ksh: 54267: User signal 1
  30

(As of d6c9821c, a dummy trap is needed to trigger the bug, or it
will be masked by the exec optimization for the sh invocation.)

This is caused by the exit status being masked to 8 bits when a
subshell terminates. For a real subshell, this is inevitable as the
kernel does this. As of b3050769, virtual subshells behave in a
manner consistent with real subshells in this regard.

However, for both virtual and real subshells, if its last command
was terminated by a signal, then that should still be reflected in
the 9th bit of ksh's exit stauts.

The root of the problem is that ksh simply cannot rely internally
on the 9th bit of the exit status to determine if a command exited
due to a signal. The 9th bit may be trimmed by a subshell or may be
set by 'return' without a signal being involved. This commit fixes
it by introducing a separate flag which will be a reliable
indicator of this.

src/cmd/ksh93/include/shell.h:
- Add sh.chldexitsig flag (set if the last command was a child
  process that exited due to a signal).

src/cmd/ksh93/sh/jobs.c: job_wait():
- When the last child process exited due to a signal, not only set
  the 9th (SH_EXITSIG) bit of sh.exitval but also sh.chldexitsig.

src/cmd/ksh93/sh/subshell.c: sh_subshell():
- Fix the virtual subshell reproducer above. After trimming the
  exit status to 8 bit, set the 9th bit if sh.chldexitsig is set.
  This needs to be done in two places: one that runs in the parent
  process after sh_subfork() and one for the regular virtual
  subshell exit.

src/cmd/ksh93/sh/fault.c:
- sh_trap(): Save and restore sh.chldexitsig so that this fix does
  not get deactivated if a trap is set.
- sh_done():
  - Fix the real subshell reproducer above. When the last command
    of a real subshell is a child process that exited due to a
    signal (i.e., if (sh.chldexitsig && sh.realsubshell)), then
    activate the code to pass down the signal to the parent
    process. Since there is no way to pass a 9-bit exit status to a
    parent process, this is the only way to ensure a correct exit
    status in the parent shell environment.
  - When exiting the main shell, use sh.chldexitsig and not the
    unreliable SH_EXITSIG bit to determine if the 8th bit needs to
    be set for a portable exit status indicating its last command
    exited due to a signal.
2022-07-03 12:49:36 +02:00
Martijn Dekker
d8dc2a1d81 sh_setenviron(): deactivate compound assignment prefix
Reproducers:

$ ksh -c 'typeset -a arr=( ( (a $(($(echo 1) + 1)) c)1))'
ksh: echo: arr[0]._AST_FEATURES=CONFORMANCE - ast UNIVERSE - ucb: cannot be an array
ksh: [1]=1: invalid variable name
$ ksh -c 'typeset -a arr=( (a $(($(echo 1) + 1)) c)1)'
ksh: echo: arr._AST_FEATURES=CONFORMANCE - ast UNIVERSE - ucb: is not an identifier
ksh: [1]=1: invalid variable name

src/cmd/ksh93/sh/name.c: sh_setenviron():
- Save and clear the current compound assignment prefix (sh.prefix)
  while assigning to the _AST_FEATURES variable.
2022-06-23 03:34:16 +01:00
Martijn Dekker
40245e088d Fix the exec optimisation mess (re: 17ebfbf6, 6701bb30, d6c9821c)
This commit supersedes @lijog's Solaris patch 280-23332860 (see
17ebfbf6) as this is a more general fix that makes the patch
redundant. Of course its associated regression tests stay.

Reproducer script:

	trap 'echo SIGUSR1 received' USR1
	sh -c 'kill -s USR1 $PPID'

Run as a normal script.
Expected behaviour: prints "SIGUSR1 received"
Actual behaviour: the shell invoking the script terminates. Oops.

As of 6701bb30, ksh again allows an exec-without-fork optimisation
for the last command in a script. So the 'sh' command gets the same
PID as the script, therefore its parent PID ($PPID) is the invoking
script and not the script itself, which has been overwritten in
working memory. This shows that, if there are traps set, the exec
optimisation is incorrect as the expected process is not signalled.

While 6701bb30 reintroduced this problem for scripts, this has
always been an issue for certain other situations: forked command
substitutions, background subshells, and -c option argument
scripts. This commit fixes it in all those cases.

In sh_exec(), case TFORK, the optimisation (flagged in no_fork) was
only blocked for SIGINT and for the EXIT and ERR pseudosignals.
That is wrong. It should be blocked for all signal and pseudosignal
traps, except DEBUG (which is run before the command) and SIGKILL
and SIGSTOP (which cannot be trapped).

(I've also tested the behaviour of other shells. One shell, mksh,
never does an exec optimisation, even if no traps are set. I don't
know if that is intentional or not. I suppose it is possible that a
script might expect to receive a signal without trapping it first,
and they could conceivably be affected the same way by this exec
optimisation. But the ash variants (e.g. Busybox ash, dash, FreeBSD
sh), as well as bash, yash and zsh, all do act like this, so the
behaviour is very widespread. This commit makes ksh act like them.)

Multiple files:
- Remove the sh.errtrap, sh.exittrap and sh.end_fn flags and their
  associated code from the superseded Solaris patch.

src/cmd/ksh93/include/shell.h:
- Add a scoped sh.st.trapdontexec flag for sh_exec() to disable
  exec-without-fork optimisations. It should be in the sh.st scope
  struct because it needs to be reset in subshell scopes.

src/cmd/ksh93/bltins/trap.c: b_trap():
- Set sh.st.trapdontexec if any trap is set and non-empty (an empty
  trap means ignore the signal, which is inherited by an exec'd
  process, so the optimisation is fine in that case).
- Only clear sh.st.trapdontexec if we're not in a ksh function
  scope; unlike subshells, ksh functions fall back to parent traps
  if they don't trap a signal themselves, so a ksh function's
  parent traps also need to disable the exec optimisation.

src/cmd/ksh93/sh/fault.c: sh_sigreset():
- Introduce a new -1 mode for sh_funscope() to use, which acts like
  mode 0 except it does not clear sh.st.trapdontexec. This avoids
  clearing sh.st.trapdontexec for ksh functions scopes (see above).
- Otherwise, clear sh.st.trapdontexec whenever traps are reset.

src/cmd/ksh93/sh/xec.c: check_exec_optimization():
- Consolidate all the exec optimisation logic into this function,
  including the logic from the no_fork flag in sh_exec()/TFORK.
- In the former no_fork flag logic, replace the three checks for
  SIGINT, ERR and EXIT traps with a single check for the
  sh.st.trapdontexec flag.
2022-06-18 23:27:10 +01:00
Martijn Dekker
d6c9821c5b Allow exec of last command in forked non-bg subshell (re: 16b38021)
The exec optimization only happened in background subshells and not
in regular subshells when they had forked via sh_subfork(), which
makes little sense.

src/cmd/ksh93/sh/xec.c: sh_exec: case TLST:
- A subshell is executed as a list of commands which is TLST. If
  the shell had not forked at the beginning of the subshell, the
  sh_state(SH_FORKED) flag was not passed on to recursive sh_exec()
  invocations, and a sh_subfork() event did not change this. To fix
  this, re-check for the SH_FORKED state and pass that bit on to
  the recursive sh_exec() invocation if set (note that sh_isstate()
  returns a bitmask and not a boolean value).

src/cmd/ksh93/sh/subshell.c: sh_subfork():
- Remove redundant sh_onstate(SH_FORKED); this is already done in
  sh_fork() which this function calls.
2022-06-18 23:25:58 +01:00
Martijn Dekker
16b3802148 Fix incorrect exec optimisation with monitor/pipefail on
Reproducer script:
    tempfile=/tmp/out2.$$.$RANDOM
    bintrue=$(whence -p true)
    for opt in monitor pipefail
    do
            (
                    set +x -o "$opt"
                    (
                            sleep .05
                            echo "ok $opt" >&2
                    ) 2>$tempfile | "$bintrue"
            ) &
            wait
            cat "$tempfile"
            rm -f "$tempfile"
    done

Expected output:
    ok monitor
    ok pipefail

Actual output:
    (none)

The 'monitor' and 'pipefail' options are supposed to make the shell
wait for the all commands in the pipeline to terminate and not only
the last component, regardless of whether the pipe between the
component commands is still open. In the failing reproducer, the
dummy external true command is subject to an exec optimization, so
it replaces the subshell instead of forking a new process. This is
incorrect, as the shell is no longer around to wait for the
left-hand part of the pipeline, so it continues in the background
without being waited for. Since it writes to standard error after
.05 seconds (after the pipe is closed), the 'cat' command reliably
finds the temp file empty. Without the sleep this would be a race
condition with unpredictable results.

Interestingly, this bug is only triggered for a (background
subshell)& and not for a forked (regular subshell). Which means the
exec optimization is not done for a forked regular subshell, though
there is no reason not to. That will be fixed in the next commit.

src/cmd/ksh93/sh/xec.c: sh_exec():
- case TFORK: Never allow an exec optimization if we're running a
  command in a multi-command pipeline (pipejob is set) and the
  shell needs to wait for all pipeline commands, i.e.: either the
  time keyword is in use, the SH_MONITOR state is active, or the
  SH_PIPEFAIL option is on.
- case TFIL: Fix the logic for setting job.waitall for the
  non-SH_PIPEFAIL case. Do not 'or' in the boolean value but assign
  it, and include the SH_TIMING (time keyword in use) state too.
- case TTIME: After that fix in case TFIL, we don't need to bother
  setting job.waitall explicitly here.

src/cmd/ksh93/sh.1:
- Add missing documentation for the conditions where the shell
  waits for all pipeline components (time, -o monitor/pipefail).

Resolves: https://github.com/ksh93/ksh/issues/449
2022-06-18 23:25:30 +01:00
Martijn Dekker
6016fb64ce Forking workaround for converting to associative array in subshell
$ arch/*/bin/ksh -xc 'typeset -a a=(1 2 3); \
  (typeset -A a; typeset -p a); typeset -p a'
typeset -A a=()
typeset -a a=(1 2 3)

The associative array in the subshell is empty, so the conversion
failed. So far, I have been unsuccessful at fixing this in the
array and/or virtual subshell code (a patch that fixes it there
would still be more than welcome).

As usual, real subshells work correctly, so this commit adds
another forking workaround. The use case is rare and specific
enough that I have no performance concerns.

src/cmd/ksh93/bltins/typeset.c: setall():
- Fork a virtual subshell if we're actually converting a variable
  to an associative array, i.e.: the NV_ARRAY (-A, associative
  array) attribute was passed, there are no assignments (sh.envlist
  is NULL), and the variable is not unset.

src/cmd/ksh93/tests/arith.sh:
- Fix the "Array subscript quoting test" tests that should not have
  been passing and that correctly failed after this fix; they used
  'typeset -A' without an assignment in a subshell, assuming it was
  unset in the parent shell, which it wasn't.

Resolves: https://github.com/ksh93/ksh/issues/409
2022-06-15 04:58:14 +01:00
Martijn Dekker
69d37d5eae parse.c: fix use-after-free probs related to funstaks()
When the funstaks() function deletes a stack, other code could
still reference that stack's pointer, at least if a script's DEBUG
trap changed the function context by assigning to ${.sh.level}.
This crashed @ormaaj's funcname.ksh script in certain contexts, at
least when run as a dot script or in a virtual subshell.

This allows that script to run in all contexts by making
funstaks(s) set the stack pointer to NULL after deleting the stack
and making various other points in the code check for a null
pointer before dereferencing it.

This may not be the most elegant fix but (in my testing) it does
work, even when compiling ksh with AddressSanitiser.

Thanks to @JohnoKing for help researching this problem.

Resolves: https://github.com/ksh93/ksh/issues/212
2022-06-14 13:47:00 +01:00
Martijn Dekker
50db00e136 Fix subshell trap integrity, e.g. re-trapping a signal in subshell
Ksh handles local traps in virtual subshells the same way as local
traps in ksh-style shell functions, which can cause incorrect
operation.

Reproducer script:

	trap 'echo "parent shell trap"; exit 0' USR1
	(trap 'echo "subshell trap"' USR1; kill -USR1 $$)
	echo wrong

Output on every ksh93 version: 'wrong'
Output on every other shell: 'parent shell trap'

The ksh93 output is wrong because $$ is the PID of the main shell,
therefore 'kill -USR1 $$' from a subshell needs to issue SIGUSR1 to
the main shell and trigger the 'echo SIGUSR1' trap.

This is an inevitable consequence of processing signals in a
virtual subshell. Signals are a process-wide property, but a
virtual subshell and the parent shell share the same process.
Therefore it is not possible to distinguish between the parent
shell and subshell trap.

This means virtual subshells are fundamentally incompatible with
receiving signals. No workaround can make this work properly.

Ksh could either assume the signal needs to be caught by the
subshell trap (wrong in this case, but right in others) or by the
parent shell trap. But it does neither and just gives up and does
nothing, which I suppose is the least bad way of doing it wrong.

As another example, consider a subshell that traps a signal, then
passes its own process ID (as of 9de65210, that's ${.sh.pid}) to
another process to say "here is where to signal me". A virtual
subshell will send it the PID that it shares with the the parent
shell. Even if a virtual subshell receives the signal correctly, it
may fork mid-execution afterwards, depending on the commands that
it runs (and this varies by implementation as we fix or work around
bugs). So its PID could be invalidated at any time.

Forking a virtual subshell at the time of trapping a signal is the
only way to ensure a persistent PID and correct operation.

src/cmd/ksh93/bltins/trap.c: b_trap():
- Fork when trapping (or ignoring) a signal in a virtual subshell.
  (There's no need to fork when trapping a pseudosignal.)

src/cmd/ksh93/tests/signal.sh:
- Add tests. These are simplified versions of tests already there,
  which issued 'kill' as a background job. Currently, running a
  background job causes a virtual subshell to fork before forking
  the 'kill' background job (or *any* background job, see e3d7bf1d)
  -- an ugly partial workaround that I believe just became
  redundant and which I will remove in the next commit.
2022-06-14 01:33:24 +01:00
Martijn Dekker
ed9053ecfb remove ( simple_command & ) optimisation (re: e3d7bf1d)
I didn't trust this back in e3d7bf1d (which disabled it for
interactive shells) and I trust it less now. In af6a32d1/6b380572,
this was also disabled for virtual subshells as it caused program
flow corruption there. Now, on macOS 10.14.6, a crash occurs when
repeatedly running a command with this optimisation:

$ ksh -c 'for((i=0;i<100;i++));do print -n "$i ";(sleep 1&);done'
0 1 2 3 4 5 6 7 Illegal instruction

Oddly enough it seems that I can only reproduce this crash on macOS
-- not on Linux, OpenBSD, or Solaris. It could be a macOS bug,
particularly given the odd message in the stack trace below.

I've had enough, though. Out it comes. Things now work fine, the
reproducer is fixed on macOS, and it didn't optimise much anyway.

The double-fork issue discussed in e3d7bf1d remains.
________
For future reference, here's an lldb debugger session with a stack
trace. It crashes on calling calloc() (via sh_calloc(), via
sh_newof()) in jobsave_create(). This is not an invalid pointer
problem as we're allocating new memory, so it does look like an OS
bug. The "BUG IN CLIENT OF LIBPLATFORM" message is interesting.

$ lldb -- arch/*/bin/ksh -c 'for((i=0;i<100;i++));do print -n "$i ";(sleep 1&);done'
(lldb) target create "arch/darwin.i386-64/bin/ksh"
Current executable set to 'arch/darwin.i386-64/bin/ksh' (x86_64).
(lldb) settings set -- target.run-args  "-c" "for((i=0;i<100;i++));do print -n \"$i \";(sleep 1&);done"
(lldb) run
error: shell expansion failed (reason: lldb-argdumper exited with error 2). consider launching with 'process launch'.
(lldb) process launch
Process 35038 launched: '/usr/local/src/ksh93/ksh/arch/darwin.i386-64/bin/ksh' (x86_64)
0 1 2 3 4 5 6 7 8 9 Process 35038 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)
    frame #0: 0x00007fff70deb1c2 libsystem_platform.dylib`_os_unfair_lock_recursive_abort + 23
libsystem_platform.dylib`_os_unfair_lock_recursive_abort:
->  0x7fff70deb1c2 <+23>: ud2

libsystem_platform.dylib`_os_unfair_lock_unowned_abort:
    0x7fff70deb1c4 <+0>:  movl   %edi, %eax
    0x7fff70deb1c6 <+2>:  leaq   0x1a8a(%rip), %rcx        ; "BUG IN CLIENT OF LIBPLATFORM: Unlock of an os_unfair_lock not owned by current thread"
    0x7fff70deb1cd <+9>:  movq   %rcx, 0x361cb16c(%rip)    ; gCRAnnotations + 8
Target 0: (ksh) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)
  * frame #0: 0x00007fff70deb1c2 libsystem_platform.dylib`_os_unfair_lock_recursive_abort + 23
    frame #1: 0x00007fff70de7c9a libsystem_platform.dylib`_os_unfair_lock_lock_slow + 239
    frame #2: 0x00007fff70daa3bd libsystem_malloc.dylib`tiny_malloc_should_clear + 188
    frame #3: 0x00007fff70daa20f libsystem_malloc.dylib`szone_malloc_should_clear + 66
    frame #4: 0x00007fff70dab444 libsystem_malloc.dylib`malloc_zone_calloc + 99
    frame #5: 0x00007fff70dab3c4 libsystem_malloc.dylib`calloc + 30
    frame #6: 0x000000010003fa5d ksh`sh_calloc(nmemb=1, size=16) at init.c:264:13
    frame #7: 0x000000010004f8a6 ksh`jobsave_create(pid=35055) at jobs.c:272:8
    frame #8: 0x000000010004ed42 ksh`job_reap(sig=20) at jobs.c:363:9
    frame #9: 0x000000010004ff6f ksh`job_waitsafe(sig=20) at jobs.c:511:3
    frame #10: 0x00007fff70de9b5d libsystem_platform.dylib`_sigtramp + 29
    frame #11: 0x00007fff70d39ac4 libsystem_kernel.dylib`__fork + 12
    frame #12: 0x00007fff70c57d80 libsystem_c.dylib`fork + 17
    frame #13: 0x000000010009590d ksh`sh_exec(t=0x0000000101005d30, flags=4) at xec.c:1883:16
    frame #14: 0x0000000100096013 ksh`sh_exec(t=0x0000000101005d30, flags=4) at xec.c:2019:4
    frame #15: 0x0000000100096c4f ksh`sh_exec(t=0x0000000101005a40, flags=5) at xec.c:2213:9
    frame #16: 0x0000000100096013 ksh`sh_exec(t=0x0000000101005a40, flags=5) at xec.c:2019:4
    frame #17: 0x000000010001c23f ksh`exfile(iop=0x0000000100405750, fno=-1) at main.c:603:4
    frame #18: 0x000000010001b23c ksh`sh_main(ac=3, av=0x00007ffeefbff4f0, userinit=0x0000000000000000) at main.c:365:2
    frame #19: 0x0000000100000776 ksh`main(argc=3, argv=0x00007ffeefbff4f0) at pmain.c:45:9
    frame #20: 0x00007fff70bfe3d5 libdyld.dylib`start + 1
2022-06-14 01:32:33 +01:00
Martijn Dekker
9b893992a3 [v1.0] posix: don't zero-pad 2nds (re: 5c677a4c, 70fc1da7, b1a41311)
The POSIX mode now disables left-hand zero-padding of seconds in
'time'/'times' output. The standard specifies the output format quite
exactly and zero-padding is not in it.
2022-06-12 16:16:11 +01:00
Martijn Dekker
148a8a3f46 Another build system overhaul (re: 35672208, 580ff616, 6cc2f6a0)
So far we've been handling AST release build and git commit flags
and ksh SHOPT_* compile time options in the generic package build
script. That was a hack that was necessary before I had sufficient
understanding of the build system. Some of it did not work very
well, e.g. the correct git commit did not show up in ${.sh.version}
when compiling from a git repo.

As of this commit, this is properly included in the mamake
dependency tree by handling it from the libast and ksh93 Mamfiles,
guaranteeing they are properly up to date.

For a release build, the _AST_ksh_release macro is renamed to
_AST_release, because some aspects of libast also use this.

This commit also adds my first attempt at documenting the (very
simple, six-command) mamake language as it is currently implemented
-- which is significantly different from Glenn Fowler's original
paper. This is mostly based on reading the mamake.c source code.

src/cmd/INIT/README-mamake.md:
- Added.

bin/package, src/cmd/INIT/package.sh:
- Delete the hack.

**/Mamfile:
- Remove KSH_RELFLAGS and KSH_SHOPTFLAGS, which supported the hack.
- Delete 'meta' commands. They were no-ops; mamake.c ignores them.
  They also did not add any informative value.

src/lib/libast/Mamfile:
- Add a 'virtual' target that obtains the current git commit,
  examines the git branch, and decides whether to auto-set an
  _AST_git_commit and/or or _AST_release #define to a new
  releaseflags.h header file. This is overwritten on each run.
- Add code to the install target that copies limit.h to
  include/ast, but only if it doesn't exist or the content of the
  original changed. This allows '#include <releaseflags.h>' from
  any program using libast while avoiding needless recompiles.
- When there are uncommitted changes, add /MOD (modified) to the
  commit hash instead of not defining it at all.

src/cmd/ksh93/**:
- Mamfile: Add a shopt.h target that reads SHOPT.sh and converts it
  into a new shopt.h header file in the object code directory. The
  shopt.h header only contains SHOPT_* directives that have a value
  in SHOPT.sh (not the empty/probe ones). They also do not redefine
  the macros if they already exist, so overriding with something
  like CCFLAGS+=' -DSHOPT_FOO=1' remains possible.
- **.c: Every c file now #includes "shopt.h" first. So SHOPT_*
  macros are no longer passed via environment/MAM variables.
* SHOPT.sh: The AUDITFILE and CMDLIB_DIR macros no longer need an
  extra backslash-escape for the double quotes in their values.
  (The old way required this because mamake inserts MAM variables
  directly into shell scripts as literals without quoting.  :-/ )

src/cmd/INIT/mamake.c:
- Import the two minor changes between from 93u+ and 93v-: bind()
  is renamed to bindfile() and there is a tweak to detecting an
  "improper done statement".
- Allow arbitrary whitespace (isspace()) everywhere, instead of
  spaces only. This obsoletes my earlier indentation workaround
  from 6cc2f6a0; turns out mamake always supported indentation, but
  with spaces only.
- Do not skip line numbers at the beginning of each line. This
  undocumented feature is not and (AFAICT) never has been used.
- Throw an error on unknown command or rule attribute. Quite an
  important feature for manual maintenance: catches typos, etc.
2022-06-12 05:47:02 +01:00
Martijn Dekker
b93216ce88 typeset -T: fix spurious 2nd error on redef (re: 1fbbeaa1, d309d604)
Reproducers:

    $ ksh -c 'typeset -T foo=(x=1); typeset -T foo=(integer x)'
    ksh: typeset: foo: type cannot be redefined
    ksh: foo: type definition requires compound assignment

    $ ksh -c 'typeset -T foo=(x=1); typeset -T foo=(integer x=2)'
    ksh: foo: type cannot be redefined
    ksh: foo: type definition requires compound assignment

In both cases, the second error message is spurious, as there is in
fact a compound assignment.

The first case was introduced in 1fbbeaa1. The 'integer' command no
longer resolves to a special built-in but is now a regular built-in
so it does not cause the shell to exit on error. The code path
continues and nv_mktype() is called which issues the bad error as
the compound assignment was not fully processed.

The second case was introduced in d309d604. This commit inhibits
the exit-on-error for the assignment list of a regular built-in.
For type definitions, nv_mktype() is called via this code path too.

Since all declaration commands were either special built-ins or
aliases to special built-ins in 93u+, they all exited on error and
this problem never happened. Now that some regular built-ins are
allowed inside 'typeset -T', we need some special-casing to make
them exit on error as before. Arguably this is the correct thing to
do anyway, as their errors are also an error in the enveloping
'typeset' which is a special built-in.

This regression is cosmetic and the case is rare, so this is
probably not worth a NEWS item.

src/cmd/ksh93/sh/xec.c: sh_exec():
- Fix the second case by adding a test for non-NULL 'sh.mktype' to
  the conditions where the assignments list is processed with exit
  on error. This pointer is set while executing a type definition.
- Fix the first case by saving the sh.mktype state before executing
  a built-in and then checking the saved value if a built-in exits
  on error. Saving is needed as sh.mktype has been reset by then.
2022-06-10 03:23:38 +01:00
Martijn Dekker
3030197b89 Add error message for ambiguous long-form option abbreviation
Before:

  $ set -o hist
  -ksh: set: hist: bad option(s)

After:

  $ set --hist
  -ksh: set: hist: ambiguous option

In ksh 93u+m, there are three options starting with 'hist', so the
abbreviation does not represent a single option. It is useful to
communicate this in the error message.

In addition, "bad option(s)" was changed to "unknown option", the
same message as that given by AST optget(3), for consistency.

src/cmd/ksh93/sh/string.c:
- Make sh_lookopt() return -1 for an ambiguous option. This is
  trivial as there is already an 'amb' variable flagging that up.

src/cmd/ksh93/sh/args.c:
- Use the negative sh_lookopt() return status to decide between
  "unknown option" and "ambiguous option".

src/cmd/ksh93/data/builtins.c: sh_set[]:
- Explain the --option and --nooption forms in addition to the -o
  option and +o option forms.
- Document the long options without their 'no' prefixes (e.g. glob
  instead of noglob) as this simplifies documentation and the
  relation with the short options makes more sense. Those names are
  also how they show up in 'set -o' output and it is better for
  those to match.
- Tweaks.
2022-06-10 01:11:46 +01:00
Martijn Dekker
89cec81b32 Another round of minor tweaks and cleanups
Notable changes:
- The typeset builtin's usage and error messages for incompatible
  options used with -f has been corrected to show that -t and -u
  can be used with -f.
- In name.c, get rid of misleaadingly named Null static which is
  actually the empty string, not the null value. Replace with a new
  AltEmpty macro that is defined similarly to Empty. This is now
  also used in nvtype.c (re: de037b6e).
2022-06-09 03:02:06 +01:00
Martijn Dekker
0602177646 posix: block brace expansion of unquoted expansions (re: a14d17c0)
Historically, ksh (including ksh88 and mksh) allow brace expansion
not just on literal patterns but also on patterns resulting from
unquoted variable expansions or command substitutions:

  $ a='{a,b}' ksh -c 'echo X{a,b} Y$a'
  Xa Xb Ya Yb

Most people expect only the first (literal) pattern to be expanded,
as in bash and zsh:

  $ a='{a,b}' bash -c 'echo X{a,b} Y$a'
  Xa Xb Y{a,b}

The historic ksh behaviour is poorly documented and nearly unknown,
violates the principle of least astonishment, and makes unquoted
variable expansions even more unsafe. See discussion at:
https://www.austingroupbugs.net/view.php?id=1193
https://github.com/ksh93/ksh/issues/140

Unfortunately, we cannot change it in default ksh without breaking
backward compatibility. But we can at least fix it for the POSIX
mode (which disables brace expansion by default but allows turning
it back on), particularly as it looks like POSIX, if it decides to
specify brace expansion in a future version of the standard, will
disallow brace expansion on unquoted variable expansions.

src/cmd/ksh93/sh/macro.c: endfield():
- When deciding whether to do brace expansion + globbing or only
  globbing, also check that we do not have POSIX mode and an
  unquoted variable expansion (mp->pattern==1).
2022-06-08 22:21:53 +01:00
Martijn Dekker
7b8e7fbb49 Error on defining disc for undeclared type member (re: 87088361)
$ typeset -T _bad_disc_t=(typeset dummy; function foo.get { :; })
  Abort

One of the abort() calls that replaced a debug message in the
referenced commit was triggered when trying to define a discipline
function for a nonexistent type member inside a 'typeset -T' type
definition.

src/cmd/ksh93/sh/nvtype.c: std_disc():
- Issue a proper error message for that condition.
2022-06-08 01:12:23 +01:00
Martijn Dekker
2a43cbc3f6 Fix crash on setting attribute to variable with getn discipline
The 'getn' discipline is experimental and undocumented, the only
mention of it being an old mailing list post from David Korn:
https://www.mail-archive.com/ast-users@research.att.com/msg00601.html

But it still should not crash.

$ LC_NUMERIC=C ENV=/./dev/null arch/*/bin/ksh
$ foo.getn() { .sh.value=2.3*4.5; }
$ typeset -F foo
Memory fault

src/cmd/ksh93/sh/nvdisc.c: assign():
- Check that the nvalue union has a non-NULL pointer before
  using it.

Progresses: https://github.com/ksh93/ksh/issues/435
2022-06-07 19:20:43 +01:00
Martijn Dekker
9da0887e54 Fix spurious export attribute when printing compound variables
Reproducer script:

    typeset -Ttyp1 typ1=(
            function get {
                    .sh.value="'Sample'";
            }
    )
    typ1 var11
    typeset -p .sh.type
    typeset -p .sh.type

Buggy output:

    namespace sh.type
    {
            typeset -r typ1='Sample'
    }
    namespace sh.type
    {
            typeset -x -r typ1='Sample'
    }

An -x (export) attribute is magically pulled out of a hat.

Analysis: The walk_tree() function in nvdisc.c repurposes (!) the
NV_EXPORT attribute as an instruction to turn *off* indenting when
pretty-printing the values of compound variables. The
print_namval() function in typeset.c, implementing 'typeset -p',
turns on NV_EXPORT for compound variables to inhibit indentation.
But it then does not bother to turn it off, which causes this bug.

src/cmd/ksh93/bltins/typeset.c: print_namval():
- When printing compound variables, only turn on NV_EXPORT
  temporarily.

Resolves: https://github.com/ksh93/ksh/issues/456
2022-06-07 04:27:54 +01:00
Martijn Dekker
80f8cc497f Fix completion following $'foo\'bar'
On an interactive shell in emacs or vi, type a command with a $'…'
quoted string that contains a backslash-escaped single quote, like:

    $ true $'foo\'bar' ▁

Then begin to type the name of a file present in the current
working directory and press tab. Nothing happens as completion
fails to work.

The completion code does not recognise $'…' strings. Instead, it
parses them as '…' strings in which there are no backslash escapes,
so it considers the last ' to open a second quoted string which is
not terminated.

Plus, when replacing a $'…' string with a (backslash-escaped)
completed string, the initial '$' is not replaced:

    $ $'/etc/hosts<Tab>
    $ $/etc/hosts

src/cmd/ksh93/edit/completion.c:
- find_begin():
  - Learn how to recognise $'…' strings. A new local dollarquote
    flag variable is used to distinguish them from regular '…'
    strings. The difference is that backslash escapes (and only
    those) should be recognised as in "…".
  - Set a special type -1 for $'…' as the caller will need a way
    to distinguish those from '…'.
- ed_expand(): When replacing a quoted string, remove an extra
  initial character (being the $ in $') if the type set by
  find_begin() is -1.

Resolves: https://github.com/ksh93/ksh/issues/462
2022-06-06 03:13:13 +01:00
Martijn Dekker
7a5423dfb6 Fix more spurious comsub execution in tab completion (re: 7a2d3564)
Comsubs were either executed or caused a syntax error when attempting
to complete them within single quotes. Since single quotes do not
expand anything, no such completion should take place.

$ '`/de<TAB>-ksh: /dev/: cannot execute [Is a directory]

$ '$(/de<TAB>-ksh: syntax error: `end of file' unexpected

src/cmd/ksh93/edit/completion.c:
- find_begin():
  - Remove recursive handling for '`' comsubs from 7a2d3564; it is
    sufficient to set the return pointer to the current location cp
    (the character following '`') if we're not in single quotes.
  - For '$' and '`', if we are within single quotes, set type '\''
    and set the return pointer bp to the location of the '$' or
    '`'.
- ed_expand(): If find_begin() sets type '\'' and the current begin
  character is $ or `, refuse to attempt completion; return -1 to
  cause a terminal beep.

Related:
https://github.com/ksh93/ksh/issues/268
https://github.com/ksh93/ksh/issues/462#issuecomment-1038482307
2022-06-06 03:12:57 +01:00
Martijn Dekker
d70793d3c0 fix radix point regress test bug (re: a0effeb0)
A previous test left LC_ALL set, causing a spurious failure in one
of the newly added tests. The bug was masked because another test
conditional upon SHOPT_MULTIBYTE unset LC_ALL again. The fix is to
unset all locale variables before testing, just to be sure.
2022-06-05 17:53:53 +01:00
Martijn Dekker
a0effeb0dc Fix omission: update radix point on LANG change (re: f0386a87)
Of course we need to check LC_LANG (for the LANG variable) as well.
2022-06-05 14:29:35 +01:00
Martijn Dekker
e3aa32a129 Add --functrace shell option (re: 2a835a2d)
A side effect of the bug fixed in 2a835a2d caused the DEBUG trap
action to appear to be inherited by subshells, but in a buggy way
that could crash the shell. After the fix, the trap is reset in
subshells along with all the others, as it should be. Nonetheless,
as that bug was present for years, some have come to rely on it.

This commit implements that functionality properly. When the new
--functrace option is turned on, DEBUG trap actions are now
inherited by subshells as well as ksh function scopes. In addition,
since it makes logical sense, the new option also causes the
-x/--xtrace option's state to be inherited by ksh function scopes.
Note that changes made within the scope do not propagate upwards;
this is different from bash.

(I've opted against adding a -T short-form equivalent as on bash,
because -T was formerly a different option on 93u+ (see 63c55ad7)
and on mksh it has yet anohter a different meaning. To minimise
confusion, I think it's best to have the long-form name only.)

src/cmd/ksh93/include/shell.h,
src/cmd/ksh93/data/options.c:
- Add new "functrace" (SH_FUNCTRACE) long-form shell option.

src/cmd/ksh93/sh/subshell.c: sh_subshell():
- When functrace is on, copy the parent's DEBUG trap action into
  the virtual subshell scope after resetting the trap actions.

src/cmd/ksh93/sh/xec.c: sh_funscope():
- When functrace is on and xtrace is on in the parent scope, turn
  it on in the child scope.
- Same DEBUG trap action handling as in sh_subshell().

Resolves: https://github.com/ksh93/ksh/issues/162
2022-06-04 17:27:27 +01:00
Martijn Dekker
73cd1a9ee0 tests/coprocess.sh: activate known intermittent fail as warning
https://github.com/ksh93/ksh/issues/132#issuecomment-997432781
2022-06-04 11:53:46 +01:00
Martijn Dekker
1184b2ade9 Honour attribs for assignments preceding sp. builtins, POSIX functs
After the previous commit, one inconsistency was left. Assignments
preceding special built-ins and POSIX functions (which persist past
the command :-/) caused pre-existing attributes of the respective
variables to be cleared.

$ (f() { typeset -p n; }; typeset -i n; n=3+4 f)
n=3+4

(expected output: 'typeset -i n=7')

This problem was introduced shortly before the release of ksh 93u+,
in version 2012-05-04, by adding these lines of code to the code
for processing preceding assignments in sh_exec():

src/cmd/ksh93/sh/xec.c:
1055:	if(np)
1056:		flgs |= NV_UNJUST;

So, for special and declaration commands and POSIX functions, the
NV_UNJUST flag is passed to nv_open(). In older ksh versions, this
flag cleared justify attributes only, but in early 2012 it was
repurposed to clear *all* attributes -- without changing the name
or the relevant comment in name.h, which are now both misleading.

The reason for setting this flag in xec.c was to deal with some
bugs in 'typeset'. Removing those two lines causes regressions:

  attributes.sh[316]: FAIL: typeset -L should not preserve old attributes
  attributes.sh[322]: FAIL: typeset -R should not preserve old attributes
  attributes.sh[483]: FAIL: typeset -F after typeset -L fails
  attributes.sh[488]: FAIL: integer attribute not cleared for subsequent typeset

Those are all typeset regressions, which suggests this fix was
relevant to typeset only. This is corroborated by the relevant
AT&T ksh93/RELEASE entry:

12-04-27  A bug in which old attributes were not cleared when
	  assigning a value using typeset has been fixed.

So, to fix this 2012 regression without reintroducing the typeset
regressions, we need to set the NV_UNJUST flag for invocations of
the typeset family of commands only. This is changed in xec.c.

While we're at it, we might as well rename that little-used flag to
something that reflects its current purpose: NV_UNATTR.
2022-06-03 23:28:28 +01:00
Martijn Dekker
75b247cce2 Honour attributes for local assignments preceding certain commands
Reproducer:

    $ typeset -i NUM=123
    $ (NUM=3+4 env; :)|grep ^NUM=
    NUM=3+4
    $ (NUM=3+4 env)|grep ^NUM=
    NUM=7

The correct output is NUM=7. This is also the output if ksh is
compiled with SHOPT_SPAWN disabled, suggesting that the problem is
here in sh_ntfork(), where the invocation-local environment list is
set using a sh_scope() call:

src/cmd/ksh93/sh/xec.c:
3496:	if(t->com.comset)
3497:	{
3498:		scope++;
3499:		sh_scope(t->com.comset,0);
3500:	}

Analysis:

When ksh is compiled with SHOPT_SPAWN disabled, external commands
are always run using the regular forking mechanism. First the shell
forks, then in the fork, the preceding assignments list (if any)
are executed and exported in the main scope. Replacing global
variables is not a problem as the variables are exported and the
forked shell is then replaced by the external command using
execve(2).

But when using SHOPT_SPAWN/sh_ntfork(), this cannot be done as the
fork(2) use is replaced by posix_spawn(2) which does not copy the
parent process into the child, therefore it's not possible to
execute anything in the child before calling execve(2). Which means
the preceding assignments list must be processed in the parent, not
the child. Which makes overwriting global variables a no-no.

To avoid overwriting global variables, sh_ntfork() treats preceding
assignments like local variables in functions, which means they do
not inherit any attributes from the parent scope. That is why the
integer attribute is not honoured in the buggy reproducers.

And this is not just an issue for external commands. sh_scope() is
also used for assignments preceding a built-in command. Which is
logical, as those don't create a process at all.

src/cmd/ksh93/sh/xec.c:
1325:	if(argp)
1326:	{
1327:		scope++;
1328:		sh_scope(argp,0);
1329:	}

Which means this bug exists for them as well, regardless of whether
SHOPT_SPAWN is compiled in.

$ /bin/ksh -c 'typeset -i NUM; NUM=3+4 command eval '\''echo $NUM'\'
3+4
(expected: 7, as on mksh and zsh)

So, the attributes from the parent scope will need to be copied
into the child scope. This should be done in nv_setlist() which is
called from sh_scope() with both the NV_EXPORT and NV_NOSCOPE flags
passed. Those flag bits are how we can recognise the need to copy
attributes.

Commit f6bc5c0 fixed a similar inconsistency with the check for the
read-only attribute. In fact, the bug fixed there was simply a
specific instance of this bug. The extra check for readonly was
because the readonly attribute was not copied to the temporary
local scope. So that fix is now replaced by the more general fix
for this bug.

src/cmd/ksh93/sh/name.c: nv_setlist():
- Introduce a 'vartree' local variable to avoid repetitive
  'sh.prefix_root ? sh.prefix_root : sh.var_tree' expressions.
- Use the NV_EXPORT|NV_NOSCOPE flag combination to check if parent
  scope attributes need to be copied to the temporary local scope
  of an assignment preceding a command.
- If so, copy everything but the value itself: attributes (nvflag),
  size parameter (nvsize), discipline function pointer (nvfun) and
  the name pointer (nvname). The latter is needed because some
  code, at least put_lang() in init.c, compares names by comparing
  the pointers instead of the strings; copying the nvname pointer
  avoids a regression in tests/locale.sh.

src/cmd/ksh93/sh/xec.c: local_exports():
- Fix a separate bug exporting attributes to a new ksh function
  scope, which was previously masked by the other bug. The
  attributes (nvflag) were copied *after* nv_putval()ing the value,
  which is incorrect as the behaviour of nv_putval() is influenced
  by the attributes. But here, we're copying the value too, so we
  can simplify the whole function by using nv_clone() instead. This
  may also fix other corner cases. (re: c1994b87)

Resolves: https://github.com/ksh93/ksh/issues/465
2022-06-03 23:28:16 +01:00
Martijn Dekker
2ecc2575d5 Fix import of float attribute/value from environment (re: 960a1a99)
Bug 1: as of 960a1a99, floating point literals were no longer
recognised when importing variables from the environment. The
attribute was still imported but the value reverted to zero:

$ (export LC_NUMERIC=C; typeset -xF5 num=7.75; \
   ksh -c 'typeset -p num')
typeset -x -F 5 num=0.00000

Bug 2 (inherited from 93u+): The code that imported variable
attributes from the environment only checked '.' to determine
whether the float attribute should be set. It should check the
current radix point instead.

$ (export LC_NUMERIC=debug; typeset -xF5 num=7,75; \
   ksh -c 'typeset -p num')
typeset -x -i num=0
...or, after fixing bug 1 only, the output is:
typeset -x -i num=75000

src/cmd/ksh93/sh/arith.c: sh_strnum():
- When importing untrusted env vars at init time, handle not only
  "base#value" literals using strtonll, but also floating point
  literals using strtold. This fixes the bug without reallowing
  arbitary expressions. (re: 960a1a99)
- When not initialising, use sh.radixpoint (see f0386a87) instead
  of '.' to help decide whether to evaluate an arith expression.

src/cmd/ksh93/sh/init.c: env_import_attributes():
- Use sh.radixpoint instead of '.' to check for a decimal fraction.
  (This code is needed because doubles are exported as integers for
  ksh88 compatibility; see attstore() in name.c.)
2022-06-03 12:18:54 +01:00
Martijn Dekker
6dbde5ec7c Correctly exit from namespace on error (re: f73b8617)
The referenced commit introduced at least one bug: EXIT traps were
not triggered if a special builtin threw an error in a namespace
that is within a virtual subshell.

src/cmd/ksh93/sh/xec.c: sh_exec(): TNSPACE:
- When an error occurs, siglongjmp to the previous saved
  environment; it is not correct to sh_exit directly.

src/cmd/ksh93/tests/namespace.sh:
- Remove the forkign workaround and the TODO where I incorrectly
  blamed this problem on the virtual subshell mechanism.
2022-06-02 03:28:41 +01:00
Martijn Dekker
ea300089a1 New feature: 'typeset -g' as in bash 4.2+
typeset -g allows directly manipulating the attributes of variables
at the global level from any context. This feature already exists
on bash 4.2 and later.

mksh (R50+), yash and zsh have this flag as well, but it's slightly
different: it ignores the current local scope, but a parent local
scope from a calling function may still be used -- whereas on bash,
'-g' always refers to the global scope. Since ksh93 uses static
scoping (see III.Q28 at <http://kornshell.com/doc/faq.html>), only
the bash behaviour makes sense here.

Note that the implementation needs to be done both in nv_setlist()
(name.c) and in b_typeset() (typeset.c) because assignments are
executed before the typeset built-in itself. Hence also the
pre-parsing of typeset options in sh_exec().

src/cmd/ksh93/include/nval.h:
- Add new NV_GLOBAL bit flag, using a previously unused bit that
  still falls within the 32-bit integer range.

src/cmd/ksh93/sh/xec.c: sh_exec():
- When pre-parsing typeset flags, make -g pass the NV_GLOBAL flag
  to the nv_setlist() call that processes shell assignments prior
  to running the command.

src/cmd/ksh93/sh/name.c: nv_setlist():
- When the NV_GLOBAL bit flag is passed, save the current variable
  tree pointer (sh.var_tree) as well as the current namespace
  (sh.namespace) and temporarily set the former to the global
  variable tree (sh.var_base) and the latter to NULL. This makes
  assignments global and ignores namesapces.

src/cmd/ksh93/bltins/typeset.c:
- b_typeset():
  - Use NV_GLOBAL bit flag for -g.
  - Allow combining -n with -g, permitting 'typeset -gn var' or
    'nameref -g var' to create a global nameref from a function.
  - Do not allow a nonsensical use of -g when using nested typeset
    to define member variables of 'typeset -T' types. (Type method
    functions can still use it as normal.)
- setall():
  - If NV_GLOBAL is passed, use sh.var_base and deactivate
    sh.namespace as in nv_setlist(). This allows attributes
    to be set correctly for global variables.

src/cmd/ksh93/tests/{functions,namespace}.sh:
- Add regression tests based on reproducers for problems found
  by @hyenias in preliminary versions of this feature.

Resolves: https://github.com/ksh93/ksh/issues/479
2022-06-01 21:07:01 +01:00
Martijn Dekker
f73b8617dd Restore namespace's parent scope when exiting due to error
Reproducer:

    $ namespace test { x=123; typeset -g x=456; }
    $ echo $x ${.test.x}
    456 123
    $ namespace test { typeset -Q; }
    arch/darwin.i386-64/bin/ksh: typeset: -Q: unknown option
    [usage message snipped for brevity]
    $ echo $x ${.test.x}
    123 123	            <== expected: 123 456
    $ x=789
    $ echo $x ${.test.x}
    789 789                 <== expected: 789 456
    $ # look at that, we never left the namespace...

When prefixing the erroneous 'typeset' with 'command', the problem
does not occur. 'command' disables the properties of special
built-ins such as exit on error. So, when a special built-in exits
on error, the parent scope is not properly resotred.

This bug exists in every ksh93 version with SHOPT_NAMESPACE so far.

src/cmd/ksh93/sh/xec.c: sh_exec():
- Before entering a namespace, use sh_pushcontext and sigsetjmp to
  make sure we return here if sh_exit() is called, e.g. when a
  special builtin throws an error, to ensure the parent scope
  (oldnspace) is restored.

Thanks to @hyenias for making me aware of this bug.
Discussion: https://github.com/ksh93/ksh/issues/479#issuecomment-1140468965
2022-05-29 23:05:03 +01:00
Martijn Dekker
8f14514661 set --default: properly restore ksh IFS behaviour (re: 9e2a8c69)
Reproducer:

$ (IFS=$'\t\t'; val=$'\tone\t\ttwo\t'; set --posix; \
   set -- $val; echo $#; set --noposix; set -- $val; echo $#)
2
4   <== OK

$ (IFS=$'\t\t'; val=$'\tone\t\ttwo\t'; set --posix; \
   set -- $val; echo $#; set --default; set -- $val; echo $#)
2
2   <== bug

The output of the seconnd command line should be like the first.

When POSIX mode is turned off using 'set --noposix' (or 'set +o
posix'), sh.ifstable is invalidated as it needs to be repopulated
on the next field split to restore ksh-specific special handling of
a repeated $IFS whitespace character as non-whitespace. However,
when 'set --default' is used, this does not happen, which is a bug.

src/cmd/ksh93/sh/args.c: sh_argopts():
- While processing --default, when turning off SH_POSIX, call
  sh_invalidate_ifs() to invalidate sh.ifstable.
2022-05-28 00:13:46 +01:00
Martijn Dekker
83baa27ef9 Fix incorrect typeset -L/-R/-Z on input with spaces (re: bdb99741)
The typeset output for -L/-R/-Z seems to be wrong when the input
has leading/trailing spaces. This started occurring after the
dynamic buffer size changes introduced in name.c as part of the
fix for <https://github.com/ksh93/ksh/issues/142>.

Test script:
  typeset -L8  s_date1=" 22/02/09 08:25:01"; echo "$s_date1"
  typeset -R10 s_date1="22/02/09 08:25:01 "; echo "$s_date1"
  typeset -Z10 s_date1="22/02/09 08:25:01 "; echo "$s_date1"

Actual output:
22/02/0
  08:25:01
0008:25:01

Expected output:
22/02/09
9 08:25:01
9 08:25:01

src/cmd/ksh93/sh/name.c: nv_newattr():
- Simplify allocation code, replacing the earlier dynamic buffer
  size calculation with just the greater of the strlen and size.

Resolves: https://github.com/ksh93/ksh/issues/476
Co-authored-by: George Lijo <george.lijo@gmail.com>
2022-05-26 00:08:45 +01:00
Martijn Dekker
c2fad38bf8 Fix 'cd -e' regress fails on some UNIXen (re: e6989853, b398f33c)
On some systems, including at least Solaris 10.1 and QNX 6.5.0, the
regression tests below occurred. This is because, on these systems,
'cd .' always fails with 'no such file or directory', regardless of
flags, if the present working directory no longer exists. This is a
legitimate variation in UNIX-like systems so the tests should be
compatible.

test builtins begins at 2022-05-22+13:08:28
/usr/local/src/ksh/src/cmd/ksh93/tests/builtins.sh[1499]: cd: .: [No such file or directory]
        builtins.sh[1501]: FAIL: cd -P without -e exits with error status if $PWD doesn't exist (expected 0, got 1)
/usr/local/src/ksh/src/cmd/ksh93/tests/builtins.sh[1504]: cd: .: [No such file or directory]
        builtins.sh[1506]: FAIL: cd -eP doesn't fail if $PWD doesn't exist (expected 1, got 2)
test builtins failed at 2022-05-22+13:08:37 with exit code 2 [ 287 tests 2 errors ]

src/cmd/ksh93/tests/builtins.sh:
- Delete the 'cd -P .' test for a nonexistent PWD.
- For the 'cd -eP .' test for a nonexistent PWD, redirect standard
  error to /dev/null and also accept exit status 2, which we would
  expect with the '-e' flag if a 'no such file or directory' error
  is thrown.
2022-05-22 12:49:42 +01:00
atheik
9bed28c3f9 Fix line continuation within command substitutions
In command substitutions of the $(standard) and ${ shared state; }
form, backslash line continuation is broken.

Reproducer:

	echo $(
	echo one two\
	three
	)

Actual output (ksh93, all versions):

	one two\ three

Expected output (every other shell, POSIX spec):

	one twothree

src/cmd/ksh93/sh/lex.c: sh_lex(): case S_REG:
- Do not skip new-line joining if we're currently processing a
  command substitution of one of these forms (i.e., if the
  lp->lexd.dolparen level is > 0).

Background info/analysis:

comsub() is called from sh_lex() when S_PAR is the current state.
In src/cmd/ksh93/data/lexstates.c, we see that S_PAR is reached in
the ST_DOL state table at index 40. Decimal 40 is ( in ASCII. So,
the previous skipping of characters was done according to the
ST_DOL state table, and the character that stopped it was (. This
means we have $(.

Alternatively, comsub() may be called from sh_lex() by jumping to
the do_comsub label. In brief, that is the case when we have ${.

Regardless of which it is from the two, comsub() is now called from
sh_lex(). In comsub(), lp->lexd.dolparen is incremented at the
beginning and decremented at the end. Between them, we see that
sh_lex() is called. So, lp->lexd.dolparen in sh_lex() indicates the
depth of nesting $( or ${ statements we're in. Thus, it is also the
number of comsub() invocations seen in a backtrace taken in
sh_lex().

The codepath for `...` is different (and never had this bug).

Co-authored by: Martijn Dekker <martijn@inlv.org>
Resolves: https://github.com/ksh93/ksh/issues/367
2022-05-22 00:23:54 +01:00
atheik
40a5c45b48 Allow double quotes within backtick comsub within double quotes
The following reproducer causes a spurious syntax error:

    foo="`: "("`"

The nested double quotes are not recognised correctly, causing a
syntax error at the '('. Removing the outer double quotes (which
are unnecessary) is a workaround, but it's still a bug as every
other shell accepts this. This bug has been present since the
original Bourne shell.

src/cmd/ksh93/sh/lex.c: sh_lex(): case S_QUOTE:
- If the current character is '"' and we're in a `...` command
  substitution (ingrave is true), then do not switch to the old
  mode but keep using the ST_QUOTE state table.

Thanks to @JohnoKing for the report and to @atheik for the fix.

Co-authored by: Martijn Dekker <martijn@inlv.org>
Resolves: https://github.com/ksh93/ksh/issues/352
2022-05-20 22:48:47 +01:00
atheik
86b94d9feb libast: optget(3): Fix memory leak in --help/--man info
src/lib/libast/misc/optget.c: textout(): case ']':
- Before returning, call pop() to free any \f...\f info items that
  are left. Note that this is a safe no-op if the pointer is null.

Resolves: https://github.com/ksh93/ksh/issues/407
Co-authored-by: Martijn Dekker <martijn@inlv.org>
2022-03-11 21:24:08 +01:00
Martijn Dekker
fd28da31da Fix another test/[ corner case bug; add --posix test script
This fixes another corner case bug in the horror show that is the
test/[ comand.

Reproducer:

   $ ksh --posix -c 'test X -a -n'
   ksh: test: argument expected

Every other shell returns 0 (success) as, POSIXly, this is a test
for the strings 'X' and '-n' both being non-empty, combined with
the binary -a (logical and) operator. Instead, '-n' was taken as a
unary primary operator with a missing argument, which is incorrect.

POSIX reference:
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/test.html
> 3 arguments:
> * If $2 is a binary primary, perform the binary test of $1 and $3.

src/cmd/ksh93/bltins/test.c:
- e3(): If the final argument begins with a dash, always treat it
  as a test for a non-empty string, therefore return true. Do not
  limit this to "new flags" only.

src/cmd/ksh93/tests/posix.sh:
- Added. These are tests for every aspect of the POSIX mode.
2022-03-11 21:23:45 +01:00
Martijn Dekker
b398f33c49 Another round of accumulated minor fixes and cleanups
Only notable changes listed below.

**/Mamfile:
- Do not bother redirecting standard error for 'cmp -s' to
  /dev/null. Normally, 'cmp -s' on Linux, macOS, *BSD, or Solaris
  do not not print any message. If it does, something unusual is
  going on and I would want to see the message.
- Since we now require a POSIX shell, we can use '!'.

src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/sh/init.c:
- Remove SH_TYPE_PROFILE symbol, unused after the removal of the
  SHOPT_PFSH code. (re: eabd6453)

src/cmd/ksh93/sh/io.c:
- piperead(), slowread(): Replace redundant sffileno() calls by
  the variables already containing their results. (re: 50d342e4)

src/cmd/ksh93/bltins/mkservice.c,
rc/lib/libcmd/vmstate.c:
- If these aren't compiled, define a stub function to silence the
  ranlib(1) warning that the .o file does not contain symbols.
2022-03-11 21:20:32 +01:00
Johnothan King
8fc8c2f51c Fix a few minor issues (#473)
Changes:
- Fixed two xtrace test failures introduced in commit cfc8744c.
- The definition of _use_ntfork_tcpgrp in xec.c is now dependent on
  SHOPT_SPAWN being defined (re: 8e9ed5be).
- Removed many unnecessary newlines and fixed various typos.
2022-03-11 21:18:42 +01:00
Johnothan King
dccf6b5ea8 Backport ksh93v- regression tests and fix various regression test bugs (#472)
- tests/*.sh: Backported many additional regression tests and test
  fixes from the alpha and beta releases of ksh93v-.

- tests/alias.sh: Avoid trying to add vi to the hash table, as some
  platforms do not provide a vi(1) implementation installed as part
  of the default system. This fixes a regression test failure I was
  getting in one of my Linux virtual machines.

- tests/builtins.sh: Fixed a bug in one of the regression tests that
  caused an incorrect total error count if any of the tests failed.

- tests/sh_match.sh: Fixed a regression test failure on DragonFly
  BSD caused by the diff command printing an extra 'No differences
  encountered' line.
2022-03-11 21:15:55 +01:00
Johnothan King
e87dbebebd Fix use after free bug when using += (re: 75796a9c) (#466)
The previous fix for the += operator introduced a use-after-free
bug that could result in a variable pointing to random garbage:
   $ foo=bar
   $ foo+=_foo true
   $ typeset -p foo
   foo=V V
The use after free issue occurs because when nv_clone creates a
copy of $foo in the true command's invocation-local scope, it does
not duplicate the string $foo points to. As a result, the $foo
variable in the parent scope points to the same string as $foo in
the invocation-local scope, which causes the use after free bug
when cloned $foo variable is freed from memory.

src/cmd/ksh93/sh/nvdisc.c:
- To fix the use after free bug, allow nv_clone to duplicate the
  string with memdup or strdup when no flags are passed.

src/cmd/ksh93/tests/variables.sh:
- Add a regression test for using the += operator with regular
  commands.

src/cmd/ksh93/tests/leaks.sh:
- Add a regression test to ensure the bugfix doesn't introduce any
  memory leaks.
2022-03-11 21:08:57 +01:00
Martijn Dekker
bc6c5dbdd9 path_pwd(): Fix use after free (re: 11177d44)
Of course, we should not free the 'cp' pointer when we still need
to use it.

Resolves: https://github.com/ksh93/ksh/issues/467
Thanks to @atheik for the report.
2022-02-19 21:50:20 +01:00