The following set of commands caused ksh to crash:
$ unalias history; unalias r
Memory fault
When ksh is compiled with -D_std_malloc, the crash always
occurs when the 'r' alias is removed with 'unalias r',
although with vmalloc 'unalias history' must be run first
for the crash to occur. With the native malloc, the crash
message is also different:
$ unalias history; unalias r
free(): invalid pointer
Abort
This crash happens because when an alias is unset, _nv_unset
removes the NV_NOFREE flag which results in an invalid use
of free(3) as nv_isattr no longer detects NV_NOFREE afterward.
The history and r aliases shouldn't be freed from memory by
nv_delete because those aliases are given the NV_NOFREE attribute.
src/cmd/ksh93/bltins/typeset.c:
- Save the state of NV_NOFREE for aliases to fix the crash
caused by 'unalias r'.
src/cmd/ksh93/tests/alias.sh:
- Use unalias on both history and r to check for the crash.
'unalias -a' can't be used to replicate the crash.
Co-authored-by: Martijn Dekker <martijn@inlv.org>
This patch from Red Hat fixes the following:
1. ksh was ignoring the -m (-o monitor) option when specified on
the invocation command line.
2. Scripts did not properly terminate their background processes
on Ctrl+C if the -m option was turned off. Reproducer:
xterm &
read junk
When run as a script without turning on -m, pressing Ctrl+C
should terminate the xterm, and now does.
3. Scripts no longer attempt to set the terminal foreground process
group ID, as only interactive shells should be doing that.
This makes some progress on https://github.com/ksh93/ksh/issues/119
but we're a long way from fixing all of that.
src/cmd/ksh93/sh/main.c: exfile():
- On non-interactive shells, do not turn off the monitor option.
Instead, if it was turned on, turn on the SH_MONITOR state flag.
src/cmd/ksh93/edit/edit.c: ed_getchar():
- On Ctrl+C, issue SIGINT to the current process group using
killpg(2) instead of going via sh_fault(), which handles a
signal only for the current shell process.
src/cmd/ksh93/sh/jobs.c: job_reap(), job_reset(),
src/cmd/ksh93/sh/xec.c: sh_exec():
- Only attempt to set the terminal foreground process group ID
using tcsetpgrp(3) if the shell is interactive.
Original patch: https://src.fedoraproject.org/rpms/ksh/blob/642af4d6/f/ksh-20120801-kshmfix.patch
This was applied to Red Hat's ksh 93u+ on 8 July 2013.
A memory leak occurred when typeset was used in a function called
from within a command substitution. This fix was backported from
the 93v- beta by Red Hat on 22 Jan 2014. Source:
https://src.fedoraproject.org/rpms/ksh/blob/642af4d6/f/ksh-20120801-memlik3.patch
src/cmd/ksh93/include/name.h,
src/cmd/ksh93/sh/subshell.c:
- Replace the nv_subsaved() function by the version from ksh 93v-.
This version frees a table from memory if the NV_TABLE flag is
passed in the new second parameter, a bitmask for flags (which
was oddly named 'table'; I've renamed it to 'flags').
src/cmd/ksh93/sh/name.c:
- nv_delete(): When calling nv_subsaved(), pass on the NV_TABLE
flag if given.
- table_unset(): Call nv_delete() with the NV_TABLE flag.
src/cmd/ksh93/tests/leaks.sh:
- Add test based on the reproducer provided in Red Hat bug 1036470.
I now have access to some of the private bugs on the Red Hat bug
tracker. This one doesn't have a lot of information on the patch,
but it contains a good reproducer, so we can at least verify that
it works.
src/cmd/ksh93/sh/array.c,
src/cmd/ksh93/sh/name.c:
- Apply the patch associated with Red Hat bug #921455. Source:
https://src.fedoraproject.org/rpms/ksh/blob/642af4d6/f/ksh-20120801-memlik.patch
This was applied to Red Hat's ksh on 04 Jul 2013.
src/cmd/ksh93/tests/leaks.sh:
- Add leak tests for associative and indexed arrays in functions
based on the reproducer from rhbz#921455.
- Both tests still leak (though much less) when run in a locale
other than C. For now, temporarily set the locale to C and add
a TODO note. Perhaps another Red Hat patch is yet to fix this.
_sfcvt(), "convert a floating point value to ASCII", did not adjust
for negative decimal place movement as what happens with leading
zeroes. This caused ksh's 'printf %f' formatter to fail to round
floating point values correctly.
src/lib/libast/sfio/sfcvt.c:
- Removed constraint of <1e-8 for doubles by matching what was done
for long doubles having <.1.
- Corrected a condition when the next power of 10 occurred and that
new 1 digit was being overwritten by a 0.
src/cmd/ksh93/tests/math.sh:
- Validate that typeset -E/F formatting matches that of their
equivalent printf formatting options as well as checking for
correct float scaling of the fractional parts.
The fix was incomplete: expansions using '?' (${var?w(ord},
${var:?wo)rd}) still did not tolerate parentheses in the word
as regular characters.
It was also possible to simplify the fix by making use of the
ST_BRACE (sh_lexstate7[]) state table. See data/lexstates.c and
include/lexstates.h.
src/cmd/ksh93/sh/lex.c: sh_lex(): case S_MOD1:
- The previous fix tested for modifier operator characters : - + =
as part of the S_MOD2 case, though they are defined as S_MOD1 in
the ST_BRACE state table. It only worked because of the
fallthrough. And it turns out the S_MOD1 case already had a
similar fix, though incomplete. The new fix effectively cancelled
the old one out as any S_MOD1 character eventually led to
'continue'. So it can be simplified by removing most of that
code, without causing any change in behaviour. Only the mode
change to the ST_QUOTE state table followed by 'continue' is
necessary. This also fixes it for the '?' operator as that is
also defined as S_MOD1 in the ST_BRACE state table.
src/cmd/ksh93/sh/macro.c:
- When skipping a ${...} expansion using sh_lexskip(), use the
ST_QUOTE state table if the character c is an S_MOD1 modifier
operator character. This makes it consistent with the S_MOD1
handling in sh_lex().
src/cmd/ksh93/tests/variables.sh:
- Update regression tests to include ? and :? operators.
Following a community discussion, it became clear that 'r' is
particularly problematic as a regular builtin, as the name can and
does conflict with at least one legit external command by that
name. There was a consensus against removing it altogether and
letting users set the alias in their login scripts. However,
aliases are easier to bypass, remove or rename than builtins are.
My compromise is to reinstate 'r' as a preset alias on interactive
shells only, along with 'history', as was done in 17f81ebe before
they were converted to builtins in 03224ae3. So this reintroduces
the notion of predefined aliases to ksh 93u+m, but only for
interactive shells that are not initialised in POSIX mode.
src/cmd/ksh93/Makefile,
src/cmd/ksh93/Mamfile,
src/cmd/ksh93/include/shtable.h,
src/cmd/ksh93/data/aliases.c:
- Restore aliases.c containing shtab_aliases[], a table specifying
the preset aliases.
src/cmd/ksh93/include/shtable.h,
src/cmd/ksh93/sh/init.c:
- Rename inittree() to sh_inittree() and make it extern, because we
need to use it in main.c (sh_main()).
src/cmd/ksh93/sh/main.c: sh_main():
- Init preset aliases from shtab_aliases[] only if the shell is
interactive and not in POSIX mode.
src/cmd/ksh93/bltins/typeset.c,
src/cmd/ksh93/tests/alias.sh:
- unall(): When unsetting an alias, pass on the NV_NOFREE attribute
to nv_delete() to avoid an erroneous attempt to free a preset
alias from read-only memory. See: 5d50f825
src/cmd/ksh93/data/builtins.c:
- Remove "history" and "r" entries from shtab_builtins[].
- Revert changes to inline fc/hist docs in sh_opthist[].
src/cmd/ksh93/bltins/hist.c: b_hist():
- Remove handling for 'history' and 'r' as builtins.
src/cmd/ksh93/sh.1:
- Update accordingly.
Resolves: https://github.com/ksh93/ksh/issues/125
The 'command' name can now result from an expansion, e.g.:
c=command; "$c" ls
set -- command ls; "$@"
both work now. This fixes BUG_CMDEXPAN.
If -o posix is on, 'command' now disables not only the "special"
but also the "declaration" properties of builtin commands that it
invokes. This is because POSIX specifies 'command' as a simple
regular builtin, and any command name following 'command' is just
an argument to the 'command' command, so there is nothing that
allows any further arguments (such as assignment-arguments) to be
treated specially by the parser. So, if and only if -o posix is on:
a. Arguments that start with a variable name followed by '=' are
always treated as regular words subject to normal shell syntax.
b. Since assignment-arguments are not processed as assignments
before the command itself, 'command' can now stop the shell from
exiting (as required by the standard) if a command that it
invokes (such as 'export') tries to modify a readonly variable.
This fixes BUG_CMDSPEXIT.
Most of 'command' is integrated in the parser and parse tree
executer, so that is where it needed fixing.
src/cmd/ksh93/sh/parse.c: simple():
- If the posix option is on, do not skip past SYSCOMMAND so that
any declaration builtin commands that are arguments to 'command'
are not detected and thus not treated specially at parsetime.
src/cmd/ksh93/sh/xec.c: sh_exec():
- When detecting SYSCOMMAND in order to skip past it, not only
compare the Namval_t pointer 'np' to SYSCOMMAND, but also handle
the case where that pointer is NULL, as when the command name
results from an expansion. In that case, search the function tree
shp->fun_tree for the name and see if that yields the SYSCOMMAND
pointer. fun_tree is initialised with a dtview to bltin_tree, so
searching fun_tree instead allows for overriding 'command' with a
shell function (which the POSIX standard requires us to allow).
src/cmd/ksh93/sh.1,
src/cmd/ksh93/data/builtins.c:
- Update documentation to match these changes.
- Various related edits and improvements.
src/cmd/ksh93/tests/builtins.sh:
- Check that 'command' works if resulting from an expansion.
- Check that 'command' can be overridden by a shell function.
The 'exit' and 'return' commands without an argument failed to pass
down the exit status of the last-run command when incorporated in a
block with redirection, &&/|| list, 'case' statement, or 'while',
'until' or 'for' loop.
src/cmd/ksh93/bltins/cflow.c:
- Use $?, which is sh.savexit a.k.a. shp->savexit, as the default
exit status value if there is no argument, instead of
shp->oldexit. This fixes the default exit status behaviour to
match POSIX and other shells.
src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/include/shell.h:
- Remove now-unused sh.oldexit (a.k.a. shp->oldexit) private struct
member. It appeared to fulfill the same function as sh.savexit,
but in a slightly broken way.
- Move the savexit/$? declaration from the _SH_PRIVATE part of the
struct definition to the public API part. Since $? uses this,
it's clearly a publicly exposed value already, and this is
generally the one to use. (If anything, it's exitval that should
have been private.) This declares savexit right next to exitval,
rewriting the comments to clarify the difference between them.
src/cmd/ksh93/sh/fault.c,
src/cmd/ksh93/sh/subshell.c,
src/cmd/ksh93/sh/xec.c:
- Remove assignments to shp->oldexit.
src/cmd/ksh93/tests/basic.sh:
- Add thorough regression tests for the default exit status
behaviour of 'return' and 'exit' in various lexical contexts.
- Verify that 'for' and 'case' without any command, as well as a
lone redirection, still correctly reset the exit status to 0.
Fixes: #117
A coprocess cleanup test could fail on rare occasions because I had
lowered the 'sleep 1' between two test coprocesses to 'sleep .1'.
This increases the sleep to prevent future spurious fails.
Fixes: https://github.com/ksh93/ksh/issues/129
Using a process of elimination I've identified ${.sh.level}
(SH_LEVELNOD) as the cause of the crash. This node apparently
cannot be copied or moved without destabilising the shell. It
contains the current depth of function calls and it cannot be
changed by assignment, so this is not actually a problem.
Meanwhile, this commit re-fixes it for the other three.
src/cmd/ksh93/sh/subshell.c:
- Simplify sh_assignok() by removing special-casing for L_ARGNOD,
SH_SUBSCRNOD and SH_NAMENOD. 'add' now has 3 modes (0, 1, 2).
- The test for a ${ subshare; } was actually wrong. sp->subshare is
a saved backup value. We must test shp->subshare. (re: a9de50bf)
src/cmd/ksh93/bltins/typeset.c:
- setall(): Update the mode 3 sh_assignok() call.
src/cmd/ksh93/tests/variables.sh:
- Regress-test subshell leaks for all special variables except
${.sh.level}.
This reverts commit b3d37b00b0.
While ksh's own regression test suite passed just fine, when
running the modernish[*] regression tests uite, ksh either froze
hard (needing SIGKILL) or threw a spurious syntax error.
Cause unknown, but I'm certainly reverting until I find out.
This reintroduces a subshell leak for four special variables.
[*] https://github.com/modernish/modernish
${var:-wor)d} or ${var+w(ord}. The parentheses now correctly lose
their normal grammatical meaning within the braces. Fix by Eric
Scrivner (@etscrivner) from July 2018 backported from ksh2020.
This fix complies with POSIX:
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_02
src/cmd/ksh93/sh/lex.c: sh_lex():
- Set the ST_QUOTE state when analysing a modifier with parameter
expansions using operators ':', '-', '+', '='. This state causes
subsequent characters (including parentheses) to be considered
quoted, suppressing their normal grammatical meaning.
src/cmd/ksh93/sh/macro.c: varsub():
- Same for skipping the expansion.
Fixes: https://github.com/ksh93/ksh/issues/126
Prior discussion: https://github.com/att/ast/issues/475
The following special variables leaked out of a subshell:
$_, ${.sh.name}, ${.sh.level}, ${.sh.subscript}.
This was due to a faulty optimisation in sh_assignok().
bd3e2a80 fixed that in part, this fixes the rest.
src/cmd/ksh93/sh/subshell.c:
- Simplify sh_assignok() by removing special-casing for these four
special variables. The 'add' param reverts to a simple boolean.
- The test for a ${ subshare; } was actually wrong. sp->subshare is
a saved backup value. We must test shp->subshare. (re: a9de50bf)
src/cmd/ksh93/bltins/typeset.c:
- setall(), unall(): Update sh_assignok() calls.
src/cmd/ksh93/tests/variables.sh:
- Regress-test subshell leaks for all special variables.
Closes: #122
When exporting variables, ksh exports their attributes (such as
'integer' or 'readonly') in a magic environment variable called
"A__z" (string defined in e_envmarker[] in data/msg.c). Child
shells recognise that variable and restore the attributes.
This little-known feature is risky; the environment cannot
necessarily be trusted and that A__z variable is easy to manipulate
before or between ksh invocations, so you can cause a script's
variables to be of the wrong type, or readonly. Backwards
compatibility requires keeping it, at least for now. But it should
be disabled in the posix mode, as it violates POSIX.
To do this, we have to solve a catch-22 in init.c. We must parse
options to know whether to turn on posix mode; it may be specified
as '-o posix' on the command line. The option parsing loop depends
on an initialised environment[*], while environment initialisation
(i.e., importing attributes) should depend on the posix option.
The catch-22 can be solved because initialising just the values
before option parsing is enough to avoid regressions. Importing the
attributes can be delayed until after option parsing. That involves
basically splitting env_init() into two parts while keeping a local
static state variable between them.
src/cmd/ksh93/sh/init.c:
- env_init():
* Split the function in two stages based on a new
'import_attributes' parameter. Import values in the first
stage; import attributes from A__z in the second (if ever).
Make the 'next' variable static as it keeps a state needed for
the attributes import stage.
* Single point of truth, greppability: don't hardcode "A__z" in
separate character comparisons, but use e_envmarker[].
* Fix an indentation error.
- sh_init(): When initialising the environment (env_init), don't
import the attributes from A__z yet; parse options first, then
import attributes only if posix option is not set.
src/cmd/ksh93/sh/name.c:
- sh_envgen(): Don't export variable attributes to A__z if the
posix option is set.
src/cmd/ksh93/tests/attributes.sh:
- Check that variable attributes aren't imported or exported
if the POSIX option is set.
src/cmd/ksh93/sh.1:
- Update.
This was the last item on the TODO list for -o posix for now.
Closes: #20
[*] If environment initialisation is delayed until after option
parsing, bin/shtests shows various regressions, including:
restricted mode breaks; the locale is not initialised properly
so that multibyte variable names break; $SHLVL breaks.
In the SHOPT_BASH code, the -o posix option was given a '\374'
(0xFC, 252) single-letter option character. Reasons unclear. The
'set' builtin doesn't accept it. It can be omitted and the option
still works. And it caused the "$-" expansion (listing active
short-form options) to include that invalid high-bit character if
the -o posix option is active, which is clearly wrong.
src/cmd/ksh93/sh/args.c: optksh[], flagval[]:
- Remove '\374' one-letter option equivalent for SH_POSIX.
src/cmd/ksh93/tests/options.sh:
- Add test verifying that '-o posix' does not affect "$-".
The inclusion of the special parameter expansions ${!} and ${$}
(including the braces) in a here-document caused a syntax error.
Bug reported by @Saikiran-m on Github.
src/cmd/ksh93/data/lexstates.c: sh_lexstate7[]:
- Change the state for ! (33) and $ (36) from S_ERR to 0. State
table 7 is for skipping over ${...}, so this avoids the S_ERR
state being invoked in sh_lex() (lex.c) for these characters
while skipping ${...} in a here-doc.
src/cmd/ksh93/tests/heredoc.sh:
- Test evaluating the braces expansion form for all special
parameters (@ * # ! $ - ? 0) in a here-document.
Fixes: https://github.com/ksh93/ksh/issues/127
An oops in tests/io.sh (re: c607c48c) wrote temporary files outside
$tmp and into src/cmd/ksh93/tests. Let's fix this properly so it
doesn't happen again.
src/cmd/ksh93/tests/shtests:
- Start each test set in its own temporary directory by default.
src/cmd/ksh93/tests/*.sh:
- Refuse to run if $tmp != $PWD.
- Related cleanups.
On ksh93, 'test -t' is equivalent to 'test -t 1' (and of course
"[ -t ]" is equivalent to "[ -t 1 ]").
This is purely for compatibility with ancient Bourne shell
breakage. No other shell supports this. ksh93 should probably keep
it for backwards compatibility, but it should definitely be
disabled in POSIX mode as it is a violation of the standard; 'test
-t' is an instance of 'test "$string"', which tests if the string
is empty, so it should test if the string '-t' is empty (quod non).
This also replaces the fix for 'test -t 1' in a command
substitution with a better one that avoids forking (re: cafe33f0).
src/cmd/ksh93/sh/parse.c:
- qscan(): If the posix option is active, disable the parser-based
hack that converts a simple "[ -t ]" to "[ -t 1 ]".
src/cmd/ksh93/bltins/test.c:
- e3(): If the posix option is active, disable the part of the
compatibility hack that was used for compound expressions
that end in '-t', e.g. "[ -t 2 -o -t ]".
- test_unop(): Remove the forking fix for "[ -t 1 ]".
src/cmd/ksh93/edit/edit.c:
- tty_check(): This function is used by "[ -t 1 ]" and in other
contexts as well, so a fix here is more comprehensive. Forking
here would cause a segfault, but we don't actually need to. This
adds a fix that simply returns false if we're in a virtual
subshell that is also a command substitution. Since command
substitutions always fork upon redirecting standard output within
them (making them no longer virtual), it is safe to do this.
src/cmd/ksh93/tests/bracket.sh
- Add comprehensive regression tests for test/[/[[ -t variants in
command substitutions, in simple and compound expressions, with
and without redirecting stdout to /dev/tty within the comsub.
- Add tests verifying that -o posix disables the old hack.
- Tweak other tests, including one that globally disabled xtrace.
eeee77ed implemented a POSIX compliance fix that caused a potential
incompatibility with existing ksh scripts; it made the (rarely
used) read/write redirection operator, <>, default to file
descriptor 0 (standard input) as POSIX specified, instead of 1
(standard output) which is traditional ksh93 behaviour. So ksh
scripts needed to change all <> to 1<> to override the new default.
This commit reverts that change, except in the new posix mode.
src/cmd/ksh93/sh/lex.c:
- Make FD for <> default to 0 in POSIX mode, 1 otherwise.
src/cmd/ksh93/tests/io.sh:
- Revert <> regression test changes from 60516872; we no longer
need 1<> instead of <> in ksh code.
If there are file descriptors > 2 opened with 'exec' or 'redirect',
ksh93 has always closed them when invoking another pogram. This is
contrary to POSIX which states:
Utilities other than the special built-ins […] shall be invoked
in a separate environment that consists of the following. The
initial value of these objects shall be the same as that for
the parent shell, except as noted below.
* Open files inherited on invocation of the shell, open files
controlled by the exec special built-in plus any
modifications, and additions specified by any redirections to
the utility
* […]
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_12
src/cmd/ksh93/sh/io.c: sh_redirect():
- When flag==2, do not close FDs > 2 if POSIX mode is active.
src/cmd/ksh93/tests/io.sh:
- Regress-test inheriting FD 7 with and without POSIX mode.
src/cmd/ksh93/sh.1:
- Update.
Though the "let" builtin is not itself a POSIX standard command, it
processes standard shell arithmetic, so it should recognise octals
by leading zeros as POSIX requires if the 'posix' option is on.
This overrides the setting of the 'letoctal' option.
Note that none of this applies to the ((...)) arithmetic command,
which has always recognised leading-octal zeros and does not listen
to 'letoctal'. So setting the posix mode makes this consistent.
src/cmd/ksh93/sh/arith.c:
- When running the 'let' builtin, test that both SH_LETOCTAL and
SH_POSIX are off before stripping leading zeros to disable octal
number recognition.
- Cosmetic: fix spurious newline.
src/cmd/ksh93/sh.1:
- Document the change.
src/cmd/ksh93/tests/shtests:
- Make sure to disable posix mode by default for regression tests.
On 16 June there was a call for volunteers to fix the bash
compatibility mode; it has never successfully compiled in 93u+.
Since no one showed up, it is now removed due to lack of interest.
A couple of things are kept, which are now globally enabled:
1. The &>file redirection shorthand (for >file 2>&1). As a matter
of fact, ksh93 already supported this natively, but only while
running rc/profile/login scripts, and it issued a warning. This
makse it globally available and removes the warning, bringing
ksh93 in line with mksh, bash and zsh.
2. The '-o posix' standard compliance option. It is now enabled on
startup if ksh is invoked as 'sh' or if the POSIXLY_CORRECT
variable exists in the environment. To begin with, it disables
the aforementioned &> redirection shorthand. Further compliance
tweaks will be added in subsequent commits. The differences will
be fairly minimal as ksh93 is mostly compliant already.
In all changed files, code was removed that was compiled (more
precisely, failed to compile/link) if the SHOPT_BASH preprocessor
identifier was defined. Below are other changes worth mentioning:
src/cmd/ksh93/sh/bash.c,
src/cmd/ksh93/data/bash_pre_rc.sh:
- Removed.
src/cmd/ksh93/data/lexstates.c,
src/cmd/ksh93/include/shlex.h,
src/cmd/ksh93/sh/lex.c:
- Globally enable &> redirection operator if SH_POSIX not active.
- Remove warning that was issued when &> was used in rc scripts.
src/cmd/ksh93/data/options.c,
src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/sh/args.c:
- Keep SH_POSIX option (-o posix).
- Replace SH_TYPE_BASH shell type by SH_TYPE_POSIX.
src/cmd/ksh93/sh/init.c:
- sh_type(): Return SH_TYPE_POSIX shell type if ksh was invoked
as sh (or rsh, restricted sh).
- sh_init(): Enable posix option if the SH_TYPE_POSIX shell type
was detected, or if the CONFORMANCE ast config variable was set
to "standard" (which libast sets on init if POSIXLY_CORRECT
exists in the environment).
src/cmd/ksh93/tests/options.sh,
src/cmd/ksh93/tests/io.sh:
- Replace regression tests for &> and move to io.sh. Since &> is
now for general use, no longer test in an rc script, and don't
check that a warning is issued.
Closes: #9
Progresses: #20
The first of the two multibyte fixes from 8b5f11dc (which was for
using the first character of IFS as an output field separator when
expanding "$*" and similar) had a minor backwards compatibility
problem: if $IFS started with a byte sequence that is not a valid
UTF-8 character, then it treated IFS as empty in UTF-8 locales, so
the fields would be joined without any separator. The expected
behaviour would be for it to fall back to using the first byte of
IFS as it used to (and as bash and zsh do).
The new code handling this was also a bit kludgy and inefficient,
repeating the mbsize() calculation for every byte of the separator
character and for every field joined by the expansion.
src/cmd/ksh93/sh/macro.c: varsub():
- Rewrite code for joining fields for $* in a quoted or scalar
context and $@ in a scalar context, eliminating a confusing 'd'
variable and concentrating the routine in one block.
- When expanding $* with a multibyte separator (first character
of $IFS), only calculate the size in bytes once per expansion.
- If $IFS starts with a byte sequence that represents an invalid
multibyte character, fall back to using the first byte.
src/cmd/ksh93/tests/variables.sh:
- Tweak some regression tests, including one that overwrote $LANG.
- Add test for invalid multibyte character behaviour as per above.
This fixes a hanging bug that could occur on macOS when using the
'read' command to read from a FIFO and encountering end-of-file
without a final newline character. It also makes the 'read' command
perform 15-25% faster on macOS and Linux.
The previous version (ff385e5a) failed on SunOS/Solaris/Illumos
because those systems apparently don't (fully) support the POSIX
standard recv(2) syscall with MSG_PEEK[*], which is the feature
that iffe detects under the 'socket_peek' identifier. On Illumos,
using that methods causes a compilation failure (unknown identifier
MSG_PEEK); on Solaris 11.4, that method causes multiple regressions
in tests/io.sh, suggesting the method compiles but doesn't work at
all. Instead, SunOS/Solaris/Illumos requires the method using
ioctl(2)+I_PEEK and select(2). No other system that ksh currently
builds on requires this method, so it is now only used on
SunOS/Solaris/Illumos.
So far, this version of sfpkrd() has been tested to work correctly
on Linux, macOS, FreeBSD, NetBSD, OpenBSD, HP-UX, Solaris, and
OmniOS (an Illumos distribution).
It still fails to peek on Cygwin, but in the exact same way it
failed before, so that's no loss.
To test, run the 'io' test set: bin/shtests -p io
src/lib/libast/sfio/sfpkrd.c: sfpkrd():
- Remove long-obsolete Mac OS X and Solaris bug workarounds.
- Remove methods that are no longer needed.
On systems with a POSIX compliant recv(2), the only thing that
is required to avoid regressions is the code that was conditional
upon the socket_peek feature test, which tests for the correct
functioning of the recv(2) syscall. This has now been made
mandatory for non-SunOS/Solaris/Illumos systems (using an #error
directive if it is not detected), with the other methods removed.
The result performs 15-25% faster on macOS and Linux while
passing all the regression tests.
On macOS, avoiding the select(2) method fixes the hanging bug.
On SunOS/Solaris/Illumos (the '__sun' identifier), the method
using ioctl(2)+I_PEEK and select(2) (iffe feature IDs:
stream_peek and lib_select) is preserved.
Resolves: https://github.com/ksh93/ksh/issues/118 (again)
[*] https://pubs.opengroup.org/onlinepubs/9699919799/functions/recv.html
This fixes a hanging bug that could occur on macOS when using the
'read' command to read from a FIFO and encountering end-of-file
without a final newline character. It also makes the 'read' command
perform 15-25% faster on macOS and Linux and maybe other systems.
src/lib/libast/sfio/sfpkrd.c: sfpkrd():
- Get rid of the optional stuff that uses the poll(2) or select(2)
syscalls. The only thing that is required to avoid regressions is
the code that was conditional upon the socket_peek feature test,
which tests for the correct functioning of the recv(2) syscall.
This has now been made mandatory. The rest now uses what was
previously a fallback in plain C, resulting in a function that is
not only more readable, but actually faster than the syscalls.
Resolves: https://github.com/ksh93/ksh/issues/118
If a command's path was previously added to the hash table as a
'tracked alias', then the hash table entry was used, bypassing
the default utility path search activated by 'command -p'.
'command -p' activates a SH_DEFPATH shell state. The bug was caused
by a failure to check for this state before using the hash table.
This check needs to be added in four places.
src/cmd/ksh93/sh/path.c,
src/cmd/ksh93/sh/xec.c:
- path_search(), path_spawn(), sh_exec(), sh_ntfork(): Only consult
the hash table, which is shp->track_tree, if the SH_DEFPATH shell
state is not active.
src/cmd/ksh93/tests/path.sh:
- Add regress tests checking that 'command -p' and 'command -p -v'
still search in the default path if a hash table entry exists for
the command searched.
A memory leak occurred upon leaving a virtual subshell if a
function was defined within it. If this was done more than 32766
(= 2^15-2 = the 'short' max value - 1) times, the shell crashed.
Discussion and reproducer: https://github.com/ksh93/ksh/issues/114
src/cmd/ksh93/sh/subshell.c: table_unset():
- A subshell-defined function was never freed because a broken
check for autoloaded functions (which must not be freed[*]). It
looked for an initial '/' in the canonical path of the script
file that defined the function, but that path is also stored for
regular functions. Now use a check that executes nv_search() in
fpathdict, the same method used in _nv_unset() in name.c for a
regular function unset.
src/cmd/ksh93/bltins/misc.c: b_dot_cmd():
- Fix an additional memory leak introduced in bd88cc7f, that caused
POSIX functions (which are run with b_dot_cmd() like dot scripts)
to leak extra. This fix avoids both the crash fixed there and the
memory leak by introducing a 'tofree' variable remembering the
filename to free. Thanks to Johnothan King for the patch.
src/lib/libast/include/stk.h,
src/lib/libast/misc/stk.c,
src/lib/libast/man/stk.3,
src/lib/libast/man/stak.3:
- Make the stack more resilient by extending the stack reference
counter 'stkref' from (signed) short to unsigned int. On modern
systems with 32-bit ints, this extends the maximum number of
elements on a stack from 2^15-1==32767 to 2^32-1==4294967295.
The ref counter can never be negative, so there is no reason for
signedness. sizeof(int) is defined as the size of a single CPU
word, so this should not affect performance at all.
On a 16-bit system (not that ksh still compiles there), this
doubles the max number of entries to 2^16-1=65535.
src/cmd/ksh93/tests/leaks.sh:
- Add leak regression tests for ksh functions, POSIX functions, dot
scripts run with '.', and dot scripts run with 'source'.
src/cmd/ksh93/tests/path.sh:
- Add an output builtin with a redirect to an autoloaded function
so that a crash[*] is triggered if the check for an autoloaded
function is ever removed from table_unset(), as was done in ksh
93v- (which crashed).
[*] Freeing autoloaded functions after leaving a virtual subshell
causes a crashing bug: https://github.com/att/ast/issues/803
Co-authored-by: Johnothan King <johnothanking@protonmail.com>
Fixes: https://github.com/ksh93/ksh/issues/114
The fix was incomplete because some tests have to unset HISTFILE,
which reverted them to using ~/.sh_history by default.
src/cmd/ksh93/tests/shtests:
- Instead of setting HISTFILE, set HOME to the temporary directory
$tmp, so nothing will write to the real user directory and the
default history file is $tmp/.sh_history.
src/cmd/ksh93/tests/attributes.sh:
- Restore HISTFILE after a test that requires setting HISTFILE=foo.
Co-authored-by: Martijn Dekker <martijn@inlv.org>
An intermittent crash occurred after running many thousands of
virtual/non-forked subshells. One reproducer is a crash in the
shbench fibonacci.ksh test, as documented here:
https://github.com/ksh-community/shbench/blob/f3d9e134/bench/fibonacci.ksh#L4-L10
The apparent cause was the signed and insufficiently large 'short'
data type of 'curenv' and related variables which wrapped around to
a negative number when overflowing. These IDs are necessary for the
'wait' builtin to obtain the exit status from a background job.
This fix is inspired by a patch based on ksh 93v-:
https://build.opensuse.org/package/view_file/shells/ksh/ksh93-longenv.dif?expand=1https://src.fedoraproject.org/rpms/ksh/blob/f24/f/ksh-20130628-longer.patch
However, we change the type to 'unsigned int' instead of 'long'. On
all remotely modern systems, ints are 32-bit values, and using this
type avoids a performance degradation on 32-bit sytems. Making them
unsigned prevents an overflow to negative values.
src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/include/jobs.h,
src/cmd/ksh93/include/nval.h,
src/cmd/ksh93/include/shell.h:
- Change the types of the static global 'subenv' and the subshell
structure members 'curenv', 'jobenv', 'subenv', 'p_env' and
'subshell' to one consistent type, unsigned int.
src/cmd/ksh93/sh/jobs.c,
src/cmd/ksh93/sh/macro.c:
src/cmd/ksh93/sh/name.c:
src/cmd/ksh93/sh/nvtype.c,
src/cmd/ksh93/sh/subshell.c:
- Updates to match new variable types.
src/cmd/ksh93/tests/subshell.sh:
- Show wrong exit status in message on failure of 'wait' builtin.
Using the bin/shtests -l/--locale option to run the regression
tests in your own locale broke the tests if you're in a locale that
uses ',' as the radix point, like my nl_NL.UTF-8, unless
LC_NUMERIC=C was exported manually. Let's automate that fix.
src/cmd/ksh93/tests/shtests: --locale:
- If LC_ALL was set, copy it to LANG and unset all LC_* vars.
This allows overriding the radix point with LC_NUMERIC if needed.
- If '1.0' is not a valid shell arithmetic expression, export
LC_NUMERIC=C to fix it.
The entity is not valid in XML, only in HTML. Since we must
be compatible with both, it can't be used. Thanks to Andras Farkas
for the bug report.
In addition, the generation of numeric entities for unprintable
characters was only valid while processing UTF-8 text while in a
UTF-8 locale. In all other conditions it produced invalid results.
This is not worth trying to fix.
Discussion:
https://groups.google.com/d/msgid/korn-shell/CAA0nTRta%3DPbOYduyBv%3DXCzumTcUCU8Lki%3DQQf2O8Erk2BFvO1g%40mail.gmail.com
src/cmd/ksh93/bltins/print.c:
- Remove conversion to entity.
- Remove conversion of non-graph characters to numeric entities.
Convert only the 5 semantically meaningful characters: < > & " '
src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/sh/string.c:
- We don't need sh_isprint() in print.c anymore, so turn it back
into a static function.
src/cmd/ksh93/tests/builtins.sh:
- Update and trim regression tests.
This applies a number of fixes to the printf formatting directives
%H and %#H (as well as their equivalents %(html)q and %(url)q):
1. Both formatters have been made multibyte/UTF-8 aware, and no
longer delete multibyte characters. Invalid UTF-8 byte sequences
are rendered as ASCII question marks.
2. %H no longer wrongly encodes spaces as non-breaking spaces
( ) and instead correctly encodes the UTF-8 non-breaking
space as such.
3. %H now converts the single quote (') to '%#39;' instead of
''' which is not a valid entity in all HTML versions.
4. %#H failed to encode some reserved characters (e.g. '?') while
encoding some unreserved ones (e.g. '~'). It now percent-encodes
all characters except those 'unreserved' as per RFC3986 (ASCII
alphanumeric plus -._~).
Prior discussion:
https://groups.google.com/d/msgid/korn-shell/ce8d1467-4a6d-883b-45ad-fc3c7b90e681%40inlv.org
src/cmd/ksh93/include/defs.h:
src/cmd/ksh93/sh/string.c:
- defs.h: If compiling without SHOPT_MULTIBYTE, redefine the
mbwide() macro (which tests if we're in a multibyte locale) as 0.
This lets the compiler optimiser do the work that would otherwise
require a lot of tedious '#if SHOPT_MULTIBYTE' directives.
- string.c: Remove some now-unneeded '#if SHOPT_MULTIBYTE' stuff.
- defs.h, string.c: Rename is_invisible() to sh_isprint(), invert
the boolean return value, and make it an extern for use in
fmthtml() -- see below. If compiling without SHOPT_MULTIBYTE,
simply #define sh_isprint() as equivalent to isprint(3).
- defs.h: Add URI_RFC3986_UNRESERVED macro for fmthtml() containing
the characters "unreserved" for purposes of URI percent-encoding.
src/cmd/ksh93/bltins/print.c: fmthtml():
- Remove kludge that skipped all multibyte characters (!).
- Complete rewrite to implement fixes described above.
- Don't bother with '#if SHOPT_MULTIBYTE' directives (see above).
src/cmd/ksh93/data/builtins.c:
- sh_optprintf[]: %H: Add single quote to encoded chars doc.
- Edit credits and bump version date.
src/cmd/ksh93/tests/builtins.sh:
- Update and tweak old regression tests.
- Add a number of new tests for UTF-8 HTML and URI encoding, which
are only run when running tests in a UTF-8 locale (shtests -u).
Several regression tests invoke an "interactive" shell using 'ksh
-i'. This records all the commands tested in the shell's history
file. By default, that is the user's history file, ~/.sh_history.
As ksh continuously synchronises history among instances, a ksh
user who ran the regression tests ended up with a number of
mysterious extra commands in their command history.
src/cmd/ksh93/tests/shtests:
- Before running any tests, set and export HISTFILE to a new
history file in the temporary files directory.
There are convincing arguments why including '.' and '..' in the
result of pathname expansion is actively harmful. See:
https://www.austingroupbugs.net/view.php?id=1228https://github.com/ksh93/ksh/issues/58#issuecomment-653716846
pdksh, mksh and zsh already skip these special traversal names
in all cases. This commit makes ksh act like these shells.
Since passing '.' and especially '..' as arguments to commands like
'chmod -R' and 'cp -r' may cause harm, this change seems likely to
fix more legacy scripts than it breaks. I'm unaware of anyone ever
having come up with a concrete use case for the old behaviour.
This change also fixes the bug that '.' and '..' failed to be
ignored as documented if FIGNORE is set.
src/lib/libast/misc/glob.c: glob_dir():
- Explicitly skip any matching '.' and '..' in all cases.
src/cmd/ksh93/tests/glob.sh:
- Add test_glob() tests for '*' and '.*'.
src/cmd/ksh93/sh.1: File Name Generation:
- Update to match new behaviour.
Resolves: https://github.com/ksh93/ksh/issues/58
The 'redirect' builtin command did not error out before executing
any valid redirections. For example, 'redirect ls >foo.txt' issued
an "incorrect syntax" error, but still created 'foo.txt' and left
standard output permanently redirected to it.
src/cmd/ksh93/sh/xec.c: sh_exec():
- If we have redirections (io != NULL), and the command is
SYSREDIR, then check for arguments and error out if there are
any, before calling sh_redirect() to execute redirections.
(Note, the other check for arguments in b_exec() in bltins/misc.c
must be kept, as that applies if there are no redirections.)
src/cmd/ksh93/sh/io.c: sh_redirect():
- Edit comments to better explain what the flag values do.
src/cmd/ksh93/bltins/misc.c:
- Add a dummy b_redirect() function declaration "for the dictionary
generator" as has historically been done for other builtins that
share one C function. I'm not sure what that dictionary generator
is supposed to be, but this also improves greppability.
src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/sh.1:
- Fix misleading "I/O redirection arguments" term. I/O redirections
are not arguments at all; no argument parser ever sees them.
src/cmd/ksh93/tests/io.sh:
- Test both conditions that should make 'redirect' produce an
"incorrect syntax" error.
- Test that any redirections are not executed if erroneous
non-redirection arguments exist.
src/cmd/ksh93/tests/builtins.sh:
- "... should show usage info on unrecognized options" test:
Because 'redirect' now refuses to process redirections on error,
the error message was not captured. The fix is to run the builtin
in a braces block and add the redirection to the block.
This variable is like Bash's $BASHPID, but in virtual subshells
it will retain its previous value as virtual subshells don't fork.
Both $BASHPID and ${.sh.pid} are different from $$ as the latter
is only set to the parent shell's process ID (i.e. it isn't set
to the process ID of the current subshell).
src/cmd/ksh93/include/defs.h:
- Add 'current_pid' for storing the current process ID at a valid
memory address.
- Change 'ppid' from 'int32_t' to 'pid_t', as the return value from
'getppid' is of the 'pid_t' data type.
src/cmd/ksh93/data/variables.c,
src/cmd/ksh93/include/variables.h,
src/cmd/ksh93/sh/init.c,
src/cmd/ksh93/sh/xec.c:
- Add the ${.sh.pid} variable as an alternative to $BASHPID.
The process ID is stored in a struct before ${.sh.pid} is set
as environment variables are pointers that must point to a
valid memory address. ${.sh.pid} is updated by the _sh_fork()
function, which is called when ksh forks a new process with
sh_fork() or sh_ntfork().
src/cmd/ksh93/tests/variables.sh:
- Add ${.sh.pid} to the list of special variables and add three
regression tests for ${.sh.pid}.
src/cmd/ksh93/tests/subshell.sh:
- Update the PATH forking regression test to use ${.sh.pid} and
remove the TODO note.
This commit fixes two bugs in the generation of $'...' shellquoted
strings:
1. A bug introduced in f9d28935. In UTF-8 locales, a byte that is
invalid in UTF-8, e.g. hex byte 86, would be shellquoted as
\u[86], which is not the same as the correct quoting, \x86.
2. A bug inherited from 93u+. Single bytes (e.g. hex 11) were
always quoted as \x11 and not \x[11], even if a subsequent
character was a hexadecimal digit. However, the parser reads
past two hexadecimal digits, so we got:
$ printf '%q\n' $'\x[11]1'
$'\x111'
$ printf $'\x111' | od -t x1
0000000 c4 91
0000002
After the bug fix, this works correctly:
$ printf '%q\n' $'\x[11]1'
$'\x[11]1'
$ printf $'\x[11]1' | od -t x1
0000000 11 31
0000002
src/cmd/ksh93/sh/string.c: sh_fmtq():
- Make the multibyte code for $'...' more readable, eliminating the
'isbyte' flag.
- When in a multibyte locale, make sure to shellquote both invalid
multibyte characters and unprintable ASCII characters as
hexadecimal bytes (\xNN). This reinstates 93u+ behaviour.
- When quoting bytes, use isxdigit(3) to determine if the next
character is a hex digit, and if so, protect the quoted byte with
square brackets.
src/cmd/ksh93/tests/quoting2.sh:
- Move the 'printf %q' shellquoting regression tests here from
builtins.sh; they test the shellquoting algorithm, not so much
the printf builtin itself.
- Add regression tests for these bugs.
A segfault happens when an array with an unset method
is turned into a multidimensional array. Reproducer:
function foo {
typeset -a a
a.unset() {
print unset
}
a[3][6][11][20]=7
}
foo
src/cmd/ksh93/sh/nvdisc:
- Fix the multidimensional array unset method crash by
checking if np->nvenv is an array, since multidimensional
arrays need to be handled as arrays. This bugfix was
backported from ksh93v- 2013-10-10-alpha.
src/cmd/ksh93/tests/arrays2.sh:
- Add the reproducer as a regression test for the crash
with multidimensional arrays.
Bug report on the old mailing list:
https://www.mail-archive.com/ast-developers@lists.research.att.com/msg01195.html
The required longjmp used to terminate scripts was not being run
when over-shifting in a POSIX function with a redirection. This
caused scripts to continue after an error in the shift builtin,
which is incorrect since shift is a special builtin. The
interpreter is sent into an indeterminate state that causes
undefined behavior as well:
$ cat reproducer.ksh
some_func() {
shift 10
}
for i in a b c d e f; do
echo "read $i"
[ "$i" != "c" ] && continue
some_func 2>&1
echo "$i = c"
done
$ ksh ./reproducer.ksh
read a
read b
read c
/tmp/k[2]: shift: 10: bad number
c = c
read d
/tmp/k[2]: shift: 10: bad number
d = c
read e
/tmp/k[2]: shift: 10: bad number
e = c
read f
/tmp/k[2]: shift: 10: bad number
f = c
src/cmd/ksh93/sh/xec.c: sh_exec():
- Do the necessary longjmp needed to terminate the script after
over-shifting in a POSIX function when the function call has a
redirection.
src/cmd/ksh93/tests/functions.sh:
- Add the over-shifting regression test from ksh93v- 2013-10-10-alpha.
Bug report and fix on the old mailing list:
https://www.mail-archive.com/ast-developers@lists.research.att.com/msg00732.html