The POSIX standard requires real UNIX pipes as in pipe(2). But on
systems supporting it (all modern ones), ksh uses socketpair(2)
instead to make it possible for certain commands to peek ahead
without consuming input from the pipe, which is not possible with
real pipes. See features/poll and sh/io.c.
But this can create undesired side effects: applications connected
to a pipe may test if they are connected to a pipe, which will fail
if they are connected to a socket. Also, on Linux:
$ cat /etc/passwd | head -1 /dev/stdin
head: cannot open '/dev/stdin' for reading: No such device or address
...which happens because, unlike most systems, Linux cannot open(2)
or openat(2) a socket (a limitation that is allowed by POSIX).
Unfortunately at least two things depend on the peekahead
capability of the _pipe_socketpair feature. One is the non-blocking
behaviour of the -n option of the 'read' built-in:
-n Causes at most n bytes to be read instead of a full
line, but will return when reading from a slow device as
soon as any characters have been read.
The other thing that breaks is the <#pattern and <##pattern
redirection operators that basically grep standard input, which
inherently requires peekahead.
Standard UNIX pipes always block on read and it is not possible to
peek ahead, so these features inevitably break. Which means we
cannot simply use standard pipes without breaking compatibility.
But we can at least fix it in the POSIX mode so that cross-platform
scripts work more correctly.
src/cmd/ksh93/sh/io.c: sh_pipe():
- If _pipe_socketpair is detected at compile time, then use a real
pipe via sh_rpipe() if the POSIX mode is active. (If
_pipe_socketpair is not detected, a real pipe is always used.)
src/cmd/ksh93/data/builtins.c:
- sh.1 documents the slow-device behaviour of -n but 'read --man'
did not. Add that, making it conditional on _pipe_socketpair.
Resolves: https://github.com/ksh93/ksh/issues/327
Historically, ksh (including ksh88 and mksh) allow brace expansion
not just on literal patterns but also on patterns resulting from
unquoted variable expansions or command substitutions:
$ a='{a,b}' ksh -c 'echo X{a,b} Y$a'
Xa Xb Ya Yb
Most people expect only the first (literal) pattern to be expanded,
as in bash and zsh:
$ a='{a,b}' bash -c 'echo X{a,b} Y$a'
Xa Xb Y{a,b}
The historic ksh behaviour is poorly documented and nearly unknown,
violates the principle of least astonishment, and makes unquoted
variable expansions even more unsafe. See discussion at:
https://www.austingroupbugs.net/view.php?id=1193https://github.com/ksh93/ksh/issues/140
Unfortunately, we cannot change it in default ksh without breaking
backward compatibility. But we can at least fix it for the POSIX
mode (which disables brace expansion by default but allows turning
it back on), particularly as it looks like POSIX, if it decides to
specify brace expansion in a future version of the standard, will
disallow brace expansion on unquoted variable expansions.
src/cmd/ksh93/sh/macro.c: endfield():
- When deciding whether to do brace expansion + globbing or only
globbing, also check that we do not have POSIX mode and an
unquoted variable expansion (mp->pattern==1).
Reproducer script:
typeset -Ttyp1 typ1=(
function get {
.sh.value="'Sample'";
}
)
typ1 var11
typeset -p .sh.type
typeset -p .sh.type
Buggy output:
namespace sh.type
{
typeset -r typ1='Sample'
}
namespace sh.type
{
typeset -x -r typ1='Sample'
}
An -x (export) attribute is magically pulled out of a hat.
Analysis: The walk_tree() function in nvdisc.c repurposes (!) the
NV_EXPORT attribute as an instruction to turn *off* indenting when
pretty-printing the values of compound variables. The
print_namval() function in typeset.c, implementing 'typeset -p',
turns on NV_EXPORT for compound variables to inhibit indentation.
But it then does not bother to turn it off, which causes this bug.
src/cmd/ksh93/bltins/typeset.c: print_namval():
- When printing compound variables, only turn on NV_EXPORT
temporarily.
Resolves: https://github.com/ksh93/ksh/issues/456
On an interactive shell in emacs or vi, type a command with a $'…'
quoted string that contains a backslash-escaped single quote, like:
$ true $'foo\'bar' ▁
Then begin to type the name of a file present in the current
working directory and press tab. Nothing happens as completion
fails to work.
The completion code does not recognise $'…' strings. Instead, it
parses them as '…' strings in which there are no backslash escapes,
so it considers the last ' to open a second quoted string which is
not terminated.
Plus, when replacing a $'…' string with a (backslash-escaped)
completed string, the initial '$' is not replaced:
$ $'/etc/hosts<Tab>
$ $/etc/hosts
src/cmd/ksh93/edit/completion.c:
- find_begin():
- Learn how to recognise $'…' strings. A new local dollarquote
flag variable is used to distinguish them from regular '…'
strings. The difference is that backslash escapes (and only
those) should be recognised as in "…".
- Set a special type -1 for $'…' as the caller will need a way
to distinguish those from '…'.
- ed_expand(): When replacing a quoted string, remove an extra
initial character (being the $ in $') if the type set by
find_begin() is -1.
Resolves: https://github.com/ksh93/ksh/issues/462
Comsubs were either executed or caused a syntax error when attempting
to complete them within single quotes. Since single quotes do not
expand anything, no such completion should take place.
$ '`/de<TAB>-ksh: /dev/: cannot execute [Is a directory]
$ '$(/de<TAB>-ksh: syntax error: `end of file' unexpected
src/cmd/ksh93/edit/completion.c:
- find_begin():
- Remove recursive handling for '`' comsubs from 7a2d3564; it is
sufficient to set the return pointer to the current location cp
(the character following '`') if we're not in single quotes.
- For '$' and '`', if we are within single quotes, set type '\''
and set the return pointer bp to the location of the '$' or
'`'.
- ed_expand(): If find_begin() sets type '\'' and the current begin
character is $ or `, refuse to attempt completion; return -1 to
cause a terminal beep.
Related:
https://github.com/ksh93/ksh/issues/268https://github.com/ksh93/ksh/issues/462#issuecomment-1038482307
A side effect of the bug fixed in 2a835a2d caused the DEBUG trap
action to appear to be inherited by subshells, but in a buggy way
that could crash the shell. After the fix, the trap is reset in
subshells along with all the others, as it should be. Nonetheless,
as that bug was present for years, some have come to rely on it.
This commit implements that functionality properly. When the new
--functrace option is turned on, DEBUG trap actions are now
inherited by subshells as well as ksh function scopes. In addition,
since it makes logical sense, the new option also causes the
-x/--xtrace option's state to be inherited by ksh function scopes.
Note that changes made within the scope do not propagate upwards;
this is different from bash.
(I've opted against adding a -T short-form equivalent as on bash,
because -T was formerly a different option on 93u+ (see 63c55ad7)
and on mksh it has yet anohter a different meaning. To minimise
confusion, I think it's best to have the long-form name only.)
src/cmd/ksh93/include/shell.h,
src/cmd/ksh93/data/options.c:
- Add new "functrace" (SH_FUNCTRACE) long-form shell option.
src/cmd/ksh93/sh/subshell.c: sh_subshell():
- When functrace is on, copy the parent's DEBUG trap action into
the virtual subshell scope after resetting the trap actions.
src/cmd/ksh93/sh/xec.c: sh_funscope():
- When functrace is on and xtrace is on in the parent scope, turn
it on in the child scope.
- Same DEBUG trap action handling as in sh_subshell().
Resolves: https://github.com/ksh93/ksh/issues/162
After the previous commit, one inconsistency was left. Assignments
preceding special built-ins and POSIX functions (which persist past
the command :-/) caused pre-existing attributes of the respective
variables to be cleared.
$ (f() { typeset -p n; }; typeset -i n; n=3+4 f)
n=3+4
(expected output: 'typeset -i n=7')
This problem was introduced shortly before the release of ksh 93u+,
in version 2012-05-04, by adding these lines of code to the code
for processing preceding assignments in sh_exec():
src/cmd/ksh93/sh/xec.c:
1055: if(np)
1056: flgs |= NV_UNJUST;
So, for special and declaration commands and POSIX functions, the
NV_UNJUST flag is passed to nv_open(). In older ksh versions, this
flag cleared justify attributes only, but in early 2012 it was
repurposed to clear *all* attributes -- without changing the name
or the relevant comment in name.h, which are now both misleading.
The reason for setting this flag in xec.c was to deal with some
bugs in 'typeset'. Removing those two lines causes regressions:
attributes.sh[316]: FAIL: typeset -L should not preserve old attributes
attributes.sh[322]: FAIL: typeset -R should not preserve old attributes
attributes.sh[483]: FAIL: typeset -F after typeset -L fails
attributes.sh[488]: FAIL: integer attribute not cleared for subsequent typeset
Those are all typeset regressions, which suggests this fix was
relevant to typeset only. This is corroborated by the relevant
AT&T ksh93/RELEASE entry:
12-04-27 A bug in which old attributes were not cleared when
assigning a value using typeset has been fixed.
So, to fix this 2012 regression without reintroducing the typeset
regressions, we need to set the NV_UNJUST flag for invocations of
the typeset family of commands only. This is changed in xec.c.
While we're at it, we might as well rename that little-used flag to
something that reflects its current purpose: NV_UNATTR.
Reproducer:
$ typeset -i NUM=123
$ (NUM=3+4 env; :)|grep ^NUM=
NUM=3+4
$ (NUM=3+4 env)|grep ^NUM=
NUM=7
The correct output is NUM=7. This is also the output if ksh is
compiled with SHOPT_SPAWN disabled, suggesting that the problem is
here in sh_ntfork(), where the invocation-local environment list is
set using a sh_scope() call:
src/cmd/ksh93/sh/xec.c:
3496: if(t->com.comset)
3497: {
3498: scope++;
3499: sh_scope(t->com.comset,0);
3500: }
Analysis:
When ksh is compiled with SHOPT_SPAWN disabled, external commands
are always run using the regular forking mechanism. First the shell
forks, then in the fork, the preceding assignments list (if any)
are executed and exported in the main scope. Replacing global
variables is not a problem as the variables are exported and the
forked shell is then replaced by the external command using
execve(2).
But when using SHOPT_SPAWN/sh_ntfork(), this cannot be done as the
fork(2) use is replaced by posix_spawn(2) which does not copy the
parent process into the child, therefore it's not possible to
execute anything in the child before calling execve(2). Which means
the preceding assignments list must be processed in the parent, not
the child. Which makes overwriting global variables a no-no.
To avoid overwriting global variables, sh_ntfork() treats preceding
assignments like local variables in functions, which means they do
not inherit any attributes from the parent scope. That is why the
integer attribute is not honoured in the buggy reproducers.
And this is not just an issue for external commands. sh_scope() is
also used for assignments preceding a built-in command. Which is
logical, as those don't create a process at all.
src/cmd/ksh93/sh/xec.c:
1325: if(argp)
1326: {
1327: scope++;
1328: sh_scope(argp,0);
1329: }
Which means this bug exists for them as well, regardless of whether
SHOPT_SPAWN is compiled in.
$ /bin/ksh -c 'typeset -i NUM; NUM=3+4 command eval '\''echo $NUM'\'
3+4
(expected: 7, as on mksh and zsh)
So, the attributes from the parent scope will need to be copied
into the child scope. This should be done in nv_setlist() which is
called from sh_scope() with both the NV_EXPORT and NV_NOSCOPE flags
passed. Those flag bits are how we can recognise the need to copy
attributes.
Commit f6bc5c0 fixed a similar inconsistency with the check for the
read-only attribute. In fact, the bug fixed there was simply a
specific instance of this bug. The extra check for readonly was
because the readonly attribute was not copied to the temporary
local scope. So that fix is now replaced by the more general fix
for this bug.
src/cmd/ksh93/sh/name.c: nv_setlist():
- Introduce a 'vartree' local variable to avoid repetitive
'sh.prefix_root ? sh.prefix_root : sh.var_tree' expressions.
- Use the NV_EXPORT|NV_NOSCOPE flag combination to check if parent
scope attributes need to be copied to the temporary local scope
of an assignment preceding a command.
- If so, copy everything but the value itself: attributes (nvflag),
size parameter (nvsize), discipline function pointer (nvfun) and
the name pointer (nvname). The latter is needed because some
code, at least put_lang() in init.c, compares names by comparing
the pointers instead of the strings; copying the nvname pointer
avoids a regression in tests/locale.sh.
src/cmd/ksh93/sh/xec.c: local_exports():
- Fix a separate bug exporting attributes to a new ksh function
scope, which was previously masked by the other bug. The
attributes (nvflag) were copied *after* nv_putval()ing the value,
which is incorrect as the behaviour of nv_putval() is influenced
by the attributes. But here, we're copying the value too, so we
can simplify the whole function by using nv_clone() instead. This
may also fix other corner cases. (re: c1994b87)
Resolves: https://github.com/ksh93/ksh/issues/465
Bug 1: as of 960a1a99, floating point literals were no longer
recognised when importing variables from the environment. The
attribute was still imported but the value reverted to zero:
$ (export LC_NUMERIC=C; typeset -xF5 num=7.75; \
ksh -c 'typeset -p num')
typeset -x -F 5 num=0.00000
Bug 2 (inherited from 93u+): The code that imported variable
attributes from the environment only checked '.' to determine
whether the float attribute should be set. It should check the
current radix point instead.
$ (export LC_NUMERIC=debug; typeset -xF5 num=7,75; \
ksh -c 'typeset -p num')
typeset -x -i num=0
...or, after fixing bug 1 only, the output is:
typeset -x -i num=75000
src/cmd/ksh93/sh/arith.c: sh_strnum():
- When importing untrusted env vars at init time, handle not only
"base#value" literals using strtonll, but also floating point
literals using strtold. This fixes the bug without reallowing
arbitary expressions. (re: 960a1a99)
- When not initialising, use sh.radixpoint (see f0386a87) instead
of '.' to help decide whether to evaluate an arith expression.
src/cmd/ksh93/sh/init.c: env_import_attributes():
- Use sh.radixpoint instead of '.' to check for a decimal fraction.
(This code is needed because doubles are exported as integers for
ksh88 compatibility; see attstore() in name.c.)
typeset -g allows directly manipulating the attributes of variables
at the global level from any context. This feature already exists
on bash 4.2 and later.
mksh (R50+), yash and zsh have this flag as well, but it's slightly
different: it ignores the current local scope, but a parent local
scope from a calling function may still be used -- whereas on bash,
'-g' always refers to the global scope. Since ksh93 uses static
scoping (see III.Q28 at <http://kornshell.com/doc/faq.html>), only
the bash behaviour makes sense here.
Note that the implementation needs to be done both in nv_setlist()
(name.c) and in b_typeset() (typeset.c) because assignments are
executed before the typeset built-in itself. Hence also the
pre-parsing of typeset options in sh_exec().
src/cmd/ksh93/include/nval.h:
- Add new NV_GLOBAL bit flag, using a previously unused bit that
still falls within the 32-bit integer range.
src/cmd/ksh93/sh/xec.c: sh_exec():
- When pre-parsing typeset flags, make -g pass the NV_GLOBAL flag
to the nv_setlist() call that processes shell assignments prior
to running the command.
src/cmd/ksh93/sh/name.c: nv_setlist():
- When the NV_GLOBAL bit flag is passed, save the current variable
tree pointer (sh.var_tree) as well as the current namespace
(sh.namespace) and temporarily set the former to the global
variable tree (sh.var_base) and the latter to NULL. This makes
assignments global and ignores namesapces.
src/cmd/ksh93/bltins/typeset.c:
- b_typeset():
- Use NV_GLOBAL bit flag for -g.
- Allow combining -n with -g, permitting 'typeset -gn var' or
'nameref -g var' to create a global nameref from a function.
- Do not allow a nonsensical use of -g when using nested typeset
to define member variables of 'typeset -T' types. (Type method
functions can still use it as normal.)
- setall():
- If NV_GLOBAL is passed, use sh.var_base and deactivate
sh.namespace as in nv_setlist(). This allows attributes
to be set correctly for global variables.
src/cmd/ksh93/tests/{functions,namespace}.sh:
- Add regression tests based on reproducers for problems found
by @hyenias in preliminary versions of this feature.
Resolves: https://github.com/ksh93/ksh/issues/479
Reproducer:
$ namespace test { x=123; typeset -g x=456; }
$ echo $x ${.test.x}
456 123
$ namespace test { typeset -Q; }
arch/darwin.i386-64/bin/ksh: typeset: -Q: unknown option
[usage message snipped for brevity]
$ echo $x ${.test.x}
123 123 <== expected: 123 456
$ x=789
$ echo $x ${.test.x}
789 789 <== expected: 789 456
$ # look at that, we never left the namespace...
When prefixing the erroneous 'typeset' with 'command', the problem
does not occur. 'command' disables the properties of special
built-ins such as exit on error. So, when a special built-in exits
on error, the parent scope is not properly resotred.
This bug exists in every ksh93 version with SHOPT_NAMESPACE so far.
src/cmd/ksh93/sh/xec.c: sh_exec():
- Before entering a namespace, use sh_pushcontext and sigsetjmp to
make sure we return here if sh_exit() is called, e.g. when a
special builtin throws an error, to ensure the parent scope
(oldnspace) is restored.
Thanks to @hyenias for making me aware of this bug.
Discussion: https://github.com/ksh93/ksh/issues/479#issuecomment-1140468965
Reproducer:
$ (IFS=$'\t\t'; val=$'\tone\t\ttwo\t'; set --posix; \
set -- $val; echo $#; set --noposix; set -- $val; echo $#)
2
4 <== OK
$ (IFS=$'\t\t'; val=$'\tone\t\ttwo\t'; set --posix; \
set -- $val; echo $#; set --default; set -- $val; echo $#)
2
2 <== bug
The output of the seconnd command line should be like the first.
When POSIX mode is turned off using 'set --noposix' (or 'set +o
posix'), sh.ifstable is invalidated as it needs to be repopulated
on the next field split to restore ksh-specific special handling of
a repeated $IFS whitespace character as non-whitespace. However,
when 'set --default' is used, this does not happen, which is a bug.
src/cmd/ksh93/sh/args.c: sh_argopts():
- While processing --default, when turning off SH_POSIX, call
sh_invalidate_ifs() to invalidate sh.ifstable.
The typeset output for -L/-R/-Z seems to be wrong when the input
has leading/trailing spaces. This started occurring after the
dynamic buffer size changes introduced in name.c as part of the
fix for <https://github.com/ksh93/ksh/issues/142>.
Test script:
typeset -L8 s_date1=" 22/02/09 08:25:01"; echo "$s_date1"
typeset -R10 s_date1="22/02/09 08:25:01 "; echo "$s_date1"
typeset -Z10 s_date1="22/02/09 08:25:01 "; echo "$s_date1"
Actual output:
22/02/0
08:25:01
0008:25:01
Expected output:
22/02/09
9 08:25:01
9 08:25:01
src/cmd/ksh93/sh/name.c: nv_newattr():
- Simplify allocation code, replacing the earlier dynamic buffer
size calculation with just the greater of the strlen and size.
Resolves: https://github.com/ksh93/ksh/issues/476
Co-authored-by: George Lijo <george.lijo@gmail.com>
In command substitutions of the $(standard) and ${ shared state; }
form, backslash line continuation is broken.
Reproducer:
echo $(
echo one two\
three
)
Actual output (ksh93, all versions):
one two\ three
Expected output (every other shell, POSIX spec):
one twothree
src/cmd/ksh93/sh/lex.c: sh_lex(): case S_REG:
- Do not skip new-line joining if we're currently processing a
command substitution of one of these forms (i.e., if the
lp->lexd.dolparen level is > 0).
Background info/analysis:
comsub() is called from sh_lex() when S_PAR is the current state.
In src/cmd/ksh93/data/lexstates.c, we see that S_PAR is reached in
the ST_DOL state table at index 40. Decimal 40 is ( in ASCII. So,
the previous skipping of characters was done according to the
ST_DOL state table, and the character that stopped it was (. This
means we have $(.
Alternatively, comsub() may be called from sh_lex() by jumping to
the do_comsub label. In brief, that is the case when we have ${.
Regardless of which it is from the two, comsub() is now called from
sh_lex(). In comsub(), lp->lexd.dolparen is incremented at the
beginning and decremented at the end. Between them, we see that
sh_lex() is called. So, lp->lexd.dolparen in sh_lex() indicates the
depth of nesting $( or ${ statements we're in. Thus, it is also the
number of comsub() invocations seen in a backtrace taken in
sh_lex().
The codepath for `...` is different (and never had this bug).
Co-authored by: Martijn Dekker <martijn@inlv.org>
Resolves: https://github.com/ksh93/ksh/issues/367
The following reproducer causes a spurious syntax error:
foo="`: "("`"
The nested double quotes are not recognised correctly, causing a
syntax error at the '('. Removing the outer double quotes (which
are unnecessary) is a workaround, but it's still a bug as every
other shell accepts this. This bug has been present since the
original Bourne shell.
src/cmd/ksh93/sh/lex.c: sh_lex(): case S_QUOTE:
- If the current character is '"' and we're in a `...` command
substitution (ingrave is true), then do not switch to the old
mode but keep using the ST_QUOTE state table.
Thanks to @JohnoKing for the report and to @atheik for the fix.
Co-authored by: Martijn Dekker <martijn@inlv.org>
Resolves: https://github.com/ksh93/ksh/issues/352
src/lib/libast/misc/optget.c: textout(): case ']':
- Before returning, call pop() to free any \f...\f info items that
are left. Note that this is a safe no-op if the pointer is null.
Resolves: https://github.com/ksh93/ksh/issues/407
Co-authored-by: Martijn Dekker <martijn@inlv.org>
This fixes another corner case bug in the horror show that is the
test/[ comand.
Reproducer:
$ ksh --posix -c 'test X -a -n'
ksh: test: argument expected
Every other shell returns 0 (success) as, POSIXly, this is a test
for the strings 'X' and '-n' both being non-empty, combined with
the binary -a (logical and) operator. Instead, '-n' was taken as a
unary primary operator with a missing argument, which is incorrect.
POSIX reference:
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/test.html
> 3 arguments:
> * If $2 is a binary primary, perform the binary test of $1 and $3.
src/cmd/ksh93/bltins/test.c:
- e3(): If the final argument begins with a dash, always treat it
as a test for a non-empty string, therefore return true. Do not
limit this to "new flags" only.
src/cmd/ksh93/tests/posix.sh:
- Added. These are tests for every aspect of the POSIX mode.
ksh has a little-known field splitting feature that conflicts with
POSIX: if a single-byte whitespace character (cf. isspace(3)) is
repated in $IFS, then field splitting is done as if that character
wasn't a whitespace character. An exmaple with the tab character:
$ (IFS=$'\t'; val=$'\tone\t\ttwo\t'; set -- $val; echo $#)
2
$ (IFS=$'\t\t'; val=$'\tone\t\ttwo\t'; set -- $val; echo $#)
4
The latter being the same as, for example
$ (IFS=':'; val='1️⃣2️⃣'; set -- $val; echo $#)
4
However, this is incompatible with the POSIX spec and with every
other shell except zsh, in which repeating a character in IFS does
not have any effect. So the POSIX mode must disable this.
src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/sh/init.c:
- Add sh_invalidate_ifs() function that invalidates the IFS state
table by setting the ifsnp discipline struct member to NULL,
which will cause the next get_ifs() call to regenerate it.
- get_ifs(): Treat a repeated char as S_DELIM even if whitespace,
unless --posix is on.
src/cmd/ksh93/sh/args.c:
- sh_argopts(): Call sh_invalidate_ifs() when enabling or disabling
the POSIX option. This is needed to make the change in field
splitting behaviour take immediate effect instead of taking
effect at the next assignment to IFS.
b_enum() contains a check that exactly one argument is given:
237: if (error_info.errors || !*argv || *(argv + 1))
But the subsequent argument handling loop will happily deal with
multiple arguments:
246: while(cp = *argv++)
Every other declaration command supports multiple arguments and I
see no reason why enum shouldn't. Simply removing the '*(argv + 1)'
check allows 'enum' to create more than one type per invocation.
src/cmd/ksh93/bltins/enum.c:
- b_enum(): Remove check for >1 args as described above.
- Update documentation to describe the behaviour of enumeration
types in arithmetic expressions and to add an example: a bool
type with two enumeration values 'false' (0) and 'true' (1).
That type is predefined in ksh 93v- and 2020. We're not going
to do that in 93u+m but it's good to document the possibility.
src/cmd/ksh93/sh.1:
- Make changes parallel to the enum.c self-doc update.
Changes:
- Fixed two xtrace test failures introduced in commit cfc8744c.
- The definition of _use_ntfork_tcpgrp in xec.c is now dependent on
SHOPT_SPAWN being defined (re: 8e9ed5be).
- Removed many unnecessary newlines and fixed various typos.
This commit fixes an infinite loop introduced in commit 0863a8eb
that caused ksh to enter an infinite loop if posix_spawn failed
to start a new process after setting the terminal process group.
Reproducer (warning: it will cause ksh to crash Wayland sessions
and drives up CPU usage by a ton):
$ /tmp/this/file/does/not/exist
/usr/bin/ksh: /tmp/this/file/does/not/exist: not found
$ <Press enter>
(ksh now prints $PS1 in a loop until killed with SIGKILL)
The first bug fixed is the infinite loop that occurs when
posix_spawn fails to execute a command. This was fixed by setting
the terminal process group to the main interactive shell.
The second bug fixed is related to the signal handling of the
SIGTTIN, SIGTTOU and SIGTSTP signals. In sh_ntfork() these signals
are set to their default signal handlers (SIG_DFL) before running
a command. The signal handlers were only restored to SIG_IGN
(ignore signal) when sh_ntfork() successfully ran a command.
This could cause a SIGTTOU lockup under strace when a command
failed to execute in an interactive shell, while also being one
cause of the infinite loop.
src/cmd/ksh93/sh/xec.c: sh_ntfork():
- Restore the terminal process group if posix_spawn failed to
launch a new process. This is necessary because posix_spawn will
set the terminal process group before it attempts to run a
command and doesn't restore it on failure.
This change turns off O_NONBLOCK for stdin if a previously ran
program left it on so that interactive programs that expect it
to be off work properly.
src/cmd/ksh93/sh/io.c: slowread():
- Turn off O_NONBLOCK for stdin if it is on.
Fixes: https://github.com/ksh93/ksh/issues/469
The previous fix for the += operator introduced a use-after-free
bug that could result in a variable pointing to random garbage:
$ foo=bar
$ foo+=_foo true
$ typeset -p foo
foo=V V
The use after free issue occurs because when nv_clone creates a
copy of $foo in the true command's invocation-local scope, it does
not duplicate the string $foo points to. As a result, the $foo
variable in the parent scope points to the same string as $foo in
the invocation-local scope, which causes the use after free bug
when cloned $foo variable is freed from memory.
src/cmd/ksh93/sh/nvdisc.c:
- To fix the use after free bug, allow nv_clone to duplicate the
string with memdup or strdup when no flags are passed.
src/cmd/ksh93/tests/variables.sh:
- Add a regression test for using the += operator with regular
commands.
src/cmd/ksh93/tests/leaks.sh:
- Add a regression test to ensure the bugfix doesn't introduce any
memory leaks.
Reproducer (symptoms on at least macOS and FreeBSD):
$ mkfifo f
$ echo foo > f
(press Ctrl+Z)
^Zksh: f: cannot create [Interrupted system call]
Abort
The shell either aborts (dev builds) or crashes with 'Illegal
instruction' (release builds). This is consistent with
UNREACHABLE() being reached.
Backtrace:
0 libsystem_kernel.dylib __kill + 10
1 ksh sh_done + 836 (fault.c:678)
2 ksh sh_fault + 1324
3 libsystem_platform.dylib _sigtramp + 29
4 dyld ImageLoaderMachOCompressed::resolve(ImageLoader::LinkContext const&, char const*, unsigned ch
5 libsystem_c.dylib abort + 127
6 ksh sh_redirect + 3576 (io.c:1356)
7 ksh sh_exec + 7231 (xec.c:1308)
8 ksh exfile + 3247 (main.c:607)
9 ksh sh_main + 3551 (main.c:368)
10 ksh main + 38 (pmain.c:45)
11 libdyld.dylib start + 1
This means that UNREACHABLE() is actually reached here:
ksh/src/cmd/ksh93/sh/io.c
1351: if((fd=sh_open(tname?tname:fname,o_mode,RW_ALL)) <0)
1352: {
1353: errormsg(SH_DICT,ERROR_system(1),((o_mode&O_CREAT)?e_create:e_open),fname);
1354: UNREACHABLE();
1355: }
The cause is that, in the following section of code in sh_fault():
ksh/src/cmd/ksh93/sh/fault.c
183: #ifdef SIGTSTP
184: if(sig==SIGTSTP)
185: {
186: sh.trapnote |= SH_SIGTSTP;
187: if(pp->mode==SH_JMPCMD && sh_isstate(SH_STOPOK))
188: {
189: sigrelease(sig);
190: sh_exit(SH_EXITSIG);
191: return;
192: }
193: }
194: #endif /* SIGTSTP */
...sh_exit() is not getting called and the function will not return
because the SH_STOPOK bit is not set while the shell is blocked
waiting to write to a FIFO.
Even if sh_exit() did get called, that would not fix it, because
that function also checks for the SH_STOPOK bit and returns without
doing a longjmp if the signal is SIGTSTP and the SH_STOPOK bit is
not set. That is direct the reason why UNREACHABLE() was raeched:
errormsg() does call sh_exit() but sh_exit() then does not longjmp.
src/cmd/ksh93/sh/fault.c: sh_fault():
- To avoid the crash, we simply need to return from sh_fault() if
SH_STOPOK is off, so that the code path does not continue, no
error message is given on Ctrl+Z, UNREACHABLE() is not reached,
and the shell resumes waiting on trying to write to the FIFO.
The sh.trapnote flag should not be set if we're not going to
process the signal. This makes ksh behave like all other shells.
Resolves: https://github.com/ksh93/ksh/issues/464
Reproducer:
$ ksh -c 'unset PWD; (cd /); :'
Memory fault
The shell crashes because b_cd() is testing the value of the PWD
variable without checking if there is one.
src/cmd/ksh93/sh/path.c: path_pwd():
- Never return an unfreeable pointer to e_dot; always return a
freeable pointer. This fixes another corner-case crashing bug.
- Make sure the PWD variable gets assigned a value if it doesn't
have one, even if it's the "." fallback. However, if the PWD is
inaccessible but we did inherit a $PWD value that starts with a
/, then use the existing $PWD value as this will help the shell
fail gracefully.
src/cmd/ksh93/bltins/cd_pwd.c:
- b_cd(): When checking if the PWD is valid, use the sh.pwd copy
instead of the PWD variable. This fixes the crash above.
- b_cd(): Since path_pwd() now always returns a freeable value,
free sh.pwd unconditionally before setting the new value.
- b_pwd(): Not only check that path_pwd() returns a value starting
with a slash, but also verify it with test_inode() and error out
if it's wrong. This makes the 'pwd' command useful for checking
that the PWD is currently accessible.
src/cmd/ksh93/data/msg.c:
- Change e_pwd error message for accuracy and clarity.
This backports two minor additions to the 'read' built-in from ksh
93v-: '-a' is now the same as '-A' and '-u p' is the same as '-p'.
This is for compatibility with some 93v- or ksh2020 scripts.
Note that their change to the '-p' option to support both prompts
and reading from the coprocess was *not* backported because we
found it to be broken and unfixable. Discussoin at:
https://github.com/ksh93/ksh/issues/463
src/cmd/ksh93/bltins/read.c: b_read():
- Backport as described above.
- Rename the misleadingly named 'name' variable to 'prompt'.
It points to the prompt string, not to a variable name.
src/cmd/ksh93/data/builtins.c: sh_optpwd[]:
- Add -a as an alterative to -A. All that is needed is adding '|a'
and optget(3) will automatically convert it to 'A'.
- Change -u from a '#' (numeric) to ':' option to support 'p'. Note
that b_read() now needs a corresponding strtol() to convert file
descriptor strings to numbers where applicable.
- Tweaks.
src/cmd/ksh93/sh.1:
- Update accordingly.
- Tidy up the unreadable mess that was the 'read' documentation.
The options are now shown in a list.
From README:
FILESCAN on Experimental option that allows fast reading of files
using while < file;do ...; done and allowing fields in
each line to be accessed as positional parameters.
As SHOPT_FILESCAN has been enabled by default since ksh 93l
2001-06-01, the filescan loop is now documented in the manual page
and the compile-time option is no longer considered experimental.
We must disable this at runtime if --posix is active because it
breaks a portable use case: POSIXly, 'while <file; do stuff; done'
repeatedly excutes 'stuff' while 'file' can successfully be opened
for reading, without actually reading from 'file'.
This also backports a bugfix from the 93v- beta. Reproducer:
$ echo 'one two three' >foo
$ while <foo; do printf '[%s] ' "$@"; echo; done
[one two three]
Expected output:
[one] [two] [three]
The bug is that "$@" acts like "$*", joining all the positional
parameters into one word though it should be generating one word
for each.
src/cmd/ksh93/sh/macro.c: varsub():
- Backport fix for the bug described above. I do not understand the
opaque macro.c code well enough yet to usefully describe the fix.
src/cmd/ksh93/sh/xec.c: sh_exec():
- Improved sanity check for filescan loop: do not recognise it if
the simple command includes variable assignments, more than one
redirection, or an output or append redirection.
- Disable filescan loops if --posix is active.
- Another 93v- fix: handle interrupts (errno==EINTR) when closing
the input file.
In UTF-8 locales, ksh breaks when a KEYBD trap is active, even a
dummy no-op one like 'trap : KEYBD'. Entering multi-byte characters
fails (the input is interrupted and a new prompt is displayed) and
pasting content with multi-byte characters produces corrupted
results.
The cause is that the KEYBD trap code is not multibyte-ready.
Unfortunately nobody yet understands the edit.c code well enough
to implement a proper fix. Pending that, this commit implements
a workaround that at least avoids breaking the shell.
src/cmd/ksh93/edit/edit.c: ed_getchar():
- When a multi-byte locale is active, do not trigger the the KEYBD
trap except for ASCII characters (1-127).
Resolves: https://github.com/ksh93/ksh/issues/307
The -f/--noglob shell option is documented simply as: "Disables
pathname expansion." But after 'set -f' on an interactive shell,
command completion and file name completion also stop working. This
is because they internally use the pathname expansion mechanism.
But it is not documented anywhere that 'set -f' disables
completion; it's just a side effect of an implementation detail.
Though ksh has always acted like this, I think it should change
because it's not useful or expected behaviour. Other shells like
bash, yash or zsh don't act like this.
src/cmd/ksh93/sh/expand.c,
src/cmd/ksh93/sh/macro.c:
- Allow the SH_COMPLETE (command completion) or SH_FCOMPLETE
(file name completion) state bit to override SH_NOGLOB in
path_generate() and in sh_macexpand().
Reproducer:
exec 9>&1
( { exec 9>&1; } 9>&- )
echo "test" >&9 # => 9: cannot open [Bad file descriptor]
The 9>&- incorrectly persists beyond the { } block that it
was attached to *and* beyond the ( ) subshell. This is yet another
bug with non-forking subshells; forking it with something like
'ulimit -t unlimited' works around the bug.
In over a year we have not been able to find a real fix, but I came
up with a workaround that forks a virtual subshell whenever it
executes a code block with a >&- or <&- redirection attached. That
use case is obscure enough that it should not cause any performance
regression except in very rare corner cases.
src/cmd/ksh93/sh/xec.c: sh_exec(): TSETIO:
- This is where redirections attached to code blocks are handled.
Check for a >&- or <&- redirection using bit flaggery from
shnodes.h and fork if we're executing such in a virtual subshell.
Resolves: https://github.com/ksh93/ksh/issues/161
Thanks to @ko1nksm for the bug report.
This commit backports all of the relevant .sh.match bugfixes from
ksh93v-. Most of the .sh.match rewrite is from versions 2012-08-24
and 2012-10-04, with patches from later releases of 93v- and
ksh2020 also applied. Note that there are still some remaining bugs
in .sh.match, although now the total count of .sh.match bugs should
be less that before.
These are the relevant changes in the ksh93v- changelog that were
backported:
12-08-07 .sh.match no longer gets set for patterns in PS4 during
set -x.
12-08-10 Rewrote .sh.match expansions fixing several bugs and
improving performance.
12-08-22 .sh.match now handles subpatterns that had no matches with
${var//pattern} correctly.
12-08-21 A bug in setting .sh.match after ${var//pattern/string}
when string is empty has been fixed.
12-08-21 A bug in setting .sh.match after [[ string == pattern ]]
has been fixed.
12-08-31 A bug that could cause a core dump after
typeset -m var=.sh.match has been fixed.
12-09-10 Fixed a bug in typeset -m the .sh.match is being renamed.
12-09-07 Fixed a bug in .sh.match code that coud cause the shell
to quitely
13-02-21 The 12-01-16 bug fix prevented .sh.match from being used
in the replacement string. The previous code was restored
and a different fix which prevented .sh.match from being
computed for nested replacement has been used instead.
13-05-28 Fixed two bug for typeset -c and typeset -m for variable
.sh.match.
Changes:
- The SHOPT_2DMATCH option has been removed. This was already the
default behavior previously, and now it's documented in the man
page.
- init.c: Backported the sh_setmatch() rewrite from 93v- 2012-08-24
and 2012-10-04.
- Backported the libast 93v- strngrpmatch() function, as the
.sh.match rewrite requires this API.
- Backported the sh_match regression tests from ksh93v-, with many
other sh_match tests backported from ksh2020. Much of the sh_match
script is based on code from Roland Mainz:
https://marc.info/?l=ast-developers&m=134606574109162&w=2https://marc.info/?l=ast-developers&m=134490505607093
- tests/{substring,treemove}.sh: Backported other relevant .sh.match
fixes, with tests added to the substring and treemove test scripts.
- tests/types.sh: One of the (now reverted) memory leak bugfixes
introduced a CI test failure in this script, so for that test the
error message has been improved.
- string/strmatch.c: The original ksh93v- code for the strngrpmatch()
changes introduced a crash that could occur because strlen would
be used on a null pointer. This has been fixed by avoiding strlen
if the string is null.
One nice side effect of these changes is a considerable performance
improvement in the shbench[1] gsub benchmark (results from 20
iterations with CCFLAGS=-Os):
--------------------------------------------------
name /tmp/ksh-current /tmp/ksh-matchfixes
--------------------------------------------------
gsub.ksh 0.883 [0.822-0.959] 0.457 [0.442-0.505]
--------------------------------------------------
Despite all of the many fixes and improvements in the backported
93v- .sh.match code, there are a few remaining bugs:
- .sh.match is printed with a default [0] subscript (see also
https://github.com/ksh93/ksh/issues/308#issuecomment-1025016088):
$ arch/*/bin/ksh -c 'echo ${!.sh.match}'
.sh.match[0]
This bug appears to have been introduced by the changes from
ksh93v- 2012-08-24.
- The wrong variable name is given for 'parameter not set' errors
(from https://marc.info/?l=ast-developers&m=134489094602596):
$ arch/*/bin/ksh -u
$ x=1234
$ true "${x//~(X)([012])|([345])/}"
$ compound co
$ typeset -m co.array=.sh.match
$ printf "%q\n" "${co.array[2][0]}"
arch/linux.i386-64/bin/ksh: co.array[2][(null)]: parameter not set
- .sh.match leaks out of subshells. Further information and a
reproducer can be found here:
https://marc.info/?l=ast-developers&m=136292897330187
[1]: https://github.com/ksh-community/shbench
When executing a script without a hashbang path like #!/bin/ksh,
ksh forks itself, longjmps back to sh_main(), and then (among other
things) calling sh_reinit() which is the function that tries to
reinitialise as much of the shell as it can. This is its way of
ensuring the child script is run in ksh and not some other shell.
However, this appraoch is incredibly buggy. Among other things,
changes in built-in commands and custom type definitions survived
the reinitialisation, "exporting" variables didn't work properly,
and the hash table and ${.sh.stats} weren't reset. As a result,
depending on what the invoking script did, the invoked script could
easily fail or malfunction.
It is not actually possible to reinitialise the shell correctly,
because some of the shell state is in locally scoped static
variables that cannot simply be reinitialised. There are probably
huge memory leaks with this approach as well. At some point, all
this is going to need a total redesign. Clearly, the only reliable
way involves execve(2) and a start from scratch.
For now though, this seems to fix the known bugs at least. I'm sure
there are more to be discovered.
This commit makes another change: instead of the -h/trackall option
(which has been a no-op for decades), the posix option is now
inherited by the child script. Since there is no hashbang path from
which to decide whether the shell should run in POSIX mode nor not,
the best guess is probably the invoking script's setting.
src/cmd/ksh93/sh/init.c: sh_reinit():
- Keep the SH_INIT state on during the entire procedure.
- Delete remaining non-exported, non-default variables.
- Remove attributes from exported variables. In POSIX mode, remove
all attributes; otherwise, only remove readonly.
- Unset discipline function pointers for variables.
- Delete all custom types.
- Delete all functions and built-ins, then reinitialise the built-ins
table from scatch.
- Free the alias values before clearing the alias table.
- Same with hash table entries (tracked aliases).
- Reset statistics.
- Inherit SH_POSIX instead of SH_TRACKALL.
- Call user init function last, not somewhere in the middle.
src/cmd/ksh93/sh/name.c: sh_envnolocal():
- Be sure to preserve the export attribute of kept variables.
Resolves: https://github.com/ksh93/ksh/issues/350
This bugfix was backported from ksh93v- 2012-10-04. The bug fixed
by this change is one that causes 'typeset -p' to omit the -C flag
when listing compound arrays belonging to a type:
$ typeset -T Foo_t=(compound -a bar)
$ Foo_t baz
$ typeset -p baz.bar
typeset -a baz.bar='' # This should be 'typeset -C -a'
src/cmd/ksh93/sh/nvtype.c:
- Backport change from 93v- 2012-10-04 that sets the array nvalue to
a pointer named Null (which is "") in nv_mktype(), then to Empty
in fixnode().
- Change the Null name from the 93v- code to AltEmpty to avoid
misleading code readers into thinking that it's a null pointer.
src/cmd/ksh93/tests/types.sh:
- Backport the relevant 93v- changes to the types regression tests.
Co-authored-by: Martijn Dekker <martijn@inlv.org>
In ksh93v- 2012-10-04 the following bugfix is noted in the changelog
(this fix was most likely part of ksh93v- 2012-09-27, although that
version is not archived anywhere):
12-09-21 A bug in which the output of a two dimensional sparse
indexed array would cause the second subscript be treated
as an associative array when read back in has been fixed.
Elements that are sparse indexed arrays now are prefixed
type "typeset -a".
Below is a before and after of this change:
# Before
$ typeset -a foo[1][2]=bar
$ typeset -p foo
typeset -a foo=([1]=([2]=bar) )
# After
$ typeset -a foo[1][2]=bar
$ typeset -p foo
typeset -a foo=(typeset -a [1]=([2]=bar) )
src/cmd/ksh93/sh/*.c:
- Backport changes from ksh93v- to print 'typeset -a' before sparse
indexed arrays and properly handle 'typeset -a' in reinput
commands from 'typeset -p'.
src/cmd/ksh93/tests:
- Add two regression tests to arrays.sh for this change.
- Update the existing regression tests for compatibility with the
new printed typeset output.
Special builtins are undeleteable for a reason. But 'enum' and
'typeset -T' allow overriding them, causing an inconsistent state.
@JohnoKing writes:
| The behavior is rather buggy, as it appears to successfully
| override normal builtins but fails to delete the special
| builtins, leading to scenarios where both the original builtin
| and type are run:
|
| $ typeset -T eval=(typeset BAD; typeset TYPE) # This should have failed
| $ eval foo=BAD
| /usr/bin/ksh: eval: line 1: foo: not found
| $ enum trap=(BAD TYPE) # This also should have failed
| $ trap foo=BAD
| /usr/bin/ksh: trap: condition(s) required
| $ enum umask=(BAD TYPE)
| $ umask foo=BAD
| $ echo $foo
| BAD
|
| # Examples of general bugginess
| $ trap bar=TYPE
| /usr/bin/ksh: trap: condition(s) required
| $ echo $bar
| TYPE
| $ eval var=TYPE
| /usr/bin/ksh: eval: line 1: var: not found
| $ echo $var
| TYPE
This commit fixes the following:
The 'enum' and 'typeset -T' commands are no longer allowed to
override and replace special built-in commands, except for type
definition commands previously created by these commands; these
are already (dis)allowed elsewhere.
A command like 'typeset -T foo_t' without any assignments no longer
creates an incompletely defined 'foo_t' built-in comamnd. Instead,
it is now silently ignored for backwards compatibility. This did
have a regression test checking for it, but I'm changing it because
that's just not a valid use case. An incomplete type definition
command does nothing useful and only crashes the shell when run.
src/cmd/ksh93/bltins/enum.c: b_enum():
- Do not allow overriding non-type special built-ins.
src/cmd/ksh93/sh/name.c: nv_setlist():
- Do not allow 'typeset -T' to override non-type special built-ins.
To avoid an inconsistent state, this must be checked for while
processing the assignments list before typeset is really invoked.
src/cmd/ksh93/bltins_typeset.c: b_typeset():
- Only create a type command if sh.envlist is set, i.e., if some
shell assignment(s) were passed to the 'typeset -T' command.
Progresses: https://github.com/ksh93/ksh/issues/350
$ unset foo
$ echo ${foo[42]=bar}
(empty line)
Instead of the empty line, 'bar' was expected. As foo[42] was
unset, the conditional assignment should have worked.
$ unset foo
$ : ${foo[42]?error: unset}
(no output)
The expansion should have thrown an error with the given message.
This bug was introduced in ksh 93t 2008-10-01. Thanks to @JohnoKing
for finding the breaking change.
Analysis: The problem was experimenally determined to be in in the
following lines of nv_putsub(). If the array member is unset (i.e.
null), the value is set to the empty string instead:
src/cmd/ksh93/sh/array.c
1250: else
1251: ap->val[size].cp = Empty;
It makes some sense: if there is a value (even an empty one), the
variable is set and these expansions should behave accordingly.
Sure enough, deleting these lines fixes the bug, but at the expense
of introducing a lot of other array-related regressions. So we need
a way to special-case the affected expansions.
Where to do this? If we replace line 1251 with an abort(3) call, we
get this stack trace:
0 libsystem_kernel.dylib __pthread_kill + 10
1 libsystem_pthread.dylib pthread_kill + 284
2 libsystem_c.dylib abort + 127
3 ksh nv_putsub + 1411 (array.c:1255)
4 ksh nv_endsubscript + 940 (array.c:1547)
5 ksh nv_create + 4732 (name.c:1066)
6 ksh nv_open + 1951 (name.c:1425)
7 ksh varsub + 4934 (macro.c:1322)
[rest omitted]
The special-casing needs to be done on line 1250 of array.c, but
flagged in varsub() which processes these expansions. So, varsub()
calls nv_open() calls nv_create() calls nv_endsubscript() calls
nv_putsub(). That's a fairly deep call stack, so passing an extra
flag argument does not seem doable. I did try an approach using a
couple of new bit flags passed via these functions' flags and mode
parameters, but the way this code base uses bit flags is so
intricate, it does not seem to be possible to add or change
anything without unwanted side effects in all sorts of places.
So the only fix I can think of adds yet another global flag
variable for a very special case. It's ugly, but it works.
An elegant fix would probably involve a fairly comprehensive
redesign, which is simply not going to happen.
src/cmd/ksh93/include/shell.h:
- Add global sh.cond_expan flag.
src/cmd/ksh93/sh/array.c: nv_putsub():
- Do not set value to empty string if sh.cond_expan is set.
src/cmd/ksh93/sh/macro.c: varsub():
- Set sh.cond_expan flag while calling nv_open() for one of the
affected expansions.
- Minor refactoring for legibility and to make the fix fit better.
- SSOT: Instead of repeating string "REPLY", use the node's nvname.
- Do not pointlessly add an extra 0 byte when saving id for error
message; sfstruse() already adds this.
Thanks to @oguz-ismail for the bug report.
Resolves: https://github.com/ksh93/ksh/issues/383
Variables with a dot in their name, such as those declared in
namespace { ... } blocks, are usually stored in a separate tree
with their actual names not containing any dots. But under some
circumstances, including at least direct assignment of a
non-preexisting dot variable, dot variables are stored in the main
sh.var_tree with names actually containing dots. With allexport
active, those could end up exported to the environment. This bug
was also present in previous release versions of ksh.
src/cmd/ksh93/sh/name.c: pushnam():
- Check for a dot in the name before pushing a variable to export.
This fix was backported from ksh 93v- 2012-10-04.
src/cmd/ksh93/sh/nvtree.c: nv_outnode():
- If the array is supposed to be empty, do not continue. This
avoids outputting a nonexistent [0]= element for empty arrays.
Resolves: https://github.com/ksh93/ksh/issues/420
Co-authored-by: Martijn Dekker <martijn@inlv.org>
This commit adds a fix for the trap command, backported from a fork
of ksh2020: https://github.com/l0stman/ksh/commit/2033375f
src/cmd/ksh93/sh/jobs.c: job_chldtrap():
- Fixed a use after free bug in the for loop. The string pointed to
by sh.st.trapcom[SIGCHLD] may be freed from memory after
sh_trap(), so it must be reobtained each time sh_trap() is called
from within the for loop.
All variables that are assigned a value should be exported while
the allexport shell option is enabled. This works in most cases,
but variables assigned to with ${var:=foo} or $((var=123)) aren't
exported while allexport is on.
src/cmd/ksh93/sh/name.c:
- nv_putval(): This is the central assignment function; all forms
of variable assignment end up here. So this is the best place
to check for SH_ALLEXPORT and turn on the export attribute.
- nv_setlist(): Remove allexport handling, now redundant.
src/cmd/ksh93/bltins/read.c: sh_readline():
- Remove allexport handling, now redundant.
src/cmd/ksh93/sh/main.c: sh_main():
- nv_putval() is used to initialize PS4 and IFS using nv_putval();
this is after an -a/--allexport specified on the ksh command
line has been processed, so temporarily turn that off.
Co-authored-by: Martijn Dekker <martijn@inlv.org>
If the VISUAL or EDITOR environment variable is not set to a value
matching *[Vv][Ii]* or *macs* at initialisation time, then ksh does
not turn on any line editor.
This is user-hostile. New users on Unix-like systems typically have
a simple editor like nano preconfigured as their default, or may
not have the VISUAL or EDITOR variable set at all. So if they try
ksh, they find themselves without basic functionality such as arrow
keys and probably go straight back to bash.
The emacs line editor is by far the most widely used, especially
among new users, so ksh should default to that. Most other shells
already do this.
src/cmd/ksh93/sh/main.c: sh_main():
- On an interactive shell, if on editor was turned on based on
$VISUAL or $EDITOR, turn on emacs before reading input.
Reproducer:
$ ksh -c 'bash -c '\''kill -s INT $$'\''; echo "$?, continuing"'
Expected result: output "258, continuing"; exit status 0.
Actual result: no output; exit status 258. The child process sent
SIGINT only to itself and not to the process group, so the parent
script was wrongly interrupted.
Every shell except ksh93 produces the expected result. ksh93 also
gave the expected result before version 2008-01-31 93s+, which
introduced the code below.
Analysis: The problem is in these lines of code in xec.c,
sh_exec(), TFORK case, parent branch of fork:
1649: if(!sh_isstate(SH_MONITOR))
1650: {
1651: if(!(sh.sigflag[SIGINT]&(SH_SIGFAULT|SH_SIGOFF)))
1652: sh_sigtrap(SIGINT);
1653: sh.trapnote |= SH_SIGIGNORE;
1654: }
[...pipe and I/O handling, wait for command to finish...]
1667: if(!sh_isstate(SH_MONITOR))
1668: {
1669: sh.trapnote &= ~SH_SIGIGNORE;
1700: if(sh.exitval == (SH_EXITSIG|SIGINT))
1701: kill(sh.current_pid,SIGINT);
1702: }
When a user presses Ctrl+C, SIGINT is sent to the entire process
group. If job control is fully off (i.e., !sh_isstate(SH_MONITOR)),
then the process group includes the parent script. Therefore, in a
script such as
$ ksh -c 'bash -c '\''read x'\''; echo "$?, continuing"'
when the user presses Ctrl+C while bash waits for 'read x' input,
the parent ksh script should be interrupted as well.
Now, the code above ignores SIGINT while bash is running. (This is
done using special-casing in sh_fault() to handle that SH_SIGIGNORE
flag for SIGINT.) So, when Ctrl+C interrupts the process group, the
parent script is not getting interrupted as it should.
To compensate for that, the code then detects, using sh.exitval
(the child process' exit status), whether the child process was
killed by SIGINT. If so, it simply assumes that the signal was
meant for the process group including the parent script, so it
reissues SIGINT to itself after unignoring it.
But, as we can see from the broken reproducer above, that
assumption is not valid. Scripts are perfectly free to send SIGINT
to themselves only, and that must work as expected.
src/cmd/ksh93/sh/xec.c: sh_exec(): TFORK: parent branch:
- Instead of ignoring SIGINT, sigblock() it, which delays handling
the signal until sigrelease(). (Note that these are macros
defined in src/cmd/ksh93/features/sigfeatures according to OS
capabilities.)
- This makes reissuing SIGINT redundant, so delete that, which
fixes the bug.
src/cmd/ksh93/sh/fault.c:
- Nothing now sets the SH_SIGIGNORE flag in sh.trapnote, so remove
special-casing added in 2008-01-31 93s+.
Add extra key bindings to the emacs and vi modes
This patch adds the following key bindings to the emacs and vi
editing modes:
- Support for Home key sequences ^[[1~ and ^[[7~ as well as End key
sequences ^[[4~ and ^[[8~.
- Support for arrow key sequences ^[OA, ^[OB, ^[OC and ^[OD.
- Support for the following keyboard shortcuts (if the platform
supports the expected escape sequence):
- Ctrl-Left Arrow: Go back one word
- Alt-Left Arrow: Go back one word (Not supported on Haiku)
- Ctrl-Right Arrow: Go forward one word
- Alt-Right Arrow: Go forward one word (Not supported on Haiku)
- Ctrl-G: Cancel reverse search
- Ctrl-Delete: Delete next word (Not supported on Haiku)
- Added a key binding for the Insert key, which differs in the
emacs and vi editing modes:
- In emacs mode, Insert escapes the next character.
- In vi mode, Insert will switch the editor to insert mode (like
in vim).
src/cmd/ksh93/edit/{emacs,vi}.c:
- Add support for the <M-Left> and <M-Right> sequences. Like in
bash and mksh, these shortcuts move the cursor one word backward
or forward (like the <Ctrl-Left> and <Ctrl-Right> shortcuts).
- Only attempt to process these shortcuts if the escape sequence
begins with $'\E[1;'.
src/cmd/ksh93/edit/vi.c:
- If the shell isn't doing a reverse search, insert the bell
character when Ctrl+G is input.
- Add the Ctrl-Delete shortcut as an alias of 'dw'. Calling
ed_ungetchar twice does not work for 'dw', so Ctrl-Delete was
implemented by using a vp->del_word variable.
Co-authored-by: Martijn Dekker <martijn@inlv.org>
Currently, running ksh under ASan without the ASAN_OPTIONS variable
set to 'detect_leaks=0' usually ends with ASan complaining about a
memory leak in defpathinit() (this leak doesn't grow in a loop, so
no regression test was added to leaks.sh). Reproducer:
$ ENV=/dev/null arch/*/bin/ksh
$ cp -?
cp: invalid option -- '?'
Try 'cp --help' for more information.
$ exit
=================================================================
==225132==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 85 byte(s) in 1 object(s) allocated from:
#0 0x7f6dab42d459 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
#1 0x5647b77fe144 in sh_calloc /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/init.c:265
#2 0x5647b788fea9 in path_addcomp /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/path.c:1567
#3 0x5647b78911ed in path_addpath /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/path.c:1705
#4 0x5647b7888e82 in defpathinit /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/path.c:442
#5 0x5647b78869f3 in ondefpath /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/path.c:67
--- cut ---
SUMMARY: AddressSanitizer: 174 byte(s) leaked in 2 allocation(s).
Analysis: The previous code was leaking memory because
defpathinit() returns a pointer from path_addpath(), which has
memory allocated to it in path_addcomp(). This is the code ASan
traced as having allocated memory:
442: return(path_addpath((Pathcomp_t*)0,(defpath),PATH_PATH));
In path_addpath():
1705: first = path_addcomp(first,old,cp,type);
[...]
1729: return(first);
In path_addcomp():
1567: pp = sh_newof((Pathcomp_t*)0,Pathcomp_t,1,len+1);
The ondefpath() function doesn't save a reference to the pointer
returned by defpathinit(), which causes the memory leak:
66: if(!defpath)
67: defpathinit();
The changes in this commit avoid this problem by setting the
defpath variable without also calling path_addpath().
src/cmd/ksh93/sh/path.c:
- Move the code for allocating defpath from defpathinit() into its
own dedicated function called std_path(). This function is called
by defpathinit() and ondefpath() to obtain the current string
stored in the defpath variable. This bugfix is adapted from a
fork of ksh2020: https://github.com/l0stman/ksh/commit/db5c83a6
'eval' suffers from the same bug. Reproducer:
$ eval vi
then suspend vi, then try to resume it -- the same as in the
reproducer shown in the previous commit.
src/cmd/ksh93/bltins/misc.c: b_eval():
- Same fix. Do *not* turn off SH_MONITOR.
This fixes yet another whopper of a bug in job control. And it's
been in every version of ksh93 back to 1995, the beginning of
ast-open-archive. <sigh>
This is also bug number 23 that is fixed by simply removing code.
Reproducer:
1. Run vi, or another suspendable program, from a dot script or
POSIX function (ksh handles them the same way). So either:
$ echo vi >v
$ . ./v
or:
$ v() { vi; }
$ v
2. Suspend vi by typing Ctrl+Z.
3. Not one but two jobs are registered:
$ jobs -l
[2] + 85513 Stopped . ./v
[1] - 85512 Stopped . ./v
4. 'fg' does not work on either of them, just printing the job
command name but not resuming the editor. The second job
disappears from the table after 'fg'.
$ fg %2
. ./v
$ fg %2
ksh: fg: no such job
$ fg %1
. ./v
$ fg %1
. ./v
Either way, you're stuck with an unresumable vi that you have to
'kill -9' manually.
src/cmd/ksh93/sh/xec.c: sh_exec(): TFORK:
- Do *not* turn off the SH_MONITOR state flag (which tells ksh to
keep track of jobs) when running an external command from a
profile script or dot script/POSIX function. It's obvious that
this results in an inconsistent job control state as ksh will not
update it when the external command is suspended. The purpose of
this nonsense is surely lost to history as it's been there since
1995 or before. My testing says that removing it doesn't break
anything. If that turns out to be wrong, then the breakage will
have to be fixed in a correct way instead.
Attempting to use array subscript expansion with variables that
aren't set currently causes a spurious syntax error (in ksh93u+ and
older commits the reproducer crashes):
$ ksh -c 'echo ${foo[${bar}..${baz}]}' # Shouldn't print anything
ksh: : arithmetic syntax error
src/cmd/ksh93/sh/macro.c:
- Backport a parser bugfix from ksh93v- 2012-08-24 that avoids
setting mp->dotdot until the copyto() function's loop is
finished.
src/cmd/ksh93/tests/arrays.sh:
- Add regression tests for this bug.
On Cygwin, ksh does not execute scripts without a #! path in a fork
of the ksh process as it does on other systems. Reproducer (run
from ksh):
$ cat test.sh
echo "${BASH_VERSION:-not bash}"
echo "${.sh.version}"
$ chmod +x test.sh
$ ./test.sh
4.4.12(3)-release
./test.sh: line 2: ${.sh.version}: bad substitution
The script was executed in bash instead of ksh.
After this fix, the output on Cygwin is like ksh on other systems:
not bash
Version AJM 93u+m/1.1.0-alpha+dev 2022-01-26
This also fixes a number of regression test failures, as quite a
few tests create and execute temp scripts without a hashbang path.
Analysis: On Cygwin, execve(2) happily executes shell scripts
without a #! path with /bin/sh (which is bash --posix). However,
ksh relies on execve(2) executing binaries or #! only, as it uses
an ENOEXEC failure to decide whether to fork and execute a #!-less
shell script with a reinitialized copy of itself via exscript().
src/cmd/ksh93/sh/path.c: path_spawn():
- Look at the magic first two bytes of the file; if it is "MZ"
(Mark Zbikowski, originator of the .exe format) or "#!", continue
as normal, otherwise simulate an ENOEXEC failure from execve(2)
which will cause ksh to fall back on #!-less script execution.
src/cmd/ksh93/sh.1:
- Add a new section on history expansion mostly adapted from the
"History substitution" section from the tcsh(1) man page. This
has the standard BSD license which is already in the COPYRIGHT
file. Inapplicable stuff was removed, some new stuff added.
src/cmd/ksh93/edit/hexpand.c,
src/cmd/ksh93/sh/io.c:
- Set '#' as the default history comment character as on bash;
no longer disable it by default.
- Add the 'a' modifier as a synonym for 'g', as on bash.
- Remove pointless static keyword from np variable; since the
value from previous calls is never used it can just be local.
- Use NV_NOADD flag with nv_open() to avoid pointlessly creating
the node if the variable doesn't exist yet.
- Fix a bug in history expansion where the 'p' modifier had no
effect if the 'histverify' option is on.
Reproducer:
$ set -H -o histv
$ true a b c
$ !!:p
$ true a b c▁ <= instead of printed, the line is re-edited
Expected:
$ set -H -o histv
$ true a b c
$ !!:p
true a b c
$ ▁
This is fixed by making 'p' remove the HIST_EVENT bit from the
returned flag bits in hist_expand(), leaving only the HIST_PRINT
flag, then adding special handling for this case to slowread()
in io.c (print the line, then instead of executing it, continue
and read the next line).