Notable changes:
- Change a bunch of uses of memcmp(3) to compare strings (which can
cause buffer read overflows) to strncmp(3).
- src/cmd/ksh93/include/name.h: Eliminate redundant Nambfp_f type.
Replace with Shbltin_f type from libast's shcmd.h. Since that is
not guaranteed to be included here, repeat the type definition
here without fully defining the struct (which is valid in C).
Lexical levels are stored in a dynamically grown array of int values
grown by the stack_grow function. The pointer lex_match and the
maximum index lex_max are part of the lexer state struct that is now
saved and restored in various places -- see e.g. 37044047, a2bc49be.
If the stack needs to be grown, it is reallocated in stack_grow()
using sh_realloc(). If that happens between saving and restoring the
lexer state, then an outdated pointer is restored, and crash.
src/cmd/ksh93/include/shlex.h,
src/cmd/ksh93/sh/lex.c:
- Take lex_match and lex_max out of the lexer state struct and make
them separate static variables.
src/cmd/ksh93/edit/edit.c:
- While we're at it, save and restore the lexer state in a way that
is saner than the 93v- beta approach (re: 37044047) as well as
more readable. Instead of permanently allocating memory, use a
local variable to save the struct. Save/restore directly around
the sh_trap() call that actually needs this done.
Resolves: https://github.com/ksh93/ksh/issues/482
Reproducers:
$ ksh -c 'typeset -a arr=( ( (a $(($(echo 1) + 1)) c)1))'
ksh: echo: arr[0]._AST_FEATURES=CONFORMANCE - ast UNIVERSE - ucb: cannot be an array
ksh: [1]=1: invalid variable name
$ ksh -c 'typeset -a arr=( (a $(($(echo 1) + 1)) c)1)'
ksh: echo: arr._AST_FEATURES=CONFORMANCE - ast UNIVERSE - ucb: is not an identifier
ksh: [1]=1: invalid variable name
src/cmd/ksh93/sh/name.c: sh_setenviron():
- Save and clear the current compound assignment prefix (sh.prefix)
while assigning to the _AST_FEATURES variable.
It's undocumented, it's broken and can crash the shell, and it's
unclear if it can ever be fixed. So with a 1.0 release (hopefully)
not very far off, it's time to remove it from the 1.0 branch.
Related: https://github.com/ksh93/ksh/issues/422
src/cmd/ksh93/include/name.h:
- Include the ususally-wanted (Shbltin_f) typecast in funptr().
Various files:
- Change several direct foo->nvalue.bfp usages to funptr(np).
- Reduce explicit typecasts after the name.h change.
- To determine if we are (or just were) running a certain built-in
command, instead of comparing sh.bltindata.bnode with a builtin
table node, use sh.bltinfun to directly compare the builtin's
function; this is more readable and efficient.
The xargs-like functionality of 'command -x' was still failing with
E2BIG in cases or on systems where the environment variables list
is very large. For instance, on a default NixOS installation it's
about 50k by default (absurd; *each* process carries this weight).
This commit tweaks the feature test and introduces a runtime
fallback if it still fails.
POSIX: "The number of bytes available for the new process' combined
argument and environment lists is {ARG_MAX}. It is implementation-
defined whether null terminators, pointers, and/or any alignment
bytes are included in this total."
https://pubs.opengroup.org/onlinepubs/9699919799/functions/exec.html
More recommended reading:
https://mina86.com/2021/the-real-arg-max-part-1/https://mina86.com/2021/the-real-arg-max-part-2/
So, operating systems are free to consume ARG_MAX space in whatever
bizarre way they want, and may even come up with more innovative
ways to waste buffer space in future. <sigh>
command_xargs() allows for the possibility of adding a certain
number of extra bytes per argument to account for pointers and
whatnot. As of this commit, we still start off from the value that
was determined by the _arg_extrabytes test in features/externs, but
path_spawn() will now increase that number at runtime and retry if
E2BIG still occurs. Hopefully this makes it future-proof.
src/cmd/ksh93/features/externs:
- Rename generated ARG_EXTRA_BYTES macro to _arg_extrabytes for
better naming consistency with other iffe feature tests.
- Tweaks to avoid detecting 9 extra bytes instead of 8 on some
versions of 64-bit Linux (it needs the size of a 64 bit pointer).
- Show the result in the iffe output.
src/cmd/ksh93/include/shell.h,
src/cmd/ksh93/sh/init.c:
- Do not store getconf(CONF_ARG_MAX) at init time; on Linux, this
value may be changed dynamically (via ulimit -s), so it must be
re-obtained on every use.
src/cmd/ksh93/sh/path.c:
- command_xargs():
- Use a static global variable for the number of extra bytes per
argument. Initialise it with the results of the feature test.
This allows increasing it at runtime if an OS does something
weird causing an E2BIG failure.
- Abort instead of return if command_xargs() is called with
sh.xargmin < 0; this should never happen.
- To allow retrying without crashing, restore saved args before
returning -1.
- Leave more generous space for the environment -- half the size
of the existing environment. This was experimentally determined
to be needed to keep Linux and macOS happy.
- Instead of crashing, return with E2BIG if there is too little
space to run.
- Get rid of unnecessary (void*) typecasts; we no longer pretend
to be compatible with C++ (re: a34e8319).
- Remove a couple of dead 'if(saveargs) free(saveargs);'
statements; at those points, saveargs is known to be NULL.
- Return -2 instead of -1 when retrying would be pointless.
- path_spawn():
- When command_xargs() returns -1 and the error is E2BIG,
increase the number of extra bytes by the size of a char*
pointer and try again. Give up if adding bytes the size of 8
char* pointers fails.
src/cmd/ksh93/sh/xec.c: check_exec_optimization():
- Do not use this optimization if we are running 'command -x';
I noticed some instances of the PATH search yielding incorrect
results if we do. TODO: work this out at some point.
Reproducer:
trap : USR1
while :; do kill -s USR1 $$ || exit; done &
while :; do : >/dev/null; done
It can take between a fraction of a second and a few minutes, but
eventually it will fail like this:
$ ksh foo
foo[3]: /dev/null: cannot create
kill: 77946: no such process
It fails similarly with "cannot open" if </dev/null is used instead
of >/dev/null.
This is the same problem as in the referenced commit, except when
handling traps -- so the same fix is required in sh_fault().
Researching #483 revealed several instances in coprocess cleanup
where sh_close() is called with a file descriptor of -1 (which
flags that the pipe is already closed). As of feeb62d1, this sets
errno to EBADF. While the race condition triggered by this was
fixed in 411481eb, it's still better to avoid it.
This re-applies most of the changes reverted in 80767b1f.
Reproducer for a race condition:
typeset -i i
cat=${ command -pv cat; }
for((i=0;i<20000;i++))
do printf '\r%d ' "$i"
ksh |&
cop=$!
print -p $'print hello | '$cat$'\nprint '$exp
read -t 1 -p
read -t 1 -p
redirect 5<&p 6>&p 5<&- 6>&-
{ sleep .4; kill $cop; } &
spy=$!
if wait $cop 2>/dev/null # "/dev/null: cannot create"?
then kill $spy
else kill $spy; exit # boom
fi
wait
done
Result on my system. The time it takes to fail varies a lot:
$ arch/*/bin/ksh foo
717 foo[13]: /dev/null: cannot create [Bad file descriptor]
$
Analysis:
errno may get a value during SIGCHLD processing. This may cause the
functions called by the signal interrupt to call sh_close() with a
negative file descriptor, giving errno the EBADF value as this loop
in sh_open() is interrupted:
ksh/src/cmd/ksh93/sh/io.c in 94fc983
851: while((fd = open(path, flags, mode)) < 0)
852: if(errno!=EINTR || sh.trapnote)
853: return(-1);
Commit feeb62d1 caused sh_close() to set errno to EBADF if it gets
called with a negative file descriptor. In the cleanup of
coprocesses during the SIGCHLD interrupt handling, there are
several locations where sh_close() may be called with a negative
file descriptor, which now sets errno to EBADF. If SIGCHLD is
handled between open() being interrupted and the check for
errno!=EINTR, sh_open() will return -1 instead of retrying, causing
sh_redirect() to issue the "cannot create" error message.
SIGCHLD is issued whenever a process (such as the coprocess in the
reproducer) terminates. job_reap() is called (via job_waitsafe())
to handle this. That function does not always restore errno, which
allowed this race condition to happen.
src/cmd/ksh93/sh/jobs.c: job_reap():
- Always save/restore errno.
Resolves: https://github.com/ksh93/ksh/issues/483
This reverts commit a72ba2cf59.
It's broken:
test coprocess begins at 2022-06-20+02:34:59
coprocess.sh[198]: FAIL: traps when reading from cat coprocess not working
coprocess.sh[198]: FAIL: traps when reading from /bin/cat coprocess not working
test coprocess failed at 2022-06-20+02:35:01 with exit code 2 [ 32 tests 2 errors ]
Reopens: https://github.com/ksh93/ksh/issues/483
Intermittent regression test failure:
test coprocess begins at 2022-02-08+23:41:52
/usr/local/src/ksh93/ksh/src/cmd/ksh93/tests/coprocess.sh[116]: /dev/null: cannot create [Bad file descriptor]
coprocess.sh[118]: FAIL: /bin/cat coprocess hung after 'exec 5<&p 6>&p; exec 5<&- 6>&-'
test coprocess failed at 2022-02-08+23:41:56 with exit code 1 [ 33 tests 1 error ]
The coprocess didn't hang; the 2>/dev/null redirection in the
regression test somehow failed.
Running the regression test in a loop 10000 times is a fairly
reliable way of reproducing the failure on my system. This has
allowed me to find the commit that introduced it. It is: feeb62d1
How odd. That commit merely sets errno to EBADF if sh_close() is
called with a file descriptor < 0. I don't understand how this
causes a race condition.
But calling abort() in the location of the feeb62d1 patch did allow
me to trace all the sh_close() calls with a negative file
descriptor. Those shouldn't be happening in any case.
All those sh_close() calls were related to co-processes, which that
regression test also tests. A co-process pipe that is not in use is
set to -1 so we need to avoid calling sh_close() if that happened.
This commit does that (as well as making some consistency tweaks)
and it reliably makes the race condition go away on my system. The
"why" of this remains a mystery as the only difference is that
errno is not set to EBADF in these situations, and both
sh_redirect() and sh_open() explicitly set errno to 0.
Resolves: https://github.com/ksh93/ksh/issues/483
Notable changes:
src/cmd/ksh93/bltins/trap.c: b_trap():
- Disable the unadvertised 'trap + SIG' feature in POSIX mode; it's
not compatible as '+' is a legitimate command name.
src/cmd/ksh93/data/builtins.c:
- Give "pwd", "alarm" and "times" the BLT_ENV flag for better
performance. There is no need for those to be stoppable.
src/cmd/ksh93/sh/xec.c:
- sh_eval(): Remove a "temporary tksh hack" and associated
sh.fn_reset flag.
- sh_exec():
- Remove now-used 'unpipe' flag and one remaining dead check for
it (re: a2196f94, 4b22fd5d).
- Fix unnecessary and confusing reuse of the 'type' variable.
src/lib/libast/comp/conf.sh:
- trap: Use 'rm -rf' instead of 'rm -f' to remove temp executables;
on macOS, they may include *.dSYM directories.
Part of the *subshell_data/*sp struct:
92: int nofork;
In subshell.c, sh_subshell(), before entering a virtual subshell:
628: if(!(sp->nofork = sh_state(SH_NOFORK)))
629: sh_onstate(SH_NOFORK);
...and upon restoring when leaving the subshell:
696: if(!sp->nofork)
697: sh_offstate(SH_NOFORK);
This code is clearly nonsense, because 'sh_state(SH_NOFORK)' is
a static expression that always has the value 1. This can be seen
by looking at the definitions in defs.h:
194: #define sh_state(x) ( 1<<(x))
...and in shell.h:
68: #define SH_NOFORK 0 /* set when fork not necessary */
So, sh_state(SH_NOFORK) == 1<<(0) == 1, meaning sp->nofork is
always 0, meaning the statements conditional on it are never
executed. We can get rid of all of this; it's dead code.
In subshell.c on line 628, perhaps sh_isstate(SH_NOFORK) was meant
instead of sh_state(SH_NOFORK) -- but that state flag is supposed
to signal that forking is not necessary when running an external
command, that we can potentially 'exec' it directly if it's the
last command in the subshell -- which makes no sense at all when
being in a virtual subshell which does not have its own process.
This dead code was added in version 2008-09-21 ksh93t.
The subsequent 'flags |= sh_state(SH_NOFORK)' (subshell.c line 630)
probably makes more sense, since this sets execflg in sh_exec(),
which is also used for optimising redirections (avoids bothering to
store previous state if it's the last command in the subshell).
This commit supersedes @lijog's Solaris patch 280-23332860 (see
17ebfbf6) as this is a more general fix that makes the patch
redundant. Of course its associated regression tests stay.
Reproducer script:
trap 'echo SIGUSR1 received' USR1
sh -c 'kill -s USR1 $PPID'
Run as a normal script.
Expected behaviour: prints "SIGUSR1 received"
Actual behaviour: the shell invoking the script terminates. Oops.
As of 6701bb30, ksh again allows an exec-without-fork optimisation
for the last command in a script. So the 'sh' command gets the same
PID as the script, therefore its parent PID ($PPID) is the invoking
script and not the script itself, which has been overwritten in
working memory. This shows that, if there are traps set, the exec
optimisation is incorrect as the expected process is not signalled.
While 6701bb30 reintroduced this problem for scripts, this has
always been an issue for certain other situations: forked command
substitutions, background subshells, and -c option argument
scripts. This commit fixes it in all those cases.
In sh_exec(), case TFORK, the optimisation (flagged in no_fork) was
only blocked for SIGINT and for the EXIT and ERR pseudosignals.
That is wrong. It should be blocked for all signal and pseudosignal
traps, except DEBUG (which is run before the command) and SIGKILL
and SIGSTOP (which cannot be trapped).
(I've also tested the behaviour of other shells. One shell, mksh,
never does an exec optimisation, even if no traps are set. I don't
know if that is intentional or not. I suppose it is possible that a
script might expect to receive a signal without trapping it first,
and they could conceivably be affected the same way by this exec
optimisation. But the ash variants (e.g. Busybox ash, dash, FreeBSD
sh), as well as bash, yash and zsh, all do act like this, so the
behaviour is very widespread. This commit makes ksh act like them.)
Multiple files:
- Remove the sh.errtrap, sh.exittrap and sh.end_fn flags and their
associated code from the superseded Solaris patch.
src/cmd/ksh93/include/shell.h:
- Add a scoped sh.st.trapdontexec flag for sh_exec() to disable
exec-without-fork optimisations. It should be in the sh.st scope
struct because it needs to be reset in subshell scopes.
src/cmd/ksh93/bltins/trap.c: b_trap():
- Set sh.st.trapdontexec if any trap is set and non-empty (an empty
trap means ignore the signal, which is inherited by an exec'd
process, so the optimisation is fine in that case).
- Only clear sh.st.trapdontexec if we're not in a ksh function
scope; unlike subshells, ksh functions fall back to parent traps
if they don't trap a signal themselves, so a ksh function's
parent traps also need to disable the exec optimisation.
src/cmd/ksh93/sh/fault.c: sh_sigreset():
- Introduce a new -1 mode for sh_funscope() to use, which acts like
mode 0 except it does not clear sh.st.trapdontexec. This avoids
clearing sh.st.trapdontexec for ksh functions scopes (see above).
- Otherwise, clear sh.st.trapdontexec whenever traps are reset.
src/cmd/ksh93/sh/xec.c: check_exec_optimization():
- Consolidate all the exec optimisation logic into this function,
including the logic from the no_fork flag in sh_exec()/TFORK.
- In the former no_fork flag logic, replace the three checks for
SIGINT, ERR and EXIT traps with a single check for the
sh.st.trapdontexec flag.
The exec optimization only happened in background subshells and not
in regular subshells when they had forked via sh_subfork(), which
makes little sense.
src/cmd/ksh93/sh/xec.c: sh_exec: case TLST:
- A subshell is executed as a list of commands which is TLST. If
the shell had not forked at the beginning of the subshell, the
sh_state(SH_FORKED) flag was not passed on to recursive sh_exec()
invocations, and a sh_subfork() event did not change this. To fix
this, re-check for the SH_FORKED state and pass that bit on to
the recursive sh_exec() invocation if set (note that sh_isstate()
returns a bitmask and not a boolean value).
src/cmd/ksh93/sh/subshell.c: sh_subfork():
- Remove redundant sh_onstate(SH_FORKED); this is already done in
sh_fork() which this function calls.
Reproducer script:
tempfile=/tmp/out2.$$.$RANDOM
bintrue=$(whence -p true)
for opt in monitor pipefail
do
(
set +x -o "$opt"
(
sleep .05
echo "ok $opt" >&2
) 2>$tempfile | "$bintrue"
) &
wait
cat "$tempfile"
rm -f "$tempfile"
done
Expected output:
ok monitor
ok pipefail
Actual output:
(none)
The 'monitor' and 'pipefail' options are supposed to make the shell
wait for the all commands in the pipeline to terminate and not only
the last component, regardless of whether the pipe between the
component commands is still open. In the failing reproducer, the
dummy external true command is subject to an exec optimization, so
it replaces the subshell instead of forking a new process. This is
incorrect, as the shell is no longer around to wait for the
left-hand part of the pipeline, so it continues in the background
without being waited for. Since it writes to standard error after
.05 seconds (after the pipe is closed), the 'cat' command reliably
finds the temp file empty. Without the sleep this would be a race
condition with unpredictable results.
Interestingly, this bug is only triggered for a (background
subshell)& and not for a forked (regular subshell). Which means the
exec optimization is not done for a forked regular subshell, though
there is no reason not to. That will be fixed in the next commit.
src/cmd/ksh93/sh/xec.c: sh_exec():
- case TFORK: Never allow an exec optimization if we're running a
command in a multi-command pipeline (pipejob is set) and the
shell needs to wait for all pipeline commands, i.e.: either the
time keyword is in use, the SH_MONITOR state is active, or the
SH_PIPEFAIL option is on.
- case TFIL: Fix the logic for setting job.waitall for the
non-SH_PIPEFAIL case. Do not 'or' in the boolean value but assign
it, and include the SH_TIMING (time keyword in use) state too.
- case TTIME: After that fix in case TFIL, we don't need to bother
setting job.waitall explicitly here.
src/cmd/ksh93/sh.1:
- Add missing documentation for the conditions where the shell
waits for all pipeline components (time, -o monitor/pipefail).
Resolves: https://github.com/ksh93/ksh/issues/449
$ arch/*/bin/ksh -xc 'typeset -a a=(1 2 3); \
(typeset -A a; typeset -p a); typeset -p a'
typeset -A a=()
typeset -a a=(1 2 3)
The associative array in the subshell is empty, so the conversion
failed. So far, I have been unsuccessful at fixing this in the
array and/or virtual subshell code (a patch that fixes it there
would still be more than welcome).
As usual, real subshells work correctly, so this commit adds
another forking workaround. The use case is rare and specific
enough that I have no performance concerns.
src/cmd/ksh93/bltins/typeset.c: setall():
- Fork a virtual subshell if we're actually converting a variable
to an associative array, i.e.: the NV_ARRAY (-A, associative
array) attribute was passed, there are no assignments (sh.envlist
is NULL), and the variable is not unset.
src/cmd/ksh93/tests/arith.sh:
- Fix the "Array subscript quoting test" tests that should not have
been passing and that correctly failed after this fix; they used
'typeset -A' without an assignment in a subshell, assuming it was
unset in the parent shell, which it wasn't.
Resolves: https://github.com/ksh93/ksh/issues/409
When the funstaks() function deletes a stack, other code could
still reference that stack's pointer, at least if a script's DEBUG
trap changed the function context by assigning to ${.sh.level}.
This crashed @ormaaj's funcname.ksh script in certain contexts, at
least when run as a dot script or in a virtual subshell.
This allows that script to run in all contexts by making
funstaks(s) set the stack pointer to NULL after deleting the stack
and making various other points in the code check for a null
pointer before dereferencing it.
This may not be the most elegant fix but (in my testing) it does
work, even when compiling ksh with AddressSanitiser.
Thanks to @JohnoKing for help researching this problem.
Resolves: https://github.com/ksh93/ksh/issues/212
$ ksh -c '(sleep 1 & echo ${.sh.stats.forks}); :'
2
The shell forks twice. A virtual subshell that launches a
background job forks first before forking the background job.
As discussed in the P.S. in e3d7bf1d, this is caused by the
following code in sh_exec() in xec.c (as since changed in 4ce486a7,
e40aaa8a, 88a1f3d6, and a2196f94):
1505: if((type&(FAMP|TFORK))==(FAMP|TFORK))
1506: {
1507: if(sh.comsub && !(sh.fdstatus[1]&IONOSEEK))
1508: unpipe = iousepipe();
1509: if(!sh.subshare)
1510: sh_subfork();
1511: }
This smells like an ugly workaround. It forks a virtual subshell
whenever a background job is launched within it, whether there is
an actual reason to do this or not. Researching the ksh93-history
repo reveals that this workaround had already gone through many
different versions before we restarted ksh development.
In the e3d7bf1d P.S. discussion, four regression test failures were
caused by removing this workaround. I recently tried it again and
two of the failures were already gone. (Testing which commits fix
it would require recompiling every 93u+m commit with the workaround
patched out and I can't be bothered.) Two were left: the two
signal.sh failures. The last commit (50db00e1) fixed those as well.
So I believe we no longer need this workaround, which never did
make very much sense. Removing it significantly improves the
performance of ksh when launching background jobs from subshells.
Ksh handles local traps in virtual subshells the same way as local
traps in ksh-style shell functions, which can cause incorrect
operation.
Reproducer script:
trap 'echo "parent shell trap"; exit 0' USR1
(trap 'echo "subshell trap"' USR1; kill -USR1 $$)
echo wrong
Output on every ksh93 version: 'wrong'
Output on every other shell: 'parent shell trap'
The ksh93 output is wrong because $$ is the PID of the main shell,
therefore 'kill -USR1 $$' from a subshell needs to issue SIGUSR1 to
the main shell and trigger the 'echo SIGUSR1' trap.
This is an inevitable consequence of processing signals in a
virtual subshell. Signals are a process-wide property, but a
virtual subshell and the parent shell share the same process.
Therefore it is not possible to distinguish between the parent
shell and subshell trap.
This means virtual subshells are fundamentally incompatible with
receiving signals. No workaround can make this work properly.
Ksh could either assume the signal needs to be caught by the
subshell trap (wrong in this case, but right in others) or by the
parent shell trap. But it does neither and just gives up and does
nothing, which I suppose is the least bad way of doing it wrong.
As another example, consider a subshell that traps a signal, then
passes its own process ID (as of 9de65210, that's ${.sh.pid}) to
another process to say "here is where to signal me". A virtual
subshell will send it the PID that it shares with the the parent
shell. Even if a virtual subshell receives the signal correctly, it
may fork mid-execution afterwards, depending on the commands that
it runs (and this varies by implementation as we fix or work around
bugs). So its PID could be invalidated at any time.
Forking a virtual subshell at the time of trapping a signal is the
only way to ensure a persistent PID and correct operation.
src/cmd/ksh93/bltins/trap.c: b_trap():
- Fork when trapping (or ignoring) a signal in a virtual subshell.
(There's no need to fork when trapping a pseudosignal.)
src/cmd/ksh93/tests/signal.sh:
- Add tests. These are simplified versions of tests already there,
which issued 'kill' as a background job. Currently, running a
background job causes a virtual subshell to fork before forking
the 'kill' background job (or *any* background job, see e3d7bf1d)
-- an ugly partial workaround that I believe just became
redundant and which I will remove in the next commit.
I didn't trust this back in e3d7bf1d (which disabled it for
interactive shells) and I trust it less now. In af6a32d1/6b380572,
this was also disabled for virtual subshells as it caused program
flow corruption there. Now, on macOS 10.14.6, a crash occurs when
repeatedly running a command with this optimisation:
$ ksh -c 'for((i=0;i<100;i++));do print -n "$i ";(sleep 1&);done'
0 1 2 3 4 5 6 7 Illegal instruction
Oddly enough it seems that I can only reproduce this crash on macOS
-- not on Linux, OpenBSD, or Solaris. It could be a macOS bug,
particularly given the odd message in the stack trace below.
I've had enough, though. Out it comes. Things now work fine, the
reproducer is fixed on macOS, and it didn't optimise much anyway.
The double-fork issue discussed in e3d7bf1d remains.
________
For future reference, here's an lldb debugger session with a stack
trace. It crashes on calling calloc() (via sh_calloc(), via
sh_newof()) in jobsave_create(). This is not an invalid pointer
problem as we're allocating new memory, so it does look like an OS
bug. The "BUG IN CLIENT OF LIBPLATFORM" message is interesting.
$ lldb -- arch/*/bin/ksh -c 'for((i=0;i<100;i++));do print -n "$i ";(sleep 1&);done'
(lldb) target create "arch/darwin.i386-64/bin/ksh"
Current executable set to 'arch/darwin.i386-64/bin/ksh' (x86_64).
(lldb) settings set -- target.run-args "-c" "for((i=0;i<100;i++));do print -n \"$i \";(sleep 1&);done"
(lldb) run
error: shell expansion failed (reason: lldb-argdumper exited with error 2). consider launching with 'process launch'.
(lldb) process launch
Process 35038 launched: '/usr/local/src/ksh93/ksh/arch/darwin.i386-64/bin/ksh' (x86_64)
0 1 2 3 4 5 6 7 8 9 Process 35038 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)
frame #0: 0x00007fff70deb1c2 libsystem_platform.dylib`_os_unfair_lock_recursive_abort + 23
libsystem_platform.dylib`_os_unfair_lock_recursive_abort:
-> 0x7fff70deb1c2 <+23>: ud2
libsystem_platform.dylib`_os_unfair_lock_unowned_abort:
0x7fff70deb1c4 <+0>: movl %edi, %eax
0x7fff70deb1c6 <+2>: leaq 0x1a8a(%rip), %rcx ; "BUG IN CLIENT OF LIBPLATFORM: Unlock of an os_unfair_lock not owned by current thread"
0x7fff70deb1cd <+9>: movq %rcx, 0x361cb16c(%rip) ; gCRAnnotations + 8
Target 0: (ksh) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)
* frame #0: 0x00007fff70deb1c2 libsystem_platform.dylib`_os_unfair_lock_recursive_abort + 23
frame #1: 0x00007fff70de7c9a libsystem_platform.dylib`_os_unfair_lock_lock_slow + 239
frame #2: 0x00007fff70daa3bd libsystem_malloc.dylib`tiny_malloc_should_clear + 188
frame #3: 0x00007fff70daa20f libsystem_malloc.dylib`szone_malloc_should_clear + 66
frame #4: 0x00007fff70dab444 libsystem_malloc.dylib`malloc_zone_calloc + 99
frame #5: 0x00007fff70dab3c4 libsystem_malloc.dylib`calloc + 30
frame #6: 0x000000010003fa5d ksh`sh_calloc(nmemb=1, size=16) at init.c:264:13
frame #7: 0x000000010004f8a6 ksh`jobsave_create(pid=35055) at jobs.c:272:8
frame #8: 0x000000010004ed42 ksh`job_reap(sig=20) at jobs.c:363:9
frame #9: 0x000000010004ff6f ksh`job_waitsafe(sig=20) at jobs.c:511:3
frame #10: 0x00007fff70de9b5d libsystem_platform.dylib`_sigtramp + 29
frame #11: 0x00007fff70d39ac4 libsystem_kernel.dylib`__fork + 12
frame #12: 0x00007fff70c57d80 libsystem_c.dylib`fork + 17
frame #13: 0x000000010009590d ksh`sh_exec(t=0x0000000101005d30, flags=4) at xec.c:1883:16
frame #14: 0x0000000100096013 ksh`sh_exec(t=0x0000000101005d30, flags=4) at xec.c:2019:4
frame #15: 0x0000000100096c4f ksh`sh_exec(t=0x0000000101005a40, flags=5) at xec.c:2213:9
frame #16: 0x0000000100096013 ksh`sh_exec(t=0x0000000101005a40, flags=5) at xec.c:2019:4
frame #17: 0x000000010001c23f ksh`exfile(iop=0x0000000100405750, fno=-1) at main.c:603:4
frame #18: 0x000000010001b23c ksh`sh_main(ac=3, av=0x00007ffeefbff4f0, userinit=0x0000000000000000) at main.c:365:2
frame #19: 0x0000000100000776 ksh`main(argc=3, argv=0x00007ffeefbff4f0) at pmain.c:45:9
frame #20: 0x00007fff70bfe3d5 libdyld.dylib`start + 1
I had missed something while removing the legacy code for ancient
systems without fork(2). The SH_NTFORK bit flag was only ever
passed to sh_exec() from the version of sh_ntfork() that was
compiled on systems without fork(2). With that call removed, this
flag is now completely unused. This commit removes the flag and the
code that is never executed since that flag is never passed.
The POSIX mode now disables left-hand zero-padding of seconds in
'time'/'times' output. The standard specifies the output format quite
exactly and zero-padding is not in it.
The question-everything department reporting for duty once again.
Version 2005-05-22 ksh93q+ introduced code that saves the inode and
path of the present working directory before invoking a built-in
command without the BLT_ENV attribute (see data/builtins.c). When
the built-in completes, it gets the PWD's inode again and compares.
If they're different, it uses chdir(2) to change back to the old
working directory.
The corresponding commit in the ksh93-history repo contains no
relevant entries in src/cmd/ksh93/RELEASE so no form of rationale
for this addition is available.
Changing back to a previous directory by path name is unsafe,
because the directory may have been removed and even replaced by a
completely different one by then. This is a common vector for
symlink attacks.
In the 93v- beta, this code was improved to use fstat(2) and
fchdir(2) to guarantee changing back to the same directory.
But is this worth doing at all? Why should a built-in not be able
to change the current working directory? If it's not intended to do
this, it simply should not do it. Even in the case of dynamically
loadable third-party built-ins, we're running built-in code in the
same process as ksh, so we've already decided the code is trusted.
If it's not, there are far worse things than $PWD to worry about.
Not only that, this code comes at a performance hit as the file
system is accessed before and after running a non-BLT_ENV builtin.
Before removing this:
$ arch/*/bin/ksh.old -c 'typeset -i i; \
time for((i=0;i<=100000;i++)); do sleep 0; done >/dev/null'
real 0m00.83s
user 0m00.40s
sys 0m00.42s
After removing this:
$ arch/*/bin/ksh -c 'typeset -i i; \
time for((i=0;i<=100000;i++)); do sleep 0; done >/dev/null'
real 0m00.25s
user 0m00.25s
sys 0m00.00s
Removing this nonsense causes no regressions -- which is obvious.
because none of our built-ins except 'cd' change the PWD.
So far we've been handling AST release build and git commit flags
and ksh SHOPT_* compile time options in the generic package build
script. That was a hack that was necessary before I had sufficient
understanding of the build system. Some of it did not work very
well, e.g. the correct git commit did not show up in ${.sh.version}
when compiling from a git repo.
As of this commit, this is properly included in the mamake
dependency tree by handling it from the libast and ksh93 Mamfiles,
guaranteeing they are properly up to date.
For a release build, the _AST_ksh_release macro is renamed to
_AST_release, because some aspects of libast also use this.
This commit also adds my first attempt at documenting the (very
simple, six-command) mamake language as it is currently implemented
-- which is significantly different from Glenn Fowler's original
paper. This is mostly based on reading the mamake.c source code.
src/cmd/INIT/README-mamake.md:
- Added.
bin/package, src/cmd/INIT/package.sh:
- Delete the hack.
**/Mamfile:
- Remove KSH_RELFLAGS and KSH_SHOPTFLAGS, which supported the hack.
- Delete 'meta' commands. They were no-ops; mamake.c ignores them.
They also did not add any informative value.
src/lib/libast/Mamfile:
- Add a 'virtual' target that obtains the current git commit,
examines the git branch, and decides whether to auto-set an
_AST_git_commit and/or or _AST_release #define to a new
releaseflags.h header file. This is overwritten on each run.
- Add code to the install target that copies limit.h to
include/ast, but only if it doesn't exist or the content of the
original changed. This allows '#include <releaseflags.h>' from
any program using libast while avoiding needless recompiles.
- When there are uncommitted changes, add /MOD (modified) to the
commit hash instead of not defining it at all.
src/cmd/ksh93/**:
- Mamfile: Add a shopt.h target that reads SHOPT.sh and converts it
into a new shopt.h header file in the object code directory. The
shopt.h header only contains SHOPT_* directives that have a value
in SHOPT.sh (not the empty/probe ones). They also do not redefine
the macros if they already exist, so overriding with something
like CCFLAGS+=' -DSHOPT_FOO=1' remains possible.
- **.c: Every c file now #includes "shopt.h" first. So SHOPT_*
macros are no longer passed via environment/MAM variables.
* SHOPT.sh: The AUDITFILE and CMDLIB_DIR macros no longer need an
extra backslash-escape for the double quotes in their values.
(The old way required this because mamake inserts MAM variables
directly into shell scripts as literals without quoting. :-/ )
src/cmd/INIT/mamake.c:
- Import the two minor changes between from 93u+ and 93v-: bind()
is renamed to bindfile() and there is a tweak to detecting an
"improper done statement".
- Allow arbitrary whitespace (isspace()) everywhere, instead of
spaces only. This obsoletes my earlier indentation workaround
from 6cc2f6a0; turns out mamake always supported indentation, but
with spaces only.
- Do not skip line numbers at the beginning of each line. This
undocumented feature is not and (AFAICT) never has been used.
- Throw an error on unknown command or rule attribute. Quite an
important feature for manual maintenance: catches typos, etc.
The question-everything department is productive today.
When executing a built-in, ksh checks execflg, which is set if the
command is run in a location where an exec optimisation would take
place if the command were an external command (i.e. if it's the
last command in the script). In that case, and if there are no
traps, and we're not running a function (which we wouldn't anyway
if execflg is set), and the built-in does not have the BLT_ENV
attribute, ksh simulates close-on-exec by manually closing the file
descriptors with the IOCLEX attribute before running the built-in.
This is yet another thing that makes no sense:
- The command is a built-in. There is no exec, so close-on-exec
should not happen.
- Why only for the last command in the script? For external
commands, close-on-exec happens regardless of the command's
location, because external commands fork, then exec. So if this
was meant to introduce behavioural consistency between externals
and built-ins: FAIL
- Why only for built-ins without the BLT_ENV attribute? There are
only a few of them (see data/builtins.c).
Researching ksh93-history shows that this code was introduced in
2005-03-19 ksh93q+. The changes in ksh93/RELEASE in that commit
have a couple of entries that could be superficially related, but
nothing directly relevant. It also introduced several regression
tests that are still present, and none of them fail after we delete
this. So, this looks like yet more undocumented nonsense.
Here's another one from the "question everything" department.
Every time ksh executes a built-in, it checks whether it is bound
to a path (the name contains a '/') and if so, it turns off the vi,
emacs and gmacs shell options before executing it. After executing
the built-ins, those shell options are restored again. The comment
is: "turn off editors for built-in versions of commands on PATH".
But why? Commands do not and cannot ever use the command line
editors. They are only ever executed on an interactive shell
*between* running commands. So turning them off during the
execution of a command makes no difference.
The only possible effect is that these commands cannot check if a
user has the vi, emacs or gmacs option on, or change these options
itself. But what purpose could that limitation possibly serve? And
what sense does it make to do this for path-bound built-ins only
and not for other commands? And what sense does it make to have
this limitation only for the editor shell options and not others?
This waste of CPU cycles has been there since the beginning of the
ksh93-history repo (1995) so if there was ever any good reason for
it, it has been lost in the mist of history. I'm 99.999% sure that
nothing will happen after deleting this.
Reproducers:
$ ksh -c 'typeset -T foo=(x=1); typeset -T foo=(integer x)'
ksh: typeset: foo: type cannot be redefined
ksh: foo: type definition requires compound assignment
$ ksh -c 'typeset -T foo=(x=1); typeset -T foo=(integer x=2)'
ksh: foo: type cannot be redefined
ksh: foo: type definition requires compound assignment
In both cases, the second error message is spurious, as there is in
fact a compound assignment.
The first case was introduced in 1fbbeaa1. The 'integer' command no
longer resolves to a special built-in but is now a regular built-in
so it does not cause the shell to exit on error. The code path
continues and nv_mktype() is called which issues the bad error as
the compound assignment was not fully processed.
The second case was introduced in d309d604. This commit inhibits
the exit-on-error for the assignment list of a regular built-in.
For type definitions, nv_mktype() is called via this code path too.
Since all declaration commands were either special built-ins or
aliases to special built-ins in 93u+, they all exited on error and
this problem never happened. Now that some regular built-ins are
allowed inside 'typeset -T', we need some special-casing to make
them exit on error as before. Arguably this is the correct thing to
do anyway, as their errors are also an error in the enveloping
'typeset' which is a special built-in.
This regression is cosmetic and the case is rare, so this is
probably not worth a NEWS item.
src/cmd/ksh93/sh/xec.c: sh_exec():
- Fix the second case by adding a test for non-NULL 'sh.mktype' to
the conditions where the assignments list is processed with exit
on error. This pointer is set while executing a type definition.
- Fix the first case by saving the sh.mktype state before executing
a built-in and then checking the saved value if a built-in exits
on error. Saving is needed as sh.mktype has been reset by then.
Before:
$ set -o hist
-ksh: set: hist: bad option(s)
After:
$ set --hist
-ksh: set: hist: ambiguous option
In ksh 93u+m, there are three options starting with 'hist', so the
abbreviation does not represent a single option. It is useful to
communicate this in the error message.
In addition, "bad option(s)" was changed to "unknown option", the
same message as that given by AST optget(3), for consistency.
src/cmd/ksh93/sh/string.c:
- Make sh_lookopt() return -1 for an ambiguous option. This is
trivial as there is already an 'amb' variable flagging that up.
src/cmd/ksh93/sh/args.c:
- Use the negative sh_lookopt() return status to decide between
"unknown option" and "ambiguous option".
src/cmd/ksh93/data/builtins.c: sh_set[]:
- Explain the --option and --nooption forms in addition to the -o
option and +o option forms.
- Document the long options without their 'no' prefixes (e.g. glob
instead of noglob) as this simplifies documentation and the
relation with the short options makes more sense. Those names are
also how they show up in 'set -o' output and it is better for
those to match.
- Tweaks.
The POSIX standard in fact contains a sentence that specifically
allows preset aliases: "Implementations also may provide predefined
valid aliases that are in effect when the shell is invoked."
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_03_01
I had missed that back then. It's still a terrible idea for scripts
(particularly the way 93u+ implemented them), but for interactive
POSIX shells it makes a lot more sense and does not violate POSIX.
src/cmd/ksh93/sh/main.c: sh_main():
- Preset aliases for interactive shell regardless of SH_POSIX.
The POSIX standard requires real UNIX pipes as in pipe(2). But on
systems supporting it (all modern ones), ksh uses socketpair(2)
instead to make it possible for certain commands to peek ahead
without consuming input from the pipe, which is not possible with
real pipes. See features/poll and sh/io.c.
But this can create undesired side effects: applications connected
to a pipe may test if they are connected to a pipe, which will fail
if they are connected to a socket. Also, on Linux:
$ cat /etc/passwd | head -1 /dev/stdin
head: cannot open '/dev/stdin' for reading: No such device or address
...which happens because, unlike most systems, Linux cannot open(2)
or openat(2) a socket (a limitation that is allowed by POSIX).
Unfortunately at least two things depend on the peekahead
capability of the _pipe_socketpair feature. One is the non-blocking
behaviour of the -n option of the 'read' built-in:
-n Causes at most n bytes to be read instead of a full
line, but will return when reading from a slow device as
soon as any characters have been read.
The other thing that breaks is the <#pattern and <##pattern
redirection operators that basically grep standard input, which
inherently requires peekahead.
Standard UNIX pipes always block on read and it is not possible to
peek ahead, so these features inevitably break. Which means we
cannot simply use standard pipes without breaking compatibility.
But we can at least fix it in the POSIX mode so that cross-platform
scripts work more correctly.
src/cmd/ksh93/sh/io.c: sh_pipe():
- If _pipe_socketpair is detected at compile time, then use a real
pipe via sh_rpipe() if the POSIX mode is active. (If
_pipe_socketpair is not detected, a real pipe is always used.)
src/cmd/ksh93/data/builtins.c:
- sh.1 documents the slow-device behaviour of -n but 'read --man'
did not. Add that, making it conditional on _pipe_socketpair.
Resolves: https://github.com/ksh93/ksh/issues/327
The shell code de-parser, which converts byte code back to shell
source code, is unused. It was used in SHOPT_COSHELL, which was
removed in 3613da42, and in ancient code for systems without
fork(2), which was removed in 7b0e0776. Testing reveals it to be
quite broken as it has not kept up with more recent changes in ksh.
It is kept in the dev branch, as we intend to fix it up and use it
for 'typeset -f FUNCTIONNAME' to output function definitions in a
future release. (The current design, which outputs original source
code direct from the source file, is fundamentally broken because
the output of a function definition that was loaded from a file is
corrupted if you edit the file after loading the function.)
Notable changes:
- The typeset builtin's usage and error messages for incompatible
options used with -f has been corrected to show that -t and -u
can be used with -f.
- In name.c, get rid of misleaadingly named Null static which is
actually the empty string, not the null value. Replace with a new
AltEmpty macro that is defined similarly to Empty. This is now
also used in nvtype.c (re: de037b6e).
Historically, ksh (including ksh88 and mksh) allow brace expansion
not just on literal patterns but also on patterns resulting from
unquoted variable expansions or command substitutions:
$ a='{a,b}' ksh -c 'echo X{a,b} Y$a'
Xa Xb Ya Yb
Most people expect only the first (literal) pattern to be expanded,
as in bash and zsh:
$ a='{a,b}' bash -c 'echo X{a,b} Y$a'
Xa Xb Y{a,b}
The historic ksh behaviour is poorly documented and nearly unknown,
violates the principle of least astonishment, and makes unquoted
variable expansions even more unsafe. See discussion at:
https://www.austingroupbugs.net/view.php?id=1193https://github.com/ksh93/ksh/issues/140
Unfortunately, we cannot change it in default ksh without breaking
backward compatibility. But we can at least fix it for the POSIX
mode (which disables brace expansion by default but allows turning
it back on), particularly as it looks like POSIX, if it decides to
specify brace expansion in a future version of the standard, will
disallow brace expansion on unquoted variable expansions.
src/cmd/ksh93/sh/macro.c: endfield():
- When deciding whether to do brace expansion + globbing or only
globbing, also check that we do not have POSIX mode and an
unquoted variable expansion (mp->pattern==1).
$ typeset -T _bad_disc_t=(typeset dummy; function foo.get { :; })
Abort
One of the abort() calls that replaced a debug message in the
referenced commit was triggered when trying to define a discipline
function for a nonexistent type member inside a 'typeset -T' type
definition.
src/cmd/ksh93/sh/nvtype.c: std_disc():
- Issue a proper error message for that condition.
The 'getn' discipline is experimental and undocumented, the only
mention of it being an old mailing list post from David Korn:
https://www.mail-archive.com/ast-users@research.att.com/msg00601.html
But it still should not crash.
$ LC_NUMERIC=C ENV=/./dev/null arch/*/bin/ksh
$ foo.getn() { .sh.value=2.3*4.5; }
$ typeset -F foo
Memory fault
src/cmd/ksh93/sh/nvdisc.c: assign():
- Check that the nvalue union has a non-NULL pointer before
using it.
Progresses: https://github.com/ksh93/ksh/issues/435
Reproducer script:
typeset -Ttyp1 typ1=(
function get {
.sh.value="'Sample'";
}
)
typ1 var11
typeset -p .sh.type
typeset -p .sh.type
Buggy output:
namespace sh.type
{
typeset -r typ1='Sample'
}
namespace sh.type
{
typeset -x -r typ1='Sample'
}
An -x (export) attribute is magically pulled out of a hat.
Analysis: The walk_tree() function in nvdisc.c repurposes (!) the
NV_EXPORT attribute as an instruction to turn *off* indenting when
pretty-printing the values of compound variables. The
print_namval() function in typeset.c, implementing 'typeset -p',
turns on NV_EXPORT for compound variables to inhibit indentation.
But it then does not bother to turn it off, which causes this bug.
src/cmd/ksh93/bltins/typeset.c: print_namval():
- When printing compound variables, only turn on NV_EXPORT
temporarily.
Resolves: https://github.com/ksh93/ksh/issues/456
On an interactive shell in emacs or vi, type a command with a $'…'
quoted string that contains a backslash-escaped single quote, like:
$ true $'foo\'bar' ▁
Then begin to type the name of a file present in the current
working directory and press tab. Nothing happens as completion
fails to work.
The completion code does not recognise $'…' strings. Instead, it
parses them as '…' strings in which there are no backslash escapes,
so it considers the last ' to open a second quoted string which is
not terminated.
Plus, when replacing a $'…' string with a (backslash-escaped)
completed string, the initial '$' is not replaced:
$ $'/etc/hosts<Tab>
$ $/etc/hosts
src/cmd/ksh93/edit/completion.c:
- find_begin():
- Learn how to recognise $'…' strings. A new local dollarquote
flag variable is used to distinguish them from regular '…'
strings. The difference is that backslash escapes (and only
those) should be recognised as in "…".
- Set a special type -1 for $'…' as the caller will need a way
to distinguish those from '…'.
- ed_expand(): When replacing a quoted string, remove an extra
initial character (being the $ in $') if the type set by
find_begin() is -1.
Resolves: https://github.com/ksh93/ksh/issues/462
Comsubs were either executed or caused a syntax error when attempting
to complete them within single quotes. Since single quotes do not
expand anything, no such completion should take place.
$ '`/de<TAB>-ksh: /dev/: cannot execute [Is a directory]
$ '$(/de<TAB>-ksh: syntax error: `end of file' unexpected
src/cmd/ksh93/edit/completion.c:
- find_begin():
- Remove recursive handling for '`' comsubs from 7a2d3564; it is
sufficient to set the return pointer to the current location cp
(the character following '`') if we're not in single quotes.
- For '$' and '`', if we are within single quotes, set type '\''
and set the return pointer bp to the location of the '$' or
'`'.
- ed_expand(): If find_begin() sets type '\'' and the current begin
character is $ or `, refuse to attempt completion; return -1 to
cause a terminal beep.
Related:
https://github.com/ksh93/ksh/issues/268https://github.com/ksh93/ksh/issues/462#issuecomment-1038482307
A previous test left LC_ALL set, causing a spurious failure in one
of the newly added tests. The bug was masked because another test
conditional upon SHOPT_MULTIBYTE unset LC_ALL again. The fix is to
unset all locale variables before testing, just to be sure.
A side effect of the bug fixed in 2a835a2d caused the DEBUG trap
action to appear to be inherited by subshells, but in a buggy way
that could crash the shell. After the fix, the trap is reset in
subshells along with all the others, as it should be. Nonetheless,
as that bug was present for years, some have come to rely on it.
This commit implements that functionality properly. When the new
--functrace option is turned on, DEBUG trap actions are now
inherited by subshells as well as ksh function scopes. In addition,
since it makes logical sense, the new option also causes the
-x/--xtrace option's state to be inherited by ksh function scopes.
Note that changes made within the scope do not propagate upwards;
this is different from bash.
(I've opted against adding a -T short-form equivalent as on bash,
because -T was formerly a different option on 93u+ (see 63c55ad7)
and on mksh it has yet anohter a different meaning. To minimise
confusion, I think it's best to have the long-form name only.)
src/cmd/ksh93/include/shell.h,
src/cmd/ksh93/data/options.c:
- Add new "functrace" (SH_FUNCTRACE) long-form shell option.
src/cmd/ksh93/sh/subshell.c: sh_subshell():
- When functrace is on, copy the parent's DEBUG trap action into
the virtual subshell scope after resetting the trap actions.
src/cmd/ksh93/sh/xec.c: sh_funscope():
- When functrace is on and xtrace is on in the parent scope, turn
it on in the child scope.
- Same DEBUG trap action handling as in sh_subshell().
Resolves: https://github.com/ksh93/ksh/issues/162
After the previous commit, one inconsistency was left. Assignments
preceding special built-ins and POSIX functions (which persist past
the command :-/) caused pre-existing attributes of the respective
variables to be cleared.
$ (f() { typeset -p n; }; typeset -i n; n=3+4 f)
n=3+4
(expected output: 'typeset -i n=7')
This problem was introduced shortly before the release of ksh 93u+,
in version 2012-05-04, by adding these lines of code to the code
for processing preceding assignments in sh_exec():
src/cmd/ksh93/sh/xec.c:
1055: if(np)
1056: flgs |= NV_UNJUST;
So, for special and declaration commands and POSIX functions, the
NV_UNJUST flag is passed to nv_open(). In older ksh versions, this
flag cleared justify attributes only, but in early 2012 it was
repurposed to clear *all* attributes -- without changing the name
or the relevant comment in name.h, which are now both misleading.
The reason for setting this flag in xec.c was to deal with some
bugs in 'typeset'. Removing those two lines causes regressions:
attributes.sh[316]: FAIL: typeset -L should not preserve old attributes
attributes.sh[322]: FAIL: typeset -R should not preserve old attributes
attributes.sh[483]: FAIL: typeset -F after typeset -L fails
attributes.sh[488]: FAIL: integer attribute not cleared for subsequent typeset
Those are all typeset regressions, which suggests this fix was
relevant to typeset only. This is corroborated by the relevant
AT&T ksh93/RELEASE entry:
12-04-27 A bug in which old attributes were not cleared when
assigning a value using typeset has been fixed.
So, to fix this 2012 regression without reintroducing the typeset
regressions, we need to set the NV_UNJUST flag for invocations of
the typeset family of commands only. This is changed in xec.c.
While we're at it, we might as well rename that little-used flag to
something that reflects its current purpose: NV_UNATTR.
Reproducer:
$ typeset -i NUM=123
$ (NUM=3+4 env; :)|grep ^NUM=
NUM=3+4
$ (NUM=3+4 env)|grep ^NUM=
NUM=7
The correct output is NUM=7. This is also the output if ksh is
compiled with SHOPT_SPAWN disabled, suggesting that the problem is
here in sh_ntfork(), where the invocation-local environment list is
set using a sh_scope() call:
src/cmd/ksh93/sh/xec.c:
3496: if(t->com.comset)
3497: {
3498: scope++;
3499: sh_scope(t->com.comset,0);
3500: }
Analysis:
When ksh is compiled with SHOPT_SPAWN disabled, external commands
are always run using the regular forking mechanism. First the shell
forks, then in the fork, the preceding assignments list (if any)
are executed and exported in the main scope. Replacing global
variables is not a problem as the variables are exported and the
forked shell is then replaced by the external command using
execve(2).
But when using SHOPT_SPAWN/sh_ntfork(), this cannot be done as the
fork(2) use is replaced by posix_spawn(2) which does not copy the
parent process into the child, therefore it's not possible to
execute anything in the child before calling execve(2). Which means
the preceding assignments list must be processed in the parent, not
the child. Which makes overwriting global variables a no-no.
To avoid overwriting global variables, sh_ntfork() treats preceding
assignments like local variables in functions, which means they do
not inherit any attributes from the parent scope. That is why the
integer attribute is not honoured in the buggy reproducers.
And this is not just an issue for external commands. sh_scope() is
also used for assignments preceding a built-in command. Which is
logical, as those don't create a process at all.
src/cmd/ksh93/sh/xec.c:
1325: if(argp)
1326: {
1327: scope++;
1328: sh_scope(argp,0);
1329: }
Which means this bug exists for them as well, regardless of whether
SHOPT_SPAWN is compiled in.
$ /bin/ksh -c 'typeset -i NUM; NUM=3+4 command eval '\''echo $NUM'\'
3+4
(expected: 7, as on mksh and zsh)
So, the attributes from the parent scope will need to be copied
into the child scope. This should be done in nv_setlist() which is
called from sh_scope() with both the NV_EXPORT and NV_NOSCOPE flags
passed. Those flag bits are how we can recognise the need to copy
attributes.
Commit f6bc5c0 fixed a similar inconsistency with the check for the
read-only attribute. In fact, the bug fixed there was simply a
specific instance of this bug. The extra check for readonly was
because the readonly attribute was not copied to the temporary
local scope. So that fix is now replaced by the more general fix
for this bug.
src/cmd/ksh93/sh/name.c: nv_setlist():
- Introduce a 'vartree' local variable to avoid repetitive
'sh.prefix_root ? sh.prefix_root : sh.var_tree' expressions.
- Use the NV_EXPORT|NV_NOSCOPE flag combination to check if parent
scope attributes need to be copied to the temporary local scope
of an assignment preceding a command.
- If so, copy everything but the value itself: attributes (nvflag),
size parameter (nvsize), discipline function pointer (nvfun) and
the name pointer (nvname). The latter is needed because some
code, at least put_lang() in init.c, compares names by comparing
the pointers instead of the strings; copying the nvname pointer
avoids a regression in tests/locale.sh.
src/cmd/ksh93/sh/xec.c: local_exports():
- Fix a separate bug exporting attributes to a new ksh function
scope, which was previously masked by the other bug. The
attributes (nvflag) were copied *after* nv_putval()ing the value,
which is incorrect as the behaviour of nv_putval() is influenced
by the attributes. But here, we're copying the value too, so we
can simplify the whole function by using nv_clone() instead. This
may also fix other corner cases. (re: c1994b87)
Resolves: https://github.com/ksh93/ksh/issues/465
Those debug messages were written straight to standard error
without going through the shell's error/warning message mechanism.
They look like debug messages that aren't really supposed to appear
(no one ever sees them AFAIK).
This commit removes them, in some cases replacing them with abort()
calls so that a stack trace is generated if the unexpected thing
happens, instead of the shell continuing in an inconsistent state.
A systematic grepping of the extern function definitions in
src/cmd/ksh93/include/*.h revealed more functions that either don't
exist or are not used anywhere. Some of them have never seen any
use in the entire ksh93-history repo (i.e. since 1995). They were
also all undocumented, so it's unlikely third-party custom
built-ins rely on them.
Bug 1: as of 960a1a99, floating point literals were no longer
recognised when importing variables from the environment. The
attribute was still imported but the value reverted to zero:
$ (export LC_NUMERIC=C; typeset -xF5 num=7.75; \
ksh -c 'typeset -p num')
typeset -x -F 5 num=0.00000
Bug 2 (inherited from 93u+): The code that imported variable
attributes from the environment only checked '.' to determine
whether the float attribute should be set. It should check the
current radix point instead.
$ (export LC_NUMERIC=debug; typeset -xF5 num=7,75; \
ksh -c 'typeset -p num')
typeset -x -i num=0
...or, after fixing bug 1 only, the output is:
typeset -x -i num=75000
src/cmd/ksh93/sh/arith.c: sh_strnum():
- When importing untrusted env vars at init time, handle not only
"base#value" literals using strtonll, but also floating point
literals using strtold. This fixes the bug without reallowing
arbitary expressions. (re: 960a1a99)
- When not initialising, use sh.radixpoint (see f0386a87) instead
of '.' to help decide whether to evaluate an arith expression.
src/cmd/ksh93/sh/init.c: env_import_attributes():
- Use sh.radixpoint instead of '.' to check for a decimal fraction.
(This code is needed because doubles are exported as integers for
ksh88 compatibility; see attstore() in name.c.)