This commit fixes an issue I found in the subshell $RANDOM
reseeding code.
The main issue is a performance regression in the shbench fibonacci
benchmark, introduced in commit af6a32d1. Performance dropped in
this benchmark because $RANDOM is always reseeded and restored,
even when it's never used in a subshell. Performance results from
before and after this performance fix (results are on Linux with
CC=gcc and CCFLAGS='-O2 -D_std_malloc'):
$ ./shbench -b bench/fibonacci.ksh -l 100 ./ksh-0f06a2e ./ksh-af6a32d ./ksh-f31e368 ./ksh-randfix
benchmarking ./ksh-0f06a2e, ./ksh-af6a32d, ./ksh-f31e368, ./ksh-randfix ...
*** fibonacci.ksh ***
# ./ksh-0f06a2e # Recent version of ksh93u+m
# ./ksh-af6a32d # Commit that introduced the regression
# ./ksh-f31e368 # Commit without the regression
# ./ksh-randfix # Ksh93u+m with this patch applied
-------------------------------------------------------------------------------------------------
name ./ksh-0f06a2e ./ksh-af6a32d ./ksh-f31e368 ./ksh-randfix
-------------------------------------------------------------------------------------------------
fibonacci.ksh 0.481 [0.459-0.515] 0.472 [0.455-0.504] 0.396 [0.380-0.442] 0.407 [0.385-0.439]
-------------------------------------------------------------------------------------------------
src/cmd/ksh93/include/variables.h,
src/cmd/ksh93/sh/{init,subshell}.c:
- Rather than reseed $RANDOM every time a subshell is created, add
a sh_save_rand_seed() function that does this only when the
$RANDOM variable is used in a subshell. This function is called
by the $RANDOM discipline functions nget_rand() and put_rand().
As a minor optimization, sh_save_rand_seed doesn't reseed if it's
called from put_rand().
- Because $RANDOM may have a seed of zero (i.e., RANDOM=0),
sp->rand_seed isn't enough to tell if $RANDOM has been reseeded.
Add sp->rand_state for this purpose.
- sh_subshell(): Only restore the former $RANDOM seed and state if
it is necessary to prevent a subshell leak.
src/cmd/ksh93/tests/variables.sh:
- Add two regression tests for bugs I ran into while making this
patch.
ksh crashed after unsetting .sh.match and then matching a pattern:
$ unset .sh.match
$ [[ bar == ba* ]]
Memory fault
src/cmd/ksh93/sh/init.c: sh_setmatch():
- Do nothing if we cannot get an array pointer to SH_MATCHNOD.
On Linux, the 'su' program sets $0 to '-su' when doing 'su -' or
'su - username'. When ksh is the target account's default shell,
this caused ksh to consider itself to be launched as a standard
POSIX sh, which (among other things) disables the default aliases
on interactive shells. This caused confusion for at least one user
as they lost their 'history' alias after 'su -':
https://www.linuxquestions.org/questions/slackware-14/in-current-with-downgrade-to-ksh93-lost-the-alias-history-4175703408/
bash does not consider itself to be sh when invoked as su, so ksh
probably shouldn't, either. The behaviour was also undocumented,
making it even more surprising.
src/cmd/ksh93/sh/init.c: sh_type():
- Only set the SH_TYPE_POSIX bit if we're invoked as 'sh' (or, on
windows, as 'sh.exe').
On NetBSD, for some reason, the wctrans(3) and towctrans(3) C
library functions exist, but have no effect; the "toupper" and
"tolower" maps don't even translate case for ASCII, never mind wide
characters. This kills 'typeset -u' and 'typeset -l' on ksh, which
was the cause of most of the regression test failures on NetBSD.
Fallback versions for these functions are provided in init.c, but
were not being used on NetBSD because the feature test detected the
presence of these functions in the C library.
src/cmd/ksh93/features/locale:
- Replace the simple test for the presence of wctrans(3),
towctrans(3), and the wctrans_t type by an actual feature test
that checks that these functions not only compile, but are also
capable of changing an ASCII 'q' to upper case and back again.
src/cmd/ksh93/sh/init.c: towctrans():
- Add wide character support to the fallback function, for whatever
good that may do; on NetBSD, the wide-character towupper(3) and
towlower(3) functions only change case for ASCII.
This fixes the following:
1. Using $RANDOM in a virtual/non-forked subshell no longer
influences the reproducible $RANDOM sequence in the parent
environment.
2. When invoking a subshell $RANDOM is now re-seeded (as mksh and
bash do) so that invocations in repeated subshells (including
forked subshells) longer produce identical sequences by default.
3. Program flow corruption that occurred in scripts on executing
( ( simple_command & ) ).
src/cmd/ksh93/include/variables.h:
- Move 'struct rand' here as it will be needed in subshell.c. Add
rand_seed member to save the pseudorandom generator seed. Remove
the pointer to the shell state as it's redundant.
src/cmd/ksh93/sh/init.c:
- put_rand(): Store given seed in rand_seed while calling srand().
No longer pointlessly limit the number of possible seeds with the
RANDMASK bitmask (that mask is to limit the values to 0-32767,
it should not limit the number of possible sequences to 32768).
- nget_rand(): Instead of using rand(), use rand_r() to update the
random_seed value. This makes it possible to save/restore the
current seed of the pseudorandom generator.
- Add sh_reseed_rand() function that reseeds the pseudorandom
generator by calling srand() with a bitwise-xor combination of
the current PID, the current time with a granularity of 1/10000
seconds, and a sequence number that is increased on each
invocation.
- nv_init(): Set the initial seed using sh_reseed_rand() here
instead of in sh_main(), as this is where the other struct rand
members are initialised.
src/cmd/ksh93/sh/main.c: sh_main():
- Remove the srand() call that was replaced by the sh_reseed_rand()
call in init.c.
src/cmd/ksh93/sh/subshell.c: sh_subshell():
- Upon entering a virtual subshell, save the current $RANDOM seed
and state, then reseed $RANDOM for the subshell.
- Upon exiting a virtual subshell, restore $RANDOM seed and state
and reseed the generator using srand() with the restored seed.
src/cmd/ksh93/sh/xec.c: sh_exec():
- When optimizing out a subshell that is the last command, still
act like a subshell: reseed $RANDOM and increase ${.sh.subshell}.
- Fix a separate bug discovered while implementing this. Do not
optimize '( simple_command & )' when in a virtual subshell; doing
this causes program flow corruption.
- When optimizing '( simple_command & )', also reseed $RANDOM and
increment ${.sh.subshell}.
src/cmd/ksh93/tests/subshell.sh,
src/cmd/ksh93/tests/variables.sh:
- Add various tests for all of the above.
Co-authored-by: Johnothan King <johnothanking@protonmail.com>
Resolves: https://github.com/ksh93/ksh/issues/285
This commit implements unsetting functions in virtual subshells,
removing the need for the forking workaround. This is done by
either invalidating the function found in the current subshell
function tree by unsetting its NV_FUNCTION attribute bits (which
will cause sh_exec() to skip it) or, if the function exists in a
parent shell, by creating an empty dummy subshell node in the
current function tree without that attribute.
As a beneficial side effect, it seems that bug 228 (unset -f fails
in forked subshells if a function is defined before forking) is now
also fixed.
src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/sh/init.c:
- Add sh.fun_base for a saved pointer to the main shell's function
tree for checking when in a subshell, analogous to sh.var_base.
src/cmd/ksh93/bltins/typeset.c: unall():
- Remove the fork workaround.
- When unsetting a function found in the current function tree
(troot) and that tree is not sh.var_base (which checks if we're
in a virtual subshell in a way that handles shared-state command
substitutions correctly), then do not delete the function but
invalidate it by unsetting its NV_FUNCTION attribute bits.
- When unsetting a function not found in the current function tree,
search for it in sh.fun_base and if found, add an empty dummy
node to mask the parent shell environment's function. The dummy
node will not have NV_FUNCTION set, so sh_exec() will skip it.
src/cmd/ksh93/sh/subshell.c:
- sh_subfuntree(): For 'unset -f' to work correctly with
shared-state command substitutions (subshares), this function
needs a fix similar to the one applied to sh_assignok() for
variables in commit 911d6b06. Walk up on the subshells tree until
we find a non-subshare.
- sh_subtracktree(): Apply the same fix for the hash table.
- Remove table_unset() and incorporate an updated version of its
code in sh_subshell(). As of ec888867, this function was only
used to clean up the subshell function table as the alias table
no longer exists.
- sh_subshell():
* Simplify the loop to free the subshell hash table.
* Add table_unset() code, slightly refactored for readability.
Treat dummy nodes now created by unall() separately to avoid a
memory leak; they must be nv_delete()d without passing the
NV_FUNCTION bits. For non-dummy nodes, turn on the NV_FUNCTION
attribute in case they were invalidated by unall(); this is
needed for _nv_unset() to free the function definition.
src/cmd/ksh93/tests/subshell.sh:
- Update the test for multiple levels of subshell functions to test
a subshare as well. While we're add it, add a very similar test
for multiple levels of subshell variables that was missing.
- Add @JohnoKing's reproducer from #228.
src/cmd/ksh93/tests/leaks.sh:
- Add leak tests for unsetting functions in a virtual subshell.
Test both the simple unset case (unall() creates a dummy node)
and the define/unset case (unall() invalidates existing node).
Resolves: https://github.com/ksh93/ksh/issues/228
Noteworthy changes:
- The man pages have been updated to fix a ton of instances of
runaway underlining (this was done with `sed -i 's/\\f5/\\f3/g'`
commands). This commit dramatically increased in size because
of this change.
- The documentation for spawnveg(3) has been extended with
information about its usage of posix_spawn(3) and vfork(2).
- The documentation for tmfmt(3) has been updated with the changes
previously made to the man pages for the printf and date builtins
(though the latter builtin is disabled by default).
- The shell's tracked alias tree (hash table) is now documented in
the shell(3) man page.
- Removed the commented out regression test for an ERRNO variable
as the COMPATIBILITY file states it was removed in ksh93.
The changes in this commit allow ksh to be built and run with
ASan[*], although for now it only works under vmalloc. Example
command to build ksh with ASan:
$ bin/package make CCFLAGS='-O0 -g -fsanitize=address'
[*] https://en.wikipedia.org/wiki/AddressSanitizer
src/cmd/INIT/mamake.c:
- Fix a few memory leaks in mamake. This doesn't fix all of the
memory leaks ASan complains about (there is one remaining in the
view() function), but it's enough to get ksh to build under ASan.
src/lib/libast/features/map.c,
src/lib/libast/misc/glob.c:
- Rename the ast globbing functions to _ast_glob() and
_ast_globfree(). Without this change the globbing tests fail
under ASan. See: 2c49eb6e
src/cmd/ksh93/sh/{init,io,nvtree,subshell}.c:
- Fix buffer overflows by using strncmp(3) instead of memcmp(3).
src/cmd/ksh93/sh/name.c:
- Fix another invalid usage of memcmp by using strncmp instead.
This change is also in one of Red Hat's patches:
https://git.centos.org/rpms/ksh/blob/c8s/f/SOURCES/ksh-20120801-nv_open-memcmp.patch
Resolves: https://github.com/ksh93/ksh/issues/230
While automagically importing/exporting ksh variable attributes via
the environment is probably a misfeature in general (now disabled
for POSIX standard mode), doing so with the readonly attribute is
particularly problematic. Scripts can take into account the
possibility of importing unwanted attributes by unsetting or
typesetting variables before using them. But there is no way for a
script to get rid of an unwanted imported readonly variable. This
is a possible attack vector with no possible mitigation.
This commit blocks both the import and the export of the readonly
attribute through the environment. I consider it a security fix.
src/cmd/ksh93/sh/init.c: env_import_attributes():
- Clear NV_RDONLY from imported attributes before applying them.
src/cmd/ksh93/sh/name.c: sh_envgen():
- Remove NV_RDONLY from bitmask defining attributes to export.
Many of these changes are minor typo fixes. The other changes
(which are mostly compiler warning fixes) are:
NEWS:
- The --globcasedetect shell option works on older Linux kernels
when used with FAT32/VFAT file systems, so remove the note about
it only working with 5.2+ kernels.
src/cmd/ksh93/COMPATIBILITY:
- Update the documentation on function scoping with an addition
from ksh93v- (this does apply to ksh93u+).
src/cmd/ksh93/edit/emacs.c:
- Check for '_AST_ksh_release', not 'AST_ksh_release'.
src/cmd/INIT/mamake.c,
src/cmd/INIT/ratz.c,
src/cmd/INIT/release.c,
src/cmd/builtin/pty.c:
- Add more uses of UNREACHABLE() and noreturn, this time for the
build system and pty.
src/cmd/builtin/pty.c,
src/cmd/builtin/array.c,
src/cmd/ksh93/sh/name.c,
src/cmd/ksh93/sh/nvtype.c,
src/cmd/ksh93/sh/suid_exec.c:
- Fix six -Wunused-variable warnings (the name.c nv_arrayptr()
fixes are also in ksh93v-).
- Remove the unused 'tableval' function to fix a -Wunused-function
warning.
src/cmd/ksh93/sh/lex.c:
- Remove unused 'SHOPT_DOS' code, which isn't enabled anywhere.
https://github.com/att/ast/issues/272#issuecomment-354363112
src/cmd/ksh93/bltins/misc.c,
src/cmd/ksh93/bltins/trap.c,
src/cmd/ksh93/bltins/typeset.c:
- Add dictionary generator function declarations for former
aliases that are now builtins (re: 1fbbeaa1, ef1621c1, 3ba4900e).
- For consistency with the rest of the codebase, use '(void)'
instead of '()' for print_cpu_times.
src/cmd/ksh93/sh/init.c,
src/lib/libast/path/pathshell.c:
- Move the otherwise unused EXE macro to pathshell() and only
search for 'sh.exe' on Windows.
src/cmd/ksh93/sh/xec.c,
src/lib/libast/include/ast.h:
- Add an empty definition for inline when compiling with C89.
This allows the timeval_to_double() function to be inlined.
src/cmd/ksh93/include/shlex.h:
- Remove the unused 'PIPESYM2' macro.
src/cmd/ksh93/tests/pty.sh:
- Add '# err_exit #' to count the regression test added in
commit 113a9392.
src/lib/libast/disc/sfdcdio.c:
- Move diordwr, dioread, diowrite and dioexcept behind
'#ifdef F_DIOINFO' to fix one -Wunused-variable warning and
multiple -Wunused-function warnings (sfdcdio() only uses these
functions when F_DIOINFO is defined).
src/lib/libast/string/fmtdev.c:
- Fix two -Wimplicit-function-declaration warnings on Linux by
including sys/sysmacros.h in fmtdev().
The current version of 93u+m does not have proper support for the
LC_TIME variable. Setting LC_TIME has no effect on printf %T, and
if the locale is invalid no error message is shown:
$ LC_TIME=ja_JP.UTF-8
$ printf '%T\n' now
Wed Apr 7 15:18:13 PDT 2021
$ LC_TIME=invalid.locale
$ # No error message
src/cmd/ksh93/data/variables.c,
src/cmd/ksh93/include/variables.h,
src/cmd/ksh93/sh/init.c:
- Add support for the $LC_TIME variable. ksh93v- attempted to add
support for LC_TIME, but the patch from that version was extended
because the variable still didn't function correctly.
src/cmd/ksh93/tests/variables.sh:
- Add LC_TIME to the regression tests for LC_* variables.
This experiment, the initialisation of which was disabled with '#if
0', defines a bunch of integer type commands as special builtins.
Most are boring as they define variables just like normal integers:
pid_t, size_t, etc.
One is interesting: mode_t is a type that automatically converts
from a octal permission bits (e.g. 755) to a mode string like
u+rwx,g+rw,o+rw. That's not a compelling enough use case to
permanently define a special and immutable builtin though.
stat_t is odd: it takes a file name as an argument and fills the
variable with stat information, but it is base64 encoded binary
data and there doesn't seem to be anything that can parse it.
Anyway, none of this is going to be enabled, so we should get rid.
On Ubuntu arm7, two variables.sh regression tests crashed with a
bus error (SIGBUS) in init.c on line 720 while testing $LINENO:
707 static void put_lineno(Namval_t* np,const char *val,int flags,Namfun_t *fp)
708 {
709 register long n;
710 Shell_t *shp = sh_getinterp();
711 if(!val)
712 {
713 fp = nv_stack(np, NIL(Namfun_t*));
714 if(fp && !fp->nofree)
715 free((void*)fp);
716 _nv_unset(np,NV_RDONLY);
717 return;
718 }
719 if(flags&NV_INTEGER)
720 n = *(double*)val;
721 else
722 n = sh_arith(shp,val);
723 shp->st.firstline += nget_lineno(np,fp)+1-n;
724 }
Apparently, gcc on arm7 doesn't like the implicit typecast from
double to long.
Those three $LINENO discipline functions are generally a mess of
implicit typecasts between Sfdouble_t, double, long and int.
Line numbers are internally stored as int. The discipline functions
need to use Sfdouble_t for API compatibility.
src/cmd/ksh93/sh/init.c: nget_lineno(), put_lineno(), get_lineno():
- Get rid of unnecessary implicit typecasts by adjusting the types
of local variables.
- Make the typecasts that are done explicit.
Progresses: https://github.com/ksh93/ksh/issues/253
Simple reproducer:
set -A arr a b c d; : ${arr[1..2]}; unset arr[1]; echo ${arr[@]}
Output:
a
Expected output:
a c d
The ${arr[1..2]} expansion broke the subsequent 'unset' command
so that it unsets element 1 and on, instead of only 1.
This regression was introduced in nv_endsubscript() on 2009-07-31:
c47896b4/src/cmd/ksh93/sh/array.c
That change checks for the ARRAY_SCAN attribute which enables
processing ranges of array elements instead of single array
elements, and restores it after. That restore is evidently not
correct as it causes the subsequent unset command to malfunction.
If we revert that change, the bug disappears and the regression
tests show no failures. However, I don't know what this was meant
to accomplish and what other bug we might introduce by reverting
this. However, no corresponding regression test was added along
with the 2009-07-31 change, nor is there any corresponding message
in the changelog. So this looks to be one of those mystery changes
that we'll never know the reason for.
Since we currently have proof that this change causes breakage and
no evidence that it fixes anything, I'll go ahead and revert it
(and add a regression test, of course). If that causes another
regression, hopefully someone will find it at some point.
src/cmd/ksh93/sh/array.c: nv_endsubscript():
- Revert the 2009-07-31 change that saves/restores the ARRAY_SCAN
attribute.
- Keep the 'ap' pointer as it is now used by newer code. Move the
declaration up to the beginning of the block, as is customary.
src/cmd/ksh93/sh/init.c:
- Cosmetic change: remove an unused array_scan() macro that I found
when grepping the code for ARRAY_SCAN. The macro was introduced
in version 2001-06-01 but the code that used it was replaced in
version 2001-07-04, without removing the macro itself.
Resolves: https://github.com/ksh93/ksh/issues/254
This commit adds an UNREACHABLE() macro that expands to either the
__builtin_unreachable() compiler builtin (for release builds) or
abort(3) (for development builds). This is used to mark code paths
that are never to be reached.
It also adds the 'noreturn' attribute to functions that never
return: path_exec(), sh_done() and sh_syntax(). The UNREACHABLE()
macro is not added after calling these.
The purpose of these is:
* to slightly improve GCC/Clang compiler optimizations;
* to fix a few compiler warnings;
* to add code clarity.
Changes of note:
src/cmd/ksh93/sh/io.c: outexcept():
- Avoid using __builtin_unreachable() here since errormsg can
return despite using ERROR_system(1), as shp->jmplist->mode is
temporarily set to 0. See: https://github.com/att/ast/issues/1336
src/cmd/ksh93/tests/io.sh:
- Add a regression test for the ksh2020 bug referenced above.
src/lib/libast/features/common:
- Detect the existence of either the C11 stdnoreturn.h header or
the GCC noreturn attribute, preferring the former when available.
- Test for the existence of __builtin_unreachable(). Use it for
release builds. On development builds, use abort() instead, which
crahses reliably for debugging when unreachable code is reached.
Co-authored-by: Martijn Dekker <martijn@inlv.org>
These are minor fixes I've accumulated over time. The following
changes are somewhat notable:
- Added a missing entry for 'typeset -s' to the man page.
- Add strftime(3) to the 'see also' section. This and the date(1)
addition are meant to add onto the documentation for 'printf %T'.
- Removed the man page the entry for ksh reading $PWD/.profile on
login. That feature was removed in commit aa7713c2.
- Added date(1) to the 'see also' section of the man page.
- Note that the 'hash' command can be used instead of 'alias -t' to
workaround one of the caveats listed in the man page.
- Use an 'out of memory' error message rather than 'out of space'
when memory allocation fails.
- Replaced backticks with quotes in some places for consistency.
- Added missing documentation for the %P date format.
- Added missing documentation for the printf %Q and %p formats
(backported from ksh2020: https://github.com/att/ast/pull/1032).
- The comments that show each builtin's options have been updated.
This commit fixes at least three bugs:
1. When issuing 'typeset -p' for unset variables typeset as short
integer, a value of 0 was incorrectly diplayed.
2. ${x=y} and ${x:=y} were still broken for short integer types
(re: 9f2389ed). ${x+set} and ${x:+nonempty} were also broken.
3. A memory fault could occur if typeset -l followed a -s option
with integers. Additonally, now the last -s/-l wins out as the
option to utilize instead of it always being short.
src/cmd/ksh93/include/name.h:
- Fix the nv_isnull() macro by removing the direct exclusion of
short integers from this set/unset test. This breaks few things
(only ${.sh.subshell} and ${.sh.level}, as far as we can tell)
while potentially correcting many aspects of short integer use
(at least bugs 1 and 2 above), as this macro is widely used.
- union Value: add new pid_t *pidp pointer member for PID values
(see further below).
src/cmd/ksh93/bltins/typeset.c: b_typeset():
- To fix bug 3 above, unset the 'shortint' flag and NV_SHORT
attribute bit upon encountering the -l optiobn.
*** To fix ${.sh.subshell} to work with the new nv_isnull():
src/cmd/ksh93/sh/defs.h:
- Add new 'realsubshell' member to the shgd (aka shp->gd) struct
which will be the integer value for ${.sh.subshell}.
src/cmd/ksh93/sh/init.c,
src/cmd/ksh93/data/variables.c:
- Initialize SH_SUBSHELLNOD as a pointer to shgd->realsubshell
instead of using a short value (.s) directly. Using a pointer
allows nv_isnull() to return a positive for ${.sh.subshell} as
a non-null pointer is what it checks for.
- While we're at it, initialize PPIDNOD ($PPID) and SH_PIDNOD
(${.sh.pid}) using the new pdip union member, which is more
correct as they are values of type pid_t.
src/cmd/ksh93/sh/subshell.c,
src/cmd/ksh93/sh/xec.c:
- Update the ${.sh.subshell} increases/decreases to refer to
shgd->realsubshell (a.k.a. shp->gd->realsubshell).
*** To fix ${.sh.level} after changing nv_isnull():
src/cmd/ksh93/sh/macro.c: varsub():
- Add a specific exception for SH_LEVLNOD to the nv_isnull() test,
so that ${.sh.level} is always considered to be set. Its handling
throughout the code is too complex/special for a simple fix, so
we have to special-case it, at least for now.
*** Regression test additions:
src/cmd/ksh93/tests/attributes.sh:
- Add in missing short integer tests and correct the one that
existed. The -si test now yields 'typeset -x -r -s -i foo'
instead of 'typeset -x -r -s -i foo=0' which brings it in line
with all the others.
- Add in some other -l attribute tests for floats. Note, -lX test
was not added as the size of long double is platform dependent.
src/cmd/ksh93/tests/variables.sh:
- Add tests for ${x=y} and ${x:=y} used on short int variables.
Co-authored-by: Martijn Dekker <martijn@inlv.org>
src/cmd/ksh93/Mamfile:
- regress.c: add missing SH_DICT define for getopt self-doc string,
needed after USAGE_LICENSE macros were removed. (re: ede47996)
src/cmd/ksh93/init.c: sh_init():
- Do not set error_info.exit early in init. This is the function
that is called when an error exits the shell. It defaults to
exit(3). Setting it to sh_exit() early on can cause a crash if an
error is thrown before shell initialisation is fully finished.
So set it at the end of sh_init() instead.
- __regress__: Remove error_info.exit workaround. (re: 506bd2b2)
- Fix SHOPT_P_SUID directive. This is not actually a 0/1 value, so
we should use #ifdef and not #if. If SHOPT_REGRESS is on, it it
set to a function call. (re: 2182ecfa)
src/cmd/ksh93/SHOPT.sh:
- Document that SHOPT_P_SUID cannot be set to 0 to be turned off.
The referenced commit neglected to add checks for strdup() calls.
That calls malloc() as well, and is used a lot.
This commit switches to another strategy: it adds wrapper functions
for all the allocation macros that check if the allocation
succeeded, so those checks don't need to be done manually.
src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/sh/init.c:
- Add sh_malloc(), sh_realloc(), sh_calloc(), sh_strdup(),
sh_memdup() wrapper functions with success checks. Call nospace()
to error out if allocation fails.
- Update new_of() macro to use sh_malloc().
- Define new sh_newof() macro to replace newof(); it uses
sh_realloc().
All other changed files:
- Replace the relevant calls with the wrappers.
- Remove now-redundant success checks from 18529b88.
- The ERROR_PANIC error message calls are updated to inclusive-or
ERROR_SYSTEM into the exit code argument, so libast's error()
appends the human-readable version of errno in square brackets.
See src/lib/libast/man/error.3
src/cmd/ksh93/edit/history.c:
- Include "defs.h" to get access to the wrappers even if KSHELL is
not defined.
- Since we're here, fix a compile error that occurred with KSHELL
undefined by updating the type definition of hist_fname[] to
match that of history.h.
src/cmd/ksh93/bltins/enum.c:
- To get access to sh_newof(), include "defs.h" instead of
<shell.h> (note that "defs.h" includes <shell.h> itself).
src/cmd/ksh93/Mamfile:
- enum.c: depend on defs.h instead of shell.h.
- enum.o: add an -I. flag in the compiler invocation so that defs.h
can find its subsequent includes.
src/cmd/builtin/pty.c:
- Define one outofmemory() function and call that instead of
repeating the error message call.
- outofmemory() never returns, so remove superfluous exit handling.
Co-authored-by: Martijn Dekker <martijn@inlv.org>
Most of these changes remove unused variables, functions and labels
to fix -Wunused compiler warnings. Somewhat notable changes:
src/cmd/ksh93/bltins/print.c:
- Removed the unused 'neg' variable.
Patch from ksh2020: https://github.com/att/ast/pull/725
src/cmd/ksh93/bltins/sleep.c:
- Initialized ns to fix three -Wsometimes-uninitialized warnings.
src/cmd/ksh93/edit/{emacs,vi}.c:
- Adjust strncpy size to fix two -Wstringop-truncation warnings.
src/cmd/ksh93/include/shell.h:
- The NOT_USED macro caused many -Wunused-value warnings,
so it has been replaced with ksh2020's macro:
19d0620a
src/cmd/ksh93/sh/expand.c:
- Removed an unnecessary 'ap = ' since 'ap' is never read
between stakseek and stakfreeze.
src/cmd/ksh93/edit/vi.c: refresh():
- Undef this function's 'w' macro at the end of it to stop it
potentially interfering with future code changes.
src/cmd/ksh93/sh/nvdisc.c,
src/lib/libast/misc/magic.c,
src/lib/libast/regex/regsubexec.c,
src/lib/libast/sfio/sfpool.c,
src/lib/libast/vmalloc/vmbest.c:
- Fixed some indentation to silence -Wmisleading-indentation
warnings.
src/lib/libast/include/ast.h:
- For clang, now only suppress hundreds of -Wparentheses warnings
as well as a few -Wstring-plus-int warnings.
Clang's -Wparentheses warns about things like
if(foo = bar())
which assigns to foo and checks the assigned value.
Clang wants us to change this into
if((foo = bar()))
Clang's -Wstring-plus-int warns about things like
"string"+x
where x is an integer, e.g. "string"+3 represents the string
"ing". Clang wants us to change that to
"string"[3]
The original versions represent a perfectly valid coding style
that was common in the 1980s and 1990s and is not going to change
in this historic code base. (gcc does not complain about these.)
Co-authored-by: Martijn Dekker <martijn@inlv.org>
Huge typeset -L/-R adjustment length values were still causing
crashses on sytems with not enough memory. They should error out
gracefully instead of crashing.
This commit adds out of memory checks to all malloc/calloc/realloc
calls that didn't have them (which is all but two or three).
The stkalloc/stakalloc calls don't need the checks; it has
automatic checking, which is done by passing a pointer to the
outofspace() function to the stakinstall() call in init.c.
src/lib/libast/include/error.h:
- Change the ERROR_PANIC exit status value from ERROR_LEVEL (255)
to 77, which is what it is supposed to be according to the libast
error.3 manual page. Exit statuses > 128 for anything else than
signals are not POSIX compliant and may cause misbehaviour.
src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/sh/init.c:
- To facilitate consistency, add a simple extern sh_outofmemory()
function that throws an ERROR_PANIC "out of memory".
src/cmd/ksh93/include/shell.h,
src/cmd/ksh93/data/builtins.c:
- Remove now-redundant e_nospace[] extern message; it is now only
used in one place so it might as well be a string literal in
sh_outofmemory().
All other changed files:
- Verify the result of all malloc/calloc/realloc calls and call
sh_outofmemory() if they fail.
This fixes the following:
1. 'set --posix' now works as an equivalent of 'set -o posix'.
2. The posix option turns off braceexpand and turns on letoctal.
Any attempt to override that in a single command such as 'set -o
posix +o letoctal' was quietly ignored. This now works as long
as the overriding option follows the posix option in the command.
3. The --default option to 'set' now stops the 'posix' option, if
set or unset in the same 'set' command, from changing other
options. This allows the command output by 'set +o' to correctly
restore the current options.
src/cmd/ksh93/data/builtins.c:
- To make 'set --posix' work, we must explicitly list it in
sh_set[] as a supported option so that AST optget(3) recognises
it and won't override it with its own default --posix option,
which converts the optget(3) string to at POSIX getopt(3) string.
This means it will appear as a separate entry in --man output,
whether we want it to or not. So we might as well use it as an
example to document how --optionname == -o optionname, replacing
the original documentation that was part of the '-o' description.
src/cmd/ksh93/sh/args.c: sh_argopts():
- Add handling for explitit --posix option in data/builtins.c.
- Move SH_POSIX syncing SH_BRACEEXPAND and SH_LETOCTAL from
sh_applyopts() into the option parsing loop here. This fixes
the bug that letoctal was ignored in 'set -o posix +o letoctal'.
- Remember if --default was used in a flag, and do not sync options
with SH_POSIX if the flag is set. This makes 'set +o' work.
src/cmd/ksh93/include/argnod.h,
src/cmd/ksh93/data/msg.c,
src/cmd/ksh93/sh/args.c: sh_printopts():
- Do not potentially translate the 'on' and 'off' labels in 'set
-o' output. No other shell does, and some scripts parse these.
src/cmd/ksh93/sh/init.c: sh_init():
- Turn on SH_LETOCTAL early along with SH_POSIX if the shell was
invoked as sh; this makes 'sh -o' and 'sh +o' show expected
options (not that anyone does this, but correctness is good).
src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/include/shell.h:
- The state flags were in defs.h and most (but not all) of the
shell options were in shell.h. Gather all the shell state and
option flag definitions into one place in shell.h for clarity.
- Remove unused SH_NOPROFILE and SH_XARGS option flags.
src/cmd/ksh93/tests/options.sh:
- Add tests for these bugs.
src/lib/libast/misc/optget.c: styles[]:
- Edit default optget(3) option self-documentation for clarity.
Several changed files:
- Some SHOPT_PFSH fixes to avoid compiling dead code.
Many compile-time options were broken so that they could not be
turned off without causing compile errors and/or regression test
failures. This commit now allows the following to be disabled:
SHOPT_2DMATCH # two dimensional ${.sh.match} for ${var//pat/str}
SHOPT_BGX # one SIGCHLD trap per completed job
SHOPT_BRACEPAT # C-shell {...,...} expansions (, required)
SHOPT_ESH # emacs/gmacs edit mode
SHOPT_HISTEXPAND # csh-style history file expansions
SHOPT_MULTIBYTE # multibyte character handling
SHOPT_NAMESPACE # allow namespaces
SHOPT_STATS # add .sh.stats variable
SHOPT_VSH # vi edit mode
The following still break ksh when disabled:
SHOPT_FIXEDARRAY # fixed dimension indexed array
SHOPT_RAWONLY # make viraw the only vi mode
SHOPT_TYPEDEF # enable typeset type definitions
Compiling without SHOPT_RAWONLY just gives four regression test
failures in pty.sh, but turning off SHOPT_FIXEDARRAY and
SHOPT_TYPEDEF causes compilation to fail. I've managed to tweak the
code to make it compile without those two options, but then dozens
of regression test failures occur, often in things nothing directly
to do with those options. It looks like the separation between the
code for these options and the rest was never properly maintained.
Making it possible to disable SHOPT_FIXEDARRAY and SHOPT_TYPEDEF
may involve major refactoring and testing and may not be worth it.
This commit has far too many tweaks to list. Notables fixes are:
src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/data/options.c:
- Do not compile in the shell options and documentation for
disabled features (braceexpand, emacs/gmacs, vi/viraw), so the
shell is not left with no-op options and inaccurate self-doc.
src/cmd/ksh93/data/lexstates.c:
- Comment the state tables to associte them with their IDs.
- In the ST_MACRO table (sh_lexstate9[]), do not make the S_BRACE
state for position 123 (ASCII for '{') conditional upon
SHOPT_BRACEPAT (brace expansion), otherwise disabling this causes
glob patterns of the form {3}(x) (matching 3 x'es) to stop
working as well -- and that is ksh globbing, not brace expansion.
src/cmd/ksh93/edit/edit.c: ed_read():
- Fixed a bug: SIGWINCH was not handled by the gmacs edit mode.
src/cmd/ksh93/sh/name.c: nv_putval():
- The -L/-R left/right adjustment options to typeset do not count
zero-width characters. This is the behaviour with SHOPT_MULTIBYTE
enabled, regardless of locale. Of course, what a zero-width
character is depends on the locale, but control characters are
always considered zero-width. So, to avoid a regression, add some
fallback code for non-SHOPT_MULTIBYTE builds that skips ASCII
control characters (as per iscntrl(3)) so they are still
considered to have zero width.
src/cmd/ksh93/tests/shtests:
- Export the SHOPT_* macros from SHOPT.sh to the tests as
environment variables, so the tests can check for them and decide
whether or how to run tests based on the compile-time options
that the tested binary was presumably compiled with.
- Do not run the C.UTF-8 tests if SHOPT_MULTIBYTE is not enabled.
src/cmd/ksh93/tests/*.sh:
- Add a bunch of checks for SHOPT_* env vars. Since most should
have a value 0 (off) or 1 (on), the form ((SHOPT_FOO)) is a
convenient way to use them as arithmetic booleans.
.github/workflows/ci.yml:
- Make GitHub do more testing: run two locale tests (Dutch and
Japanese UTF-8 locales), then disable all the SHOPTs that we can
currently disable, recompile ksh, and run the tests again.
This fixes the function that sets ${.sh.match}. Patch from OpenSUSE:
https://build.opensuse.org/package/view_file/shells/ksh/ksh93-limit-name-len.dif
src/cmd/ksh93/sh/init.c: sh_setmatch():
- Fix node size calculation, possibly preventing data corruption.
src/cmd/ksh93/include/ulimit.h: Limit_t:
- Defining the 'name' struct member as 'char name[16]' makes
no sense as the name is being initialised statically in
data/limits.c; just make it a 'char *name' pointer.
This backports most of the Cdt (container data types) mechanism
from the ksh 93v- beta, based on ground work done by OpenSUSE:
https://build.opensuse.org/package/view_file/shells/ksh/ksh93-dttree-crash.dif
plus adaptations to match ksh 93u+m and an updated manual page
(src/lib/libast/man/cdt.3) added directly from the 93v- sources.
| Thu Dec 20 12:48:02 UTC 2012 - werner@suse.de
|
| - Add ksh93-dttree-crash.dif - Allow empty strings in (dt)trees
| (bnc#795324)
|
| Fri Oct 25 14:07:57 UTC 2013 - werner@suse.de
|
| - Rework patch ksh93-dttree-crash.dif
As usual, precious little information is available because the
OpenSUSE bug report is currently closed to the public:
https://bugzilla.opensuse.org/show_bug.cgi?id=795324
However, a cursory inspection suggests that this code contains
improvements to do with concurrent processing and related
robustness. The new cdt.3 manual page adds a lot about that.
This has been in production use on OpenSUSE for a long time,
so hopefully this will make ksh a little more stable again.
Only one way to find out: let's commit and test this...
BTW, to get a nice manual, use groff and ghostscript's ps2pdf:
$ groff -tman src/lib/libast/man/cdt.3 | ps2pdf - cdt.3.pdf
This applies a patch from Solaris:
https://github.com/oracle/solaris-userland/blob/master/components/ksh93/patches/160-CR7175995.patch
There is no public information on why it's needed, but it seems
sensible on the face of it. Using a file called '.profile' in the
PWD on login, without a directory path, is redundant at best, since
"$HOME/.profile" (e_profile, see data/msg.c) is already used. And
if the PWD is not $HOME at login time, it seems to me there are
serious problems and the last thing you want is to read some
random and probably dodgy '.profile' from the PWD.
src/cmd/ksh93/sh/init.c: sh_init(): login_files[]:
- Remove redundant/problematic ".profile" entry.
$KSH_VERSION is initialised as a nameref to ${.sh.version}, but it
was not realiable as it could be overridden from the environment.
Some scripts do version checking so this would allow influencing
their execution.
This fix is inspired by the following Solaris patch:
https://github.com/oracle/solaris-userland/blob/master/components/ksh93/patches/200-17435456.patch
but a different approach was needed, because the code has changed
(see 960a1a99).
src/cmd/ksh93/sh/init.c: env_init():
- Refuse to import $KSH_VERSION. Using strncmp(3) might be crude,
but it's effective and I can't figure out another way.
This change is pulled from here:
https://github.com/oracle/solaris-userland/blob/master/components/ksh93/patches/280-23332860.patch
Info and reproducers:
https://github.com/att/ast/issues/36
In a -c script (like ksh -c 'commands'), the last command
misredirects standard output if an EXIT or ERR trap is set.
This appears to be a side effect of the optimisation that
runs the last command without forking.
This applies a patch by George Lijo that flags these specific
cases and disables the optimisation.
src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/bltins/trap.c,
src/cmd/ksh93/sh/init.c,
src/cmd/ksh93/sh/main.c,
src/cmd/ksh93/sh/xec.c:
- Apply patch as above.
src/cmd/ksh93/tests/io.sh:
- Add the reproducers from the bug report as regression tests.
The forking fix implemented in 102868f8 and 9d428f8f, which stops
the main shell's hash table from being cleared if PATH is changed
in a subshell, can cause a significant performance penalty for
certain scripts that do something like
( PATH=... command foo )
in a subshell, especially if done repeatedly. This is because the
hash table is cleared (and hence a subshell forks) even for
temporary PATH assignments preceding commands.
It also just plain doesn't work. For instance:
$ hash -r; (ls) >/dev/null; hash
ls=/bin/ls
Simply running an external command in a subshell caches the path in
the hash table that is shared with a main shell. To remedy this, we
would have to fork the subshell before forking any external
command. And that would be an unacceptable performance regression.
Virtual subshells do not need to fork when changing PATH if they
get their own hash tables. This commit adds these. The code for
alias subshell trees (which was removed in ec888867 because they
were broken and unneeded) provided the beginning of a template for
their implementation.
src/cmd/ksh93/sh/subshell.c:
- struct subshell: Add strack pointer to subshell hash table.
- Add sh_subtracktree(): return pointer to subshell hash table.
- sh_subfuntree(): Refactor a bit for legibility.
- sh_subshell(): Add code for cleaning up subshell hash table.
src/cmd/ksh93/sh/name.c:
- nv_putval(): Remove code to fork a subshell upon resetting PATH.
- nv_rehash(): When in a subshell, invalidate a hash table entry
for a subshell by creating the subshell scope if needed, then
giving that entry the NV_NOALIAS attribute to invalidate it.
src/cmd/ksh93/sh/path.c: path_search():
- To set a tracked alias/hash table entry, use sh_subtracktree()
and pass the HASH_NOSCOPE flag to nv_search() so that any new
entries are added to the current subshell table (if any) and do
not influence any parent scopes.
src/cmd/ksh93/bltins/typeset.c: b_alias():
- b_alias(): For hash table entries, use sh_subtracktree() instead
of forking a subshell. Keep forking for normal aliases.
- setall(): To set a tracked alias/hash table entry, pass the
HASH_NOSCOPE flag to nv_search() so that any new entries are
added to the current subshell table (if any) and do not influence
any parent scopes.
src/cmd/ksh93/sh/init.c: put_restricted():
- Update code for clearing the hash table (when changing $PATH) to
use sh_subtracktree().
src/cmd/ksh93/bltins/cd_pwd.c:
- When invalidating path name bindings to relative paths, use the
subshell hash tree if applicable by calling sh_subtracktree().
- rehash(): Call nv_rehash() instead of _nv_unset()ting the hash
table entry; this is needed to work correctly in subshells.
src/cmd/ksh93/tests/leaks.sh:
- Add leak tests for various PATH-related operations in the main
shell and in a virtual subshell.
- Several pre-existing memory leaks are exposed by the new tests
(I've confirmed these in 93u+). The tests are disabled and marked
TODO for now, as these bugs have not yet been fixed.
src/cmd/ksh93/tests/subshell.sh:
- Update.
Resolves: https://github.com/ksh93/ksh/issues/66
The SHOPT_2DMATCH code block in sh_setmatch() modifies the 'ap'
pointer, which is initialised as nv_arrayptr(SH_MATCHNOD). This
caused a (rarely occurring) segfault in the following line near the
end of the function:
ap->nelem -= x;
as this line assumed that 'ap' still had the initial value.
src/cmd/ksh93/sh/init.c: sh_setmatch():
- On init, save ap in ap_save and use ap_save instead of ap where
it should be pointing to SH_MATCHNOD. This also allows removing
two redundant nv_arrayptr(SH_MATCHNOD) calls, slightly increasing
the efficiency of this function.
As of this commit, ksh 93u+m has a standard semantic version number
<https://semver.org/>, beginning with 1.0.0-alpha. This is added to
the version string in a way that should be compatible with scripts
parsing ${.sh.version} or $(ksh --version). This addition does not
replace the release date and does not affect $((.sh.version)).
For non-release builds, the version string will be:
FORK/VERSION+HASH YYYY-MM-DD
e.g.: 93u+m/1.0.0-alpha+41ef7f76 2021-01-03
For release builds, it will be:
FORK/VERSION YYYY-MM-DD
e.g.: 93u+m/1.0.0 2021-01-03
It is now automatically decided by bin/package whether to build a
release or development build. When building from a directory that
is not a git repository, or if the current git branch name starts
with a number (e.g. '1.0'), the release build is enabled; otherwise
a development build is the default. This is arranged by adding -D
flags to $CCFLAGS as described below. These flags are prepended to
$CCFLAGS, so they can be overridden by adding your own -D or -U
flags via the environment.
In addition, AST vmalloc is disabled for release builds as of this
commit, forcing the use of the OS's standard malloc(3). In 2021,
this is generally more reliable, faster, and more economical with
memory than AST vmalloc. Several memory leaks and crashing bugs are
avoided, e.g.: <https://github.com/ksh93/ksh/issues/95>.
For development builds, vmalloc stays enabled (along with its known
bugs) because this allows the use of the vmstat builtin, making it
much more efficient to test for memory leaks. For more info, see
the regression test script: src/cmd/ksh93/tests/leaks.sh
bin/package, src/cmd/INIT/package.sh:
- Add flags for build type. In $CCFLAGS, define _AST_ksh_release if
we're not on any git branch or on a git branch whose name starts
with a number. Otherwise, define _AST_git_commit as the first 8
characters of the current git commit hash.
src/lib/libast/features/vmalloc:
- If _AST_ksh_release is defined, disable vmalloc and force use of
the operating system's malloc. Discussion:
https://github.com/ksh93/ksh/issues/95
src/cmd/ksh93/include/version.h:
- Define new format for version string, adding a semantic version
number as well as (for non-release builds) the git commit hash.
src/cmd/ksh93/sh/init.c: e_version[]:
- Add a 'v' to the ${.sh.version} feature string if ksh was
compiled with vmalloc enabled. This allows scripts, such as
regression tests, to detect this.
src/cmd/ksh93/data/builtins.c: sh_optksh[]:
- Add a copyright line crediting the contributors to ksh 93u+m.
Resolves: https://github.com/ksh93/ksh/issues/95
When starting ksh +s, it gets stuck in an infinite loop continually
trying to parse its own binary as a shell script and rejecting it:
$ arch/linux.i386-64/bin/ksh +s
arch/linux.i386-64/bin/ksh: arch/linux.i386-64/bin/ksh: cannot execute [Exec format error]
arch/linux.i386-64/bin/ksh: arch/linux.i386-64/bin/ksh: cannot execute [Exec format error]
arch/linux.i386-64/bin/ksh: arch/linux.i386-64/bin/ksh: cannot execute [Exec format error]
arch/linux.i386-64/bin/ksh: arch/linux.i386-64/bin/ksh: cannot execute [Exec format error]
arch/linux.i386-64/bin/ksh: arch/linux.i386-64/bin/ksh: cannot execute [Exec format error]
[...]
$ echo 'echo "this is stdin"' | arch/linux.i386-64/bin/ksh +s
arch/linux.i386-64/bin/ksh: arch/linux.i386-64/bin/ksh: cannot execute [Exec format error]
(no loop, but still ksh trying to parse itself)
src/cmd/ksh93/sh/init.c: sh_init():
- When forcing on the '-s' option upon finding no command
arguments, also update sh.offoptions, a.k.a. shp->offoptions.
This avoids the inconsistent state causing this problem.
In main.c, there is:
if(sh_isoption(SH_SFLAG))
fdin = 0;
else
(code to open $0 as a file)
This was entering the else block because sh_isoption(SH_SFLAG)
was returning 0, and $0 is set to the ksh binary as it is
supposed to when no other script is provided. When I looked for
why sh_isoption was returning 0, I found main.c's
for(i=0; i<elementsof(shp->offoptions.v); i++)
shp->options.v[i] &= ~shp->offoptions.v[i];
Before this loop, shp->offoptions tracks which options were
explicitly disabled by the user on the command line. The effect
of this loop is to make "explicitly disabled" take precedence
over "implicitly enabled". My patch removes the registration of
the +s option.
Fixes: https://github.com/ksh93/ksh/issues/150
Co-authored-by: Martijn Dekker <martijn@inlv.org>
The undocumented alarm builtin executes actions unsafely so that
'read' with an IFS assignment crashed when an alarm was triggered.
This applies an edited version of a Red Hat patch:
642af4d6/f/ksh-20120801-alarmifs.patch
Prior discussion:
https://bugzilla.redhat.com/1176670
src/cmd/ksh93/bltins/alarm.c:
- Add a TODO note based on dgk's 2014 email cited in the RH bug.
- When executing the trap function, save and restore the IFS table.
src/cmd/ksh93/sh/init.c: get_ifs():
- Remove now-unnecessary SHOPT_MULTIBYTE preprocessor directives as
8477d2ce lets the compiler optimise out multibyte code if needed.
- Initialise the 0 position of the IFS table to S_EOF. This
corresponds with the static state tables in data/lexstates.c.
src/cmd/ksh93/tests/builtins.sh:
- Crash test.
This imports a new version of the code to import environment
variable values that was sent to Red Hat from upstream in 2014.
It avoids importing environment variables whose names are not valid
in the shell language, as it would be impossible to change or unset
them. However, they stay in the environment to be passed to child
processes.
Prior discussion: https://bugzilla.redhat.com/1147645
Original patch: 642af4d6/f/ksh-20120801-oldenvinit.patch
src/cmd/ksh93/sh/init.c:
- env_init(): Import new, simplified code to import environment
variable name/value pairs. Instead of doing the heavy lifting
itself, this version uses nv_open(), passing the NV_IDENT flag to
reject and skip invalid names.
- Get rid of gotos and a static var by splitting off the code to
import attributes into a new env_import_attributes() function.
This is a better way to avoid importing attributes when
initialising the shell in POSIX mode (re: 00d43960
- Remove an nv_mapchar() call that was based on some unclear
flaggery which was also removed by upstream as sent to Red Hat.
I don't know what that did, if anything; looks like it might have
had something to do with typeset -u/-l, but those particular
attributes have never been successfully inherited through the
environment.
(Maybe that's another bug, or maybe I just don't care as
inheriting attributes is a misfeature anyway; we have to put up
with it because legacy scripts might use it. Maybe someone can
prove it's an unacceptable security risk to import attributes
like readonly from an environment variable that is inherently
vulnerable to manipulation. That would be nice, as a CVE ID
would give us a solid reason to get rid of this nonsense.)
- Remove an 'else cp += 2;' that was very clearly a no-op; 'cp' is
immediately overwritten on the next loop iteration and not used
past the loop.
src/cmd/ksh93/tests/variables.sh:
- Test.
When using typeset -l or -u on a variable that cannot be changed
when the shell is in restricted mode, ksh crashed.
This fixed is inspired by this Red Hat fix, which is incomplete:
642af4d6/f/ksh-20120801-tpstl.patch
The crash was caused by the nv_shell() function. It walks though a
discipline function tree to get the pointer to the interpreter
associated with it. Evidently, the problem is that some pointer in
that walk is not set correctly for all special variables.
Thing is, ksh only has one shell language interpreter, and only one
global data structure (called 'sh') to keep its main state[*]. Yet,
the code is full of 'shp' pointers to that structure. Most (not
all) functions pass that pointer around to each other, accessing
that struct indirectly, ostensibly to account for the non-existent
possibility that there might be more than one interpreter state.
The "why" of that is an interesting cause for speculation that I
may get to sometime. For now, it is enough to know that, in the
code as it is, it matters not one iota what pointer to the shell
interpreter state is used; they all point to the same thing (unless
it's broken, as in this bug).
So, rather than fixing nv_shell() and/or associated pointer
assignments, this commit simply removes it, and replaces it with
calls to sh_getinterp(), which always returns a pointer to sh (see
init.c, where that function is defined as literally 'return &sh').
[*] Defined in shell.h, with the _SH_PRIVATE part in defs.h
src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/sh/name.c:
- Remove nv_shell().
src/cmd/ksh93/sh/init.c:
- In all the discipline functions for special variables, initialise
shp using sh_getinterp() instead of nv_shell().
src/cmd/ksh93/tests/variables.sh:
- Add regression test for typeset -l/-u on all special variables.
Now that we have ${.sh.pid} a.k.a. shgd->current_pid, which is
updated using getpid() whenever forking a new process, there is no
need for anything else to ever call getpid(); we can use the stored
value instead. There were a lot of these syscalls kicking around,
some of them in performance-sensitive places.
The following lists only changes *other* than changing getpid() to
shgd->currentpid.
src/cmd/ksh93/include/defs.h:
- Comments: clarify what shgd->{pid,ppid,current_pid} are for.
src/cmd/ksh93/sh/main.c,
src/cmd/ksh93/sh/init.c:
- On reinit for a new script, update shgd->{pid,ppid,current_pid}
in the sh_reinit() function itself instead of calling sh_reinit()
from sh_main() and then updating those immediately after that
call. It just makes more sense this way. Nothing else ever calls
sh_reinit() so there are no side effects.
src/cmd/ksh93/sh/xec.c: _sh_fork():
- Update shgd->current_pid in the child early, so that the rest of
the function can use it instead of calling getpid() again.
- Remove reassignment of SH_PIDNOD->nvalue.lp value pointer to
shgd->current_pid (which makes ${.sh.pid} work in the shell).
It's constant and was already set on init.
Following a community discussion, it became clear that 'r' is
particularly problematic as a regular builtin, as the name can and
does conflict with at least one legit external command by that
name. There was a consensus against removing it altogether and
letting users set the alias in their login scripts. However,
aliases are easier to bypass, remove or rename than builtins are.
My compromise is to reinstate 'r' as a preset alias on interactive
shells only, along with 'history', as was done in 17f81ebe before
they were converted to builtins in 03224ae3. So this reintroduces
the notion of predefined aliases to ksh 93u+m, but only for
interactive shells that are not initialised in POSIX mode.
src/cmd/ksh93/Makefile,
src/cmd/ksh93/Mamfile,
src/cmd/ksh93/include/shtable.h,
src/cmd/ksh93/data/aliases.c:
- Restore aliases.c containing shtab_aliases[], a table specifying
the preset aliases.
src/cmd/ksh93/include/shtable.h,
src/cmd/ksh93/sh/init.c:
- Rename inittree() to sh_inittree() and make it extern, because we
need to use it in main.c (sh_main()).
src/cmd/ksh93/sh/main.c: sh_main():
- Init preset aliases from shtab_aliases[] only if the shell is
interactive and not in POSIX mode.
src/cmd/ksh93/bltins/typeset.c,
src/cmd/ksh93/tests/alias.sh:
- unall(): When unsetting an alias, pass on the NV_NOFREE attribute
to nv_delete() to avoid an erroneous attempt to free a preset
alias from read-only memory. See: 5d50f825
src/cmd/ksh93/data/builtins.c:
- Remove "history" and "r" entries from shtab_builtins[].
- Revert changes to inline fc/hist docs in sh_opthist[].
src/cmd/ksh93/bltins/hist.c: b_hist():
- Remove handling for 'history' and 'r' as builtins.
src/cmd/ksh93/sh.1:
- Update accordingly.
Resolves: https://github.com/ksh93/ksh/issues/125
When exporting variables, ksh exports their attributes (such as
'integer' or 'readonly') in a magic environment variable called
"A__z" (string defined in e_envmarker[] in data/msg.c). Child
shells recognise that variable and restore the attributes.
This little-known feature is risky; the environment cannot
necessarily be trusted and that A__z variable is easy to manipulate
before or between ksh invocations, so you can cause a script's
variables to be of the wrong type, or readonly. Backwards
compatibility requires keeping it, at least for now. But it should
be disabled in the posix mode, as it violates POSIX.
To do this, we have to solve a catch-22 in init.c. We must parse
options to know whether to turn on posix mode; it may be specified
as '-o posix' on the command line. The option parsing loop depends
on an initialised environment[*], while environment initialisation
(i.e., importing attributes) should depend on the posix option.
The catch-22 can be solved because initialising just the values
before option parsing is enough to avoid regressions. Importing the
attributes can be delayed until after option parsing. That involves
basically splitting env_init() into two parts while keeping a local
static state variable between them.
src/cmd/ksh93/sh/init.c:
- env_init():
* Split the function in two stages based on a new
'import_attributes' parameter. Import values in the first
stage; import attributes from A__z in the second (if ever).
Make the 'next' variable static as it keeps a state needed for
the attributes import stage.
* Single point of truth, greppability: don't hardcode "A__z" in
separate character comparisons, but use e_envmarker[].
* Fix an indentation error.
- sh_init(): When initialising the environment (env_init), don't
import the attributes from A__z yet; parse options first, then
import attributes only if posix option is not set.
src/cmd/ksh93/sh/name.c:
- sh_envgen(): Don't export variable attributes to A__z if the
posix option is set.
src/cmd/ksh93/tests/attributes.sh:
- Check that variable attributes aren't imported or exported
if the POSIX option is set.
src/cmd/ksh93/sh.1:
- Update.
This was the last item on the TODO list for -o posix for now.
Closes: #20
[*] If environment initialisation is delayed until after option
parsing, bin/shtests shows various regressions, including:
restricted mode breaks; the locale is not initialised properly
so that multibyte variable names break; $SHLVL breaks.
This commit removes the following standards check on init:
strcmp(astconf("CONFORMANCE",0,0),"standard")==0
This also checks for the POSIXLY_CORRECT variable; the libast
configuration system uses it to set "CONFORMANCE" to "standard",
*but*, only if that parameter wasn't already initialised from the
_AST_FEATURES environment variable (see 'getconf --man').
Problem is, there is a harmful interaction between POSIXLY_CORRECT
and _AST_FEATURES. If the latter exists, it overrides the former.
Not only that, merely querying CONFORMANCE makes astconf create and
export the _AST_FEATURES variable, propagating the current setting
to child ksh processes, which will then ignore POSIXLY_CORRECT.
We could get around this by simply using getenv("POSIXLY_CORRECT").
But then the results may be inconsistent with the AST config state.
The whole thing may not be the best idea anyway. Honouring
POSIXLY_CORRECT at startup introduces a backwards compatibility
issue. Existing scripts or setups may export POSIXLY_CORRECT=y to
put external GNU utilities in standards mode, while still expecting
traditional ksh behaviour from newly initialised shells.
So it's probably better to just get rid of the check. This is not
bash, after all. If ksh is invoked as sh (the POSIX standard
command name), or with '-o posix' on the command line, you get the
standards mode; that ought to be good enough.
src/cmd/ksh93/sh/init.c: sh_init():
- Remove astconf call as per above.
Since ksh 93u+m comes bundled with libast 20111111, there's no need
to support older versions, so this is another cleanup opportunity.
src/cmd/ksh93/include/defs.h:
- Throw an #error if AST_VERSION is undefined or < 20111111.
(Note that _AST_VERSION is the same as AST_VERSION, but the
latter is newer and preferred; see src/lib/libast/features/api)
All other changed files:
- Remove legacy code for versions older than the currently used
versions, which are:
_AST_VERSION 20111111
ERROR_VERSION 20100309
GLOB_VERSION 20060717
OPT_VERSION 20070319
SFIO_VERSION 20090915
VMALLOC_VERSION 20110808
SHOPT_ENV is an undocumented compile-time option implementing an
experimental method for handling environment variables, which is
implemented in env.h and env.c. There is no mention in the docs or
Makefile, and no mention in the mailing list archives. It adds no
new functionality, but at first glance it's a clean-looking
interface.
However, unfortunately, it's broken. Compiling with -DSHOPT_ENV
added to CCFLAGS causes bin/shtests to show these regressions:
functions.sh[341]: export not restored name=value function call -- expected 'base', got ''
functions.sh[1274]: Environment variable is not passed to a function
substring.sh[236]: export not restored name=value function call
variables.sh[782]: SHLVL should be 3 not 2
In addition, 'export' stops working on unset variables.
In the 93v- beta this code is still present, unchanged, though 93v-
made lots of incompatible changes. By the time ksh2020 noticed it,
it was no longer compiling, so it probably wasn't compiling in the
93v- beta either. Discussion: https://github.com/att/ast/issues/504
So the experiment was already abandoned by D. Korn and his team.
Meanwhile it was leaving sh/name.c with two versions of several
enviornment-related functions, and it's not clear which one is
actually compiled without doing detective work tracing header files
(most of the code was made conditional on _ENV_H, which is defined
in env.h, which is included by defs.h if SHOPT_ENV is defined).
This actively hinders understanding of the codebase. And any
changes to these functions would need to be implemented twice.
src/cmd/ksh93/include/env.h,
src/cmd/ksh93/sh/env.c:
- Removed.
src/cmd/ksh93/DESIGN,
src/cmd/ksh93/Makefile,
src/cmd/ksh93/Mamfile:
- Update accordingly.
All other changed files:
- Remove deactivated code behind SHOPT_ENV and _ENV_H.
ksh was enabling POSIX mode on init if it was invoked as any name
that merely started with 'sh' (after parsing initial 'r'). This
included shcomp, which was bad news.
src/cmd/ksh93/sh/init.c: sh_type():
- Check that the 'sh' is at the end of the string by checking
for a final zero byte.
- On Windows (_WINIX, see src/lib/libast/features/common), allow
for a file name extension (sh.exe) by checking for a dot as well.
On 16 June there was a call for volunteers to fix the bash
compatibility mode; it has never successfully compiled in 93u+.
Since no one showed up, it is now removed due to lack of interest.
A couple of things are kept, which are now globally enabled:
1. The &>file redirection shorthand (for >file 2>&1). As a matter
of fact, ksh93 already supported this natively, but only while
running rc/profile/login scripts, and it issued a warning. This
makse it globally available and removes the warning, bringing
ksh93 in line with mksh, bash and zsh.
2. The '-o posix' standard compliance option. It is now enabled on
startup if ksh is invoked as 'sh' or if the POSIXLY_CORRECT
variable exists in the environment. To begin with, it disables
the aforementioned &> redirection shorthand. Further compliance
tweaks will be added in subsequent commits. The differences will
be fairly minimal as ksh93 is mostly compliant already.
In all changed files, code was removed that was compiled (more
precisely, failed to compile/link) if the SHOPT_BASH preprocessor
identifier was defined. Below are other changes worth mentioning:
src/cmd/ksh93/sh/bash.c,
src/cmd/ksh93/data/bash_pre_rc.sh:
- Removed.
src/cmd/ksh93/data/lexstates.c,
src/cmd/ksh93/include/shlex.h,
src/cmd/ksh93/sh/lex.c:
- Globally enable &> redirection operator if SH_POSIX not active.
- Remove warning that was issued when &> was used in rc scripts.
src/cmd/ksh93/data/options.c,
src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/sh/args.c:
- Keep SH_POSIX option (-o posix).
- Replace SH_TYPE_BASH shell type by SH_TYPE_POSIX.
src/cmd/ksh93/sh/init.c:
- sh_type(): Return SH_TYPE_POSIX shell type if ksh was invoked
as sh (or rsh, restricted sh).
- sh_init(): Enable posix option if the SH_TYPE_POSIX shell type
was detected, or if the CONFORMANCE ast config variable was set
to "standard" (which libast sets on init if POSIXLY_CORRECT
exists in the environment).
src/cmd/ksh93/tests/options.sh,
src/cmd/ksh93/tests/io.sh:
- Replace regression tests for &> and move to io.sh. Since &> is
now for general use, no longer test in an rc script, and don't
check that a warning is issued.
Closes: #9
Progresses: #20
This removes various blocks of uncommented experimental code that
was disabled using '#if 0' or '#if 1 ... #else' directives. It's
hard or impossible to figure out what the thoughts behind them
might have been, and we can really do without those distractions.
If ksh was compiled with -DSHOPT_REGRESS=1, it would immediately
segfault on init. After fixing that, another segfault remained that
occurred when using the --regress= command line option with an
invalid option-argument.
The __regress__ builtin allows tracing a few things (see
'__regress__ --man' after compiling with -DSHOPT_REGRESS=1, or
usage[] in src/cmd/ksh93/bltins/regress.c). It seems of limited
use, but at least it can be used/tested now.
src/cmd/ksh93/sh/init.c: sh_init():
- Move the call to sh_regress_init() up. The crash on init was
caused by geteuid() being intercepted by regress.c before the
shp->regress (== sh.regress) pointer was initialised.
- The builtin can also be called using a --regress= option-argument
on the ksh command line. Before calling b___regress__() to parse
that, temporarily change error_info.exit so any usage error calls
exit(3) instead of sh_exit(), as the latter assumes a fully
defined shell state and this call is done before the shell is
fully initialised.
Co-authored-by: Martijn Dekker <martijn@inlv.org>
An intermittent crash occurred after running many thousands of
virtual/non-forked subshells. One reproducer is a crash in the
shbench fibonacci.ksh test, as documented here:
f3d9e134/bench/fibonacci.ksh (L4-L10)
The apparent cause was the signed and insufficiently large 'short'
data type of 'curenv' and related variables which wrapped around to
a negative number when overflowing. These IDs are necessary for the
'wait' builtin to obtain the exit status from a background job.
This fix is inspired by a patch based on ksh 93v-:
https://build.opensuse.org/package/view_file/shells/ksh/ksh93-longenv.dif?expand=1https://src.fedoraproject.org/rpms/ksh/blob/f24/f/ksh-20130628-longer.patch
However, we change the type to 'unsigned int' instead of 'long'. On
all remotely modern systems, ints are 32-bit values, and using this
type avoids a performance degradation on 32-bit sytems. Making them
unsigned prevents an overflow to negative values.
src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/include/jobs.h,
src/cmd/ksh93/include/nval.h,
src/cmd/ksh93/include/shell.h:
- Change the types of the static global 'subenv' and the subshell
structure members 'curenv', 'jobenv', 'subenv', 'p_env' and
'subshell' to one consistent type, unsigned int.
src/cmd/ksh93/sh/jobs.c,
src/cmd/ksh93/sh/macro.c:
src/cmd/ksh93/sh/name.c:
src/cmd/ksh93/sh/nvtype.c,
src/cmd/ksh93/sh/subshell.c:
- Updates to match new variable types.
src/cmd/ksh93/tests/subshell.sh:
- Show wrong exit status in message on failure of 'wait' builtin.
This variable is like Bash's $BASHPID, but in virtual subshells
it will retain its previous value as virtual subshells don't fork.
Both $BASHPID and ${.sh.pid} are different from $$ as the latter
is only set to the parent shell's process ID (i.e. it isn't set
to the process ID of the current subshell).
src/cmd/ksh93/include/defs.h:
- Add 'current_pid' for storing the current process ID at a valid
memory address.
- Change 'ppid' from 'int32_t' to 'pid_t', as the return value from
'getppid' is of the 'pid_t' data type.
src/cmd/ksh93/data/variables.c,
src/cmd/ksh93/include/variables.h,
src/cmd/ksh93/sh/init.c,
src/cmd/ksh93/sh/xec.c:
- Add the ${.sh.pid} variable as an alternative to $BASHPID.
The process ID is stored in a struct before ${.sh.pid} is set
as environment variables are pointers that must point to a
valid memory address. ${.sh.pid} is updated by the _sh_fork()
function, which is called when ksh forks a new process with
sh_fork() or sh_ntfork().
src/cmd/ksh93/tests/variables.sh:
- Add ${.sh.pid} to the list of special variables and add three
regression tests for ${.sh.pid}.
src/cmd/ksh93/tests/subshell.sh:
- Update the PATH forking regression test to use ${.sh.pid} and
remove the TODO note.