The unused mkservice and eloop builtins are currently not built, and if
an attempt to compile them is made the build ends in failure. This
commit backports a few build fixes from ksh93v- 2012-08-24 that allow
mkservice and eloop to build (plus an additional compiler warning fix
not in ksh93v-). I've also added a new SHOPT_MKSERVICE setting (turned
off by default) so that mkservice and eloop can be built if the user
chooses to include them in their build of ksh.
This commit backports the whence '-t' option from ksh93v-. The '-t'
option is useful when one needs to identify the type of a command.
The '-t' flag was added by ksh93v- for compatibility with Bash.
It should be noted the ksh93v- patch had one bug, which this commit
fixes. Path-bound builtins from /opt/ast/bin were classified as
files if loaded from /opt/ast/bin in the PATH. Reproducer:
$ PATH=/opt/ast/bin whence -t cat
file
src/cmd/ksh93/bltins/whence.c:
- Simplify the bitmask values for the command and whence builtin
flags.
- Add the -t flag to the whence and type builtins. To prevent bugs,
-t will always override -v if both of those flags were passed.
src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/sh.1:
- Add documentation for the new -t option.
Depending on the OS, the heredoc.sh regression tests, and possibly
others, still crashed with the -x option (xtrace) on.
Analysis: The lexer crashes in lex_advance(). Something has caused
an inconsistent lexer state, and it happened earlier on, so the
backtrace is useless for figuring out where that happened.
But I think I've found it. It's the sh_mactry() call here:
src/cmd/ksh93/sh/xec.c, lines 2800 to 2807 in f7213f03
2800: if(!(cp=nv_getval(sh_scoped(shp,PS4NOD))))
2801: cp = "+ ";
2802: else
2803: {
2804: sh_offoption(SH_XTRACE);
2805: cp = sh_mactry(shp,cp);
2806: sh_onoption(SH_XTRACE);
2807: }
sh_mactry() needs to parse the contents of $PS4 to perform
expansions and command substitutions in it, which involves the
lexer. If that happens in a here-document, the lexer is in the C
function call stack, in the middle of parsing the here-document.
Result: inconsistent lexer state. Solution: save and restore lexer
state in sh_mactry().
After this commit, all regression tests should pass with the
'-x'/'--xtrace' option in use, with no errors or crashes.
Note for backporters: this fix depends both on on d7cada7b and on
the consistency fix for the Lex_t type's size applied in a7ed5d9f.
src/cmd/ksh93/include/shlex.h:
- Cosmetic fix: remove a copied & pasted backslash. (re: a7ed5d9f)
src/cmd/ksh93/sh/macro.c: sh_mactry():
- Save and restore the lexer state before letting sh_mactrim()
indirectly parse and execute code.
src/cmd/ksh93/tests/*.sh:
- Turn off xtrace in various command substitutions that contain
2>&1 redirections, so that the xtrace output is not caught by
the command substitutions, causing tests to fail incorrectly.
- Turn off xtrace for a few code blocks with 2>&1 redirections,
stopping xtrace output from being written to standard output.
Resolves: https://github.com/ksh93/ksh/issues/306 (again)
On some systems (at least Linux and macOS):
1. Run on a command line: t=$(sleep 10|while :; do :; done)
2. Press Ctrl+C in the first 10 seconds.
3. Execute any other command substitution. The shell crashes.
Analysis: Something in the job_wait() call in the sh_subshell()
restore routine may be interrupted by a signal such as SIGINT on
Linux and macOS. Exactly what that interruptible thing is remains
to be determined. In any case, since job_wait() was invoked after
sh_popcontext(), interrupting it caused the sh_subshell() restore
routine to be aborted, resulting in an inconsistent state of the
shell. The fix is to sh_popcontext() at a later stage instead.
src/cmd/ksh93/sh/subshell.c: sh_subshell():
- Rename struct checkpt buff to checkpoint because it's clearer.
- Move the sh_popcontext() call to near the end, just after
decreasing the subshell level counters and restoring the global
subshell data struct to its parent. This seems like a logical
place for it and could allow other things to be interrupted, too.
- Get rid of the if(shp->subshell) because it is known that the
value is > 0 at this point.
- The short exit routine run if the subshell forked now needs a new
sh_popcontext() call, because this is handled before restoring
the virtual subshell state.
- While we're here, do a little more detransitioning from all those
pointless shp pointers.
Fixes: https://github.com/ksh93/ksh/issues/397
This commit was originally intended to fix just one bug with shcomp's
handling of 'alias -p', but while fixing that I found a large number
of related issues in the alias command's -p, -t and -x options. The
current patch provides bugfixes for all of the bugs listed below:
1) Listing aliases in a script with 'alias -p' or 'alias' broke
shcomp's bytecode output:
https://github.com/ksh93/ksh/issues/87#issuecomment-813819122
2) Listing individual aliases with the -p option doesn't work:
$ alias foo=bar bar=foo
$ alias foo
foo=bar
$ alias -p foo # No output
3) Listing specific tracked aliases with -pt does not display them
in a reusable format, but rather adds another tracked alias:
$ hash -r cat vi
$ alias -pt vi # No output
$ alias -pt rm
$ alias -t
cat=/usr/bin/cat
rm=/usr/bin/rm
vi=/usr/bin/vi
4) Listing all tracked aliases with -pt does not output them in a
reusable format (the resulting command printed only creates a
normal alias, which is different from a tracked alias):
$ hash -r cat
$ alias -pt
alias cat=/usr/bin/cat # Expected 'alias -t cat'
5) Listing a non-existent alias with -p doesn't cause an error:
$ unalias -a
$ alias -p notanalias # No output
$ echo $?
0
$ alias notanalias
notanalias: alias not found
$ echo $?
1
$ hash -r
$ alias -pt notacommand # No output
$ echo $?
0
6) Attempting to list 256 non-existent aliases results in exit
status zero:
$ unalias -a
$ alias $(awk -v ORS= 'BEGIN { for(i=0;i<256;i++) print "x "; }')
x: alias not found
--cut error message--
$ echo $?
0
Changes:
- typeset.c: Avoid printing anything while shcomp is compiling a
script. This is needed because the alias command is run by shcomp
to prevent parsing issues.
- b_alias(): To avoid adding tracked aliases with -pt, set
tdata.aflag to '+' so that setall() and other related functions
only list tracked aliases.
- b_alias(): Set tdata.pflag to 1 so that setall() and other
functions recognize -p was passed.
- print_value(): Add support for listing specific aliases with
'alias -p'.
- setall(): To avoid any issues with zombie tracked aliases (see also
the regression tests) ignore tracked alias nodes marked with the
NV_NOALIAS attribute. This bit is set for tracked alias nodes by
the nv_rehash() function.
- setall(): For backward compatibility, continue incrementing the
exit status for each invalid alias and tracked alias passed. This
was already how alias behaved when listing aliases without -p, so
using -p shouldn't cause a change in behavior:
$ unalias -a
$ alias foo bar
foo: alias not found
bar: alias not found
$ echo $?
2
To fix bug 6, the exit status is set to one if an enforced 8-bit
exit status would be zero.
- print_namval(): Set the prefix to 'alias -t' so that listing
tracked aliases with 'alias -pt' works correctly.
- data/msg.c and include/name.h: Add an error message for when
'alias -pt' doesn't find a tracked alias.
- tests/alias.sh: Add a ton of regression tests for the bugs fixed in
this commit.
- sh/args.c: A process substitution run in a profile script may print
its PID as if it was a command spawned with '&'. Reproducer:
$ cat /tmp/env
true >(false)
$ ENV=/tmp/env ksh
[1] 730227
$
This bug is fixed by turning off the SH_PROFILE state while running
a process substitution.
- sh/subshell.c: The SH_INTERACTIVE fix in 3525535e renders the extra
check for SH_PROFILE redundant, so it has been removed.
- tests/io.sh: Update the procsub PIDs test to also check the result
after using process substitution in a profile script.
Reproducer: run vi in a subshell:
$ (vi)
vi opens; now press Ctrl+Z to suspend. The output is as expected:
[2] + Stopped (vi)
…but the exit status is 18 (SIGTSTP's signal number) instead of 0.
Now do:
$ fg
(vi)
$
The exit status is 18 again, vi is not resumed, and the job is
lost. You have to find vi's pid manually using ps and kill it.
Forking all non-command substitution subshells invoked from the
interactive main shell is the only reliable and effective fix I've
found. I've tried to fork the subshell conditionally in every other
remotely plausible place I can think of in fault.c and xec.c, but I
can't get anything to work properly. If anyone can get this to work
without forking as much (or at all), please do submit a patch or PR
that supersedes this fix.
At least subshells of subshells don't need to fork, so the
performance impact can be limited. Plus, it's not as if most people
need maximum speed on the interactive command line. Scripts
(including login/profile scripts) are not affected at all.
Command substitutions can be handled differently. My testing shows
that all shells except ksh93 simply block SIGTSTP (the ^Z signal)
while they run. We should do the same, so they don't need to fork.
NOTE for any backporters: the subshell.c and fault.c changes depend
on commits 35b02626 and 48ba6964 to work correctly.
src/cmd/ksh93/sh/subshell.c: sh_subshell():
- If the interactive shell state bit is on, then before executing
the subshell's code:
- for command substitutions, block SIGTSTP;
- for other subshells, fork.
- For command substitutions, release SIGTSTP if the interactive
shell state bit was on upon invoking the subshell.
src/cmd/ksh93/sh/fault.c:
- Instead of checking for a virtual subshell, check the shell's
interactive state bit to decide whether to handle SIGTSTP, as
that is only turned on in the interactive main shell.
src/cmd/ksh93/sh/main.c: sh_main():
- To avoid bugs, ignore SIGTSTP while running profile scripts.
Blocking it doesn't work because delaying it until after
sigrelease() will cause a crash. Thanks to @JohnoKing for this.
- While we're here, prevent a possible overflow of the 'beenhere'
static char variable by only incrementing it once.
Co-authored-by: Johnothan King <johnothanking@protonmail.com>
Resolves: https://github.com/ksh93/ksh/issues/390
Reproducer:
$ (sleep 1& echo done)
done
$ (eval "echo hi"; sleep 1& echo done)
hi
[1] 30587
done
No job control output should be printed for a background process
invoked from a subshell, not even after 'eval'.
The cause: sh_parse() turns on the shell's interactive state bit
(sh_state(SH_INTERACTIVE)) if the interactive shell option is on.
This is incorrect. The parser should have no involvement with shell
interactivity in principle because that's not its domain.
Not only that, the parser may need to run in a subshell, e.g. when
executing traps or 'eval' commands (as above). By definition, a
subshell can never be interactive.
We already fixed many bugs related to job control and the shell's
interactive state. Even if these two lines previously papered over
some breakage, I can't find any now after simply removing them. If
any is found later, then it'll need to be fixed properly instead.
Related: https://github.com/ksh93/ksh/issues/390
Announcing: KornShell 93u+m 1.0.0-beta.2
https://github.com/ksh93/ksh
In May 2020, when every KornShell (ksh93) development project was
abandoned, development was rebooted in a new fork based on the last
stable AT&T version: ksh 93u+. This new fork is called ksh 93u+m as a
permanent nod to its origin. We're restarting it at version 1.0. Seven
months after the first beta, the second one is ready. Please test this
second beta and report any bugs you find, or help us fix known bugs.
We're now the default ksh93 in some OS distributions, at least Debian
and Slackware! Even though we don't think it's stable release quality
yet, the consensus seems to be that 93u+m is already much better than
the last AT&T release.
Main developers: Martijn Dekker, Johnothan King, hyenias
Contributors: Andy Fiddaman, Anuradha Weeraman, Chase, Gordon Woodhull,
Govind Kamat, Harald van Dijk, Lev Kujawski, Marc Wilson, Ryan Schmidt,
Sterling Jensen
HOW TO GET IT
Please download the source code tarball from our GitHub releases page:
https://github.com/ksh93/ksh/releases
To build, follow the instructions in README.md or src/cmd/ksh93/README.
HOW TO GET INVOLVED
To report a bug, please open an issue at our GitHub page (see above).
Alternatively, email me at martijn@inlv.org with your report.
To get involved in development, read the brief policy information in
README.md and then jump right in with a pull request or email a patch.
See the TODO file in the top-level directory for a to-do list.
*** MAIN CHANGES between 1.0.0-beta.1 and 1.0.0-beta.2 ***
New features in built-in commands:
- 'cd' now supports an -e option that, when combined with -P, verifies
that $PWD is correct after changing directories; this helps detect
access permission problems. See:
https://www.austingroupbugs.net/view.php?id=253
- 'printf' now supports a -v option as in bash. This assigns formatted
output directly to variables, which is very fast and will not strip
final newline (\n) characters.
- The 'return' command, when used to return from a function, can now
return any status value in the 32-bit signed integer range, like on
zsh. However, due to a traditional Unix kernel limitation, $? is
still trimmed to its least significant 8 bits whenever leaving a
(sub)shell environment.
- 'test'/'[' now supports all the same operators as [[ (including =~,
\<, \>) except for the different 'and'/'or' operators. Note that
'test'/'[' remains deprecated due to its unfixable pitfalls;
[[ ... ]] is recommended instead.
Shell language changes:
- Several improvements were made to the --noexec shell code linter.
- Arithmetic expressions in native ksh mode no longer interpret a
number with a leading zero as octal in any context. Use 8#octalnumber
instead (e.g. 8#400 == 256). Arithmetic expressions now also behave
identically within and outside ((...)) and $((...)).
- POSIX compatibility mode fixes (only applicable with the --posix shell
option on):
- A leading zero is now consistently recognised as introducing an octal
number in all arithmetic contexts.
- $((inf)) and $((nan)) are now interpreted as regular variables.
- The '.' built-in no longer runs ksh functions and now only runs
files.
Bugs fixed:
- '.' and '..' are now once again completed by tab completion.
- If SIGINT is set to ignore, the interactive shell no longer exits on
Ctrl+C.
- ksh now builds and runs on Apple's new M1 hardware.
- The 'return' and 'exit' commands no longer risk triggering actual
signals by returning or exiting with a status > 256.
- Ksh no longer behaves badly when parsing a type definition command
('typeset -T' or 'enum') without executing it or when executing it in
a subshell. Types can now safely be defined in subshells and defined
conditionally as in 'if condition; then enum ...; fi'.
- Discipline functions, especially those applied to PS2 or .sh.tilde,
will no longer crash your shell upon being interrupted or throwing an
error.
- Fixed a bug that could corrupt output if standard output is closed
upon initialising the shell.
- Fixed a bug in the [[ ... ]] compound command: the '!' logical
negation operator now correctly negates another '!', e.g.,
[[ ! ! 1 -eq 1 ]] now returns 0/true. Note that this has always been
the case for 'test'/'['.
- Fixed SHLVL so that replacing ksh by itself (exec ksh) will not
increase it.
- Arithmetic expressions are no longer allowed to assign out-of-range
values to variables of types declared with enum.
- The 'time' keyword no longer makes the --errexit shell option
ineffective.
- Various bugs in libcmd built-in commands (those bound to the
/opt/ast/bin path by default) have been fixed.
- Various other crashing bugs have been fixed.
Fixes for the shcomp byte code compiler:
- shcomp is now able to compile scripts that define types using enum.
- shcomp now refuses to mess up your terminal by writing bytecode
to it.
*** MAIN CHANGES between ksh 93u+ 2012-08-01 and 93u+m 1.0.0-beta.1 ***
Hundreds of bugs have been fixed, including many serious/critical bugs.
This includes upstreamed patches from OpenSUSE, Red Hat, and Solaris, fixes
backported from the abandoned 93v- beta and ksh2020 fork, as well as many
new fixes from the community. See the NEWS file for more information, and
the git commit log for complete documentation of every fix. Incompatible
changes have been minimised, but not at the expense of fixing bugs. For a
list of potentially incompatible changes, see src/cmd/ksh93/COMPATIBILITY.
Though there was a "no new features, bugfixes only" policy, some new
features were found necessary, either to fix serious design flaws or to
complete functionality that was evidently intended, but not finished.
Below is a summary of these new features.
New command line editor features:
- The forward-delete and End keys are now handled as expected in the
emacs and vi built-in line editors.
- In the vi and emacs line editors, repeat count parameters can now also
be used for the arrow keys and the forward-delete key. E.g., in emacs
mode, <ESC> 7 <left-arrow> will now move the cursor seven positions to
the left. In vi control mode, this would be entered as: 7 <left-arrow>.
New shell language features:
- The &>file redirection shorthand (for >file 2>&1) is now available for
all scripts and interactive sessions and not only for profile/login
scripts, bringing ksh 93u+m in line with mksh, bash, and zsh.
- File name generation (a.k.a. pathname expansion, a.k.a. globbing) now
never matches the special navigational names '.' (current directory)
and '..' (parent directory). This change makes a pattern like .*
useful; it now matches all hidden files (dotfiles) in the current
directory, without the harmful inclusion of '.' and '..'.
- Tilde expansion can now be extended or modified by defining a
.sh.tilde.get or .sh.tilde.set discipline function. This replaces a
2004 undocumented attempt to add this functionality via a .sh.tilde
command, which never worked and crashed the shell. See the manual for
details on the new method.
New features in built-in commands:
- Usage error messages now show the --help/--man self-documentation options.
- Path-bound built-ins (such as /opt/ast/bin/cat) can now be executed by
invoking the canonical path, so the following will now work as expected:
$ /opt/ast/bin/cat --version
version cat (AT&T Research) 2012-05-31
- 'command -x' now looks for external commands only, skipping built-ins.
In addition, its xargs-like functionality no longer freezes the shell on
Linux and macOS, making it effectively a new feature on these systems.
- 'redirect' now checks if all arguments are valid redirections before
performing them. If an error occurs, it issues an error message instead
of terminating the shell.
- 'suspend' now refuses to suspend a login shell, as there is probably no
parent shell to return to and the login session would freeze.
- 'times' now gives high precision output in a POSIX compliant format.
- 'typeset' now gives an informative error message if an incompatible
combination of options is given.
- 'whence -v/-a' now reports the location of autoloadable functions.
New features in shell options:
- A new --globcasedetect shell option is added on OSs where we can
check for a case-insensitive file system (currently Windows/Cygwin,
macOS, Linux and QNX 7.0+). When this option is turned on, file name
generation (globbing), as well as file name tab completion on
interactive shells, automatically become case-insensitive on file
systems where the difference between upper and lower case is ignored
for file names. This is transparently determined for each directory, so
a path pattern that spans multiple file systems can be part
case-sensitive and part case-insensitive.
- A new --nobackslashctrl shell option disables the special escaping
behaviour of the backslash character in the emacs and vi built-in
editors. Particularly in the emacs editor, this makes it much easier to
go backward, insert a forgotten backslash into a command, and then
continue editing without having your next cursor key replace your
backslash with garbage. Note that Ctrl+V (or whatever other character
was set using 'stty lnext') always escapes all control characters in
either editing mode.
- A new --posix shell option has been added to ksh 93u+m that makes the
ksh language more compatible with other shells by following the POSIX
standard more closely. See the manual page for details. It is enabled by
default if ksh is invoked as sh, otherwise it is disabled by default.
- Enhancement to -G/--globstar: symbolic links to directories are now
followed if they match a normal (non-**) glob pattern. For example, if
'/lnk' is a symlink to a directory, '/lnk/**' and '/l?k/**' now work as
you would expect.
Strings compared in [[ with the > and < operators should be compared
lexically. This does not work when the strings are single digits, as
the parser interprets it as a syntax error:
$ [[ 10<2 ]] # 10 lexically sorts before 2
$ echo $?
0
$ [[ 1<2 ]]
/usr/bin/ksh: syntax error: `<' unexpected
$ echo $?
3
src/cmd/ksh93/sh/lex.c:
- Don't interpret numbers next to > and < as a redirection while
inside of [[. This bugfix was backported from ksh93v- 2014-06-25.
src/cmd/ksh93/tests/bracket.sh:
- Add regression tests for the > and < operators.
New version. I'm pretty sure the problems that forced me to revert
it earlier are fixed.
This commit mitigates the effects of the hack explained in the
referenced commit so that dummy built-in command nodes added by the
parser for declaration/assignment purposes do not leak out into the
execution level, except in a relatively harmless corner case.
Something like
if false; then
typeset -T Foo_t=(integer -i bar)
fi
will no longer leave a broken dummy Foo_t declaration command. The
same applies to declaration commands created with enum.
The corner case remaining is:
$ ksh -c 'false && enum E_t=(a b c); E_t -a x=(b b a c)'
ksh: E_t: not found
Since the 'enum' command is not executed, this should have thrown
a syntax error on the 'E_t -a' declaration:
ksh: syntax error at line 1: `(' unexpected
This is because the -c script is parsed entirely before being
executed, so E_t is recognised as a declaration built-in at parse
time. However, the 'not found' error shows that it was successfully
eliminated at execution time, so the inconsistent state will no
longer persist.
This fix now allows another fix to be effective as well: since
built-ins do not know about virtual subshells, fork a virtual
subshell into a real subshell before adding any built-ins.
src/cmd/ksh93/sh/parse.c:
- Add a pair of functions, dcl_hactivate() and dcl_dehacktivate(),
that (de)activate an internal declaration built-ins tree into
which check_typedef() can pre-add dummy type declaration command
nodes. A viewpath from the main built-ins tree to this internal
tree is added, unifying the two for search purposes and causing
new nodes to be added to the internal tree. When parsing is done,
we close that viewpath. This hides those pre-added nodes at
execution time. Since the parser is sometimes called recursively
(e.g. for command substitutions), keep track of this and only
activate and deactivate at the first level.
(Fixed compared to previous version of this commit: calling
dcl_dehacktivate() when the recursion level is already zero is
now a harmless no-op. Since this only occurs in error handling
conditions, who cares.)
- We also need to catch errors. This is done by setting libast's
error_info.exit variable to a dcl_exit() function that tidies up
and then passes control to the original (usually sh_exit()).
(Fixed compared to previous version of this commit: dcl_exit()
immediately deactivates the hack, no matter the recursion level,
and restores the regular sh_exit(). This is the right thing to
do when we're in the process of erroring out.)
- sh_cmd(): This is the most central function in the parser. You'd
think it was sh_parse(), but $(modern)-form command substitutions
use sh_dolparen() instead. Both call sh_cmd(). So let's simply
add a dcl_hacktivate() call at the beginning and a
dcl_deactivate() call at the end.
- assign(): This function calls path_search(), which among many
other things executes an FPATH search, which may execute
arbitrary code at parse time (!!!). So, regardless of recursion
level, forcibly dehacktivate() to avoid those ugly parser side
effects returning in that context.
src/cmd/ksh93/bltins/enum.c: b_enum():
- Fork a virtual subshell before adding a built-in.
src/cmd/ksh93/sh/xec.c: sh_exec():
- Fork a virtual subshell when detecting typeset's -T option.
Improves fix to https://github.com/ksh93/ksh/issues/256
The killpg(getpgrp(),SIGINT) call added to ed_getchar() in that
commit caused the interactive shell to exit on ^C even if SIGINT is
being ignored. We cannot revert or remove that call without
breaking job control. This commit applies a new fix instead.
Reproducers fixed by this commit:
SIGINT ignored by child:
$ PS1='childshell$ ' ksh
childshell$ trap '' INT
childshell$ (press Ctrl+C)
$
SIGINT ignored by parent:
$ (trap '' INT; ENV=/./dev/null PS1='childshell$ ' ksh)
childshell$ (press Ctrl+C)
$
SIGINT ignored by parent, trapped in child:
$ (trap '' INT; ENV=/./dev/null PS1='childshell$ ' ksh)
childshell$ trap 'echo test' INT
childshell$ (press Ctrl+C)
$
I've experimentally determined that, under these conditions, the
SFIO stream error state is set to 256 == 0400 == SH_EXITSIG.
src/cmd/ksh93/sh/main.c: exfile():
- On EOF or error, do not return (exiting the shell) if the shell
state is interactive and if sferror(iop)==SH_EXITSIG.
- Refactor that block a little to make the new check fit in nicely.
src/cmd/ksh93/tests/pty.sh:
- Test the above three reproducers.
Fixes: https://github.com/ksh93/ksh/issues/343
Note that this is only about the /opt/ast/bin built-in commands,
not about the regular pathless builtins such as printf.
To use these, either add /opt/ast/bin to your $PATH or use a
command like 'builtin cp'. As usual, --man provides info.
Removed as defaults for lack of convincing advantages over the OS's
external commands:
- chmod, cmp, head, logname, mkdir, sync, uname, wc
Remain as useful defaults:
- basename, cat, cut, dirname. These are commonly used in
performance-sensitive code paths in scripts and having them as
built-ins can be good for performance.
- getconf: This is the only interface to some libast internals that
is available to ksh. It's also has better functionality than most
OS-shipped 'getconf' commands, e.g., it can list and query all
the configuration values.
Added as defaults:
- cp, ln, mv: Having these built in can speed up scripts that
manage files. Also the AST versions have extended functionality
(see cp --man, etc.).
- mktemp: External mktemp commands vary too widely and are
incompatible, but it's important that scripts can securely make
temporary files, so it's good to ship a known interface to this
functionality.
As a result, the statically linked ksh binary is very slightly
smaller than before.
Resolves: https://github.com/ksh93/ksh/issues/349
This bug was first reported at <https://www.illumos.org/issues/3782>.
The chown builtin when used on illumos can fail with different error
messages after running the same command twice:
$ touch /tmp/x
$ /opt/ast/bin/chown -h 433:434 /tmp/px
chown: /tmp/x: cannot change owner and group [Not owner]
$ /opt/ast/bin/chown -h 433:434 /tmp/px
chown: /tmp/x: cannot change owner and group [Invalid argument]
The error messages differ because the libast struid and strgid
functions will return -2 if the same nonexistent ID is used twice.
The fix for this bug has been ported from here:
https://github.com/illumos/illumos-gate/commit/4162633a7c5961f388fd
src/lib/libcmd/chgrp.c:
- Remove NOID macro and check for a < 0 error status instead.
This is different from the Illumos fix at
<https://github.com/illumos/illumos-gate/commit/4162633a7c59>
which added another macro.
src/lib/libast/man/{strgid,struid}.3:
- Correct errors in the strgid and struid documentation.
- Document that the strgid and struid functions will return -2 if
the same invalid name is used twice.
Co-authored-by: Martijn Dekker <martijn@inlv.org>
Turns out there is a bona fide, honest-to-goodness use case for
matching '.' and '..' in globbing after all. It's when globbing is
used as the backend mechanism for file name completion in
interactive shell editors. A tab invisibly adds a * at the end of
the word to the left of your cursor and the resulting pattern is
expanded. In 5312a59d, this broke for '.' and '..'.
Typing '.' followed by two tabs should result in a menu that
includes './' and '../'. Typing '..' followed by a tab should
result in '../', (or a menu that includes it if there are files
with names starting with '..'). This is the behaviour in 93u+ and
we should maintain this.
To restore this functionality without reintroducing the harmful
behaviour fixed in the referenced commits, we should special-case
this, allowing '.' and '..' to match only for file name completion.
src/lib/libast/include/glob.h:
- Fix an inaccurate comment: the GLOB_COMPLETE flag is used for
command completion, not file name completion. This is very clear
from reading the path_expand() function in sh/expand.c.
- Add new GLOB_FCOMPLETE flag for file name completion.
src/lib/libast/misc/glob.c:
- Adapt flags mask to fit the new flag.
- glob_dir(): If GLOB_FCOMPLETE is passed, allow '.' and '..' to
match even if expanded from a pattern.
- Clarify the fix from aad74597 with an extended comment based on
<https://github.com/ksh93/ksh/issues/146#issuecomment-790991990>.
src/cmd/ksh93/sh/expand.c: path_expand():
- If we're in the SH_FCOMPLETE (file name completion) state, then
pass the new GLOB_FCOMPLETE flag to AST glob(3).
Fixes: https://github.com/ksh93/ksh/issues/372
Thanks to @fbrau for the bug report.
This commit adds onto <https://github.com/ksh93/ksh/pull/353> by porting
over two additional improvements to the shell linter:
1) The changes in the aforementioned pull request were merged into
illumos-gate with an additional change.[*] The illumos revision of
the patch improved the warning for (( $foo = $? )) to specify '$foo'
causes the warning.[**] Example:
$ ksh -n -c '(( $? != $bar ))'
ksh: warning: line 1: in '(( $? != $bar ))', using '$' as in '$bar' is slower and can introduce rounding errors
While I was porting the illumos patch I did notice one problem. The
string it uses from paramsub() skips over the initial '{' in
'${var}', resulting in the warning printing '$var}' instead:
$ ksh -n -c '(( ${.sh.pid} != $$ ))'
... in '(( ${.sh.pid} != $$ ))', using '$' as in '$.sh.pid}' is slower ...
This was fixed by including the missing '{' in the string returned by
paramsub for ${var} variables.
2) In ksh93v-, parsing x=$((expr)) with the shell linter will cause ksh
to warn the user x=$((expr)) is slower than ((x=expr)). This
improvement has been backported with a modified warning:
# Result from this commit
$ ksh -n -c 'x=$((1 + 2))'
ksh: warning: line 1: x=$((1 + 2)) is slower than ((x=1 + 2))
# Result from ksh93v-
$ ksh93v -n -c 'x=$((1 + 2))'
ksh93v: warning: line 1: ((x=1 + 2)) is more efficient than x=$((1 + 2))
Minor note: the ksh93v- patch had an invalid use of memcmp; this
version of the patch uses strncmp instead.
References:
be548e87bchttps://code.illumos.org/c/illumos-gate/+/1834/comment/65722363_22fdf8e7/
*** Crash 1: ***
ksh crashed if the PS1 prompt contains one or more command
substitutions and you enter a multi-line command substitution
on the command line, then interrupt while on the PS2 prompt.
$ ENV=/./dev/null /usr/local/bin/ksh -o emacs
$ PS1='$(echo foo) $(echo bar) $(echo baz) ! % '
foo bar baz 16999 % echo $(
> true <-- here, press Ctrl+C instead of Return
Memory fault
The crash occurred due to a corrupted lexer state while trying to
display the PS1 prompt.
Analysis: My fix for the crashing bug with Ctrl+C in commit
3023d53b is incorrect and only worked accidentally. sh_fault() is
not the right place to reset the lexer state because, when we press
Ctrl+C on a PS2 prompt, ksh had been waiting for input to finish
lexing a multi-line command, so sh_lex() and other lexer functions
are on the function call stack and will be returned to.
src/cmd/ksh93/sh/fault.c: sh_fault():
- Remove incorrect SIGINT fix.
src/cmd/ksh93/sh/io.c: io_prompt():
- Reset the lexer state immediately before printing every PS1
prompt. Even in situations where this is redundant it should be
perfectly safe, the overhead is negligible, and it resolves this
crash. It may pre-empt other problems as well.
*** Crash 2: ***
If an INT trap is set, and you start entering a multi-line command
substitution, then press Ctrl+C on the PS2 prompt to trigger the
crash, the lexer state is corrupted because the lexer is invoked to
eval the trap action. A crash then occurs on entering the final ')'
of the command substitution.
$ trap 'echo TRAPPED' INT
$ echo $(
> trueTRAPPED <-- press Ctrl+C to output "TRAPPED"
> )
Memory fault
Technically, as SIGINT is trapped, it should not interrupt, so ksh
should execute the trap, then continue with the PS2 prompt to let
the user finish inputting the command. But I have been unsuccessful
in many different attempts to make this work properly. I managed to
get multi-line command substitutions to lex correctly by saving and
restoring the lexer state, but command substitutions were still
corrupted at the parser and/or execution level and I have not
managed to trace the cause of that.
My testing showed that all other shells interrupt the PS2 prompt
and return to PS1 when the user presses Ctrl+C, even if SIGINT is
trapped. I think that is a reasonable alternative, and it is
something I managed to make work.
src/cmd/ksh93/sh/fault.c: sh_chktrap():
- Immediately after invoking sh_trap() to run a trap action, check
if we're in a PS2 prompt (sh.nextprompt == 2). If so, assume the
lexer state is now overwritten. Closing the fcin stream with
fcclose() seems to reliably force the lexer to stop doing
anything else. Then we can just reset the prompt to PS1 and
invoke sh_exit() to start new command line, which will now reset
the lexer state as per above.
ksh crashed if you pressed Ctrl+C or Ctrl+D on a PS2 prompt while
you haven't finished entering a $(command substitution). It
corrupts subsequent command substitutions. Sometimes the situation
recovers, sometimes the shell crashes.
Simple crash reproducer:
$ PS1="\$(echo foo) \$(echo bar) \$(echo baz) > "
foo bar baz > echo $( <-- now press Ctrl+D
> ksh: syntax error: `(' unmatched
Memory fault
The same happens with Ctrl+C, minus the syntax error message.
The problem is that the lexer state becomes inconsistent when the
lexer is interrupted in the middle of reading a command
substitution of the form $( ... ). This is tracked in the
'lexd.dolparen' variable in the lexer state struct.
Resetting that variable is sufficient to fix this issue. However,
in this commit I prefer to just reinitialise the lexer state
completely to pre-empt any other possible issues. Whether there was
a syntax error or the user pressed Ctrl+C, we just interrupted all
lexing and parsing, so the lexer *should* restart from scratch.
src/cmd/ksh93/sh/fault.c: sh_fault():
- If the shell is in an interactive state (e.g. not a subshell) and
SIGINT was received, reinitialise the lexer state. This fixes the
crash with Ctrl+C.
src/cmd/ksh93/sh/lex.c: sh_syntax():
- When handling a syntax error, reset the lexer state. This fixes
the crash with Ctrl+D.
NEWS:
- Also add the forgotten item for the previous fix (re: 2322f939).
This reverts c0334e32, thereby restoring 936a1939.
After the fixes in 0a343244 and a2bc49be, the tilde expansion
disciplines work nicely, so they can come back to the 1.0 branch.
The head and tail builtins don't correctly handle files that lack
newlines[*]:
$ print -n foo > /tmp/bar
$ /opt/ast/bin/head -1 /tmp/bar # No output
$ print -n 'foo\nbar' > /tmp/bar
$ /opt/ast/bin/tail -1 /tmp/bar
foo
bar$
This commit backports the required changes from ksh93v- to handle files
without a newline in the head and tail builtins. (Also note that the
required fix to sfmove was already backported in commit 1bd06207.)
src/lib/libcmd/{head,tail}.c:
- Backport the relevant ksh93v- code for handling files
without newlines.
src/cmd/ksh93/tests/builtins.sh:
- Add a few regression tests for using 'head -1' and 'tail -1' on a file
missing and ending newline.
[*]: https://www.illumos.org/issues/4149
When a global EXIT trap is set, and a ksh-style function exits with
a status > 256 that could have been the result of a signal, then
the shell incorrectly issues that signal to itself. Depending on
the signal, this causes ksh to terminate itself ungracefully:
$ cat /tmp/exit267
trap 'echo OK' EXIT # This trap triggers the crash
function foo { return 267; }
foo
$ bash /tmp/exit267
OK
$ ksh-3aee10d7 /tmp/exit267
OK
$ ksh /tmp/exit267
Memory fault(coredump)
On most systems, status 267 corresponds to SIGSEGV. The reported
memory fault is not real; it results from ksh incorrectly killing
itself with that signal.
The problem is caused by two factors:
1. As of 93u+ 2012-08-01, ksh explicitly allows 'return' to use an
exit status corresponding to a signal (from 257 to end of signal
range). The rest of the integer range is trunctated to 8 bits.
This is contrary to both 'man ksh' and 'return --man' which both
say it's always truncated to 8 bits. Plus, combined with point 2
below, this new behaviour is nonsensical, as 'return' has no
business actually generating signals. However, a couple of
regression tests now depend on this, as may some scripts.
2. When a ksh-style function does not handle a signal, the signal
is passed down to the parent environment and ksh does this by
reissuing the signal to its own process after leaving the
function scope. However, it does this by checking the exit
status, which is very bad practice as there is no guarantee
that an exit status corresponding to a signal was in fact
produced by a signal, particularly after they changed the
behaviour of 'return' per 1 above.
This commit fixes both issues. It also takes a proper decision on
allowable 'return' exit status arguments. Since 93u+ was released
nearly a decade ago and some scripts may now rely on being able to
pass certain exit statuses out of the 8-bit range, we should not
disallow this now. But neither should we be half-hearted in
allowing only some arbitrary selection of 9-bit statuses; 'return'
values categorically should have nothing to do with signals, so
this is no basis for limiting them. We're now allowing the full
unsigned integer range, which is usually 32 bits. This is like zsh,
and may create some interesting possibilities for scripts.
Just don't forget that $? will still lose all but its 8 least
significant bits when leaving the current (sub)shell environment.
src/cmd/ksh93/sh/xec.c: sh_funscope():
- Fix passing down unhandled signals from interrupted ksh functions
(jumpval==SH_JMPFUN) to the parent environment. Do not pay any
attention to the exit status. Instead, use sh.lastsig (a.k.a.
shp->lastsig). It is set by sh_fault() in fault.c for just this
purpose and contains the last signal handled for the current
command. It is reset in sh_exec() before running any new command.
So if it contains a signal, that is the one that interrupted the
ksh function, so it's the correct one to pass down. (Further
evidence: sh_subshell() was already using this in the same way.)
src/cmd/ksh93/bltins/cflow.c: b_return():
- Allow any signed int return value when invoked as and behaving
like 'return'.
- Add warning if a passed value is out of int range. Set the exit
status to 128 in that case; int overflow is undefined behaviour
in C and we want consistent behaviour across platforms. It should
be safe enough to check if the long and int values are equal.
- Refactor for clarity.
src/cmd/ksh93/sh/subshell.c: sh_subshell():
- If a function returns with a status out of the 8 bit range in a
virtual subshell, this status could be passed down to the parent
shell in full. However, if the subshell forks, then the kernel
will enforce an 8-bit exit status. That is inconsistent. Scripts
should not be able to tell the difference between forked and
non-forked subshells, so artificially enforce that limit here.
Other changed files:
- Documentation updates and copy-edits.
- Update an AT&T functions.sh regress test to allow arbitrary
integer return values for functions.
- Add regression tests based in part on @JohnoKing's reproducers.
- Rework some vaguely related regression tests to fail gracefully.
Thanks to Johnothan King for the report and the testing.
Fixes: https://github.com/ksh93/ksh/issues/364
This change adds the -e flag to the cd builtin, as specified in
<https://www.austingroupbugs.net/view.php?id=253>. The -e flag is
used to verify if the the current working directory after 'cd -P'
successfully changes the directory, and returns with exit status 1
if the cwd couldn't be determined. Additionally, it causes all
other errors to return with exit status >1 (i.e., status 2 unless
ENOMEM occurs) if -e and -P are both active.
src/cmd/ksh93/bltins/cd_pwd.c:
- Add -e option to the cd builtin command. It verifies $PWD by
using test_inode() to execute the equivalent of [[ . -ef $PWD ]].
- The check for restricted mode has been moved after optget to
allow 'cd -eP' to return with exit status 2 when in restricted
mode. To avoid changing the previous behavior of cd when -e isn't
passed, extra checks have been added to prevent cd from printing
usage information in restricted mode.
src/cmd/ksh93/tests/builtins.sh:
- Add regression tests for the exit status when using the cd -P
flag with and without -e.
src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/sh.1:
- Document the addition of -e to the cd builtin.
This commit ports over two of Andy Fiddaman's bugfixes to conf.sh
on illumos:
- The compiler isn't passed on to an invocation of iffe. The bugfix is
from this commit: <https://github.com/citrus-it/ast/commit/63563232>
- The getconf builtin is missing several parameters on illumos.
Reproducer:
$ /opt/ast/bin/getconf ADDRESS_WIDTH
getconf: Invalid argument (ADDRESS_WIDTH) # Should output '64'
This bug occurs because conf.sh expects GNU sed and fails to work
properly with other sed implementations. The bugfix and original bug
report can be found here:
https://www.illumos.org/issues/14044https://github.com/citrus-it/ast/commit/ba443cfd
This patch fixes the crashes experienced when a discipline function
exited because of a signal or an error from a special builtin. The
crashes were caused by ksh entering an inconsistent state after
performing a longjmp away from the assign() and lookup() functions in
nvdisc.c. Fixing the crash requires entering a new context, then setting
a nonlocal goto with sigsetjmp(3). Any longjmps that happen while
running the discipline function will go back to assign/lookup, allowing
ksh to do a proper cleanup afterwards.
Resolves: https://github.com/ksh93/ksh/issues/346
This commit ports over two improvements to the shell linter from
illumos (original patch written by Andy Fiddaman). Links to the
relevant bug reports and the original patch:
https://www.illumos.org/issues/13601https://www.illumos.org/issues/13631c7b656fc71
The first improvement is to the lint warning for arithmetic
operators in [[ ... ]]. The ksh linter now suggests the correct
equivalent operator to use in ((...)). Example:
$ ksh -nc '[[ 30 -gt 25 ]]'
# Original warning
warning: line 1: -gt within [[ ... ]] obsolete, use ((...))
# New warning
warning: line 1: [[ ... -gt ... ]] obsolete, use ((... > ...))
The second improvement pertains to variable expansion in arithmetic
expressions. The ksh linter now suggests referencing variable names
directly:
$ ksh -nc 'integer foo=40; (($foo < 50 ))'
# Old warning
warning: line 1: variable expansion makes arithmetic evaluation less efficient
# New warning
warning: line 1: in '(($foo < 50))', using '$' is slower and can introduce rounding errors
src/cmd/ksh93/{data/lexstates,sh/lex,sh/parse}.c:
- Port the improved shell lint warnings from illumos to ksh93u+m.
- The original checks for arithmetic operators involved a bunch of
if statements with inefficient calls to strcmp(3). These were
replaced with a more efficient switch statement that avoids
strcmp.
This change fixes a crash that can occur after setting a KEYBD trap
then inputting a multi-line command substitution. The crash is
similar to issue #347, but it's easier to reproduce since it
doesn't require you to setup a kshrc file. Reproducer for the
crash:
$ ENV=/./dev/null ksh
$ trap : KEYBD
$ : $(
> true)
Memory fault(coredump)
The bugfix was backported (with considerable changes) from ksh93v-
2013-10-08. The crash was first reported on the old mailing list:
https://www.mail-archive.com/ast-users@lists.research.att.com/msg00313.html
src/cmd/ksh93/{include/shlex.h,sh/lex.c}:
- To fix this properly, we need sizeof(Lex_t) to work as expected
in edit.c, but that is thwarted by the _SHLEX_PRIVATE macro in
lex.c which shlex.h uses to add private structs to the Lex_t type
in lex.c only. So get rid of that _SHLEX_PRIVATE macro and make
those members part of the centrally defined struct, renaming them
to make it clear they're considered private to lex.c.
src/cmd/ksh93/edit/edit.c:
- Now that we can get its size, save and restore the shell lexing
context when a KEYBD trap is present.
src/cmd/ksh93/tests/pty.sh:
- Add a regression test for the KEYBD trap crash.
Co-authored-by: Martijn Dekker <martijn@inlv.org>
Ksh segfaults on M1 Macs, apparently because Apple's compiler
optimizer can't cope with overriding bzero(3) with libast's
implementation (though it's nothing more than "memset(b, 0, n);").
Apple has disabled libast's bzero function since 2018-12-04:
https://opensource.apple.com/source/ksh/ksh-27/patches/src__lib__libast__comp__omitted.c.diff.auto.html
src/lib/libast/comp/omitted.c:
- Only fall back to the libast bzero function if the OS lacks an
implementation of bzero.
- Remove the bzero undef since this fallback is only reached if the
OS doesn't have bzero.
NEWS:
- Add compat info for macOS on ARM64. This notes that macOS
Monterey can now compile and run ksh, although there is one
regression test failure:
builtins.sh[345]: printf %H: invalid UTF-8 characters
(expected %3F%C2%86%3F%3F%3F; got %3F%C2%86%3F%3Fv%3F%3F)
Thanks to @DesantBucie for the report and the testing.
Resolves: https://github.com/ksh93/ksh/issues/329
This bug was first reported in <https://www.illumos.org/issues/7694>.
The time keyword currently overrides the errexit shell option,
allowing failing scripts to continue after an error:
$ cat 1.sh
#!/bin/sh
time false # This should cause the script to exit
echo FAILURE
true
$ ksh -o errexit 1.sh
real 0m0.00s
user 0m0.00s
sys 0m0.00s
FAILURE
src/cmd/ksh93/sh/xec.c:
- When the time keyword runs a command, pass the errexit state flag
to the sh_exec call. This state flag is required for ksh to exit
when a command fails while the errexit option is on.
src/cmd/ksh93/tests/basic.sh:
- Add a regression test based on the reproducer.
This reverts commit 2b9cbbbc8e.
This is not ready for prime time. Crashses when running a $PS2
discipline function. This needs fixing and more testing in
development before making it into the 1.0 branch. In the meantime,
that terrible problem with types is back, sorry about that.
This commit mitigates the effects of the hack explained in the
referenced commit so that dummy built-in command nodes added by the
parser for declaration/assignment purposes do not leak out into the
execution level, except in a relatively harmless corner case.
Something like
if false; then
typeset -T Foo_t=(integer -i bar)
fi
will no longer leave a broken dummy Foo_t declaration command. The
same applies to declaration commands created with enum.
The corner case remaining is:
$ ksh -c 'false && enum E_t=(a b c); E_t -a x=(b b a c)'
ksh: E_t: not found
Since the 'enum' command is not executed, this should have thrown
a syntax error on the 'E_t -a' declaration:
ksh: syntax error at line 1: `(' unexpected
This is because the -c script is parsed entirely before being
executed, so E_t is recognised as a declaration built-in at parse
time. However, the 'not found' error shows that it was successfully
eliminated at execution time, so the inconsistent state will no
longer persist.
This fix now allows another fix to be effective as well: since
built-ins do not know about virtual subshells, fork a virtual
subshell into a real subshell before adding any built-ins.
src/cmd/ksh93/sh/parse.c:
- Add a pair of functions, dcl_hactivate() and dcl_dehacktivate(),
that (de)activate an internal declaration built-ins tree into
which check_typedef() can pre-add dummy type declaration command
nodes. A viewpath from the main built-ins tree to this internal
tree is added, unifying the two for search purposes and causing
new nodes to be added to the internal tree. When parsing is done,
we close that viewpath. This hides those pre-added nodes at
execution time. Since the parser is sometimes called recursively
(e.g. for command substitutions), keep track of this and only
activate and deactivate at the first level.
- We also need to catch errors. This is done by setting libast's
error_info.exit variable to a dcl_exit() function that tidies up
and then passes control to the original (usually sh_exit()).
- sh_cmd(): This is the most central function in the parser. You'd
think it was sh_parse(), but $(modern)-form command substitutions
use sh_dolparen() instead. Both call sh_cmd(). So let's simply
add a dcl_hacktivate() call at the beginning and a
dcl_deactivate() call at the end.
- assign(): This function calls path_search(), which among many
other things executes an FATH search, which may execute arbitrary
code at parse time (!!!). So, regardless of recursion level,
forcibly dehacktivate() to avoid those ugly parser side effects
returning in that context.
src/cmd/ksh93/bltins/enum.c: b_enum():
- Fork a virtual subshell before adding a built-in.
src/cmd/ksh93/sh/xec.c: sh_exec():
- Fork a virtual subshell when detecting typeset's -T option.
Improves fix to https://github.com/ksh93/ksh/issues/256
The -d flag implemented in the rm builtin is completely broken. No
matter what you do it refuses to remove directories, even if -r is
also passed. Reproducer:
$ mkdir /tmp/empty
$ PATH=/opt/ast/bin rm -d /tmp/empty
rm: /tmp/empty: directory
$ PATH=/opt/ast/bin rm -dr /tmp/empty
rm: /tmp/empty: directory not removed [Is a directory]
Additionally, the description of 'rm -d' in the man page contradicts
how it's specified in <https://www.austingroupbugs.net/view.php?id=802>.
The ksh93v- rm builtin fixed nearly all of these issues, so I've
backported it to 93u+m and applied one additional fix for 'rm -rd'.
src/lib/libcmd/rm.c:
- Backported the fixes from the ksh93v- rm builtin's -d flag when
used on empty directories.
- Backported the man page update for rm(1) from ksh93v-.
- The ksh93v- rm builtin had one additional bug that caused the -r
option to fail when combined with -d. This was fixed by
overriding -d if -r is also passed.
src/cmd/ksh93/tests/builtins.sh:
- Add regression tests for the rm builtin's -d option.
POSIXly, '.' loads only files, not functions.
This only applies to '.', not 'source' (which is not in POSIX).
src/cmd/ksh93/bltins/misc.c: b_source():
- For ksh function lookup, add an additional check that we're not
in POSIX mode and running the '.' (SYSDOT) builtin.
Defining a .sh.tilde.get or .sh.tilde.set discipline function to
extend tilde expansion works well as long as the discipline
function doesn't get interrupted (e.g. with Crtl+C) or produce an
error message. Either of those will cause the shell to become
unstable and crash.
This feature is now removed from the 1.0 branch as it is not ready
for prime time. It can return to a release branch if/when we manage
to fix it on the master branch.
Related: https://github.com/ksh93/ksh/issues/346
Within arithmetic expressions, enumeration values of variables of a
type created with the 'enum' command translate to index numbers
from 0 to the number of elements minus 1. However, there was no
range checking on this in the arithmetic subsystem, allowing the
assignment of out-of-range values that did not correspond to any
enumeration value.
Variables of an enum type are internally unsigned short integers
(NV_UINT16), like those created with 'integer -su', except with an
additional discipline function (ENUM_disc).
src/cmd/ksh93/bltins/enum.c,
src/cmd/ksh93/include/builtins.h:
- To implement range checking, the arithmetic system needs access
to the 'nelem' (number of elements) member of 'struct Enum'. This
is only defined locally in enum.c. We could move that to name.h
so arith.c can access it, but enum.c has code that supports
compiling as standalone. So, instead, define a quick extern
function, b_enum_elem(), that does the necessary type conversion
and returns a type's number of elements.
- Add --man documentation for the arithmetic subsystem behaviour
for enum types. Tell the enuminfo() function, which dynamically
inserts values into the documentation, how to process new \f tags
'lastv' (the last-defined value) and 'lastn' (the number of the
last element).
src/cmd/ksh93/sh/arith.c: arith():
- For NV_UINT16 variables with an ENUM_disc discipline, check the
range using b_enum_elem() and error out if necessary.
Resolves: https://github.com/ksh93/ksh/issues/335
This commit adds support for 'stty size' to the stty builtin, as
defined in <https://austingroupbugs.net/view.php?id=1053>. The size
mode is used to display the terminal's number of rows and columns.
Note that stty isn't included in the default list of builtin
commands; testing this addition requires adding CMDLIST(stty) to
the table of builtins in src/cmd/ksh93/data/builtins.c.
src/lib/libcmd/stty.c:
- Add support for the size mode to the stty builtin. This mode is
only used to display the terminal's number of rows and columns,
so error out if any arguments are given that attempt to set the
terminal size.
Parser limitations prevent shcomp or source from handling enum
types correctly:
$ cat /tmp/colors.sh
enum Color_t=(red green blue orange yellow)
Color_t -A Colors=([foo]=red)
$ shcomp /tmp/colors.sh > /dev/null
/tmp/colors.sh: syntax error at line 2: `(' unexpected
$ source /tmp/colors.sh
/bin/ksh: source: syntax error: `(' unexpected
Yet, for types created using 'typeset -T', this works. This is done
via a check_typedef() function that preliminarily adds the special
declaration builtin at parse time, with details to be filled in
later at execution time.
This hack will produce ugly undefined behaviour if the definition
command creating that type built-in is then not actually run at
execution time before the type built-in is accessed.
But the hack is necessary because we're dealing with a fundamental
design flaw in the ksh language. Dynamically addable built-ins that
change the syntactic parsing of the shell language on the fly are
an absurdity that violates the separation between parsing and
execution, which muddies the waters and creates the need for some
kind of ugly hack to keep things like shcomp more or less working.
This commit extends that hack to support enum.
src/cmd/ksh93/sh/parse.c:
- check_typedef():
- Add 'intypeset' parameter that should be set to 1 for typeset
and friends, 2 for enum.
- When processing enum arguments, use AST getopt(3) to skip over
enum's options to find the name of the type to be defined.
(getopt failed if we were running a -c script; deal with this
by zeroing opt_info.index first.)
- item(): Update check_typedef() call, passing lexp->intypeset.
- simple(): Set lexp->intypeset to 2 when processing enum.
The rest of the changes are all to support the above and should be
fairly obvious, except:
src/cmd/ksh93/bltins/enum.c:
- enuminfo(): Return on null pointer, avoiding a crash upon
executing 'Type_t --man' if Type_t has not been fully defined due
to the definition being pre-added at parse time but not executed.
It's all still wrong, but a crash is worse.
Resolves: https://github.com/ksh93/ksh/issues/256
Listing types with 'typeset -T' will list not only types created with
typeset, but also types created with enum. However, the types created
by enum are not displayed correctly in the resulting output:
$ enum Foo_t=(foo bar)
$ typeset -T
typeset -T Foo_t
typeset -T Foo_t=fo)
The fix for this bug was backported from ksh93v- 2013-10-08.
src/cmd/ksh93/sh/nvtype.c:
- sh_outtype(): Skip over enums when listing types with 'typeset -T'.
This commit fixes an issue I found in the subshell $RANDOM
reseeding code.
The main issue is a performance regression in the shbench fibonacci
benchmark, introduced in commit af6a32d1. Performance dropped in
this benchmark because $RANDOM is always reseeded and restored,
even when it's never used in a subshell. Performance results from
before and after this performance fix (results are on Linux with
CC=gcc and CCFLAGS='-O2 -D_std_malloc'):
$ ./shbench -b bench/fibonacci.ksh -l 100 ./ksh-0f06a2e ./ksh-af6a32d ./ksh-f31e368 ./ksh-randfix
benchmarking ./ksh-0f06a2e, ./ksh-af6a32d, ./ksh-f31e368, ./ksh-randfix ...
*** fibonacci.ksh ***
# ./ksh-0f06a2e # Recent version of ksh93u+m
# ./ksh-af6a32d # Commit that introduced the regression
# ./ksh-f31e368 # Commit without the regression
# ./ksh-randfix # Ksh93u+m with this patch applied
-------------------------------------------------------------------------------------------------
name ./ksh-0f06a2e ./ksh-af6a32d ./ksh-f31e368 ./ksh-randfix
-------------------------------------------------------------------------------------------------
fibonacci.ksh 0.481 [0.459-0.515] 0.472 [0.455-0.504] 0.396 [0.380-0.442] 0.407 [0.385-0.439]
-------------------------------------------------------------------------------------------------
src/cmd/ksh93/include/variables.h,
src/cmd/ksh93/sh/{init,subshell}.c:
- Rather than reseed $RANDOM every time a subshell is created, add
a sh_save_rand_seed() function that does this only when the
$RANDOM variable is used in a subshell. This function is called
by the $RANDOM discipline functions nget_rand() and put_rand().
As a minor optimization, sh_save_rand_seed doesn't reseed if it's
called from put_rand().
- Because $RANDOM may have a seed of zero (i.e., RANDOM=0),
sp->rand_seed isn't enough to tell if $RANDOM has been reseeded.
Add sp->rand_state for this purpose.
- sh_subshell(): Only restore the former $RANDOM seed and state if
it is necessary to prevent a subshell leak.
src/cmd/ksh93/tests/variables.sh:
- Add two regression tests for bugs I ran into while making this
patch.
'printf' on bash and zsh has a popular -v option that allows
assigning formatted output directly to variables without using a
command substitution. This is much faster and avoids snags with
stripping final linefeeds. AT&T had replicated this feature in the
abandoned 93v- beta version. This backports it with a few tweaks
and one user-visible improvement.
The 93v- version prohibited specifying a variable name with an
array subscript, such as printf -v var\[3\] foo. This works fine on
bash and zsh, so I see no reason why this should not work on ksh,
as nv_putval() deals with array subscripts just fine.
src/cmd/ksh93/bltins/print.c: b_print():
- While processing the -v option when called as printf, get a
pointer to the variable, creating it if necessary. Pass only the
NV_VARNAME flag to enforce a valid variable name, and not (as
93v- does) the NV_NOARRAY flag to prohibit array subscripts.
- If a variable was given, set the output file to an internal
string buffer and jump straight to processing the format.
- After processing the format, assign the contents to the string
buffer to the variable.
src/cmd/ksh93/data/builtins.c:
- Document the new option, adding a warning that unquoted square
brackets may trigger pathname expansion.
In C/POSIX arithmetic, a leading 0 denotes an octal number, e.g.
010 == 8. But this is not a desirable feature as it can cause
problems with processing things like dates with a leading zero.
In ksh, you should use 8#10 instead ("10" with base 8).
It would be tolerable if ksh at least implemented it consistently.
But AT&T made an incredible mess of it. For anyone who is not
intimately familiar with ksh internals, it is inscrutable where
arithmetic evaluation special-cases a leading 0 and where it
doesn't. Here are just some of the surprises/inconsistencies:
1. The AT&T maintainers tried to honour a leading 0 inside of
((...)) and $((...)) and not for arithmetic contexts outside it,
but even that inconsistency was never quite consistent.
2. Since 2010-12-12, $((x)) and $(($x)) are different:
$ /bin/ksh -c 'x=010; echo $((x)) $(($x))'
10 8
That's a clear violation of both POSIX and the principle of
least astonishment. $((x)) and $(($x)) should be the same in
all cases.
3. 'let' with '-o letoctal' acts in this bizarre way:
$ set -o letoctal; x=010; let "y1=$x" "y2=010"; echo $y1 $y2
10 8
That's right, 'let y=$x' is different from 'let y=010' even
when $x contains the same string value '010'! This violates
established shell grammar on the most basic level.
This commit introduces consistency. By default, ksh now acts like
mksh and zsh: the octal leading zero is disabled in all arithmetic
contexts equally. In POSIX mode, it is enabled equally.
The one exception is the 'let' built-in, where this can still be
controlled independently with the letoctal option as before (but,
because letoctal is synched with posix when switching that on/off,
it's consistent by default).
We're also removing the hackery that causes variable expansions for
the 'let' builtin to be quietly altered, so that 'x=010; let y=$x'
now does the same as 'let y=010' even with letoctal on.
Various files:
- Get rid of now-redundant sh.inarith (shp->inarith) flag, as we're
no longer distinguishing between being inside or outside ((...)).
src/cmd/ksh93/sh/arith.c:
- arith(): Let disabling POSIX octal constants by skipping leading
zeros depend on either the letoctal option being off (if we're
running the "let" built-in") or the posix option being off.
- sh_strnum(): Preset a base of 10 for strtonll(3) depending on the
posix or letoctal option being off, not on the sh.inarith flag.
src/cmd/ksh93/include/argnod.h,
src/cmd/ksh93/sh/args.c,
src/cmd/ksh93/sh/macro.c:
- Remove astonishing hackery that violated shell grammar for 'let'.
src/cmd/ksh93/sh/name.c (nv_getnum()),
src/cmd/ksh93/sh/nvdisc.c (nv_getn()):
- Remove loops for skipping leading zeroes that included a broken
check for justify/zerofill attributes, thereby fixing this bug:
$ typeset -Z x=0x15; echo $((x))
-ksh: x15: parameter not set
Even if this code wasn't redundant before, it is now: sh_arith()
is called immediately after the removed code and it ignores
leading zeroes via sh_strnum() and strtonll(3).
Resolves: https://github.com/ksh93/ksh/issues/334
This commit fixes two file descriptor leaks in the hist built-in.
The bugfix for the first file descriptor leak was backported from
ksh2020. See:
https://github.com/att/ast/issues/872https://github.com/att/ast/commit/73bd61b5
Reproducer:
$ echo no
$ hist -s no=yes
The second file descriptor leak occurs after a substitution error
in the hist built-in (this leak wasn't fixed in ksh2020).
Reproducer:
$ echo no
$ ls /proc/$$/fd
$ hist -s no=yes
$ hist -s no=yes
$ ls /proc/$$/fd
src/cmd/ksh93/bltins/hist.c:
- Close leftover file descriptors when an error occurs and after
'hist -s' runs a command.
src/cmd/ksh93/tests/builtins.sh:
- Add two regression tests for both of the file descriptor leaks.
The --posix compliance option now disables the case-insensitive
special floating point constants Inf and NaN so that all case
variants of $((inf)) and $((nan)) refer to the variables by those
names as the standard requires. (BUG_ARITHNAN)
src/cmd/ksh93/sh/arith.c: arith():
- Only do case-insensitive checks for "Inf" and "NaN" if the POSIX
option is off.
ksh crashed after unsetting .sh.match and then matching a pattern:
$ unset .sh.match
$ [[ bar == ba* ]]
Memory fault
src/cmd/ksh93/sh/init.c: sh_setmatch():
- Do nothing if we cannot get an array pointer to SH_MATCHNOD.
Symptoms:
$ test \( string1 -a string2 \)
/usr/local/bin/ksh: test: argument expected
$ test \( string1 -o string2 \)
/usr/local/bin/ksh: test: argument expected
The parentheses should be irrelevant and this should be a test for
the non-emptiness of string1 and/or string2.
src/cmd/ksh93/bltins/test.c:
- b_test(): There is a block where the case of 'test' with five or
less arguments, the first and last one being parentheses, is
special-cased. The parentheses are removed as a workaround: argv
is increased to skip the opening parenthesis and argc is
decreased by 2. However, there is no corresponding increase of
tdata.av which is a copy of this function's argv. This renders
the workaround ineffective. The fix is to add that increase.
- e3(): Do not handle '!' as a negator if not followed by an
argument. This allows a right-hand expression that is equal to
'!' (i.e. a test for the non-emptiness of the string '!').
In ksh88, the test/[ built-in supported both the '<' and '>'
lexical sorting comparison operators, same as in [[. However, in
every version of ksh93, '<' does not work though '>' still does!
Still, the code for both is present in test_binop():
src/cmd/ksh93/bltins/test.c
548: case TEST_SGT:
549: return(strcoll(left, right)>0);
550: case TEST_SLT:
551: return(strcoll(left, right)<0);
Analysis: The binary operators are looked up in shtab_testops[] in
data/testops.c using a macro called sh_lookup, which expands to a
sh_locate() call. If we examine that function in sh/string.c, it's
easy to see that on systems using ASCII (i.e. all except IBM
mainframes), it assumes the table is sorted in ASCII order.
src/cmd/ksh93/sh/string.c
64: while((c= *tp->sh_name) && (CC_NATIVE!=CC_ASCII || c <= first))
The problem was that the '<' operator was not correctly sorted in
shtab_testops[]; it was sorted immediately before '>', but after
'='. The ASCII order is: < (60), = (61), > (62). This caused '<' to
never be found in the table.
The test_binop() function is also used by [[, yet '<' always worked
in that. This is because the parser has code that directly checks
for '<' and '>' within [[ (in sh/parse.c, lines 1949-1952).
This commit also adds '=~' to 'test', which took three lines of
code and allowed eliminating error handling in test_binop() as
test/[ and [[ now support the same binary ops. (re: fc2d5a60)
src/cmd/ksh93/*/*.[ch]:
- Rename a couple of very misleadingly named macros in test.h:
. For == and !=, the TEST_PATTERN bit is off for pattern compares
and on for literal string compares! Rename to TEST_STRCMP.
. The TEST_BINOP bit does not denote all binary operators, but
only the logical -a/-o ops in test/[. Rename to TEST_ANDOR.
src/cmd/ksh93/bltins/test.c: test_binop():
- Add support for =~. This is only used by test/[. The method is
implemented in two lines that convert the ERE to a shell pattern
by prefixing it with ~(E), then call test_strmatch with that
temporary string to match the ERE and update ${.sh.match}.
- Since all binary ops from shtab_testops[] are now accounted for,
remove unknown op error handling from this function.
src/cmd/ksh93/data/testops.c:
- shtab_testops[]:
. Correctly sort the '<' (TEST_SLT) entry.
. Remove ']]' (TEST_END). It's not an op and doesn't belong here.
- Update sh_opttest[] documentation with =~, \<, \>.
- Remove now-unused e_unsupported_op[] error message.
src/cmd/ksh93/sh/lex.c: sh_lex():
- Check for ']]' directly instead of relying on the removed
TEST_END entry from shtab_testops[].
src/cmd/ksh93/tests/bracket.sh:
- Add relevant tests.
src/cmd/ksh93/tests/builtins.sh:
- Fix an old test that globally deleted the 'test' builtin. Delete
it within the command substitution subshell only.
- Remove the test for non-support of =~ in test/[.
- Update the test for invalid test/[ op to use test directly.
POSIX requires
test "$a" -a "$b"
to return true if both $a and $b are non-empty, and
test "$a" -o "$b"
to return true if either $a or $b is non-empty.
In ksh, this fails if "$a" is '!' or '(' as this causes ksh to
interpret the -a and -o as unary operators (-a being a file
existence test like -e, and -o being a shell option test).
$ test ! -a ""; echo "$?"
0 (expected: 1/false)
$ set -o trackall; test ! -o trackall; echo "$?"
1 (expected: 0/true)
$ test \( -a \); echo "$?"
ksh: test: argument expected
2 (expected: 0/true)
$ test \( -o \)
ksh: test: argument expected
2 (expected: 0/true)
Unfortunately this problem cannot be fixed without risking breakage
in legacy scripts. For instance, a script may well use
test ! -a filename
to check that a filename is nonexistent. POSIX specifies that this
always return true as it is a test for the non-emptiness of both
strings '!' and 'filename'.
So this commit fixes it for POSIX mode only.
src/cmd/ksh93/bltins/test.c: e3():
- If the posix option is active, specially handle the case of
having at least three arguments with the second being -a or -o,
overriding their handling as unary operators.
src/cmd/ksh93/data/testops.c:
- Update 'test --man --' date and say that unary -a is deprecated.
src/cmd/ksh93/sh.1:
- Document the fix under the -o posix option.
- For test/[, explain that binary -a/-o are deprecated.
src/cmd/ksh93/tests/bracket.sh:
- Add tests based on reproducers in bug report.
Resolves: https://github.com/ksh93/ksh/issues/330
Stéphane Chazelas reported:
> As noted in this austin-group-l discussion[*] (relevant to this
> issue):
>
> $ ksh93u+m -c 'pwd; echo "$?" >&2; echo test; echo "$?" >&2' >&-
> 0
> 1
> /home/chazelas
>
> when stdout is closed, pwd does claim it succeeds (by returning a
> 0 exit status), while echo doesn't (not really relevant to the
> problem here, only to show it doesn't affect all builtins), and
> the output that pwd failed to write earlier ends up being written
> on stderr here instead of stdout upon exit (presumably) because
> of that >&2 redirection.
>
> strace shows ksh93 attempting write(1, "/home/chazelas\n", 15) 6
> times (1, the last one, successful).
>
> It gets even weirder when redirecting to a file:
>
> $ ksh93u+m -c 'pwd; echo "$?" >&2; echo test; echo "$?" > file' >&-
> 0
> $ cat file
> 1
> 1
> ome/chazelas
In my testing, the problem does not occur when closing stdout at
the start of the -c script itself (using redirect >&- or exec >&-);
it only occurs if stdout was closed before initialising the shell.
That made me suspect that the problem had to do with an
inconsistent file descriptor state in the shell. ksh uses internal
sh_open() and sh_close() functions, among others, to maintain that
state.
src/cmd/ksh93/sh/main.c: sh_main():
- If the shell is initialised with stdin, stdout or stderr closed,
then make the shell's file descriptor state tables reflect that
fact by calling sh_close() for the closed file descriptors.
This commit also improves the BUG_PUTIOERR fix from 93e15a30. Error
checking after sfsync() is not sufficient. For instance, on
FreeBSD, the following did not produce a non-zero exit status:
ksh -c 'echo hi' >/dev/full
even though this did:
ksh -c 'echo hi >/dev/full'
Reliable error checking requires not only checking the result of
every SFIO command that writes output, but also synching the buffer
at the end of the operation and checking the result of that.
src/cmd/ksh93/bltins/print.c:
- Make exitval variable global to allow functions called by
b_print() to set a nonzero exit status.
- Check the result of all SFIO output commands that write output.
- b_print(): Always sfsync() at the end, except if the s (history)
flag was given. This allows getting rid of the sfsync() call that
required the workaround introduced in 846ad932.
[*] https://www.mail-archive.com/austin-group-l@opengroup.org/msg08056.html
Resolves: https://github.com/ksh93/ksh/issues/314
Bug 1: POSIX requires numbers used as arguments for all the %d,
%u... in printf to be interpreted as in the C language, so
printf '%d\n' 010
should output 8 when the posix option is on. However, it outputs 10.
This bug was introduced as a side effect of a change introduced in
the 2012-02-07 version of ksh 93u+m, which caused the recognition
of leading-zero numbers as octal in arithmetic expressions to be
disabled outside ((...)) and $((...)). However, POSIX requires
leading-zero octal numbers to be recognised for printf, too.
The change in question introduced a sh.arith flag that is set while
we're processing a POSIX arithmetic expression, i.e., one that
recognises leading-zero octal numbers.
Bug 2: Said flag is not reset in a command substitution used within
an arithmetic expression. A command substitution should be a
completely new context, so the following should both output 10:
$ ksh -c 'integer x; x=010; echo $x'
10 # ok; it's outside ((…)) so octals are not recognised
$ ksh -c 'echo $(( $(integer x; x=010; echo $x) ))'
8 # bad; $(comsub) should create new non-((…)) context
src/cmd/ksh93/bltins/print.c: extend():
- For the u, d, i, o, x, and X conversion modifiers, set the POSIX
arithmetic context flag before calling sh_strnum() to convert the
argument. This fixes bug 1.
src/cmd/ksh93/sh/subshell.c: sh_subshell():
- When invoking a command substitution, save and unset the POSIX
arithmetic context flag. Restore it at the end. This fixes bug 2.
Reported-by: @stephane-chazelas
Resolves: https://github.com/ksh93/ksh/issues/326
When invoking a script without an interpreter (#!hashbang) path,
ksh forks, but there is no exec syscall in the child. The existing
command line is overwritten in fixargs() with the name of the new
script and associated arguments. In the generic/fallback version of
fixargs() which is used on Linux and macOS, if the new command line
is longer than the existing one, it is truncated. This works well
when calling a script with a shorter name.
However, it generates a misleading name in the common scenario
where a script is invoked from an interactive shell, which
typically has a short command line. For instance, if "/tmp/script"
is invoked, "ksh" gets replaced with "/tm" in "ps" output.
A solution is found in the fact that, on these systems, the
environment is stored immediately after the command line arguments.
This space can be made available for use by a longer command line
by moving the environment strings out of the way.
src/cmd/ksh93/sh/main.c: fixargs():
- Refactor BSD setproctitle(3) version to be more self-contained.
- In the generic (Linux/macOS) version, on init (i.e. mode==0), if
the command line is smaller than 128 bytes and the environment
strings have not yet been moved (i.e. if they still immediately
follow the command line arguments in memory), then strdup the
environment strings, pointing the *environment[] members to the
new strings and adding the length of the strings to the maximum
command line buffer size.
Reported-by: @gkamat
Resolves: https://github.com/ksh93/ksh/pull/300