1
0
Fork 0
mirror of git://git.code.sf.net/p/cdesktopenv/code synced 2025-02-13 19:52:20 +00:00
Commit graph

190 commits

Author SHA1 Message Date
Martijn Dekker
a14d17c0f4 Allow turning off brace expansion in comsubs (rhbz#1078698)
There was no check for the -B/braceexpand option before calling
path_expand() to process brace expansion, making it impossible to
turn off brace expansion within command substitutions. Normally the
lexer flags brace expansion so that this code is not reached, but
shell code within command substitutions is handled differently.

Red Hat patches this by adding this check to the function itself:
https://src.fedoraproject.org/rpms/ksh/blob/642af4d6/f/ksh-20140301-fikspand.patch
But I think it's more logical to patch it at the point of decision.

src/cmd/ksh93/sh/macro.c: endfield():
- Decide to call either path_generate() or path_expand() based on
  the state of the SH_BRACEEXPAND shell option.
- Fix '#if SHOPT_BRACEPAT' preprocessor check that previously
  hardcoded this decision at compile time.

src/cmd/ksh93/tests/options.sh:
- Add tests.
2020-09-24 08:21:37 +02:00
Martijn Dekker
3654ee73c0 Fix typeset -l/-u crash on special vars (rhbz#1083713)
When using typeset -l or -u on a variable that cannot be changed
when the shell is in restricted mode, ksh crashed.

This fixed is inspired by this Red Hat fix, which is incomplete:
https://src.fedoraproject.org/rpms/ksh/blob/642af4d6/f/ksh-20120801-tpstl.patch

The crash was caused by the nv_shell() function. It walks though a
discipline function tree to get the pointer to the interpreter
associated with it. Evidently, the problem is that some pointer in
that walk is not set correctly for all special variables.

Thing is, ksh only has one shell language interpreter, and only one
global data structure (called 'sh') to keep its main state[*]. Yet,
the code is full of 'shp' pointers to that structure. Most (not
all) functions pass that pointer around to each other, accessing
that struct indirectly, ostensibly to account for the non-existent
possibility that there might be more than one interpreter state.
The "why" of that is an interesting cause for speculation that I
may get to sometime. For now, it is enough to know that, in the
code as it is, it matters not one iota what pointer to the shell
interpreter state is used; they all point to the same thing (unless
it's broken, as in this bug).

So, rather than fixing nv_shell() and/or associated pointer
assignments, this commit simply removes it, and replaces it with
calls to sh_getinterp(), which always returns a pointer to sh (see
init.c, where that function is defined as literally 'return &sh').

[*] Defined in shell.h, with the _SH_PRIVATE part in defs.h

src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/sh/name.c:
- Remove nv_shell().

src/cmd/ksh93/sh/init.c:
- In all the discipline functions for special variables, initialise
  shp using sh_getinterp() instead of nv_shell().

src/cmd/ksh93/tests/variables.sh:
- Add regression test for typeset -l/-u on all special variables.
2020-09-24 03:03:29 +02:00
Martijn Dekker
ce68e1be37 Fix crash in backtick comsubs with job control on (rhbz#825520)
This imports another fix from Red Hat/Fedora. Original patch:
https://src.fedoraproject.org/rpms/ksh/blob/642af4d6/f/ksh-20120801-crash.patch

src/cmd/ksh93/include/jobs.h,
src/cmd/ksh93/sh/jobs.c,
src/cmd/ksh93/sh/subshell.c,
src/cmd/ksh93/sh/xec.c:
- Import the Red Hat fix with these differences:
  - Rename the 'hack1_waitall' variable to 'bktick_waitall' and add
    a comment describing what it's for.
  - Remove unused 'pipefail' variable.

src/cmd/ksh93/tests/basic.sh:
- Regression test from reproducer given in the Red Hat bug report.
- Add special handling to SIGKILL it, as it might freeze hard.
2020-09-23 01:56:09 +02:00
Martijn Dekker
fe6d0903dc Fix v=$(<file) for closed FD 0,1,2 (rhbz#1066589)
var=$(< file) now reads the file even if the standard inout,
standard output and/or standard error file descriptors are closed.

Original patch:
https://src.fedoraproject.org/rpms/ksh/blob/642af4d6/f/ksh-20120801-filecomsubst.patch

src/cmd/ksh93/sh/io.c: sh_redirect():
- When processing the '<' redirector as part of $(< ...), i.e. if
  flag==3, make sure the FD of the file to read is > 2 by calling
  sh_iomovefd(). Unlike the RedHat patch, this checks for flag==3
  to avoid unnecessary sh_iomovefd() calls for normal redirections,
  as there was no bug with those.

src/cmd/ksh93/tests/io.sh:
- Add test.
2020-09-22 03:02:06 +02:00
Martijn Dekker
5683155cb5 update NEWS, SH_RELEASE (re: 970069a6) 2020-09-22 01:45:01 +02:00
Martijn Dekker
a329c22dba Multiple 'whence' and path search fixes
Hopefully this doesn't introduce new bugs, but it does fix at
least the following:

1. When whence -v/-a found an "undefined" (i.e. autoloadable)
   function in $FPATH, it actually loaded the function as a side
   effect of reporting on its existence (!). Now it only reports.

2. 'whence' will now canonicalise paths properly. Examples:
	$ whence ///usr/lib/../bin//./env
	/usr/bin/env
	$ (cd /; whence -v dev/../usr/bin//./env)
	dev/../usr/bin//./env is /usr/bin/env

3. 'whence' no longer prefixes a spurious double slash when doing
   something like 'cd / && whence bin/echo'. On Cygwin, an initial
   double slash denotes a network server, so this was not just a
   cosmetic problem.

4. 'whence -a' now reports a "tracked alias" (a.k.a. hash table
   entry, i.e. cached $PATH search) even if an actual alias by the
   same name exists. This needed fixing because in fact the hash
   table entry continues to be used when bypassing the alias.
   Aliases and "tracked aliases" are not remotely the same thing;
   confusing nomenclature is not a reason to report wrong results.

5. When using 'hash' or 'alias -t' on a command that is also a
   builtin to force caching a $PATH search for the external
   command, 'whence -a' double-reported the path:
	$ hash printf; whence -a printf
	printf is a shell builtin
	printf is /usr/bin/printf
	printf is a tracked alias for /usr/bin/printf
   This is now fixed so that the second output line is gone.
   Plus, if there were multiple versions of the command on $PATH,
   the tracked alias was reported at the end, which is the wrong
   order. This is also fixed.

src/cmd/ksh93/bltins/whence.c: whence():
- Refactor the do...while loop that handles whence -v/-a for path
  searches in such a way that the code actually makes sense and
  stops looking like higher esotericism. Just doing this fixed #2,
  #4 and #5 above (the latter two before I even noticed them). For
  instance, the path_fullname() call to canonicalise paths was
  already there; it was just never used.
- Remove broken 'notrack' flaggery for deciding whether to report a
  hash table entry a.k.a. "tracked alias"; instead, check the hash
  table (shp->track_tree).

src/cmd/ksh93/sh/path.c:
- path_search(): Re #3: When prefixing the PWD, first check if
  we're in '/' and if so, don't prefix it; otherwise, adding the
  next slash causes an initial double slash. (Since '/' is the only
  valid single-character absolute path, all we need to do is check
  if the second character pwd[1] is non-null.)
- path_search(): Re #1: Stop autoloading when called by 'whence':
  * The 'flag==2' check to avoid autoloading a function was
    broken. The flag value is 2 on the first whence() loop
    iteration, but 3 on subsequent ones. Change to 'flag >= 2'.
  * However, this only fixes it if the function file does not have
    the x permission bit, as executable files are handled by
    path_absolute() which unconditionally autoloads functions!
    So, pass on our flag parameter when callling path_absolute().
- path_absolute(): Re #1: Add flag parameter. Do not autoload
  functions if flag >= 2.

src/cmd/ksh93/include/path.h,
src/cmd/ksh93/bltins/typeset.c,
src/cmd/ksh93/sh/main.c,
src/cmd/ksh93/sh/xec.c:
- Re #1: Update path_absolute() calls, adding a 0 flag parameter.

src/cmd/ksh93/include/name.h:
- Remove now-unused pathcomp member from union Value. It was
  introduced in 99065353 to allow examining the value of a tracked
  alias. This commit uses nv_getval() instead.

src/cmd/ksh93/tests/builtins.sh,
src/cmd/ksh93/tests/path.sh:
- Add and tweak various related tests.

Fixes: https://github.com/ksh93/ksh/issues/84
2020-09-20 07:56:09 +02:00
Martijn Dekker
f45a0f1650 -o posix: inverse-sync braceexpand; properly sync letoctal
{Brace,expansion} is potentially incompatible with POSIX scripts,
because in POSIX those are simple literal strings with no special
meaning. So the POSIX option should really turn that off.

As of b301d417, the 'posix' option was also forcing 'letoctal'
behaviour on, without actually setting that option. I've since
found that to be a botch; 'let' may recognise octals without that
option being set, and that looks like a bug.

So as of this commit, the '-o posix' option actually toggles both
of these options off/on and on/of, respectively. 'set +o posix'
toggles them inversely. However, it is now possible to control both
options (and their associated behaviour) independently in between
'set -o posix' and 'set +o posix'. Much better.

src/cmd/ksh93/sh/main.c: sh_main():
- If SH_POSIX was set on init, turn on SH_LETOCTAL by default
  instead of SH_BRACEEXPAND.

src/cmd/ksh93/sh/args.c: sh_applyopts():
- Turn off SH_BRACEEXPAND and turn on SH_LETOCTAL when SH_POSIX is
  turned on (but not if it was already on).
- Turn on SH_BRACEEXPAND and turn off SH_LETOCTAL when SH_POSIX is
  turned off (but not if it was already off).

src/cmd/ksh93/sh/arith.c: arith():
- Revert to pre-b301d417 and only check SH_LETOCTAL option when
  deciding whether 'let' should skip initial zeros.

src/cmd/ksh93/tests/options.sh:
- Update $- test to allow '-o posix' to switch B = braceexpand.

src/cmd/ksh93/sh.1:
- Update.
- Edit for clarity.
2020-09-18 22:07:44 +02:00
Martijn Dekker
7e5fd3e98d A few job control (-m, -o monitor) fixes (rhbz#960034)
This patch from Red Hat fixes the following:

1. ksh was ignoring the -m (-o monitor) option when specified on
   the invocation command line.

2. Scripts did not properly terminate their background processes
   on Ctrl+C if the -m option was turned off. Reproducer:
	xterm &
	read junk
   When run as a script without turning on -m, pressing Ctrl+C
   should terminate the xterm, and now does.

3. Scripts no longer attempt to set the terminal foreground process
   group ID, as only interactive shells should be doing that.

This makes some progress on https://github.com/ksh93/ksh/issues/119
but we're a long way from fixing all of that.

src/cmd/ksh93/sh/main.c: exfile():
- On non-interactive shells, do not turn off the monitor option.
  Instead, if it was turned on, turn on the SH_MONITOR state flag.

src/cmd/ksh93/edit/edit.c: ed_getchar():
- On Ctrl+C, issue SIGINT to the current process group using
  killpg(2) instead of going via sh_fault(), which handles a
  signal only for the current shell process.

src/cmd/ksh93/sh/jobs.c: job_reap(), job_reset(),
src/cmd/ksh93/sh/xec.c: sh_exec():
- Only attempt to set the terminal foreground process group ID
  using tcsetpgrp(3) if the shell is interactive.

Original patch: https://src.fedoraproject.org/rpms/ksh/blob/642af4d6/f/ksh-20120801-kshmfix.patch
This was applied to Red Hat's ksh 93u+ on 8 July 2013.
2020-09-18 04:42:27 +02:00
Martijn Dekker
06e721c313 data/signals.c: fix empty SIGINT/SIGPIPE messages
src/cmd/ksh93/data/signals.c includes two checks for the JOBS
identifier; if it is not defined then the interactive shell's
background job signal messages for SIGINT and SIGPIPE are empty.
The cause was that the "jobs.h" header, which defines that ID, was
not #included in signals.c. This commit adds that #include.
(ksh 93u+, ksh 93v- and ksh2020 all have this bug as well.)

Before:

$ sleep 30 &
[1]	86430
$ kill -s INT "$!"
[1] +                          sleep 30 &
$

After:

$ sleep 30 &
[1]	86445
$ kill -s INT "$!"
[1] + Interrupt                sleep 30 &
$
2020-09-18 03:22:26 +02:00
Martijn Dekker
13c3fb21e9 emacs, vi: Support repeat parameters to VT220 keys (re: f2a3f4e3)
In the vi and emacs line editors, repeat count parameters can now
also be used for the arrow keys and the forward-delete key. E.g.,
in emacs mode, <ESC> 7 <left-arrow> will now move the cursor seven
positions to the left. In vi control mode, this would be entered
as: 7 <left-arrow>.

src/cmd/ksh93/edit/emacs.c:
- ed_emacsread(): Upon getting ^[ (ESC), save current repeat count
  in a new variable; restore and reset it upon the next character.
- escape(): Minor bugfix: when processing a ^[[x sequence where 'x'
  is a character other than '~' (which would be DEL), also reinsert
  the final character into the buffer so scripts can detect them.

src/cmd/ksh93/edit/vi.c:
- cntlmode(): Do not reset the repeat count if the command is '[',
  the character following ESC in VT220 escape sequences.
- mvcursor():
  * Do not use getcount() to get the character following '[', as
    that was parsing repetition parameters in the wrong place.
    There wouldn't be any, so this would reset the repeat count.
  * After that, no more need for the special-casing of ^[[3~ (DEL)
    introduced in f2a3f4e3. Move it to within the 'switch' block.
  * When handling left and right arrows and Home and End keys, do
    not modify cursor directly but ed_ungetchar() the corresponding
    traditional command keys as with the rest. Otherwise a repeat
    count parameter would now wrongly survive those keys.

src/cmd/ksh93/sh.1:
- Document control character notation used for vi mode docs.
- Since vi control mode beeps and aborts on ESC except if a
  subsequent [ is already in the input buffer upon receiving ESC,
  document that VT220 escape sequences only preserve repeat counts
  when entered into the input buffer all at once.
- Don't skip the initial ESC in the documentation of the VT220
  escape sequences. In control mode, skipping the initial ESC still
  works as before, but that is now undocumented, as it's really
  nothing more than an artefact of VT220 escape processing.
- Move the two long paragraphs on '-o viraw' and canonical (i.e.
  line-based) input processing from the vi editor introduction to
  the options section under 'viraw'. It is much too arcane for the
  intro, and besides, ksh 93u+ (and hence also 93u+m) has
  SHOPT_VIRAW enabled by default, so the shell is compiled to force
  this option on at all times, making it even less relevant for
  most users.
2020-09-17 19:14:39 +02:00
Martijn Dekker
f2a3f4e36b Handle forward-delete key in emacs and vi editors
On every modern system, the forward-delete key on PC/Mac keyboards
generates the VT220 sequence ESC [ 3 ~. Every other shell with an
editor handles this now, ksh93 seems to be the last not to.

src/cmd/ksh93/edit/emacs.c: escape():
- Handle the ^[[3 as part of normal escape processing, then read an
  extra character to check for the final '~'. If detected, insert
  an ERASECHAR key event.

src/cmd/ksh93/edit/vi.c: mvcursor():
- Replace the ^[[3~ sequence by an 'x' command. We have to
  special-case its processing, because vi mode parses numbers as
  repetition operators. The escape sequence contains a number,
  making it incompatible with normal command handling. This means
  number repetitions don't work with the forward-delete key. If
  that annoys anyone enough to fix it, a patch would be welcome.
  For now, it will do to make the forward-delete key stop
  exhibiting bizarre behaviour (beep + change case + move forward).

src/cmd/ksh93/sh.1
- Copy-edit emacs documentation for VT220-style sequences; map them
  to their actual key, otherwise it's meaningless to the reader.
- Document the new forward-delete key behaviour for emacs mode.
- Leave the forward-delete key for vi mode undocumented for now, as
  repetitions don't work, so it doesn't really match the vi canon.
  (OTOH, it doesn't work in vim, either...)
2020-09-15 03:43:53 +02:00
hyenias
d7c90eadc3 sfio: correct floating decimal point scaling of fractions (#131)
_sfcvt(), "convert a floating point value to ASCII", did not adjust
for negative decimal place movement as what happens with leading
zeroes. This caused ksh's 'printf %f' formatter to fail to round
floating point values correctly.

src/lib/libast/sfio/sfcvt.c:
- Removed constraint of <1e-8 for doubles by matching what was done
  for long doubles having <.1.
- Corrected a condition when the next power of 10 occurred and that
  new 1 digit was being overwritten by a 0.

 src/cmd/ksh93/tests/math.sh:
- Validate that typeset -E/F formatting matches that of their
  equivalent printf formatting options as well as checking for
  correct float scaling of the fractional parts.
2020-09-14 13:46:40 +02:00
Martijn Dekker
ddaa145b3d Reinstate 'r' and 'history' as preset aliases for interactive ksh
Following a community discussion, it became clear that 'r' is
particularly problematic as a regular builtin, as the name can and
does conflict with at least one legit external command by that
name. There was a consensus against removing it altogether and
letting users set the alias in their login scripts. However,
aliases are easier to bypass, remove or rename than builtins are.
My compromise is to reinstate 'r' as a preset alias on interactive
shells only, along with 'history', as was done in 17f81ebe before
they were converted to builtins in 03224ae3. So this reintroduces
the notion of predefined aliases to ksh 93u+m, but only for
interactive shells that are not initialised in POSIX mode.

src/cmd/ksh93/Makefile,
src/cmd/ksh93/Mamfile,
src/cmd/ksh93/include/shtable.h,
src/cmd/ksh93/data/aliases.c:
- Restore aliases.c containing shtab_aliases[], a table specifying
  the preset aliases.

src/cmd/ksh93/include/shtable.h,
src/cmd/ksh93/sh/init.c:
- Rename inittree() to sh_inittree() and make it extern, because we
  need to use it in main.c (sh_main()).

src/cmd/ksh93/sh/main.c: sh_main():
- Init preset aliases from shtab_aliases[] only if the shell is
  interactive and not in POSIX mode.

src/cmd/ksh93/bltins/typeset.c,
src/cmd/ksh93/tests/alias.sh:
- unall(): When unsetting an alias, pass on the NV_NOFREE attribute
  to nv_delete() to avoid an erroneous attempt to free a preset
  alias from read-only memory. See: 5d50f825

src/cmd/ksh93/data/builtins.c:
- Remove "history" and "r" entries from shtab_builtins[].
- Revert changes to inline fc/hist docs in sh_opthist[].

src/cmd/ksh93/bltins/hist.c: b_hist():
- Remove handling for 'history' and 'r' as builtins.

src/cmd/ksh93/sh.1:
- Update accordingly.

Resolves: https://github.com/ksh93/ksh/issues/125
2020-09-11 21:35:45 +02:00
Martijn Dekker
b9d10c5a9c Fix 'command' expansion bug and POSIX compliance
The 'command' name can now result from an expansion, e.g.:
	c=command; "$c" ls
	set -- command ls; "$@"
both work now. This fixes BUG_CMDEXPAN.

If -o posix is on, 'command' now disables not only the "special"
but also the "declaration" properties of builtin commands that it
invokes. This is because POSIX specifies 'command' as a simple
regular builtin, and any command name following 'command' is just
an argument to the 'command' command, so there is nothing that
allows any further arguments (such as assignment-arguments) to be
treated specially by the parser. So, if and only if -o posix is on:
a. Arguments that start with a variable name followed by '=' are
   always treated as regular words subject to normal shell syntax.
b. Since assignment-arguments are not processed as assignments
   before the command itself, 'command' can now stop the shell from
   exiting (as required by the standard) if a command that it
   invokes (such as 'export') tries to modify a readonly variable.
   This fixes BUG_CMDSPEXIT.

Most of 'command' is integrated in the parser and parse tree
executer, so that is where it needed fixing.

src/cmd/ksh93/sh/parse.c: simple():
- If the posix option is on, do not skip past SYSCOMMAND so that
  any declaration builtin commands that are arguments to 'command'
  are not detected and thus not treated specially at parsetime.

src/cmd/ksh93/sh/xec.c: sh_exec():
- When detecting SYSCOMMAND in order to skip past it, not only
  compare the Namval_t pointer 'np' to SYSCOMMAND, but also handle
  the case where that pointer is NULL, as when the command name
  results from an expansion. In that case, search the function tree
  shp->fun_tree for the name and see if that yields the SYSCOMMAND
  pointer. fun_tree is initialised with a dtview to bltin_tree, so
  searching fun_tree instead allows for overriding 'command' with a
  shell function (which the POSIX standard requires us to allow).

src/cmd/ksh93/sh.1,
src/cmd/ksh93/data/builtins.c:
- Update documentation to match these changes.
- Various related edits and improvements.

src/cmd/ksh93/tests/builtins.sh:
- Check that 'command' works if resulting from an expansion.
- Check that 'command' can be overridden by a shell function.
2020-09-11 10:06:43 +02:00
Martijn Dekker
092b90da81 Fix BUG_LOOPRET2 and related return/exit misbehaviour
The 'exit' and 'return' commands without an argument failed to pass
down the exit status of the last-run command when incorporated in a
block with redirection, &&/|| list, 'case' statement, or 'while',
'until' or 'for' loop.

src/cmd/ksh93/bltins/cflow.c:
- Use $?, which is sh.savexit a.k.a. shp->savexit, as the default
  exit status value if there is no argument, instead of
  shp->oldexit. This fixes the default exit status behaviour to
  match POSIX and other shells.

src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/include/shell.h:
- Remove now-unused sh.oldexit (a.k.a. shp->oldexit) private struct
  member. It appeared to fulfill the same function as sh.savexit,
  but in a slightly broken way.
- Move the savexit/$? declaration from the _SH_PRIVATE part of the
  struct definition to the public API part. Since $? uses this,
  it's clearly a publicly exposed value already, and this is
  generally the one to use. (If anything, it's exitval that should
  have been private.) This declares savexit right next to exitval,
  rewriting the comments to clarify the difference between them.

src/cmd/ksh93/sh/fault.c,
src/cmd/ksh93/sh/subshell.c,
src/cmd/ksh93/sh/xec.c:
- Remove assignments to shp->oldexit.

src/cmd/ksh93/tests/basic.sh:
- Add thorough regression tests for the default exit status
  behaviour of 'return' and 'exit' in various lexical contexts.
- Verify that 'for' and 'case' without any command, as well as a
  lone redirection, still correctly reset the exit status to 0.

Fixes: #117
2020-09-09 20:02:20 +02:00
Martijn Dekker
5ed9ffd6c4 This fixes erroneous syntax errors in parameter expansions such as
${var:-wor)d} or ${var+w(ord}. The parentheses now correctly lose
their normal grammatical meaning within the braces. Fix by Eric
Scrivner (@etscrivner) from July 2018 backported from ksh2020.

This fix complies with POSIX:
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_02

src/cmd/ksh93/sh/lex.c: sh_lex():
- Set the ST_QUOTE state when analysing a modifier with parameter
  expansions using operators ':', '-', '+', '='. This state causes
  subsequent characters (including parentheses) to be considered
  quoted, suppressing their normal grammatical meaning.

src/cmd/ksh93/sh/macro.c: varsub():
- Same for skipping the expansion.

Fixes: https://github.com/ksh93/ksh/issues/126
Prior discussion: https://github.com/att/ast/issues/475
2020-09-05 16:20:22 +02:00
Martijn Dekker
00d439605f -o posix: don't import/export variable attributes thru environment
When exporting variables, ksh exports their attributes (such as
'integer' or 'readonly') in a magic environment variable called
"A__z" (string defined in e_envmarker[] in data/msg.c). Child
shells recognise that variable and restore the attributes.

This little-known feature is risky; the environment cannot
necessarily be trusted and that A__z variable is easy to manipulate
before or between ksh invocations, so you can cause a script's
variables to be of the wrong type, or readonly. Backwards
compatibility requires keeping it, at least for now. But it should
be disabled in the posix mode, as it violates POSIX.

To do this, we have to solve a catch-22 in init.c. We must parse
options to know whether to turn on posix mode; it may be specified
as '-o posix' on the command line. The option parsing loop depends
on an initialised environment[*], while environment initialisation
(i.e., importing attributes) should depend on the posix option.

The catch-22 can be solved because initialising just the values
before option parsing is enough to avoid regressions. Importing the
attributes can be delayed until after option parsing. That involves
basically splitting env_init() into two parts while keeping a local
static state variable between them.

src/cmd/ksh93/sh/init.c:
- env_init():
  * Split the function in two stages based on a new
    'import_attributes' parameter. Import values in the first
    stage; import attributes from A__z in the second (if ever).
    Make the 'next' variable static as it keeps a state needed for
    the attributes import stage.
  * Single point of truth, greppability: don't hardcode "A__z" in
    separate character comparisons, but use e_envmarker[].
  * Fix an indentation error.
- sh_init(): When initialising the environment (env_init), don't
  import the attributes from A__z yet; parse options first, then
  import attributes only if posix option is not set.

src/cmd/ksh93/sh/name.c:
- sh_envgen(): Don't export variable attributes to A__z if the
  posix option is set.

src/cmd/ksh93/tests/attributes.sh:
- Check that variable attributes aren't imported or exported
  if the POSIX option is set.

src/cmd/ksh93/sh.1:
- Update.

This was the last item on the TODO list for -o posix for now.
Closes: #20

[*] If environment initialisation is delayed until after option
    parsing, bin/shtests shows various regressions, including:
    restricted mode breaks; the locale is not initialised properly
    so that multibyte variable names break; $SHLVL breaks.
2020-09-05 11:41:02 +02:00
Martijn Dekker
bec6556236 update NEWS, SH_RELEASE (re: 6575903d) 2020-09-04 05:29:52 +02:00
Martijn Dekker
55f0f8ce52 -o posix: disable '[ -t ]' == '[ -t 1 ]' hack
On ksh93, 'test -t' is equivalent to 'test -t 1' (and of course
"[ -t ]" is equivalent to "[ -t 1 ]").

This is purely for compatibility with ancient Bourne shell
breakage. No other shell supports this. ksh93 should probably keep
it for backwards compatibility, but it should definitely be
disabled in POSIX mode as it is a violation of the standard; 'test
-t' is an instance of 'test "$string"', which tests if the string
is empty, so it should test if the string '-t' is empty (quod non).

This also replaces the fix for 'test -t 1' in a command
substitution with a better one that avoids forking (re: cafe33f0).

src/cmd/ksh93/sh/parse.c:
- qscan(): If the posix option is active, disable the parser-based
  hack that converts a simple "[ -t ]" to "[ -t 1 ]".

src/cmd/ksh93/bltins/test.c:
- e3(): If the posix option is active, disable the part of the
  compatibility hack that was used for compound expressions
  that end in '-t', e.g. "[ -t 2 -o -t ]".
- test_unop(): Remove the forking fix for "[ -t 1 ]".

src/cmd/ksh93/edit/edit.c:
- tty_check(): This function is used by "[ -t 1 ]" and in other
  contexts as well, so a fix here is more comprehensive. Forking
  here would cause a segfault, but we don't actually need to. This
  adds a fix that simply returns false if we're in a virtual
  subshell that is also a command substitution. Since command
  substitutions always fork upon redirecting standard output within
  them (making them no longer virtual), it is safe to do this.

src/cmd/ksh93/tests/bracket.sh
- Add comprehensive regression tests for test/[/[[ -t variants in
  command substitutions, in simple and compound expressions, with
  and without redirecting stdout to /dev/tty within the comsub.
- Add tests verifying that -o posix disables the old hack.
- Tweak other tests, including one that globally disabled xtrace.
2020-09-01 20:24:44 +01:00
Martijn Dekker
c607c48c84 Revert <> redir FD except in posix mode (re: eeee77ed, 60516872)
eeee77ed implemented a POSIX compliance fix that caused a potential
incompatibility with existing ksh scripts; it made the (rarely
used) read/write redirection operator, <>, default to file
descriptor 0 (standard input) as POSIX specified, instead of 1
(standard output) which is traditional ksh93 behaviour. So ksh
scripts needed to change all <> to 1<> to override the new default.

This commit reverts that change, except in the new posix mode.

src/cmd/ksh93/sh/lex.c:
- Make FD for <> default to 0 in POSIX mode, 1 otherwise.

src/cmd/ksh93/tests/io.sh:
- Revert <> regression test changes from 60516872; we no longer
  need 1<> instead of <> in ksh code.
2020-09-01 08:48:18 +01:00
Martijn Dekker
fd977388a2 -o posix: allow invoked programs to inherit FDs > 2
If there are file descriptors > 2 opened with 'exec' or 'redirect',
ksh93 has always closed them when invoking another pogram. This is
contrary to POSIX which states:
    Utilities other than the special built-ins […] shall be invoked
    in a separate environment that consists of the following. The
    initial value of these objects shall be the same as that for
    the parent shell, except as noted below.
    * Open files inherited on invocation of the shell, open files
      controlled by the exec special built-in plus any
      modifications, and additions specified by any redirections to
      the utility
    * […]
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_12

src/cmd/ksh93/sh/io.c: sh_redirect():
- When flag==2, do not close FDs > 2 if POSIX mode is active.

src/cmd/ksh93/tests/io.sh:
- Regress-test inheriting FD 7 with and without POSIX mode.

src/cmd/ksh93/sh.1:
- Update.
2020-09-01 08:11:27 +01:00
Martijn Dekker
b301d41731 -o posix: always recognise octals in "let" builtin
Though the "let" builtin is not itself a POSIX standard command, it
processes standard shell arithmetic, so it should recognise octals
by leading zeros as POSIX requires if the 'posix' option is on.
This overrides the setting of the 'letoctal' option.

Note that none of this applies to the ((...)) arithmetic command,
which has always recognised leading-octal zeros and does not listen
to 'letoctal'. So setting the posix mode makes this consistent.

src/cmd/ksh93/sh/arith.c:
- When running the 'let' builtin, test that both SH_LETOCTAL and
  SH_POSIX are off before stripping leading zeros to disable octal
  number recognition.
- Cosmetic: fix spurious newline.

src/cmd/ksh93/sh.1:
- Document the change.

src/cmd/ksh93/tests/shtests:
- Make sure to disable posix mode by default for regression tests.
2020-09-01 07:17:22 +01:00
Martijn Dekker
921bbcaeb7 Remove SHOPT_BASH; keep &> redir operator, '-o posix' option
On 16 June there was a call for volunteers to fix the bash
compatibility mode; it has never successfully compiled in 93u+.
Since no one showed up, it is now removed due to lack of interest.

A couple of things are kept, which are now globally enabled:

1. The &>file redirection shorthand (for >file 2>&1). As a matter
   of fact, ksh93 already supported this natively, but only while
   running rc/profile/login scripts, and it issued a warning. This
   makse it globally available and removes the warning, bringing
   ksh93 in line with mksh, bash and zsh.

2. The '-o posix' standard compliance option. It is now enabled on
   startup if ksh is invoked as 'sh' or if the POSIXLY_CORRECT
   variable exists in the environment. To begin with, it disables
   the aforementioned &> redirection shorthand. Further compliance
   tweaks will be added in subsequent commits. The differences will
   be fairly minimal as ksh93 is mostly compliant already.

In all changed files, code was removed that was compiled (more
precisely, failed to compile/link) if the SHOPT_BASH preprocessor
identifier was defined. Below are other changes worth mentioning:

src/cmd/ksh93/sh/bash.c,
src/cmd/ksh93/data/bash_pre_rc.sh:
- Removed.

src/cmd/ksh93/data/lexstates.c,
src/cmd/ksh93/include/shlex.h,
src/cmd/ksh93/sh/lex.c:
- Globally enable &> redirection operator if SH_POSIX not active.
- Remove warning that was issued when &> was used in rc scripts.

src/cmd/ksh93/data/options.c,
src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/sh/args.c:
- Keep SH_POSIX option (-o posix).
- Replace SH_TYPE_BASH shell type by SH_TYPE_POSIX.

src/cmd/ksh93/sh/init.c:
- sh_type(): Return SH_TYPE_POSIX shell type if ksh was invoked
  as sh (or rsh, restricted sh).
- sh_init(): Enable posix option if the SH_TYPE_POSIX shell type
  was detected, or if the CONFORMANCE ast config variable was set
  to "standard" (which libast sets on init if POSIXLY_CORRECT
  exists in the environment).

src/cmd/ksh93/tests/options.sh,
src/cmd/ksh93/tests/io.sh:
- Replace regression tests for &> and move to io.sh. Since &> is
  now for general use, no longer test in an rc script, and don't
  check that a warning is issued.

Closes: #9
Progresses: #20
2020-09-01 06:19:19 +01:00
Martijn Dekker
9ba2c2e0df Speed up 'read', fixing macOS hang (take 2)
This fixes a hanging bug that could occur on macOS when using the
'read' command to read from a FIFO and encountering end-of-file
without a final newline character. It also makes the 'read' command
perform 15-25% faster on macOS and Linux.

The previous version (ff385e5a) failed on SunOS/Solaris/Illumos
because those systems apparently don't (fully) support the POSIX
standard recv(2) syscall with MSG_PEEK[*], which is the feature
that iffe detects under the 'socket_peek' identifier. On Illumos,
using that methods causes a compilation failure (unknown identifier
MSG_PEEK); on Solaris 11.4, that method causes multiple regressions
in tests/io.sh, suggesting the method compiles but doesn't work at
all. Instead, SunOS/Solaris/Illumos requires the method using
ioctl(2)+I_PEEK and select(2). No other system that ksh currently
builds on requires this method, so it is now only used on
SunOS/Solaris/Illumos.

So far, this version of sfpkrd() has been tested to work correctly
on Linux, macOS, FreeBSD, NetBSD, OpenBSD, HP-UX, Solaris, and
OmniOS (an Illumos distribution).

It still fails to peek on Cygwin, but in the exact same way it
failed before, so that's no loss.

To test, run the 'io' test set:  bin/shtests -p io

src/lib/libast/sfio/sfpkrd.c: sfpkrd():
- Remove long-obsolete Mac OS X and Solaris bug workarounds.
- Remove methods that are no longer needed.
     On systems with a POSIX compliant recv(2), the only thing that
  is required to avoid regressions is the code that was conditional
  upon the socket_peek feature test, which tests for the correct
  functioning of the recv(2) syscall. This has now been made
  mandatory for non-SunOS/Solaris/Illumos systems (using an #error
  directive if it is not detected), with the other methods removed.
  The result performs 15-25% faster on macOS and Linux while
  passing all the regression tests.
     On macOS, avoiding the select(2) method fixes the hanging bug.
     On SunOS/Solaris/Illumos (the '__sun' identifier), the method
  using ioctl(2)+I_PEEK and select(2) (iffe feature IDs:
  stream_peek and lib_select) is preserved.

Resolves: https://github.com/ksh93/ksh/issues/118 (again)

[*] https://pubs.opengroup.org/onlinepubs/9699919799/functions/recv.html
2020-08-19 23:54:55 +01:00
Martijn Dekker
569c1bb9c1 Revert "Speed up 'read', fixing macOS hang"
This reverts commit ff385e5a89.
It broke Solaris and illumos. More testing is needed.
2020-08-19 04:10:55 +01:00
Martijn Dekker
ff385e5a89 Speed up 'read', fixing macOS hang
This fixes a hanging bug that could occur on macOS when using the
'read' command to read from a FIFO and encountering end-of-file
without a final newline character. It also makes the 'read' command
perform 15-25% faster on macOS and Linux and maybe other systems.

src/lib/libast/sfio/sfpkrd.c: sfpkrd():
- Get rid of the optional stuff that uses the poll(2) or select(2)
  syscalls. The only thing that is required to avoid regressions is
  the code that was conditional upon the socket_peek feature test,
  which tests for the correct functioning of the recv(2) syscall.
  This has now been made mandatory. The rest now uses what was
  previously a fallback in plain C, resulting in a function that is
  not only more readable, but actually faster than the syscalls.

Resolves: https://github.com/ksh93/ksh/issues/118
2020-08-19 01:36:01 +01:00
Martijn Dekker
d03e948bcd Fix 'command -p' lookup if hash table entry exists (re: c9ccee86)
If a command's path was previously added to the hash table as a
'tracked alias', then the hash table entry was used, bypassing
the default utility path search activated by 'command -p'.

'command -p' activates a SH_DEFPATH shell state. The bug was caused
by a failure to check for this state before using the hash table.
This check needs to be added in four places.

src/cmd/ksh93/sh/path.c,
src/cmd/ksh93/sh/xec.c:
- path_search(), path_spawn(), sh_exec(), sh_ntfork(): Only consult
  the hash table, which is shp->track_tree, if the SH_DEFPATH shell
  state is not active.

src/cmd/ksh93/tests/path.sh:
- Add regress tests checking that 'command -p' and 'command -p -v'
  still search in the default path if a hash table entry exists for
  the command searched.
2020-08-17 20:23:39 +01:00
Martijn Dekker
acf84e9633 Fix 'command -x' on macOS, Linux, Solaris
'command -x' (basically builtin xargs for 'command') worked for
long argument lists on *BSD and HP-UX, but not on macOS and Linux,
where it reliably entered into an infinite loop.

The problem was that it assumed that every byte of the environment
space can be used for arguments, without accounting for alignment
that some OSs do. MacOS seems to be the most wasteful one: it
aligns on 16-byte boundaries and requires some extra bytes per
argument as well.

src/cmd/ksh93/sh/path.c:
- path_xargs(): When calculating how much space to subtract per
  argument, add 16 extra bytes to the length of each argument, then
  align the result on 16-byte boundaries. The extra 16 bytes is
  more than even macOS needs, but hopefully it is future-proof.
- path_spawn(): If path_xargs() does fail, do not enter a retry
  loop (which always becomes an infinite loop if the argument list
  exceeds OS limitations), but abort with an error message.
2020-08-16 09:31:43 +01:00
Martijn Dekker
56805b25af Fix leak and crash upon defining functions in subshells
A memory leak occurred upon leaving a virtual subshell if a
function was defined within it. If this was done more than 32766
(= 2^15-2 = the 'short' max value - 1) times, the shell crashed.
Discussion and reproducer: https://github.com/ksh93/ksh/issues/114

src/cmd/ksh93/sh/subshell.c: table_unset():
- A subshell-defined function was never freed because a broken
  check for autoloaded functions (which must not be freed[*]). It
  looked for an initial '/' in the canonical path of the script
  file that defined the function, but that path is also stored for
  regular functions. Now use a check that executes nv_search() in
  fpathdict, the same method used in _nv_unset() in name.c for a
  regular function unset.

src/cmd/ksh93/bltins/misc.c: b_dot_cmd():
- Fix an additional memory leak introduced in bd88cc7f, that caused
  POSIX functions (which are run with b_dot_cmd() like dot scripts)
  to leak extra. This fix avoids both the crash fixed there and the
  memory leak by introducing a 'tofree' variable remembering the
  filename to free. Thanks to Johnothan King for the patch.

src/lib/libast/include/stk.h,
src/lib/libast/misc/stk.c,
src/lib/libast/man/stk.3,
src/lib/libast/man/stak.3:
- Make the stack more resilient by extending the stack reference
  counter 'stkref' from (signed) short to unsigned int. On modern
  systems with 32-bit ints, this extends the maximum number of
  elements on a stack from 2^15-1==32767 to 2^32-1==4294967295.
  The ref counter can never be negative, so there is no reason for
  signedness. sizeof(int) is defined as the size of a single CPU
  word, so this should not affect performance at all.
     On a 16-bit system (not that ksh still compiles there), this
  doubles the max number of entries to 2^16-1=65535.

src/cmd/ksh93/tests/leaks.sh:
- Add leak regression tests for ksh functions, POSIX functions, dot
  scripts run with '.', and dot scripts run with 'source'.

src/cmd/ksh93/tests/path.sh:
- Add an output builtin with a redirect to an autoloaded function
  so that a crash[*] is triggered if the check for an autoloaded
  function is ever removed from table_unset(), as was done in ksh
  93v- (which crashed).

[*] Freeing autoloaded functions after leaving a virtual subshell
    causes a crashing bug: https://github.com/att/ast/issues/803

Co-authored-by: Johnothan King <johnothanking@protonmail.com>
Fixes: https://github.com/ksh93/ksh/issues/114
2020-08-14 00:25:31 +01:00
Johnothan King
05ac1dbb41
Fix crash upon running many subshells (#113)
Co-authored-by: Martijn Dekker <martijn@inlv.org>

An intermittent crash occurred after running many thousands of
virtual/non-forked subshells. One reproducer is a crash in the
shbench fibonacci.ksh test, as documented here:
https://github.com/ksh-community/shbench/blob/f3d9e134/bench/fibonacci.ksh#L4-L10

The apparent cause was the signed and insufficiently large 'short'
data type of 'curenv' and related variables which wrapped around to
a negative number when overflowing. These IDs are necessary for the
'wait' builtin to obtain the exit status from a background job.

This fix is inspired by a patch based on ksh 93v-:
https://build.opensuse.org/package/view_file/shells/ksh/ksh93-longenv.dif?expand=1
https://src.fedoraproject.org/rpms/ksh/blob/f24/f/ksh-20130628-longer.patch

However, we change the type to 'unsigned int' instead of 'long'. On
all remotely modern systems, ints are 32-bit values, and using this
type avoids a performance degradation on 32-bit sytems. Making them
unsigned prevents an overflow to negative values.

src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/include/jobs.h,
src/cmd/ksh93/include/nval.h,
src/cmd/ksh93/include/shell.h:
- Change the types of the static global 'subenv' and the subshell
  structure members 'curenv', 'jobenv', 'subenv', 'p_env' and
  'subshell' to one consistent type, unsigned int.

src/cmd/ksh93/sh/jobs.c,
src/cmd/ksh93/sh/macro.c:
src/cmd/ksh93/sh/name.c:
src/cmd/ksh93/sh/nvtype.c,
src/cmd/ksh93/sh/subshell.c:
- Updates to match new variable types.

src/cmd/ksh93/tests/subshell.sh:
- Show wrong exit status in message on failure of 'wait' builtin.
2020-08-12 18:50:59 +01:00
Martijn Dekker
61437b2728 Fix crash, take three (re: e805c7d9, 33858689)
The current fix appears to be only partially successful in
eliminating the intermittent crash, and also breaks '-o notify'
during the 60-second $TMOUT grace period. This replaces it.

The root cause appears to be that the state of job control becomes
somehow inconsistent when running external commands in a command
substitution expanded from the $PS1 prompt. The job_unpost() or
(sometimes) the job_list() function intermittently crash. These are
called if the SH_TTYWAIT state is active:
https://github.com/ksh93/ksh/blob/88e8fa67/src/cmd/ksh93/sh/jobs.c#L463-L469
Temporarily deactivating the SSH_TTYWAIT state while expanding
PS{1..4} prompts appears to fix the problem reliably.

It is quite possible that this fix merely masks a bug in the job
control system, but testing has shown that it stops ksh crashing
without side effects, so I'm calling it good for now.

Thanks to Marc Wilson for many hours of persistent testing.

src/cmd/ksh93/sh/jobs.c:
- Revert changes made in 33858689 and e805c7d9.

src/cmd/ksh93/sh/io.c: io_prompt():
- Save SH_TTYWAIT state and turn it off while expanding prompts.

Resolves: https://github.com/ksh93/ksh/issues/103
Resolves: https://github.com/ksh93/ksh/issues/112
2020-08-11 01:51:31 +01:00
Martijn Dekker
8477d2ce22 printf: Fix HTML and URI encoding (%H, %#H)
This applies a number of fixes to the printf formatting directives
%H and %#H (as well as their equivalents %(html)q and %(url)q):
1. Both formatters have been made multibyte/UTF-8 aware, and no
   longer delete multibyte characters. Invalid UTF-8 byte sequences
   are rendered as ASCII question marks.
2. %H no longer wrongly encodes spaces as non-breaking spaces
   (&nbsp;) and instead correctly encodes the UTF-8 non-breaking
   space as such.
3. %H now converts the single quote (') to '%#39;' instead of
   '&apos;' which is not a valid entity in all HTML versions.
4. %#H failed to encode some reserved characters (e.g. '?') while
   encoding some unreserved ones (e.g. '~'). It now percent-encodes
   all characters except those 'unreserved' as per RFC3986 (ASCII
   alphanumeric plus -._~).

Prior discussion:
https://groups.google.com/d/msgid/korn-shell/ce8d1467-4a6d-883b-45ad-fc3c7b90e681%40inlv.org

src/cmd/ksh93/include/defs.h:
src/cmd/ksh93/sh/string.c:
- defs.h: If compiling without SHOPT_MULTIBYTE, redefine the
  mbwide() macro (which tests if we're in a multibyte locale) as 0.
  This lets the compiler optimiser do the work that would otherwise
  require a lot of tedious '#if SHOPT_MULTIBYTE' directives.
- string.c: Remove some now-unneeded '#if SHOPT_MULTIBYTE' stuff.
- defs.h, string.c: Rename is_invisible() to sh_isprint(), invert
  the boolean return value, and make it an extern for use in
  fmthtml() -- see below. If compiling without SHOPT_MULTIBYTE,
  simply #define sh_isprint() as equivalent to isprint(3).
- defs.h: Add URI_RFC3986_UNRESERVED macro for fmthtml() containing
  the characters "unreserved" for purposes of URI percent-encoding.

src/cmd/ksh93/bltins/print.c: fmthtml():
- Remove kludge that skipped all multibyte characters (!).
- Complete rewrite to implement fixes described above.
- Don't bother with '#if SHOPT_MULTIBYTE' directives (see above).

src/cmd/ksh93/data/builtins.c:
- sh_optprintf[]: %H: Add single quote to encoded chars doc.
- Edit credits and bump version date.

src/cmd/ksh93/tests/builtins.sh:
- Update and tweak old regression tests.
- Add a number of new tests for UTF-8 HTML and URI encoding, which
  are only run when running tests in a UTF-8 locale (shtests -u).
2020-08-10 22:51:55 +01:00
Martijn Dekker
5312a59d5a Skip '.' and '..' when globbing patterns like .*
There are convincing arguments why including '.' and '..' in the
result of pathname expansion is actively harmful. See:
https://www.austingroupbugs.net/view.php?id=1228
https://github.com/ksh93/ksh/issues/58#issuecomment-653716846

pdksh, mksh and zsh already skip these special traversal names
in all cases. This commit makes ksh act like these shells.

Since passing '.' and especially '..' as arguments to commands like
'chmod -R' and 'cp -r' may cause harm, this change seems likely to
fix more legacy scripts than it breaks. I'm unaware of anyone ever
having come up with a concrete use case for the old behaviour.

This change also fixes the bug that '.' and '..' failed to be
ignored as documented if FIGNORE is set.

src/lib/libast/misc/glob.c: glob_dir():
- Explicitly skip any matching '.' and '..' in all cases.

src/cmd/ksh93/tests/glob.sh:
- Add test_glob() tests for '*' and '.*'.

src/cmd/ksh93/sh.1: File Name Generation:
- Update to match new behaviour.

Resolves: https://github.com/ksh93/ksh/issues/58
2020-08-10 00:35:53 +01:00
Martijn Dekker
be5ea8bbb2 redirect: check args before executing redirections (re: 7b82c338)
The 'redirect' builtin command did not error out before executing
any valid redirections. For example, 'redirect ls >foo.txt' issued
an "incorrect syntax" error, but still created 'foo.txt' and left
standard output permanently redirected to it.

src/cmd/ksh93/sh/xec.c: sh_exec():
- If we have redirections (io != NULL), and the command is
  SYSREDIR, then check for arguments and error out if there are
  any, before calling sh_redirect() to execute redirections.
  (Note, the other check for arguments in b_exec() in bltins/misc.c
  must be kept, as that applies if there are no redirections.)

src/cmd/ksh93/sh/io.c: sh_redirect():
- Edit comments to better explain what the flag values do.

src/cmd/ksh93/bltins/misc.c:
- Add a dummy b_redirect() function declaration "for the dictionary
  generator" as has historically been done for other builtins that
  share one C function. I'm not sure what that dictionary generator
  is supposed to be, but this also improves greppability.

src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/sh.1:
- Fix misleading "I/O redirection arguments" term. I/O redirections
  are not arguments at all; no argument parser ever sees them.

src/cmd/ksh93/tests/io.sh:
- Test both conditions that should make 'redirect' produce an
  "incorrect syntax" error.
- Test that any redirections are not executed if erroneous
  non-redirection arguments exist.

src/cmd/ksh93/tests/builtins.sh:
- "... should show usage info on unrecognized options" test:
  Because 'redirect' now refuses to process redirections on error,
  the error message was not captured. The fix is to run the builtin
  in a braces block and add the redirection to the block.
2020-08-09 00:47:22 +01:00
Martijn Dekker
e805c7d9b1 Fix crash: do not list job if in 60 sec grace period (re: 33858689)
The crash in job_list() or job_unpost() could still occur after the
previous patch if a signal was being handled after $TMOUT was
exceeded and the 60-second grace period was entered.

It *should* work to add a general check for !sh_isstate(SH_GRACE).
We know that the SH_GRACE state is set immediately after printing
the 60 second grace period warning message:
https://github.com/ksh93/ksh/blob/9de65210/src/cmd/ksh93/sh/io.c#L1869-L1870
(and that the crashes occur upon re-evaluating the $PS1 prompt
after setting the SH_GRACE state). We know that the SH_GRACE state
is not turned off again until either the user enters a line:
https://github.com/ksh93/ksh/blob/9de65210/src/cmd/ksh93/sh/main.c#L474
or the shell times out after the grace period:
https://github.com/ksh93/ksh/blob/9de65210/src/cmd/ksh93/sh/io.c#L1861
The SH_GRACE state flag is not used or changed in any other context
(verified with grep -rn SH_GRACE src/cmd/ksh93). So, logically,
this should suffice to make sure the crash stays gone.

src/cmd/ksh93/sh/jobs.c: job_reap():
- Do not list jobs when the SH_GRACE state (the 60 second timeout
  grace period after TMOUT was exceeded) is active.
- Keep the previous check for job control just to be sure, and
  because it makes sense.

Fixes: https://github.com/ksh93/ksh/issues/103 (again)
2020-08-07 21:09:01 +01:00
Johnothan King
9de65210c6
Add ${.sh.pid} as an alternative to $BASHPID (#109)
This variable is like Bash's $BASHPID, but in virtual subshells
it will retain its previous value as virtual subshells don't fork.
Both $BASHPID and ${.sh.pid} are different from $$ as the latter
is only set to the parent shell's process ID (i.e. it isn't set
to the process ID of the current subshell).

src/cmd/ksh93/include/defs.h:
- Add 'current_pid' for storing the current process ID at a valid
  memory address.
- Change 'ppid' from 'int32_t' to 'pid_t', as the return value from
  'getppid' is of the 'pid_t' data type.

src/cmd/ksh93/data/variables.c,
src/cmd/ksh93/include/variables.h,
src/cmd/ksh93/sh/init.c,
src/cmd/ksh93/sh/xec.c:
 - Add the ${.sh.pid} variable as an alternative to $BASHPID.
   The process ID is stored in a struct before ${.sh.pid} is set
   as environment variables are pointers that must point to a
   valid memory address. ${.sh.pid} is updated by the _sh_fork()
   function, which is called when ksh forks a new process with
   sh_fork() or sh_ntfork().

src/cmd/ksh93/tests/variables.sh:
- Add ${.sh.pid} to the list of special variables and add three
  regression tests for ${.sh.pid}.

src/cmd/ksh93/tests/subshell.sh:
- Update the PATH forking regression test to use ${.sh.pid} and
  remove the TODO note.
2020-08-07 02:53:25 +01:00
Johnothan King
f9fdbfc9e9
Fix a large number of typos and other problems (#110)
Most of these fixes are for typos and extra whitespace at the
end of lines. These are the notable changes:
- Fixed a compatibility issue with how asterisks are displayed
  using certain fonts. Bug report: https://github.com/att/ast/issues/764
- Fixed a bug in the man page that caused searches for the '|'
  character to fail. Bug report: https://github.com/att/ast/issues/871
- Removed a duplicate description of 'set -B' from the man
  page. Bug report: https://github.com/att/ast/issues/789
- Added documentation for options missing from the ksh man
  page (applies to 'hist -N', 'sleep -s', 'whence -q' and
  many of ulimit's options). Bug reports:
  https://github.com/att/ast/issues/948
  https://github.com/att/ast/issues/503#issuecomment-386649715
  https://github.com/att/ast/issues/507#issuecomment-507924608
- Applied the following ksh2020 documentation fixes:
  https://github.com/att/ast/pull/351
  https://github.com/att/ast/pull/352
- Fixed a minor GCC -Wformat warning in procopen.c by changing
  a sentinel to NULL.
2020-08-07 00:50:11 +01:00
Martijn Dekker
338586896d Fix crash: do not list jobs if there is no job control
This bug caused an undefined state, which sometimes crashed the
shell in job_list() or job_unpost(), if $PS1 contains a command
substitution running an external command and the '-b'/'-o notify'
shell option is active. So far the only known way to trigger the
crash is by letting $TMOUT time out the interactive shell. See
https://github.com/ksh93/ksh/issues/103 for details.

src/cmd/ksh93/sh/jobs.c: job_reap():
- The check for the SH_NOTIFY option and the SH_TTYWAIT state
  before listing jobs was insufficient. Job control is disabled in
  command substitutions, so also check that job control is active
  before listing jobs.

src/cmd/ksh93/sh.1:
- Fix TMOUT documentation. The 'read' command in fact only times
  out when reading from a terminal, just like 'select'. Also
  document the extra 60 second grace period when an interactive
  shell prompt reads from a terminal.

Fixes: https://github.com/ksh93/ksh/issues/103
2020-08-06 22:46:02 +01:00
Martijn Dekker
ac8991e525 Fix shellquoting of invalid multibyte char (re: f9d28935, 8c7c60ec)
This commit fixes two bugs in the generation of $'...' shellquoted
strings:
1. A bug introduced in f9d28935. In UTF-8 locales, a byte that is
   invalid in UTF-8, e.g. hex byte 86, would be shellquoted as
   \u[86], which is not the same as the correct quoting, \x86.
2. A bug inherited from 93u+. Single bytes (e.g. hex 11) were
   always quoted as \x11 and not \x[11], even if a subsequent
   character was a hexadecimal digit. However, the parser reads
   past two hexadecimal digits, so we got:
	$ printf '%q\n' $'\x[11]1'
	$'\x111'
	$ printf $'\x111' | od -t x1
	0000000    c4  91
	0000002
   After the bug fix, this works correctly:
	$ printf '%q\n' $'\x[11]1'
	$'\x[11]1'
	$ printf $'\x[11]1' | od -t x1
	0000000    11  31
	0000002

src/cmd/ksh93/sh/string.c: sh_fmtq():
- Make the multibyte code for $'...' more readable, eliminating the
  'isbyte' flag.
- When in a multibyte locale, make sure to shellquote both invalid
  multibyte characters and unprintable ASCII characters as
  hexadecimal bytes (\xNN). This reinstates 93u+ behaviour.
- When quoting bytes, use isxdigit(3) to determine if the next
  character is a hex digit, and if so, protect the quoted byte with
  square brackets.

src/cmd/ksh93/tests/quoting2.sh:
- Move the 'printf %q' shellquoting regression tests here from
  builtins.sh; they test the shellquoting algorithm, not so much
  the printf builtin itself.
- Add regression tests for these bugs.
2020-08-05 18:22:22 +01:00
Johnothan King
e53177abca
Fix unset method in multidimensional arrays (#105)
A segfault happens when an array with an unset method
is turned into a multidimensional array. Reproducer:
function foo {
    typeset -a a
    a.unset() {
        print unset
    }
    a[3][6][11][20]=7
}
foo

src/cmd/ksh93/sh/nvdisc:
- Fix the multidimensional array unset method crash by
  checking if np->nvenv is an array, since multidimensional
  arrays need to be handled as arrays. This bugfix was
  backported from ksh93v- 2013-10-10-alpha.

src/cmd/ksh93/tests/arrays2.sh:
- Add the reproducer as a regression test for the crash
  with multidimensional arrays.

Bug report on the old mailing list:
https://www.mail-archive.com/ast-developers@lists.research.att.com/msg01195.html
2020-08-05 18:14:30 +01:00
Johnothan King
23f2e23385
Over-shifting in a POSIX function should cause scripts to exit (#106)
The required longjmp used to terminate scripts was not being run
when over-shifting in a POSIX function with a redirection. This
caused scripts to continue after an error in the shift builtin,
which is incorrect since shift is a special builtin. The
interpreter is sent into an indeterminate state that causes
undefined behavior as well:
$ cat reproducer.ksh
some_func() {
   shift 10
}

for i in a b c d e f; do
  echo "read $i"
  [ "$i" != "c" ] && continue
  some_func 2>&1
  echo "$i = c"
done
$ ksh ./reproducer.ksh
read a
read b
read c
/tmp/k[2]: shift: 10: bad number
c = c
read d
/tmp/k[2]: shift: 10: bad number
d = c
read e
/tmp/k[2]: shift: 10: bad number
e = c
read f
/tmp/k[2]: shift: 10: bad number
f = c

src/cmd/ksh93/sh/xec.c: sh_exec():
- Do the necessary longjmp needed to terminate the script after
  over-shifting in a POSIX function when the function call has a
  redirection.

src/cmd/ksh93/tests/functions.sh:
- Add the over-shifting regression test from ksh93v- 2013-10-10-alpha.

Bug report and fix on the old mailing list:
https://www.mail-archive.com/ast-developers@lists.research.att.com/msg00732.html
2020-08-05 18:06:16 +01:00
Marc Wilson
4144f404ae
Fix expansion of multibyte character after $1 - $9, $?, etc (#102)
A multibyte character immediately following an expansion of a
single-character name, e.g. $1 through $9, $?, $-, etc. was
corrupted when in a UTF-8 locale, e.g.:

    $ set -- foo; echo "$1テスト"
    foo?スト

Prior discussion:
https://www.mail-archive.com/ast-users@lists.research.att.com/msg01060.html
https://bugzilla.redhat.com/show_bug.cgi?id=1256495

src/cmd/ksh93/sh/macro.c:
- Apply a Red Hat patch by Paulo Andrade that avoids calling
  fcmbget() if backtracking more than one byte might be required.

src/cmd/ksh93/tests/basic.c:
- Test "テスト" following expansion of "$1", "$?" and "$#".

Co-authored-by: Martijn Dekker <martijn@inlv.org>
2020-08-01 01:12:45 +01:00
Johnothan King
02a14ff9b7
Fix creation of extra associative array element '0' (#101)
Multidimensional associative arrays are created with an extra array
member named '0', which is set to no value. Reproducer:

$ typeset -A foo
$ typeset -A foo[bar]
$ typeset -p foo
typeset -A foo=([bar]=([0]='') )

The bugfix prevents nv_setarray from creating the extra '[0]' member
when an associative array is empty. This bug was discussed on the old
mailing list:
https://www.mail-archive.com/ast-developers@lists.research.att.com/msg01574.html

src/cmd/ksh93/sh/array.c:
- Do not allow the creation of an extra array member when an array
  is empty.

src/cmd/ksh93/tests/arrays.sh:
- Add a regression test for creating multidimensional associative
  arrays, but use the output from 'typeset -p' instead of fgrep.
2020-07-31 17:32:09 +01:00
Martijn Dekker
70f6d758c0 Fix blocked signals after fork(2)ing external command in subshell
When the classic fork/exec mechanism was used (via sh_fork()) to
run an external command from within a non-forking subshell, SIGINT
was blocked until that subshell was exited. If a subsequent loop
was run in the subshell, it became uninterruptible, e.g.:

   $ arch/*/bin/ksh -c '(/usr/bin/true; while :; do :; done); exit'
   ^C^C^C^C^C

src/cmd/ksh93/sh/xec.c:
- sh_fork() did not reset the savesig variable in the parent part
  of the fork when running in a virtual subshell. This had the
  effect of delaying signal handling until exiting the subshell.
  There is no reason for that subshell check that I can discern, so
  this removes it.
      I've verified that this causes no regression test failures
  even when ksh is compiled with -DSHOPT_SPAWN=0 which means the
  classic fork/exec mechanism is always used.

Fixes: https://github.com/ksh93/ksh/issues/86
2020-07-30 01:46:00 +01:00
Martijn Dekker
a2f13c19f2 Fix typeset attributes -a, -A, -l, -u leaking out of subshells
If an array or upper/lowercase variable was declared with a null
initial value within a virtual/non-forked subshell, like:
	( typeset -a foo; ... )
	( typeset -A foo; ... )
	( typeset -l foo; ... )
	( typeset -u foo; ... )
then the type declaration leaked out of the subshell into the
parent shell environment, though without any values that may
subsequently have been assigned.

src/cmd/ksh93/bltins/typeset.c: setall():
- When deciding whether to create a virtual subshell scope for a
  variable, use sh_assignok(), which was actually designed for the
  purpose, instead of _nv_unset(). This allows getting rid of a
  tangled mess of special-casing that never worked quite right.

src/cmd/ksh93/tests/arrays.sh:
- Add regression tests checking that array declarations don't leak
  out of virtual subshells.

src/cmd/ksh93/tests/attributes.sh:
- Add regression tests for combining the 'export' and 'readonly'
  attributes with every other possible typeset attribute on unset
  variables. This also includes a subshell leak test for each one.

Fixes: https://github.com/ksh93/ksh/issues/88
2020-07-26 02:41:12 +01:00
Johnothan King
1bc2c74c74
Fix how unrecognized options are handled in 'sleep' and 'suspend' (#93)
When a builtin is given an unrecognized option, the usage information
for that builtin should be shown as 'Usage: builtin-name options'. The
sleep and suspend builtins were an exception to this. 'suspend' would
not show usage information and sleep wouldn't exit on error:

$ suspend -e
/usr/bin/ksh: suspend: -e: unknown option
$ time sleep -e 1
sleep: -e: unknown option

real	0m1.00s
user	0m0.00s
sys	0m0.00s

src/cmd/ksh93/bltins/sleep.c:
- Show usage information and exit when sleep is given an unknown
  option. This bugfix was backported from ksh2020: https://github.com/att/ast/pull/1024

src/cmd/ksh93/bltins/trap.c:
- Use the normal method of parsing options with optget to fix the
  suspend builtin's test failure.

src/cmd/ksh93/tests/builtins.sh:
- Add the ksh2020 regression test for getting the usage information
  of each builtin. Enable all /opt/ast/bin builtins in a subshell
  since those should be tested as well (aside from getconf and uname
  because those builtins fallback to the real commands on error).
2020-07-26 02:18:49 +01:00
Johnothan King
8b5f11dcd7
Add support for multibyte characters to $IFS (#92)
Add support for multibyte characters to $IFS

This commit fixes BUG_MULTIBIFS, which had two bug reports in the ksh2020 branch.

src/cmd/ksh93/sh/macro.c:
- Backport Eric Scrivner's fix for multibyte IFS characters (slightly modified
  for compatibility with C89). Explanation from https://github.com/att/ast/pull/737:

  Previously, the varsub method used for the macro expansion of $param, ${param},
  and ${param op word} would incorrectly expand the internal field separator (IFS)
  if it was a multibyte character. This was due to truncation based on the
  incorrect assumption that the IFS would never be larger than a single byte.

  This change fixes this issue by carefully tracking the number of bytes that
  should be persisted in the IFS case and ensuring that all bytes are written
  during expansion and substitution.

  Bug report: https://github.com/att/ast/issues/13

- Fixed another bug that caused multibyte characters with the same initial byte
  to be treated as the same character by the IFS. This bug was occurring because
  the first byte of a multibyte character wasn't being written to the stack when
  the IFS delimiter had the same initial byte:

  $ IFS=£
  $ v='§'
  $ set -- $v
  $ v="${1-}"
  $ echo "$v" | hd # The first byte should be c2, but it isn't due to the bug
  00000000  a7 0a                                             |..|
  00000002

  Bug report: https://github.com/att/ast/issues/1372

src/cmd/ksh93/tests/variables.sh:
- Add (reworked) regression tests from ksh2020 for the multibyte IFS bugs.
- Add a regression test for att/ast#1372 based on the reproducer.
2020-07-25 19:46:11 +01:00
Johnothan King
8c16f38a88
Fix an infinite loop related to $_ if ksh is /bin/sh (#90)
The following explanation is mostly taken from Tomas Klacko's report on
the old mailing list (which also contains a C program reproducer) [*]:

1. When ksh starts a binary, it sets its environment variable "_"
   to "*number*/path/to/binary". Where "number" is the pid of the
   ksh process.

2. The binary forks and the child executes a suid root shell script
   which begins with #!/bin/sh. For this bug to occur, ksh must be /bin/sh.

3. The ksh process interpreting the suid shell script leaves the "_"
   variable as not set (nv_getval(L_ARGNOD) returns NULL) because
   the "number" from step 1 is not the pid of its parent process.

4-5. Because "_" is not set and the script is suid root, an infinite
   loop occurs because when the SHELL environment variable contains
   "/bin/sh" pathshell() returns "/bin/sh". This becomes an infinite
   loop of /bin/sh /dev/fd/3 executing /bin/sh /dev/fd/3.

src/cmd/ksh93/sh/init.c: get_lastarg():
- Disable the check for if the "number" refers to the process id of
  the parent process.

src/cmd/ksh93/sh/main.c: sh_main():
- Prevent an infinite loop when '$_' is not passed in from the environment.

Solaris applies this bugfix to their version of ksh:
https://github.com/oracle/solaris-userland/blob/master/components/ksh93/patches/190-17432413.patch

[*]: https://www.mail-archive.com/ast-developers@lists.research.att.com/msg01680.html
2020-07-24 01:20:26 +01:00
Johnothan King
6e515f1d45
Fix command substitutions run on the same line as a here-doc (#91)
When a command substitution is run on the same line as a here-document,
a syntax error occurs due to a regression introduced in ksh93u+ 2011-04-15:

true << EOF; true $(true)
EOF
syntax error at line 1: `<<EOF' here-document not contained within command substitution

The regression is caused by an error check that was added to make
the following script causes a syntax error (because the here-document
isn't completed inside of the command substitution):

$(true << EOF)
EOF

src/cmd/ksh93/sh/lex.c:
- Only throw an error when a here-document in a command substitution
  isn't completed inside of the command substitution.

src/cmd/ksh93/tests/heredoc.sh:
- Add a regression test for running a command substitution on the
  same line as a here-document.
- Add a missed regression test for using here-documents in command
  substitutions. This is the original bug that was fixed in ksh93u+
  2011-04-15 (it is why the error message was added), but a regression
  test for here-documents in command substitutions wasn't added in
  that version.

This bugfix was backported from ksh93v- 2013-10-10-alpha.
2020-07-24 00:03:57 +01:00
Martijn Dekker
f207cd5787 Fix race conditions running external commands with job control on
When ksh is compiled with SHOPT_SPAWN (the default), which uses
posix_spawn(3) or vfork(2) (via sh_ntfork()) to launch external
commands, at least two race conditions occur when launching
external commands while job control is active. See:
https://bugs.launchpad.net/ubuntu/+source/ksh/+bug/1887863/comments/3
https://www.mail-archive.com/ast-developers@research.att.com/msg00717.html

The basic issue is that this performance optimisation is
incompatible with job control, because it uses a spawning mechanism
that doesn't copy the parent process' memory pages into the child
process, therefore no state that involves memory can be set before
exec-ing the external program. This makes it impossible to
correctly set the terminal's process group ID in the child process,
something that is essential for job control to work.

src/cmd/ksh93/sh/xec.c:
- Use sh_fork() instead of sh_ntfork() if job control is active.
  This uses fork(2), which is 30%-ish slower on most sytems, but
  allows for correctly setting the terminal process group.

src/cmd/ksh93/tests/basic.sh:
- Add regression test for the race condition reported in #79.

src/cmd/INIT/cc.darwin:
- Remove hardcoded flag to disable SHOPT_SPAWN on the Mac.
  It should be safe to use now.

Fixes https://github.com/ksh93/ksh/issues/79
2020-07-22 13:45:33 +01:00
Martijn Dekker
db72f41f4b Fix subshell file descriptor leak
A file descriptor (at least 3, can't reproduce for 4 and up) opened
with 'exec' or 'redirect' in a virtual/non-forked subshell survived
that subshell after exiting it:

    $ ksh -c '(redirect 3>&1); echo bug >&3'
    bug

src/cmd/ksh93/sh/io.c:
- Apply a patch from OpenSUSE (ksh93-redirectleak.dif). Source:
  https://build.opensuse.org/package/show/openSUSE:Leap:42.3:Update/ksh

src/cmd/ksh93/tests/io.sh:
- Add regression test.

Thanks to Marc Wilson for flagging this up.
2020-07-21 04:12:40 +01:00
Martijn Dekker
bc8b36faba whence -a/type -a: report both function and built-in by same name
'whence -a' is documented to list all possible interpretations of a
command, but failed to list a built-in command if a shell function
by the same name exists or is marked undefined using 'autoload'.

src/cmd/ksh93/bltins/whence.c: whence():
- Refactor and separate the code for reporting functions and
  built-in commands so that both can be reported for one name.

src/cmd/ksh93/data/builtins.c: sh_optwhence[]:
- Correct 'whence --man' to document that:
  * 'type' is equivalent to 'whence -v'
  * '-a' output is like '-v'

src/cmd/ksh93/tests/builtins.sh:
- Test 'whence -a' with these combinations:
  * a function, built-in and external command
  * an undefined/autoload function, built-in and external command

Fixes https://github.com/ksh93/ksh/issues/83
2020-07-20 21:16:24 +01:00
Johnothan King
bd88cc7f4f
Fix two crashes related to kshdb (#82)
This commit fixes two different crashes related to kshdb:
- When redirect is given an invalid file descriptor, a segfault
  no longer occurs. Reproducer:
  $ ksh -c 'redirect 9>&200000000000'

- Fix a crash due to free(3) being used on an invalid pointer.
  This can be reproduced with kshdb (commands from att/ast#582):
  $ git clone https://github.com/rocky/kshdb.git
  $ cd kshdb
  $ ksh autogen.sh
  $ echo "print hi there" > $HOME/.kshdbrc
  $ ./kshdb -L . test/example/dbg-test1.sh

src/cmd/ksh93/bltins/misc.c: b_dot_cmd():
- The string pointed to by shp->st.filename must be able to be
  freed from memory with free(3), so duplicate the string with
  strdup(3).

src/cmd/ksh93/sh/io.c: sh_redirect():
- Show an error message when a file descriptor is invalid to
  fix a memory fault.
2020-07-19 23:42:12 +01:00
Johnothan King
2db9953ae0
Fix three bugs in the sleep builtin (#77)
This commit backports the main changes to sh_delay from ksh93v-
and ksh2020, which fixes the following bugs:

- Microsecond amounts of less than one millisecond are no longer
  ignored. The following loop will now take a minimum of one
  second to complete:
  for ((i = 0; i != 10000; i++)) do
    sleep PT100U
  done

- 'sleep 30' no longer adds an extra 30 milliseconds to the total
  amount of time to sleep. This bug is hard to notice since 30
  milliseconds can be considered within the margin of error. The
  only reason why longer delays weren't affected is because the old
  code masked the bug when the interval is greater than 30 seconds:
  else if(n > 30)
  {
      sleep(n);
      t -= n;
  }
  This caused 'sleep -s' to break with intervals greater than 30
  seconds, so an actual fix is used instead of a workaround.

- 'sleep -s' now functions correctly with intervals of more than
  30 seconds as the new code doesn't need the old workaround. This
  is done by handling '-s' in sh_delay.

src/cmd/ksh93/bltins/sleep.c:
- Remove the replacement for sleep(3) from the sleep builtin.
- Replace the old sh_delay function with the newer one from ksh2020.
  The new function uses tvsleep, which uses nanosleep(3) internally.

src/cmd/ksh93/include/shell.h,
src/cmd/ksh93/edit/edit.c,
src/cmd/ksh93/sh/jobs.c,
src/cmd/ksh93/sh/xec.c,
src/cmd/ksh93/shell.3:
- Update sh_delay documentation and usage since the function now
  requires two arguments.

src/cmd/ksh93/tests/builtins.sh:
- Add a regression test for 'sleep -s' when the interval is greater
  than 30 seconds. The other bugs can't be tested for in a feasible
  manner across all systems:
  https://github.com/ksh93/ksh/pull/72#issuecomment-657215616
2020-07-17 05:00:28 +01:00
Johnothan King
ea5b25b93a
Fix some formatting errors, typos and other problems (#78)
Some notes:
- Removed a TODO note that was fixed in commit 43d9fbac.
- Removed a duplicate note about the '%l' time format in the changelog.
- Applied the following documentation fixes from Terrence J. Doyle:
  - https://www.mail-archive.com/ast-developers@lists.research.att.com/msg01852.html
  - https://www.mail-archive.com/ast-developers@lists.research.att.com/msg01856.html
- Fixed strange grammar in one of the error messages.
- Added missing options for rksh to the synopsis section.
- Applied a formatting fix from ksh93v- to the man page.
- Replaced a C99 line comment in src/lib/libast/comp/realpath.c with a
  proper comment that is valid in C89.
- Prioritize UTC over GMT in the documentation (missed by commit c9634e90).
- Add some extra information for 'ksh -R file' to the man page. This patch
  is from Red Hat: https://git.centos.org/rpms/ksh/blob/c8/f/SOURCES/ksh-20080202-manfix.patch
2020-07-16 22:27:00 +01:00
Johnothan King
03224ae3af
Make the 'history' and 'r' commands builtins (#76)
With this change no more preset aliases exist, so the preset alias
tables can be safely removed. All ksh commands can now be used
without 'unalias -a' removing them, even in interactive shells.
Additionally, the history and r commands are no longer limited to
being used in interactive shells.

src/cmd/ksh93/bltins/hist.c:
- Implement the history and r commands as builtins. Also guarantee
  lflag is set to one by avoiding 'lflag++'.

src/cmd/ksh93/Makefile,
src/cmd/ksh93/Mamfile,
src/cmd/ksh93/sh/main.c,
src/cmd/ksh93/sh/init.c,
src/cmd/ksh93/data/aliases.c:
- Remove the table of predefined aliases because the last few have
  been removed. During init the alias tree is now initialized the
  same way as the function tree.

src/cmd/ksh93/bltins/typeset.c:
- Remove the bugfix for unsetting predefined aliases because it is
  now a no-op. Aliases are no longer able to have the NV_NOFREE
  attribute.

src/cmd/ksh93/tests/alias.sh:
- Remove the regression test for unsetting predefined aliases since
  those no longer exist.

src/cmd/ksh93/data/builtins.c:
- Update sh_opthist[] for 'hist --man', etc.

src/cmd/ksh93/sh.1:
- Remove the list of preset aliases since those no longer exist.
- Document history and r as builtins instead of preset aliases.

Co-authored-by: Martijn Dekker <martijn@inlv.org>
2020-07-16 18:56:49 +01:00
Martijn Dekker
17f81ebedb Load 'r' and 'history' default aliases on interactive only
These two default aliases are useful on interactive shells. In
scripts, they interfere with possible function or command names.

As of this commit, these final two default aliases are only loaded
for interactive shells, leaving zero default aliases for scripts.
This completes the project to get rid of misguided default aliases.

src/cmd/ksh93/include/shtable.h,
src/cmd/ksh93/data/aliases.c:
src/cmd/ksh93/sh/init.c:
- Add empty alias table shtab_noaliases[] for scripts.
- Rename inittree() to sh_inittree() and make it external.
- nv_init(), sh_reinit(): Initialise empty alias tree for scripts.

src/cmd/ksh93/sh/main.c: sh_main():
- If interactive, reinitialise alias tree for interactive shells.

src/cmd/ksh93/tests/alias.sh:
- To test default alias removal, launch shell with -i.
2020-07-16 06:44:05 +01:00
Johnothan King
01145a48dd
Handle the escape sequence for the End key (#75)
Many terminals (xterm being one example) give the Home and End keys
the escape sequences '^[[H' and '^[[F'. The first sequence is
handled in both editing modes by moving the cursor to start of
line, but ksh ignored the second sequence.

src/cmd/ksh93/edit/emacs.c,
src/cmd/ksh93/edit/vi.c:
- Add case labels for '^[[F' so that in both editing modes the End
  key moves the cursor to the end of the line.
2020-07-15 23:38:44 +01:00
Martijn Dekker
1fbbeaa19d Convert default typeset aliases to regular builtins
This converts the 'autoload', 'compound', 'float', 'functions',
'integer' and 'nameref' default aliases into regular built-in
commands, so that 'unalias -a' does not remove them. Shell
functions can now use these names, which improves compatibility
with POSIX shell scripts.

src/cmd/ksh93/data/aliases.c:
- Remove default typeset aliases.

src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/include/builtins.h:
- Add corresponding built-in command declarations. Typeset-style
  commands are now defined by a pointer range, SYSTYPESET ..
  SYSTYPESET_END. A couple need their own IDs (SYSCOMPOUND,
  SYSNAMEREF) for special-casing in sh/xec.c.
- Update 'typeset --man'.

src/cmd/ksh93/bltins/typeset.c: b_typeset():
- Recognise the new builtin commands by argv[0]. Implement them by
  inserting the corresponding 'typeset' options into the argument
  list before parsing options. This may seem like a bit of a hack,
  but it is simpler, shorter, more future-proof and less
  error-prone than manually copying and adapting all the complex
  flaggery from the option parsing loop.

src/cmd/ksh93/sh/parse.c,
src/cmd/ksh93/sh/xec.c:
- Recognise typeset-style commands by SYSTYPESET .. SYSTYPESET_END
  pointer range.
- Special-case 'compound' (SYSCOMPOUND) and 'nameref' (SYSNAMEREF)
  along with recognising the corresponding 'typeset' options.

src/cmd/ksh93/sh.1:
- Update to document the new built-ins.
- Since not all declaration commands are special built-ins now,
  identify declaration commands using a double-dagger "\(dd"
  character (which renders as '=' in ASCII) and disassociate their
  definition from that of special built-ins.

src/cmd/ksh93/tests/variables.sh:
- Adapt a regression test as there is no more 'integer' alias.
2020-07-15 20:54:06 +01:00
Martijn Dekker
b1a4131123 Millisecond precision for 'times' builtin (re: 65d363fd, 5c677a4c)
Now that we have an iffe feature test for getrusage(3), introduced
in 70fc1da7, the millisecond-precision 'times' command from the
last version of ksh2020 can easily be backported.

src/cmd/ksh93/bltins/misc.c:
- Incorporate ksh2020 'times' command, with a couple of tweaks:
  * Use locale's radix point instead of '.'.
  * Pad seconds with initial zero if < 10.

src/cmd/ksh93/data/builtins.c:
- Update version date for 'times --man'.

src/cmd/ksh93/tests/builtins.sh:
- Update 'times' test for 3 digits after radix point.
2020-07-15 04:22:45 +01:00
Johnothan King
70fc1da73e
Fix the max precision of the 'time' keyword (#72)
This commit backports the required fixes from ksh2020 for using
millisecond precision with the 'time' keyword. The bugfix refactors
a decent amount of code to rely on the BSD 'timeradd' and
'timersub' macros for calculating the total amount of time elapsed
(as these aren't standard, they are selectively implemented in an
iffe feature test for platforms without them). getrusage(3) is now
preferred since it usually has higher precision than times(3) (the
latter is used as a fallback).

There are three other fixes as well:

src/lib/libast/features/time:
- Test for getrusage with an iffe feature test rather than
  assume _sys_times == _lib_getrusage.

src/cmd/ksh93/sh/xec.c:
- A single percent at the end of a format specifier is now
  treated as a literal '%' (like in Bash).
- Zero-pad seconds if seconds < 10. This was already done for
  the times builtin in commit 5c677a4c, although it wasn't
  applied to the time keyword.
- Backport the ksh2020 bugfix for the time keyword by using
  timeradd and timersub with gettimeofday (which is used with
  a timeofday macro). Prefer getrusage when it is available.
- Allow compiling without the 'timeofday' ifdef for better
  portability.
  This is the order of priority for getting the elapsed time:
  1) getrusage (most precise)
  2) times + gettimeofday (best fallback)
  3) only times (doesn't support millisecond precision)
  This was tested by using debug '#undef' statements in xec.c.

src/cmd/ksh93/features/time:
- Implement feature tests for the 'timeradd' and 'timersub'
  macros.
- Do a feature test for getrusage like in the libast time test.

src/cmd/ksh93/tests/basic.sh:
- Add test for millisecond precision.
- Add test for handling of '%' at the end of a format specifier.
- Add test for locale-specific radix point.
2020-07-14 22:48:04 +01:00
Johnothan King
fc655f1a26
Restore 'set -b'/'set -o notify' functionality (#74)
'set -b' had no effect; it should cause the shell to notify job
state changes immediately instead of waiting for the next prompt.

This fixes a regression that was introduced in ksh93t 2008-07-25.
The bugfix is from: https://github.com/att/ast/pull/1089

src/cmd/ksh93/sh/jobs.c:
- Save the tty wait state and avoid changing it if TTYWAIT was
  already on to avoid breaking 'set -b'.
  The last 'sh_offstate' is inside of an '#if' directive because it
  is only required when ksh is compiled with SHOPT_COSHELL enabled.

src/cmd/ksh93/tests/pty.sh:
- Add a regression test for 'set -b' in interactive shells.
2020-07-14 22:00:28 +01:00
Johnothan King
66c955bc8f
Fix a fork bomb when vi is run from a script and sent Ctrl-Z (#73)
This bug was reported on the old mailing list:
https://www.mail-archive.com/ast-developers@lists.research.att.com/msg00207.html

A fork bomb can occur when SIGTSTP is sent to the vi editor. Vi
must be launched from a script run with exec (tested with
BusyBox vi, nvi and vim):
$ cat /tmp/foo
vi /tmp/bar
echo end
$ ksh
$ chmod +x /tmp/foo
$ exec /tmp/foo
While in vi, send SIGTSTP using Ctrl-Z

src/cmd/ksh93/sh/fault.c:
- Only fork after Ctrl-Z if job control is available. The patch
  used checks 'job.jobcontrol' instead of 'SH_MONITOR':
  https://git.centos.org/rpms/ksh/blob/c8/f/SOURCES/ksh-20120801-forkbomb.patch
2020-07-13 19:10:23 +01:00
Martijn Dekker
778fd6ca2d Fix possible crash due to failure to update shell FD state
This applies ksh-20100621-fdstatus.patch from Red Hat. Not very
much information is available, so this one is more or less taken
on faith. But it seems to make sense on the face of it: calling
sh_fcntl() instead of fcntl(2) directly makes the shell update its
internal file descriptor state more frequently.

It claims to fix Red Hat bug 924440. The report is currently closed
to the public: https://bugzilla.redhat.com/show_bug.cgi?id=924440

However, Kamil Dudka at Red Hat writes:
https://github.com/ksh93/ksh/issues/67#issuecomment-656379993
| Yes, the summary of RHBZ#924440 is "crash in bestreclaim() after
| traversing a memory block with a very large size". We did not have
| any in house reproducer for the bug. The mentioned patch was
| provided and verified by a customer.

...and Marc Wilson dug up a Red Hat erratum containing this info:
https://download.rhn.redhat.com/errata/RHBA-2013-1599.html
| Previously, the ksh shell did not resize the file descriptor list
| every time it was necessary. This could lead to memory corruption
| when several file descriptors were used. As a consequence, ksh
| terminated unexpectedly. This updated version resizes the file
| descriptor list every time it is needed, and ksh no longer
| crashes in the described scenario. (BZ#924440)

No reproducer means no regression test can be added now.

src/cmd/ksh93/sh/io.c,
src/cmd/ksh93/sh/subshell.c,
src/cmd/ksh93/sh/xec.c:
- Change several fcntl(2) calls to sh_fcntl(). This function calls
  fcntl(2) and then updates the shell's file descriptor state.
2020-07-10 20:04:31 +01:00
Johnothan King
c4236cc295 Fix type names starting with lowercase 'a' (#69)
Type names that start with a lowercase 'a' cause an error when used:

$ typeset -T al=(typeset bar)
$ al foo=(bar=testset)
/usr/bin/ksh: al: : invalid variable name

The error occurs because when the parser checks for the alias
builtin (to set 'assignment' to two instead of one), only the first
letter of 'argp->argval' is checked (rather than the entire
string). This was fixed in ksh93v- by comparing argp->argval
against "alias", but in ksh93u+m the check can simply be removed
because it is only run when a builtin has the BLT_DCL flag. As of
04b9171, the alias builtin does not have that flag.

src/cmd/ksh93/sh/parse.c:
- Remove the bugged check for the alias builtin.

src/cmd/ksh93/tests/types.sh:
- Add a regression test for type names starting with a lowercase 'a'.
2020-07-10 17:54:51 +01:00
Martijn Dekker
f9d28935bb Fix UTF-8 shellquoting for xtrace, printf %q, etc.
This fixes an annoying issue in the shell's quoting algorithm
(used for xtrace (set -x), printf %q, and other things) for UTF-8
locales, that caused it to encode perfectly printable UTF-8
characters unnecessarily and inconsistently. For example:

$ (set -x; : 'aeu aéu')
+ : $'aeu a\u[e9]u'
$ (set -x; : 'aéu aeu')
+ : 'aéu aeu'
$ (set -x; : '正常終了 aeu')
+ : '正常終了 aeu'
$ (set -x; : 'aeu 正常終了')
+ : $'aeu \u[6b63]\u[5e38]\u[7d42]\u[4e86]'

This issue was originally reported by lijo george in May 2017:
https://www.mail-archive.com/ast-developers@lists.research.att.com/msg01958.html

src/cmd/ksh93/sh/string.c:
- Add is_invisible() function that returns true if a character is a
  Unicode invisible (non-graph) character, excluding ASCII space.
  Ref.: https://unicode.org/charts/PDF/U2000.pdf
- Use a fallback in is_invisible() if we cannot use the system's
  iswprint(3); this is the case for the ksh C.UTF-8 locale if the
  OS doesn't support that. Fall back to a hardcoded blacklist of
  invisible and control characters and put up with not encoding
  nonexistent characters into \u[xxxx] escapes.
  Ref.: https://unicode.org/charts/PDF/U2000.pdf
- When deciding whether to switch to $'...' quoting mode (state=2),
  use is_invisible() instead of testing for ASCII 0-127 range.
- In $'...' quoting mode, use is_invisible() to decide whether to
  encode wide characters into \u[xxxx] escapes.

src/cmd/ksh93/tests/builtins.sh:
- Add regression tests for shellquoting Arabic, Japanese and Latin
  UTF-8 characters, to be run only in a UTF-8 locale. The Arabic
  sample text[*] contains a couple of direction markers that are
  expected to be encoded into \u[xxxx] escapes.

[*] source: https://r12a.github.io/scripts/tutorial/summaries/arabic
2020-07-10 05:55:11 +01:00
Martijn Dekker
588a1ff7ca Fix spurious warning output in KIA (-R) database file
The ksh -R option creates a cross-reference database that can be
parsed with a "C Query Language" (CQL) tool.
See cql-1994.pdf at: http://gsf.cococlyde.org/files

The -R option puts ksh in noexec mode as it parses the script, and
this can produce warnings as the syntax is parsed. The bug is that
these warnings can end up in the database file, corrupting it.

This applies a fix from Paulo Andrade, via Siteshwar Vashisht:
https://www.mail-archive.com/ast-developers@lists.research.att.com/msg01952.html

src/cmd/ksh93/sh/parse.c:
- Terminate names with a zero character when writing database
  output.

A regression test is not very feasible because the majority of the
database output consists of cryptic IDs/hashes that vary depending
on the session and/or system and possibly other things.
2020-07-09 23:18:41 +01:00
Johnothan King
6930666234
Fix a syntax error when ((...)) is combined with redirections (#68)
This bugfix was backported from ksh93v- 2013-10-10-alpha.

src/cmd/ksh93/sh/parse: item():
- The done label is placed after the 'inout' call for handling I/O
  redirections. This causes the command below to produce a syntax
  error because the '>' is not handled as a redirection operator
  after 'goto done':
  $ ((1+2)) > /dev/null
  /usr/bin/ksh: syntax error: `>' unexpected
  Moving the done label fixes the syntax error as 'inout' is now
  called to handle the redirection operator.

src/cmd/ksh93/tests/arith.sh:
- Add a simple regression test.
2020-07-09 22:12:04 +01:00
Martijn Dekker
361fe1fcc3 Fix hash table memory leak when restoring PATH
There is a bug in path_alias() that may cause a memory leak when
clearing the hash table while setting/restoring PATH.

This applies a fix from Siteshwar Vashist:
https://www.mail-archive.com/ast-developers@lists.research.att.com/msg01945.html

Note that, contrary to Siteshwar's analysis linked above, this bug
has nothing directly to do with subshells, forked or otherwise; it
can also be reproduced by temporarily setting PATH for a command,
for example, 'PATH=/dev/null true', and then doing a PATH search.

Modified analysis:
ksh maintains the value of PATH as a linked list. When a local
scope for PATH is created (e.g. in a virtual subshell or when doing
something like PATH=/foo/bar command ...), ksh duplicates PATH by
increasing the refcount for every element in the linked list by
calling the path_dup() and path_alias() functions. However, when
the state of PATH is restored, this refcount is not decreased. Next
time when PATH is reset to a new value, ksh calls the path_delete()
function to delete the linked list that stored the older path. But
the path_delete() function does not free elements whose refcount is
greater than 1, causing a memory leak.

src/cmd/ksh93/sh/path.c: path_alias():
- Decrease refcount and free old item if needed.
  (The 'old' variable was already introduced in 99065353, but
  its value was never used there; this fixes that as well.)

src/cmd/ksh93/tests/leaks.sh:
- Add regression test. With the bug, setting/restoring PATH
  (which clears the hash table) and doing a PATH search 16 times
  causes about 1.5 KiB of memory to be leaked.
2020-07-09 18:34:15 +01:00
Martijn Dekker
5e7d335f2f Fix crash when listing indexed arrays with 'typeset -a'
There is a bug in print_scan() function that may cause ksh to crash
while listing indexed arrays. The crash happens in nv_search() when
called from print_scan().

This applies a fix from Siteshwar Vashist:
https://www.mail-archive.com/ast-developers@lists.research.att.com/msg01944.html

src/cmd/ksh93/bltins/typeset.c:
- Call nv_scan() without the NV_IARRAY flag, even for a null scan.

src/cmd/ksh93/tests/arrays.sh:
- Add regression test for 'typeset -a' crash and check output.
2020-07-09 16:42:16 +01:00
Martijn Dekker
a8f6d6b842 Fix crash due to double free() when sourcing multiple files
There is a bug in sh_eval() that may cause ksh to crash due to a
double free() after sourcing multiple files with '.' or 'source'
if a longjmp is triggered, e.g. by a syntax error.

This applies a fix from Siteshwar Vashist:
https://www.mail-archive.com/ast-developers@lists.research.att.com/msg01943.html

src/cmd/ksh93/sh/xec.c: sh_eval():
- Zero file descriptor io_save after closing it. This prevents a
  double free() after returning from a longjmp.

src/cmd/ksh93/tests/basic.sh:
- Add reproducer as regression test.
2020-07-09 15:35:07 +01:00
Johnothan King
9526b3fa08
Fix unexpected output from 'printf %T' with certain formats (#65)
This commit changes the behavior of four date formats accepted
by 'printf %()T' because the old behavior is not compatible with
modern implementations of date(1):
- %k and %l now return a blank-padded hour, the former based on a
  24-hour clock and the latter a 12-hour clock (these are common
  extensions present on Linux and *BSD).
- %f now returns a date with the format '%Y.%m.%d-%H:%M:%S'
  (BusyBox extension).
- %q now returns the quarter of the current year (GNU extension).

src/cmd/ksh93/data/builtins.c:
- Copy the date format documentation from date in libcmd to
  the printf man page (for documenting 'printf %T').

src/cmd/ksh93/tests/builtins.sh:
- Add four regression tests for the changed date formats.

src/cmd/ksh93/sh.1:
- Remove inaccurate information about the date formats accepted by
  printf %T'. The KornShell uses a custom version of strftime(3)
  that isn't guaranteed to accepts the same formats as the native
  strftime function.

src/lib/libast/tm/tmxfmt.c:
- Change the behavior of %f, %k, %l and %q to the common behavior.
  %k and %l are implemented as aliases to %_H and %_I to avoid
  duplicating code.

src/lib/libcmd/date.c:
- Update the documentation for the AST date command since it is
  also affected by the changes to 'printf %T'.

Fixes #62
2020-07-09 05:08:28 +01:00
Johnothan King
e70925ce10
Fix memory leak on unset of associative array (#64)
Associative arrays weren't being properly freed from memory, which
was causing a memory leak.

This commit incorporates a patch and reproducer/regress test from:
https://www.mail-archive.com/ast-users@lists.research.att.com/msg01016.html

src/cmd/ksh93/sh/name.c:
- Properly free associative arrays from memory in nv_delete().

src/cmd/ksh93/tests/leaks.sh:
- Add regression test.
2020-07-09 01:09:40 +01:00
Johnothan King
9a9da2c299
Fix use of strdup on a NULL pointer (#63)
The following set of commands can rarely cause a memory fault
when auditing[*] is enabled, although most of the time it will
simply cause ksh to write '(null)' to the auditing file in place
of a tty name:

$ [ -e /etc/ksh_audit ] || echo "/tmp/ksh_auditfile;$(id -u)" | sudo tee /etc/ksh_audit;
$ v=$(ksh  2> /dev/null +o rc -ic $'getopts a:bc: opt --man\nprint $?')
$ cat /tmp/ksh_auditfile
1000;1593599493;(null); getopts a:bc: opt --man

This happens because strdup is used unconditionally on the pointer
returned by 'ttyname', which can be NULL if stderr is closed. This
then causes 'hp->tty' to be set to null, as strdup returns NULL.
See https://github.com/att/ast/issues/1028

src/cmd/ksh93/edit/history.c:
- Make strdup duplicate 'notty' instead of NULL to prevent
  crashes.

[*] https://blog.fpmurphy.com/2008/12/ksh93-auditing-and-accounting.html
2020-07-06 21:51:44 +01:00
Martijn Dekker
300cd19987 Fix corrupt UTF-8 char processing & shellquoting after aborted read
If the processing of a multibyte character was interrupted in UTF-8
locales, e.g. by reading just one byte of a two-byte character 'ü'
(\303\274) with a command like:
	print -nr $'\303\274' | read -n1 g
then the shellquoting algorithm was corrupted in such a way that
the final quote in simple single-quoted string was missing. This
bug may have had other, as yet undiscovered, effects as well. The
problem was with corrupted multibyte character processing and not
with the shell-quoting routine sh_fmtq() itself.

Full trace and discussion at: https://github.com/ksh93/ksh/issues/5
(which is also an attempt to begin to understand the esoteric
workings of the libast mb* macros that process UTF-8 characters).

src/lib/libast/comp/setlocale.c: utf8_mbtowc():
- If called from the mbinit() macro (i.e. if both pointer
  parameters are null), reset the global multibyte character
  synchronisation state variable. This fixes the problem with
  interrupted processing leaving an inconsistent state, provided
  that mbinit() is called before processing multibyte characters
  (which it is, in most (?) places that do this). Before this fix,
  calling mbinit() in UTF-8 locales was a no-op.

src/cmd/ksh93/sh/string.c: sh_fmtq():
- Call mbinit() before potentially processing multibyte characters.
  Testing suggests that this could be superfluous, but at worst,
  it's harmless; better be sure.

src/cmd/ksh93/tests/builtins.sh:
- Add regression test for shellquoting with 'printf %q' after
  interrupting the processing of a multibyte characeter with
  'read -n1'. This test only fails in a UTF-8 locale, e.g. when
  running: bin/shtests -u builtins SHELL=/buggy/ksh-2012-08-01

Fixes #5.
2020-07-05 19:24:41 +02:00
Johnothan King
658bba748e
Fix 'kill -INFO' on systems that support SIGINFO (#59)
src/cmd/ksh93/data/signals.c:
- SIGINFO was absent from the table of signals, which caused
  commands like 'kill -INFO $$' to fail even on platforms with
  SIGINFO (such as macOS and FreeBSD). Fix that by adding
  it to the signal table.

src/cmd/ksh93/tests/signal.sh:
- Add a regression tests for using SIGINFO with the kill builtin.
  The test will only be run if the external kill command supports
  SIGINFO.
2020-07-04 15:57:47 +01:00
Johnothan King
a0dcdeeade Fix bugs with backslash escaping in interactive vi mode (#57)
This commit fixes the following bugs in the 'vi' editing mode
backslash escape feature. Ref.: Bolsky & Korn (1995), p. 113, which
states for \: "Similar to Control+V [...] except that it escapes
only the next Erase or Kill charactrer".

1. The vi mode now only escapes the next character if the last
   character input was a backslash, fixing the bug demonstrated at:
   https://asciinema.org/a/E3Rq3et07MMQG5BaF7vkXQTg0
2. Escaping backslashes are now disabled in vi.c if the vi mode is
   disabled (note that vi.c handles raw editing mode in UTF-8
   locales). This makes the behavior of the raw editing mode
   consistent in C/POSIX and UTF-8 locales.
3. An odd interaction with Backspace when the character prior to a
   separate buffer entered with Shift-C was a backslash has been
   fixed. Demonstration at: https://asciinema.org/a/314833
   ^? will no longer be output repeatedly when attempting to erase
   a separate buffer with a Backspace, although, to be consistent
   with vi(1), you still cannot backspace past it before escaping
   out of it. Ref.:
   https://github.com/ksh93/ksh/issues/56#issuecomment-653586994

src/cmd/ksh93/edit/vi.c:
- Prevent a backslash from escaping the next input if the previous
  input wasn't a backslash. This is done by unsetting a variable
  named backslash if a backslash escaped a character. backslash is
  set to the result of c == '\\' when the user enters a new
  character.
- Disable escaping backslashes in the raw editing mode because
  it should not be enabled there.

src/cmd/ksh93/tests/pty.sh:
- Add some tests for how ksh handles backslashes in each
  editing mode to test for the bugs fixed by this commit.

Fixes #56.
2020-07-03 21:15:21 +02:00
Anuradha Weeraman
035a4cb3f4
Fix segfault if $PATH contains a .paths directory (#55)
ksh crashed if it encountered a .paths directory in any of the
directories in $PATH.

Ref: https://bugs.launchpad.net/ubuntu/+source/ksh/+bug/1534855

src/cmd/ksh93/sh/path.c: path_chkpaths():
- Refuse to read .paths if it's not a regular file
  or a symlink to a regular file.
2020-07-02 23:29:07 +01:00
Johnothan King
db1d539d49
Fix ERE repetition expressions in [[ ... =~ ERE{x,y} ]] (#54)
Regular expressions that combine a repetition expression with
a parenthesized sub-expression throw a garbled syntax error:

$ [[ AATAAT =~ (AAT){2} ]]
ksh: syntax error: `~(E)(AAT){2} ]]
:'%Cred%h%Creseksh: syntax error: `~(E)(AAT){2} ]]
:'%Cred%h%Creseksh: syntax' unexpected

The syntax error occurs because ksh is not fully
accounting for '=~' when it runs into a curly bracket.
This fix disables the syntax error when the operator
is '=~' and adds handling for '(str){x}' (to allow for
more than one sub-expression). This bugfix and the
regression tests for it were backported from ksh93v-
2014-12-24-beta.

src/cmd/ksh93/sh/lex.c:
- Do not trigger a syntax error for '{x}' when the operator
  is '=~' and add handling for multiple parentheses when
  combined with '{x}'.

src/cmd/ksh93/tests/bracket.sh:
- Add two tests from ksh93v- to test sub-expressions
  combined with the '{x}' quantifier.
2020-07-02 18:40:15 +01:00
Johnothan King
120aec25ba
Fix a crash when 'read -u' is given an invalid fd (#53)
File descriptors that are too far out of range will cause the
read builtin to crash. The following example will generate
two crashes:

$ ksh -c 'read -u 2000000' || ksh -c 'read -u-2000000'

The fix is to error out when the given file descriptor is out
of range. This bugfix is from Tomas Klacko, although it was
modified to use 'sh_iovalidfd' and reject numbers greater
than 'INT_MAX':
https://www.mail-archive.com/ast-developers@lists.research.att.com/msg01912.html
The question about 'shp->fdstatus[-1]' only applies to ksh93v-
(ksh93u+ doesn't have any references to 'shp->fdstatus[-1]').

src/cmd/ksh93/bltins/read.c:
- File descriptors that are out of range should be rejected
  with an error message (like invalid file descriptors that
  are in range). The seemingly redundant check for negative
  numbers is there because out of range negative numbers also
  cause memory faults despite the later 'fd<0' check.

src/cmd/ksh93/tests/io.sh:
- Add three tests for attempting 'read -u' on various invalid
  file descriptor numbers.
2020-07-01 18:14:10 +01:00
Johnothan King
1b5bc1802a
Fix the readonly builtin's scope in functions (#51)
* Fix the readonly builtin's scope in functions

This bug was first reported at https://github.com/att/ast/issues/881

'tdata.sh->prefix' is only set to the correct value when
'b_readonly' is called as 'export', which breaks 'readonly' in
functions because the correct scope isn't set. As a result, the
following example will only print a newline:

$ function show_bar { readonly foo=bar; echo $foo; }; show_bar

The fix is to move the required code out of the if statement for
'export', as it needs to be run for 'readonly' as well. This bugfix
is from https://github.com/att/ast/pull/906

src/cmd/ksh93/bltins/typeset.c:
- Set 'tdata.sh->prefix' to the correct value, otherwise 'readonly'
  uses the wrong scope.

src/cmd/ksh93/tests/builtins.sh:
- Add the regression test from ksh2020, modified to run in a
  subshell.

src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/sh.1:
- Add documentation of 'readonly' vs. 'typeset -r' difference:
  'readonly' does not create a function-local scope.

Co-authored-by: Martijn Dekker <martijn@inlv.org>
2020-06-29 19:09:20 +01:00
Johnothan King
10b6ba801d
Fix memory corruption when a compound variable is unset (#49)
The following set of commands ends with a memory fault under
certain circumstances because ksh attempts to free memory
twice, causing memory corruption:

$ testarray=(1 2)
$ compound testarray
$ unset testarray
$ eval testarray=

The fix is to make sure 'np->nvfun' is a valid pointer before
attempting to free memory in 'put_tree'. This patch is from
OpenSUSE: https://build.opensuse.org/package/view_file/shells/ksh/ksh93-nvtree-free.dif?expand=1

src/cmd/ksh93/sh/nvtree.c:
- Do not try to free memory when 'np->nvfun' and 'val'
  are false.

src/cmd/ksh93/tests/comvar.sh:
- Add a regression test for the double free problem. The
  reproducer must be run from an executable script
  with 'ksh -c'.
2020-06-29 18:08:28 +01:00
Johnothan King
5135cf651c
Fix crashes caused by 'typeset -RF' (#47)
Variables created with 'typeset -RF' were being treated as
short integers, even though they are actually floating point
values. As a result the following example will cause a crash:

$ typeset -RF foo=1
$ test "$foo"

This is fixed by checking for 'NV_DOUBLE' with 'nv_isattr',
which prevents ksh from treating floating point values as
short integers due to '== NV_INT16P' excluding 'NV_DOUBLE'.
This bugfix was backported from ksh93v- 2013-10-10-alpha.

src/cmd/ksh93/sh/array.c,
src/cmd/ksh93/sh/name.c,
src/cmd/ksh93/sh/nvdisc:
- Avoid treating floating point values as short integers by
  checking for 'NV_DOUBLE' with 'nv_isattr'.

src/cmd/ksh93/tests/types.sh:
- Add a regression test for the 'typeset -RF' crash. The
  crash cannot be replicated if 'typeset -RF' sets 'foo'
  to zero.
2020-06-28 23:30:27 +01:00
Johnothan King
bb4745e897
Fix incorrect behavior of 'cd ../.foo' (#46)
The cd builtin was removing '.' from directory names when combined
with a preceding '../', which caused commands like 'cd ../.local'
to become 'cd ../local'. This patch fixes the problem by limiting
the extra handling to leading '..'. The bugfix comes from ksh93v-
2013-10-10-alpha, although this version is a shortened patch from
Solaris (as ksh93v- refactored a decent amount of the code for the
cd builtin).

src/cmd/ksh93/bltins/cd_pwd.c:
- cd should only check for leading '..', as trying to handle a lone
  '.' only causes problems.

src/cmd/ksh93/tests/builtins.sh:
- Add a regression test for this problem based on the test present in
  ksh93v- 2013-10-10-alpha.

Patch from Solaris:
https://github.com/oracle/solaris-userland/blob/860d27f/components/ksh93/patches/270-23319761.patch
2020-06-26 23:36:29 +01:00
Martijn Dekker
8c705bf3b7 Fix behaviour of tabs in raw Bourne Shell-like editing mode
When neither '-o emacs' nor '-o vi' is active, there were a couple
of bugs with entering tab characters:
1. Tab completion was erroneously left active. The cause of this
   was that raw Bourne edit mode is handled by ed_viread() in vi.c
   on shells with wide character support, instead of the default
   ed_read() in edit.c, and the former failed to check if vi mode
   is active when processing tab characters.
2. When entering literal tab characters, the cursor was moved to
   the right only one character, instead of the amount of
   characters corresponding to the tab.

src/cmd/ksh93/edit/vi.c: getline():
- Before processing '\t' (tab) for command completion, check that
  the 'vi' shell option (SH_VI) is active.

src/cmd/ksh93/edit/edit.c: ed_virt_to_phys():
- When translating literal tabs to on-terminal spaces and when
  recalculating the cursor position, remove erroneous checks for
  SH_VI; this is also needed in raw Bourne mode. According to my
  own testing, this has no effect on emacs mode (knock on wood).

src/cmd/ksh93/tests/pty.sh:
- Add two regression tests. An odd race condition reveals itself in
  either pty or in ksh's raw/Bourne edit mode; see comment in test.
  Effect is we have to expect either literal tabs or tabs expanded
  to spaces, until that is tracked down and fixed.

Fixes #43.
2020-06-26 11:34:02 +02:00
Johnothan King
4cecde1dd3 Fix buggy completion of ~/some in vi mode (#41)
This commit fixes the bug reported in:
https://github.com/att/ast/issues/682
The following sequence fails in vi mode because ksh looks in the
wrong part of the 'virtual' buffer:

$ touch ~/testfile
$ ls ~/test<tab>

The fix is to change 'virtual[i]' to 'virtual[last_virt]' in the
bugged section of code. The other changes are to make sure listing
files in a directory with something like 'ls /etc/<tab>' calls the
code for Ctrl+L to preserve 'ls /etc/' rather than try (and fail)
to complete the directory name, producing 'ls /etc\n/'. This bugfix
was backported from ksh93v- 2013-10-10-alpha.

src/cmd/ksh93/edit/vi.c
 - Backport the bugfix from ksh93v- 2013-10-10-alpha for this
   problem.

src/cmd/ksh93/tests/pty.sh
 - Add a regression test for this issue using pty, adjusted slightly
   for a fake home directory in /tmp.
2020-06-25 23:13:45 +02:00
Johnothan King
d41ec674c7 Fix some errors in the documentation and other minor issues (#42)
Somewhat notable changes in this commit:
- The 'set +r' bugfix (re: 74b41621) is now documented in the
  changelog.
- Missing options have been added to the synopsis section of the
  ksh man page.
- The minor formatting fix from https://github.com/ksh-community/ksh/pull/5
  has been applied to the ksh man page.
- A few fixes from https://github.com/att/ast/commit/5e747cfb
  have been applied to the ksh man page.
- The man page fixes from https://github.com/att/ast/pull/353
  have been applied, being:
  - An addition to document the behavior of 'set -H'.
  - A fix for the cd section appending rksh93.
  - A fix for some options being indented too far.
  - Removal of a duplicate section documenting '-D'.
  - Reordering the options for 'set' in alphabetical order.
  - A minor fix for the documentation of 'ksh -i'.
2020-06-25 19:31:51 +02:00
Martijn Dekker
2315f6687a Add regress test for fixed BUG_KUNSETIFS (re: 6f0e008c, 7b994b6a)
Modernish is no longer detecting BUG_KUNSETIFS, as I've just
discovered. Always nice when bugs spontaneously vanish...

A 'git reset HEAD~1'/recompile/retest loop reveals this bug was
fixed by 6f0e008c, as later modified by 7b994b6a.

So, let's make sure it stays fixed.

src/cmd/ksh93/tests/variables.sh:
- Add a couple of regression tests for BUG_KUNSETIFS presence,
  detection and known workaround, based on the same in modernish.
  Ref.: https://github.com/modernish/modernish/blob/3ddcbd13/lib/modernish/cap/BUG_KUNSETIFS.t
	https://github.com/modernish/modernish/blob/3ddcbd13/lib/modernish/tst/isset.t#L204-L222
2020-06-24 20:00:01 +02:00
Johnothan King
0aa9e03f55
Fix process substitution combined with redirection (#40)
The code for handling process substitution with redirection was
never being run because IORAW is usually set when IOPROCSUB is
set. This commit fixes the problem by moving the required code
out of the !IORAW if statement. The following command now prints
'good' instead of writing 'ok' to a bizzare file:

$ ksh -c 'echo ok > >(sed s/ok/good/); wait'
good

This commit also fixes a bug that caused the process ID of the
asynchronous process to print when the shell was in interactive
mode. The following command no longer prints a process ID,
behaving like in Bash and zsh:

$ echo >(true)
/dev/fd/5

src/cmd/ksh93/sh/args.c:
 - Temporarily turn off the interactive state while in a process
   substitution to prevent the shell from printing the PID of
   the asynchronous process.

src/cmd/ksh93/sh/io.c:
 - Move the code for process substitution with redirection into
   a separate if statement.

src/cmd/ksh93/tests/io.sh:
 - Add two tests for both process substitution bugs fixed by this
   commit.

src/cmd/ksh93/tests/shtests:
 - Update shtests with a patch from Martijn Dekker to use
   pretty-printing for the output from the times builtin (if it
   is available).

Fixes #2
2020-06-23 23:02:16 +01:00
Johnothan King
c1994b87f1
Fix nested functions ignoring prefixed variable assignments (#37)
This commit fixes the bug described in att/ast#32. The fix and
following explanation is from att/ast#467:

While copying variables from function's local scope to a new scope,
variable attributes were not copied. Such variables were not marked
to be exported in the new function. For e.g.

function f2 { env | grep -i "^foo"; }
function f1 { env | grep -i "^foo"; f2; }
foo=bar f1

prints 'foo=bar' only once, but it should print be twice.

src/cmd/ksh93/sh/xec.c:
 - When variables from the local scope of a function are copied into
   the scope of a nested function, the attributes of the variables
   need to be copied as well.

src/cmd/ksh93/tests/functions.sh:
 - Add regression tests from ksh2020 to check environment variables
   passed to functions.
2020-06-23 00:27:05 +01:00
Johnothan King
ff358f3464 Fix a crash when 'kill %%' and 'kill %+' are run (#35)
Ksh was trying to use the 'pw' variable as a valid pointer even
when it was NULL. This is fixed by doing the error check for
'pw' before doing anything else in 'job_kill'.

This bugfix is from Red Hat:
44e0a643a9/f/SOURCES/ksh-20130214-fixkill.patch

Fixes #34
2020-06-22 19:11:49 +02:00
Martijn Dekker
3ba4900e9c Make 'stop' and 'suspend' regular built-ins
The 'stop' and 'suspend' default aliases are now converted into
regular built-in commands so that 'unalias -a' does not remove
them, 'suspend' can do some sanity checks, and something like
	cmd=stop; $cmd $!
will now work.

src/cmd/ksh93/bltins/trap.c:
- b_kill(): Incorporate 'stop' functionality, which is simply
  setting the same flag and variable as '-s STOP' would have done.
- b_suspend(): Add simple builtin function that sends SIGSTOP to
  the main shell. Check for no operands, and refuse to suspend a
  login shell (which would leave the user stuck with no way out).
  Also check that 'kill' succeeds; if we're in an asynchronous
  subshell, it is possible the main shell no longer exists.

src/cmd/ksh93/data/aliases.c:
- Remove "stop" and "suspend" default aliases. (Why were these
  conditional upon SIGTSTP when they actually issued SIGSTOP?)

src/cmd/ksh93/include/builtins.h,
src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/data/msg.c:
- Add declarations of "stop" and "suspend" regular built-ins.
- Add option strings (AST manual/--man pages) for them.
- Add e_toomanyops ("too many operands") reusable error message for
  b_suspend(). Other new commands may want this at some point.

src/cmd/ksh93/sh.1:
- Remove "stop" and "suspend" default aliases.
- Document "stop" and "suspend" regular built-in commands.
2020-06-22 15:36:29 +02:00
Johnothan King
bd3e2a8001
Fix unreliable behavior when special vars are readonly or unset (#27)
src/cmd/ksh93/data/variables.c:
 - Running 'unset .sh.lineno' creates a memory fault, so fix that
   by giving it the NV_NOFREE attribute. This crash was happening
   because ${.sh.lineno} is an integer that cannot be freed from
   memory with free(3).

src/cmd/ksh93/sh/init.c:
 - Tell _nv_unset to ignore NV_RDONLY when $RANDOM and $LINENO are
   restored from the subshell scope. This is required to fully
   restore the original state of these variables after a virtual
   subshell finishes.

src/cmd/ksh93/bltins/typeset.c,
src/cmd/ksh93/sh/subshell.c:
 - Disabled some optimizations for two instances of 'sh_assignok' to
   fix 'readonly' in virtual subshells and '(unset .sh.level)' in
   nested functions. This fixes the following variables when
   '(readonly $varname); enum varname=' is run:

   $_
   ${.sh.name}
   ${.sh.subscript}
   ${.sh.level}

   The optimization in question prevents sh_assignok from saving the
   original state of these variables by making the sh_assignok call
   a no-op. Ksh needs the original state of a variable for it to be
   properly restored after a virtual subshell has run, otherwise ksh
   will simply carry over any new flags (being NV_RDONLY in this case)
   from the subshell into the main shell.

src/cmd/ksh93/tests/variables.sh:
 - Add regression tests from Martijn Dekker for setting special
   variables as readonly in virtual subshells and for unsetting
   special variables in general.

Fixes #4
2020-06-20 18:08:41 +01:00
Johnothan King
99065353b3 Fix 'whence -a' to print correct path for tracked alias (#25)
'whence -a' bases the path for tracked aliases on the user's
current working directory if an enabled ksh builtin of the same
name is also available. The following example will claim 'cat'
is in the user's current working directory:

$ whence -a cat
cat is a tracked alias for /usr/bin/cat
$ builtin cat
$ whence -a cat
cat is a shell builtin
cat is /usr/bin/cat
cat is a tracked alias for /current/working/directory/cat

This patch from ksh2020 fixes this problem by properly saving the
path of the tracked alias for use with 'whence -a', since
'path_pwd' (as implied by the function's name) only gets the users
current working directory, not the location of tracked aliases.
Ref.: https://github.com/att/ast/issues/1049

This bug was originally reported by David Morano about two decades
ago to the AST team: https://github.com/att/ast/issues/954

src/cmd/ksh93/bltins/whence.c:
 - Print the actual path of a tracked alias, path_pwd doesn't
   have this functionality.

src/cmd/ksh93/include/name.h:
 - Add 'pathcomp' for saving the value of tracked aliases.

src/cmd/ksh93/sh/path.c:
 - Save the value of tracked aliases for use by whence.

src/cmd/ksh93/tests/builtins.sh:
 - Add a regression test for using 'whence -a' on tracked
   aliases with a builtin equivalent.
2020-06-19 14:03:58 +02:00
Martijn Dekker
3e3f6b0f12 Restore #22 'unset -f' fix minus segfault (re: b7932e87, 97511748)
Applying the fix for 'unset -f' exposed a crashing bug in lookup()
in sh/nvdisc.c, which is the function for looking up discipline
functions. This is what caused tests/variables.sh to crash.
Ref.: https://github.com/ksh93/ksh/issues/23#issuecomment-645699614

src/cmd/ksh93/sh/nvdisc.c: lookup():
- To avoid segfault, check that the function pointer nq->nvalue.rp
  is actually set before checking if nq->nvalue.rp->running==1.

src/cmd/ksh93/sh/xec.c,
src/cmd/ksh93/tests/functions.sh:
- Uncomment the 'unset -f' fix from b7932e87.

Resolves #21 (again).
2020-06-18 02:48:51 +02:00
Martijn Dekker
975117485c Part revert #22 to undo memory fault (re: b7932e87)
The fix in sh/xec.c, which was backported from the ksh 93v- beta to
delay the actual removal of a running function that unsets itself,
caused a segfault in the variables.sh regression tests (see #23).

src/cmd/ksh93/sh/xec.c:
- Comment out the backported code pending a correct fix for #21.
  Now both types of functions silently fail to unset themselves
  (unless they're discipline functions).

src/cmd/ksh93/tests/functions.sh:
- Disable regression tests checking that the function was actually
  unset, pending a correct fix for #21.

Resolves: #23
Reopens: #21
2020-06-17 21:01:55 +02:00
Johnothan King
b7932e87b6
Fix two problems with 'unset -f' behavior (#22)
src/cmd/ksh93/sh/name.c:
 - Correct the check for when a function is currently running
   to fix a segmentation fault that occurred when a POSIX
   function tries to unset itself while it is running.
   This bug fix was backported from ksh93v-.

src/cmd/ksh93/sh/xec.c:
 - If a function tries to unset itself, unset the function
   with '_nv_unset(np, NV_RDONLY)' to fix a silent failure.
   This fix was also backported from ksh93v-.

src/cmd/ksh93/tests/functions.sh:
 - Add four regression tests for when a function unsets itself.

Resolves #21
2020-06-17 18:26:43 +01:00
Johnothan King
fae8862c53
Fix assignments preceding 'command <special builtin>' (#19)
Ksh was not checking for `command` when running a special builtin,
which caused preceding invocation-local variable assignments to
become global. This is the reproducer from the att/ast#72:

$ foo=BUG command eval ':'
$ echo "$foo"

This no longer prints 'BUG', as ksh now makes sure the command builtin
is not running a special builtin before making invocation-local
variable assignments global.

src/cmd/ksh93/sh/xec.c:
 - Backport the bugfix for BUG_CMDSPASGN from ksh93v- 2013-10-10-alpha.

src/cmd/ksh93/tests/builtins.sh:
 - Add a regression test based on the reproducer in att/ast#72.
2020-06-16 22:58:05 +01:00
Johnothan King
764acefaf1 read -r -d should not ignore -r
This bug was previously reported in att/ast#37.
Ksh ignores `-r` when `read -r -d` is run because when
the bit for `D_FLAG` is set, the bit for `R_FLAG` is unset
as a side effect of setting `D_FLAG`. The following set
of commands fails to print a backslash:

$ printf '\\\000' | read -r -d ''
$ echo $REPLY

The fix for this bug is to set `D_FLAG` with `D_FLAG + 1`,
which prevents `R_FLAG` from being unset. This bugfix
has been backported from ksh93v- 2013-10-10-alpha.

src/cmd/ksh93/bltins/read.c:
 - Set `D_FLAG` with `D_FLAG + 1` to prevent the bit for
   `R_FLAG` from being unset.

src/cmd/ksh93/tests/builtins.sh:
 - Add the regression test for `read -r -d` from ksh93v-.
2020-06-16 13:49:23 +01:00
Martijn Dekker
6916a873c2 optget: display --help and --man in terse usage messages
The fact that every ksh builtin command self-documents with the
options --help and --man (and others, see 'getopts --man'; but
these are the essential ones) is poorly known; the information is
buried somewhere deep in the sh.1 manual page, and is incomplete at
that. None of the terse usage messages displayed on error point the
user to these options. So let's fix that.

src/lib/libast/misc/optget.c:
- Change generic 'options' placeholder, used in all terse usage
  messages, to 'options | --help | --man'.

src/cmd/ksh93/sh.1:
- Edit documentation of --man/-?, adding documentation on --help
  which was completely undocumented. Refer to 'getopts --man' for
  more advanced info.
- Separate these from the (important) documentation on special
  builtins using a paragraph break.
2020-06-15 16:56:11 +02:00