1
0
Fork 0
mirror of git://git.code.sf.net/p/cdesktopenv/code synced 2025-03-09 15:50:02 +00:00
Commit graph

490 commits

Author SHA1 Message Date
Martijn Dekker
e072e7c170 Fix crash in xtrace while processing here-document (re: d7cada7b)
Depending on the OS, the heredoc.sh regression tests, and possibly
others, still crashed with the -x option (xtrace) on.

Analysis: The lexer crashes in lex_advance(). Something has caused
an inconsistent lexer state, and it happened earlier on, so the
backtrace is useless for figuring out where that happened.

But I think I've found it. It's the sh_mactry() call here:

src/cmd/ksh93/sh/xec.c, lines 2800 to 2807 in f7213f03
2800:   if(!(cp=nv_getval(sh_scoped(shp,PS4NOD))))
2801:           cp = "+ ";
2802:   else
2803:   {
2804:           sh_offoption(SH_XTRACE);
2805:           cp = sh_mactry(shp,cp);
2806:           sh_onoption(SH_XTRACE);
2807:   }

sh_mactry() needs to parse the contents of $PS4 to perform
expansions and command substitutions in it, which involves the
lexer. If that happens in a here-document, the lexer is in the C
function call stack, in the middle of parsing the here-document.
Result: inconsistent lexer state. Solution: save and restore lexer
state in sh_mactry().

After this commit, all regression tests should pass with the
'-x'/'--xtrace' option in use, with no errors or crashes.

Note for backporters: this fix depends both on on d7cada7b and on
the consistency fix for the Lex_t type's size applied in a7ed5d9f.

src/cmd/ksh93/include/shlex.h:
- Cosmetic fix: remove a copied & pasted backslash. (re: a7ed5d9f)

src/cmd/ksh93/sh/macro.c: sh_mactry():
- Save and restore the lexer state before letting sh_mactrim()
  indirectly parse and execute code.

src/cmd/ksh93/tests/*.sh:
- Turn off xtrace in various command substitutions that contain
  2>&1 redirections, so that the xtrace output is not caught by
  the command substitutions, causing tests to fail incorrectly.
- Turn off xtrace for a few code blocks with 2>&1 redirections,
  stopping xtrace output from being written to standard output.

Resolves: https://github.com/ksh93/ksh/issues/306 (again)
2021-12-27 04:02:25 +00:00
Johnothan King
f7213f03a2 Fix multiple bugs when using 'alias -p' to print aliases (#398)
This commit was originally intended to fix just one bug with shcomp's
handling of 'alias -p', but while fixing that I found a large number
of related issues in the alias command's -p, -t and -x options. The
current patch provides bugfixes for all of the bugs listed below:

1) Listing aliases in a script with 'alias -p' or 'alias' broke
   shcomp's bytecode output:
   https://github.com/ksh93/ksh/issues/87#issuecomment-813819122

2) Listing individual aliases with the -p option doesn't work:
      $ alias foo=bar bar=foo
      $ alias foo
      foo=bar
      $ alias -p foo  # No output

3) Listing specific tracked aliases with -pt does not display them
   in a reusable format, but rather adds another tracked alias:
      $ hash -r cat vi
      $ alias -pt vi  # No output
      $ alias -pt rm
      $ alias -t
      cat=/usr/bin/cat
      rm=/usr/bin/rm
      vi=/usr/bin/vi

4) Listing all tracked aliases with -pt does not output them in a
   reusable format (the resulting command printed only creates a
   normal alias, which is different from a tracked alias):
      $ hash -r cat
      $ alias -pt
      alias cat=/usr/bin/cat  # Expected 'alias -t cat'

5) Listing a non-existent alias with -p doesn't cause an error:
      $ unalias -a
      $ alias -p notanalias  # No output
      $ echo $?
      0
      $ alias notanalias
      notanalias: alias not found
      $ echo $?
      1
      $ hash -r
      $ alias -pt notacommand  # No output
      $ echo $?
      0

6) Attempting to list 256 non-existent aliases results in exit
   status zero:
      $ unalias -a
      $ alias $(awk -v ORS= 'BEGIN { for(i=0;i<256;i++) print "x "; }')
      x: alias not found
      --cut error message--
      $ echo $?
      0

Changes:

- typeset.c: Avoid printing anything while shcomp is compiling a
  script. This is needed because the alias command is run by shcomp
  to prevent parsing issues.

- b_alias(): To avoid adding tracked aliases with -pt, set
  tdata.aflag to '+' so that setall() and other related functions
  only list tracked aliases.

- b_alias(): Set tdata.pflag to 1 so that setall() and other
  functions recognize -p was passed.

- print_value(): Add support for listing specific aliases with
  'alias -p'.

- setall(): To avoid any issues with zombie tracked aliases (see also
  the regression tests) ignore tracked alias nodes marked with the
  NV_NOALIAS attribute. This bit is set for tracked alias nodes by
  the nv_rehash() function.

- setall(): For backward compatibility, continue incrementing the
  exit status for each invalid alias and tracked alias passed. This
  was already how alias behaved when listing aliases without -p, so
  using -p shouldn't cause a change in behavior:
      $ unalias -a
      $ alias foo bar
      foo: alias not found
      bar: alias not found
      $ echo $?
      2
  To fix bug 6, the exit status is set to one if an enforced 8-bit
  exit status would be zero.

- print_namval(): Set the prefix to 'alias -t' so that listing
  tracked aliases with 'alias -pt' works correctly.

- data/msg.c and include/name.h: Add an error message for when
  'alias -pt' doesn't find a tracked alias.

- tests/alias.sh: Add a ton of regression tests for the bugs fixed in
  this commit.
2021-12-27 03:49:06 +00:00
Martijn Dekker
9403d326f4 shtests --posix: ensure consistent locale: unset all LC_* vars
When running tests in --posix/-p mode, not all LC_* variables were
unset, so that certain aspects of the locale could be non-POSIX.
2021-12-27 03:48:27 +00:00
Johnothan King
3785a0685c Fix process substitutions printing PIDs in profile scripts (#395)
- sh/args.c: A process substitution run in a profile script may print
  its PID as if it was a command spawned with '&'. Reproducer:
     $ cat /tmp/env
     true >(false)
     $ ENV=/tmp/env ksh
     [1]	730227
     $
  This bug is fixed by turning off the SH_PROFILE state while running
  a process substitution.

- sh/subshell.c: The SH_INTERACTIVE fix in 3525535e renders the extra
  check for SH_PROFILE redundant, so it has been removed.

- tests/io.sh: Update the procsub PIDs test to also check the result
  after using process substitution in a profile script.
2021-12-22 13:27:00 +00:00
Johnothan King
e6989853bc Fix yet more minor bugs related to the regression tests (#389)
- Redirect error output from the ulimit builtin (re: 3e58851f).
- Fix the test failure for 'cd -eP' on illumos by making a directory
  symlink first, then removing the symlink after cd.
- Fix the test failure for 'getconf -l' on illumos by quoting
  strings with the -q option.
- astconf.c: Only quote strings if the -q option was passed.
- Improve error messages from intermittently failing types.sh tests
2021-12-21 08:01:00 +00:00
Martijn Dekker
d3f553ca76 shtests: report correct line numbers for shcomp fails (re: bd38c804)
Inserting the _common script instead of sourcing it caused all test
failures in shcomp runs to be reported with the number of lines in
_common added.

src/cmd/ksh93/shtests:
- Only incorporate the aliases from _common; dot/source the rest of
  the code as normal. Replace the first few lines with the aliases
  to avoid affecting $LINENO; they are comments anyway.
2021-12-21 06:48:21 +00:00
Martijn Dekker
ce3e080c0e Release 1.0.0-beta.2
Announcing: KornShell 93u+m 1.0.0-beta.2
https://github.com/ksh93/ksh

In May 2020, when every KornShell (ksh93) development project was
abandoned, development was rebooted in a new fork based on the last
stable AT&T version: ksh 93u+. This new fork is called ksh 93u+m as a
permanent nod to its origin. We're restarting it at version 1.0. Seven
months after the first beta, the second one is ready. Please test this
second beta and report any bugs you find, or help us fix known bugs.

We're now the default ksh93 in some OS distributions, at least Debian
and Slackware! Even though we don't think it's stable release quality
yet, the consensus seems to be that 93u+m is already much better than
the last AT&T release.

Main developers: Martijn Dekker, Johnothan King, hyenias

Contributors: Andy Fiddaman, Anuradha Weeraman, Chase, Gordon Woodhull,
Govind Kamat, Harald van Dijk, Lev Kujawski, Marc Wilson, Ryan Schmidt,
Sterling Jensen

HOW TO GET IT

Please download the source code tarball from our GitHub releases page:

	https://github.com/ksh93/ksh/releases

To build, follow the instructions in README.md or src/cmd/ksh93/README.

HOW TO GET INVOLVED

To report a bug, please open an issue at our GitHub page (see above).
Alternatively, email me at martijn@inlv.org with your report.
To get involved in development, read the brief policy information in
README.md and then jump right in with a pull request or email a patch.
See the TODO file in the top-level directory for a to-do list.

*** MAIN CHANGES between 1.0.0-beta.1 and 1.0.0-beta.2 ***

New features in built-in commands:

- 'cd' now supports an -e option that, when combined with -P, verifies
  that $PWD is correct after changing directories; this helps detect
  access permission problems. See:
  https://www.austingroupbugs.net/view.php?id=253

- 'printf' now supports a -v option as in bash. This assigns formatted
  output directly to variables, which is very fast and will not strip
  final newline (\n) characters.

- The 'return' command, when used to return from a function, can now
  return any status value in the 32-bit signed integer range, like on
  zsh. However, due to a traditional Unix kernel limitation, $? is
  still trimmed to its least significant 8 bits whenever leaving a
  (sub)shell environment.

- 'test'/'[' now supports all the same operators as [[ (including =~,
  \<, \>) except for the different 'and'/'or' operators. Note that
  'test'/'[' remains deprecated due to its unfixable pitfalls;
  [[ ... ]] is recommended instead.

Shell language changes:

- Several improvements were made to the --noexec shell code linter.

- Arithmetic expressions in native ksh mode no longer interpret a
  number with a leading zero as octal in any context. Use 8#octalnumber
  instead (e.g. 8#400 == 256). Arithmetic expressions now also behave
  identically within and outside ((...)) and $((...)).

- POSIX compatibility mode fixes (only applicable with the --posix shell
  option on):
  - A leading zero is now consistently recognised as introducing an octal
    number in all arithmetic contexts.
  - $((inf)) and $((nan)) are now interpreted as regular variables.
  - The '.' built-in no longer runs ksh functions and now only runs
    files.

Bugs fixed:

- '.' and '..' are now once again completed by tab completion.

- If SIGINT is set to ignore, the interactive shell no longer exits on
  Ctrl+C.

- ksh now builds and runs on Apple's new M1 hardware.

- The 'return' and 'exit' commands no longer risk triggering actual
  signals by returning or exiting with a status > 256.

- Ksh no longer behaves badly when parsing a type definition command
  ('typeset -T' or 'enum') without executing it or when executing it in
  a subshell. Types can now safely be defined in subshells and defined
  conditionally as in 'if condition; then enum ...; fi'.

- Discipline functions, especially those applied to PS2 or .sh.tilde,
  will no longer crash your shell upon being interrupted or throwing an
  error.

- Fixed a bug that could corrupt output if standard output is closed
  upon initialising the shell.

- Fixed a bug in the [[ ... ]] compound command: the '!' logical
  negation operator now correctly negates another '!', e.g.,
  [[ ! ! 1 -eq 1 ]] now returns 0/true. Note that this has always been
  the case for 'test'/'['.

- Fixed SHLVL so that replacing ksh by itself (exec ksh) will not
  increase it.

- Arithmetic expressions are no longer allowed to assign out-of-range
  values to variables of types declared with enum.

- The 'time' keyword no longer makes the --errexit shell option
  ineffective.

- Various bugs in libcmd built-in commands (those bound to the
  /opt/ast/bin path by default) have been fixed.

- Various other crashing bugs have been fixed.

Fixes for the shcomp byte code compiler:

- shcomp is now able to compile scripts that define types using enum.

- shcomp now refuses to mess up your terminal by writing bytecode
  to it.

*** MAIN CHANGES between ksh 93u+ 2012-08-01 and 93u+m 1.0.0-beta.1 ***

Hundreds of bugs have been fixed, including many serious/critical bugs.
This includes upstreamed patches from OpenSUSE, Red Hat, and Solaris, fixes
backported from the abandoned 93v- beta and ksh2020 fork, as well as many
new fixes from the community. See the NEWS file for more information, and
the git commit log for complete documentation of every fix. Incompatible
changes have been minimised, but not at the expense of fixing bugs. For a
list of potentially incompatible changes, see src/cmd/ksh93/COMPATIBILITY.

Though there was a "no new features, bugfixes only" policy, some new
features were found necessary, either to fix serious design flaws or to
complete functionality that was evidently intended, but not finished.
Below is a summary of these new features.

New command line editor features:

- The forward-delete and End keys are now handled as expected in the
  emacs and vi built-in line editors.

- In the vi and emacs line editors, repeat count parameters can now also
  be used for the arrow keys and the forward-delete key. E.g., in emacs
  mode, <ESC> 7 <left-arrow> will now move the cursor seven positions to
  the left. In vi control mode, this would be entered as: 7 <left-arrow>.

New shell language features:

- The &>file redirection shorthand (for >file 2>&1) is now available for
  all scripts and interactive sessions and not only for profile/login
  scripts, bringing ksh 93u+m in line with mksh, bash, and zsh.

- File name generation (a.k.a. pathname expansion, a.k.a. globbing) now
  never matches the special navigational names '.' (current directory)
  and '..' (parent directory). This change makes a pattern like .*
  useful; it now matches all hidden files (dotfiles) in the current
  directory, without the harmful inclusion of '.' and '..'.

- Tilde expansion can now be extended or modified by defining a
  .sh.tilde.get or .sh.tilde.set discipline function. This replaces a
  2004 undocumented attempt to add this functionality via a .sh.tilde
  command, which never worked and crashed the shell. See the manual for
  details on the new method.

New features in built-in commands:

- Usage error messages now show the --help/--man self-documentation options.

- Path-bound built-ins (such as /opt/ast/bin/cat) can now be executed by
  invoking the canonical path, so the following will now work as expected:
	$ /opt/ast/bin/cat --version
	  version         cat (AT&T Research) 2012-05-31

- 'command -x' now looks for external commands only, skipping built-ins.
  In addition, its xargs-like functionality no longer freezes the shell on
  Linux and macOS, making it effectively a new feature on these systems.

- 'redirect' now checks if all arguments are valid redirections before
  performing them. If an error occurs, it issues an error message instead
  of terminating the shell.

- 'suspend' now refuses to suspend a login shell, as there is probably no
  parent shell to return to and the login session would freeze.

- 'times' now gives high precision output in a POSIX compliant format.

- 'typeset' now gives an informative error message if an incompatible
  combination of options is given.

- 'whence -v/-a' now reports the location of autoloadable functions.

New features in shell options:

- A new --globcasedetect shell option is added on OSs where we can
  check for a case-insensitive file system (currently Windows/Cygwin,
  macOS, Linux and QNX 7.0+). When this option is turned on, file name
  generation (globbing), as well as file name tab completion on
  interactive shells, automatically become case-insensitive on file
  systems where the difference between upper and lower case is ignored
  for file names. This is transparently determined for each directory, so
  a path pattern that spans multiple file systems can be part
  case-sensitive and part case-insensitive.

- A new --nobackslashctrl shell option disables the special escaping
  behaviour of the backslash character in the emacs and vi built-in
  editors. Particularly in the emacs editor, this makes it much easier to
  go backward, insert a forgotten backslash into a command, and then
  continue editing without having your next cursor key replace your
  backslash with garbage. Note that Ctrl+V (or whatever other character
  was set using 'stty lnext') always escapes all control characters in
  either editing mode.

- A new --posix shell option has been added to ksh 93u+m that makes the
  ksh language more compatible with other shells by following the POSIX
  standard more closely. See the manual page for details. It is enabled by
  default if ksh is invoked as sh, otherwise it is disabled by default.

- Enhancement to -G/--globstar: symbolic links to directories are now
  followed if they match a normal (non-**) glob pattern. For example, if
  '/lnk' is a symlink to a directory, '/lnk/**' and '/l?k/**' now work as
  you would expect.
2021-12-17 04:20:04 +01:00
Johnothan King
85199ab351 Backport ksh93v- bugfix for [[ 1<2 ]] (#380)
Strings compared in [[ with the > and < operators should be compared
lexically. This does not work when the strings are single digits, as
the parser interprets it as a syntax error:
   $ [[ 10<2 ]]   # 10 lexically sorts before 2
   $ echo $?
   0
   $ [[ 1<2 ]]
   /usr/bin/ksh: syntax error: `<' unexpected
   $ echo $?
   3

src/cmd/ksh93/sh/lex.c:
- Don't interpret numbers next to > and < as a redirection while
  inside of [[. This bugfix was backported from ksh93v- 2014-06-25.

src/cmd/ksh93/tests/bracket.sh:
- Add regression tests for the > and < operators.
2021-12-17 03:26:41 +01:00
Martijn Dekker
e67df29c07 Re-fix defining types conditionally or in subshells (re: f508660d)
New version. I'm pretty sure the problems that forced me to revert
it earlier are fixed.

This commit mitigates the effects of the hack explained in the
referenced commit so that dummy built-in command nodes added by the
parser for declaration/assignment purposes do not leak out into the
execution level, except in a relatively harmless corner case.

Something like

    if false; then
        typeset -T Foo_t=(integer -i bar)
    fi

will no longer leave a broken dummy Foo_t declaration command. The
same applies to declaration commands created with enum.

The corner case remaining is:

    $ ksh -c 'false && enum E_t=(a b c); E_t -a x=(b b a c)'
    ksh: E_t: not found

Since the 'enum' command is not executed, this should have thrown
a syntax error on the 'E_t -a' declaration:
ksh: syntax error at line 1: `(' unexpected

This is because the -c script is parsed entirely before being
executed, so E_t is recognised as a declaration built-in at parse
time. However, the 'not found' error shows that it was successfully
eliminated at execution time, so the inconsistent state will no
longer persist.

This fix now allows another fix to be effective as well: since
built-ins do not know about virtual subshells, fork a virtual
subshell into a real subshell before adding any built-ins.

src/cmd/ksh93/sh/parse.c:

- Add a pair of functions, dcl_hactivate() and dcl_dehacktivate(),
  that (de)activate an internal declaration built-ins tree into
  which check_typedef() can pre-add dummy type declaration command
  nodes. A viewpath from the main built-ins tree to this internal
  tree is added, unifying the two for search purposes and causing
  new nodes to be added to the internal tree. When parsing is done,
  we close that viewpath. This hides those pre-added nodes at
  execution time. Since the parser is sometimes called recursively
  (e.g. for command substitutions), keep track of this and only
  activate and deactivate at the first level.
     (Fixed compared to previous version of this commit: calling
  dcl_dehacktivate() when the recursion level is already zero is
  now a harmless no-op. Since this only occurs in error handling
  conditions, who cares.)

- We also need to catch errors. This is done by setting libast's
  error_info.exit variable to a dcl_exit() function that tidies up
  and then passes control to the original (usually sh_exit()).
     (Fixed compared to previous version of this commit: dcl_exit()
  immediately deactivates the hack, no matter the recursion level,
  and restores the regular sh_exit(). This is the right thing to
  do when we're in the process of erroring out.)

- sh_cmd(): This is the most central function in the parser. You'd
  think it was sh_parse(), but $(modern)-form command substitutions
  use sh_dolparen() instead. Both call sh_cmd(). So let's simply
  add a dcl_hacktivate() call at the beginning and a
  dcl_deactivate() call at the end.

- assign(): This function calls path_search(), which among many
  other things executes an FPATH search, which may execute
  arbitrary code at parse time (!!!). So, regardless of recursion
  level, forcibly dehacktivate() to avoid those ugly parser side
  effects returning in that context.

src/cmd/ksh93/bltins/enum.c: b_enum():

- Fork a virtual subshell before adding a built-in.

src/cmd/ksh93/sh/xec.c: sh_exec():

- Fork a virtual subshell when detecting typeset's -T option.

Improves fix to https://github.com/ksh93/ksh/issues/256
2021-12-17 01:28:28 +01:00
Martijn Dekker
2bc1d814c9 Do not exit shell on Ctrl+C with SIGINT ignored (re: 7e5fd3e9)
The killpg(getpgrp(),SIGINT) call added to ed_getchar() in that
commit caused the interactive shell to exit on ^C even if SIGINT is
being ignored. We cannot revert or remove that call without
breaking job control. This commit applies a new fix instead.

Reproducers fixed by this commit:

  SIGINT ignored by child:

    $ PS1='childshell$ ' ksh
    childshell$ trap '' INT
    childshell$ (press Ctrl+C)
    $

  SIGINT ignored by parent:

    $ (trap '' INT; ENV=/./dev/null PS1='childshell$ ' ksh)
    childshell$ (press Ctrl+C)
    $

  SIGINT ignored by parent, trapped in child:

    $ (trap '' INT; ENV=/./dev/null PS1='childshell$ ' ksh)
    childshell$ trap 'echo test' INT
    childshell$ (press Ctrl+C)
    $

I've experimentally determined that, under these conditions, the
SFIO stream error state is set to 256 == 0400 == SH_EXITSIG.

src/cmd/ksh93/sh/main.c: exfile():
- On EOF or error, do not return (exiting the shell) if the shell
  state is interactive and if sferror(iop)==SH_EXITSIG.
- Refactor that block a little to make the new check fit in nicely.

src/cmd/ksh93/tests/pty.sh:
- Test the above three reproducers.

Fixes: https://github.com/ksh93/ksh/issues/343
2021-12-16 19:56:46 +01:00
Martijn Dekker
3779d84220 more regression test updates (re: 7cdb01f6) 2021-12-16 11:46:17 +01:00
Martijn Dekker
fad16a0c63 tests/path.sh: update regression test (re: 7cdb01f6) 2021-12-16 09:20:00 +01:00
Martijn Dekker
60e0687cec shtests: don't run pty tests with shcomp by default
The pty tests tests the interactive shell. Therefore, running them
through the script compiler is a waste of time.

Not only that, it is reported that the pty tests intermittently
fail with shcomp on some systems. This is not worth trying to fix.

src/cmd/ksh93/tests/shtests:
- Only run pty.sh with shcomp if -c/--compile was explicitly
  specified.
- Document the change.
2021-12-16 09:09:05 +01:00
Martijn Dekker
6137b99a6b package: remove a lot more unused AT&T cruft
This reduces the bin/package script by more than half!

bin/package, src/cmd/INIT/package.sh:
- Remove obsolete and unused package commands: admin, contents,
  license, list, remote, regress, setup, update, verify, write.
- Remove associated documentation.
- Replace install command with a dummy. It'll come back when we
  reintroduce the building of dynamic libaries.
- Update the test command to run the regression tests properly
  and capture the output in arch/*/lib/package/gen/tests.out, as
  documented. Arguments are simply passed to bin/shtests.

src/cmd/INIT/{ditto.sh,hurl.sh,release.c}:
- Removed. These were support scripts for some of the removed
  package commands.

src/cmd/ksh93/tests/pty.sh:
- Avoid failure when capturing output via 'bin/package test' by
  redirecting standard error to /dev/tty when running the tests.
2021-12-15 00:50:45 +01:00
Martijn Dekker
fc752b574a Re-match '.' and '..' in tab completion (re: 5312a59d, aad74597)
Turns out there is a bona fide, honest-to-goodness use case for
matching '.' and '..' in globbing after all. It's when globbing is
used as the backend mechanism for file name completion in
interactive shell editors. A tab invisibly adds a * at the end of
the word to the left of your cursor and the resulting pattern is
expanded. In 5312a59d, this broke for '.' and '..'.

Typing '.' followed by two tabs should result in a menu that
includes './' and '../'. Typing '..' followed by a tab should
result in '../', (or a menu that includes it if there are files
with names starting with '..'). This is the behaviour in 93u+ and
we should maintain this.

To restore this functionality without reintroducing the harmful
behaviour fixed in the referenced commits, we should special-case
this, allowing '.' and '..' to match only for file name completion.

src/lib/libast/include/glob.h:
- Fix an inaccurate comment: the GLOB_COMPLETE flag is used for
  command completion, not file name completion. This is very clear
  from reading the path_expand() function in sh/expand.c.
- Add new GLOB_FCOMPLETE flag for file name completion.

src/lib/libast/misc/glob.c:
- Adapt flags mask to fit the new flag.
- glob_dir(): If GLOB_FCOMPLETE is passed, allow '.' and '..' to
  match even if expanded from a pattern.
- Clarify the fix from aad74597 with an extended comment based on
  <https://github.com/ksh93/ksh/issues/146#issuecomment-790991990>.

src/cmd/ksh93/sh/expand.c: path_expand():
- If we're in the SH_FCOMPLETE (file name completion) state, then
  pass the new GLOB_FCOMPLETE flag to AST glob(3).

Fixes: https://github.com/ksh93/ksh/issues/372
Thanks to @fbrau for the bug report.
2021-12-13 01:50:50 +01:00
Johnothan King
e54001d58b Various minor capitalization and typo fixes (#371)
This commit fixes various minor typos, punctuation errors and
corrects the capitalization of many names.
2021-12-13 01:49:42 +01:00
Martijn Dekker
350e52877b Revert "[1.0 release prep] Remove tilde expansion discipline"
This reverts c0334e32, thereby restoring 936a1939.

After the fixes in 0a343244 and a2bc49be, the tilde expansion
disciplines work nicely, so they can come back to the 1.0 branch.
2021-12-09 07:31:37 +01:00
Martijn Dekker
a2bc49bed1 Further robustify .get and .set discipline functions (re: 0a343244) (#368)
This should fix various crashes that remain, at least:
* when running a PS2 discipline at parse time
* when pressing Ctrl+C on a PS2 prompt
* when a special builtin within a discipline throws an error within
  a virtual subshell

src/cmd/ksh93/sh/nvdisc.c:
- In both assign() which handles .set disciplines and lookup()
  which handles .get disciplines, to stop errors in discipline
  functions from wreaking havoc:
  - Save, reinitialise and restore the lexer state in case the
    discipline is run at parse time. This happens with PS2; I'm not
    currently aware of other contexts but that doesn't mean there
    aren't any or that there won't be any. Plus, I determined by
    experimenting that doing this here seems to be the only way to
    make it work reliably. Thankfully the overhead is low.
  - Check the topfd redirection state and run sh_iorestore() if
    needed. Without this, if a special builtin with a redirection
    throws an error in a discipline function, its redirection(s)
    remain permanent. For example, 'trap --bad-option 2>/dev/null'
    in a PS2.get() discipline would kill standard error, including
    all your prompts.

src/cmd/ksh93/sh/io.c: io_prompt():
- Before getting the value of the PS2 prompt, save the stack state
  and restore it after. This stops a PS2.get discipline function
  from corrupting a command substitution that the user is typing.
  Doing this in assign()/lookup() is ineffective, so do it here.

Fixes: https://github.com/ksh93/ksh/issues/347
2021-12-09 07:31:30 +01:00
Johnothan King
1f5ad85cd4 Minor improvements to the regression tests (#366)
- tests/basic.sh: Fix a regression test that was failing under dtksh by
  allowing the error message to name ksh 'lt-dtksh'. Additionally, fix
  the test's inaccurate failure message (a version string is not what
  the regression test expects).

- test/builtins.sh: Exclude the expr builtin from the unrecognized
  options test because it's incompatible. Additionally, put the
  unrecognized options test inside of a function to ensure that it works
  with the future local builtin (https://github.com/ksh93/ksh/issues/123).

- tests/io.sh: The long seek test may fail to seek past the 2 GiB
  boundary on 32-bit systems, so only allow it to run on 64-bit.
  References:
  3222ac2b59/f/ksh-1.0.0-beta.1-regre-tests.patch
  a5c692e1bd

- tests/substring.sh: Add a regression test for the ${.sh.match}
  crashing bug fixed in commit 1bf1d2f8.
2021-12-09 06:43:39 +01:00
Johnothan King
2b8eaa6609 Fix handling of files without newlines in the head and tail builtins (#365)
The head and tail builtins don't correctly handle files that lack
newlines[*]:
    $ print -n foo > /tmp/bar
    $ /opt/ast/bin/head -1 /tmp/bar  # No output
    $ print -n 'foo\nbar' > /tmp/bar
    $ /opt/ast/bin/tail -1 /tmp/bar
    foo
    bar$
This commit backports the required changes from ksh93v- to handle files
without a newline in the head and tail builtins. (Also note that the
required fix to sfmove was already backported in commit 1bd06207.)

src/lib/libcmd/{head,tail}.c:
- Backport the relevant ksh93v- code for handling files
  without newlines.

src/cmd/ksh93/tests/builtins.sh:
- Add a few regression tests for using 'head -1' and 'tail -1' on a file
  missing and ending newline.

[*]: https://www.illumos.org/issues/4149
2021-12-09 06:43:10 +01:00
Johnothan King
beccb93fd4 Fix various compiler warnings and minor issues (#362)
List of changes:
- Fixed some -Wuninitialized warnings and removed some unused variables.

- Removed the unused extern for B_login (re: d8eba9d1).

- The libcmd builtins and the vmalloc memfatal function now handle
  memory errors with 'ERROR_SYSTEM|ERROR_PANIC' for consistency with how
  ksh itself handles out of memory errors.

- Added usage of UNREACHABLE() where it was missing from error handling.

- Extend many variables from short to int to prevent overflows (most
  variables involve file descriptors).

- Backported a ksh2020 patch to fix unused value Coverity issues
  (https://github.com/att/ast/pull/740).

- Note in src/cmd/ksh93/README that ksh compiles with Cygwin on
  Windows 10 and Windows 11, albeit with many test failures.

- Add comments to detail some sections of code. Extensive list of
  commits related to this change:
  ca2443b5, 7e7f1372, 2db9953a, 7003aba4, 6f50ff64, b1a41311,
  222515bf, a0dcdeea, 0aa9e03f, 61437b27, 352e68da, 88e8fa67,
  bc8b36fa, 6e515f1d, 017d088c, 035a4cb3, 588a1ff7, 6d63b57d,
  a2f13c19, 794d1c86, ab98ec65, 1026006d

- Removed a lot of dead ifdef code.

- edit/emacs.c: Hide an assignment to avoid a -Wunused warning. (See
  also https://github.com/att/ast/pull/753, which removed the assignment
  because ksh2020 removed the !SHOPT_MULTIBYTE code.)

- sh/nvdisc.c: The sh_newof macro cannot return a null pointer because
  it will instead cause the shell to exit if memory cannot be allocated.
  That makes the if statement here a no-op, so remove it.

- sh/xec.c: Fixed one unused variable warning in sh_funscope().

- sh/xec.c: Remove a fallthrough comment added in commit ed478ab7
  because the TFORK code doesn't fall through (GCC also produces no
  -Wimplicit-fallthrough warning here).

- data/builtins.c: The cd and pwd man pages state that these builtins
  default to -P if PATH_RESOLVE is 'physical', which isn't accurate:
     $ /opt/ast/bin/getconf PATH_RESOLVE
     physical
     $ mkdir /tmp/dir; ln -s /tmp/dir /tmp/sym
     $ cd /tmp/sym
     $ pwd
     /tmp/sym
     $ cd -P /tmp/sym
     $ pwd
     /tmp/dir
  The behavior described by these man pages isn't specified in the ksh
  man page or by POSIX, so to avoid changing these builtin's behavior
  the inaccurate PATH_RESOLVE information has been removed.

- Mamfiles: Preserve multi-line errors by quoting the $x variable.
  This fix was backported from 93v-.
  (See also <a7e9cc82>.)

- sh/subshell.c: Remove set but not used sp->errcontext variable.
2021-12-09 06:42:59 +01:00
Martijn Dekker
b3050769ea Fix 'return' emitting signals; allow arbitrary return values
When a global EXIT trap is set, and a ksh-style function exits with
a status > 256 that could have been the result of a signal, then
the shell incorrectly issues that signal to itself. Depending on
the signal, this causes ksh to terminate itself ungracefully:

    $ cat /tmp/exit267
    trap 'echo OK' EXIT  # This trap triggers the crash
    function foo { return 267; }
    foo
    $ bash /tmp/exit267
    OK
    $ ksh-3aee10d7 /tmp/exit267
    OK
    $ ksh /tmp/exit267
    Memory fault(coredump)

On most systems, status 267 corresponds to SIGSEGV. The reported
memory fault is not real; it results from ksh incorrectly killing
itself with that signal.

The problem is caused by two factors:

1. As of 93u+ 2012-08-01, ksh explicitly allows 'return' to use an
   exit status corresponding to a signal (from 257 to end of signal
   range). The rest of the integer range is trunctated to 8 bits.
   This is contrary to both 'man ksh' and 'return --man' which both
   say it's always truncated to 8 bits. Plus, combined with point 2
   below, this new behaviour is nonsensical, as 'return' has no
   business actually generating signals. However, a couple of
   regression tests now depend on this, as may some scripts.

2. When a ksh-style function does not handle a signal, the signal
   is passed down to the parent environment and ksh does this by
   reissuing the signal to its own process after leaving the
   function scope. However, it does this by checking the exit
   status, which is very bad practice as there is no guarantee
   that an exit status corresponding to a signal was in fact
   produced by a signal, particularly after they changed the
   behaviour of 'return' per 1 above.

This commit fixes both issues. It also takes a proper decision on
allowable 'return' exit status arguments. Since 93u+ was released
nearly a decade ago and some scripts may now rely on being able to
pass certain exit statuses out of the 8-bit range, we should not
disallow this now. But neither should we be half-hearted in
allowing only some arbitrary selection of 9-bit statuses; 'return'
values categorically should have nothing to do with signals, so
this is no basis for limiting them. We're now allowing the full
unsigned integer range, which is usually 32 bits. This is like zsh,
and may create some interesting possibilities for scripts.
Just don't forget that $? will still lose all but its 8 least
significant bits when leaving the current (sub)shell environment.

src/cmd/ksh93/sh/xec.c: sh_funscope():
- Fix passing down unhandled signals from interrupted ksh functions
  (jumpval==SH_JMPFUN) to the parent environment. Do not pay any
  attention to the exit status. Instead, use sh.lastsig (a.k.a.
  shp->lastsig). It is set by sh_fault() in fault.c for just this
  purpose and contains the last signal handled for the current
  command. It is reset in sh_exec() before running any new command.
  So if it contains a signal, that is the one that interrupted the
  ksh function, so it's the correct one to pass down. (Further
  evidence: sh_subshell() was already using this in the same way.)

src/cmd/ksh93/bltins/cflow.c: b_return():
- Allow any signed int return value when invoked as and behaving
  like 'return'.
- Add warning if a passed value is out of int range. Set the exit
  status to 128 in that case; int overflow is undefined behaviour
  in C and we want consistent behaviour across platforms. It should
  be safe enough to check if the long and int values are equal.
- Refactor for clarity.

src/cmd/ksh93/sh/subshell.c: sh_subshell():
- If a function returns with a status out of the 8 bit range in a
  virtual subshell, this status could be passed down to the parent
  shell in full. However, if the subshell forks, then the kernel
  will enforce an 8-bit exit status. That is inconsistent. Scripts
  should not be able to tell the difference between forked and
  non-forked subshells, so artificially enforce that limit here.

Other changed files:
- Documentation updates and copy-edits.
- Update an AT&T functions.sh regress test to allow arbitrary
  integer return values for functions.
- Add regression tests based in part on @JohnoKing's reproducers.
- Rework some vaguely related regression tests to fail gracefully.

Thanks to Johnothan King for the report and the testing.

Fixes: https://github.com/ksh93/ksh/issues/364
2021-12-09 06:41:39 +01:00
Martijn Dekker
c842f46bbd tests/leaks.sh: increase tolerances again
Another day, another attempt to quash regress test fails caused by
implausibly small memory "leaks".

Apparently, on later macOS versions, 'ps' got more unreliable, so
increase tolerance from 8 to 12 bytes. Maximum reported leaks.sh
failure was 188 KiB after 16384 iterations (11 bytes/iteration).

And on some Linux versions, /proc sometimes acts weird. The
following is apparently consistent on Arch Linux:
	shcomp-leaks.ksh[177]: memory leak with read -C when using
	<<< (leaked approx 65536 bytes after 4096 iterations)
...which is 16 bytes per iteration, still not large enough to make
a real leak plausible.

Resolves: https://github.com/ksh93/ksh/issues/363
2021-12-06 10:05:36 +01:00
Johnothan King
cd8c48cc5a Add the '-e' flag to the 'cd' builtin (#358)
This change adds the -e flag to the cd builtin, as specified in
<https://www.austingroupbugs.net/view.php?id=253>. The -e flag is
used to verify if the the current working directory after 'cd -P'
successfully changes the directory, and returns with exit status 1
if the cwd couldn't be determined. Additionally, it causes all
other errors to return with exit status >1 (i.e., status 2 unless
ENOMEM occurs) if -e and -P are both active.

src/cmd/ksh93/bltins/cd_pwd.c:
- Add -e option to the cd builtin command. It verifies $PWD by
  using test_inode() to execute the equivalent of [[ . -ef $PWD ]].
- The check for restricted mode has been moved after optget to
  allow 'cd -eP' to return with exit status 2 when in restricted
  mode. To avoid changing the previous behavior of cd when -e isn't
  passed, extra checks have been added to prevent cd from printing
  usage information in restricted mode.

src/cmd/ksh93/tests/builtins.sh:
- Add regression tests for the exit status when using the cd -P
  flag with and without -e.

src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/sh.1:
- Document the addition of -e to the cd builtin.
2021-12-06 06:58:11 +01:00
Martijn Dekker
6faf43791b Disable vmalloc by default; fix build on Cygwin
Vmalloc is incompatible with Cygwin, but the code to disable it on
Cygwin did not work properly, somehow causing the build to freeze
at a seemingly unrelated point (i.e., when iffe feature tests
attempt to write to sfstdout).

Vmalloc has wasted my time for the last time, so now it's getting
disabled by default even on development builds and you'll have to
pass -D_AST_vmalloc in CCFLAGS to enable it for testing purposes.

This commit has a few other build tweaks as well.

src/lib/libast/features/vmalloc:
- tst map_malloc: Remove no-op #if __CYGWIN__ block which was in
  the #else clause of another #if __CYGWIN__ block.
- Output block ('cat{'):
  - Instead of disabling vmalloc for certain systems, disable it
    unless _AST_vmalloc is defined.
  - To disable it, set _AST_std_malloc as well as _std_malloc, just
    to be sure.

src/lib/libast/vmalloc/malloc.c:
- Remove ineffective Cygwin special-casing.

src/lib/libcmd/vmstate.c:
- This is only useful for vmalloc, so do not pointlessly compile it
  if vmalloc is disabled.

src/lib/libast/man/vmalloc.3:
- Add deprecation notice.

Resolves: https://github.com/ksh93/ksh/issues/360
2021-12-05 19:28:45 +01:00
Johnothan King
370440473e Fix KEYBD trap crash when inputting a command substitution (#355)
This change fixes a crash that can occur after setting a KEYBD trap
then inputting a multi-line command substitution. The crash is
similar to issue #347, but it's easier to reproduce since it
doesn't require you to setup a kshrc file. Reproducer for the
crash:

  $ ENV=/./dev/null ksh
  $ trap : KEYBD
  $ : $(
  > true)
  Memory fault(coredump)

The bugfix was backported (with considerable changes) from ksh93v-
2013-10-08. The crash was first reported on the old mailing list:
https://www.mail-archive.com/ast-users@lists.research.att.com/msg00313.html

src/cmd/ksh93/{include/shlex.h,sh/lex.c}:
- To fix this properly, we need sizeof(Lex_t) to work as expected
  in edit.c, but that is thwarted by the _SHLEX_PRIVATE macro in
  lex.c which shlex.h uses to add private structs to the Lex_t type
  in lex.c only. So get rid of that _SHLEX_PRIVATE macro and make
  those members part of the centrally defined struct, renaming them
  to make it clear they're considered private to lex.c.

src/cmd/ksh93/edit/edit.c:
- Now that we can get its size, save and restore the shell lexing
  context when a KEYBD trap is present.

src/cmd/ksh93/tests/pty.sh:
- Add a regression test for the KEYBD trap crash.

Co-authored-by: Martijn Dekker <martijn@inlv.org>
2021-11-30 04:27:31 +01:00
Johnothan King
a0eeb14787 Stop the time keyword overriding errexit (#351)
This bug was first reported in <https://www.illumos.org/issues/7694>.
The time keyword currently overrides the errexit shell option,
allowing failing scripts to continue after an error:

  $ cat 1.sh
  #!/bin/sh
  time false   # This should cause the script to exit
  echo FAILURE
  true
  $ ksh -o errexit 1.sh

  real    0m0.00s
  user    0m0.00s
  sys     0m0.00s
  FAILURE

src/cmd/ksh93/sh/xec.c:
- When the time keyword runs a command, pass the errexit state flag
  to the sh_exec call. This state flag is required for ksh to exit
  when a command fails while the errexit option is on.

src/cmd/ksh93/tests/basic.sh:
- Add a regression test based on the reproducer.
2021-11-29 20:12:15 +01:00
Martijn Dekker
f508660ddf Revert "Fix defining types conditionally and/or in subshells (re: 8ced1daa)"
This reverts commit 2b9cbbbc8e.

This is not ready for prime time. Crashses when running a $PS2
discipline function. This needs fixing and more testing in
development before making it into the 1.0 branch. In the meantime,
that terrible problem with types is back, sorry about that.
2021-11-29 20:08:53 +01:00
Martijn Dekker
2b9cbbbc8e Fix defining types conditionally and/or in subshells (re: 8ced1daa)
This commit mitigates the effects of the hack explained in the
referenced commit so that dummy built-in command nodes added by the
parser for declaration/assignment purposes do not leak out into the
execution level, except in a relatively harmless corner case.

Something like

	if false; then
		typeset -T Foo_t=(integer -i bar)
	fi

will no longer leave a broken dummy Foo_t declaration command. The
same applies to declaration commands created with enum.

The corner case remaining is:

$ ksh -c 'false && enum E_t=(a b c); E_t -a x=(b b a c)'
ksh: E_t: not found

Since the 'enum' command is not executed, this should have thrown
a syntax error on the 'E_t -a' declaration:
ksh: syntax error at line 1: `(' unexpected

This is because the -c script is parsed entirely before being
executed, so E_t is recognised as a declaration built-in at parse
time. However, the 'not found' error shows that it was successfully
eliminated at execution time, so the inconsistent state will no
longer persist.

This fix now allows another fix to be effective as well: since
built-ins do not know about virtual subshells, fork a virtual
subshell into a real subshell before adding any built-ins.

src/cmd/ksh93/sh/parse.c:

- Add a pair of functions, dcl_hactivate() and dcl_dehacktivate(),
  that (de)activate an internal declaration built-ins tree into
  which check_typedef() can pre-add dummy type declaration command
  nodes. A viewpath from the main built-ins tree to this internal
  tree is added, unifying the two for search purposes and causing
  new nodes to be added to the internal tree. When parsing is done,
  we close that viewpath. This hides those pre-added nodes at
  execution time. Since the parser is sometimes called recursively
  (e.g. for command substitutions), keep track of this and only
  activate and deactivate at the first level.

- We also need to catch errors. This is done by setting libast's
  error_info.exit variable to a dcl_exit() function that tidies up
  and then passes control to the original (usually sh_exit()).

- sh_cmd(): This is the most central function in the parser. You'd
  think it was sh_parse(), but $(modern)-form command substitutions
  use sh_dolparen() instead. Both call sh_cmd(). So let's simply
  add a dcl_hacktivate() call at the beginning and a
  dcl_deactivate() call at the end.

- assign(): This function calls path_search(), which among many
  other things executes an FATH search, which may execute arbitrary
  code at parse time (!!!). So, regardless of recursion level,
  forcibly dehacktivate() to avoid those ugly parser side effects
  returning in that context.

src/cmd/ksh93/bltins/enum.c: b_enum():

- Fork a virtual subshell before adding a built-in.

src/cmd/ksh93/sh/xec.c: sh_exec():

- Fork a virtual subshell when detecting typeset's -T option.

Improves fix to https://github.com/ksh93/ksh/issues/256
2021-11-29 09:02:07 +01:00
Martijn Dekker
43cd8da2fe Fix 'command' prefix in enum type def pre-parsing (re: 1dc18346)
Symptom:

$ ksh -c 'command enum -i P_t=(a b); P_t -A v=([f]=b); typeset -p v'
ksh: syntax error at line 1: `(' unexpected

Expected: no syntax error, and output of 'P_t -A v=([f]=b)'.

src/cmd/ksh93/sh/parse.c: check_typedef():
- For enum, skip over any possible 'command' prefixes before
  pre-parsing options with optget (or, technically, skip anything
  else that might come before 'enum', though I don't think anything
  else is possible).
- The sh_addbuiltin() call at the end to pre-add the builtin
  obtained the node pointer to the built-in and the node flags from
  the parser tree. This did not work if a 'command' prefix was
  present. However, we don't actually need this. For parsing
  purposes, the BLT_DCL flag for a declaration built-in is
  sufficient; this is what gets the parser to accept
  assignment-arguments including parentheses. So just apply that.
  In addition, let's point it to an actual dummy built-in, 'true'
  (SYSTRUE), so that if a user does run something like 'if false;
  then enum Foo_t=(...); fi', the leaked Foo_t dummy at least won't
  do anything (not even crash).
2021-11-28 21:16:19 +01:00
Martijn Dekker
c9ca0ff531 typeset equivalents: use 'typeset' in error messages (re: 1fbbeaa1)
When giving an invalid or incompatible option to a typeset option
equivalent command (former default alias) such as 'compound' or
'integer', the resulting usage messages are incorrect. Example:

$ ksh -c 'compound -T foo=(typeset -a bar[1]=23)'
ksh: compound: -T cannot be used with other options
Usage: compound [-bflmnprstuxACHS] [-a[[type]]] [-i[base]] [-E[n]]
                [-F[n]] [-L[n]] [-M[mapping]] [-R[n]] [-X[n]]
                [-h string] [-T[tname]] [-Z[n]] [name[=value]...]
   Or: compound -f [name...]
   Or: compound -m [name=name...]
   Or: compound -n [name=name...]
   Or: compound -T [tname[=(type definition)]...]
 Help: compound [ --help | --man ] 2>&1

The error message is wrong (there were no other options) and some
of the listed usages are invalid, like 'compound -f'.

Typeset option equivalent commands should just use 'typeset' in all
their error messages to avoid confusion. This is done by setting
error_info.id to the name of the typeset builtin.
2021-11-28 21:16:17 +01:00
Johnothan King
84ded2d0c4 Backport the ksh93v- rm builtin to fix 'rm -d' (#348)
The -d flag implemented in the rm builtin is completely broken. No
matter what you do it refuses to remove directories, even if -r is
also passed. Reproducer:

  $ mkdir /tmp/empty
  $ PATH=/opt/ast/bin rm -d /tmp/empty
  rm: /tmp/empty: directory
  $ PATH=/opt/ast/bin rm -dr /tmp/empty
  rm: /tmp/empty: directory not removed [Is a directory]

Additionally, the description of 'rm -d' in the man page contradicts
how it's specified in <https://www.austingroupbugs.net/view.php?id=802>.

The ksh93v- rm builtin fixed nearly all of these issues, so I've
backported it to 93u+m and applied one additional fix for 'rm -rd'.

src/lib/libcmd/rm.c:
- Backported the fixes from the ksh93v- rm builtin's -d flag when
  used on empty directories.
- Backported the man page update for rm(1) from ksh93v-.
- The ksh93v- rm builtin had one additional bug that caused the -r
  option to fail when combined with -d. This was fixed by
  overriding -d if -r is also passed.

src/cmd/ksh93/tests/builtins.sh:
- Add regression tests for the rm builtin's -d option.
2021-11-25 03:52:05 +01:00
Martijn Dekker
214308f81e '.': disable ksh function lookup in POSIX mode
POSIXly, '.' loads only files, not functions.

This only applies to '.', not 'source' (which is not in POSIX).

src/cmd/ksh93/bltins/misc.c: b_source():
- For ksh function lookup, add an additional check that we're not
  in POSIX mode and running the '.' (SYSDOT) builtin.
2021-11-24 09:12:39 +01:00
Martijn Dekker
c0334e32a1 [1.0 release prep] Remove tilde expansion discipline
Defining a .sh.tilde.get or .sh.tilde.set discipline function to
extend tilde expansion works well as long as the discipline
function doesn't get interrupted (e.g. with Crtl+C) or produce an
error message. Either of those will cause the shell to become
unstable and crash.

This feature is now removed from the 1.0 branch as it is not ready
for prime time. It can return to a release branch if/when we manage
to fix it on the master branch.

Related: https://github.com/ksh93/ksh/issues/346
2021-11-24 07:46:58 +01:00
Martijn Dekker
e3d91ffa90 nv_associative(): finally use proper check for enum (re: b98e32fc)
As of the previous commit, I finally know how to properly check for
a variable of a type created by 'enum'. We need to check for both
the NV_UINT16 attribute and the ENUM_disc discipline.

Also:
- regression test tweaks
- add missing tests for previous commit (f600a5ea)
2021-11-24 02:06:08 +01:00
Martijn Dekker
10ef74e1a2 shtests: unignore SIGCONT
For some reason, Void Linux (with musl libc) sets SIGCONT to
ignored on the Linux console, causing the 'sleep -s' test in
builtins.sh to fail spuriously as it relies on SIGCONT to work.

src/cmd/ksh93/tests/shtests:
- Reset SIGCONT using the unadvertised 'trap + SIGCONT' feature.

Resolves: https://github.com/ksh93/ksh/issues/301
2021-11-22 16:55:51 +01:00
Martijn Dekker
8ced1daadf Fix enum type definition pre-parsing for shcomp and dot/source
Parser limitations prevent shcomp or source from handling enum
types correctly:

    $ cat /tmp/colors.sh
    enum Color_t=(red green blue orange yellow)
    Color_t -A Colors=([foo]=red)

    $ shcomp /tmp/colors.sh > /dev/null
    /tmp/colors.sh: syntax error at line 2: `(' unexpected
    $ source /tmp/colors.sh
    /bin/ksh: source: syntax error: `(' unexpected

Yet, for types created using 'typeset -T', this works. This is done
via a check_typedef() function that preliminarily adds the special
declaration builtin at parse time, with details to be filled in
later at execution time.

This hack will produce ugly undefined behaviour if the definition
command creating that type built-in is then not actually run at
execution time before the type built-in is accessed.

But the hack is necessary because we're dealing with a fundamental
design flaw in the ksh language. Dynamically addable built-ins that
change the syntactic parsing of the shell language on the fly are
an absurdity that violates the separation between parsing and
execution, which muddies the waters and creates the need for some
kind of ugly hack to keep things like shcomp more or less working.

This commit extends that hack to support enum.

src/cmd/ksh93/sh/parse.c:
- check_typedef():
  - Add 'intypeset' parameter that should be set to 1 for typeset
    and friends, 2 for enum.
  - When processing enum arguments, use AST getopt(3) to skip over
    enum's options to find the name of the type to be defined.
    (getopt failed if we were running a -c script; deal with this
    by zeroing opt_info.index first.)
- item(): Update check_typedef() call, passing lexp->intypeset.
- simple(): Set lexp->intypeset to 2 when processing enum.

The rest of the changes are all to support the above and should be
fairly obvious, except:

src/cmd/ksh93/bltins/enum.c:
- enuminfo(): Return on null pointer, avoiding a crash upon
  executing 'Type_t --man' if Type_t has not been fully defined due
  to the definition being pre-added at parse time but not executed.
  It's all still wrong, but a crash is worse.

Resolves: https://github.com/ksh93/ksh/issues/256
2021-11-21 17:43:55 +01:00
Johnothan King
e554a07c56 typeset -T shouldn't list types created with enum (#340)
Listing types with 'typeset -T' will list not only types created with
typeset, but also types created with enum. However, the types created
by enum are not displayed correctly in the resulting output:

$ enum Foo_t=(foo bar)
$ typeset -T
typeset -T Foo_t
typeset -T Foo_t=fo)

The fix for this bug was backported from ksh93v- 2013-10-08.

src/cmd/ksh93/sh/nvtype.c:
- sh_outtype(): Skip over enums when listing types with 'typeset -T'.
2021-11-20 09:48:48 +01:00
Martijn Dekker
98ea0c2dbb tests/signal.sh: fix AT&T's err_exit bogosity (re: 712261c8) 2021-11-20 03:31:10 +01:00
Martijn Dekker
6829fc9a29 tests/leaks.sh: tweak Linux tolerance again (re: 31fe1c28)
The referenced commit did not fix the symptoms on the 1.0 branch
(no vmalloc) on the GitHub CI runners.

The failures are intermittent and are not reproduced with vmalloc
or on other operating systems.

Though the failures occur on a different test each time, the total
amount of "leaked" bytes is always 36864, e.g.:

    leaks.sh[388]: run command with preceding PATH assignment in
    main shell (leaked approx 36864 bytes after 4096 iterations)

36864/4096 equals exactly 9. An odd number, literally and
figuratively, but I suppose that's the tolerance Linux needs.

src/cmd/ksh93/tests/leaks.sh
- Increase tolerance of bytes per iteration from 8 to 9.
2021-11-19 20:21:25 +01:00
Johnothan King
396b388e1f Fix a few issues with $RANDOM seeding in subshells (#339)
This commit fixes an issue I found in the subshell $RANDOM
reseeding code.

The main issue is a performance regression in the shbench fibonacci
benchmark, introduced in commit af6a32d1. Performance dropped in
this benchmark because $RANDOM is always reseeded and restored,
even when it's never used in a subshell. Performance results from
before and after this performance fix (results are on Linux with
CC=gcc and CCFLAGS='-O2 -D_std_malloc'):

  $ ./shbench -b bench/fibonacci.ksh -l 100 ./ksh-0f06a2e ./ksh-af6a32d ./ksh-f31e368 ./ksh-randfix

  benchmarking ./ksh-0f06a2e, ./ksh-af6a32d, ./ksh-f31e368, ./ksh-randfix ...
  *** fibonacci.ksh ***
  # ./ksh-0f06a2e  # Recent version of ksh93u+m
  # ./ksh-af6a32d  # Commit that introduced the regression
  # ./ksh-f31e368  # Commit without the regression
  # ./ksh-randfix  # Ksh93u+m with this patch applied

  -------------------------------------------------------------------------------------------------
  name           ./ksh-0f06a2e        ./ksh-af6a32d        ./ksh-f31e368        ./ksh-randfix
  -------------------------------------------------------------------------------------------------
  fibonacci.ksh  0.481 [0.459-0.515]  0.472 [0.455-0.504]  0.396 [0.380-0.442]  0.407 [0.385-0.439]
  -------------------------------------------------------------------------------------------------

src/cmd/ksh93/include/variables.h,
src/cmd/ksh93/sh/{init,subshell}.c:
- Rather than reseed $RANDOM every time a subshell is created, add
  a sh_save_rand_seed() function that does this only when the
  $RANDOM variable is used in a subshell. This function is called
  by the $RANDOM discipline functions nget_rand() and put_rand().
  As a minor optimization, sh_save_rand_seed doesn't reseed if it's
  called from put_rand().
- Because $RANDOM may have a seed of zero (i.e., RANDOM=0),
  sp->rand_seed isn't enough to tell if $RANDOM has been reseeded.
  Add sp->rand_state for this purpose.
- sh_subshell(): Only restore the former $RANDOM seed and state if
  it is necessary to prevent a subshell leak.

src/cmd/ksh93/tests/variables.sh:
- Add two regression tests for bugs I ran into while making this
  patch.
2021-11-19 08:18:44 +01:00
Martijn Dekker
bd9752e43c Backport 'printf -v' from ksh 93v-
'printf' on bash and zsh has a popular -v option that allows
assigning formatted output directly to variables without using a
command substitution. This is much faster and avoids snags with
stripping final linefeeds. AT&T had replicated this feature in the
abandoned 93v- beta version. This backports it with a few tweaks
and one user-visible improvement.

The 93v- version prohibited specifying a variable name with an
array subscript, such as printf -v var\[3\] foo. This works fine on
bash and zsh, so I see no reason why this should not work on ksh,
as nv_putval() deals with array subscripts just fine.

src/cmd/ksh93/bltins/print.c: b_print():
- While processing the -v option when called as printf, get a
  pointer to the variable, creating it if necessary. Pass only the
  NV_VARNAME flag to enforce a valid variable name, and not (as
  93v- does) the NV_NOARRAY flag to prohibit array subscripts.
- If a variable was given, set the output file to an internal
  string buffer and jump straight to processing the format.
- After processing the format, assign the contents to the string
  buffer to the variable.

src/cmd/ksh93/data/builtins.c:
- Document the new option, adding a warning that unquoted square
  brackets may trigger pathname expansion.
2021-11-19 03:54:33 +01:00
Martijn Dekker
1e96013367 tests/pty.sh: fix two failures due to typeahead on Debian Bullseye
As the (original AT&T) comment at the top says, "the trickiest part
of the tests is avoiding typeahead in the pty dialogue".

Two tests failed to [p]eek at the prompt before they started
'typing'. This causes unpredictable results. On Debian Bullseye
this triggers typeahead, which produces unwanted echo to the
terminal, killing the tests.

src/cmd/ksh93/tests/pty.sh:
- Add missing 'p' commands for the first prompt to the tests
  'nobackslashctrl in emacs' and 'emacs backslash escaping'.

Resolves: https://github.com/ksh93/ksh/issues/332
2021-11-17 23:05:05 +01:00
Martijn Dekker
c734568b02 arithmetic: Fix the octal leading zero mess (#337)
In C/POSIX arithmetic, a leading 0 denotes an octal number, e.g.
010 == 8. But this is not a desirable feature as it can cause
problems with processing things like dates with a leading zero.
In ksh, you should use 8#10 instead ("10" with base 8).

It would be tolerable if ksh at least implemented it consistently.
But AT&T made an incredible mess of it. For anyone who is not
intimately familiar with ksh internals, it is inscrutable where
arithmetic evaluation special-cases a leading 0 and where it
doesn't. Here are just some of the surprises/inconsistencies:

1. The AT&T maintainers tried to honour a leading 0 inside of
   ((...)) and $((...)) and not for arithmetic contexts outside it,
   but even that inconsistency was never quite consistent.

2. Since 2010-12-12, $((x)) and $(($x)) are different:
      $ /bin/ksh -c 'x=010; echo $((x)) $(($x))'
      10 8
   That's a clear violation of both POSIX and the principle of
   least astonishment. $((x)) and $(($x)) should be the same in
   all cases.

3. 'let' with '-o letoctal' acts in this bizarre way:
      $ set -o letoctal; x=010; let "y1=$x" "y2=010"; echo $y1 $y2
      10 8
   That's right, 'let y=$x' is different from 'let y=010' even
   when $x contains the same string value '010'! This violates
   established shell grammar on the most basic level.

This commit introduces consistency. By default, ksh now acts like
mksh and zsh: the octal leading zero is disabled in all arithmetic
contexts equally. In POSIX mode, it is enabled equally.

The one exception is the 'let' built-in, where this can still be
controlled independently with the letoctal option as before (but,
because letoctal is synched with posix when switching that on/off,
it's consistent by default).

We're also removing the hackery that causes variable expansions for
the 'let' builtin to be quietly altered, so that 'x=010; let y=$x'
now does the same as 'let y=010' even with letoctal on.

Various files:
- Get rid of now-redundant sh.inarith (shp->inarith) flag, as we're
  no longer distinguishing between being inside or outside ((...)).

src/cmd/ksh93/sh/arith.c:
- arith(): Let disabling POSIX octal constants by skipping leading
  zeros depend on either the letoctal option being off (if we're
  running the "let" built-in") or the posix option being off.
- sh_strnum(): Preset a base of 10 for strtonll(3) depending on the
  posix or letoctal option being off, not on the sh.inarith flag.

src/cmd/ksh93/include/argnod.h,
src/cmd/ksh93/sh/args.c,
src/cmd/ksh93/sh/macro.c:
- Remove astonishing hackery that violated shell grammar for 'let'.

src/cmd/ksh93/sh/name.c (nv_getnum()),
src/cmd/ksh93/sh/nvdisc.c (nv_getn()):
- Remove loops for skipping leading zeroes that included a broken
  check for justify/zerofill attributes, thereby fixing this bug:
	$ typeset -Z x=0x15; echo $((x))
	-ksh: x15: parameter not set
  Even if this code wasn't redundant before, it is now: sh_arith()
  is called immediately after the removed code and it ignores
  leading zeroes via sh_strnum() and strtonll(3).

Resolves: https://github.com/ksh93/ksh/issues/334
2021-11-17 04:28:08 +01:00
Johnothan King
b40155fae8 Fix file descriptor leaks in the hist builtin (#336)
This commit fixes two file descriptor leaks in the hist built-in.
The bugfix for the first file descriptor leak was backported from
ksh2020. See:
https://github.com/att/ast/issues/872
73bd61b5

Reproducer:
  $ echo no
  $ hist -s no=yes

The second file descriptor leak occurs after a substitution error
in the hist built-in (this leak wasn't fixed in ksh2020).
Reproducer:
  $ echo no
  $ ls /proc/$$/fd
  $ hist -s no=yes
  $ hist -s no=yes
  $ ls /proc/$$/fd

src/cmd/ksh93/bltins/hist.c:
- Close leftover file descriptors when an error occurs and after
  'hist -s' runs a command.

src/cmd/ksh93/tests/builtins.sh:
- Add two regression tests for both of the file descriptor leaks.
2021-11-16 23:34:46 +01:00
Martijn Dekker
7ea95b7df3 tests: fix intermittent $RANDOM reseeding fails (re: af6a32d1)
When testing whether subshell $RANDOM reseeding worked, checking
for non-identical numbers is not sufficient. There is no check for
randomly occurring duplicate numbers, nor can there be, because
subshells cannot (or, in the case of virtual subshells, should not)
influence each other or the parent shell.

src/cmd/ksh93/tests/variables.sh:
- Try up to three times, tolerating identical numbers twice.
2021-11-15 21:16:39 +01:00
Martijn Dekker
56c2e13e92 arith: Fix variables 'nan' and 'inf' in arithmetic for POSIX mode
The --posix compliance option now disables the case-insensitive
special floating point constants Inf and NaN so that all case
variants of $((inf)) and $((nan)) refer to the variables by those
names as the standard requires. (BUG_ARITHNAN)

src/cmd/ksh93/sh/arith.c: arith():
- Only do case-insensitive checks for "Inf" and "NaN" if the POSIX
  option is off.
2021-11-15 21:16:23 +01:00
Martijn Dekker
31fe1c2890 tests/leaks.sh: increase iterations on Linux
There are one or two leaks that show up intermittently on the
Github runners for the 1.0 branch (which is compiled as a release,
i.e. no vmalloc). If they're intermittent, they must be false
positives due to malloc artefacts. Let's double the number of
iterations for the /proc/$$/stat method and see what happens.
2021-11-15 03:00:40 +01:00
Martijn Dekker
d9f1fdaa41 Fix [ \( str -a str \) ], [ \( str -o str \) ]
Symptoms:

$ test \( string1 -a string2 \)
/usr/local/bin/ksh: test: argument expected
$ test \( string1 -o string2 \)
/usr/local/bin/ksh: test: argument expected

The parentheses should be irrelevant and this should be a test for
the non-emptiness of string1 and/or string2.

src/cmd/ksh93/bltins/test.c:

- b_test(): There is a block where the case of 'test' with five or
  less arguments, the first and last one being parentheses, is
  special-cased. The parentheses are removed as a workaround: argv
  is increased to skip the opening parenthesis and argc is
  decreased by 2. However, there is no corresponding increase of
  tdata.av which is a copy of this function's argv. This renders
  the workaround ineffective. The fix is to add that increase.

- e3(): Do not handle '!' as a negator if not followed by an
  argument. This allows a right-hand expression that is equal to
  '!' (i.e. a test for the non-emptiness of the string '!').
2021-11-15 02:44:56 +01:00
Martijn Dekker
802136a6ad Fix goof in regression test (re: c8147306) 2021-11-14 12:30:49 +01:00