1
0
Fork 0
mirror of git://git.code.sf.net/p/cdesktopenv/code synced 2025-03-09 15:50:02 +00:00
Commit graph

67 commits

Author SHA1 Message Date
Martijn Dekker
064baa372e More misc. tweaks and cleanups
Notable changes:

.github/workflows/ci.yml:
- Run 'bin/package test' on the github runner so we test iffe too.

src/cmd/ksh93/sh/subshell.c:
- sh_assignok was usually called like 'np = sh_assignok(np,0)'. But
  the function never changes np, it just returns the np value
  passed to it, so the assignment is pointless and that function
  can be changed to a void.

src/cmd/ksh93/sh/fault.c: sh_fault():
- Remove check for sh.subshell after sh_isstate(SH_INTERACTIVE). As
  of 48ba6964, it is never set in subshells.
2022-07-14 17:34:08 +02:00
Martijn Dekker
7c4418ccdc Multibyte character handling overhaul; allow global disable
The SHOPT_MULTIBYTE compile-time option did not make much sense as
disabling it only disabled multibyte support for ksh/libshell, not
libast or libcmd built-in commands. This commit allows disabling
multibyte support for the entire codebase by defining the macro
AST_NOMULTIBYTE (e.g. via CCFLAGS). This slightly speeds up the
code and makes an optimised binary about 5% smaller.

src/lib/libast/include/ast.h:
- Add non-multibyte fallback versions of the multibyte macros that
  are used if AST_NOMULTIBYTE is defined. This should cause most
  multibyte handling to be automatically optimised out everywhere.
- Reformat the multibyte macros for legibility.
- Similify mbchar() and and mbsize() macros by defining them in
  terms of mbnchar() and mbnsize(), eliminating code duplication.
- Correct non-multibyte fallback of mbwidth(). For consistent
  behaviour, control characters and out-of-range values should
  return -1 as they do for UTF-8. The fallback is now the same as
  default_wcwidth() in src/lib/libast/comp/setlocale.c.

src/lib/libast/comp/setlocale.c:
- If AST_NOMULTIBYTE is defined, do not compile in the debug and
  UTF-8 locale conversion functions, including several large
  conversion tables. Define their fallback macros as 0 as these are
  used as function pointers.

src/cmd/ksh93/SHOPT.sh,
src/cmd/ksh93/Mamfile:
- Change the SHOPT_MULTIBYTE default to empty, indicating "probe".
- Synchronise SHOPT_MULTIBYTE with !AST_NOMULTIBYTE by default.

src/cmd/ksh93/include/defs.h:
- When SHOPT_MULTIBYTE is zero but AST_NOMULTIBYTE is not non-zero,
  then enable AST_NOMULTIBYTE here to use the ast.h non-multibyte
  fallbacks for ksh. When this is done, the effect is that
  multibyte is optimized out for ksh only, as before.
- Remove previous fallback for disabling multibyte (re: c2cb0eae).

src/cmd/ksh93/include/lexstates.h,
src/cmd/ksh93/sh/lex.c:
- Define SETLEN() macro to assign to LEN (i.e. _Fcin.fclen) for
  multibyte only and do not assign to it directly. With no
  SHOPT_MULTIBYTE, define that macro as empty. This allows removing
  multiple '#if SHOPT_MULTIBYTE' directives from lex.c, as that
  code will all be optimised out automatically if it's disabled.

src/cmd/ksh93/include/national.h,
src/cmd/ksh93/sh/string.c:
- Fix flagrantly incorrect non-multibyte fallback for sh_strchr().
  The latter returns an integer offset (-1 if not found), whereas
  strchr(3) returns a char pointer (NULL if not found). Incorporate
  the fallback into the function for correct handling instead of
  falling back to strchr(3) directly.

src/cmd/ksh93/sh/macro.c:
- lastchar() optimisation: avoid function call if SHOPT_MULTIBYTE
  is enabled but we're not actually in a multibyte locale.

src/cmd/ksh93/sh/name.c:
- Use ja_size() even with SHOPT_MULTIBYTE disabled (re: 2182ecfa).
  Though no regression tests failed, the non-multibyte fallback for
  typeset -L/-R/-Z length calculation was probably not quite
  correct as ja_size() does more. The ast.h change to mbwidth()
  ensures correct behaviour for non-multibyte locales.

src/cmd/ksh93/tests/shtests:
- Since its value in SHOPT.sh is now empty by default, add a quick
  feature test (for the length of the UTF-8 character 'é') to check
  if SHOPT_MULTIBYTE needs to be enabled for the regression tests.
2022-07-09 00:32:27 +02:00
Martijn Dekker
4d50b69cbd [v1.0] remove alarm builtin
It's undocumented, it's broken and can crash the shell, and it's
unclear if it can ever be fixed. So with a 1.0 release (hopefully)
not very far off, it's time to remove it from the 1.0 branch.

Related: https://github.com/ksh93/ksh/issues/422
2022-06-21 05:45:53 +01:00
Martijn Dekker
148a8a3f46 Another build system overhaul (re: 35672208, 580ff616, 6cc2f6a0)
So far we've been handling AST release build and git commit flags
and ksh SHOPT_* compile time options in the generic package build
script. That was a hack that was necessary before I had sufficient
understanding of the build system. Some of it did not work very
well, e.g. the correct git commit did not show up in ${.sh.version}
when compiling from a git repo.

As of this commit, this is properly included in the mamake
dependency tree by handling it from the libast and ksh93 Mamfiles,
guaranteeing they are properly up to date.

For a release build, the _AST_ksh_release macro is renamed to
_AST_release, because some aspects of libast also use this.

This commit also adds my first attempt at documenting the (very
simple, six-command) mamake language as it is currently implemented
-- which is significantly different from Glenn Fowler's original
paper. This is mostly based on reading the mamake.c source code.

src/cmd/INIT/README-mamake.md:
- Added.

bin/package, src/cmd/INIT/package.sh:
- Delete the hack.

**/Mamfile:
- Remove KSH_RELFLAGS and KSH_SHOPTFLAGS, which supported the hack.
- Delete 'meta' commands. They were no-ops; mamake.c ignores them.
  They also did not add any informative value.

src/lib/libast/Mamfile:
- Add a 'virtual' target that obtains the current git commit,
  examines the git branch, and decides whether to auto-set an
  _AST_git_commit and/or or _AST_release #define to a new
  releaseflags.h header file. This is overwritten on each run.
- Add code to the install target that copies limit.h to
  include/ast, but only if it doesn't exist or the content of the
  original changed. This allows '#include <releaseflags.h>' from
  any program using libast while avoiding needless recompiles.
- When there are uncommitted changes, add /MOD (modified) to the
  commit hash instead of not defining it at all.

src/cmd/ksh93/**:
- Mamfile: Add a shopt.h target that reads SHOPT.sh and converts it
  into a new shopt.h header file in the object code directory. The
  shopt.h header only contains SHOPT_* directives that have a value
  in SHOPT.sh (not the empty/probe ones). They also do not redefine
  the macros if they already exist, so overriding with something
  like CCFLAGS+=' -DSHOPT_FOO=1' remains possible.
- **.c: Every c file now #includes "shopt.h" first. So SHOPT_*
  macros are no longer passed via environment/MAM variables.
* SHOPT.sh: The AUDITFILE and CMDLIB_DIR macros no longer need an
  extra backslash-escape for the double quotes in their values.
  (The old way required this because mamake inserts MAM variables
  directly into shell scripts as literals without quoting.  :-/ )

src/cmd/INIT/mamake.c:
- Import the two minor changes between from 93u+ and 93v-: bind()
  is renamed to bindfile() and there is a tweak to detecting an
  "improper done statement".
- Allow arbitrary whitespace (isspace()) everywhere, instead of
  spaces only. This obsoletes my earlier indentation workaround
  from 6cc2f6a0; turns out mamake always supported indentation, but
  with spaces only.
- Do not skip line numbers at the beginning of each line. This
  undocumented feature is not and (AFAICT) never has been used.
- Throw an error on unknown command or rule attribute. Quite an
  important feature for manual maintenance: catches typos, etc.
2022-06-12 05:47:02 +01:00
Martijn Dekker
a46c8e74f7 [v1.0] remove deparse.c
The shell code de-parser, which converts byte code back to shell
source code, is unused. It was used in SHOPT_COSHELL, which was
removed in 3613da42, and in ancient code for systems without
fork(2), which was removed in 7b0e0776. Testing reveals it to be
quite broken as it has not kept up with more recent changes in ksh.

It is kept in the dev branch, as we intend to fix it up and use it
for 'typeset -f FUNCTIONNAME' to output function definitions in a
future release. (The current design, which outputs original source
code direct from the source file, is fundamentally broken because
the output of a function definition that was loaded from a file is
corrupted if you edit the file after loading the function.)
2022-06-09 03:07:34 +01:00
Martijn Dekker
89cec81b32 Another round of minor tweaks and cleanups
Notable changes:
- The typeset builtin's usage and error messages for incompatible
  options used with -f has been corrected to show that -t and -u
  can be used with -f.
- In name.c, get rid of misleaadingly named Null static which is
  actually the empty string, not the null value. Replace with a new
  AltEmpty macro that is defined similarly to Empty. This is now
  also used in nvtype.c (re: de037b6e).
2022-06-09 03:02:06 +01:00
Martijn Dekker
fb8719fe1d Remove more unused stuff
A systematic grepping of the extern function definitions in
src/cmd/ksh93/include/*.h revealed more functions that either don't
exist or are not used anywhere. Some of them have never seen any
use in the entire ksh93-history repo (i.e. since 1995). They were
also all undocumented, so it's unlikely third-party custom
built-ins rely on them.
2022-06-03 18:47:15 +01:00
Martijn Dekker
9e2a8c6925 posix mode: disable effect of repeating whitespace char in $IFS
ksh has a little-known field splitting feature that conflicts with
POSIX: if a single-byte whitespace character (cf. isspace(3)) is
repated in $IFS, then field splitting is done as if that character
wasn't a whitespace character. An exmaple with the tab character:

  $ (IFS=$'\t'; val=$'\tone\t\ttwo\t'; set -- $val; echo $#)
  2
  $ (IFS=$'\t\t'; val=$'\tone\t\ttwo\t'; set -- $val; echo $#)
  4
The latter being the same as, for example
  $ (IFS=':'; val='1️⃣2️⃣'; set -- $val; echo $#)
  4

However, this is incompatible with the POSIX spec and with every
other shell except zsh, in which repeating a character in IFS does
not have any effect. So the POSIX mode must disable this.

src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/sh/init.c:
- Add sh_invalidate_ifs() function that invalidates the IFS state
  table by setting the ifsnp discipline struct member to NULL,
  which will cause the next get_ifs() call to regenerate it.
- get_ifs(): Treat a repeated char as S_DELIM even if whitespace,
  unless --posix is on.

src/cmd/ksh93/sh/args.c:
- sh_argopts(): Call sh_invalidate_ifs() when enabling or disabling
  the POSIX option. This is needed to make the change in field
  splitting behaviour take immediate effect instead of taking
  effect at the next assignment to IFS.
2022-03-11 21:22:22 +01:00
Martijn Dekker
b398f33c49 Another round of accumulated minor fixes and cleanups
Only notable changes listed below.

**/Mamfile:
- Do not bother redirecting standard error for 'cmp -s' to
  /dev/null. Normally, 'cmp -s' on Linux, macOS, *BSD, or Solaris
  do not not print any message. If it does, something unusual is
  going on and I would want to see the message.
- Since we now require a POSIX shell, we can use '!'.

src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/sh/init.c:
- Remove SH_TYPE_PROFILE symbol, unused after the removal of the
  SHOPT_PFSH code. (re: eabd6453)

src/cmd/ksh93/sh/io.c:
- piperead(), slowread(): Replace redundant sffileno() calls by
  the variables already containing their results. (re: 50d342e4)

src/cmd/ksh93/bltins/mkservice.c,
rc/lib/libcmd/vmstate.c:
- If these aren't compiled, define a stub function to silence the
  ranlib(1) warning that the .o file does not contain symbols.
2022-03-11 21:20:32 +01:00
Martijn Dekker
82ff91e9d9 Remove SHOPT_ENV dead code and sh.env variable (re: 8d7f616e)
As so often with SHOPT_* compile-time options, not all the relevant
code was actually conditional on the option macro. After removing
SHOPT_ENV, the arguments to sh_putenv() and env_delete() macro
calls are dead code and, after removing that, the sh.env variable
is unused.
2022-02-17 19:42:27 +00:00
Martijn Dekker
14a43a0a88 Yet more misc. cleanups; rm SHOPT_PFSH, SHOPT_TYPEDEF
Notable changes:
- Remove SHOPT_PFSH compile-time option and associated code.
  This was meant to work with Solaris rights profiles, see:
  https://docs.oracle.com/cd/E23824_01/html/821-1461/profiles-1.html#REFMAN1profiles-1
  But it has been obsolete for years as Solaris stopped using
  it in its shipped ksh several OS versions ago, preferring a
  library-based wrapper around ksh and other shells.
  Nonetheless I experimented with the option on Solaris 11.4.
  Result: no external command will run; output of unitialised
  memory in error message. So it's already fallen victim to bit
  rot. There's nothing interesting here, so just get rid.
- Remove SHOPT_TYPEDEF compile-time option (but keep the code!).
  Turning it off caused the build to fail. It may be possible to
  fix it, but the type definition code is integral to ksh now (e.g.
  'enum' depends on much of it) so it makes no sense to disable it.
  This was removed in the ksh 93v- beta version as well.
- Remove nv_close() calls and remove nv_close() documentation from
  the nval.3 man page. This function is a dummy, present without
  any changes since the beginning of the ast-open-archive repo in
  1995. The comment was: "Currently this is a dummy, but someday
  will be needed for reference counting". 27 or more years later,
  it's time to admit it's never going to happen. (And of course,
  nv_close() calls were not being used with anything resembling
  consistency.)
- Add a null nv_close() macro to nval.h for compatibility with
  third party code that follows the old documentation.
- Add a few missing regression tests.
2022-02-10 21:04:56 +00:00
Johnothan King
f38494ea1d Fix multiple bugs in .sh.match (#455)
This commit backports all of the relevant .sh.match bugfixes from
ksh93v-. Most of the .sh.match rewrite is from versions 2012-08-24
and 2012-10-04, with patches from later releases of 93v- and
ksh2020 also applied. Note that there are still some remaining bugs
in .sh.match, although now the total count of .sh.match bugs should
be less that before.

These are the relevant changes in the ksh93v- changelog that were
backported:
12-08-07  .sh.match no longer gets set for patterns in PS4 during
          set -x.
12-08-10  Rewrote .sh.match expansions fixing several bugs and
          improving performance.
12-08-22  .sh.match now handles subpatterns that had no matches with
          ${var//pattern} correctly.
12-08-21  A bug in setting .sh.match after ${var//pattern/string}
          when string is empty has been fixed.
12-08-21  A bug in setting .sh.match after [[ string == pattern ]]
          has been fixed.
12-08-31  A bug that could cause a core dump after
          typeset -m var=.sh.match has been fixed.
12-09-10  Fixed a bug in typeset -m the .sh.match is being renamed.
12-09-07  Fixed a bug in .sh.match code that coud cause the shell
          to quitely
13-02-21  The 12-01-16 bug fix prevented .sh.match from being used
          in the replacement string. The previous code was restored
          and a different fix which prevented .sh.match from being
          computed for nested replacement has been used instead.
13-05-28  Fixed two bug for typeset -c and typeset -m for variable
          .sh.match.

Changes:
- The SHOPT_2DMATCH option has been removed. This was already the
  default behavior previously, and now it's documented in the man
  page.
- init.c: Backported the sh_setmatch() rewrite from 93v- 2012-08-24
  and 2012-10-04.
- Backported the libast 93v- strngrpmatch() function, as the
  .sh.match rewrite requires this API.
- Backported the sh_match regression tests from ksh93v-, with many
  other sh_match tests backported from ksh2020. Much of the sh_match
  script is based on code from Roland Mainz:
  https://marc.info/?l=ast-developers&m=134606574109162&w=2
  https://marc.info/?l=ast-developers&m=134490505607093
- tests/{substring,treemove}.sh: Backported other relevant .sh.match
  fixes, with tests added to the substring and treemove test scripts.
- tests/types.sh: One of the (now reverted) memory leak bugfixes
  introduced a CI test failure in this script, so for that test the
  error message has been improved.
- string/strmatch.c: The original ksh93v- code for the strngrpmatch()
  changes introduced a crash that could occur because strlen would
  be used on a null pointer. This has been fixed by avoiding strlen
  if the string is null.

One nice side effect of these changes is a considerable performance
improvement in the shbench[1] gsub benchmark (results from 20
iterations with CCFLAGS=-Os):
--------------------------------------------------
name      /tmp/ksh-current     /tmp/ksh-matchfixes
--------------------------------------------------
gsub.ksh  0.883 [0.822-0.959]  0.457 [0.442-0.505]
--------------------------------------------------

Despite all of the many fixes and improvements in the backported
93v- .sh.match code, there are a few remaining bugs:

- .sh.match is printed with a default [0] subscript (see also
  https://github.com/ksh93/ksh/issues/308#issuecomment-1025016088):
     $ arch/*/bin/ksh -c 'echo ${!.sh.match}'
       .sh.match[0]
  This bug appears to have been introduced by the changes from
  ksh93v- 2012-08-24.
- The wrong variable name is given for 'parameter not set' errors
  (from https://marc.info/?l=ast-developers&m=134489094602596):
     $ arch/*/bin/ksh -u
     $ x=1234
     $ true "${x//~(X)([012])|([345])/}"
     $ compound co
     $ typeset -m co.array=.sh.match
     $ printf "%q\n" "${co.array[2][0]}"
     arch/linux.i386-64/bin/ksh: co.array[2][(null)]: parameter not set
- .sh.match leaks out of subshells. Further information and a
  reproducer can be found here:
  https://marc.info/?l=ast-developers&m=136292897330187

[1]: https://github.com/ksh-community/shbench
2022-02-10 21:04:23 +00:00
Johnothan King
0863a8eb29 Support glibc 2.35's posix_spawn_file_actions_addtcsetpgrp_np(3)
This commit implements support for the glibc 2.35
posix_spawn_file_actions_addtcsetpgrp_np(3) extension[2][3],
updating spawnveg(3) to use the new function for setting the
terminal group. This was done with the intention of improving
performance in interactive shells without reintroducing previous
race conditions[4][5].

[1]: https://sourceware.org/pipermail/libc-alpha/2022-February/136040.html
[2]: https://sourceware.org/git/?p=glibc.git;a=commit;h=342cc934
[3]: https://sourceware.org/git/?p=glibc.git;a=commit;h=6289d28d
[4]: https://github.com/ksh93/ksh/issues/79
[5]: https://www.mail-archive.com/ast-developers@research.att.com/msg00717.html

src/cmd/ksh93/sh/path.c:
- Tell spawnveg(3) to set the terminal process group when launching
  a child process in an interactive shell.

src/cmd/ksh93/sh/xec.c:
- If posix_spawn_file_actions_addtcsetpgrp_np(3) is available,
  allow use of spawnveg(3) (via sh_ntfork()) even with job control
  active.
- sh_ntfork(): Reimplement most of the SIGTSTP handling code
  removed in commit 66c37202.

src/lib/libast/comp/spawnveg.c,
src/lib/libast/misc/procopen.c,
src/lib/libast/features/sys:
- Add support for posix_spawn_file_actions_addtcsetpgrp_np(3).
- Allow spawnveg to set the terminal process group when pgid == 0.
  This was necessary to avoid race conditions when using the new
  function.

src/lib/libast/features/lib:
- Detect posix_spawn_file_actions_addtcsetpgrp_np(3).
- Do not detect an OS spawnveg(3). With the API changes to spawnveg
  in this pull request ksh probably can't use the OS's spawnveg
  function anymore. (That's assuming anything else even provides a
  spawnveg function to begin with, which is unlikely.)

src/lib/libast/features/api,
src/cmd/ksh93/include/defs.h:
- Bump libast version (20220101 => 20220201) due to the spawnveg(3)
  API change.

src/lib/libast/man/spawnveg.3:
- Document the changes to spawnveg(3) in the corresponding man
  page. Currently, it will only use the new tcfd argument if
  posix_spawn_file_actions_addtcsetpgrp_np(3) is supported. This
  could also be implemented for the fork(2) fallback, but for now
  I've avoided changing that since actually using it in the fork
  code would likely require a lot of hackery to avoid attempting
  tcsetpgrp with vfork (the behavior of tcsetpgrp after vfork is
  not portable) and would only benefit systems that don't have
  posix_spawn and vfork (I can't recall any off the top of my head
  that would fall under that category).
- Updated the man page to account for spawnveg's change in
  behavior.

Co-authored-by: Martijn Dekker <martijn@inlv.org>
2022-02-05 13:31:31 +00:00
Martijn Dekker
e569f23ef9 bump internal libast version; various minor cleanups
These are minor things I accumulated over the last month or so.

Notable changes:

src/lib/libast/features/api,
src/lib/libast/misc/state.c,
src/lib/libast/comp/conf.tab,
src/cmd/ksh93/include/defs.h:
- Bump internal libast version to 20220101L. We've made a few
  additions to the API, at least pathicase (see 71934570, ca3ec200)
  and astconf_long (see c2ac69b2), so this should have been done
  already. This also updates '/opt/ast/bin/getconf _AST_VERSION'.
- Use AST_VERSION instead of outdated _AST_VERSION.
- In state.c, use AST_VERSION instead of hardcoding the version.

src/cmd/ksh93/sh/xec.c:
- Remove 'restorefd' variable, unused as of 42becab6.
- Remove 'cmdrecurse' function and SH_RUNPROG macro; this was once
  used by a few libcmd commands, but ast-open-archive reveals it's
  unused as of ast 1999-12-25.

src/cmd/ksh93/sh/*.c:
- Where available, use e_dot instead of "." for consistency; it is
  defined as an extern so we might as well use it.

src/cmd/ksh93/tests/*.sh:
- When reporting signal names in fails, include the SIG prefix.
- Fix a broken process hang test in subshell.sh.

src/lib/libast/man/sfdisc.3:
- Removed. The interfaces described here never made it out of AT&T;
  they do not exist in any libast version in ast-open-archive.
  Resolves: https://github.com/ksh93/ksh/issues/426
2022-01-14 19:55:35 +00:00
Martijn Dekker
b509e92241 edit: do not enable multiline mode with no editor active
If neither gmacs/emacs nor vi are active, the multiline mode should
not be enabled even if the multiline option is on. Doing so can
cause inconsistent behaviour, particularly in multibyte locales
where, if the shell is compiled with SHOPT_RAWONLY (as is default),
the no-editor mode is actually handled by vi.c.

Also, the new --histreedit and --histverify options only work in
the emacs or vi editors, or in no-editor mode when handled by vi.
Which means they cannot ever work if neither emacs or vi were
compiled in (i.e. SHOPT_ESH and SHOPT_VSH were both disabled).
In that case, there's no point in compiling in those options.
Come to think of it, the same applies to the multiline option.

All changed files:
- Update SHOPT_ESH/SHOPT_VSH preprocessor directives as per above.

src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/include/shell.h:
- Move definitions of history expansion-related options to shell.h,
  which is where all the other shell options are defined.
2022-01-12 20:39:05 +00:00
Martijn Dekker
a5700d3937 Expose --histreedit and --histverify options (re: 921bbcae)
This adds two long-form shell options that modify history expansion
(-H/--histexpand). If --histreedit is on and a history expansion
fails, the command line is reloaded into the next prompt's edit
buffer, allowing corrections. If --histverify is on, the results of
a history expansion are not immediately executed but instead loaded
into the next prompt's edit buffer, allowing further changes.

SH_HISTREEDIT and SH_HISTVERIFY were already handled all along in
slowread() in io.c and via 'reedit' arguments to functions called
there, but could not be turned on as they were only ever exposed as
shopt options in the removed bash compatibility mode (in theory
only, as it failed to compile). I had overlooked them until now.

So, since the code is there, let's expose these options through the
normal long options interface. They're working fine, and activating
them actually makes history expansion tolerable to use.

src/cmd/ksh93/data/options.c:
- Make these options available as "histreedit" and "histverify".

src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/sh.1:
- Document the "new" options.

src/cmd/ksh93/include/defs.h:
- Remove unused SH_HISTAPPEND and SH_HISTORY2 macros which were
  part of the removed bash compatibility code. Note that ksh does
  not need a histappend option as it never overwrites the history
  file (in the bash mode, this shopt option was a no-op).
2022-01-12 20:38:30 +00:00
Johnothan King
ca5803419b Fix various typos, man page issues and improve the documentation (#415)
This commit makes various different improvements to the documentation:
- sh.1: Backported (with changes) mandoc warning fixes from ksh2020
  for the ksh93(1) man page: <https://github.com/att/ast/pull/1406>
- Removed unnecessary spaces at the end of lines to fix a few other
  mandoc warnings.
- Fixed various typos and capitalization errors in the documentation.
- ANNOUNCE: Document the addition of the ${.sh.pid} variable
  (re: 9de65210).
- libast/man/str*: Update the man pages for the libast str* functions
  to improve how accurately each function is described.
- ksh93/README: Update regression test/compatibility notes to include
  OpenBSD 7.0, FreeBSD 13.0 and WSL running Ubuntu 20.04.
- Change a few places to store the return value from strlen in a
  size_t variable rather than signed int.
- comp/setlocale.c: To avoid confusion of two separate variables named
  lang, the function local variable has been renamed to langidx.
2022-01-07 16:17:55 +00:00
Martijn Dekker
b590a9f155 [shp cleanup 01..20] all the rest (re: 2d3ec8b6)
This combines 20 cleanup commits from the dev branch.

All changed files:
- Clean up pointer defererences to sh.
- Remove shp arguments from functions.

Other notable changes:

src/cmd/ksh93/include/shell.h,
src/cmd/ksh93/sh/init.c:
- On second thought, get rid of the function version of
  sh_getinterp() as libshell ABI compatibility is moot. We've
  already been breaking that by reordering the sh struct, so there
  is no way it's going to work without recompiling.

src/cmd/ksh93/sh/name.c:
- De-obfuscate the relationship between nv_scan() and scanfilter().
  The former just calls the latter as a static function, there's no
  need to do that via a function pointer and void* type conversions.

src/cmd/ksh93/bltins/typeset.c,
src/cmd/ksh93/sh/name.c,
src/cmd/ksh93/sh/nvdisc.c:
- 'struct adata' and 'struct tdata', defined as local struct types
  in these files, need to have their first three fields in common,
  the first being a pointer to sh. This is because scanfilter() in
  name.c accesses these fields via a type conversion. So the sh
  field needed to be removed in all three at the same time.
  TODO: de-obfuscate: good practice definition via a header file.

src/cmd/ksh93/sh/path.c:
- Naming consistency: reserve the path_ function name prefix for
  externs and rename statics with that prefix.
- The default path was sometimes referred to as the standard path.
  To use one term, rename std_path to defpath and onstdpath() to
  ondefpath().
- De-obfuscate SHOPT_PFSH conditional code by only calling
  pf_execve() (was path_pfexecve()) if that is compiled in.

src/cmd/ksh93/include/streval.h,
src/cmd/ksh93/sh/streval.c:
- Rename extern strval() to arith_strval() for consistency.

src/cmd/ksh93/sh/string.c:
- Remove outdated/incorrect isxdigit() fallback; '#ifnded isxdigit'
  is not a correct test as isxdigit() is specified as a function.
  Plus, it's part of C89/C90 which we now require. (re: ac8991e5)

src/cmd/ksh93/sh/suid_exec.c:
- Replace an incorrect reference to shgd->current_pid with
  getpid(); it cannot work as (contrary to its misleading directory
  placement) suid_exec is an independent libast program with no
  link to ksh or libshell at all. However, no one noticed because
  this was in fallback code for ancient systems without
  setreuid(2). Since that standard function was specified in POSIX
  Issue 4 Version 2 from 1994, we should remove that fallback code
  sometime as part of another obsolete code cleanup operation to
  avoid further bit rot. (re: 843b546c)

src/cmd/ksh93/bltins/print.c: genformat():
- Remove preformat[] which was always empty and had no effect.

src/cmd/ksh93/shell.3:
- Minor copy-edit.
- Remove documentation for nonexistent sh.infile_name. A search
  through ast-open-archive[*] reveals this never existed at all.
- Document sh.savexit (== $?).

src/cmd/ksh93/shell.3,
src/cmd/ksh93/include/shell.h,
src/cmd/ksh93/sh/init.c:
- Remove sh.gd/shgd; this is now unused and was never documented
  or exposed in the shell.h public interface.
- sh_sigcheck() was documented in shell.3 as taking no arguments
  whereas in the actual code it took a shp argument. I decided to
  go with the documentation.
- That leaves sh_parse() as the only documented function that still
  takes an shp argument. I'm just going to go ahead and remove it
  for consistency, reverting sh_parse() to its pre-2003 spec.
- Remove undocumented/unused sh_bltin_tree() function which simply
  returned sh.bltin_tree.
- Bump SH_VERSION to 20220106.
2022-01-07 16:16:31 +00:00
Martijn Dekker
2d3ec8b67a [shp cleanup 00] Reunify the original sh state struct
As observed previously (see 3654ee73, 7e6bbf85, 79d19458), the ksh
93u+ codebase on which we rebased development was in a transition:
AT&T evidently wanted to make it possible to have several shell
interpreter states in the same process, which in theory would have
made it possible to start a complete new shell (not just a
subshell) without forking a new process.

This required transitioning from accessing the 'sh' state struct
directly to accessing it via pointers (usually but not always
called 'shp'), introducing a lot of bug-prone passing around of
those pointers via function arguments and other state structs.

Some of the original 'sh' struct was separated into a 'struct
shared' called 'shgd' a.k.a. 'sh.gd' (global data) instead; these
were global state variables that were going to be shared between
the different main shell environments sharing a process. Yet, for
some reason, that struct was allocated dynamically once at init
time, requiring yet another pointer to access it. <shrug>

None of this ever worked, because that transition was incomplete.
It was much further along in the ksh 93v- beta, but I don't think
it actually worked there either (not very much really did). So,
starting a new shell has always required starting a new process.

So, now that it's clear what they were trying to do, should we try
to make it work? I'm going to go with a firm "no" on that question.

Even non-forking (virtual) subshells, something quite a bit less
ambitious, were already an unmitigated nightmare of bugs. In 93u+m
we fixed a load of bugs related to those, but I'm sure there are
still many left. At the very least there are multiple memory leaks.

I think the ambition to go even further and have complete shells
running separate programs share a process, particularly given the
brittle and buggy state of the existing codebase, is evidence that
the AT&T team, in the final years, had well and truly lost the
ability to think "wait a minute, aren't we in over our heads here,
and why are we doing this again? Is this *actually* a feasible and
useful idea?"

In my view, having entirely separate programs share a process is a
*terrible*, horrible, no-good idea that takes us back to the bad
old days before Unix, when kernels and CPUs were unable to enforce
any memory access restrictions. Programmers are imperfect. If
you're going to run a new program, you need the kernel to enforce
the separation between programs, or you're just asking for memory
corruption and security holes. And that separation is enforced by
starting a new program in a new process. That's what processes are
*for*. And if you need *that* to be radically performance-optimised
then you're probably doing it wrong anyway.

(By the way, I would still argue the same for subshells, even after
we fixed many bugs in virtual subshells. But forking all subshells
would in fact cause many scripts to slow down, and the community
would surely revolt. <sigh>  Maybe I should make it a shell option
instead, so scripts can 'set -o subfork' for reliability.)

It is also unclear how they were going to make something like
'ulimit' work, which can only work in a separate process. There
was no sign of a mechanism to fork a separate program's shell
mid-execution like there is for subshells (sh_subfork()).

Anyway... I had already changed some code here and there to access
the sh state struct directly, but as of this commit I'm beginning
to properly undo this exercise in pointlessness. From now on, we're
exercising pointerlessness instead.

I'll do this in stages to make any problems introduced more
traceable. Stage 0 restores the full 'sh' state struct to its
former static glory and reverts 'shgd' as a separate entity.

src/cmd/ksh93/sh/defs.c,
src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/include/shell.h
src/cmd/ksh93/Mamfile::
- Move 'struct sh_scoped' and 'struct limits' from defs.h to
  shell.h as the sh struct will need their complete definitions.
- Get rid of 'struct shared' (shgd) in defs.h; its members are
  folded back into their original place, the main Shell_t struct
  (sh) in shell.h. There are no name conflicts.
- Get rid of the _SH_PRIVATE macro in defs.h. The members it
  defines are now defined normally in the main Shell_t struct (sh)
  in shell.h.
- To make this possible, move <history.h> and "fault.h" includes
  from defs.h to shell.h and update the Mamfile accordingly.
- Turn sh_getinterp() and shgd into macros that resolve to (&sh).
  This will allow the compiler to optimise out many pointer
  dereferences already.
- Keep extern sh_getinterp() for libshell ABI compatibility.

src/cmd/ksh93/sh/init.c:
- sh_init(): Do not calloc (sh_newof) the sh or shgd structs.
- sh_getinterp(): Keep function for libshell ABI compat.
2022-01-01 02:28:06 +00:00
Martijn Dekker
a1f5c99204 INIT: remove proto, ratz (re: 46593a89, 6137b99a); major cleanup
This takes another step towards cleaning up the build system. We
now do not even pretend to be theoretically compatible with
pre-1989 K&R C compilers or with C++ compilers. In practice, this
had already been broken for many years due to bit rot.

Commit 46593a89 already removed the license handling enormity that
depended on proto, so now we can cleanly remove it altogether. But
we do need to leave some backwards compatibility stubs to keep the
build system compatible with older AST code; it should remain
possible to build older ksh versions with the current build system
(the bin/ and src/cmd/INIT/ directories) for testing purposes.

So as of now there is no more __MANGLE__d rubbish in your generated
header files. This is only about a quarter of a century overdue...

This commit also includes a huge amount of code cleanup to remove
thousands of unused K&R C fallbacks and other cruft, particularly
in libast. This code base should now be a little easier to
understand for people who are familiar with a modern(ish) C
standard.

ratz is now also removed; this was a standalone and simplified 2005
version of gunzip. As of 6137b99a, none of our code uses it, even
theoretically. And the real g(un)zip is now everywhere.

src/cmd/INIT/proto.c, src/cmd/INIT/ratz.c:
- Removed.

COPYRIGHT:
- Remove zlib license; this only applied to ratz.

bin/package, src/cmd/INIT/package.sh:
- Related cleanups.
- Unset LC_ALL before invoking a new shell, respecting the user's
  locale again and avoiding multibyte character corruption on the
  command line.

src/cmd/INIT/proto.sh:
- Add stub for backwards compatibility with Mamfiles that depend on
  proto. It does nothing but pass input without modification and is
  now installed as the new arch/*/bin/proto by src/cmd/INIT/Mamfile.

src/cmd/INIT/iffe.sh:
- Ignore the proto-related -e (--package) and -p (--prototyped)
  options; keep parsing them for backwards compatibility.
- Trim the macros passed to every test to their standard C
  versions, removing K&R C and C++ versions. These are now
  considered to be for backwards compatibility only.

src/cmd/INIT/iffe.tst:
- Remove proto(1) mangling code.
  By the way, iffe can be regression-tested as follows:
        $ bin/package use   # set up environment in a child shell
        $ regress src/cmd/INIT/iffe.tst
        $ exit              # leave package environment

src/cmd/INIT/make.probe, src/cmd/INIT/probe.win32:
- Remove code to handle C++.

src/lib/libast/features/common:
- As in iffe.sh above, trim macros designed for compatibility with
  C++ and ancient C compilers to their standard C versions and
  comment that they are for backwards compatibility with AST code.
  This is needed to keep all the old ast and ksh code compiling.

src/cmd/ksh93/sh/init.c,
src/cmd/ksh93/sh/name.c:
- Clarify libshell ABI compatibility function versions of macros.
  A "proto workaround" comment in the original code mislead me into
  thinking this had something to do with the removed proto(1), but
  it's unrelated. Call the workaround macro BYPASS_MACRO instead.

src/cmd/ksh93/include/defs.h:
- sh_sigcheck() macro: allow &sh as an argument: parenthesise shp.

src/cmd/ksh93/sh/nvtype.c:
- Remove unused nv_mkstruct() function. (re: d0a5cab1)

**/features/*:
- Remove obsolete iffe 'set prototyped' option.

**/Mamfile:
- Remove all references to the ast/prototyped.h header.
- Remove all use of the proto command. Simply copy instead.

*** 850-ish source files: ***
- Remove all '#pragma prototyped' directives.
- Remove all C++ compat code conditional upon defined(__cplusplus).
- Remove all use of the _ARG_ macro, which on standard C expands to
  its argument:
        #define _ARG_(x)        x
  (on K&R C, it expanded to nothing)
- Remove all use of _BEGIN_EXTERNS_ and _END_EXTERNS_ macros (empty
  on standard C; this was for C++ compatibility)
- Reduce all #if __STD_C (standard code) #else (K&R code) #endif
  blocks to the standard code only, without use of the macro.
- Same for _STD_ macro which seems to have had the same function.
- Change all instances of 'Void_t' to standard 'void'.
2021-12-24 07:05:22 +00:00
Johnothan King
beccb93fd4 Fix various compiler warnings and minor issues (#362)
List of changes:
- Fixed some -Wuninitialized warnings and removed some unused variables.

- Removed the unused extern for B_login (re: d8eba9d1).

- The libcmd builtins and the vmalloc memfatal function now handle
  memory errors with 'ERROR_SYSTEM|ERROR_PANIC' for consistency with how
  ksh itself handles out of memory errors.

- Added usage of UNREACHABLE() where it was missing from error handling.

- Extend many variables from short to int to prevent overflows (most
  variables involve file descriptors).

- Backported a ksh2020 patch to fix unused value Coverity issues
  (https://github.com/att/ast/pull/740).

- Note in src/cmd/ksh93/README that ksh compiles with Cygwin on
  Windows 10 and Windows 11, albeit with many test failures.

- Add comments to detail some sections of code. Extensive list of
  commits related to this change:
  ca2443b5, 7e7f1372, 2db9953a, 7003aba4, 6f50ff64, b1a41311,
  222515bf, a0dcdeea, 0aa9e03f, 61437b27, 352e68da, 88e8fa67,
  bc8b36fa, 6e515f1d, 017d088c, 035a4cb3, 588a1ff7, 6d63b57d,
  a2f13c19, 794d1c86, ab98ec65, 1026006d

- Removed a lot of dead ifdef code.

- edit/emacs.c: Hide an assignment to avoid a -Wunused warning. (See
  also https://github.com/att/ast/pull/753, which removed the assignment
  because ksh2020 removed the !SHOPT_MULTIBYTE code.)

- sh/nvdisc.c: The sh_newof macro cannot return a null pointer because
  it will instead cause the shell to exit if memory cannot be allocated.
  That makes the if statement here a no-op, so remove it.

- sh/xec.c: Fixed one unused variable warning in sh_funscope().

- sh/xec.c: Remove a fallthrough comment added in commit ed478ab7
  because the TFORK code doesn't fall through (GCC also produces no
  -Wimplicit-fallthrough warning here).

- data/builtins.c: The cd and pwd man pages state that these builtins
  default to -P if PATH_RESOLVE is 'physical', which isn't accurate:
     $ /opt/ast/bin/getconf PATH_RESOLVE
     physical
     $ mkdir /tmp/dir; ln -s /tmp/dir /tmp/sym
     $ cd /tmp/sym
     $ pwd
     /tmp/sym
     $ cd -P /tmp/sym
     $ pwd
     /tmp/dir
  The behavior described by these man pages isn't specified in the ksh
  man page or by POSIX, so to avoid changing these builtin's behavior
  the inaccurate PATH_RESOLVE information has been removed.

- Mamfiles: Preserve multi-line errors by quoting the $x variable.
  This fix was backported from 93v-.
  (See also <a7e9cc82>.)

- sh/subshell.c: Remove set but not used sp->errcontext variable.
2021-12-09 06:42:59 +01:00
Martijn Dekker
a3f4b5efd1 out of memory checks: add missing sh_getcwd() wrapper (re: 7ad274f8)
getcwd() with 0/NULL arguments also mallocs, so needs a check.
2021-12-05 22:02:41 +01:00
Martijn Dekker
74730c8ac7 test/[: Improve error status > 1 (re: 7003aba4, cd2cf236, ef1f53b5)
As I got to know the code better, it now seems painfully obvious
that getting test/[ to issue an exit status >= 2 on error only
requires a simple check in sh_exit() in fault.c, which is called
whenever the shell issues an error message.
2021-11-22 15:37:04 +01:00
Martijn Dekker
c734568b02 arithmetic: Fix the octal leading zero mess (#337)
In C/POSIX arithmetic, a leading 0 denotes an octal number, e.g.
010 == 8. But this is not a desirable feature as it can cause
problems with processing things like dates with a leading zero.
In ksh, you should use 8#10 instead ("10" with base 8).

It would be tolerable if ksh at least implemented it consistently.
But AT&T made an incredible mess of it. For anyone who is not
intimately familiar with ksh internals, it is inscrutable where
arithmetic evaluation special-cases a leading 0 and where it
doesn't. Here are just some of the surprises/inconsistencies:

1. The AT&T maintainers tried to honour a leading 0 inside of
   ((...)) and $((...)) and not for arithmetic contexts outside it,
   but even that inconsistency was never quite consistent.

2. Since 2010-12-12, $((x)) and $(($x)) are different:
      $ /bin/ksh -c 'x=010; echo $((x)) $(($x))'
      10 8
   That's a clear violation of both POSIX and the principle of
   least astonishment. $((x)) and $(($x)) should be the same in
   all cases.

3. 'let' with '-o letoctal' acts in this bizarre way:
      $ set -o letoctal; x=010; let "y1=$x" "y2=010"; echo $y1 $y2
      10 8
   That's right, 'let y=$x' is different from 'let y=010' even
   when $x contains the same string value '010'! This violates
   established shell grammar on the most basic level.

This commit introduces consistency. By default, ksh now acts like
mksh and zsh: the octal leading zero is disabled in all arithmetic
contexts equally. In POSIX mode, it is enabled equally.

The one exception is the 'let' built-in, where this can still be
controlled independently with the letoctal option as before (but,
because letoctal is synched with posix when switching that on/off,
it's consistent by default).

We're also removing the hackery that causes variable expansions for
the 'let' builtin to be quietly altered, so that 'x=010; let y=$x'
now does the same as 'let y=010' even with letoctal on.

Various files:
- Get rid of now-redundant sh.inarith (shp->inarith) flag, as we're
  no longer distinguishing between being inside or outside ((...)).

src/cmd/ksh93/sh/arith.c:
- arith(): Let disabling POSIX octal constants by skipping leading
  zeros depend on either the letoctal option being off (if we're
  running the "let" built-in") or the posix option being off.
- sh_strnum(): Preset a base of 10 for strtonll(3) depending on the
  posix or letoctal option being off, not on the sh.inarith flag.

src/cmd/ksh93/include/argnod.h,
src/cmd/ksh93/sh/args.c,
src/cmd/ksh93/sh/macro.c:
- Remove astonishing hackery that violated shell grammar for 'let'.

src/cmd/ksh93/sh/name.c (nv_getnum()),
src/cmd/ksh93/sh/nvdisc.c (nv_getn()):
- Remove loops for skipping leading zeroes that included a broken
  check for justify/zerofill attributes, thereby fixing this bug:
	$ typeset -Z x=0x15; echo $((x))
	-ksh: x15: parameter not set
  Even if this code wasn't redundant before, it is now: sh_arith()
  is called immediately after the removed code and it ignores
  leading zeroes via sh_strnum() and strtonll(3).

Resolves: https://github.com/ksh93/ksh/issues/334
2021-11-17 04:28:08 +01:00
Martijn Dekker
7b5b0a5d54 Fix octal number arguments in printf integer arithmetic
Bug 1: POSIX requires numbers used as arguments for all the %d,
%u... in printf to be interpreted as in the C language, so
	printf '%d\n' 010
should output 8 when the posix option is on. However, it outputs 10.

This bug was introduced as a side effect of a change introduced in
the 2012-02-07 version of ksh 93u+m, which caused the recognition
of leading-zero numbers as octal in arithmetic expressions to be
disabled outside ((...)) and $((...)). However, POSIX requires
leading-zero octal numbers to be recognised for printf, too.

The change in question introduced a sh.arith flag that is set while
we're processing a POSIX arithmetic expression, i.e., one that
recognises leading-zero octal numbers.
Bug 2: Said flag is not reset in a command substitution used within
an arithmetic expression. A command substitution should be a
completely new context, so the following should both output 10:

$ ksh -c 'integer x; x=010; echo $x'
10            # ok; it's outside ((…)) so octals are not recognised
$ ksh -c 'echo $(( $(integer x; x=010; echo $x) ))'
8             # bad; $(comsub) should create new non-((…)) context

src/cmd/ksh93/bltins/print.c: extend():
- For the u, d, i, o, x, and X conversion modifiers, set the POSIX
  arithmetic context flag before calling sh_strnum() to convert the
  argument. This fixes bug 1.

src/cmd/ksh93/sh/subshell.c: sh_subshell():
- When invoking a command substitution, save and unset the POSIX
  arithmetic context flag. Restore it at the end. This fixes bug 2.

Reported-by: @stephane-chazelas
Resolves: https://github.com/ksh93/ksh/issues/326
2021-09-13 04:57:37 +02:00
Martijn Dekker
a2196f9434 Fix backtick comsubs by making them act like $(modern) ones
ksh93 currently has three command substitution mechanisms:
- type 1: old-style backtick comsubs that use a pipe;
- type 3: $(modern) comsubs that use a temp file, currently with
  fallback to a pipe if a temp file cannot be created;
- type 2: ${ shared-state; } comsubs; same as type 3, but shares
  state with parent environment.

Type 1 is buggy. There are at least two reproducers that make it
hang. The Red Hat patch applied in 4ce486a7 fixed a hang in
backtick comsubs but reintroduced another hang that was fixed in
ksh 93v-. So far, no one has succeeded in making pipe-based comsubs
work properly.

But, modern (type 3) comsubs use temp files. How does it make any
sense to have two different command substitution mechanisms at the
execution level? The specified functionality between backtick and
modern command substitutions is exactly the same; the difference
*should* be purely syntactic.

So this commit removes the type 1 comsub code at the execution
level, treating them all like type 3 (or 2). As a result, the
related bugs vanish while the regression tests all pass.

The only side effect that I can find is that the behaviour of bug
https://github.com/ksh93/ksh/issues/124 changes for backtick
comsubs. But it's broken either way, so that's neutral.

So this commit can now be added to my growing list of ksh93 issues
fixed by simply removing code.

src/cmd/ksh93/sh/xec.c:
- Remove special code for type 1 comsubs from iousepipe(),
  sh_iounpipe(), sh_exec() and _sh_fork().

src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/sh/subshell.c:
- Remove pipe support from sh_subtmpfile(). This also removes the
  use of a pipe as a fallback for $(modern) comsubs. Instead, panic
  and error out if temp file creation fails. If the shell cannot
  create a temporary file, there are fatal system problems anyway
  and a script should not continue.
- No longer pass comsub type to sh_subtmpfile().

All other changes:
- Update sh_subtmpfile() calls.

src/cmd/ksh93/tests/subshell.sh:
- Add two regression tests based on reproducers from bug reports.

Resolves: https://github.com/ksh93/ksh/issues/305
Resolves: https://github.com/ksh93/ksh/issues/316
2021-08-13 09:14:11 +02:00
Martijn Dekker
2aad3cab06 Add ksh 93u+m contributors notice to 964 copyright headers 2021-04-26 00:19:31 +01:00
Martijn Dekker
13c57e4b58 Fix 'unset -f' to work in subshells without forking (re: 047cb330)
This commit implements unsetting functions in virtual subshells,
removing the need for the forking workaround. This is done by
either invalidating the function found in the current subshell
function tree by unsetting its NV_FUNCTION attribute bits (which
will cause sh_exec() to skip it) or, if the function exists in a
parent shell, by creating an empty dummy subshell node in the
current function tree without that attribute.

As a beneficial side effect, it seems that bug 228 (unset -f fails
in forked subshells if a function is defined before forking) is now
also fixed.

src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/sh/init.c:
- Add sh.fun_base for a saved pointer to the main shell's function
  tree for checking when in a subshell, analogous to sh.var_base.

src/cmd/ksh93/bltins/typeset.c: unall():
- Remove the fork workaround.
- When unsetting a function found in the current function tree
  (troot) and that tree is not sh.var_base (which checks if we're
  in a virtual subshell in a way that handles shared-state command
  substitutions correctly), then do not delete the function but
  invalidate it by unsetting its NV_FUNCTION attribute bits.
- When unsetting a function not found in the current function tree,
  search for it in sh.fun_base and if found, add an empty dummy
  node to mask the parent shell environment's function. The dummy
  node will not have NV_FUNCTION set, so sh_exec() will skip it.

src/cmd/ksh93/sh/subshell.c:
- sh_subfuntree(): For 'unset -f' to work correctly with
  shared-state command substitutions (subshares), this function
  needs a fix similar to the one applied to sh_assignok() for
  variables in commit 911d6b06. Walk up on the subshells tree until
  we find a non-subshare.
- sh_subtracktree(): Apply the same fix for the hash table.
- Remove table_unset() and incorporate an updated version of its
  code in sh_subshell(). As of ec888867, this function was only
  used to clean up the subshell function table as the alias table
  no longer exists.
- sh_subshell():
  * Simplify the loop to free the subshell hash table.
  * Add table_unset() code, slightly refactored for readability.
    Treat dummy nodes now created by unall() separately to avoid a
    memory leak; they must be nv_delete()d without passing the
    NV_FUNCTION bits. For non-dummy nodes, turn on the NV_FUNCTION
    attribute in case they were invalidated by unall(); this is
    needed for _nv_unset() to free the function definition.

src/cmd/ksh93/tests/subshell.sh:
- Update the test for multiple levels of subshell functions to test
  a subshare as well. While we're add it, add a very similar test
  for multiple levels of subshell variables that was missing.
- Add @JohnoKing's reproducer from #228.

src/cmd/ksh93/tests/leaks.sh:
- Add leak tests for unsetting functions in a virtual subshell.
  Test both the simple unset case (unall() creates a dummy node)
  and the define/unset case (unall() invalidates existing node).

Resolves: https://github.com/ksh93/ksh/issues/228
2021-04-24 06:57:49 +01:00
Martijn Dekker
d50d3d7c4c Reset arithmetic recursion level on all errors (re: 264ba48b)
The recursion level for arithmetic expressions is kept track of in
a static 'level' variable in streval.c. It is reset when arithmetic
expressions throw an error.

But an error for an arithmetic expression may also occur elsewhere
-- at least in one case: when an arithmetic expression attempts to
change a read-only variable. In that case, the recursion level is
never reset because that code does not have access to the static
'level' variable.

If many such conditions occur (as in the new readonly.sh regression
tests), an arithmetic command like 'i++' may eventually fail with a
'recursion too deep' error.

To mitigate the problem, MAXLEVEL in streval.c was changed from 9
to 1024 in 264ba48b (as in the ksh 93v- beta). This commit leaves
that increase, but adds a proper fix.

src/cmd/ksh93/include/defs.h:
- Add global sh.arithrecursion (a.k.a. shp->arithrecursion)
  variable to keep track of the arithmetic recursion level,
  replacing the static 'level' variable in streval.c.

src/cmd/ksh93/sh/xec.c: sh_exec():
- Reset sh.arithrecursion before starting a new simple command
  (TCOM), a new subshell with parentheses (TPAR), a new pipe
  (TFIL), or a new [[ ... ]] command (TTST). These are the same
  places where 'echeck' is set to 1 for --errexit and ERR trap
  checks, so it should cover everything.

src/cmd/ksh93/sh/streval.c:
- Change all uses of 'level' to sh.arithrecursion.
- _seterror, aritherror(): No longer bother to reset the level
  to zero here; xec.c should have this covered for all cases now.

src/cmd/ksh93/tests/arith.sh:
- Add tests for main shell and subshell.
2021-04-11 01:25:19 +01:00
Johnothan King
a065558291
Fix more compiler warnings, typos and other minor issues (#260)
Many of these changes are minor typo fixes. The other changes
(which are mostly compiler warning fixes) are:

NEWS:
- The --globcasedetect shell option works on older Linux kernels
  when used with FAT32/VFAT file systems, so remove the note about
  it only working with 5.2+ kernels.

src/cmd/ksh93/COMPATIBILITY:
- Update the documentation on function scoping with an addition
  from ksh93v- (this does apply to ksh93u+).

src/cmd/ksh93/edit/emacs.c:
- Check for '_AST_ksh_release', not 'AST_ksh_release'.

src/cmd/INIT/mamake.c,
src/cmd/INIT/ratz.c,
src/cmd/INIT/release.c,
src/cmd/builtin/pty.c:
- Add more uses of UNREACHABLE() and noreturn, this time for the
  build system and pty.

src/cmd/builtin/pty.c,
src/cmd/builtin/array.c,
src/cmd/ksh93/sh/name.c,
src/cmd/ksh93/sh/nvtype.c,
src/cmd/ksh93/sh/suid_exec.c:
- Fix six -Wunused-variable warnings (the name.c nv_arrayptr()
  fixes are also in ksh93v-).
- Remove the unused 'tableval' function to fix a -Wunused-function
  warning.

src/cmd/ksh93/sh/lex.c:
- Remove unused 'SHOPT_DOS' code, which isn't enabled anywhere.
  https://github.com/att/ast/issues/272#issuecomment-354363112

src/cmd/ksh93/bltins/misc.c,
src/cmd/ksh93/bltins/trap.c,
src/cmd/ksh93/bltins/typeset.c:
- Add dictionary generator function declarations for former
  aliases that are now builtins (re: 1fbbeaa1, ef1621c1, 3ba4900e).
- For consistency with the rest of the codebase, use '(void)'
  instead of '()' for print_cpu_times.

src/cmd/ksh93/sh/init.c,
src/lib/libast/path/pathshell.c:
- Move the otherwise unused EXE macro to pathshell() and only
  search for 'sh.exe' on Windows.

src/cmd/ksh93/sh/xec.c,
src/lib/libast/include/ast.h:
- Add an empty definition for inline when compiling with C89.
  This allows the timeval_to_double() function to be inlined.

src/cmd/ksh93/include/shlex.h:
- Remove the unused 'PIPESYM2' macro.

src/cmd/ksh93/tests/pty.sh:
- Add '# err_exit #' to count the regression test added in
  commit 113a9392.

src/lib/libast/disc/sfdcdio.c:
- Move diordwr, dioread, diowrite and dioexcept behind
  '#ifdef F_DIOINFO' to fix one -Wunused-variable warning and
  multiple -Wunused-function warnings (sfdcdio() only uses these
  functions when F_DIOINFO is defined).

src/lib/libast/string/fmtdev.c:
- Fix two -Wimplicit-function-declaration warnings on Linux by
  including sys/sysmacros.h in fmtdev().
2021-04-08 19:58:07 +01:00
Johnothan King
814b5c6890
Fix various minor problems and update the documentation (#237)
These are minor fixes I've accumulated over time. The following
changes are somewhat notable:

- Added a missing entry for 'typeset -s' to the man page.
- Add strftime(3) to the 'see also' section. This and the date(1)
  addition are meant to add onto the documentation for 'printf %T'.
- Removed the man page the entry for ksh reading $PWD/.profile on
  login. That feature was removed in commit aa7713c2.
- Added date(1) to the 'see also' section of the man page.
- Note that the 'hash' command can be used instead of 'alias -t' to
  workaround one of the caveats listed in the man page.
- Use an 'out of memory' error message rather than 'out of space'
  when memory allocation fails.
- Replaced backticks with quotes in some places for consistency.
- Added missing documentation for the %P date format.
- Added missing documentation for the printf %Q and %p formats
  (backported from ksh2020: https://github.com/att/ast/pull/1032).
- The comments that show each builtin's options have been updated.
2021-03-21 14:39:03 +00:00
Martijn Dekker
7b0e0776e2 cleanup: remove legacy code for systems without fork(2)
In 2021, it seems like it's about time to join the 21st century
and officially require fork(2). In practice this was already the
case as the legacy code was unmaintained and didn't compile.
2021-03-21 06:39:32 +00:00
Johnothan King
6d63b57dd3
Re-enable SHOPT_DEVFD, fixing process substitution fd leaks (#218)
This commit fixes a long-standing bug (present since at least
ksh93r) that caused a file descriptor leak when passing a process
substitution to a function, or (if compiled with SHOPT_SPAWN) to a
nonexistent command.

The leaks only occurred when ksh was compiled with SHOPT_DEVFD; the
FIFO method was unaffected.

src/cmd/ksh93/sh/xec.c: sh_exec():
- When a process substitution is passed to a built-in, the
  remaining file descriptor is closed with sh_iorestore. Do the
  same thing when passing a process substitution to a function.
  This is done by delaying the sh_iorestore() call to 'setexit:'
  where both built-ins and functions terminate and set the exit
  status ($?).
  This means that call now will not be executed if a longjmp is
  done, e.g. due to an error in a special built-in. However, there
  is already another sh_iorestore() call in main.c, exfile(), line
  418, that handles that scenario.
- sh_ntfork() can fail, so rather than assume it will succeed,
  handle a failure by closing extra file descriptors with
  sh_iorestore(). This fixes the leak on command not found with
  SHOPT_SPAWN.

src/cmd/ksh93/include/defs.h:
- Since the file descriptor leaks are now fixed, remove the
  workaround that forced ksh to use the FIFO method.

src/cmd/ksh93/SHOPT.sh:
- Add SHOPT_DEVFD as a configurable option (default: probe).

src/cmd/ksh93/tests/io.sh:
- Add a regression test for the 'not found' file descriptor leak.
- Add a test to ensure it keeps working with 'command'.

Fixes: https://github.com/ksh93/ksh/issues/67
2021-03-13 13:46:42 +00:00
Johnothan King
c3eac977ea
Fix unused process substitutions hanging (#214)
On systems where ksh needs to use the older and less secure FIFO
method for process substitutions (which is currently all of them as
the more modern and solid /dev/fd method is still broken, see #67),
process substitutions could leave background processes hanging in
these two scenarios:

1. If the parent process exits without opening a pipe to the child
   process forked by the process substitution. The fifo_check()
   function in xec.c, which is periodically called to check if the
   parent process still exists while waiting for it to open the
   FIFO, verified the parent process's existence by checking if the
   PPID had reverted to 1, the traditional PID of init. However,
   POSIX specifies that the PPID can revert to any implementation-
   defined system process in that case. So this breaks on certain
   systems, causing unused process substitutions to hang around
   forever as they never detect that the parent disappeared.
   The fix is to save the current PID before forking and having the
   child check if the PPID has changed from that saved PID.

2. If command invoked from the main shell is passed a process
   substitution, but terminates without opening the pipe to the
   process substitution. In that case, the parent process never
   disappears in the first place, because the parent process is the
   main shell. So the same infinite wait occurs in unused process
   substitutions, even after correcting problem 1.
   The fix is to remember all FIFOs created for any number of
   process substitutions passed to a single command, and unlink any
   remaining FIFOs as they represent unused command substitutions.
   Unlinking them FIFOs causes sh_open() in the child to fail with
   ENOENT on the next periodic check, which can easily be handled.

Fixing these problems causes the FIFO method to act identically to
the /dev/fd method, which is good for compatibility. Even when #67
is fixed this will still be important, as ksh also runs on systems
that do not have /dev/fd (such as AIX, HP-UX, and QNX), so will
fall back to using FIFOs.

--- Fix problem 1 ---

src/cmd/ksh93/sh/xec.c:
- Add new static fifo_save_ppid variable.
- sh_exec(): If a FIFO is defined, save the current PID in
  fifo_save_ppid for the forked child to use.
- fifo_check(): Compare PPID against the saved value instead of 1.

--- Fix problem 2 ---

To keep things simple I'm abusing the name-value pair routines used
for variables for this purpose. The overhead is negligible. A more
elegant solution is possible but would involve adding more code.

src/cmd/ksh93/include/defs.h: _SH_PRIVATE:
- Define new sh.fifo_tree pointer to a new FIFO cleanup tree.

src/cmd/ksh93/sh/args.c: sh_argprocsubs():
- After launching a process substitution in the background,
  add the FIFO to the cleanup list before freeing it.

src/cmd/ksh93/sh/xec.c:
- Add fifo_cleanup() that unlinks all FIFOs in the cleanup list and
  clears/closes the list. They should only still exist if the
  command never used them, however, just run 'unlink' and don't
  check for existence first as that would only add overhead.
- sh_exec():
  * Call fifo_cleanup() on finishing all simple commands (when
    setting $?) or when a special builtin fails.
  * When forking, clear/close the cleanup list; we do not want
    children doing duplicate cleanup, particularly as this can
    interfere when using multiple process substitutions in one
    command.
  * Process substitution handling:
    > Change FIFO check frequency from 500ms to 50ms.
      Note that each check sends a signal that interrupts open(2),
      causing sh_open() to reinvoke it. This causes sh_open() to
      fail with ENOENT on the next check when the FIFO no longer
      exists, so we do not need to add an additional check for
      existence to fifo_check(). Unused process substitutions now
      linger for a maximum of 50ms.
    > Do not issue an error message if errno == ENOENT.
- sh_funct(): Process substitutions can be passed to functions as
  well, and we do not want commands within the function to clean up
  the FIFOs for the process substitutions passed to it from the
  outside. The problem is solved by simply saving fifo_tree in a
  local variable, setting it to null before running the function,
  and cleaning it up before restoring the parent one at the end.
  Since sh_funct() is called recursively for multiple-level
  function calls, this correctly gives each function a locally
  scoped fifo_tree.

--- Tests ---

src/cmd/ksh93/tests/io.sh:
- Add tests covering the failing scenarios.

Co-authored-by: Martijn Dekker <martijn@inlv.org>
2021-03-12 11:43:23 +00:00
hyenias
5aba0c7251
Fix set/unset state for short integer (typeset -si) (#211)
This commit fixes at least three bugs:
1. When issuing 'typeset -p' for unset variables typeset as short
   integer, a value of 0 was incorrectly diplayed.
2. ${x=y} and ${x:=y} were still broken for short integer types
   (re: 9f2389ed). ${x+set} and ${x:+nonempty} were also broken.
3. A memory fault could occur if typeset -l followed a -s option
   with integers. Additonally, now the last -s/-l wins out as the
   option to utilize instead of it always being short.

src/cmd/ksh93/include/name.h:
- Fix the nv_isnull() macro by removing the direct exclusion of
  short integers from this set/unset test. This breaks few things
  (only ${.sh.subshell} and ${.sh.level}, as far as we can tell)
  while potentially correcting many aspects of short integer use
  (at least bugs 1 and 2 above), as this macro is widely used.
- union Value: add new pid_t *pidp pointer member for PID values
  (see further below).

src/cmd/ksh93/bltins/typeset.c: b_typeset():
- To fix bug 3 above, unset the 'shortint' flag and NV_SHORT
  attribute bit upon encountering the -l optiobn.

*** To fix ${.sh.subshell} to work with the new nv_isnull():

src/cmd/ksh93/sh/defs.h:
- Add new 'realsubshell' member to the shgd (aka shp->gd) struct
  which will be the integer value for ${.sh.subshell}.

src/cmd/ksh93/sh/init.c,
src/cmd/ksh93/data/variables.c:
- Initialize SH_SUBSHELLNOD as a pointer to shgd->realsubshell
  instead of using a short value (.s) directly. Using a pointer
  allows nv_isnull() to return a positive for ${.sh.subshell} as
  a non-null pointer is what it checks for.
- While we're at it, initialize PPIDNOD ($PPID) and SH_PIDNOD
  (${.sh.pid}) using the new pdip union member, which is more
  correct as they are values of type pid_t.

src/cmd/ksh93/sh/subshell.c,
src/cmd/ksh93/sh/xec.c:
- Update the ${.sh.subshell} increases/decreases to refer to
  shgd->realsubshell (a.k.a. shp->gd->realsubshell).

*** To fix ${.sh.level} after changing nv_isnull():

src/cmd/ksh93/sh/macro.c: varsub():
- Add a specific exception for SH_LEVLNOD to the nv_isnull() test,
  so that ${.sh.level} is always considered to be set. Its handling
  throughout the code is too complex/special for a simple fix, so
  we have to special-case it, at least for now.

*** Regression test additions:

src/cmd/ksh93/tests/attributes.sh:
- Add in missing short integer tests and correct the one that
  existed. The -si test now yields 'typeset -x -r -s -i foo'
  instead of 'typeset -x -r -s -i foo=0' which brings it in line
  with all the others.
- Add in some other -l attribute tests for floats. Note, -lX test
  was not added as the size of long double is platform dependent.

src/cmd/ksh93/tests/variables.sh:
- Add tests for ${x=y} and ${x:=y} used on short int variables.

Co-authored-by: Martijn Dekker <martijn@inlv.org>
2021-03-08 04:19:36 +00:00
Johnothan King
7ad274f8b6
Add more out of memory checks (re: 18529b88) (#192)
The referenced commit neglected to add checks for strdup() calls.
That calls malloc() as well, and is used a lot.

This commit switches to another strategy: it adds wrapper functions
for all the allocation macros that check if the allocation
succeeded, so those checks don't need to be done manually.

src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/sh/init.c:
- Add sh_malloc(), sh_realloc(), sh_calloc(), sh_strdup(),
  sh_memdup() wrapper functions with success checks. Call nospace()
  to error out if allocation fails.
- Update new_of() macro to use sh_malloc().
- Define new sh_newof() macro to replace newof(); it uses
  sh_realloc().

All other changed files:
- Replace the relevant calls with the wrappers.
- Remove now-redundant success checks from 18529b88.
- The ERROR_PANIC error message calls are updated to inclusive-or
  ERROR_SYSTEM into the exit code argument, so libast's error()
  appends the human-readable version of errno in square brackets.
  See src/lib/libast/man/error.3

src/cmd/ksh93/edit/history.c:
- Include "defs.h" to get access to the wrappers even if KSHELL is
  not defined.
- Since we're here, fix a compile error that occurred with KSHELL
  undefined by updating the type definition of hist_fname[] to
  match that of history.h.

src/cmd/ksh93/bltins/enum.c:
- To get access to sh_newof(), include "defs.h" instead of
  <shell.h> (note that "defs.h" includes <shell.h> itself).

src/cmd/ksh93/Mamfile:
- enum.c: depend on defs.h instead of shell.h.
- enum.o: add an -I. flag in the compiler invocation so that defs.h
  can find its subsequent includes.

src/cmd/builtin/pty.c:
- Define one outofmemory() function and call that instead of
  repeating the error message call.
- outofmemory() never returns, so remove superfluous exit handling.

Co-authored-by: Martijn Dekker <martijn@inlv.org>
2021-02-27 21:21:58 +00:00
Martijn Dekker
18529b88c6 Add lots of checks for out of memory (re: 0ce0b671)
Huge typeset -L/-R adjustment length values were still causing
crashses on sytems with not enough memory. They should error out
gracefully instead of crashing.

This commit adds out of memory checks to all malloc/calloc/realloc
calls that didn't have them (which is all but two or three).

The stkalloc/stakalloc calls don't need the checks; it has
automatic checking, which is done by passing a pointer to the
outofspace() function to the stakinstall() call in init.c.

src/lib/libast/include/error.h:
- Change the ERROR_PANIC exit status value from ERROR_LEVEL (255)
  to 77, which is what it is supposed to be according to the libast
  error.3 manual page. Exit statuses > 128 for anything else than
  signals are not POSIX compliant and may cause misbehaviour.

src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/sh/init.c:
- To facilitate consistency, add a simple extern sh_outofmemory()
  function that throws an ERROR_PANIC "out of memory".

src/cmd/ksh93/include/shell.h,
src/cmd/ksh93/data/builtins.c:
- Remove now-redundant e_nospace[] extern message; it is now only
  used in one place so it might as well be a string literal in
  sh_outofmemory().

All other changed files:
- Verify the result of all malloc/calloc/realloc calls and call
  sh_outofmemory() if they fail.
2021-02-21 22:27:28 +00:00
Martijn Dekker
c2cb0eae19 Make 'read' compatible with Shift-JIS
This commit fixes a bug in the 'read' built-in: it did not properly
skip over multibyte characters. The bug never affects UTF-8 locales
because all UTF-8 bytes have the high-order bit set. But Shift-JIS
characters may include a byte corresponding to the ASCII backslash
character, which cauased buggy behaviour when using 'read' without
the '-r' option that disables backslash escape processing.

It also makes the regression tests compatible with Shift-JIS
locales. They failed with syntax errors.

src/cmd/ksh93/bltins/read.c:
- Use the multibyte macros when skipping over word characters.
  Based on a patch from the old ast-developers mailing list:
  https://www.mail-archive.com/ast-developers@lists.research.att.com/msg01848.html

src/cmd/ksh93/include/defs.h:
- Be a bit smarter about causing the compiler to optimise out
  multibyte code when SHOPT_MULTIBYTE is disabled. See the updated
  comment for details.

src/cmd/ksh93/tests/locale.sh:
- Put all the supported locales in an array for future tests.
- Add test for the 'read' bug. Include it in a loop that tests
  64 SHIFT-JIS character combinations. Only one fails on old ksh:
  the one where the final byte corresponds to the ASCII backslash.
  It doesn't hurt to test all the others anyway.

src/cmd/ksh93/tests/basic.sh,
src/cmd/ksh93/tests/builtins.sh,
src/cmd/ksh93/tests/quoting2.sh:
- Fix syntax errors that occurred in SHIFT-JIS locales as the
  parser was processing literal UTF-8 characters. Not executing
  that code is not enough; we need to make sure it never gets
  parsed as well. This is done by wrapping the commands containing
  literal UTF-8 strings in an 'eval' command as a single-quoted
  operand.

.github/workflows/ci.yml:
- Run the tests in the ja_JP.SJIS locale instead of ja_JP.UTF-8.
  UTF-8 is already covered by the nl_NL.UTF-8 test run; that should
  be good enough.
2021-02-18 16:07:12 +00:00
Martijn Dekker
af5f7acf99 Fix bugs related to --posix shell option (re: 921bbcae, f45a0f16)
This fixes the following:
1. 'set --posix' now works as an equivalent of 'set -o posix'.
2. The posix option turns off braceexpand and turns on letoctal.
   Any attempt to override that in a single command such as 'set -o
   posix +o letoctal' was quietly ignored. This now works as long
   as the overriding option follows the posix option in the command.
3. The --default option to 'set' now stops the 'posix' option, if
   set or unset in the same 'set' command, from changing other
   options. This allows the command output by 'set +o' to correctly
   restore the current options.

src/cmd/ksh93/data/builtins.c:
- To make 'set --posix' work, we must explicitly list it in
  sh_set[] as a supported option so that AST optget(3) recognises
  it and won't override it with its own default --posix option,
  which converts the optget(3) string to at POSIX getopt(3) string.
    This means it will appear as a separate entry in --man output,
  whether we want it to or not. So we might as well use it as an
  example to document how --optionname == -o optionname, replacing
  the original documentation that was part of the '-o' description.

src/cmd/ksh93/sh/args.c: sh_argopts():
- Add handling for explitit --posix option in data/builtins.c.
- Move SH_POSIX syncing SH_BRACEEXPAND and SH_LETOCTAL from
  sh_applyopts() into the option parsing loop here. This fixes
  the bug that letoctal was ignored in 'set -o posix +o letoctal'.
- Remember if --default was used in a flag, and do not sync options
  with SH_POSIX if the flag is set. This makes 'set +o' work.

src/cmd/ksh93/include/argnod.h,
src/cmd/ksh93/data/msg.c,
src/cmd/ksh93/sh/args.c: sh_printopts():
- Do not potentially translate the 'on' and 'off' labels in 'set
  -o' output. No other shell does, and some scripts parse these.

src/cmd/ksh93/sh/init.c: sh_init():
- Turn on SH_LETOCTAL early along with SH_POSIX if the shell was
  invoked as sh; this makes 'sh -o' and 'sh +o' show expected
  options (not that anyone does this, but correctness is good).

src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/include/shell.h:
- The state flags were in defs.h and most (but not all) of the
  shell options were in shell.h. Gather all the shell state and
  option flag definitions into one place in shell.h for clarity.
- Remove unused SH_NOPROFILE and SH_XARGS option flags.

src/cmd/ksh93/tests/options.sh:
- Add tests for these bugs.

src/lib/libast/misc/optget.c: styles[]:
- Edit default optget(3) option self-documentation for clarity.

Several changed files:
- Some SHOPT_PFSH fixes to avoid compiling dead code.
2021-02-14 23:51:19 +00:00
Martijn Dekker
2182ecfa08 Fix compile/regress fails on compiling without SHOPT_* options
Many compile-time options were broken so that they could not be
turned off without causing compile errors and/or regression test
failures. This commit now allows the following to be disabled:

SHOPT_2DMATCH    # two dimensional ${.sh.match} for ${var//pat/str}
SHOPT_BGX        # one SIGCHLD trap per completed job
SHOPT_BRACEPAT   # C-shell {...,...} expansions (, required)
SHOPT_ESH        # emacs/gmacs edit mode
SHOPT_HISTEXPAND # csh-style history file expansions
SHOPT_MULTIBYTE  # multibyte character handling
SHOPT_NAMESPACE  # allow namespaces
SHOPT_STATS      # add .sh.stats variable
SHOPT_VSH        # vi edit mode

The following still break ksh when disabled:

SHOPT_FIXEDARRAY # fixed dimension indexed array
SHOPT_RAWONLY    # make viraw the only vi mode
SHOPT_TYPEDEF    # enable typeset type definitions

Compiling without SHOPT_RAWONLY just gives four regression test
failures in pty.sh, but turning off SHOPT_FIXEDARRAY and
SHOPT_TYPEDEF causes compilation to fail. I've managed to tweak the
code to make it compile without those two options, but then dozens
of regression test failures occur, often in things nothing directly
to do with those options. It looks like the separation between the
code for these options and the rest was never properly maintained.
Making it possible to disable SHOPT_FIXEDARRAY and SHOPT_TYPEDEF
may involve major refactoring and testing and may not be worth it.

This commit has far too many tweaks to list. Notables fixes are:

src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/data/options.c:
- Do not compile in the shell options and documentation for
  disabled features (braceexpand, emacs/gmacs, vi/viraw), so the
  shell is not left with no-op options and inaccurate self-doc.

src/cmd/ksh93/data/lexstates.c:
- Comment the state tables to associte them with their IDs.
- In the ST_MACRO table (sh_lexstate9[]), do not make the S_BRACE
  state for position 123 (ASCII for '{') conditional upon
  SHOPT_BRACEPAT (brace expansion), otherwise disabling this causes
  glob patterns of the form {3}(x) (matching 3 x'es) to stop
  working as well -- and that is ksh globbing, not brace expansion.

src/cmd/ksh93/edit/edit.c: ed_read():
- Fixed a bug: SIGWINCH was not handled by the gmacs edit mode.

src/cmd/ksh93/sh/name.c: nv_putval():
- The -L/-R left/right adjustment options to typeset do not count
  zero-width characters. This is the behaviour with SHOPT_MULTIBYTE
  enabled, regardless of locale. Of course, what a zero-width
  character is depends on the locale, but control characters are
  always considered zero-width. So, to avoid a regression, add some
  fallback code for non-SHOPT_MULTIBYTE builds that skips ASCII
  control characters (as per iscntrl(3)) so they are still
  considered to have zero width.

src/cmd/ksh93/tests/shtests:
- Export the SHOPT_* macros from SHOPT.sh to the tests as
  environment variables, so the tests can check for them and decide
  whether or how to run tests based on the compile-time options
  that the tested binary was presumably compiled with.
- Do not run the C.UTF-8 tests if SHOPT_MULTIBYTE is not enabled.

src/cmd/ksh93/tests/*.sh:
- Add a bunch of checks for SHOPT_* env vars. Since most should
  have a value 0 (off) or 1 (on), the form ((SHOPT_FOO)) is a
  convenient way to use them as arithmetic booleans.

.github/workflows/ci.yml:
- Make GitHub do more testing: run two locale tests (Dutch and
  Japanese UTF-8 locales), then disable all the SHOPTs that we can
  currently disable, recompile ksh, and run the tests again.
2021-02-08 22:02:45 +00:00
Martijn Dekker
66e1d44642 command -x: fix efficiency; always run external cmd (re: acf84e96)
This commit fixes 'command -x' to adapt to OS limitations with
regards to data alignment in the arguments list. A feature test is
added that detects if the OS aligns the argument on 32-bit or
64-bit boundaries or not at all, allowing 'command -x' to avoid
E2BIG errors while maximising efficiency.

Also, as of now, 'command -x' is a way to bypass built-ins and
run/query an external command. Built-ins do not limit the length of
their argument list, so '-x' never made sense to use for them. And
because '-x' hangs on Linux and macOS on every ksh93 release
version to date (see acf84e96), few use it, so there is little
reason not to make this change.

Finally, this fixes a longstanding bug that caused the minimum exit
status of 'command -x' to be 1 if a command with many arguments was
divided into several command invocations. This is done by replacing
broken flaggery with a new SH_XARG state flag bit.

src/cmd/ksh93/features/externs:
- Add new C feature test detecting byte alignment in args list.
  The test writes a #define ARG_ALIGN_BYTES with the amount of
  bytes the OS aligns arguments to, or zero for no alignment.

src/cmd/ksh93/include/defs.h:
- Add new SH_XARG state bit indicating 'command -x' is active.

src/cmd/ksh93/sh/path.c: path_xargs():
- Leave extra 2k in the args buffer instead of 1k, just to be sure;
  some commands add large environment variables these days.
- Fix a bug in subtracting the length of existing arguments and
  environment variables. 'size -= strlen(cp)-1;' subtracts one less
  than the size of cp, which makes no sense; what is necessary is
  to substract the length plus one to account for the terminating
  zero byte, i.e.: 'size -= strlen(cp)+1'.
- Use the ARG_ALIGN_BYTES feature test result to match the OS's
  data alignment requirements.
- path_spawn(): E2BIG: Change to checking SH_XARG state bit.

src/cmd/ksh93/bltins/whence.c: b_command():
- Allow combining -x with -p, -v and -V with the expected results
  by setting P_FLAG to act like 'whence -p'. E.g., as of now,
	command -xv printf
  is equivalent to
	whence -p printf
  but note that 'whence' has no equivalent of 'command -pvx printf'
  which searches $(getconf PATH) for a command.
- When -x will run a command, now set the new SH_XARG state flag.

src/cmd/ksh93/sh/xec.c: sh_exec():
- Change to using the new SH_XARG state bit.
- Skip the check for built-ins if SH_XARG is active, so that
  'command -x' now always runs an external command.

src/lib/libcmd/date.c, src/lib/libcmd/uname.c:
- These path-bound builtins sometimes need to run the external
  system command by the same name, but they did that by hardcoding
  an unportable direct path. Now that 'command -x' runs an external
  command, change this to using 'command -px' to guarantee using
  the known-good external system utility in the default PATH.
- In date.c, fix the format string passed to 'command -px date'
  when setting the date; it was only compatible with BSD systems.
  Use the POSIX variant on non-BSD systems.
2021-01-30 06:53:19 +00:00
Martijn Dekker
17ebfbf6a3 Fix I/O redirection in -c script (Solaris patch 280-23332860)
This change is pulled from here:
https://github.com/oracle/solaris-userland/blob/master/components/ksh93/patches/280-23332860.patch

Info and reproducers:
https://github.com/att/ast/issues/36

In a -c script (like ksh -c 'commands'), the last command
misredirects standard output if an EXIT or ERR trap is set.
This appears to be a side effect of the optimisation that
runs the last command without forking.

This applies a patch by George Lijo that flags these specific
cases and disables the optimisation.

src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/bltins/trap.c,
src/cmd/ksh93/sh/init.c,
src/cmd/ksh93/sh/main.c,
src/cmd/ksh93/sh/xec.c:
- Apply patch as above.

src/cmd/ksh93/tests/io.sh:
- Add the reproducers from the bug report as regression tests.
2021-01-08 15:15:53 +00:00
Martijn Dekker
bae02c39b6 Fix for argv for setuid scripts (Solaris patch 115-CR6934836)
This change is pulled from here:
https://github.com/oracle/solaris-userland/blob/master/components/ksh93/patches/115-CR6934836.patch

Unfortunately there is no publicly available documentation on what
this does or why it was needed. We just have to assume the Solaris
people knew what they were doing. ksh2020 upstreamed this too (as
well as all the other Solaris patches applied here).
2021-01-08 11:28:33 +00:00
Martijn Dekker
222515bf08 Implement hash tables for virtual subshells (re: 102868f8, 9d428f8f)
The forking fix implemented in 102868f8 and 9d428f8f, which stops
the main shell's hash table from being cleared if PATH is changed
in a subshell, can cause a significant performance penalty for
certain scripts that do something like

    ( PATH=... command foo )

in a subshell, especially if done repeatedly. This is because the
hash table is cleared (and hence a subshell forks) even for
temporary PATH assignments preceding commands.

It also just plain doesn't work. For instance:

    $ hash -r; (ls) >/dev/null; hash
    ls=/bin/ls

Simply running an external command in a subshell caches the path in
the hash table that is shared with a main shell. To remedy this, we
would have to fork the subshell before forking any external
command. And that would be an unacceptable performance regression.

Virtual subshells do not need to fork when changing PATH if they
get their own hash tables. This commit adds these. The code for
alias subshell trees (which was removed in ec888867 because they
were broken and unneeded) provided the beginning of a template for
their implementation.

src/cmd/ksh93/sh/subshell.c:
- struct subshell: Add strack pointer to subshell hash table.
- Add sh_subtracktree(): return pointer to subshell hash table.
- sh_subfuntree(): Refactor a bit for legibility.
- sh_subshell(): Add code for cleaning up subshell hash table.

src/cmd/ksh93/sh/name.c:
- nv_putval(): Remove code to fork a subshell upon resetting PATH.
- nv_rehash(): When in a subshell, invalidate a hash table entry
  for a subshell by creating the subshell scope if needed, then
  giving that entry the NV_NOALIAS attribute to invalidate it.

src/cmd/ksh93/sh/path.c: path_search():
- To set a tracked alias/hash table entry, use sh_subtracktree()
  and pass the HASH_NOSCOPE flag to nv_search() so that any new
  entries are added to the current subshell table (if any) and do
  not influence any parent scopes.

src/cmd/ksh93/bltins/typeset.c: b_alias():
- b_alias(): For hash table entries, use sh_subtracktree() instead
  of forking a subshell. Keep forking for normal aliases.
- setall(): To set a tracked alias/hash table entry, pass the
  HASH_NOSCOPE flag to nv_search() so that any new entries are
  added to the current subshell table (if any) and do not influence
  any parent scopes.

src/cmd/ksh93/sh/init.c: put_restricted():
- Update code for clearing the hash table (when changing $PATH) to
  use sh_subtracktree().

src/cmd/ksh93/bltins/cd_pwd.c:
- When invalidating path name bindings to relative paths, use the
  subshell hash tree if applicable by calling sh_subtracktree().
- rehash(): Call nv_rehash() instead of _nv_unset()ting the hash
  table entry; this is needed to work correctly in subshells.

src/cmd/ksh93/tests/leaks.sh:
- Add leak tests for various PATH-related operations in the main
  shell and in a virtual subshell.
- Several pre-existing memory leaks are exposed by the new tests
  (I've confirmed these in 93u+). The tests are disabled and marked
  TODO for now, as these bugs have not yet been fixed.

src/cmd/ksh93/tests/subshell.sh:
- Update.

Resolves: https://github.com/ksh93/ksh/issues/66
2021-01-07 22:18:25 +00:00
Harald van Dijk
41ef7f76cf Invocation: fix infinite loop on 'ksh +s'
When starting ksh +s, it gets stuck in an infinite loop continually
trying to parse its own binary as a shell script and rejecting it:

$ arch/linux.i386-64/bin/ksh +s
arch/linux.i386-64/bin/ksh: arch/linux.i386-64/bin/ksh: cannot execute [Exec format error]
arch/linux.i386-64/bin/ksh: arch/linux.i386-64/bin/ksh: cannot execute [Exec format error]
arch/linux.i386-64/bin/ksh: arch/linux.i386-64/bin/ksh: cannot execute [Exec format error]
arch/linux.i386-64/bin/ksh: arch/linux.i386-64/bin/ksh: cannot execute [Exec format error]
arch/linux.i386-64/bin/ksh: arch/linux.i386-64/bin/ksh: cannot execute [Exec format error]
[...]
$ echo 'echo "this is stdin"' | arch/linux.i386-64/bin/ksh +s
arch/linux.i386-64/bin/ksh: arch/linux.i386-64/bin/ksh: cannot execute [Exec format error]
(no loop, but still ksh trying to parse itself)

src/cmd/ksh93/sh/init.c: sh_init():
- When forcing on the '-s' option upon finding no command
  arguments, also update sh.offoptions, a.k.a. shp->offoptions.
  This avoids the inconsistent state causing this problem.

  In main.c, there is:

  if(sh_isoption(SH_SFLAG))
      fdin = 0;
  else
      (code to open $0 as a file)

  This was entering the else block because sh_isoption(SH_SFLAG)
  was returning 0, and $0 is set to the ksh binary as it is
  supposed to when no other script is provided. When I looked for
  why sh_isoption was returning 0, I found main.c's

  for(i=0; i<elementsof(shp->offoptions.v); i++)
      shp->options.v[i] &= ~shp->offoptions.v[i];

  Before this loop, shp->offoptions tracks which options were
  explicitly disabled by the user on the command line. The effect
  of this loop is to make "explicitly disabled" take precedence
  over "implicitly enabled". My patch removes the registration of
  the +s option.

Fixes: https://github.com/ksh93/ksh/issues/150
Co-authored-by: Martijn Dekker <martijn@inlv.org>
2021-01-03 23:54:36 +00:00
Martijn Dekker
79d1945813 Streamline some shell state flaggery
src/cmd/ksh93/sh/args.c: sh_argprocsub():
- Save and restore state more efficiently by just saving and
  restoring all the state bits in one go using the
  sh_{get,set}state() macros, which are defined in defs.h as:
    #define	sh_getstate()	(sh.st.states)
    #define	sh_setstate(x)	(sh.st.states = (x))
  (and there is yet more evidence that it doesn't matter whether
  we use a 'shp->' pointer or 'sh.' direct access).

src/cmd/ksh93/sh/main.c: exfile():
- Remove a no-op 'sh_offstate(SH_INTERACTIVE);'. It was in the
  'else' clause of 'if(sh_isstate(SH_INTERACTIVE))' so if we get
  there, it is known that this flag is already off.
- To properly disable job control, we also have to save and restore
  the job.jobcontrol variable.

src/cmd/ksh93/sh/xec.c: sh_exec():
- Remove some no-op flaggery from this highly performance-sensitive
  point in the code. Given the immediately preceding:
	volatile int	was_errexit = sh_isstate(SH_ERREXIT);
	volatile int	was_monitor = sh_isstate(SH_MONITOR);
  the following:
	sh_offstate(SH_ERREXIT);
	if(was_errexit&flags)
		sh_onstate(SH_ERREXIT);
  can be reformulated as:
	if(!(flags & sh_state(SH_ERREXIT)))
		sh_offstate(SH_ERREXIT);
  (IOW, if it was already on, don't turn it off and then on again)
  ...and the following:
	if(was_monitor&flags)
		sh_onstate(SH_MONITOR);
  can be removed; it's a no-op because it wasn't preceded by an
  sh_offstate() and if 'was_monitor' is true, this option is known
  to be on. (I considered they may have forgotten an 'sh_offstate'
  there like in the SH_ERREXIT case, but adding that causes several
  regressions in a shtests run.)

src/cmd/ksh93/include/defs.h:
- Remove comment that is evidently long outdated; there is not (or
  no longer) a Shscoped_t type defined anywhere, nor are these
  struct fields replicated in any other type definition.
- Add comment to clarify what the 'states' int in 'struct
  sh_scoped' is for.
2020-10-02 23:58:21 +02:00
Martijn Dekker
7424844df5 Remove SH_SUBSHELL option vestiges
Mildly interesting: apparently there was once an idea to implement
shared-state command substitutions as a shell option like 'set -o
subshare'. They were implemented using a new ${ syntax; } instead,
but there is a vestigial SH_SUBSHARE option ID in shell.h plus a
check for it in subshell.c that would cause backtick-style command
substitutions (comsub==1) to share their state. That option isn't
defined in data/options.c so it's impossible for a user to set it.

src/cmd/ksh93/include/shell.h,
src/cmd/ksh93/sh/subshell.c:
- Remove SH_SUBSHELL option vestiges.

src/cmd/ksh93/include/defs.h:
- Correct my comment on 'comsub' flag; I was wrong about what the
  values meant. 2 is for a shared-state comsub. (re: 4ce486a7)
2020-10-01 16:58:03 +02:00
Martijn Dekker
4ce486a7a4 Fix hang in comsubs (rhbz#1062296) (re: 970069a6)
The new command substitution mechanism imported in 970069a6 from
Red Hat patches introduced this bug: backtick-style command
substitutions hang when processing about 117KiB of data or more.

It is fixed by another Red Hat patch:
642af4d6/f/ksh-20140415-hokaido.patch

It saves the value of the shp->comsub flag so that it is set to 2
(usually meaning new-style $(comsubs)) in two specific cases even
when processing backtick comsubs. This stops the sh_subtmpfile()
function in subshell.c from creating a /tmp file. However, I think
that approach is quite ugly, so I'm taking a slightly different one
that has the same effect.

src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/sh/subshell.c:
- Redefine sh_subtmpfile() to pass the comsub flag as an argument.
  (Remove the shp pointer argument, which is redundant; a pointer
  to the shell state can easily be obtained in the function.)

src/cmd/ksh93/sh/xec.c: sh_exec():
- Apply the Red Hat fix by passing flag 2 to sh_subtmpfile().

src/cmd/ksh93/tests/subshell.sh:
- Move regress test from ce68e1be from basic.sh to here; this is
  the place for command substitution tests as they are subshells.
- Add regress test for this bug.

All other changed files:
- Update sh_subtmpfile() calls to pass on the shp->comsub flag.
2020-09-24 06:07:12 +02:00
Martijn Dekker
3654ee73c0 Fix typeset -l/-u crash on special vars (rhbz#1083713)
When using typeset -l or -u on a variable that cannot be changed
when the shell is in restricted mode, ksh crashed.

This fixed is inspired by this Red Hat fix, which is incomplete:
642af4d6/f/ksh-20120801-tpstl.patch

The crash was caused by the nv_shell() function. It walks though a
discipline function tree to get the pointer to the interpreter
associated with it. Evidently, the problem is that some pointer in
that walk is not set correctly for all special variables.

Thing is, ksh only has one shell language interpreter, and only one
global data structure (called 'sh') to keep its main state[*]. Yet,
the code is full of 'shp' pointers to that structure. Most (not
all) functions pass that pointer around to each other, accessing
that struct indirectly, ostensibly to account for the non-existent
possibility that there might be more than one interpreter state.
The "why" of that is an interesting cause for speculation that I
may get to sometime. For now, it is enough to know that, in the
code as it is, it matters not one iota what pointer to the shell
interpreter state is used; they all point to the same thing (unless
it's broken, as in this bug).

So, rather than fixing nv_shell() and/or associated pointer
assignments, this commit simply removes it, and replaces it with
calls to sh_getinterp(), which always returns a pointer to sh (see
init.c, where that function is defined as literally 'return &sh').

[*] Defined in shell.h, with the _SH_PRIVATE part in defs.h

src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/sh/name.c:
- Remove nv_shell().

src/cmd/ksh93/sh/init.c:
- In all the discipline functions for special variables, initialise
  shp using sh_getinterp() instead of nv_shell().

src/cmd/ksh93/tests/variables.sh:
- Add regression test for typeset -l/-u on all special variables.
2020-09-24 03:03:29 +02:00
Martijn Dekker
843b546c1a rm redundant getpid(2) syscalls (re: 9de65210)
Now that we have ${.sh.pid} a.k.a. shgd->current_pid, which is
updated using getpid() whenever forking a new process, there is no
need for anything else to ever call getpid(); we can use the stored
value instead. There were a lot of these syscalls kicking around,
some of them in performance-sensitive places.

The following lists only changes *other* than changing getpid() to
shgd->currentpid.

src/cmd/ksh93/include/defs.h:
- Comments: clarify what shgd->{pid,ppid,current_pid} are for.

src/cmd/ksh93/sh/main.c,
src/cmd/ksh93/sh/init.c:
- On reinit for a new script, update shgd->{pid,ppid,current_pid}
  in the sh_reinit() function itself instead of calling sh_reinit()
  from sh_main() and then updating those immediately after that
  call. It just makes more sense this way. Nothing else ever calls
  sh_reinit() so there are no side effects.

src/cmd/ksh93/sh/xec.c: _sh_fork():
- Update shgd->current_pid in the child early, so that the rest of
  the function can use it instead of calling getpid() again.
- Remove reassignment of SH_PIDNOD->nvalue.lp value pointer to
  shgd->current_pid (which makes ${.sh.pid} work in the shell).
  It's constant and was already set on init.
2020-09-23 04:19:02 +02:00