1
0
Fork 0
mirror of git://git.code.sf.net/p/cdesktopenv/code synced 2025-02-15 04:32:24 +00:00
Commit graph

1083 commits

Author SHA1 Message Date
Martijn Dekker
5a1ec3c9ff sfio: fix "SF_SYNC redefined" warning on FreeBSD
FreeBSD defines an SF_SYNC macro in sys/socket.h that conflicts
with sfio's SF_SYNC discipline, at best rendering it ineffective.

src/lib/libast/sfio/sfhdr.h:
- Temporarily undef __BSD_VISIBLE while including <sys/socket.h>
  to hide the BSD extension with the conflicting definition.
2022-01-20 05:54:45 +00:00
Martijn Dekker
2c4b05b4f8 tie up standards macros loose ends (re: 289dd46c)
src/lib/libast/features/standards:
- Do not emit #defines for the typ u_long test which is only used
  as a heuristic in subsequent tests in this file. (Note that 'set'
  can set and unset any iffe command-line --option at runtime.)
- Remove definition of _ISOC99_SOURCE macro. This is another old
  GNU thing; feature_test_macros(7) says invoking the compiler with
  the option -std=c99 has the same effect. But modern GCC has C11
  with GNU extensions as the default, which is fine. If a
  particular standard is desired, pass a -std=... flag in $CC.

src/cmd/ksh93/features/rlimits:
- Remove overlooked Linux *64* types/functions hackery.
  After defining standards macros it caused a build failure
  on at least one version of Void Linux (but not 5.15.14_1).
  Thanks to @JohnoKing for the report.

src/cmd/ksh93/sh/subshell.c,
src/lib/libdll/dllnext.c:
- Remove now-redundant local definitions of _GNU_SOURCE and
  __EXTENSIONS__ macros.

src/cmd/ksh93/tests/builtins.sh:
- Fix broken sed invocation (re: 41829efa).
2022-01-20 05:50:00 +00:00
Martijn Dekker
41829efa06 Various minor cleanups and fixes
The more notable ones are:

src/lib/libast/features/standards:
- Do not redefine _GNU_SOURCE and _FILE_OFFSET_BITS if already
  defined from $CCFLAGS. Thanks to @hyanias for the heads-up.
  (re: 289dd46c)

src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/include/shell.h,
src/cmd/ksh93/sh/args.c,
src/cmd/ksh93/sh/name.c:
- Remove -T test code activation option. It was basically unused.
  The only thing it did was intentionally introduce a memory leak
  in table_unset() if the 4th bit in the option argument was set.
  A search in ast-open-history reveals a few more trivial test uses
  that were later deleted, but nothing interesting.

src/cmd/ksh93/tests/{basic,path}.sh:
- Skip a couple of tests on AIX avoid hangs, at least one of which
  is not ksh's fault. Thanks to @HansH111 for the report.

src/cmd/ksh93/tests/builtins.sh:
- Change one awk use to a more portable sed invocation to placate
  systems with ancient awk commands, such as AIX. (re: de795e1f)
2022-01-20 00:54:42 +00:00
Martijn Dekker
289dd46c05 build: include standards macros for all AST code (re: 7fb814e1)
Turns out that the standards macros set by features/standards (such
as _GNU_SOURCE for Linux or _DARWIN_SOURCE for macOS) were still
*not* included for most C source files! Instead, they were
selectively included for some files only, sometimes via
FEATURE/standards (the output of features/standards), sometimes
via ast_standards.h which is copied from FEATURE/standards.

Consequently, there were still inconsistencies in the system header
interfaces exposed on Linux, macOS, Solaris, et al. It's no wonder
it sometimes took so much hackery to keep everything building.

Of course, making this consistent had to break things somewhere.
Breakage occurred on 32-bit Linux due to a lot of ugly hackery
involving direct use of internal GNU types like off64_t and
functions like fseek64(). This is now all removed and they are
activated by setting the appropriate feature macro instead, so
these types and functions can be used with their standard names
(off_t, fseek, etc.)

Before committing I've tested these changes on the following
i386/x86_64 systems: Linux (glibc 32 and 64 bit, musl libc 64 bit),
Solaris (32 and 64 bit), illumos (32 and 64 bit), FreeBSD (64 bit),
macOS (64 bit), Cygwin (32 bit), and Haiku (64 bit).

(Note: ast_standards.h is copied from FEATURE/standards, whereas
ast_common.h is copied from FEATURE/common.)

src/lib/libast/include/ast_std.h,
src/lib/libast/stdio/stdhdr.h:
- Include <ast_standards.h> first. This should cause all the AST
  and dependent code (such as ksh) to get the standards macros.

src/lib/libast/features/standards:
- For GNU (glibc), #define _FILE_OFFSET_BITS 64 to get large file
  support with 64-bit offsets.
- Stop GNU and Cygwin <string.h> form defining the GNU version of
  basename(3); on Cygwin, that declaration conflicts with the AST
  version (and with POSIX) by using a const char* argument instead
  of char*. It is deactivated by defining the macro 'basename' (as
  'basename'); this causes GNU string.h to consider it to be
  already defined by the standard libgen.h header.

All other changed files:
- Remove direct use of *64* types and functions and a lot of
  related hackery.
2022-01-20 00:53:22 +00:00
Johnothan King
fb8e239cb4 Fix build error on AIX 64-bit (#428)
This commit adds a wrapper for the AIX ar command that uses the
-X64 flag to avoid build errors on that platform.

Resolves: https://github.com/ksh93/ksh/issues/385
2022-01-17 20:25:06 +00:00
Martijn Dekker
0310ce08ea Fix build on Cygwin (again) (re: 24fc1bbc, 6faf4379)
Commit 24fc1bbc broke the build on Cygwin in comp/setlocale.c by no
longer defining _GNU_SOURCE on that system in features/standards.
This caused wcwidth() to be hidden by wchar.h though it was
detected in the libraries.

src/lib/libast/features/standards:
- Detect Cygwin along with GNU as a system on which to define
  _GNU_SOURCE.
- Add wcwidth() compilation as an extra heuristic to the BSD,
  SunOS, Darwin and GNU/Cygwin tests. (Since it's specified as an
  optional (X/Open) feature, it should not be tested for in the
  generic fallbacks.)
2022-01-17 20:24:05 +00:00
Martijn Dekker
e569f23ef9 bump internal libast version; various minor cleanups
These are minor things I accumulated over the last month or so.

Notable changes:

src/lib/libast/features/api,
src/lib/libast/misc/state.c,
src/lib/libast/comp/conf.tab,
src/cmd/ksh93/include/defs.h:
- Bump internal libast version to 20220101L. We've made a few
  additions to the API, at least pathicase (see 71934570, ca3ec200)
  and astconf_long (see c2ac69b2), so this should have been done
  already. This also updates '/opt/ast/bin/getconf _AST_VERSION'.
- Use AST_VERSION instead of outdated _AST_VERSION.
- In state.c, use AST_VERSION instead of hardcoding the version.

src/cmd/ksh93/sh/xec.c:
- Remove 'restorefd' variable, unused as of 42becab6.
- Remove 'cmdrecurse' function and SH_RUNPROG macro; this was once
  used by a few libcmd commands, but ast-open-archive reveals it's
  unused as of ast 1999-12-25.

src/cmd/ksh93/sh/*.c:
- Where available, use e_dot instead of "." for consistency; it is
  defined as an extern so we might as well use it.

src/cmd/ksh93/tests/*.sh:
- When reporting signal names in fails, include the SIG prefix.
- Fix a broken process hang test in subshell.sh.

src/lib/libast/man/sfdisc.3:
- Removed. The interfaces described here never made it out of AT&T;
  they do not exist in any libast version in ast-open-archive.
  Resolves: https://github.com/ksh93/ksh/issues/426
2022-01-14 19:55:35 +00:00
Johnothan King
07fc64f52b Fix use after free bug in discipline functions (#424)
This fixes one of the ASan failures in the variables.sh regression
tests. Explanation from <https://github.com/att/ast/issues/1268>:

> The problem is caused by this block of code freeing the Namfun_t*
> (via the call to chktfree()):
> https://github.com/ksh93/ksh/blob/307bc3ed/src/cmd/ksh93/sh/nvdisc.c#L570-L577
>> 570  else
>> 571  {
>> 572          struct blocked *bp;
>> 573          action = vp->disc[type];
>> 574          vp->disc[type] = 0;
>> 575          if(!(bp=block_info(np,(struct blocked*)0)) || !isblocked(bp,UNASSIGN))
>> 576                  chktfree(np,vp);
>> 577  }
> That invalidates the value stored in vp which is dereferenced here:
> https://github.com/ksh93/ksh/blob/307bc3ed/src/cmd/ksh93/sh/nvdisc.c#L411-L421
>> 419          unblock(bp,type);
>> 420          if(!vp->disc[type])
>> 421                  chktfree(np,vp);

ksh2020 commit:
https://github.com/att/ast/commit/df1e8165

src/cmd/ksh93/sh/nvdisc.c:
- Block nv_setdisc from freeing the memory associated with the vp pointer.
2022-01-14 19:51:24 +00:00
Johnothan King
307bc3edce time: Fix precision bug in times(3) fallback (#425)
In the times(3) fallback for the time keyword (which can be enabled
in xec.c by undefining _lib_getrusage and timeofday), ksh will
print the obtained time incorrectly if TIMEFORMAT is set to use a
precision level of three:
   $ TIMEFORMAT=$'\nreal\t%3lR'
   $ time sleep .080
   real 0m00.008s  # Should be '00.080s'
This commit corrects that issue by using 10^precision to get the
correct fractional scaling. Note that the fallback still doesn't
support a true precision level of three (times(3) alone doesn't
support it), so this in effect pads a zero to the end of the output
when the precision level is three.

Additional change to tests/builtins.sh:
- While fixing the above issue I found out that ksh93v- broke
  support for passing microseconds to the sleep builtin in the form
  of <num>U. I've added a regression test for that bug to ensure it
  isn't backported to ksh93u+m by accident.

Co-authored-by: Martijn Dekker <martijn@inlv.org>
2022-01-13 12:25:22 +00:00
Johnothan King
40b2c3c3d4 alias: Avoid unnecessary forks (re: ec888867) (#417)
The code used to fork subshells when creating/changing aliases will
always fork, even when the alias tree isn't changed:
   $ echo $(unalias --man 2> /dev/null; echo $$ ${.sh.pid})
   375097 375107
   $ alias foo=bar; echo $(alias -p foo; echo $$ ${.sh.pid})
   alias foo=bar 375097 375110
This is a bit inefficient, so this commit avoids forking a subshell
unless at least one change is made to the alias table.

src/cmd/ksh93/bltins/typeset.c:
- b_alias(), b_unalias(): Remove sh_subfork() calls.
- setall(): When creating an alias (name contains '='), fork a
  virtual subshell before calling nv_open() to add the node.
- unall():
  - When unsetting all aliases (-a), fork subshell before dtclear().
  - When unsetting one alias, fork subshell before nv_delete().
  - Move sh_pushcontext() and sh_popcontext() expansions so that
    sh_subfork() is not in between them, as that would cause
    program flow corruption or a crash.

Co-authored-by: Martijn Dekker <martijn@inlv.org>
2022-01-13 01:56:19 +00:00
Martijn Dekker
b509e92241 edit: do not enable multiline mode with no editor active
If neither gmacs/emacs nor vi are active, the multiline mode should
not be enabled even if the multiline option is on. Doing so can
cause inconsistent behaviour, particularly in multibyte locales
where, if the shell is compiled with SHOPT_RAWONLY (as is default),
the no-editor mode is actually handled by vi.c.

Also, the new --histreedit and --histverify options only work in
the emacs or vi editors, or in no-editor mode when handled by vi.
Which means they cannot ever work if neither emacs or vi were
compiled in (i.e. SHOPT_ESH and SHOPT_VSH were both disabled).
In that case, there's no point in compiling in those options.
Come to think of it, the same applies to the multiline option.

All changed files:
- Update SHOPT_ESH/SHOPT_VSH preprocessor directives as per above.

src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/include/shell.h:
- Move definitions of history expansion-related options to shell.h,
  which is where all the other shell options are defined.
2022-01-12 20:39:05 +00:00
Martijn Dekker
a5700d3937 Expose --histreedit and --histverify options (re: 921bbcae)
This adds two long-form shell options that modify history expansion
(-H/--histexpand). If --histreedit is on and a history expansion
fails, the command line is reloaded into the next prompt's edit
buffer, allowing corrections. If --histverify is on, the results of
a history expansion are not immediately executed but instead loaded
into the next prompt's edit buffer, allowing further changes.

SH_HISTREEDIT and SH_HISTVERIFY were already handled all along in
slowread() in io.c and via 'reedit' arguments to functions called
there, but could not be turned on as they were only ever exposed as
shopt options in the removed bash compatibility mode (in theory
only, as it failed to compile). I had overlooked them until now.

So, since the code is there, let's expose these options through the
normal long options interface. They're working fine, and activating
them actually makes history expansion tolerable to use.

src/cmd/ksh93/data/options.c:
- Make these options available as "histreedit" and "histverify".

src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/sh.1:
- Document the "new" options.

src/cmd/ksh93/include/defs.h:
- Remove unused SH_HISTAPPEND and SH_HISTORY2 macros which were
  part of the removed bash compatibility code. Note that ksh does
  not need a histappend option as it never overwrites the history
  file (in the bash mode, this shopt option was a no-op).
2022-01-12 20:38:30 +00:00
Martijn Dekker
2d4a787564 Fix comsub hang on subshell fork (re: 090b65e7)
The referenced commit introduced a bug that caused command
substitutions to hang, writing infinite zero bytes, when
redirecting standard output on a built-in comand that forks the
command substitution subshell.

The bug was caused by removing the fork when redirecting standard
output in a non-permanent manner. However, simply reintroducing the
fork causes multiple regressions that we had fixed in the meantime.

Thankfully, it looks like this forking workaround is only necessary
when redirecting the output of built-ins. It appears that moving
workaround from io.c to the built-ins handling code in sh_exec() in
xec.c, right before calling sh_redirect(), allows reintroducing the
forking workaround for non-permanent redirections without causing
other regressions.

It would be better if the underlying cause of the hang were fixed
so the workaround becomes unnecessary, but I don't think that is
going to happen any time soon (AT&T didn't manage, either).

src/cmd/ksh93/sh/io.c: sh_redirect():
- Remove forking workaround for redirecting stdout in a comsub.

src/cmd/ksh93/sh/xec.c: sh_exec(): TCOM: built-ins handling code:
- Reimplement the workaround here.

Resolves: https://github.com/ksh93/ksh/issues/416
2022-01-12 20:30:20 +00:00
Martijn Dekker
f711da9081 Make process substitutions work on Haiku
On Haiku:

    # /bin/cat <(echo hi)	   # no redirection
    cat: /tmp/ksh.f29pd8f: No such file or directory

Whereas this works fine:

    # /bin/cat < <(echo hi)	   # with redirection
    hi
    # /opt/ast/bin/cat <(echo hi)  # no redirection; use built-in
    hi

Haiku does not have /dev/fd, so uses the FIFO (named pipe) fallback
mechanism. See also: c3eac977

Analysis: In the TFORK part of sh_exec(), forked branch (child),
the FIFO (sh.fifo) is unlinked immediately after opening it. This
is not a problem if the process substitution is used in combination
with a redirection, but if not, then the FIFO is passed on to the
command as a file name argument. This creates a race condition: ksh
was counting on the external 'cat' command opening the FIFO before
the child could unlink it. Whether that race is won depends on
operating system implementation details. When invoking an external
command on Haiku, the race is lost.

src/cmd/ksh93/sh/xec.c: sh_exec(): TFORK: child branch:
- Delay unlinking the FIFO until after executing the process
  substitution, when we're about to exit from the child process.
2022-01-12 20:30:02 +00:00
Martijn Dekker
de5bdd12a4 iffe: some more tweaks
src/cmd/INIT/iffe.sh:
- Remove obsolete $posix_noglob/$posix_glob code to disable and
  re-enable pathname expansion. Instead, globally disable it using
  'set -o noglob'. Except a couple of 'rm' commands, no aspect of
  this code uses globbing (note: 'case' patterns do not count), nor
  do any of the shell code feature tests that are eval'ed by iffe.
  Field splitting is heavily used via unquoted variable expansions;
  globbing should not interfere. (Concrete example: one of the test
  notes in src/cmd/ksh93/features/options is "SHOPT_* option probe"
  and that SHOPT_* is taken as a file name pattern, so could be
  expanded.) For the rm commands, globbing is turned back on in a
  subshell. If this breaks something that I've missed, hopefully
  that will turn up soon.
- Remove $show/$SHOW code to massage different versions of 'echo'
  into acting consistently; instead, use a show_test() function
  that uses the POSIX-specified printf command.
- For debugging level 2 and 3, use much more extensive PS4 prompts
  taken from modernish (which I wrote), depending on the shell.
- Replace the obsolete 'set -' command. In the ancient Bourne
  shell, this disables the v and x options, so is equivalent to
  'set +v +x'. This still works on almost all modern shells, but
  not yash, and POSIX considers it "unspecified".
- Remove a few echo|sed fallbacks I'd missed (re: 215efa15).

src/cmd/ksh93/features/math.sh:
- Disable pathname expansion here as well.
- Pass down an inherited $IFFEFLAGS to recursive iffe invocations.
2022-01-12 20:29:51 +00:00
Johnothan King
1a9af9db40 Fix vi mode tab completion with spaces (#413)
Attempting to complete file names in vi mode using tab completion can
fail if the last character on the command line is a space. Reproducer
(note that this bug doesn't occur in emacs mode):
   $ set -o vi
   $ mkdir '/tmp/foo bar'
   $ test -d /tmp/foo\ <Tab>

src/cmd/ksh93/edit/vi.c:
- Don't disable tab completion or reset the tab count just because the
  last character on the command line is a space. This bugfix was
  backported from ksh93v- 2014-06-06.

src/cmd/ksh93/tests/pty.sh:
- Add a regression test for the tab completion bug.
2022-01-07 16:18:28 +00:00
Johnothan King
ca5803419b Fix various typos, man page issues and improve the documentation (#415)
This commit makes various different improvements to the documentation:
- sh.1: Backported (with changes) mandoc warning fixes from ksh2020
  for the ksh93(1) man page: <https://github.com/att/ast/pull/1406>
- Removed unnecessary spaces at the end of lines to fix a few other
  mandoc warnings.
- Fixed various typos and capitalization errors in the documentation.
- ANNOUNCE: Document the addition of the ${.sh.pid} variable
  (re: 9de65210).
- libast/man/str*: Update the man pages for the libast str* functions
  to improve how accurately each function is described.
- ksh93/README: Update regression test/compatibility notes to include
  OpenBSD 7.0, FreeBSD 13.0 and WSL running Ubuntu 20.04.
- Change a few places to store the return value from strlen in a
  size_t variable rather than signed int.
- comp/setlocale.c: To avoid confusion of two separate variables named
  lang, the function local variable has been renamed to langidx.
2022-01-07 16:17:55 +00:00
Johnothan King
d347ec0fc9 Allow ksh to compile on Haiku; implement SIGKILLTHR support (#408)
This commit implements the build fixes required to get ksh running on
Haiku. Note that while ksh does compile, it has a ton of regression test
failures on Haiku.

src/cmd/ksh93/data/signals.c,
src/lib/libast/features/signal.c:
- Add support for the SIGKILLTHR signal, which is supported by BeOS and
  Haiku.
- SIGINFO was missing an entry in the libast feature test, so add one
  (re: 658bba74).

src/cmd/ksh93/RELEASE:
- Add an entry noting that ksh now compiles on Haiku, albeit with many
  regression test failures.

src/cmd/ksh93/{include/terminal.h,sh/path.c}:
- Silence compiler warnings on Haiku.

src/lib/libast/features/mmap:
- The mmap feature test freezes on Haiku, so modify the test to fail
  immediately on that OS.

src/lib/libast/misc/signal.c:
- Avoid redefining the signal definition on Haiku to fix a compiler
  error.

src/lib/libast/features/nl_types:
- For some reason the nl_item typedef on Haiku doesn't work correctly.
  Work around that by creating the nl_item type in the libast nl_types
  feature test.
2022-01-07 16:16:42 +00:00
Martijn Dekker
b590a9f155 [shp cleanup 01..20] all the rest (re: 2d3ec8b6)
This combines 20 cleanup commits from the dev branch.

All changed files:
- Clean up pointer defererences to sh.
- Remove shp arguments from functions.

Other notable changes:

src/cmd/ksh93/include/shell.h,
src/cmd/ksh93/sh/init.c:
- On second thought, get rid of the function version of
  sh_getinterp() as libshell ABI compatibility is moot. We've
  already been breaking that by reordering the sh struct, so there
  is no way it's going to work without recompiling.

src/cmd/ksh93/sh/name.c:
- De-obfuscate the relationship between nv_scan() and scanfilter().
  The former just calls the latter as a static function, there's no
  need to do that via a function pointer and void* type conversions.

src/cmd/ksh93/bltins/typeset.c,
src/cmd/ksh93/sh/name.c,
src/cmd/ksh93/sh/nvdisc.c:
- 'struct adata' and 'struct tdata', defined as local struct types
  in these files, need to have their first three fields in common,
  the first being a pointer to sh. This is because scanfilter() in
  name.c accesses these fields via a type conversion. So the sh
  field needed to be removed in all three at the same time.
  TODO: de-obfuscate: good practice definition via a header file.

src/cmd/ksh93/sh/path.c:
- Naming consistency: reserve the path_ function name prefix for
  externs and rename statics with that prefix.
- The default path was sometimes referred to as the standard path.
  To use one term, rename std_path to defpath and onstdpath() to
  ondefpath().
- De-obfuscate SHOPT_PFSH conditional code by only calling
  pf_execve() (was path_pfexecve()) if that is compiled in.

src/cmd/ksh93/include/streval.h,
src/cmd/ksh93/sh/streval.c:
- Rename extern strval() to arith_strval() for consistency.

src/cmd/ksh93/sh/string.c:
- Remove outdated/incorrect isxdigit() fallback; '#ifnded isxdigit'
  is not a correct test as isxdigit() is specified as a function.
  Plus, it's part of C89/C90 which we now require. (re: ac8991e5)

src/cmd/ksh93/sh/suid_exec.c:
- Replace an incorrect reference to shgd->current_pid with
  getpid(); it cannot work as (contrary to its misleading directory
  placement) suid_exec is an independent libast program with no
  link to ksh or libshell at all. However, no one noticed because
  this was in fallback code for ancient systems without
  setreuid(2). Since that standard function was specified in POSIX
  Issue 4 Version 2 from 1994, we should remove that fallback code
  sometime as part of another obsolete code cleanup operation to
  avoid further bit rot. (re: 843b546c)

src/cmd/ksh93/bltins/print.c: genformat():
- Remove preformat[] which was always empty and had no effect.

src/cmd/ksh93/shell.3:
- Minor copy-edit.
- Remove documentation for nonexistent sh.infile_name. A search
  through ast-open-archive[*] reveals this never existed at all.
- Document sh.savexit (== $?).

src/cmd/ksh93/shell.3,
src/cmd/ksh93/include/shell.h,
src/cmd/ksh93/sh/init.c:
- Remove sh.gd/shgd; this is now unused and was never documented
  or exposed in the shell.h public interface.
- sh_sigcheck() was documented in shell.3 as taking no arguments
  whereas in the actual code it took a shp argument. I decided to
  go with the documentation.
- That leaves sh_parse() as the only documented function that still
  takes an shp argument. I'm just going to go ahead and remove it
  for consistency, reverting sh_parse() to its pre-2003 spec.
- Remove undocumented/unused sh_bltin_tree() function which simply
  returned sh.bltin_tree.
- Bump SH_VERSION to 20220106.
2022-01-07 16:16:31 +00:00
Martijn Dekker
01da863154 In the original ast code base, src/{cmd/nmake,lib/libast}/Makefile
(nmake makefiles) defined this macro:

	__OBSOLETE__ == $("6 months ago":@F=%(%Y0101)T)

This was used to automatically disable code after a period between
6 and 18 months, on 1st Jan of each year, in preprocessor
directives like:

	#if __OBSOLETE__ < 20080101
	// obsolete code here
	#endif

However, when compiling without nmake (as we do), this __OBSOLETE__
macro is not defined at all. And undefined macros evaluate to zero
in arithmetic comparisons, so all that obsolete code has been
getting compiled. Thankfully it doesn't seem to have done any harm,
but all that code was supposed to expire between 2008 and 2014.

src/lib/libast/disc/sfstrtmp.c:
- Removed. Was supposed to be a stub #if __OBSOLETE__ >= 20070101.

src/lib/libast/include/ast.h:
- Remove unused fmtbasell() macro (/* until 2014-01-01 */).

Other changed files:
- Remove __OBSOLETE__d code.
2022-01-07 15:57:46 +00:00
Martijn Dekker
aee917f666 tests/builtins.sh: skip cd permission test if root (re: 59bacfd4) 2022-01-07 15:56:50 +00:00
Martijn Dekker
a1c613c48d package: more flat view fixes (re: 336e82f9, aeda3502)
bin/package, src/cmd/INIT/package.sh:
- Automatically update an existing flat view even if 'flat' wasn't
  given for a make action. This stops a flat view becoming
  inaccurate if you forget to use 'bin/package flat make'
  consistently. If the $PACKAGEROOT/lib/package/gen directory
  exists, an existing flat view is assumed.
- Correct symlink fallbacks. We need an absolute path for the
  symlink target or it's going to be broken.

.gitignore:
- Update.
2022-01-07 15:56:15 +00:00
Johnothan King
f1627e2a8c Fix typeset -m crash under ASan and on OpenBSD (#412)
This fixes the use after free issue that caused typeset -m to crash on
older versions of OpenBSD and under ASan. The problem that was causing
the failure was that the ap pointer wasn't set to null after the memory
associated with it was freed. This commit backports a bugfix from
ksh93v- 2013-06-28 that sets ap to null before freeing the associated
memory and adds a check that makes sure ap is still a valid pointer
before calling array_unscope().

tests/types.sh changes:
- Avoid redirecting stderr to /dev/null, as this test shouldn't print
  anything to stderr.
- Apply error message improvement from
  https://github.com/ksh93/ksh/issues/231#issue-834252084.

tests/arrays.sh change:
- Apply error message improvement from
  https://github.com/ksh93/ksh/issues/229#issue-834240645 (re: 7c7fde75).

Resolves: https://github.com/ksh93/ksh/issues/231
2022-01-07 15:54:46 +00:00
Johnothan King
7c7fde75c8 Fix arrays.sh test failure under ASan (#411)
This backports a ksh2020 fix for an ASan heap-use-after-free error in
arrays.sh. The arrays regression tests were failing under ASan because
the ap pointer was used after the memory allocated to it was freed by
_nv_unset(). ksh2020 commit:
f1e5119e31
2022-01-07 15:53:10 +00:00
Johnothan King
997f0e9dd8 iffe: Correct backslashes in POSIX command substitutions (re: aeda3502) (#414)
In backtick command substitutions '\\' will only print one backslash,
while placing a backslash in front of a different character will
print a backslash followed by that character:
   $ echo `echo '\\\\ \3'`
   \\ \3
   $ echo $(echo '\\\\ \3')
   \\\\ \3
This difference in behavior was causing iffe to fail on Haiku after
the changes in aeda3502. This commit applies fixes for backslashes
in quoted strings to fix the Haiku build failure and other possible
bugs introduced by the referenced commit.
2022-01-07 05:20:42 +00:00
Martijn Dekker
ecce82c3ca package: report 'failed' if build failed
bin/package, src/cmd/INIT/package.sh:
- Make the EXIT (0) trap report failure based on $error_status
  (see d18469d6), replacing 'done' with 'failed' if it is nonzero.
- Remove extra space before 'at' in that report line.
- If we get a signal, we have to set error_status to nonzero
  manually so that the correct exit message is printed.
2022-01-01 02:43:04 +00:00
Martijn Dekker
aeda350298 more package and iffe tweaks/cleanups
I've reconsidered excluding build system internals (and also '*.o'
files) from the flat view. Really the only thing that should be
excluded are *.old files.

bin/package, src/cmd/INIT/package.sh:
- Do not exclude anything except *.old files from the flat view.
- This would delete bin/package when cleaning up the flat view,
  so explicitly keep bin/package in the clean routine.
- Be much faster about updating an existing flat view by checking
  if a link already exists before creating one.
- Add silent cleanup for dozens of orphaned macOS *.dSYM bundles
  belonging to deleted temporary feature test executables.

src/cmd/INIT/{iffe,ignore,silent}.sh, bin/{ignore,silent}:
- Remove obsolete Bourne shell fallbacks.
- Modernise command substitutions.
- Remove unused literal() function.
- Update copy() function to use printf.
- Distinguish just two shell types now: ksh and POSIX.
- compile(): Remove obsolete/incorrect grepping for signal messages.
  Add a POSIX-compiliant check with 'kill -l' to see if the
  compiler was terminated by a signal.
2022-01-01 02:28:54 +00:00
Martijn Dekker
d9fc61c022 init.c: upstream init.c.patch for CDE's dtksh
This upstreams the patch to init.c that is necessary to build dtksh
(graphical extensions for CDE, see <https://cdesktopenv.sf.net/>).
It has no effect when building regular ksh. Upstreaming it avoids
the need to keep updating it when changes to init.c are made.
2022-01-01 02:28:45 +00:00
Martijn Dekker
336e82f942 package: reintroduce/rewrite flat view (re: 20f8557c)
I called the flat view featuritis, but it turns out the dtksh build
system depends on it. But it was broken, so let's make a proper
version instead. This new approach does not symlink directories
before make, but hardlinks files after make. To make performance
tolerable, it requires a modern POSIX 'find' utility with '{} +'.

bin/package, src/cmd/INIT/package.sh:

- We're going to depend on 'test X -ef Y' to check if X is the same
  file as Y, so add that to the $min_posix checks even though it is
  not technically POSIX. But it's so close to universally available
  it doesn't really make a difference.

- After 'make', create a flat view by hardlinking arch/$HOSTTYPE
  files, minus build system internals, onto their corresponding
  paths in $PACKAGEROOT. Fall back to symlinking if hardlinking
  fails (this would happen when crossing device boundaries).

- Clean up flat view when doing clean/clobber. This is done by
  using 'find' to loop through the files in arch/ again and
  removing any root paths that are the same file as their
  corresponding arch/ path (this is where test X -ef Y comes in).
  Then delete arch/$HOSTTYPE, then delete empty directories left.

- Check for the 'flat' qualifier when doing clean/clobber. If
  given, do not delete arch/$HOSTTYPE but only clean up the flat
  view and remove empty directories.

src/cmd/INIT/Mamfile:
- Do not create the lib/package directory. (re: beb3c64a)

.gitignore:
- Update.
2022-01-01 02:28:38 +00:00
Martijn Dekker
3fc6cf0e2f iffe: really abort on output{ fail (re: e20c0c6b, e72543a, b6bd981)
After further examining the iffe code I found that fail{ ... }end
as well as pass{ ... }end blocks are executed in iffe's current
environment, using a simple 'eval', with no safeguards whatsoever!

This does of course afford maximum flexibility... for example, a
block can 'exit 1' to abort the iffe run and the whole build with
it. We can use this to abort properly on fatal compilation errors.

src/cmd/INIT/iffe.sh:
- Complete the fail{ and pass{ documentation; it should really be
  known that they run in iffe's current environment.
- Make one change: for just the 'eval' command that runs these
  blocks, redirect standard error back to the saved $stderr file
  descriptor so these blocks can write error messages using the
  standard >&2 instead of the undocumented >&$stderr.

src/lib/**/features/*:
- Write error message to standard error and error out properly when
  an output{ ... }end block fails to compile, instead of writing an
  #error directive to error out later.
2022-01-01 02:28:27 +00:00
Martijn Dekker
2d3ec8b67a [shp cleanup 00] Reunify the original sh state struct
As observed previously (see 3654ee73, 7e6bbf85, 79d19458), the ksh
93u+ codebase on which we rebased development was in a transition:
AT&T evidently wanted to make it possible to have several shell
interpreter states in the same process, which in theory would have
made it possible to start a complete new shell (not just a
subshell) without forking a new process.

This required transitioning from accessing the 'sh' state struct
directly to accessing it via pointers (usually but not always
called 'shp'), introducing a lot of bug-prone passing around of
those pointers via function arguments and other state structs.

Some of the original 'sh' struct was separated into a 'struct
shared' called 'shgd' a.k.a. 'sh.gd' (global data) instead; these
were global state variables that were going to be shared between
the different main shell environments sharing a process. Yet, for
some reason, that struct was allocated dynamically once at init
time, requiring yet another pointer to access it. <shrug>

None of this ever worked, because that transition was incomplete.
It was much further along in the ksh 93v- beta, but I don't think
it actually worked there either (not very much really did). So,
starting a new shell has always required starting a new process.

So, now that it's clear what they were trying to do, should we try
to make it work? I'm going to go with a firm "no" on that question.

Even non-forking (virtual) subshells, something quite a bit less
ambitious, were already an unmitigated nightmare of bugs. In 93u+m
we fixed a load of bugs related to those, but I'm sure there are
still many left. At the very least there are multiple memory leaks.

I think the ambition to go even further and have complete shells
running separate programs share a process, particularly given the
brittle and buggy state of the existing codebase, is evidence that
the AT&T team, in the final years, had well and truly lost the
ability to think "wait a minute, aren't we in over our heads here,
and why are we doing this again? Is this *actually* a feasible and
useful idea?"

In my view, having entirely separate programs share a process is a
*terrible*, horrible, no-good idea that takes us back to the bad
old days before Unix, when kernels and CPUs were unable to enforce
any memory access restrictions. Programmers are imperfect. If
you're going to run a new program, you need the kernel to enforce
the separation between programs, or you're just asking for memory
corruption and security holes. And that separation is enforced by
starting a new program in a new process. That's what processes are
*for*. And if you need *that* to be radically performance-optimised
then you're probably doing it wrong anyway.

(By the way, I would still argue the same for subshells, even after
we fixed many bugs in virtual subshells. But forking all subshells
would in fact cause many scripts to slow down, and the community
would surely revolt. <sigh>  Maybe I should make it a shell option
instead, so scripts can 'set -o subfork' for reliability.)

It is also unclear how they were going to make something like
'ulimit' work, which can only work in a separate process. There
was no sign of a mechanism to fork a separate program's shell
mid-execution like there is for subshells (sh_subfork()).

Anyway... I had already changed some code here and there to access
the sh state struct directly, but as of this commit I'm beginning
to properly undo this exercise in pointlessness. From now on, we're
exercising pointerlessness instead.

I'll do this in stages to make any problems introduced more
traceable. Stage 0 restores the full 'sh' state struct to its
former static glory and reverts 'shgd' as a separate entity.

src/cmd/ksh93/sh/defs.c,
src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/include/shell.h
src/cmd/ksh93/Mamfile::
- Move 'struct sh_scoped' and 'struct limits' from defs.h to
  shell.h as the sh struct will need their complete definitions.
- Get rid of 'struct shared' (shgd) in defs.h; its members are
  folded back into their original place, the main Shell_t struct
  (sh) in shell.h. There are no name conflicts.
- Get rid of the _SH_PRIVATE macro in defs.h. The members it
  defines are now defined normally in the main Shell_t struct (sh)
  in shell.h.
- To make this possible, move <history.h> and "fault.h" includes
  from defs.h to shell.h and update the Mamfile accordingly.
- Turn sh_getinterp() and shgd into macros that resolve to (&sh).
  This will allow the compiler to optimise out many pointer
  dereferences already.
- Keep extern sh_getinterp() for libshell ABI compatibility.

src/cmd/ksh93/sh/init.c:
- sh_init(): Do not calloc (sh_newof) the sh or shgd structs.
- sh_getinterp(): Keep function for libshell ABI compat.
2022-01-01 02:28:06 +00:00
Martijn Dekker
57ed1efc2c Actually deactivate CDPATH when unsetting it
After 'unset CDPATH', CDPATH continued to work as if nothing
happened. Unsetting it should be a valid way to deactivate it.

This bug is in every ksh93 version.

src/cmd/ksh93/bltins/cd_pwd.c: b_cd():
- Fix a manifest logic error: first check if CDPATH (CDPNOD) is
  unset before assigning to 'cdpath', not the other way around.
  Setting the 'cdpath' pointer is what activates the CDPATH search.
2021-12-29 01:48:55 +00:00
Martijn Dekker
de511cfbc2 libast: regex: re-backport robustness improvements from 93v- beta
That intermittent regression test failure in types.sh seems to be
gone. So let's reimport the regex changes into the 1.0 branch to
subject them to wider testing and make sure any failures stay gone.
(re: 48568476, 38aab428, 1aa8f771)

[Original commit message from 1aa8f771 follows]

There are two main changes:

1. The regex code now creates and uses its own stack (env->mst)
   instead of using the shared standard stack (stkstd). That seems
   likely to be a good thing.

2. Missing mbinit() calls were inserted. The 93v- code uses a
   completely different multibyte characters API, so these needed
   to be translated back to the older API. But, as mbinit() is no
   longer a no-op as of 300cd199, these calls do stop things from
   breaking if a previous operation is interrupted mid-character.

I think there might be a couple of off-by-one errors fixed as well,
as there are two instances of this change:

-               while ((index += skip[buf[index]]) < mid);
+               while (index < mid)
+                       index += skip[buf[index]];
2021-12-28 22:24:41 +00:00
Johnothan King
4032050249 Port cksum builtin performance improvements from illumos (#391)
This commit ports performance optimizations from illumos for the libsum
code (used by the cksum and sum builtins):
98bea71f0d
The new codepath in libsum uses prefetching and loop unrolling to
improve performance (prefetching is done with __builtin_prefetch()
or sun_prefetch_read_many() if either is available).

Script for testing (note that cksum must be enabled in
src/cmd/ksh93/data/builtins.c):
   #!/bin/ksh
   builtin cksum || exit 1
   for ((i=0; i!=50000; i++)) do
       cksum -x att /etc/hosts
   done >/dev/null

Results on Linux x86_64 (using CCFLAGS=-O2):
$ echo 'UNPATCHED:'; time arch/linux.i386-64/bin/ksh /tmp/foo; echo 'PATCHED'; time /tmp/ksh /tmp/foo
UNPATCHED:

real    0m09.989s
user    0m07.582s
sys     0m02.406s
PATCHED:

real    0m06.536s
user    0m04.331s
sys     0m02.204s

src/lib/libsum/{sum-att.c,sum-crc.c,Mamfile}:
- Port the performance optimizations from illumos to 93u+m libsum. To
  prevent problems with older versions of GCC, avoid the new codepath
  if GCC is older than the 3.1 release series. Additionally, the ast.h
  header must be included to handle tcc defining __GNUC__ on FreeBSD.
- Apply some build fixes to allow the new codepath to build with Clang
  3.6 and newer (my own testing indicates an even better performance
  improvement with Clang than with GCC).
2021-12-28 22:22:52 +00:00
Johnothan King
8f9d1bec97 Add three options to 'ulimit' (#406)
This patch adds a few extra options to the ulimit command (if the OS
supports them). These options are also present in Bash, although in ksh
additional long forms of each option are available:
  ulimit -k/--kqueues   This is the maximum number of kqueues.
  ulimit -P/--npts      This is the maximum number of pseudo-terminals.
  ulimit -R/--rttime    This is the time a real-time process can run
                        before blocking, in microseconds. When the
                        limit is exceeded, the process is sent SIGXCPU.

Other changes:
- bltins/ulimit.c: Change the formatting from sfprintf and increase the
  size of the tmp buffer to prevent text from being cut off in ulimit
  -a (this was required to add ulimit -R).
- data/limits.c: Add support for using microseconds as a unit.
2021-12-28 22:02:20 +00:00
Martijn Dekker
8f24d4dc56 tests/leaks.sh: add a newly discovered leak as known
Help us fix it at: https://github.com/ksh93/ksh/issues/407
2021-12-28 21:59:50 +00:00
Johnothan King
b425196958 Fix ASan heap-buffer-overflow when handling syntax errors (#402)
This commit backports a bugfix from ksh2020 to fix an ASan
heap-buffer-overflow error in one of the regression tests. See:
https://github.com/att/ast/commit/c57f7398
https://github.com/att/ast/issues/1261

This explanation comes from the linked issue:
> The poplevel() in this block of code is called when lp->lexd.lex_max
> is zero:
> https://github.com/att/ast/blob/bd94eb56/src/cmd/ksh93/sh/lex.c#L921-L925
> Since poplevel() first decrements lp->lexd.lex_max then uses it as
> an index into lp->lexd.lex_match this causes the word before the
> start of that buffer to be accessed. The buffer is allocated here:
> https://github.com/att/ast/blob/bd94eb56/src/cmd/ksh93/sh/lex.c#L2210-L2218

src/cmd/ksh93/sh/lex.c:
- Avoid calling poplevel() twice when handling syntax errors.
2021-12-28 17:53:35 +00:00
Johnothan King
de795e1f9d Allow regression tests to pass without any /opt/ast/bin builtins (#403)
This commit primarily makes changes that allow the regression tests to
run without any of the /opt/ast/bin builtins compiled into ksh. It also
makes a minor improvement to one of the tests in locale.sh by
shellquoting an error message.

Co-authored-by: Martijn Dekker <martijn@inlv.org>
2021-12-28 17:52:57 +00:00
Johnothan King
0e197eee57 Fix mkservice compile errors and add SHOPT_MKSERVICE (#401)
The unused mkservice and eloop builtins are currently not built, and if
an attempt to compile them is made the build ends in failure. This
commit backports a few build fixes from ksh93v- 2012-08-24 that allow
mkservice and eloop to build (plus an additional compiler warning fix
not in ksh93v-). I've also added a new SHOPT_MKSERVICE setting (turned
off by default) so that mkservice and eloop can be built if the user
chooses to include them in their build of ksh.
2021-12-28 17:51:11 +00:00
Johnothan King
a3ed4c368b Fix implicit warnings in the iffe feature tests (#396)
This commit fixes some implicit function warnings in the iffe feature
tests by adding missing include directives.
2021-12-28 17:50:39 +00:00
Martijn Dekker
db3a3d8fc0 tests/leaks.sh: redesign with a more robust testing algorithm
On modern operating systems, memory management is non-deterministic
(i.e. random, unpredictable) to varying degrees. This makes testing
for memory leaks a nightmare as the OS may decide to randomly grow
a process's memory allocation at any time for no apparent reason,
causing intermittent test failures that do not represent real
memory leaks. So far, the leaks test tried to cope with this by
using a large number of iterations plus a certain amount of bytes
of tolerance per iteration. This was inefficient and on some
systems still did not fully eliminate intermittent test failures.

This commit introduces a new testing algorithm that is designed to
cope with a large degree of unpredictability. Instead of a fixed
number of test iterations, it defines a maximum (16384), dividing
them in blocks of 128 iterations. It also defines a minimum number
of sequential "good" iteration blocks, counted if memory usage did
not increase from one block to the next. That minimum number is set
to 16. The theory is that if we can get 16 "good" iteration blocks
in a row, we can safely assume it's not a real memory leak, break
the loop, and consider the test succeeded. That "good" sequence is
allowed to occur at any point in the loop, creating a high built-in
tolerance for non-deterministic shenanigans. It also speeds up the
tests, as successful tests can bow out at 16 * 128 == 2048
iterations if they're lucky. If the OS decides to randomly grow the
memory heap, it may take more tries, but almost (?) certainly not
more than the maximum 16384 (128 blocks). If the counter reaches
that, then we assume a memory leak and throw a test failure.

We're also no longer testing with byte granularity in any case; the
randomness of memory management makes that pointless. All getmem()
function versions now return kibibytes (1024 bytes).

This should eliminate the need for workarounds such as initial
iterations to "steady the state" or a tolerance of a certain number
of bytes. I've experimentally determined the exact values
(max_iter, block_iter, min_good_blocks) that seem to work reliably
on all systems I've tested. They are easy to tweak if necessary.

To make all this manageable, this commit hides all the supporting
code in a triplet of aliases (TEST, DO, DONE) that, when used
correctly, create a grammatically robust shell code block: you can
add redirections, pipe into it, etc. as expected. This makes the
actual tests a great deal easier to read as well.

src/cmd/ksh93/tests/pty.sh:
- Implement new leaks testing framework as described and convert
  all the tests to it.
- Mark known leaks with a 'known' variable. Print non-fail warnings
  for all known leaks, but skip the tests by default. Test them
  only if DEBUG is exported. This is better than commenting them
  out as we will no longer be tempted to forget about these.
- Move the test for large command substitutions to subshell.sh --
  it's not in fact a leak test; instead, it checks that command
  substitutions don't lose data.

src/cmd/ksh93/tests/_common: err_exit():
- Since we're printing more warnings, clearly mark all test
  failures with 'FAIL:' to make them stand out.

src/cmd/ksh93/tests/shtests.
src/cmd/ksh93/tests/pty.sh:
- Special-case leaks.sh for counting tests; grep ^TEST.
- Special-case pty.sh as well while we're at it by grepping tst()
  calls. Remove all the dummy '# err_exit #' comments from pty.sh
  as they are now no longer used for counting the tests.
2021-12-28 17:47:29 +00:00
Martijn Dekker
a9c6f77c3e tests/alias.sh: fix regression test (re: f7213f03)
The variable 'i' had already been used for a non-numeric purpose,
so when declaring it integer in a subshell it's necessary to
initialise it with value 0 or an arithmetic error is shown (which
does not interrupt the test or make it fail).

For the record, the errpr was mine, not Johnothan's.
2021-12-28 15:08:20 +00:00
Johnothan King
24174f0fb7 Backport -P and -t flags for 'type'/'whence' from ksh93v- (#392)
This commit backports the whence '-t' option from ksh93v-. The '-t'
option is useful when one needs to identify the type of a command.
The '-t' flag was added by ksh93v- for compatibility with Bash.

It should be noted the ksh93v- patch had one bug, which this commit
fixes. Path-bound builtins from /opt/ast/bin were classified as
files if loaded from /opt/ast/bin in the PATH. Reproducer:
   $ PATH=/opt/ast/bin whence -t cat
   file

src/cmd/ksh93/bltins/whence.c:
- Simplify the bitmask values for the command and whence builtin
  flags.
- Add the -t flag to the whence and type builtins. To prevent bugs,
  -t will always override -v if both of those flags were passed.

src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/sh.1:
- Add documentation for the new -t option.
2021-12-27 06:40:02 +00:00
Martijn Dekker
2027648f1a Remove leftover pre-C89 code (re: a1f5c992)
I'd forgotten to check for uses of the __STDC__ macro. This is
defined on all C compilers that support C89/C90 or later standards.
So we can remove all fallback code disabled by that macro.
2021-12-27 05:46:23 +00:00
Martijn Dekker
e072e7c170 Fix crash in xtrace while processing here-document (re: d7cada7b)
Depending on the OS, the heredoc.sh regression tests, and possibly
others, still crashed with the -x option (xtrace) on.

Analysis: The lexer crashes in lex_advance(). Something has caused
an inconsistent lexer state, and it happened earlier on, so the
backtrace is useless for figuring out where that happened.

But I think I've found it. It's the sh_mactry() call here:

src/cmd/ksh93/sh/xec.c, lines 2800 to 2807 in f7213f03
2800:   if(!(cp=nv_getval(sh_scoped(shp,PS4NOD))))
2801:           cp = "+ ";
2802:   else
2803:   {
2804:           sh_offoption(SH_XTRACE);
2805:           cp = sh_mactry(shp,cp);
2806:           sh_onoption(SH_XTRACE);
2807:   }

sh_mactry() needs to parse the contents of $PS4 to perform
expansions and command substitutions in it, which involves the
lexer. If that happens in a here-document, the lexer is in the C
function call stack, in the middle of parsing the here-document.
Result: inconsistent lexer state. Solution: save and restore lexer
state in sh_mactry().

After this commit, all regression tests should pass with the
'-x'/'--xtrace' option in use, with no errors or crashes.

Note for backporters: this fix depends both on on d7cada7b and on
the consistency fix for the Lex_t type's size applied in a7ed5d9f.

src/cmd/ksh93/include/shlex.h:
- Cosmetic fix: remove a copied & pasted backslash. (re: a7ed5d9f)

src/cmd/ksh93/sh/macro.c: sh_mactry():
- Save and restore the lexer state before letting sh_mactrim()
  indirectly parse and execute code.

src/cmd/ksh93/tests/*.sh:
- Turn off xtrace in various command substitutions that contain
  2>&1 redirections, so that the xtrace output is not caught by
  the command substitutions, causing tests to fail incorrectly.
- Turn off xtrace for a few code blocks with 2>&1 redirections,
  stopping xtrace output from being written to standard output.

Resolves: https://github.com/ksh93/ksh/issues/306 (again)
2021-12-27 04:02:25 +00:00
Martijn Dekker
91a7c2e3e9 Fix crash/freeze upon interrupting command substitution with pipe
On some systems (at least Linux and macOS):

1. Run on a command line: t=$(sleep 10|while :; do :; done)
2. Press Ctrl+C in the first 10 seconds.
3. Execute any other command substitution. The shell crashes.

Analysis: Something in the job_wait() call in the sh_subshell()
restore routine may be interrupted by a signal such as SIGINT on
Linux and macOS. Exactly what that interruptible thing is remains
to be determined. In any case, since job_wait() was invoked after
sh_popcontext(), interrupting it caused the sh_subshell() restore
routine to be aborted, resulting in an inconsistent state of the
shell. The fix is to sh_popcontext() at a later stage instead.

src/cmd/ksh93/sh/subshell.c: sh_subshell():
- Rename struct checkpt buff to checkpoint because it's clearer.
- Move the sh_popcontext() call to near the end, just after
  decreasing the subshell level counters and restoring the global
  subshell data struct to its parent. This seems like a logical
  place for it and could allow other things to be interrupted, too.
- Get rid of the if(shp->subshell) because it is known that the
  value is > 0 at this point.
- The short exit routine run if the subshell forked now needs a new
  sh_popcontext() call, because this is handled before restoring
  the virtual subshell state.
- While we're here, do a little more detransitioning from all those
  pointless shp pointers.

Fixes: https://github.com/ksh93/ksh/issues/397
2021-12-27 03:49:41 +00:00
Johnothan King
f7213f03a2 Fix multiple bugs when using 'alias -p' to print aliases (#398)
This commit was originally intended to fix just one bug with shcomp's
handling of 'alias -p', but while fixing that I found a large number
of related issues in the alias command's -p, -t and -x options. The
current patch provides bugfixes for all of the bugs listed below:

1) Listing aliases in a script with 'alias -p' or 'alias' broke
   shcomp's bytecode output:
   https://github.com/ksh93/ksh/issues/87#issuecomment-813819122

2) Listing individual aliases with the -p option doesn't work:
      $ alias foo=bar bar=foo
      $ alias foo
      foo=bar
      $ alias -p foo  # No output

3) Listing specific tracked aliases with -pt does not display them
   in a reusable format, but rather adds another tracked alias:
      $ hash -r cat vi
      $ alias -pt vi  # No output
      $ alias -pt rm
      $ alias -t
      cat=/usr/bin/cat
      rm=/usr/bin/rm
      vi=/usr/bin/vi

4) Listing all tracked aliases with -pt does not output them in a
   reusable format (the resulting command printed only creates a
   normal alias, which is different from a tracked alias):
      $ hash -r cat
      $ alias -pt
      alias cat=/usr/bin/cat  # Expected 'alias -t cat'

5) Listing a non-existent alias with -p doesn't cause an error:
      $ unalias -a
      $ alias -p notanalias  # No output
      $ echo $?
      0
      $ alias notanalias
      notanalias: alias not found
      $ echo $?
      1
      $ hash -r
      $ alias -pt notacommand  # No output
      $ echo $?
      0

6) Attempting to list 256 non-existent aliases results in exit
   status zero:
      $ unalias -a
      $ alias $(awk -v ORS= 'BEGIN { for(i=0;i<256;i++) print "x "; }')
      x: alias not found
      --cut error message--
      $ echo $?
      0

Changes:

- typeset.c: Avoid printing anything while shcomp is compiling a
  script. This is needed because the alias command is run by shcomp
  to prevent parsing issues.

- b_alias(): To avoid adding tracked aliases with -pt, set
  tdata.aflag to '+' so that setall() and other related functions
  only list tracked aliases.

- b_alias(): Set tdata.pflag to 1 so that setall() and other
  functions recognize -p was passed.

- print_value(): Add support for listing specific aliases with
  'alias -p'.

- setall(): To avoid any issues with zombie tracked aliases (see also
  the regression tests) ignore tracked alias nodes marked with the
  NV_NOALIAS attribute. This bit is set for tracked alias nodes by
  the nv_rehash() function.

- setall(): For backward compatibility, continue incrementing the
  exit status for each invalid alias and tracked alias passed. This
  was already how alias behaved when listing aliases without -p, so
  using -p shouldn't cause a change in behavior:
      $ unalias -a
      $ alias foo bar
      foo: alias not found
      bar: alias not found
      $ echo $?
      2
  To fix bug 6, the exit status is set to one if an enforced 8-bit
  exit status would be zero.

- print_namval(): Set the prefix to 'alias -t' so that listing
  tracked aliases with 'alias -pt' works correctly.

- data/msg.c and include/name.h: Add an error message for when
  'alias -pt' doesn't find a tracked alias.

- tests/alias.sh: Add a ton of regression tests for the bugs fixed in
  this commit.
2021-12-27 03:49:06 +00:00
Johnothan King
feeb62d15f sh_close(): Set errno to EBADF for invalid file descriptors (#399)
The sh_close() function fails to set errno to EBADF when passed a
negative (invalid) file descriptor. This commit fixes the issue
by setting errno if the file descriptor is a negative value
(backported from ksh93v- 2012-08-24).
2021-12-27 03:48:38 +00:00
Martijn Dekker
9403d326f4 shtests --posix: ensure consistent locale: unset all LC_* vars
When running tests in --posix/-p mode, not all LC_* variables were
unset, so that certain aspects of the locale could be non-POSIX.
2021-12-27 03:48:27 +00:00
Martijn Dekker
a1f5c99204 INIT: remove proto, ratz (re: 46593a89, 6137b99a); major cleanup
This takes another step towards cleaning up the build system. We
now do not even pretend to be theoretically compatible with
pre-1989 K&R C compilers or with C++ compilers. In practice, this
had already been broken for many years due to bit rot.

Commit 46593a89 already removed the license handling enormity that
depended on proto, so now we can cleanly remove it altogether. But
we do need to leave some backwards compatibility stubs to keep the
build system compatible with older AST code; it should remain
possible to build older ksh versions with the current build system
(the bin/ and src/cmd/INIT/ directories) for testing purposes.

So as of now there is no more __MANGLE__d rubbish in your generated
header files. This is only about a quarter of a century overdue...

This commit also includes a huge amount of code cleanup to remove
thousands of unused K&R C fallbacks and other cruft, particularly
in libast. This code base should now be a little easier to
understand for people who are familiar with a modern(ish) C
standard.

ratz is now also removed; this was a standalone and simplified 2005
version of gunzip. As of 6137b99a, none of our code uses it, even
theoretically. And the real g(un)zip is now everywhere.

src/cmd/INIT/proto.c, src/cmd/INIT/ratz.c:
- Removed.

COPYRIGHT:
- Remove zlib license; this only applied to ratz.

bin/package, src/cmd/INIT/package.sh:
- Related cleanups.
- Unset LC_ALL before invoking a new shell, respecting the user's
  locale again and avoiding multibyte character corruption on the
  command line.

src/cmd/INIT/proto.sh:
- Add stub for backwards compatibility with Mamfiles that depend on
  proto. It does nothing but pass input without modification and is
  now installed as the new arch/*/bin/proto by src/cmd/INIT/Mamfile.

src/cmd/INIT/iffe.sh:
- Ignore the proto-related -e (--package) and -p (--prototyped)
  options; keep parsing them for backwards compatibility.
- Trim the macros passed to every test to their standard C
  versions, removing K&R C and C++ versions. These are now
  considered to be for backwards compatibility only.

src/cmd/INIT/iffe.tst:
- Remove proto(1) mangling code.
  By the way, iffe can be regression-tested as follows:
        $ bin/package use   # set up environment in a child shell
        $ regress src/cmd/INIT/iffe.tst
        $ exit              # leave package environment

src/cmd/INIT/make.probe, src/cmd/INIT/probe.win32:
- Remove code to handle C++.

src/lib/libast/features/common:
- As in iffe.sh above, trim macros designed for compatibility with
  C++ and ancient C compilers to their standard C versions and
  comment that they are for backwards compatibility with AST code.
  This is needed to keep all the old ast and ksh code compiling.

src/cmd/ksh93/sh/init.c,
src/cmd/ksh93/sh/name.c:
- Clarify libshell ABI compatibility function versions of macros.
  A "proto workaround" comment in the original code mislead me into
  thinking this had something to do with the removed proto(1), but
  it's unrelated. Call the workaround macro BYPASS_MACRO instead.

src/cmd/ksh93/include/defs.h:
- sh_sigcheck() macro: allow &sh as an argument: parenthesise shp.

src/cmd/ksh93/sh/nvtype.c:
- Remove unused nv_mkstruct() function. (re: d0a5cab1)

**/features/*:
- Remove obsolete iffe 'set prototyped' option.

**/Mamfile:
- Remove all references to the ast/prototyped.h header.
- Remove all use of the proto command. Simply copy instead.

*** 850-ish source files: ***
- Remove all '#pragma prototyped' directives.
- Remove all C++ compat code conditional upon defined(__cplusplus).
- Remove all use of the _ARG_ macro, which on standard C expands to
  its argument:
        #define _ARG_(x)        x
  (on K&R C, it expanded to nothing)
- Remove all use of _BEGIN_EXTERNS_ and _END_EXTERNS_ macros (empty
  on standard C; this was for C++ compatibility)
- Reduce all #if __STD_C (standard code) #else (K&R code) #endif
  blocks to the standard code only, without use of the macro.
- Same for _STD_ macro which seems to have had the same function.
- Change all instances of 'Void_t' to standard 'void'.
2021-12-24 07:05:22 +00:00
Johnothan King
3785a0685c Fix process substitutions printing PIDs in profile scripts (#395)
- sh/args.c: A process substitution run in a profile script may print
  its PID as if it was a command spawned with '&'. Reproducer:
     $ cat /tmp/env
     true >(false)
     $ ENV=/tmp/env ksh
     [1]	730227
     $
  This bug is fixed by turning off the SH_PROFILE state while running
  a process substitution.

- sh/subshell.c: The SH_INTERACTIVE fix in 3525535e renders the extra
  check for SH_PROFILE redundant, so it has been removed.

- tests/io.sh: Update the procsub PIDs test to also check the result
  after using process substitution in a profile script.
2021-12-22 13:27:00 +00:00
Johnothan King
740a24a456 Fix ASan buffer overflow errors caused by memcmp (#393)
This commit replaces more instances of memcmp with strncmp to fix some
more heap-buffer-overflow errors in ASan, some of which can occur when
running the regression tests with xtrace enabled. It combines two
existing patches plus another fix in name.c for xtrace:
https://www.mail-archive.com/ast-developers@lists.research.att.com/msg00877.html
https://github.com/oracle/solaris-userland/blob/master/components/ksh93/patches/035-CR7036535.patch
2021-12-22 06:37:58 +00:00
Martijn Dekker
d95700c348 print.c: resolve whitespace diff with master (re: fb8308243)
I ended up committing versions of the fix to the master and 1.0
branches that differed only in whitespace in a few lines (no code
differences). This commit makes the whitespace identical so this
does not keep annoying me when I look at 'git diff 1.0 master'.
2021-12-22 05:15:32 +00:00
Martijn Dekker
fcd9efce7f Interactive: Avoid losing the job after suspending a subshell
Reproducer: run vi in a subshell:

	$ (vi)

vi opens; now press Ctrl+Z to suspend. The output is as expected:

	[2] + Stopped                  (vi)

…but the exit status is 18 (SIGTSTP's signal number) instead of 0.

Now do:

	$ fg
	(vi)
	$

The exit status is 18 again, vi is not resumed, and the job is
lost. You have to find vi's pid manually using ps and kill it.

Forking all non-command substitution subshells invoked from the
interactive main shell is the only reliable and effective fix I've
found. I've tried to fork the subshell conditionally in every other
remotely plausible place I can think of in fault.c and xec.c, but I
can't get anything to work properly. If anyone can get this to work
without forking as much (or at all), please do submit a patch or PR
that supersedes this fix.

At least subshells of subshells don't need to fork, so the
performance impact can be limited. Plus, it's not as if most people
need maximum speed on the interactive command line. Scripts
(including login/profile scripts) are not affected at all.

Command substitutions can be handled differently. My testing shows
that all shells except ksh93 simply block SIGTSTP (the ^Z signal)
while they run. We should do the same, so they don't need to fork.

NOTE for any backporters: the subshell.c and fault.c changes depend
on commits 35b02626 and 48ba6964 to work correctly.

src/cmd/ksh93/sh/subshell.c: sh_subshell():
- If the interactive shell state bit is on, then before executing
  the subshell's code:
  - for command substitutions, block SIGTSTP;
  - for other subshells, fork.
- For command substitutions, release SIGTSTP if the interactive
  shell state bit was on upon invoking the subshell.

src/cmd/ksh93/sh/fault.c:
- Instead of checking for a virtual subshell, check the shell's
  interactive state bit to decide whether to handle SIGTSTP, as
  that is only turned on in the interactive main shell.

src/cmd/ksh93/sh/main.c: sh_main():
- To avoid bugs, ignore SIGTSTP while running profile scripts.
  Blocking it doesn't work because delaying it until after
  sigrelease() will cause a crash. Thanks to @JohnoKing for this.
- While we're here, prevent a possible overflow of the 'beenhere'
  static char variable by only incrementing it once.

Co-authored-by: Johnothan King <johnothanking@protonmail.com>
Resolves: https://github.com/ksh93/ksh/issues/390
2021-12-22 05:09:17 +00:00
Martijn Dekker
3525535e1f sh_parse(): don't turn on interactive state (re: 48ba6964)
Reproducer:

	$ (sleep 1& echo done)
	done
	$ (eval "echo hi"; sleep 1& echo done)
	hi
	[1]	30587
	done

No job control output should be printed for a background process
invoked from a subshell, not even after 'eval'.

The cause: sh_parse() turns on the shell's interactive state bit
(sh_state(SH_INTERACTIVE)) if the interactive shell option is on.

This is incorrect. The parser should have no involvement with shell
interactivity in principle because that's not its domain.

Not only that, the parser may need to run in a subshell, e.g. when
executing traps or 'eval' commands (as above). By definition, a
subshell can never be interactive.

We already fixed many bugs related to job control and the shell's
interactive state. Even if these two lines previously papered over
some breakage, I can't find any now after simply removing them. If
any is found later, then it'll need to be fixed properly instead.

Related: https://github.com/ksh93/ksh/issues/390
2021-12-22 05:06:12 +00:00
Johnothan King
e6989853bc Fix yet more minor bugs related to the regression tests (#389)
- Redirect error output from the ulimit builtin (re: 3e58851f).
- Fix the test failure for 'cd -eP' on illumos by making a directory
  symlink first, then removing the symlink after cd.
- Fix the test failure for 'getconf -l' on illumos by quoting
  strings with the -q option.
- astconf.c: Only quote strings if the -q option was passed.
- Improve error messages from intermittently failing types.sh tests
2021-12-21 08:01:00 +00:00
Martijn Dekker
0ead68b704 bin/package: fix 64-bit detection on systems without 'cc'
If the compiler is called gcc but not cc, the 64-bit detection
didn't work and $HOSTTYPE (and the arch/ subdirectory) did not get
the -64 suffix.

bin/package, src/cmd/INIT/package.sh:
- Run checkcc() before attempting to compile the program. This will
  set $cc to the path to gcc if there is no 'cc' command.
- trap: use 'rm -rf' to also delete .dSYM directories (macOS).
- checkcc(): Since we're here, find clang as well.
2021-12-21 07:11:44 +00:00
Martijn Dekker
24fc1bbca9 Sanitise standards/feature macros, remove compiler/linker wrappers
The goal is to get rid of all compiler/linker wrapper scripts as
they are overridden by passing CC/LD and it should be possible to
select your compiler or linker without breaking the build. The
probing and feature testing system should set the appropriate flags
and macros. This makes some progress towards that.

src/lib/libast/features/standards:
- Eliminate the shotgun approach to standards macros on popular
  systems where the macros we we need to set are known and
  documented. The following will enable standards compliance plus
  all the available extensions:
  - Set no macros at all for any BSD system (excluding macOS).
  - Set _DARWIN_C_SOURCE on Darwin/macOS.
  - Set everything and the kitchen sink for Solaris/illumos in
    a way that enables backwards compatibility with older Solaris.
    This is unofficial, but following the standards(5) manual
    disables a lot of basic functionality that we depend on.
  - Set _GNU_SOURCE for GNU (glibc).
  - Remove the covered macros from the shotgun approach fallback.
- Add a new heuristic. _POSIX_PATH_MAX and _SC_PAGESIZE are among
  the basic macros disabled when you pass recommended standards
  macros, killing the build, so it's good to check if they compile.

src/cmd/INIT/ar.freebsd12.amd64,
src/cmd/INIT/ar.linux.i386-64:
- Removed. May cause build failures on some systems as not all 'ar'
  implementations support the U option. Plus, I can think of no
  good reason to disable deterministic mode (which always creates
  identical output) on 'ar' implementations that support it. See:
  https://groups.google.com/g/comp.unix.shell/c/LdOD1Ya0C9E/m/U6DhgHVICwAJ

src/cmd/INIT/cc.linux.*-icc,
  Removed icc wrappers. These manually source /etc/profile.d/icc.sh
  but I don't think that is the build system's job. Profile scripts
  should be run at login time and export variables we inherit
  through the environment.

src/cmd/INIT.cc.{freebsd,linux,openbsd}*:
- Removed. Should be entirely superfluous now that the standards
  feature test sets the appropriate macros.

src/cmd/INIT.cc.sol11.*:
- Removed as the standards feature test now sets the approopriate
  macros. Note the Solaris build system should now simply pass CC
  as normal instead of passing CC_EXPLICIT.
2021-12-21 06:52:16 +00:00
Martijn Dekker
d3f553ca76 shtests: report correct line numbers for shcomp fails (re: bd38c804)
Inserting the _common script instead of sourcing it caused all test
failures in shcomp runs to be reported with the number of lines in
_common added.

src/cmd/ksh93/shtests:
- Only incorporate the aliases from _common; dot/source the rest of
  the code as normal. Replace the first few lines with the aliases
  to avoid affecting $LINENO; they are comments anyway.
2021-12-21 06:48:21 +00:00
Martijn Dekker
a381a1b049 Better fix for BUG_IFSISSET (re: 95294419)
With a better understanding of the code 1.5 years later, the
special-casing for IFS introduced in that commit seems like a hack.

The problem was not that the IFS node always exists but that it is
always considered to have a 'get' discipline function. Variables
with a 'get' discipline are considered set. This makes sense for
all variables except IFS.

The nv_isnull() macro is used to check if a variable is set. It
calls nv_hasget() to determine if the variable has a 'get'
discipline. So a better fix is for nv_hasget() always to return
false for IFS.

src/cmd/ksh93/bltins/test.c, src/cmd/ksh93/sh/macro.c:
- Remove special-casing for IFS.

src/cmd/ksh93/sh/nvdisc.c: nv_hasget():
- Always return false for IFS, taking local scope into account.
2021-12-21 06:29:30 +00:00
Martijn Dekker
69e0de9274 bin/package: more cleanups (re: 9166545a)
src/cmd/INIT/package.sh, bin/package:
- Derive the command name from $0 instead of hardcoding it.
- Remove NPROC and related code to support parallel building. This
  is not supported with mamake, is unlikely to be reintroduced any
  time soon, and if it ever is it will need to be done in a
  different way anwyay.
- Invoke 'sed' and 'tr' directly instead of via $SED and $TR
  variables. We're not building our own dynamically linked 'sed'
  and 'tr' in this distribution so LD_LIBRARY_PATH is irrelevant.
  If we ever do again, there are better ways to make sure the OS
  standard 'sed' and 'tr' are invoked than this kludge.
- Use note() consistently to print warnings to standard error.
  note() is changed to print each argument on a new line prefixed
  by the command name, so arguments need to be quoted now if they
  are to be shown on a single line.
- Use a new err_out() function to error out, avoiding code
  repetition.
2021-12-18 14:47:24 +00:00
Martijn Dekker
ce3e080c0e Release 1.0.0-beta.2
Announcing: KornShell 93u+m 1.0.0-beta.2
https://github.com/ksh93/ksh

In May 2020, when every KornShell (ksh93) development project was
abandoned, development was rebooted in a new fork based on the last
stable AT&T version: ksh 93u+. This new fork is called ksh 93u+m as a
permanent nod to its origin. We're restarting it at version 1.0. Seven
months after the first beta, the second one is ready. Please test this
second beta and report any bugs you find, or help us fix known bugs.

We're now the default ksh93 in some OS distributions, at least Debian
and Slackware! Even though we don't think it's stable release quality
yet, the consensus seems to be that 93u+m is already much better than
the last AT&T release.

Main developers: Martijn Dekker, Johnothan King, hyenias

Contributors: Andy Fiddaman, Anuradha Weeraman, Chase, Gordon Woodhull,
Govind Kamat, Harald van Dijk, Lev Kujawski, Marc Wilson, Ryan Schmidt,
Sterling Jensen

HOW TO GET IT

Please download the source code tarball from our GitHub releases page:

	https://github.com/ksh93/ksh/releases

To build, follow the instructions in README.md or src/cmd/ksh93/README.

HOW TO GET INVOLVED

To report a bug, please open an issue at our GitHub page (see above).
Alternatively, email me at martijn@inlv.org with your report.
To get involved in development, read the brief policy information in
README.md and then jump right in with a pull request or email a patch.
See the TODO file in the top-level directory for a to-do list.

*** MAIN CHANGES between 1.0.0-beta.1 and 1.0.0-beta.2 ***

New features in built-in commands:

- 'cd' now supports an -e option that, when combined with -P, verifies
  that $PWD is correct after changing directories; this helps detect
  access permission problems. See:
  https://www.austingroupbugs.net/view.php?id=253

- 'printf' now supports a -v option as in bash. This assigns formatted
  output directly to variables, which is very fast and will not strip
  final newline (\n) characters.

- The 'return' command, when used to return from a function, can now
  return any status value in the 32-bit signed integer range, like on
  zsh. However, due to a traditional Unix kernel limitation, $? is
  still trimmed to its least significant 8 bits whenever leaving a
  (sub)shell environment.

- 'test'/'[' now supports all the same operators as [[ (including =~,
  \<, \>) except for the different 'and'/'or' operators. Note that
  'test'/'[' remains deprecated due to its unfixable pitfalls;
  [[ ... ]] is recommended instead.

Shell language changes:

- Several improvements were made to the --noexec shell code linter.

- Arithmetic expressions in native ksh mode no longer interpret a
  number with a leading zero as octal in any context. Use 8#octalnumber
  instead (e.g. 8#400 == 256). Arithmetic expressions now also behave
  identically within and outside ((...)) and $((...)).

- POSIX compatibility mode fixes (only applicable with the --posix shell
  option on):
  - A leading zero is now consistently recognised as introducing an octal
    number in all arithmetic contexts.
  - $((inf)) and $((nan)) are now interpreted as regular variables.
  - The '.' built-in no longer runs ksh functions and now only runs
    files.

Bugs fixed:

- '.' and '..' are now once again completed by tab completion.

- If SIGINT is set to ignore, the interactive shell no longer exits on
  Ctrl+C.

- ksh now builds and runs on Apple's new M1 hardware.

- The 'return' and 'exit' commands no longer risk triggering actual
  signals by returning or exiting with a status > 256.

- Ksh no longer behaves badly when parsing a type definition command
  ('typeset -T' or 'enum') without executing it or when executing it in
  a subshell. Types can now safely be defined in subshells and defined
  conditionally as in 'if condition; then enum ...; fi'.

- Discipline functions, especially those applied to PS2 or .sh.tilde,
  will no longer crash your shell upon being interrupted or throwing an
  error.

- Fixed a bug that could corrupt output if standard output is closed
  upon initialising the shell.

- Fixed a bug in the [[ ... ]] compound command: the '!' logical
  negation operator now correctly negates another '!', e.g.,
  [[ ! ! 1 -eq 1 ]] now returns 0/true. Note that this has always been
  the case for 'test'/'['.

- Fixed SHLVL so that replacing ksh by itself (exec ksh) will not
  increase it.

- Arithmetic expressions are no longer allowed to assign out-of-range
  values to variables of types declared with enum.

- The 'time' keyword no longer makes the --errexit shell option
  ineffective.

- Various bugs in libcmd built-in commands (those bound to the
  /opt/ast/bin path by default) have been fixed.

- Various other crashing bugs have been fixed.

Fixes for the shcomp byte code compiler:

- shcomp is now able to compile scripts that define types using enum.

- shcomp now refuses to mess up your terminal by writing bytecode
  to it.

*** MAIN CHANGES between ksh 93u+ 2012-08-01 and 93u+m 1.0.0-beta.1 ***

Hundreds of bugs have been fixed, including many serious/critical bugs.
This includes upstreamed patches from OpenSUSE, Red Hat, and Solaris, fixes
backported from the abandoned 93v- beta and ksh2020 fork, as well as many
new fixes from the community. See the NEWS file for more information, and
the git commit log for complete documentation of every fix. Incompatible
changes have been minimised, but not at the expense of fixing bugs. For a
list of potentially incompatible changes, see src/cmd/ksh93/COMPATIBILITY.

Though there was a "no new features, bugfixes only" policy, some new
features were found necessary, either to fix serious design flaws or to
complete functionality that was evidently intended, but not finished.
Below is a summary of these new features.

New command line editor features:

- The forward-delete and End keys are now handled as expected in the
  emacs and vi built-in line editors.

- In the vi and emacs line editors, repeat count parameters can now also
  be used for the arrow keys and the forward-delete key. E.g., in emacs
  mode, <ESC> 7 <left-arrow> will now move the cursor seven positions to
  the left. In vi control mode, this would be entered as: 7 <left-arrow>.

New shell language features:

- The &>file redirection shorthand (for >file 2>&1) is now available for
  all scripts and interactive sessions and not only for profile/login
  scripts, bringing ksh 93u+m in line with mksh, bash, and zsh.

- File name generation (a.k.a. pathname expansion, a.k.a. globbing) now
  never matches the special navigational names '.' (current directory)
  and '..' (parent directory). This change makes a pattern like .*
  useful; it now matches all hidden files (dotfiles) in the current
  directory, without the harmful inclusion of '.' and '..'.

- Tilde expansion can now be extended or modified by defining a
  .sh.tilde.get or .sh.tilde.set discipline function. This replaces a
  2004 undocumented attempt to add this functionality via a .sh.tilde
  command, which never worked and crashed the shell. See the manual for
  details on the new method.

New features in built-in commands:

- Usage error messages now show the --help/--man self-documentation options.

- Path-bound built-ins (such as /opt/ast/bin/cat) can now be executed by
  invoking the canonical path, so the following will now work as expected:
	$ /opt/ast/bin/cat --version
	  version         cat (AT&T Research) 2012-05-31

- 'command -x' now looks for external commands only, skipping built-ins.
  In addition, its xargs-like functionality no longer freezes the shell on
  Linux and macOS, making it effectively a new feature on these systems.

- 'redirect' now checks if all arguments are valid redirections before
  performing them. If an error occurs, it issues an error message instead
  of terminating the shell.

- 'suspend' now refuses to suspend a login shell, as there is probably no
  parent shell to return to and the login session would freeze.

- 'times' now gives high precision output in a POSIX compliant format.

- 'typeset' now gives an informative error message if an incompatible
  combination of options is given.

- 'whence -v/-a' now reports the location of autoloadable functions.

New features in shell options:

- A new --globcasedetect shell option is added on OSs where we can
  check for a case-insensitive file system (currently Windows/Cygwin,
  macOS, Linux and QNX 7.0+). When this option is turned on, file name
  generation (globbing), as well as file name tab completion on
  interactive shells, automatically become case-insensitive on file
  systems where the difference between upper and lower case is ignored
  for file names. This is transparently determined for each directory, so
  a path pattern that spans multiple file systems can be part
  case-sensitive and part case-insensitive.

- A new --nobackslashctrl shell option disables the special escaping
  behaviour of the backslash character in the emacs and vi built-in
  editors. Particularly in the emacs editor, this makes it much easier to
  go backward, insert a forgotten backslash into a command, and then
  continue editing without having your next cursor key replace your
  backslash with garbage. Note that Ctrl+V (or whatever other character
  was set using 'stty lnext') always escapes all control characters in
  either editing mode.

- A new --posix shell option has been added to ksh 93u+m that makes the
  ksh language more compatible with other shells by following the POSIX
  standard more closely. See the manual page for details. It is enabled by
  default if ksh is invoked as sh, otherwise it is disabled by default.

- Enhancement to -G/--globstar: symbolic links to directories are now
  followed if they match a normal (non-**) glob pattern. For example, if
  '/lnk' is a symlink to a directory, '/lnk/**' and '/l?k/**' now work as
  you would expect.
2021-12-17 04:20:04 +01:00
Johnothan King
85199ab351 Backport ksh93v- bugfix for [[ 1<2 ]] (#380)
Strings compared in [[ with the > and < operators should be compared
lexically. This does not work when the strings are single digits, as
the parser interprets it as a syntax error:
   $ [[ 10<2 ]]   # 10 lexically sorts before 2
   $ echo $?
   0
   $ [[ 1<2 ]]
   /usr/bin/ksh: syntax error: `<' unexpected
   $ echo $?
   3

src/cmd/ksh93/sh/lex.c:
- Don't interpret numbers next to > and < as a redirection while
  inside of [[. This bugfix was backported from ksh93v- 2014-06-25.

src/cmd/ksh93/tests/bracket.sh:
- Add regression tests for the > and < operators.
2021-12-17 03:26:41 +01:00
Martijn Dekker
e67df29c07 Re-fix defining types conditionally or in subshells (re: f508660d)
New version. I'm pretty sure the problems that forced me to revert
it earlier are fixed.

This commit mitigates the effects of the hack explained in the
referenced commit so that dummy built-in command nodes added by the
parser for declaration/assignment purposes do not leak out into the
execution level, except in a relatively harmless corner case.

Something like

    if false; then
        typeset -T Foo_t=(integer -i bar)
    fi

will no longer leave a broken dummy Foo_t declaration command. The
same applies to declaration commands created with enum.

The corner case remaining is:

    $ ksh -c 'false && enum E_t=(a b c); E_t -a x=(b b a c)'
    ksh: E_t: not found

Since the 'enum' command is not executed, this should have thrown
a syntax error on the 'E_t -a' declaration:
ksh: syntax error at line 1: `(' unexpected

This is because the -c script is parsed entirely before being
executed, so E_t is recognised as a declaration built-in at parse
time. However, the 'not found' error shows that it was successfully
eliminated at execution time, so the inconsistent state will no
longer persist.

This fix now allows another fix to be effective as well: since
built-ins do not know about virtual subshells, fork a virtual
subshell into a real subshell before adding any built-ins.

src/cmd/ksh93/sh/parse.c:

- Add a pair of functions, dcl_hactivate() and dcl_dehacktivate(),
  that (de)activate an internal declaration built-ins tree into
  which check_typedef() can pre-add dummy type declaration command
  nodes. A viewpath from the main built-ins tree to this internal
  tree is added, unifying the two for search purposes and causing
  new nodes to be added to the internal tree. When parsing is done,
  we close that viewpath. This hides those pre-added nodes at
  execution time. Since the parser is sometimes called recursively
  (e.g. for command substitutions), keep track of this and only
  activate and deactivate at the first level.
     (Fixed compared to previous version of this commit: calling
  dcl_dehacktivate() when the recursion level is already zero is
  now a harmless no-op. Since this only occurs in error handling
  conditions, who cares.)

- We also need to catch errors. This is done by setting libast's
  error_info.exit variable to a dcl_exit() function that tidies up
  and then passes control to the original (usually sh_exit()).
     (Fixed compared to previous version of this commit: dcl_exit()
  immediately deactivates the hack, no matter the recursion level,
  and restores the regular sh_exit(). This is the right thing to
  do when we're in the process of erroring out.)

- sh_cmd(): This is the most central function in the parser. You'd
  think it was sh_parse(), but $(modern)-form command substitutions
  use sh_dolparen() instead. Both call sh_cmd(). So let's simply
  add a dcl_hacktivate() call at the beginning and a
  dcl_deactivate() call at the end.

- assign(): This function calls path_search(), which among many
  other things executes an FPATH search, which may execute
  arbitrary code at parse time (!!!). So, regardless of recursion
  level, forcibly dehacktivate() to avoid those ugly parser side
  effects returning in that context.

src/cmd/ksh93/bltins/enum.c: b_enum():

- Fork a virtual subshell before adding a built-in.

src/cmd/ksh93/sh/xec.c: sh_exec():

- Fork a virtual subshell when detecting typeset's -T option.

Improves fix to https://github.com/ksh93/ksh/issues/256
2021-12-17 01:28:28 +01:00
Martijn Dekker
2bc1d814c9 Do not exit shell on Ctrl+C with SIGINT ignored (re: 7e5fd3e9)
The killpg(getpgrp(),SIGINT) call added to ed_getchar() in that
commit caused the interactive shell to exit on ^C even if SIGINT is
being ignored. We cannot revert or remove that call without
breaking job control. This commit applies a new fix instead.

Reproducers fixed by this commit:

  SIGINT ignored by child:

    $ PS1='childshell$ ' ksh
    childshell$ trap '' INT
    childshell$ (press Ctrl+C)
    $

  SIGINT ignored by parent:

    $ (trap '' INT; ENV=/./dev/null PS1='childshell$ ' ksh)
    childshell$ (press Ctrl+C)
    $

  SIGINT ignored by parent, trapped in child:

    $ (trap '' INT; ENV=/./dev/null PS1='childshell$ ' ksh)
    childshell$ trap 'echo test' INT
    childshell$ (press Ctrl+C)
    $

I've experimentally determined that, under these conditions, the
SFIO stream error state is set to 256 == 0400 == SH_EXITSIG.

src/cmd/ksh93/sh/main.c: exfile():
- On EOF or error, do not return (exiting the shell) if the shell
  state is interactive and if sferror(iop)==SH_EXITSIG.
- Refactor that block a little to make the new check fit in nicely.

src/cmd/ksh93/tests/pty.sh:
- Test the above three reproducers.

Fixes: https://github.com/ksh93/ksh/issues/343
2021-12-16 19:56:46 +01:00
Martijn Dekker
4856847631 libast/regex: full revert (re: 38aab428, 1aa8f771)
In another branch, this is causing a mysterious test failure in
types.sh. So this is not ready for 1.0.
2021-12-16 19:06:33 +01:00
Martijn Dekker
3779d84220 more regression test updates (re: 7cdb01f6) 2021-12-16 11:46:17 +01:00
Martijn Dekker
fad16a0c63 tests/path.sh: update regression test (re: 7cdb01f6) 2021-12-16 09:20:00 +01:00
Martijn Dekker
7cdb01f625 New selection of default libcmd /opt/ast/bin built-ins
Note that this is only about the /opt/ast/bin built-in commands,
not about the regular pathless builtins such as printf.
To use these, either add /opt/ast/bin to your $PATH or use a
command like 'builtin cp'. As usual, --man provides info.

Removed as defaults for lack of convincing advantages over the OS's
external commands:
- chmod, cmp, head, logname, mkdir, sync, uname, wc

Remain as useful defaults:
- basename, cat, cut, dirname. These are commonly used in
  performance-sensitive code paths in scripts and having them as
  built-ins can be good for performance.
- getconf: This is the only interface to some libast internals that
  is available to ksh. It's also has better functionality than most
  OS-shipped 'getconf' commands, e.g., it can list and query all
  the configuration values.

Added as defaults:
- cp, ln, mv: Having these built in can speed up scripts that
  manage files. Also the AST versions have extended functionality
  (see cp --man, etc.).
- mktemp: External mktemp commands vary too widely and are
  incompatible, but it's important that scripts can securely make
  temporary files, so it's good to ship a known interface to this
  functionality.

As a result, the statically linked ksh binary is very slightly
smaller than before.

Resolves: https://github.com/ksh93/ksh/issues/349
2021-12-16 09:09:37 +01:00
Martijn Dekker
60e0687cec shtests: don't run pty tests with shcomp by default
The pty tests tests the interactive shell. Therefore, running them
through the script compiler is a waste of time.

Not only that, it is reported that the pty tests intermittently
fail with shcomp on some systems. This is not worth trying to fix.

src/cmd/ksh93/tests/shtests:
- Only run pty.sh with shcomp if -c/--compile was explicitly
  specified.
- Document the change.
2021-12-16 09:09:05 +01:00
Martijn Dekker
38aab428bb Temporarily revert separate stack for libast/regex (re: 1aa8f771)
Welcome to AT&T engineering practices in action: a fix in one thing
breaks a completely unreleated thing, but only in very specific
and inscrutable circumstances.

Commit ffe84ee7 introduced a regression test failure in types.sh:

test types begins at 2021-12-14+23:57:35
	types.sh[130]: z.r.s should be z.r.x
test types failed at 2021-12-14+23:57:35 with exit code 1 [ 86 tests 1 error ]
test types(C.UTF-8) begins at 2021-12-14+23:57:35
test types(C.UTF-8) passed at 2021-12-14+23:57:35 [ 86 tests 0 errors ]
test types(shcomp) begins at 2021-12-14+23:57:35
test types(shcomp) passed at 2021-12-14+23:57:35 [ 86 tests 0 errors ]

Only enough, I've *only* found this regression on the GitHub CI
runner. I've tried it on three different regular Linux systems and
it occurs on none of them, nor on macOS.

Another odd thing: it only fails on the first of those three test
runs. But my experiments show it fails very consistently.

Through a process of systematic elimination in a test branch, I've
found that the failure is triggered by the change to using a
separate stack in the regex code. All the other changes are fine.

Using a separate stack improves the robustness of the regex code,
but it apparently exposes some breakage in how the very dodgy
'typeset -T' code is handling the stack, which was being masked by
sharing a stack with it. Or at least that seems like the most
plausible explanation to me right now.

So, until that breakage can be traced and fixed, the regex code now
shares the main stack with everything else again for the time being.

_____
Just to record this: by adding a couple of debug lines:

  typeset -p z | sed 's/^/[DEBUG] /'
  printf '[DEBUG] %s\n' "${z.r.s}" "${z.r.x}"

the symptom reveals itself more clearly on the GitHub runner:

  test types begins at 2021-12-15+17:25:57
  [DEBUG] Y_t z=(X_t r=(x=foo;y=bam;s=''))
  [DEBUG]
  [DEBUG] foo
          types.sh[132]: z.r.s should be z.r.x
  test types failed at 2021-12-15+17:25:57 with exit code 1 [ 86 tests 1 error ]
  test types(C.UTF-8) begins at 2021-12-15+17:25:57
  [DEBUG] Y_t z=(X_t r=(x=foo;y=bam;s=foo))
  [DEBUG] foo
  [DEBUG] foo
  test types(C.UTF-8) passed at 2021-12-15+17:25:57 [ 86 tests 0 errors ]
  test types(shcomp) begins at 2021-12-15+17:25:57
  [DEBUG] Y_t z=(X_t r=(x=foo;y=bam;s=foo))
  [DEBUG] foo
  [DEBUG] foo
  test types(shcomp) passed at 2021-12-15+17:25:57 [ 86 tests 0 errors ]
2021-12-16 09:08:44 +01:00
Martijn Dekker
9166545aa3 bin/package: go POSIX
Yes, we're finally abandoning the old Bourne shell so we can
use sane $(command substitutions) and the like. POSIX sh (with
tolerance for shell bugs) is very highly portable these days.
Even Solaris 10 has a POSIX shell, though not as /bin/sh.

bin/package, src/cmd/INIT/package.sh:
- Be nice: if we're on an obsolete or broken shell, try hard to
  escape to a good one. This should preserve the possibility to
  just run 'bin/package make' via ancient /bin/sh on Solaris 10.
  - Note: zsh without sh emulation is considered broken because
    $path, which we use, is linked to $PATH. You have to run it via
    a symlink named sh or via 'zsh --emulate sh' to disable this.
    Enabling emulation mode after initialisation will not work.
- More self-documentation cleanups and updates.
- Regenerate the text-only fallback version of the self-doc.
- Remove flat view functionality (no arch directory); it may have
  been broken for some time, but quite frankly I could not care
  less. It's yet more featuritis. Building in arch/ is fine.
2021-12-16 09:08:20 +01:00
Martijn Dekker
1aa8f771d8 libast: regex: backport robustness improvements from 93v- beta
There are two main changes:

1. The regex code now creates and uses its own stack (env->mst)
   instead of using the shared standard stack (stkstd). That seems
   likely to be a good thing.

2. Missing mbinit() calls were inserted. The 93v- code uses a
   completely different multibyte characters API, so these needed
   to be translated back to the older API. But, as mbinit() is no
   longer a no-op as of 300cd199, these calls do stop things from
   breaking if a previous operation is interrupted mid-character.

I think there might be a couple of off-by-one errors fixed as well,
as there are two instances of this change:

-		while ((index += skip[buf[index]]) < mid);
+		while (index < mid)
+			index += skip[buf[index]];
2021-12-15 00:50:59 +01:00
Martijn Dekker
7c30a59e25 package: allow 'tee' to catch up before returning (re: beb3c64a)
In the referenced commit message I neglected to mention that, when
doing bin/package make, we're now running 'tee' in the background
again and the building job in the foreground, as opposed to the
other way around. Foreground jobs are more reliably interruptable.
But that reintroduced the problem fixed in 5b8d29d3. Now I don't
know what I was thinking then -- the obvious fix is to add a 'wait'
command, allowing 'tee' to catch up before returning to the prompt.
2021-12-15 00:50:56 +01:00
Martijn Dekker
6137b99a6b package: remove a lot more unused AT&T cruft
This reduces the bin/package script by more than half!

bin/package, src/cmd/INIT/package.sh:
- Remove obsolete and unused package commands: admin, contents,
  license, list, remote, regress, setup, update, verify, write.
- Remove associated documentation.
- Replace install command with a dummy. It'll come back when we
  reintroduce the building of dynamic libaries.
- Update the test command to run the regression tests properly
  and capture the output in arch/*/lib/package/gen/tests.out, as
  documented. Arguments are simply passed to bin/shtests.

src/cmd/INIT/{ditto.sh,hurl.sh,release.c}:
- Removed. These were support scripts for some of the removed
  package commands.

src/cmd/ksh93/tests/pty.sh:
- Avoid failure when capturing output via 'bin/package test' by
  redirecting standard error to /dev/tty when running the tests.
2021-12-15 00:50:45 +01:00
Johnothan King
61fa1b68bf The chown builtin should fail with the same error consistently (#378)
This bug was first reported at <https://www.illumos.org/issues/3782>.
The chown builtin when used on illumos can fail with different error
messages after running the same command twice:

  $ touch /tmp/x
  $ /opt/ast/bin/chown -h 433:434 /tmp/px
  chown: /tmp/x: cannot change owner and group [Not owner]
  $ /opt/ast/bin/chown -h 433:434 /tmp/px
  chown: /tmp/x: cannot change owner and group [Invalid argument]

The error messages differ because the libast struid and strgid
functions will return -2 if the same nonexistent ID is used twice.

The fix for this bug has been ported from here:
https://github.com/illumos/illumos-gate/commit/4162633a7c5961f388fd

src/lib/libcmd/chgrp.c:
- Remove NOID macro and check for a < 0 error status instead.
  This is different from the Illumos fix at
  <https://github.com/illumos/illumos-gate/commit/4162633a7c59>
  which added another macro.

src/lib/libast/man/{strgid,struid}.3:
- Correct errors in the strgid and struid documentation.
- Document that the strgid and struid functions will return -2 if
  the same invalid name is used twice.

Co-authored-by: Martijn Dekker <martijn@inlv.org>
2021-12-14 10:37:44 +01:00
Martijn Dekker
c0354a869f Fix build on illumos (re: 7fb814e1, 580ff616)
The standards macros consistency fix for iffe exposed breakage on
illumos: the standards flags aren't set properly. Back in 580ff616,
I set _XPG6 from features/common, which is the wrong place; the
correct place is features/standards -- especially now that iffe
uses its results.

In addition, to get header declarations that aren't somehow in
conflict with themselves on illumos, don't result in "implicit
function declaration" warnings, and expose all the functionality,
we need to define *all* the _XPG[4-7] macros *and* __EXTENSIONS__
*and* _XOPEN_SOURCE. Welp. Thankfully, that's just fine with
Solaris, too.

Thanks to @JohnoKing for the heads-up.
2021-12-14 10:13:41 +01:00
Johnothan King
5084bef41f Fix buggy bin/package environment behavior (#377)
Running 'bin/package environment' should only show what will happen
during the build process, but in it's current state the feature has
some bugs:

1. Errors can occur relating to the failed creation of files and a
   failed attempt to change the directory. Various errors from
   'bin/package environment make' and 'bin/package environment make
   CC=tcc':
   bin/package[5632]: cd: /home/johno/GitRepos/KornShell/ksh/arch/linux.i386-64/lib/package/gen: [No such file or directory]
   bin/package: line 5869: /home/johno/GitRepos/KornShell/ksh/arch/linux.i386-64/lib/package/gen/CC: cannot create [No such file or directory]
   bin/package: line 5869: /home/johno/GitRepos/KornShell/ksh/arch/linux.i386-64/lib/package/gen/CCFLAGS: cannot create [No such file or directory]
   bin/package: line 5869: /home/johno/GitRepos/KornShell/ksh/arch/linux.i386-64/lib/package/gen/CCLDFLAGS: cannot create [No such file or directory]
   bin/package: line 5869: /home/johno/GitRepos/KornShell/ksh/arch/linux.i386-64/lib/package/gen/LDFLAGS: cannot create [No such file or directory]
   bin/package: line 5869: /home/johno/GitRepos/KornShell/ksh/arch/linux.i386-64/lib/package/gen/KSH_RELFLAGS: cannot create [No such file or directory]
   bin/package[5888]: /home/johno/GitRepos/KornShell/ksh/arch/linux.i386-64/lib/package/gen/host: cannot create [No such file or directory]

2. The package script may in some scenarios create a temporary file
   at the root of the repository, such as 'pkg77213.c'.

bin/package, src/cmd/INIT/package.sh:
- Avoid creating files or changing the directory while the
  environment qualifier is on (this also affects the debug
  qualifier). Part of this fix is based on a patch from Marcin
  Cieślak[*], with other fixes applied for similar problems the
  environment qualifier had.
2021-12-14 03:54:50 +01:00
Martijn Dekker
46593a89b7 Get rid of overcomplicated AT&T copyright/license maintenance code
I'm now taking another small step towards extricating this build
system from the long-dead AT&T AST universe.

This commit modifies/reduces the tool called proto. AT&T used proto
for two purposes:

  1. To convert ANSI C code to a form compatible with ancient
     (pre-ANSI) K&R C compilers using extremely complex macro
     voodo. It was similarly capable of translating to C++.
     Theoretically, this entire code base should compile on
     anything from a 1980s K&R C compiler to a modern C++ compiler.
     In practice, given the massive amount of bit rot we inherited,
     I am 99.9% sure that this has been broken for many years.

  2. To automagically insert license comments into source files
     based on an extremely complicated license database system.
     (In all-too-typical AT&T fashion, this second function of
     proto is completely unrelated to the first.)

Function 2 has now been removed because, unlike the AT&T legal
department, I don't think it's worth going to unspeakably extreme
lengths to avoid maintaining license information in source code
files by hand.

In the process, proto.c was cleaned up to look halfway like actual
C code, but it's still processed code: most macros have been
expanded to their numeric value, all comments were stripped, etc.
So don't expect to understand this code. The actual source code is
in these two directories in the ast-open-history repo:

https://github.com/ksh93/ast-open-history/tree/master/src/cmd/proto
https://github.com/ksh93/ast-open-history/tree/master/src/lib/libpp

Meanwhile, nobody wants to compile ksh with a pre-ANSI K&R C
compiler in 2021 -- and there's no good reason to be compatible
with C++ because standard C compilers are universally available.
So, proto will go away when I manage to figure out how to pry it
loose from the innards of this build system.

src/lib/libast/port/astlicense.c:
- Removed. This is al the license handling code that was
  incorporated in proto.c in stripped form. It was not used
  anywhere else, and the environment where it was useful is gone.

src/cmd/INIT/proto.c:
- Cleanup to make this halfway maintainable: indentation, huge
  blocks of empty lines, #line directives, etc.
- Delete all the code corresponding to astlicense.c. This was
  actually easy as it was in a discrete block.
- proto(), pppopen(): Remove 'license'/'notice' and 'options'
  arguments.
- main(): Remove processing of -l (license) and -o (license
  options) flags.

**/Mamfile:
- Update all the proto invocations to remove the -l and -o flags.

bin/package, src/cmd/INIT/package.sh:
- Delete the 'copyright' command, which used the -l and -o
  options to tell proto to extract copyright information from
  *.lic/*.def files in lib/package.

COPYRIGHT:
- Added. This has the information from 'bin/package copyright', with
  the copyright years corrected to plausible values as the AST code
  used the current year (2021) for all of them. It adds ksh 93u+m
  copyright and contributor information at the top as well.
     (Yes, some of the lines in the old non-AT&T copyright notices
  are clipped. This is the actual output of the 'bin/package
  copyright' command as generated by 'proto' in the AST
  distribution. For all that extreme complexity, they couldn't even
  reproduce the notices correctly. But it's officially sanctioned
  by AT&T in exactly this form, so there you have it.)

lib/package/**:
- Removed. All these files are now obsolete and redundant.
2021-12-14 03:15:16 +01:00
Johnothan King
c2ac69b2d5 Use dynamic maximum configuration values when necessary (#370)
This commit fixes an issue with how ksh was obtaining the value of
NGROUPS_MAX. On some systems this setting can be changed (e.g., on
illumos adding 'set ngroups_max=32' to /etc/system then rebooting
changes NGROUPS_MAX from 16 to 32). Ksh was using NGROUPS_MAX with
the assumption it's a static value, which could cause issues on
systems where it isn't static. This bugfix is inspired by the one
from <https://github.com/lkujaw/ast/commit/b1362c3a5>, although it
has been expanded a bit to account for OPEN_MAX as well.

src/cmd/ksh93/sh/init.c, src/lib/libcmd/fds.c:
- Rename the getconf() macro to astconf_long() and move it to ast.h
  to prevent redundancy. Other sections of the code have been
  modified to use this macro for astconf() to account for
  dynamic settings.
- An equivalent macro for unsigned long values (astconf_ulong) has
  been added.
- Prefer sysconf(3) where available. It has better performance as it
  returns a numeric value directly instead of via string
  conversion.
- The astconf_long and astconf_ulong macros have been documented in
  the ast(3) man page.
2021-12-13 07:53:14 +01:00
Martijn Dekker
fc752b574a Re-match '.' and '..' in tab completion (re: 5312a59d, aad74597)
Turns out there is a bona fide, honest-to-goodness use case for
matching '.' and '..' in globbing after all. It's when globbing is
used as the backend mechanism for file name completion in
interactive shell editors. A tab invisibly adds a * at the end of
the word to the left of your cursor and the resulting pattern is
expanded. In 5312a59d, this broke for '.' and '..'.

Typing '.' followed by two tabs should result in a menu that
includes './' and '../'. Typing '..' followed by a tab should
result in '../', (or a menu that includes it if there are files
with names starting with '..'). This is the behaviour in 93u+ and
we should maintain this.

To restore this functionality without reintroducing the harmful
behaviour fixed in the referenced commits, we should special-case
this, allowing '.' and '..' to match only for file name completion.

src/lib/libast/include/glob.h:
- Fix an inaccurate comment: the GLOB_COMPLETE flag is used for
  command completion, not file name completion. This is very clear
  from reading the path_expand() function in sh/expand.c.
- Add new GLOB_FCOMPLETE flag for file name completion.

src/lib/libast/misc/glob.c:
- Adapt flags mask to fit the new flag.
- glob_dir(): If GLOB_FCOMPLETE is passed, allow '.' and '..' to
  match even if expanded from a pattern.
- Clarify the fix from aad74597 with an extended comment based on
  <https://github.com/ksh93/ksh/issues/146#issuecomment-790991990>.

src/cmd/ksh93/sh/expand.c: path_expand():
- If we're in the SH_FCOMPLETE (file name completion) state, then
  pass the new GLOB_FCOMPLETE flag to AST glob(3).

Fixes: https://github.com/ksh93/ksh/issues/372
Thanks to @fbrau for the bug report.
2021-12-13 01:50:50 +01:00
Johnothan King
e54001d58b Various minor capitalization and typo fixes (#371)
This commit fixes various minor typos, punctuation errors and
corrects the capitalization of many names.
2021-12-13 01:49:42 +01:00
Johnothan King
cd562b16e2 Port more shell lint improvements from illumos and ksh93v- (#374)
This commit adds onto <https://github.com/ksh93/ksh/pull/353> by porting
over two additional improvements to the shell linter:

1) The changes in the aforementioned pull request were merged into
   illumos-gate with an additional change.[*] The illumos revision of
   the patch improved the warning for (( $foo = $? )) to specify '$foo'
   causes the warning.[**] Example:
      $ ksh -n -c '(( $? != $bar ))'
      ksh: warning: line 1: in '(( $? != $bar ))', using '$' as in '$bar' is slower and can introduce rounding errors
   While I was porting the illumos patch I did notice one problem. The
   string it uses from paramsub() skips over the initial '{' in
   '${var}', resulting in the warning printing '$var}' instead:
      $ ksh -n -c '(( ${.sh.pid} != $$ ))'
      ...  in '(( ${.sh.pid} != $$ ))', using '$' as in '$.sh.pid}' is slower  ...
   This was fixed by including the missing '{' in the string returned by
   paramsub for ${var} variables.

2) In ksh93v-, parsing x=$((expr)) with the shell linter will cause ksh
   to warn the user x=$((expr)) is slower than ((x=expr)). This
   improvement has been backported with a modified warning:
      # Result from this commit
      $ ksh -n -c 'x=$((1 + 2))'
      ksh: warning: line 1: x=$((1 + 2)) is slower than ((x=1 + 2))
      # Result from ksh93v-
      $ ksh93v -n -c 'x=$((1 + 2))'
      ksh93v: warning: line 1: ((x=1 + 2)) is more efficient than x=$((1 + 2))
   Minor note: the ksh93v- patch had an invalid use of memcmp; this
   version of the patch uses strncmp instead.

References:
be548e87bc
https://code.illumos.org/c/illumos-gate/+/1834/comment/65722363_22fdf8e7/
2021-12-13 01:49:37 +01:00
Ryan Schmidt
66a50ece82 Include <unistd.h> in "fd is * arg to poll" tests (#376)
The "fd is first arg to poll()" and "fd is second arg to poll()" tests
use write() but don't include the system header in which that function
is declared, leading to "error: implicit declaration of function 'write'
is invalid in C99'" when trying to compile the test. By including the
header, the test can now compile and run as intended.
2021-12-13 01:49:29 +01:00
Ryan Schmidt
b31fe887b8 Include <string.h> in "mmap is worth using" test (#375)
The "mmap is worth using" test uses memcpy but doesn't include the
system header in which that function is declared, leading to
"error: implicitly declaring library function 'memcpy'" when trying
to compile the test. By including the header, the test can now
compile and run as intended.

On macOS, the result is still negative. The test seems to time
using mmap and another method and only picks mmap if it's faster.
Editing the test to print out timing information on a few runs, I
see that mmap is 2–3✕ slower than the other method:

$ for i in $(seq 1 5); do ./mmap_worth_using; done
/* mmtm=11 rdtm=3 */
/* 4*mmtm=44 3*rdtm=9 */
/* 4*mmtm=44 5*rdtm=15 */
/* mmtm=9 rdtm=5 */
/* 4*mmtm=36 3*rdtm=15 */
/* 4*mmtm=36 5*rdtm=25 */
/* mmtm=10 rdtm=4 */
/* 4*mmtm=40 3*rdtm=12 */
/* 4*mmtm=40 5*rdtm=20 */
/* mmtm=12 rdtm=4 */
/* 4*mmtm=48 3*rdtm=12 */
/* 4*mmtm=48 5*rdtm=20 */
/* mmtm=12 rdtm=4 */
/* 4*mmtm=48 3*rdtm=12 */
/* 4*mmtm=48 5*rdtm=20 */
2021-12-13 01:49:24 +01:00
Martijn Dekker
0eb857041d libast: fix memccpy(3) feature test
If you passed CC=/some/compiler, the build broke on macOS because the
cc.darwin compiler wrapper wasn't used. Among other things, this
wrapper adds a -D_lib_memccpy flag, defining _lib_memccpy as 1
during the build. That was used to override a false negative result
of the lib_memccpy feature test. This commit fixes that feature
test instead, so it correctly returns positive on macOS.

Thanks to Ryan Smith (@ryandesign) for the bug report and for the
fix to the lib_memccpy test.

src/lib/libast/features/lib:
- Fix the lib_memccpy feature test. It was checking the result of
  mmap(2) incorrectly, resulting in the test crashing on macOS.
  Failure does not return NULL, it returns MAP_FAILED which is
  usually -1.

src/cmd/INIT/cc.darwin*:
- Removed. Any other flags in these wrappers are either related to
  building dynamic libraries, which is not currently supported, or
  were determined to be unnecessary. See the GitHub issue for
  discussion. This now makes it possible to pass `CC` to use any
  compiler you like on the Mac. Notes:
  - Apple's -D_ast_int8_t=int64_t is a no-op; another AST feature
    test already defines _ast_int8_t a 64-bit integer type, even on
    32-bit systems (on which it is defined as 'long long').
  - The -search_paths_first linker flag is the default since 2010.
    But even on my museum-grade Power Mac G5 with Mac OS X 10.3
    (from 2004), it builds and runs just fine without.
  - DCLK_TCK=100 is a no-op as even that ancient Mac system already
    defines it as 100. Plus, it's not even actually used.
  If a need is found for any of these, please report this in a new
  issue so I can special-case it elsewhere in the code.

Resolves: https://github.com/ksh93/ksh/issues/373
2021-12-12 03:12:15 +01:00
Martijn Dekker
65feb9641a Fix two more PS2/SIGINT crashing bugs (re: 3023d53b)
*** Crash 1: ***

ksh crashed if the PS1 prompt contains one or more command
substitutions and you enter a multi-line command substitution
on the command line, then interrupt while on the PS2 prompt.

    $ ENV=/./dev/null /usr/local/bin/ksh -o emacs
    $ PS1='$(echo foo) $(echo bar) $(echo baz) ! % '
    foo bar baz 16999 % echo $(
    > true		<-- here, press Ctrl+C instead of Return

    Memory fault

The crash occurred due to a corrupted lexer state while trying to
display the PS1 prompt.

Analysis: My fix for the crashing bug with Ctrl+C in commit
3023d53b is incorrect and only worked accidentally. sh_fault() is
not the right place to reset the lexer state because, when we press
Ctrl+C on a PS2 prompt, ksh had been waiting for input to finish
lexing a multi-line command, so sh_lex() and other lexer functions
are on the function call stack and will be returned to.

src/cmd/ksh93/sh/fault.c: sh_fault():
- Remove incorrect SIGINT fix.

src/cmd/ksh93/sh/io.c: io_prompt():
- Reset the lexer state immediately before printing every PS1
  prompt. Even in situations where this is redundant it should be
  perfectly safe, the overhead is negligible, and it resolves this
  crash. It may pre-empt other problems as well.

*** Crash 2: ***

If an INT trap is set, and you start entering a multi-line command
substitution, then press Ctrl+C on the PS2 prompt to trigger the
crash, the lexer state is corrupted because the lexer is invoked to
eval the trap action. A crash then occurs on entering the final ')'
of the command substitution.

    $ trap 'echo TRAPPED' INT
    $ echo $(
    > trueTRAPPED	<-- press Ctrl+C to output "TRAPPED"
    > )
    Memory fault

Technically, as SIGINT is trapped, it should not interrupt, so ksh
should execute the trap, then continue with the PS2 prompt to let
the user finish inputting the command. But I have been unsuccessful
in many different attempts to make this work properly. I managed to
get multi-line command substitutions to lex correctly by saving and
restoring the lexer state, but command substitutions were still
corrupted at the parser and/or execution level and I have not
managed to trace the cause of that.

My testing showed that all other shells interrupt the PS2 prompt
and return to PS1 when the user presses Ctrl+C, even if SIGINT is
trapped. I think that is a reasonable alternative, and it is
something I managed to make work.

src/cmd/ksh93/sh/fault.c: sh_chktrap():
- Immediately after invoking sh_trap() to run a trap action, check
  if we're in a PS2 prompt (sh.nextprompt == 2). If so, assume the
  lexer state is now overwritten. Closing the fcin stream with
  fcclose() seems to reliably force the lexer to stop doing
  anything else. Then we can just reset the prompt to PS1 and
  invoke sh_exit() to start new command line, which will now reset
  the lexer state as per above.
2021-12-11 04:29:53 +01:00
Martijn Dekker
0180a65bbf emacs.c: fix compiler warning (re: aa304888) 2021-12-11 04:29:12 +01:00
Martijn Dekker
feedc05037 Reset lexer state on syntax error and on SIGINT (Ctrl+C)
ksh crashed if you pressed Ctrl+C or Ctrl+D on a PS2 prompt while
you haven't finished entering a $(command substitution). It
corrupts subsequent command substitutions. Sometimes the situation
recovers, sometimes the shell crashes.

Simple crash reproducer:

  $ PS1="\$(echo foo) \$(echo bar) \$(echo baz) > "
  foo bar baz > echo $(                        <-- now press Ctrl+D
  > ksh: syntax error: `(' unmatched
  Memory fault

The same happens with Ctrl+C, minus the syntax error message.

The problem is that the lexer state becomes inconsistent when the
lexer is interrupted in the middle of reading a command
substitution of the form $( ... ). This is tracked in the
'lexd.dolparen' variable in the lexer state struct.

Resetting that variable is sufficient to fix this issue. However,
in this commit I prefer to just reinitialise the lexer state
completely to pre-empt any other possible issues. Whether there was
a syntax error or the user pressed Ctrl+C, we just interrupted all
lexing and parsing, so the lexer *should* restart from scratch.

src/cmd/ksh93/sh/fault.c: sh_fault():
- If the shell is in an interactive state (e.g. not a subshell) and
  SIGINT was received, reinitialise the lexer state. This fixes the
  crash with Ctrl+C.

src/cmd/ksh93/sh/lex.c: sh_syntax():
- When handling a syntax error, reset the lexer state. This fixes
  the crash with Ctrl+D.

NEWS:
- Also add the forgotten item for the previous fix (re: 2322f939).
2021-12-09 07:35:12 +01:00
Martijn Dekker
350e52877b Revert "[1.0 release prep] Remove tilde expansion discipline"
This reverts c0334e32, thereby restoring 936a1939.

After the fixes in 0a343244 and a2bc49be, the tilde expansion
disciplines work nicely, so they can come back to the 1.0 branch.
2021-12-09 07:31:37 +01:00
Martijn Dekker
a2bc49bed1 Further robustify .get and .set discipline functions (re: 0a343244) (#368)
This should fix various crashes that remain, at least:
* when running a PS2 discipline at parse time
* when pressing Ctrl+C on a PS2 prompt
* when a special builtin within a discipline throws an error within
  a virtual subshell

src/cmd/ksh93/sh/nvdisc.c:
- In both assign() which handles .set disciplines and lookup()
  which handles .get disciplines, to stop errors in discipline
  functions from wreaking havoc:
  - Save, reinitialise and restore the lexer state in case the
    discipline is run at parse time. This happens with PS2; I'm not
    currently aware of other contexts but that doesn't mean there
    aren't any or that there won't be any. Plus, I determined by
    experimenting that doing this here seems to be the only way to
    make it work reliably. Thankfully the overhead is low.
  - Check the topfd redirection state and run sh_iorestore() if
    needed. Without this, if a special builtin with a redirection
    throws an error in a discipline function, its redirection(s)
    remain permanent. For example, 'trap --bad-option 2>/dev/null'
    in a PS2.get() discipline would kill standard error, including
    all your prompts.

src/cmd/ksh93/sh/io.c: io_prompt():
- Before getting the value of the PS2 prompt, save the stack state
  and restore it after. This stops a PS2.get discipline function
  from corrupting a command substitution that the user is typing.
  Doing this in assign()/lookup() is ineffective, so do it here.

Fixes: https://github.com/ksh93/ksh/issues/347
2021-12-09 07:31:30 +01:00
Johnothan King
1f5ad85cd4 Minor improvements to the regression tests (#366)
- tests/basic.sh: Fix a regression test that was failing under dtksh by
  allowing the error message to name ksh 'lt-dtksh'. Additionally, fix
  the test's inaccurate failure message (a version string is not what
  the regression test expects).

- test/builtins.sh: Exclude the expr builtin from the unrecognized
  options test because it's incompatible. Additionally, put the
  unrecognized options test inside of a function to ensure that it works
  with the future local builtin (https://github.com/ksh93/ksh/issues/123).

- tests/io.sh: The long seek test may fail to seek past the 2 GiB
  boundary on 32-bit systems, so only allow it to run on 64-bit.
  References:
  3222ac2b59/f/ksh-1.0.0-beta.1-regre-tests.patch
  a5c692e1bd

- tests/substring.sh: Add a regression test for the ${.sh.match}
  crashing bug fixed in commit 1bf1d2f8.
2021-12-09 06:43:39 +01:00
Martijn Dekker
caae7dd7c9 ksh93/Mamfile: add missing header dependencies
src/cmd/ksh93/sh/Mamfile:
- For edit/edit.c, add include/shlex.h (re: f99ce517).
- For sh/shcomp.c, add include/{path,terminal}.h (re: 141fa68e).

Thanks to @JohnoKing for flagging these up.
2021-12-09 06:43:30 +01:00
Martijn Dekker
aa3048880b cleanup: get rid of KSHELL and _BLD_shell preprocessor macros
Once upon a time it might have been possible to build certain parts
of ksh, such as the emacs and vi editors and possibly even the
name/value library (nval(3)) as independent libraries. But given
the depressing amount of bit rot in the code that we inherited, I
am certain that disabling either of these macros had been resulting
in a broken build for many years before AT&T abandoned this code
base. These are certainly not going to be useful now.

Meanwhile the KSHELL macro got in the way of me today, because the
Mamfile did not define it for all the .c files, but some headers
declared some functionality conditionally upon that macro. So
including <io.h> in, e.g., nvdisc.c did not declare the same
functions as including that header in files with KSHELL defined.
This inconsistency is now gone as well, for various files.

I'm currently working on making it possible once again to build
libshell as a dynamic library; that should be good enough. And that
never involved disabling either of these macros.
2021-12-09 06:43:22 +01:00
Johnothan King
2b8eaa6609 Fix handling of files without newlines in the head and tail builtins (#365)
The head and tail builtins don't correctly handle files that lack
newlines[*]:
    $ print -n foo > /tmp/bar
    $ /opt/ast/bin/head -1 /tmp/bar  # No output
    $ print -n 'foo\nbar' > /tmp/bar
    $ /opt/ast/bin/tail -1 /tmp/bar
    foo
    bar$
This commit backports the required changes from ksh93v- to handle files
without a newline in the head and tail builtins. (Also note that the
required fix to sfmove was already backported in commit 1bd06207.)

src/lib/libcmd/{head,tail}.c:
- Backport the relevant ksh93v- code for handling files
  without newlines.

src/cmd/ksh93/tests/builtins.sh:
- Add a few regression tests for using 'head -1' and 'tail -1' on a file
  missing and ending newline.

[*]: https://www.illumos.org/issues/4149
2021-12-09 06:43:10 +01:00
Johnothan King
beccb93fd4 Fix various compiler warnings and minor issues (#362)
List of changes:
- Fixed some -Wuninitialized warnings and removed some unused variables.

- Removed the unused extern for B_login (re: d8eba9d1).

- The libcmd builtins and the vmalloc memfatal function now handle
  memory errors with 'ERROR_SYSTEM|ERROR_PANIC' for consistency with how
  ksh itself handles out of memory errors.

- Added usage of UNREACHABLE() where it was missing from error handling.

- Extend many variables from short to int to prevent overflows (most
  variables involve file descriptors).

- Backported a ksh2020 patch to fix unused value Coverity issues
  (https://github.com/att/ast/pull/740).

- Note in src/cmd/ksh93/README that ksh compiles with Cygwin on
  Windows 10 and Windows 11, albeit with many test failures.

- Add comments to detail some sections of code. Extensive list of
  commits related to this change:
  ca2443b5, 7e7f1372, 2db9953a, 7003aba4, 6f50ff64, b1a41311,
  222515bf, a0dcdeea, 0aa9e03f, 61437b27, 352e68da, 88e8fa67,
  bc8b36fa, 6e515f1d, 017d088c, 035a4cb3, 588a1ff7, 6d63b57d,
  a2f13c19, 794d1c86, ab98ec65, 1026006d

- Removed a lot of dead ifdef code.

- edit/emacs.c: Hide an assignment to avoid a -Wunused warning. (See
  also https://github.com/att/ast/pull/753, which removed the assignment
  because ksh2020 removed the !SHOPT_MULTIBYTE code.)

- sh/nvdisc.c: The sh_newof macro cannot return a null pointer because
  it will instead cause the shell to exit if memory cannot be allocated.
  That makes the if statement here a no-op, so remove it.

- sh/xec.c: Fixed one unused variable warning in sh_funscope().

- sh/xec.c: Remove a fallthrough comment added in commit ed478ab7
  because the TFORK code doesn't fall through (GCC also produces no
  -Wimplicit-fallthrough warning here).

- data/builtins.c: The cd and pwd man pages state that these builtins
  default to -P if PATH_RESOLVE is 'physical', which isn't accurate:
     $ /opt/ast/bin/getconf PATH_RESOLVE
     physical
     $ mkdir /tmp/dir; ln -s /tmp/dir /tmp/sym
     $ cd /tmp/sym
     $ pwd
     /tmp/sym
     $ cd -P /tmp/sym
     $ pwd
     /tmp/dir
  The behavior described by these man pages isn't specified in the ksh
  man page or by POSIX, so to avoid changing these builtin's behavior
  the inaccurate PATH_RESOLVE information has been removed.

- Mamfiles: Preserve multi-line errors by quoting the $x variable.
  This fix was backported from 93v-.
  (See also <https://github.com/lkujaw/ast/commit/a7e9cc82>.)

- sh/subshell.c: Remove set but not used sp->errcontext variable.
2021-12-09 06:42:59 +01:00
Martijn Dekker
b3050769ea Fix 'return' emitting signals; allow arbitrary return values
When a global EXIT trap is set, and a ksh-style function exits with
a status > 256 that could have been the result of a signal, then
the shell incorrectly issues that signal to itself. Depending on
the signal, this causes ksh to terminate itself ungracefully:

    $ cat /tmp/exit267
    trap 'echo OK' EXIT  # This trap triggers the crash
    function foo { return 267; }
    foo
    $ bash /tmp/exit267
    OK
    $ ksh-3aee10d7 /tmp/exit267
    OK
    $ ksh /tmp/exit267
    Memory fault(coredump)

On most systems, status 267 corresponds to SIGSEGV. The reported
memory fault is not real; it results from ksh incorrectly killing
itself with that signal.

The problem is caused by two factors:

1. As of 93u+ 2012-08-01, ksh explicitly allows 'return' to use an
   exit status corresponding to a signal (from 257 to end of signal
   range). The rest of the integer range is trunctated to 8 bits.
   This is contrary to both 'man ksh' and 'return --man' which both
   say it's always truncated to 8 bits. Plus, combined with point 2
   below, this new behaviour is nonsensical, as 'return' has no
   business actually generating signals. However, a couple of
   regression tests now depend on this, as may some scripts.

2. When a ksh-style function does not handle a signal, the signal
   is passed down to the parent environment and ksh does this by
   reissuing the signal to its own process after leaving the
   function scope. However, it does this by checking the exit
   status, which is very bad practice as there is no guarantee
   that an exit status corresponding to a signal was in fact
   produced by a signal, particularly after they changed the
   behaviour of 'return' per 1 above.

This commit fixes both issues. It also takes a proper decision on
allowable 'return' exit status arguments. Since 93u+ was released
nearly a decade ago and some scripts may now rely on being able to
pass certain exit statuses out of the 8-bit range, we should not
disallow this now. But neither should we be half-hearted in
allowing only some arbitrary selection of 9-bit statuses; 'return'
values categorically should have nothing to do with signals, so
this is no basis for limiting them. We're now allowing the full
unsigned integer range, which is usually 32 bits. This is like zsh,
and may create some interesting possibilities for scripts.
Just don't forget that $? will still lose all but its 8 least
significant bits when leaving the current (sub)shell environment.

src/cmd/ksh93/sh/xec.c: sh_funscope():
- Fix passing down unhandled signals from interrupted ksh functions
  (jumpval==SH_JMPFUN) to the parent environment. Do not pay any
  attention to the exit status. Instead, use sh.lastsig (a.k.a.
  shp->lastsig). It is set by sh_fault() in fault.c for just this
  purpose and contains the last signal handled for the current
  command. It is reset in sh_exec() before running any new command.
  So if it contains a signal, that is the one that interrupted the
  ksh function, so it's the correct one to pass down. (Further
  evidence: sh_subshell() was already using this in the same way.)

src/cmd/ksh93/bltins/cflow.c: b_return():
- Allow any signed int return value when invoked as and behaving
  like 'return'.
- Add warning if a passed value is out of int range. Set the exit
  status to 128 in that case; int overflow is undefined behaviour
  in C and we want consistent behaviour across platforms. It should
  be safe enough to check if the long and int values are equal.
- Refactor for clarity.

src/cmd/ksh93/sh/subshell.c: sh_subshell():
- If a function returns with a status out of the 8 bit range in a
  virtual subshell, this status could be passed down to the parent
  shell in full. However, if the subshell forks, then the kernel
  will enforce an 8-bit exit status. That is inconsistent. Scripts
  should not be able to tell the difference between forked and
  non-forked subshells, so artificially enforce that limit here.

Other changed files:
- Documentation updates and copy-edits.
- Update an AT&T functions.sh regress test to allow arbitrary
  integer return values for functions.
- Add regression tests based in part on @JohnoKing's reproducers.
- Rework some vaguely related regression tests to fail gracefully.

Thanks to Johnothan King for the report and the testing.

Fixes: https://github.com/ksh93/ksh/issues/364
2021-12-09 06:41:39 +01:00
Martijn Dekker
7fb814e186 iffe: include OS standards macros in iffe tests by default
In iffe tests, some C functions are found in system libraries, but
then are not declared by the system headers on some systems because
the expected standards macros aren't defined, causing the system
headers to hide the function declarations. This may cause warnings
about invalid implicit function declarations in some tests (which
only show up if you export IFFEFLAGS=-d1), but may also cause false
negative test results. The iffe tests should be given the same
environment that their test results are going to be used in.

libast's first-run and most central feature test, 'standards',
figures out what standards macros need to be used on the current
system to get the system headers to declare the expected
functionality. All code that links to libast depends on the header
generated by this feature test. So iffe should use this result for
the tests as well, as soon as it's available (which is early in
libast's compilation cycle).

Concrete example: on Cygwin, in src/cmd/builtin/features/pty, the
'lib ptsname' test detects ptsname(3) in the system library, but
the output{...}end block that uses the _lib_ptsname feature test
result throws an 'implicit function definition' warning because
Cygwin's stdlib.h does not define this function without the
appropriate standards macros being defined first.

src/cmd/INIT/iffe.sh:
- If ${INSTALLROOT}/src/lib/libast/FEATURE/standards is available,
  incorporate it directly into iffe's own block of compiler
  massaging macros. Do not use #include as the necessary -I flags
  are not added for every test.
- Centrally define the iffe version once, not twice.
- Update and tweak getopts self-documentation (iffe --man).
- While we're here, add macOS/Darwin's DYLD_LIBRARY_PATH to the
  supported dynamic library search variables. On macOS, this is
  normally disabled by System Integrity Protection, plus this
  distribution uses static libraries at build time, so this is for
  completeness' sake and not much else. This was ported from
  @lkujaw's fork: https://github.com/lkujaw/ast/commit/48ff05442994
  (thanks to @JohnoKing for pointing it out).

src/lib/libast/features/standards:
- Add a comment. This is to update the file's timestamp, ensuring
  that everything will be recompiled after this commit.
2021-12-07 05:25:06 +01:00
Martijn Dekker
c842f46bbd tests/leaks.sh: increase tolerances again
Another day, another attempt to quash regress test fails caused by
implausibly small memory "leaks".

Apparently, on later macOS versions, 'ps' got more unreliable, so
increase tolerance from 8 to 12 bytes. Maximum reported leaks.sh
failure was 188 KiB after 16384 iterations (11 bytes/iteration).

And on some Linux versions, /proc sometimes acts weird. The
following is apparently consistent on Arch Linux:
	shcomp-leaks.ksh[177]: memory leak with read -C when using
	<<< (leaked approx 65536 bytes after 4096 iterations)
...which is 16 bytes per iteration, still not large enough to make
a real leak plausible.

Resolves: https://github.com/ksh93/ksh/issues/363
2021-12-06 10:05:36 +01:00
Martijn Dekker
17f13345dc Build & getconf fixes for macOS (and UnixWare)
src/lib/libast/features/standards:
- Add heuristic (u_long availability) for systems that hide rather
  than reveal functionality in the presence of _POSIX_SOURCE, etc.
- Define _DARWIN_C_SOURCE, like _GNU_SOURCE, to enable the full
  range of definitions on macOS systems.
- Due to the above, remove MACH (macOS)-specific hack.
- These changes ported from https://github.com/att/ast/pull/1492 -
  thanks to Lev Kujawski (@lkujaw). His PR indicates that this
  fixes the standards macros on UnixWare, too. Therefore, no longer
  exclude UnixWare from standards macros (re: ff70c27f).

src/lib/libast/comp/conf.sh:
- Promote the 'op' member in Conf_t (struct Conf_s) from short to
  int. This allows some Darwin/macOS values, now exposed, to fit
  that would otherwise be truncated, namely:
  _CS_DARWIN_USER_CACHE_DIR               65538
  _CS_DARWIN_USER_DIR                     65536
  _CS_DARWIN_USER_TEMP_DIR                65537
  Thus, the following AST getconf values are now correct on macOS:
  $ /opt/ast/bin/getconf | grep ^DARWIN
  DARWIN_USER_CACHE_DIR=/var/folders/nx/(REDACTED)/C/
  DARWIN_USER_DIR=/var/folders/nx/(REDACTED)/0/
  DARWIN_USER_TEMP_DIR=/var/folders/nx/(REDACTED)/T/

src/lib/libast/features/tty:
- Include <sys/ioctl.h> if available. This silences a compiler
  warning in src/lib/libast/misc/procopen.c about an invalid
  implicit declaration of ioctl(2).
2021-12-06 10:05:30 +01:00
Martijn Dekker
0d2d6bb47b Documentation tweaks (re: cd8c48cc)
The information about an out of memory error does not apply
to the version of the PR that was eventually committed since
it does not use getcwd() which might cause such an error.
See https://github.com/ksh93/ksh/pull/358#issuecomment-986294934
2021-12-06 06:58:34 +01:00
Johnothan King
cd8c48cc5a Add the '-e' flag to the 'cd' builtin (#358)
This change adds the -e flag to the cd builtin, as specified in
<https://www.austingroupbugs.net/view.php?id=253>. The -e flag is
used to verify if the the current working directory after 'cd -P'
successfully changes the directory, and returns with exit status 1
if the cwd couldn't be determined. Additionally, it causes all
other errors to return with exit status >1 (i.e., status 2 unless
ENOMEM occurs) if -e and -P are both active.

src/cmd/ksh93/bltins/cd_pwd.c:
- Add -e option to the cd builtin command. It verifies $PWD by
  using test_inode() to execute the equivalent of [[ . -ef $PWD ]].
- The check for restricted mode has been moved after optget to
  allow 'cd -eP' to return with exit status 2 when in restricted
  mode. To avoid changing the previous behavior of cd when -e isn't
  passed, extra checks have been added to prevent cd from printing
  usage information in restricted mode.

src/cmd/ksh93/tests/builtins.sh:
- Add regression tests for the exit status when using the cd -P
  flag with and without -e.

src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/sh.1:
- Document the addition of -e to the cd builtin.
2021-12-06 06:58:11 +01:00
Martijn Dekker
a3f4b5efd1 out of memory checks: add missing sh_getcwd() wrapper (re: 7ad274f8)
getcwd() with 0/NULL arguments also mallocs, so needs a check.
2021-12-05 22:02:41 +01:00
Johnothan King
f6a6d236f1 Port over illumos fixes for conf.sh (#361)
This commit ports over two of Andy Fiddaman's bugfixes to conf.sh
on illumos:

- The compiler isn't passed on to an invocation of iffe. The bugfix is
  from this commit: <https://github.com/citrus-it/ast/commit/63563232>
- The getconf builtin is missing several parameters on illumos.
  Reproducer:
     $ /opt/ast/bin/getconf ADDRESS_WIDTH
     getconf: Invalid argument (ADDRESS_WIDTH)  # Should output '64'
  This bug occurs because conf.sh expects GNU sed and fails to work
  properly with other sed implementations. The bugfix and original bug
  report can be found here:
  https://www.illumos.org/issues/14044
  https://github.com/citrus-it/ast/commit/ba443cfd
2021-12-05 19:57:15 +01:00
Martijn Dekker
6faf43791b Disable vmalloc by default; fix build on Cygwin
Vmalloc is incompatible with Cygwin, but the code to disable it on
Cygwin did not work properly, somehow causing the build to freeze
at a seemingly unrelated point (i.e., when iffe feature tests
attempt to write to sfstdout).

Vmalloc has wasted my time for the last time, so now it's getting
disabled by default even on development builds and you'll have to
pass -D_AST_vmalloc in CCFLAGS to enable it for testing purposes.

This commit has a few other build tweaks as well.

src/lib/libast/features/vmalloc:
- tst map_malloc: Remove no-op #if __CYGWIN__ block which was in
  the #else clause of another #if __CYGWIN__ block.
- Output block ('cat{'):
  - Instead of disabling vmalloc for certain systems, disable it
    unless _AST_vmalloc is defined.
  - To disable it, set _AST_std_malloc as well as _std_malloc, just
    to be sure.

src/lib/libast/vmalloc/malloc.c:
- Remove ineffective Cygwin special-casing.

src/lib/libcmd/vmstate.c:
- This is only useful for vmalloc, so do not pointlessly compile it
  if vmalloc is disabled.

src/lib/libast/man/vmalloc.3:
- Add deprecation notice.

Resolves: https://github.com/ksh93/ksh/issues/360
2021-12-05 19:28:45 +01:00
Martijn Dekker
6e1e9b738b sh.1: correct doc on format used by ${.sh.command} (re: a959a352)
It actually uses the same format as xtrace, which is *not* always
shellquoted and eval-safe.
2021-12-05 19:28:31 +01:00
Johnothan King
0a343244c1 nvdisc.c: Fix crash after an error or signal in discipline function (#356)
This patch fixes the crashes experienced when a discipline function
exited because of a signal or an error from a special builtin. The
crashes were caused by ksh entering an inconsistent state after
performing a longjmp away from the assign() and lookup() functions in
nvdisc.c. Fixing the crash requires entering a new context, then setting
a nonlocal goto with sigsetjmp(3). Any longjmps that happen while
running the discipline function will go back to assign/lookup, allowing
ksh to do a proper cleanup afterwards.

Resolves: https://github.com/ksh93/ksh/issues/346
2021-12-05 19:28:09 +01:00
Martijn Dekker
520f530198 lex.c: add default, though it should never happen (re: b0282f26)
If a bug is ever introduced that causes a [[ ... ]] operator to be
unhandled by the linter, we should at least avoid writing random
memory contents to standard error. In non-release builds, let's
abort() so the problem can be more easily backtraced.
2021-12-05 19:27:38 +01:00
Johnothan King
6904585f49 Port illumos' shell linter improvements (#353)
This commit ports over two improvements to the shell linter from
illumos (original patch written by Andy Fiddaman). Links to the
relevant bug reports and the original patch:
https://www.illumos.org/issues/13601
https://www.illumos.org/issues/13631
c7b656fc71

The first improvement is to the lint warning for arithmetic
operators in [[ ... ]]. The ksh linter now suggests the correct
equivalent operator to use in ((...)). Example:
  $ ksh -nc '[[ 30 -gt 25 ]]'
  # Original warning
  warning: line 1: -gt within [[ ... ]] obsolete, use ((...))
  # New warning
  warning: line 1: [[ ... -gt ... ]] obsolete, use ((... > ...))

The second improvement pertains to variable expansion in arithmetic
expressions. The ksh linter now suggests referencing variable names
directly:
  $ ksh -nc 'integer foo=40; (($foo < 50 ))'
  # Old warning
  warning: line 1: variable expansion makes arithmetic evaluation less efficient
  # New warning
  warning: line 1: in '(($foo < 50))', using '$' is slower and can introduce rounding errors

src/cmd/ksh93/{data/lexstates,sh/lex,sh/parse}.c:
- Port the improved shell lint warnings from illumos to ksh93u+m.
- The original checks for arithmetic operators involved a bunch of
  if statements with inefficient calls to strcmp(3). These were
  replaced with a more efficient switch statement that avoids
  strcmp.
2021-12-05 19:27:16 +01:00
Johnothan King
370440473e Fix KEYBD trap crash when inputting a command substitution (#355)
This change fixes a crash that can occur after setting a KEYBD trap
then inputting a multi-line command substitution. The crash is
similar to issue #347, but it's easier to reproduce since it
doesn't require you to setup a kshrc file. Reproducer for the
crash:

  $ ENV=/./dev/null ksh
  $ trap : KEYBD
  $ : $(
  > true)
  Memory fault(coredump)

The bugfix was backported (with considerable changes) from ksh93v-
2013-10-08. The crash was first reported on the old mailing list:
https://www.mail-archive.com/ast-users@lists.research.att.com/msg00313.html

src/cmd/ksh93/{include/shlex.h,sh/lex.c}:
- To fix this properly, we need sizeof(Lex_t) to work as expected
  in edit.c, but that is thwarted by the _SHLEX_PRIVATE macro in
  lex.c which shlex.h uses to add private structs to the Lex_t type
  in lex.c only. So get rid of that _SHLEX_PRIVATE macro and make
  those members part of the centrally defined struct, renaming them
  to make it clear they're considered private to lex.c.

src/cmd/ksh93/edit/edit.c:
- Now that we can get its size, save and restore the shell lexing
  context when a KEYBD trap is present.

src/cmd/ksh93/tests/pty.sh:
- Add a regression test for the KEYBD trap crash.

Co-authored-by: Martijn Dekker <martijn@inlv.org>
2021-11-30 04:27:31 +01:00
Johnothan King
bfad44e56d Fix memory fault on ARM-based Macs (#354)
Ksh segfaults on M1 Macs, apparently because Apple's compiler
optimizer can't cope with overriding bzero(3) with libast's
implementation (though it's nothing more than "memset(b, 0, n);").

Apple has disabled libast's bzero function since 2018-12-04:
https://opensource.apple.com/source/ksh/ksh-27/patches/src__lib__libast__comp__omitted.c.diff.auto.html

src/lib/libast/comp/omitted.c:
- Only fall back to the libast bzero function if the OS lacks an
  implementation of bzero.
- Remove the bzero undef since this fallback is only reached if the
  OS doesn't have bzero.

NEWS:
- Add compat info for macOS on ARM64. This notes that macOS
  Monterey can now compile and run ksh, although there is one
  regression test failure:
        builtins.sh[345]: printf %H: invalid UTF-8 characters
        (expected %3F%C2%86%3F%3F%3F; got %3F%C2%86%3F%3Fv%3F%3F)

Thanks to @DesantBucie for the report and the testing.

Resolves: https://github.com/ksh93/ksh/issues/329
2021-11-29 20:12:57 +01:00
Johnothan King
a0eeb14787 Stop the time keyword overriding errexit (#351)
This bug was first reported in <https://www.illumos.org/issues/7694>.
The time keyword currently overrides the errexit shell option,
allowing failing scripts to continue after an error:

  $ cat 1.sh
  #!/bin/sh
  time false   # This should cause the script to exit
  echo FAILURE
  true
  $ ksh -o errexit 1.sh

  real    0m0.00s
  user    0m0.00s
  sys     0m0.00s
  FAILURE

src/cmd/ksh93/sh/xec.c:
- When the time keyword runs a command, pass the errexit state flag
  to the sh_exec call. This state flag is required for ksh to exit
  when a command fails while the errexit option is on.

src/cmd/ksh93/tests/basic.sh:
- Add a regression test based on the reproducer.
2021-11-29 20:12:15 +01:00
Martijn Dekker
f508660ddf Revert "Fix defining types conditionally and/or in subshells (re: 8ced1daa)"
This reverts commit 2b9cbbbc8e.

This is not ready for prime time. Crashses when running a $PS2
discipline function. This needs fixing and more testing in
development before making it into the 1.0 branch. In the meantime,
that terrible problem with types is back, sorry about that.
2021-11-29 20:08:53 +01:00
Martijn Dekker
2b9cbbbc8e Fix defining types conditionally and/or in subshells (re: 8ced1daa)
This commit mitigates the effects of the hack explained in the
referenced commit so that dummy built-in command nodes added by the
parser for declaration/assignment purposes do not leak out into the
execution level, except in a relatively harmless corner case.

Something like

	if false; then
		typeset -T Foo_t=(integer -i bar)
	fi

will no longer leave a broken dummy Foo_t declaration command. The
same applies to declaration commands created with enum.

The corner case remaining is:

$ ksh -c 'false && enum E_t=(a b c); E_t -a x=(b b a c)'
ksh: E_t: not found

Since the 'enum' command is not executed, this should have thrown
a syntax error on the 'E_t -a' declaration:
ksh: syntax error at line 1: `(' unexpected

This is because the -c script is parsed entirely before being
executed, so E_t is recognised as a declaration built-in at parse
time. However, the 'not found' error shows that it was successfully
eliminated at execution time, so the inconsistent state will no
longer persist.

This fix now allows another fix to be effective as well: since
built-ins do not know about virtual subshells, fork a virtual
subshell into a real subshell before adding any built-ins.

src/cmd/ksh93/sh/parse.c:

- Add a pair of functions, dcl_hactivate() and dcl_dehacktivate(),
  that (de)activate an internal declaration built-ins tree into
  which check_typedef() can pre-add dummy type declaration command
  nodes. A viewpath from the main built-ins tree to this internal
  tree is added, unifying the two for search purposes and causing
  new nodes to be added to the internal tree. When parsing is done,
  we close that viewpath. This hides those pre-added nodes at
  execution time. Since the parser is sometimes called recursively
  (e.g. for command substitutions), keep track of this and only
  activate and deactivate at the first level.

- We also need to catch errors. This is done by setting libast's
  error_info.exit variable to a dcl_exit() function that tidies up
  and then passes control to the original (usually sh_exit()).

- sh_cmd(): This is the most central function in the parser. You'd
  think it was sh_parse(), but $(modern)-form command substitutions
  use sh_dolparen() instead. Both call sh_cmd(). So let's simply
  add a dcl_hacktivate() call at the beginning and a
  dcl_deactivate() call at the end.

- assign(): This function calls path_search(), which among many
  other things executes an FATH search, which may execute arbitrary
  code at parse time (!!!). So, regardless of recursion level,
  forcibly dehacktivate() to avoid those ugly parser side effects
  returning in that context.

src/cmd/ksh93/bltins/enum.c: b_enum():

- Fork a virtual subshell before adding a built-in.

src/cmd/ksh93/sh/xec.c: sh_exec():

- Fork a virtual subshell when detecting typeset's -T option.

Improves fix to https://github.com/ksh93/ksh/issues/256
2021-11-29 09:02:07 +01:00
Martijn Dekker
43cd8da2fe Fix 'command' prefix in enum type def pre-parsing (re: 1dc18346)
Symptom:

$ ksh -c 'command enum -i P_t=(a b); P_t -A v=([f]=b); typeset -p v'
ksh: syntax error at line 1: `(' unexpected

Expected: no syntax error, and output of 'P_t -A v=([f]=b)'.

src/cmd/ksh93/sh/parse.c: check_typedef():
- For enum, skip over any possible 'command' prefixes before
  pre-parsing options with optget (or, technically, skip anything
  else that might come before 'enum', though I don't think anything
  else is possible).
- The sh_addbuiltin() call at the end to pre-add the builtin
  obtained the node pointer to the built-in and the node flags from
  the parser tree. This did not work if a 'command' prefix was
  present. However, we don't actually need this. For parsing
  purposes, the BLT_DCL flag for a declaration built-in is
  sufficient; this is what gets the parser to accept
  assignment-arguments including parentheses. So just apply that.
  In addition, let's point it to an actual dummy built-in, 'true'
  (SYSTRUE), so that if a user does run something like 'if false;
  then enum Foo_t=(...); fi', the leaked Foo_t dummy at least won't
  do anything (not even crash).
2021-11-28 21:16:19 +01:00
Martijn Dekker
c9ca0ff531 typeset equivalents: use 'typeset' in error messages (re: 1fbbeaa1)
When giving an invalid or incompatible option to a typeset option
equivalent command (former default alias) such as 'compound' or
'integer', the resulting usage messages are incorrect. Example:

$ ksh -c 'compound -T foo=(typeset -a bar[1]=23)'
ksh: compound: -T cannot be used with other options
Usage: compound [-bflmnprstuxACHS] [-a[[type]]] [-i[base]] [-E[n]]
                [-F[n]] [-L[n]] [-M[mapping]] [-R[n]] [-X[n]]
                [-h string] [-T[tname]] [-Z[n]] [name[=value]...]
   Or: compound -f [name...]
   Or: compound -m [name=name...]
   Or: compound -n [name=name...]
   Or: compound -T [tname[=(type definition)]...]
 Help: compound [ --help | --man ] 2>&1

The error message is wrong (there were no other options) and some
of the listed usages are invalid, like 'compound -f'.

Typeset option equivalent commands should just use 'typeset' in all
their error messages to avoid confusion. This is done by setting
error_info.id to the name of the typeset builtin.
2021-11-28 21:16:17 +01:00
Martijn Dekker
7318afc278 jobs.c: refactor SIGHUP handling; document bug fixed (re: 62cf88d0)
There is quite a bit of no-op code in the job_hup() function due
to conditions that always test false. This commit removes that code
and clarifies the rest, making the purpose of this function clear.

job_hup() (before 62cf88d0: job_terminate()) is called via
job_walk() by sh_done() in fault.c to issue SIGHUP, the "hang up"
signal, to every background job's process group when the current
session is ungracefully disconnected. (One way to trigger such a
disconnection is to forcibly terminate a ssh session by typing '~.'
on a new prompt.)

The bug that Solaris patch 260-22964338 fixed is that ksh then
killed all non-disowned jobs' process groups without considering
that ksh still remembers a job even when all its processes are
finished (have the P_DONE flag). In that condition, the process
group ID may well be reused by another process by now, so it is
dangerous to killpg() it; we risk killing unrelated processes!

This is *not* a hypothetical problem; the Solaris patch exists
because this happened to a Solaris customer. However, the bug
exists on all operating systems. It's rarely triggered but serious,
and it's more likely to occur on heavy workloads that re-use
process/group IDs a lot. And it's on every currently released
non-Solaris version of ksh93. Eesh.

src/cmd/ksh93/sh/jobs.c:
src/cmd/ksh93/include/jobs.h:
- Remove job_terminate() which was unused as of 62cf88d0.
  It could have been fixed instead of replaced. Oh well.
- Refactor job_hup():
  - Remove code that will never be executed because, at
    those points, it is known that pw->p_pgrp != 0.
  - Simplify the loop that checks that there is at least
    one non-P_DONE process so it doesn't need a flag.

For documentation purposes, below is a reproducer for the bug
before the Solaris patch. It is rather involved.

1. Compile the C program below (cpid).
2. In one terminal, 'ssh localhost'.
3. Within the ssh session:
   - 'exec -a-ksh /path/to/buggy/ksh' to get a ksh login shell.
   - 'sleep 1 &' and let it finish. Note down the reported PID.
     That is the one we will reuse. Let's say 26650.
4. In another terminal, run: ./cpid 26650 (the PID from the
   previous step). Now wait until it says "PID 26650 is ready"; it
   has now succeeded at re-using that PID, and will just sit there.
   This process will never voluntarily terminate. If we have the
   bug, the termination of this process will be the symptom.
5. In the first terminal, forcibly terminate the ssh session by
   typing, on a new prompt: ~. (tilde, dot). This triggers the
   buggy routine to issue SIGHUP to all of ksh's background jobs.
6. In the second terminal, the bug is reproduced if cpid has been
   terminated, reporting 'waitpid return 26650, status 0x0001', so
   ksh just killed this process that it had nothing to do with.
   (Note that status 0x0001 refers to being killed by signal 1
   which is SIGHUP.)

cpid.c follows (written by George Lijo, tweaked by me):

 #include <stdio.h>
 #include <stdlib.h>
 #include <unistd.h>
 #include <signal.h>
 #include <sys/wait.h>

 int main(int argc, char *argv[])
 {
	pid_t	pid, rpid, opid;
	int	i, status, npid;
	if (argc != 2)
	{
		fprintf(stderr, "Usage: cpid <PID to re-use>\n");
		exit(1);
	}
	rpid = atoi(argv[1]);
	opid = getpid();
	for (;;)
	{
		if ((pid = fork()) == 0)
		{
			setpgrp();
			pause();
			_exit(0);
		}
		if (pid == rpid)
			break;
		kill(pid, SIGKILL);
		waitpid(pid, NULL, 0);
		if (opid < rpid && pid > rpid)
			printf("Cannot create PID %d\n", rpid);
		opid = pid;
	}
	printf("PID %d is ready\n", pid);
	i = waitpid(pid, &status, 0);
	printf("waitpid return %d, status 0x%4.4x\n", i, status);
	return status;
 }
2021-11-25 19:29:17 +01:00
Martijn Dekker
f3433a696a Reset sh.arithrecursion in sh_exit() instead (re: d50d3d7c)
Since the arithmetic recursion level only becomes incorrect when
an error interrupts the arithmetic subsystem, and all such error
messages call sh_exit(), it should be good enough to reset it
there, so we don't need to do that for nearly every sh_exec() run.
2021-11-25 10:26:09 +01:00
Martijn Dekker
27ccdd2517 Fix parentheses in sh_{push,pop}context macros
The lack of parentheses around the shp parameter expansion made
it impossible to pass something like &sh as the first parameter.
2021-11-25 04:11:41 +01:00
Johnothan King
84ded2d0c4 Backport the ksh93v- rm builtin to fix 'rm -d' (#348)
The -d flag implemented in the rm builtin is completely broken. No
matter what you do it refuses to remove directories, even if -r is
also passed. Reproducer:

  $ mkdir /tmp/empty
  $ PATH=/opt/ast/bin rm -d /tmp/empty
  rm: /tmp/empty: directory
  $ PATH=/opt/ast/bin rm -dr /tmp/empty
  rm: /tmp/empty: directory not removed [Is a directory]

Additionally, the description of 'rm -d' in the man page contradicts
how it's specified in <https://www.austingroupbugs.net/view.php?id=802>.

The ksh93v- rm builtin fixed nearly all of these issues, so I've
backported it to 93u+m and applied one additional fix for 'rm -rd'.

src/lib/libcmd/rm.c:
- Backported the fixes from the ksh93v- rm builtin's -d flag when
  used on empty directories.
- Backported the man page update for rm(1) from ksh93v-.
- The ksh93v- rm builtin had one additional bug that caused the -r
  option to fail when combined with -d. This was fixed by
  overriding -d if -r is also passed.

src/cmd/ksh93/tests/builtins.sh:
- Add regression tests for the rm builtin's -d option.
2021-11-25 03:52:05 +01:00
Martijn Dekker
2d65148fad arith.c: scope(): de-obfuscate some code
This function adds the NV_ADD flag to its 'flags' variable for
nv_serach() calls subject to some checks. However, every call that
uses that variable explicitly turns off the NV_ADD bit again.

A search in the ast-open-history repo reveals that this check
briefly made a difference between versions 2010-06-25 and
2010-08-11, but it's been a complete no-op ever since.

src/cmd/ksh93/sh/arith.c: scope():
- Remove no-op code.
- Resolve the constant expressions involving the 'flags' variable,
  get rid of the variable, and just indicate the flag bitmasks
  directly in the nv_search() calls.
- Detangle and split up the excessively long 'if' construct.

No change in behaviour.

Previously noticed by Kurtis Rader for ksh2020:
https://github.com/att/ast/commit/d5ce3b05
2021-11-25 03:25:39 +01:00
Martijn Dekker
214308f81e '.': disable ksh function lookup in POSIX mode
POSIXly, '.' loads only files, not functions.

This only applies to '.', not 'source' (which is not in POSIX).

src/cmd/ksh93/bltins/misc.c: b_source():
- For ksh function lookup, add an additional check that we're not
  in POSIX mode and running the '.' (SYSDOT) builtin.
2021-11-24 09:12:39 +01:00
Martijn Dekker
c0334e32a1 [1.0 release prep] Remove tilde expansion discipline
Defining a .sh.tilde.get or .sh.tilde.set discipline function to
extend tilde expansion works well as long as the discipline
function doesn't get interrupted (e.g. with Crtl+C) or produce an
error message. Either of those will cause the shell to become
unstable and crash.

This feature is now removed from the 1.0 branch as it is not ready
for prime time. It can return to a release branch if/when we manage
to fix it on the master branch.

Related: https://github.com/ksh93/ksh/issues/346
2021-11-24 07:46:58 +01:00
Martijn Dekker
40de1e92b0 [1.0 release prep] Block namespace defs in ksh functions
In 93v-/ksh2020, namespace defs in any function are a syntax error.
This commit blocks namespace defs for ksh functions only, at the
execution level. This follows some of AT&T original intention while
working around some of the known bugs with namespaces.

Related: https://github.com/ksh93/ksh/issues/325
2021-11-24 07:31:22 +01:00
Martijn Dekker
e3d91ffa90 nv_associative(): finally use proper check for enum (re: b98e32fc)
As of the previous commit, I finally know how to properly check for
a variable of a type created by 'enum'. We need to check for both
the NV_UINT16 attribute and the ENUM_disc discipline.

Also:
- regression test tweaks
- add missing tests for previous commit (f600a5ea)
2021-11-24 02:06:08 +01:00
Martijn Dekker
a66cd72f7d arith: implement range checking for enum types
Within arithmetic expressions, enumeration values of variables of a
type created with the 'enum' command translate to index numbers
from 0 to the number of elements minus 1. However, there was no
range checking on this in the arithmetic subsystem, allowing the
assignment of out-of-range values that did not correspond to any
enumeration value.

Variables of an enum type are internally unsigned short integers
(NV_UINT16), like those created with 'integer -su', except with an
additional discipline function (ENUM_disc).

src/cmd/ksh93/bltins/enum.c,
src/cmd/ksh93/include/builtins.h:
- To implement range checking, the arithmetic system needs access
  to the 'nelem' (number of elements) member of 'struct Enum'. This
  is only defined locally in enum.c. We could move that to name.h
  so arith.c can access it, but enum.c has code that supports
  compiling as standalone. So, instead, define a quick extern
  function, b_enum_elem(), that does the necessary type conversion
  and returns a type's number of elements.
- Add --man documentation for the arithmetic subsystem behaviour
  for enum types. Tell the enuminfo() function, which dynamically
  inserts values into the documentation, how to process new \f tags
  'lastv' (the last-defined value) and 'lastn' (the number of the
  last element).

src/cmd/ksh93/sh/arith.c: arith():
- For NV_UINT16 variables with an ENUM_disc discipline, check the
  range using b_enum_elem() and error out if necessary.

Resolves: https://github.com/ksh93/ksh/issues/335
2021-11-23 22:10:40 +01:00
Johnothan King
e26937b36a Add support for 'stty size' to the libcmd 'stty' builtin (#342)
This commit adds support for 'stty size' to the stty builtin, as
defined in <https://austingroupbugs.net/view.php?id=1053>. The size
mode is used to display the terminal's number of rows and columns.
Note that stty isn't included in the default list of builtin
commands; testing this addition requires adding CMDLIST(stty) to
the table of builtins in src/cmd/ksh93/data/builtins.c.

src/lib/libcmd/stty.c:
- Add support for the size mode to the stty builtin. This mode is
  only used to display the terminal's number of rows and columns,
  so error out if any arguments are given that attempt to set the
  terminal size.
2021-11-23 15:38:14 +01:00
Martijn Dekker
10ef74e1a2 shtests: unignore SIGCONT
For some reason, Void Linux (with musl libc) sets SIGCONT to
ignored on the Linux console, causing the 'sleep -s' test in
builtins.sh to fail spuriously as it relies on SIGCONT to work.

src/cmd/ksh93/tests/shtests:
- Reset SIGCONT using the unadvertised 'trap + SIGCONT' feature.

Resolves: https://github.com/ksh93/ksh/issues/301
2021-11-22 16:55:51 +01:00
Martijn Dekker
74730c8ac7 test/[: Improve error status > 1 (re: 7003aba4, cd2cf236, ef1f53b5)
As I got to know the code better, it now seems painfully obvious
that getting test/[ to issue an exit status >= 2 on error only
requires a simple check in sh_exit() in fault.c, which is called
whenever the shell issues an error message.
2021-11-22 15:37:04 +01:00
Martijn Dekker
8ced1daadf Fix enum type definition pre-parsing for shcomp and dot/source
Parser limitations prevent shcomp or source from handling enum
types correctly:

    $ cat /tmp/colors.sh
    enum Color_t=(red green blue orange yellow)
    Color_t -A Colors=([foo]=red)

    $ shcomp /tmp/colors.sh > /dev/null
    /tmp/colors.sh: syntax error at line 2: `(' unexpected
    $ source /tmp/colors.sh
    /bin/ksh: source: syntax error: `(' unexpected

Yet, for types created using 'typeset -T', this works. This is done
via a check_typedef() function that preliminarily adds the special
declaration builtin at parse time, with details to be filled in
later at execution time.

This hack will produce ugly undefined behaviour if the definition
command creating that type built-in is then not actually run at
execution time before the type built-in is accessed.

But the hack is necessary because we're dealing with a fundamental
design flaw in the ksh language. Dynamically addable built-ins that
change the syntactic parsing of the shell language on the fly are
an absurdity that violates the separation between parsing and
execution, which muddies the waters and creates the need for some
kind of ugly hack to keep things like shcomp more or less working.

This commit extends that hack to support enum.

src/cmd/ksh93/sh/parse.c:
- check_typedef():
  - Add 'intypeset' parameter that should be set to 1 for typeset
    and friends, 2 for enum.
  - When processing enum arguments, use AST getopt(3) to skip over
    enum's options to find the name of the type to be defined.
    (getopt failed if we were running a -c script; deal with this
    by zeroing opt_info.index first.)
- item(): Update check_typedef() call, passing lexp->intypeset.
- simple(): Set lexp->intypeset to 2 when processing enum.

The rest of the changes are all to support the above and should be
fairly obvious, except:

src/cmd/ksh93/bltins/enum.c:
- enuminfo(): Return on null pointer, avoiding a crash upon
  executing 'Type_t --man' if Type_t has not been fully defined due
  to the definition being pre-added at parse time but not executed.
  It's all still wrong, but a crash is worse.

Resolves: https://github.com/ksh93/ksh/issues/256
2021-11-21 17:43:55 +01:00
Martijn Dekker
996def3141 builtins.h: rm broken check for removed SYSDECLARE (re: 921bbcae) 2021-11-21 17:43:23 +01:00
Martijn Dekker
893c6a9068 nv_associative(): clarify value indicating enum (re: 6b9703ff) 2021-11-21 17:43:14 +01:00
Johnothan King
e554a07c56 typeset -T shouldn't list types created with enum (#340)
Listing types with 'typeset -T' will list not only types created with
typeset, but also types created with enum. However, the types created
by enum are not displayed correctly in the resulting output:

$ enum Foo_t=(foo bar)
$ typeset -T
typeset -T Foo_t
typeset -T Foo_t=fo)

The fix for this bug was backported from ksh93v- 2013-10-08.

src/cmd/ksh93/sh/nvtype.c:
- sh_outtype(): Skip over enums when listing types with 'typeset -T'.
2021-11-20 09:48:48 +01:00
Martijn Dekker
cb961788a8 shell.3: fix formatting for sh_{g,s}etscope 2021-11-20 04:53:42 +01:00
Martijn Dekker
98ea0c2dbb tests/signal.sh: fix AT&T's err_exit bogosity (re: 712261c8) 2021-11-20 03:31:10 +01:00
Martijn Dekker
6829fc9a29 tests/leaks.sh: tweak Linux tolerance again (re: 31fe1c28)
The referenced commit did not fix the symptoms on the 1.0 branch
(no vmalloc) on the GitHub CI runners.

The failures are intermittent and are not reproduced with vmalloc
or on other operating systems.

Though the failures occur on a different test each time, the total
amount of "leaked" bytes is always 36864, e.g.:

    leaks.sh[388]: run command with preceding PATH assignment in
    main shell (leaked approx 36864 bytes after 4096 iterations)

36864/4096 equals exactly 9. An odd number, literally and
figuratively, but I suppose that's the tolerance Linux needs.

src/cmd/ksh93/tests/leaks.sh
- Increase tolerance of bytes per iteration from 8 to 9.
2021-11-19 20:21:25 +01:00
Johnothan King
396b388e1f Fix a few issues with $RANDOM seeding in subshells (#339)
This commit fixes an issue I found in the subshell $RANDOM
reseeding code.

The main issue is a performance regression in the shbench fibonacci
benchmark, introduced in commit af6a32d1. Performance dropped in
this benchmark because $RANDOM is always reseeded and restored,
even when it's never used in a subshell. Performance results from
before and after this performance fix (results are on Linux with
CC=gcc and CCFLAGS='-O2 -D_std_malloc'):

  $ ./shbench -b bench/fibonacci.ksh -l 100 ./ksh-0f06a2e ./ksh-af6a32d ./ksh-f31e368 ./ksh-randfix

  benchmarking ./ksh-0f06a2e, ./ksh-af6a32d, ./ksh-f31e368, ./ksh-randfix ...
  *** fibonacci.ksh ***
  # ./ksh-0f06a2e  # Recent version of ksh93u+m
  # ./ksh-af6a32d  # Commit that introduced the regression
  # ./ksh-f31e368  # Commit without the regression
  # ./ksh-randfix  # Ksh93u+m with this patch applied

  -------------------------------------------------------------------------------------------------
  name           ./ksh-0f06a2e        ./ksh-af6a32d        ./ksh-f31e368        ./ksh-randfix
  -------------------------------------------------------------------------------------------------
  fibonacci.ksh  0.481 [0.459-0.515]  0.472 [0.455-0.504]  0.396 [0.380-0.442]  0.407 [0.385-0.439]
  -------------------------------------------------------------------------------------------------

src/cmd/ksh93/include/variables.h,
src/cmd/ksh93/sh/{init,subshell}.c:
- Rather than reseed $RANDOM every time a subshell is created, add
  a sh_save_rand_seed() function that does this only when the
  $RANDOM variable is used in a subshell. This function is called
  by the $RANDOM discipline functions nget_rand() and put_rand().
  As a minor optimization, sh_save_rand_seed doesn't reseed if it's
  called from put_rand().
- Because $RANDOM may have a seed of zero (i.e., RANDOM=0),
  sp->rand_seed isn't enough to tell if $RANDOM has been reseeded.
  Add sp->rand_state for this purpose.
- sh_subshell(): Only restore the former $RANDOM seed and state if
  it is necessary to prevent a subshell leak.

src/cmd/ksh93/tests/variables.sh:
- Add two regression tests for bugs I ran into while making this
  patch.
2021-11-19 08:18:44 +01:00
Martijn Dekker
745ffd366d sh.1: Add missing printf -v doc (re: eb760a62); more tweaks
Also add a missing 'Shell Variables' heading that is referred to
elsewhere, and capitalise the ASCII acronym.
2021-11-19 05:32:09 +01:00
Martijn Dekker
15bbc2f632 manual: use consistent terminology
The ksh manual page is one of the few places that calls globbing
"file name generation". The mksh and zsh manuals use the same term.
But every other shell's manual calls it "pathname expansion": bash,
dash, yash, FreeBSD sh. So does ksh's built-in documentation (alias
--man, export --man, readonly --man, set --man, typeset --man).
What's more, the authoritative ksh reference, Bolsky & Korn's 1995
"The New Kornshell" book, also calls it "pathname expansion", and
so does the POSIX standard.

Similarly, "arithmetic substitution" should be called "arithmetic
expansion" per Bolsky & Korn as well as POSIX.

This commit has several other miscellaneous documentation tweaks as
well.
2021-11-19 03:54:42 +01:00
Martijn Dekker
bd9752e43c Backport 'printf -v' from ksh 93v-
'printf' on bash and zsh has a popular -v option that allows
assigning formatted output directly to variables without using a
command substitution. This is much faster and avoids snags with
stripping final linefeeds. AT&T had replicated this feature in the
abandoned 93v- beta version. This backports it with a few tweaks
and one user-visible improvement.

The 93v- version prohibited specifying a variable name with an
array subscript, such as printf -v var\[3\] foo. This works fine on
bash and zsh, so I see no reason why this should not work on ksh,
as nv_putval() deals with array subscripts just fine.

src/cmd/ksh93/bltins/print.c: b_print():
- While processing the -v option when called as printf, get a
  pointer to the variable, creating it if necessary. Pass only the
  NV_VARNAME flag to enforce a valid variable name, and not (as
  93v- does) the NV_NOARRAY flag to prohibit array subscripts.
- If a variable was given, set the output file to an internal
  string buffer and jump straight to processing the format.
- After processing the format, assign the contents to the string
  buffer to the variable.

src/cmd/ksh93/data/builtins.c:
- Document the new option, adding a warning that unquoted square
  brackets may trigger pathname expansion.
2021-11-19 03:54:33 +01:00
Martijn Dekker
fb8308243c printf: fix %(pattern)q documentation in 'printf --man'
%(pattern)q is equivalent to %P. It's also equivalent to %#P, but
since the alternative format specifier '#' does nothing for %P,
%P and %#P are the same and documenting #%P is just confusing.

Thanks to @stephane-chazelas for the report.

src/cmd/ksh93/bltins/print.c:
- In the printmap struct, document %P as equivalent of %(pattern)q.
- Sort it alphabetically.
- Do not pointlessly repeat the string "Equivalent to". Instead,
  let the discipline function infof() insert it for each entry.
  (This is the function used to dynamically insert the equivalents
  documentation into the --man output at the \fextra\f tag in
  sh_optprintf[] in data/builtins.c.)

Resolves: https://github.com/ksh93/ksh/issues/338
2021-11-18 17:46:38 +01:00
Martijn Dekker
0b0d0094b9 bltins/misc.c: exec: finish cleanup (re: d8eba9d1)
An obsolete struct was left that passed some variables on between
b_exec() and the deleted B_login(). We can simply make those local
variables now. Let's get rid of the redundant sh pointer, too.
2021-11-18 04:38:46 +01:00
Martijn Dekker
1e96013367 tests/pty.sh: fix two failures due to typeahead on Debian Bullseye
As the (original AT&T) comment at the top says, "the trickiest part
of the tests is avoiding typeahead in the pty dialogue".

Two tests failed to [p]eek at the prompt before they started
'typing'. This causes unpredictable results. On Debian Bullseye
this triggers typeahead, which produces unwanted echo to the
terminal, killing the tests.

src/cmd/ksh93/tests/pty.sh:
- Add missing 'p' commands for the first prompt to the tests
  'nobackslashctrl in emacs' and 'emacs backslash escaping'.

Resolves: https://github.com/ksh93/ksh/issues/332
2021-11-17 23:05:05 +01:00
Martijn Dekker
77c7de7cc7 package: fix Bourne compat (re: 48e6dd98)
Tried to compile on Solaris 10.1 for the first time in a while.
Turns out the obsolete Bourne /bin/sh does not support 'test -e'.

bin/package, src/cmd/INIT/package.sh:
- Use 'test -f' instead.
2021-11-17 06:09:35 +01:00
Martijn Dekker
c734568b02 arithmetic: Fix the octal leading zero mess (#337)
In C/POSIX arithmetic, a leading 0 denotes an octal number, e.g.
010 == 8. But this is not a desirable feature as it can cause
problems with processing things like dates with a leading zero.
In ksh, you should use 8#10 instead ("10" with base 8).

It would be tolerable if ksh at least implemented it consistently.
But AT&T made an incredible mess of it. For anyone who is not
intimately familiar with ksh internals, it is inscrutable where
arithmetic evaluation special-cases a leading 0 and where it
doesn't. Here are just some of the surprises/inconsistencies:

1. The AT&T maintainers tried to honour a leading 0 inside of
   ((...)) and $((...)) and not for arithmetic contexts outside it,
   but even that inconsistency was never quite consistent.

2. Since 2010-12-12, $((x)) and $(($x)) are different:
      $ /bin/ksh -c 'x=010; echo $((x)) $(($x))'
      10 8
   That's a clear violation of both POSIX and the principle of
   least astonishment. $((x)) and $(($x)) should be the same in
   all cases.

3. 'let' with '-o letoctal' acts in this bizarre way:
      $ set -o letoctal; x=010; let "y1=$x" "y2=010"; echo $y1 $y2
      10 8
   That's right, 'let y=$x' is different from 'let y=010' even
   when $x contains the same string value '010'! This violates
   established shell grammar on the most basic level.

This commit introduces consistency. By default, ksh now acts like
mksh and zsh: the octal leading zero is disabled in all arithmetic
contexts equally. In POSIX mode, it is enabled equally.

The one exception is the 'let' built-in, where this can still be
controlled independently with the letoctal option as before (but,
because letoctal is synched with posix when switching that on/off,
it's consistent by default).

We're also removing the hackery that causes variable expansions for
the 'let' builtin to be quietly altered, so that 'x=010; let y=$x'
now does the same as 'let y=010' even with letoctal on.

Various files:
- Get rid of now-redundant sh.inarith (shp->inarith) flag, as we're
  no longer distinguishing between being inside or outside ((...)).

src/cmd/ksh93/sh/arith.c:
- arith(): Let disabling POSIX octal constants by skipping leading
  zeros depend on either the letoctal option being off (if we're
  running the "let" built-in") or the posix option being off.
- sh_strnum(): Preset a base of 10 for strtonll(3) depending on the
  posix or letoctal option being off, not on the sh.inarith flag.

src/cmd/ksh93/include/argnod.h,
src/cmd/ksh93/sh/args.c,
src/cmd/ksh93/sh/macro.c:
- Remove astonishing hackery that violated shell grammar for 'let'.

src/cmd/ksh93/sh/name.c (nv_getnum()),
src/cmd/ksh93/sh/nvdisc.c (nv_getn()):
- Remove loops for skipping leading zeroes that included a broken
  check for justify/zerofill attributes, thereby fixing this bug:
	$ typeset -Z x=0x15; echo $((x))
	-ksh: x15: parameter not set
  Even if this code wasn't redundant before, it is now: sh_arith()
  is called immediately after the removed code and it ignores
  leading zeroes via sh_strnum() and strtonll(3).

Resolves: https://github.com/ksh93/ksh/issues/334
2021-11-17 04:28:08 +01:00
Martijn Dekker
257eea612a edit.c: don't trace tput command on init (re: ef8b80cf)
When starting a new interactive ksh with the -v or -x option, an
annoying symptom occurs: the 'tput' command that ed_setup() issues
to get the escape sequence for cursor-up is xtraced or echoed,
corrupting prompt display, for example ('▂' is the cursor):
	$ ksh -x
	$ + /usr/bin/tput cuu1
	+ 2> /dev/null
	+ .sh.subscript=$'\E[A'
	▂
or
	$ ksh -v
	$ .sh.subscript=$(/usr/bin/tput cuu1 2>/dev/null)▂

src/cmd/ksh93/edit/edit.c: ed_setup():
- Turn off xtrace and verbose while sh_trap()ing tput.
2021-11-17 04:27:20 +01:00
Martijn Dekker
54674cb325 shcomp: refuse to write binary data to terminal
So, shcomp has messed up my terminal once too often by writing
compiled binary data to it. While fixing that I've done some other
tweaks as well.

src/cmd/ksh93/sh/shcomp.c: main():
- Fix error/warning message id (the "name:" prefix before messages)
  so it makes sense to the user. Save shcomp's argv[0] id for error
  messages that are directly from shcomp's main(), and use the
  argv[1] script id (set by sh_init()) for warnings produced by the
  compilation process. If there is no script id because we're
  reading from stdin, set it to "(stdin)".
- If no arguments are given, refuse to read from standard input if
  it's on a tty. Instead, write a brief usage message (with pointer
  to --help and --man, see e21a053e) and exit. This is far more
  helpful; people will rarely want to compile a script by manually
  typing it in. If you really want to do that, use /dev/stdin as
  the input filename. :)
- Error out if we're about to write binary data to a tty (even if
  /dev/stdout was given as the output filename).
- Turn off SH_MULTILINE to avoid some pointless editor init in case
  we're reading from stdin on a terminal.
- Do not attempt to copy remaining data if we're already at EOF.
  This fixes a bug that required the user to press Ctrl+D twice
  when manually entering a script on the terminal. Pressing Ctrl+D
  once and then entering more data would corrupt the bytecode.
2021-11-16 23:34:52 +01:00
Johnothan King
b40155fae8 Fix file descriptor leaks in the hist builtin (#336)
This commit fixes two file descriptor leaks in the hist built-in.
The bugfix for the first file descriptor leak was backported from
ksh2020. See:
https://github.com/att/ast/issues/872
https://github.com/att/ast/commit/73bd61b5

Reproducer:
  $ echo no
  $ hist -s no=yes

The second file descriptor leak occurs after a substitution error
in the hist built-in (this leak wasn't fixed in ksh2020).
Reproducer:
  $ echo no
  $ ls /proc/$$/fd
  $ hist -s no=yes
  $ hist -s no=yes
  $ ls /proc/$$/fd

src/cmd/ksh93/bltins/hist.c:
- Close leftover file descriptors when an error occurs and after
  'hist -s' runs a command.

src/cmd/ksh93/tests/builtins.sh:
- Add two regression tests for both of the file descriptor leaks.
2021-11-16 23:34:46 +01:00
Martijn Dekker
7ea95b7df3 tests: fix intermittent $RANDOM reseeding fails (re: af6a32d1)
When testing whether subshell $RANDOM reseeding worked, checking
for non-identical numbers is not sufficient. There is no check for
randomly occurring duplicate numbers, nor can there be, because
subshells cannot (or, in the case of virtual subshells, should not)
influence each other or the parent shell.

src/cmd/ksh93/tests/variables.sh:
- Try up to three times, tolerating identical numbers twice.
2021-11-15 21:16:39 +01:00
Martijn Dekker
56c2e13e92 arith: Fix variables 'nan' and 'inf' in arithmetic for POSIX mode
The --posix compliance option now disables the case-insensitive
special floating point constants Inf and NaN so that all case
variants of $((inf)) and $((nan)) refer to the variables by those
names as the standard requires. (BUG_ARITHNAN)

src/cmd/ksh93/sh/arith.c: arith():
- Only do case-insensitive checks for "Inf" and "NaN" if the POSIX
  option is off.
2021-11-15 21:16:23 +01:00