1
0
Fork 0
mirror of git://git.code.sf.net/p/cdesktopenv/code synced 2025-02-13 19:52:20 +00:00
Commit graph

918 commits

Author SHA1 Message Date
Martijn Dekker
7fb814e186 iffe: include OS standards macros in iffe tests by default
In iffe tests, some C functions are found in system libraries, but
then are not declared by the system headers on some systems because
the expected standards macros aren't defined, causing the system
headers to hide the function declarations. This may cause warnings
about invalid implicit function declarations in some tests (which
only show up if you export IFFEFLAGS=-d1), but may also cause false
negative test results. The iffe tests should be given the same
environment that their test results are going to be used in.

libast's first-run and most central feature test, 'standards',
figures out what standards macros need to be used on the current
system to get the system headers to declare the expected
functionality. All code that links to libast depends on the header
generated by this feature test. So iffe should use this result for
the tests as well, as soon as it's available (which is early in
libast's compilation cycle).

Concrete example: on Cygwin, in src/cmd/builtin/features/pty, the
'lib ptsname' test detects ptsname(3) in the system library, but
the output{...}end block that uses the _lib_ptsname feature test
result throws an 'implicit function definition' warning because
Cygwin's stdlib.h does not define this function without the
appropriate standards macros being defined first.

src/cmd/INIT/iffe.sh:
- If ${INSTALLROOT}/src/lib/libast/FEATURE/standards is available,
  incorporate it directly into iffe's own block of compiler
  massaging macros. Do not use #include as the necessary -I flags
  are not added for every test.
- Centrally define the iffe version once, not twice.
- Update and tweak getopts self-documentation (iffe --man).
- While we're here, add macOS/Darwin's DYLD_LIBRARY_PATH to the
  supported dynamic library search variables. On macOS, this is
  normally disabled by System Integrity Protection, plus this
  distribution uses static libraries at build time, so this is for
  completeness' sake and not much else. This was ported from
  @lkujaw's fork: https://github.com/lkujaw/ast/commit/48ff05442994
  (thanks to @JohnoKing for pointing it out).

src/lib/libast/features/standards:
- Add a comment. This is to update the file's timestamp, ensuring
  that everything will be recompiled after this commit.
2021-12-07 05:25:06 +01:00
Martijn Dekker
c842f46bbd tests/leaks.sh: increase tolerances again
Another day, another attempt to quash regress test fails caused by
implausibly small memory "leaks".

Apparently, on later macOS versions, 'ps' got more unreliable, so
increase tolerance from 8 to 12 bytes. Maximum reported leaks.sh
failure was 188 KiB after 16384 iterations (11 bytes/iteration).

And on some Linux versions, /proc sometimes acts weird. The
following is apparently consistent on Arch Linux:
	shcomp-leaks.ksh[177]: memory leak with read -C when using
	<<< (leaked approx 65536 bytes after 4096 iterations)
...which is 16 bytes per iteration, still not large enough to make
a real leak plausible.

Resolves: https://github.com/ksh93/ksh/issues/363
2021-12-06 10:05:36 +01:00
Martijn Dekker
17f13345dc Build & getconf fixes for macOS (and UnixWare)
src/lib/libast/features/standards:
- Add heuristic (u_long availability) for systems that hide rather
  than reveal functionality in the presence of _POSIX_SOURCE, etc.
- Define _DARWIN_C_SOURCE, like _GNU_SOURCE, to enable the full
  range of definitions on macOS systems.
- Due to the above, remove MACH (macOS)-specific hack.
- These changes ported from https://github.com/att/ast/pull/1492 -
  thanks to Lev Kujawski (@lkujaw). His PR indicates that this
  fixes the standards macros on UnixWare, too. Therefore, no longer
  exclude UnixWare from standards macros (re: ff70c27f).

src/lib/libast/comp/conf.sh:
- Promote the 'op' member in Conf_t (struct Conf_s) from short to
  int. This allows some Darwin/macOS values, now exposed, to fit
  that would otherwise be truncated, namely:
  _CS_DARWIN_USER_CACHE_DIR               65538
  _CS_DARWIN_USER_DIR                     65536
  _CS_DARWIN_USER_TEMP_DIR                65537
  Thus, the following AST getconf values are now correct on macOS:
  $ /opt/ast/bin/getconf | grep ^DARWIN
  DARWIN_USER_CACHE_DIR=/var/folders/nx/(REDACTED)/C/
  DARWIN_USER_DIR=/var/folders/nx/(REDACTED)/0/
  DARWIN_USER_TEMP_DIR=/var/folders/nx/(REDACTED)/T/

src/lib/libast/features/tty:
- Include <sys/ioctl.h> if available. This silences a compiler
  warning in src/lib/libast/misc/procopen.c about an invalid
  implicit declaration of ioctl(2).
2021-12-06 10:05:30 +01:00
Martijn Dekker
0d2d6bb47b Documentation tweaks (re: cd8c48cc)
The information about an out of memory error does not apply
to the version of the PR that was eventually committed since
it does not use getcwd() which might cause such an error.
See https://github.com/ksh93/ksh/pull/358#issuecomment-986294934
2021-12-06 06:58:34 +01:00
Johnothan King
cd8c48cc5a Add the '-e' flag to the 'cd' builtin (#358)
This change adds the -e flag to the cd builtin, as specified in
<https://www.austingroupbugs.net/view.php?id=253>. The -e flag is
used to verify if the the current working directory after 'cd -P'
successfully changes the directory, and returns with exit status 1
if the cwd couldn't be determined. Additionally, it causes all
other errors to return with exit status >1 (i.e., status 2 unless
ENOMEM occurs) if -e and -P are both active.

src/cmd/ksh93/bltins/cd_pwd.c:
- Add -e option to the cd builtin command. It verifies $PWD by
  using test_inode() to execute the equivalent of [[ . -ef $PWD ]].
- The check for restricted mode has been moved after optget to
  allow 'cd -eP' to return with exit status 2 when in restricted
  mode. To avoid changing the previous behavior of cd when -e isn't
  passed, extra checks have been added to prevent cd from printing
  usage information in restricted mode.

src/cmd/ksh93/tests/builtins.sh:
- Add regression tests for the exit status when using the cd -P
  flag with and without -e.

src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/sh.1:
- Document the addition of -e to the cd builtin.
2021-12-06 06:58:11 +01:00
Martijn Dekker
a3f4b5efd1 out of memory checks: add missing sh_getcwd() wrapper (re: 7ad274f8)
getcwd() with 0/NULL arguments also mallocs, so needs a check.
2021-12-05 22:02:41 +01:00
Johnothan King
f6a6d236f1 Port over illumos fixes for conf.sh (#361)
This commit ports over two of Andy Fiddaman's bugfixes to conf.sh
on illumos:

- The compiler isn't passed on to an invocation of iffe. The bugfix is
  from this commit: <https://github.com/citrus-it/ast/commit/63563232>
- The getconf builtin is missing several parameters on illumos.
  Reproducer:
     $ /opt/ast/bin/getconf ADDRESS_WIDTH
     getconf: Invalid argument (ADDRESS_WIDTH)  # Should output '64'
  This bug occurs because conf.sh expects GNU sed and fails to work
  properly with other sed implementations. The bugfix and original bug
  report can be found here:
  https://www.illumos.org/issues/14044
  https://github.com/citrus-it/ast/commit/ba443cfd
2021-12-05 19:57:15 +01:00
Martijn Dekker
6faf43791b Disable vmalloc by default; fix build on Cygwin
Vmalloc is incompatible with Cygwin, but the code to disable it on
Cygwin did not work properly, somehow causing the build to freeze
at a seemingly unrelated point (i.e., when iffe feature tests
attempt to write to sfstdout).

Vmalloc has wasted my time for the last time, so now it's getting
disabled by default even on development builds and you'll have to
pass -D_AST_vmalloc in CCFLAGS to enable it for testing purposes.

This commit has a few other build tweaks as well.

src/lib/libast/features/vmalloc:
- tst map_malloc: Remove no-op #if __CYGWIN__ block which was in
  the #else clause of another #if __CYGWIN__ block.
- Output block ('cat{'):
  - Instead of disabling vmalloc for certain systems, disable it
    unless _AST_vmalloc is defined.
  - To disable it, set _AST_std_malloc as well as _std_malloc, just
    to be sure.

src/lib/libast/vmalloc/malloc.c:
- Remove ineffective Cygwin special-casing.

src/lib/libcmd/vmstate.c:
- This is only useful for vmalloc, so do not pointlessly compile it
  if vmalloc is disabled.

src/lib/libast/man/vmalloc.3:
- Add deprecation notice.

Resolves: https://github.com/ksh93/ksh/issues/360
2021-12-05 19:28:45 +01:00
Martijn Dekker
6e1e9b738b sh.1: correct doc on format used by ${.sh.command} (re: a959a352)
It actually uses the same format as xtrace, which is *not* always
shellquoted and eval-safe.
2021-12-05 19:28:31 +01:00
Johnothan King
0a343244c1 nvdisc.c: Fix crash after an error or signal in discipline function (#356)
This patch fixes the crashes experienced when a discipline function
exited because of a signal or an error from a special builtin. The
crashes were caused by ksh entering an inconsistent state after
performing a longjmp away from the assign() and lookup() functions in
nvdisc.c. Fixing the crash requires entering a new context, then setting
a nonlocal goto with sigsetjmp(3). Any longjmps that happen while
running the discipline function will go back to assign/lookup, allowing
ksh to do a proper cleanup afterwards.

Resolves: https://github.com/ksh93/ksh/issues/346
2021-12-05 19:28:09 +01:00
Martijn Dekker
520f530198 lex.c: add default, though it should never happen (re: b0282f26)
If a bug is ever introduced that causes a [[ ... ]] operator to be
unhandled by the linter, we should at least avoid writing random
memory contents to standard error. In non-release builds, let's
abort() so the problem can be more easily backtraced.
2021-12-05 19:27:38 +01:00
Johnothan King
6904585f49 Port illumos' shell linter improvements (#353)
This commit ports over two improvements to the shell linter from
illumos (original patch written by Andy Fiddaman). Links to the
relevant bug reports and the original patch:
https://www.illumos.org/issues/13601
https://www.illumos.org/issues/13631
c7b656fc71

The first improvement is to the lint warning for arithmetic
operators in [[ ... ]]. The ksh linter now suggests the correct
equivalent operator to use in ((...)). Example:
  $ ksh -nc '[[ 30 -gt 25 ]]'
  # Original warning
  warning: line 1: -gt within [[ ... ]] obsolete, use ((...))
  # New warning
  warning: line 1: [[ ... -gt ... ]] obsolete, use ((... > ...))

The second improvement pertains to variable expansion in arithmetic
expressions. The ksh linter now suggests referencing variable names
directly:
  $ ksh -nc 'integer foo=40; (($foo < 50 ))'
  # Old warning
  warning: line 1: variable expansion makes arithmetic evaluation less efficient
  # New warning
  warning: line 1: in '(($foo < 50))', using '$' is slower and can introduce rounding errors

src/cmd/ksh93/{data/lexstates,sh/lex,sh/parse}.c:
- Port the improved shell lint warnings from illumos to ksh93u+m.
- The original checks for arithmetic operators involved a bunch of
  if statements with inefficient calls to strcmp(3). These were
  replaced with a more efficient switch statement that avoids
  strcmp.
2021-12-05 19:27:16 +01:00
Johnothan King
370440473e Fix KEYBD trap crash when inputting a command substitution (#355)
This change fixes a crash that can occur after setting a KEYBD trap
then inputting a multi-line command substitution. The crash is
similar to issue #347, but it's easier to reproduce since it
doesn't require you to setup a kshrc file. Reproducer for the
crash:

  $ ENV=/./dev/null ksh
  $ trap : KEYBD
  $ : $(
  > true)
  Memory fault(coredump)

The bugfix was backported (with considerable changes) from ksh93v-
2013-10-08. The crash was first reported on the old mailing list:
https://www.mail-archive.com/ast-users@lists.research.att.com/msg00313.html

src/cmd/ksh93/{include/shlex.h,sh/lex.c}:
- To fix this properly, we need sizeof(Lex_t) to work as expected
  in edit.c, but that is thwarted by the _SHLEX_PRIVATE macro in
  lex.c which shlex.h uses to add private structs to the Lex_t type
  in lex.c only. So get rid of that _SHLEX_PRIVATE macro and make
  those members part of the centrally defined struct, renaming them
  to make it clear they're considered private to lex.c.

src/cmd/ksh93/edit/edit.c:
- Now that we can get its size, save and restore the shell lexing
  context when a KEYBD trap is present.

src/cmd/ksh93/tests/pty.sh:
- Add a regression test for the KEYBD trap crash.

Co-authored-by: Martijn Dekker <martijn@inlv.org>
2021-11-30 04:27:31 +01:00
Johnothan King
bfad44e56d Fix memory fault on ARM-based Macs (#354)
Ksh segfaults on M1 Macs, apparently because Apple's compiler
optimizer can't cope with overriding bzero(3) with libast's
implementation (though it's nothing more than "memset(b, 0, n);").

Apple has disabled libast's bzero function since 2018-12-04:
https://opensource.apple.com/source/ksh/ksh-27/patches/src__lib__libast__comp__omitted.c.diff.auto.html

src/lib/libast/comp/omitted.c:
- Only fall back to the libast bzero function if the OS lacks an
  implementation of bzero.
- Remove the bzero undef since this fallback is only reached if the
  OS doesn't have bzero.

NEWS:
- Add compat info for macOS on ARM64. This notes that macOS
  Monterey can now compile and run ksh, although there is one
  regression test failure:
        builtins.sh[345]: printf %H: invalid UTF-8 characters
        (expected %3F%C2%86%3F%3F%3F; got %3F%C2%86%3F%3Fv%3F%3F)

Thanks to @DesantBucie for the report and the testing.

Resolves: https://github.com/ksh93/ksh/issues/329
2021-11-29 20:12:57 +01:00
Johnothan King
a0eeb14787 Stop the time keyword overriding errexit (#351)
This bug was first reported in <https://www.illumos.org/issues/7694>.
The time keyword currently overrides the errexit shell option,
allowing failing scripts to continue after an error:

  $ cat 1.sh
  #!/bin/sh
  time false   # This should cause the script to exit
  echo FAILURE
  true
  $ ksh -o errexit 1.sh

  real    0m0.00s
  user    0m0.00s
  sys     0m0.00s
  FAILURE

src/cmd/ksh93/sh/xec.c:
- When the time keyword runs a command, pass the errexit state flag
  to the sh_exec call. This state flag is required for ksh to exit
  when a command fails while the errexit option is on.

src/cmd/ksh93/tests/basic.sh:
- Add a regression test based on the reproducer.
2021-11-29 20:12:15 +01:00
Martijn Dekker
f508660ddf Revert "Fix defining types conditionally and/or in subshells (re: 8ced1daa)"
This reverts commit 2b9cbbbc8e.

This is not ready for prime time. Crashses when running a $PS2
discipline function. This needs fixing and more testing in
development before making it into the 1.0 branch. In the meantime,
that terrible problem with types is back, sorry about that.
2021-11-29 20:08:53 +01:00
Martijn Dekker
2b9cbbbc8e Fix defining types conditionally and/or in subshells (re: 8ced1daa)
This commit mitigates the effects of the hack explained in the
referenced commit so that dummy built-in command nodes added by the
parser for declaration/assignment purposes do not leak out into the
execution level, except in a relatively harmless corner case.

Something like

	if false; then
		typeset -T Foo_t=(integer -i bar)
	fi

will no longer leave a broken dummy Foo_t declaration command. The
same applies to declaration commands created with enum.

The corner case remaining is:

$ ksh -c 'false && enum E_t=(a b c); E_t -a x=(b b a c)'
ksh: E_t: not found

Since the 'enum' command is not executed, this should have thrown
a syntax error on the 'E_t -a' declaration:
ksh: syntax error at line 1: `(' unexpected

This is because the -c script is parsed entirely before being
executed, so E_t is recognised as a declaration built-in at parse
time. However, the 'not found' error shows that it was successfully
eliminated at execution time, so the inconsistent state will no
longer persist.

This fix now allows another fix to be effective as well: since
built-ins do not know about virtual subshells, fork a virtual
subshell into a real subshell before adding any built-ins.

src/cmd/ksh93/sh/parse.c:

- Add a pair of functions, dcl_hactivate() and dcl_dehacktivate(),
  that (de)activate an internal declaration built-ins tree into
  which check_typedef() can pre-add dummy type declaration command
  nodes. A viewpath from the main built-ins tree to this internal
  tree is added, unifying the two for search purposes and causing
  new nodes to be added to the internal tree. When parsing is done,
  we close that viewpath. This hides those pre-added nodes at
  execution time. Since the parser is sometimes called recursively
  (e.g. for command substitutions), keep track of this and only
  activate and deactivate at the first level.

- We also need to catch errors. This is done by setting libast's
  error_info.exit variable to a dcl_exit() function that tidies up
  and then passes control to the original (usually sh_exit()).

- sh_cmd(): This is the most central function in the parser. You'd
  think it was sh_parse(), but $(modern)-form command substitutions
  use sh_dolparen() instead. Both call sh_cmd(). So let's simply
  add a dcl_hacktivate() call at the beginning and a
  dcl_deactivate() call at the end.

- assign(): This function calls path_search(), which among many
  other things executes an FATH search, which may execute arbitrary
  code at parse time (!!!). So, regardless of recursion level,
  forcibly dehacktivate() to avoid those ugly parser side effects
  returning in that context.

src/cmd/ksh93/bltins/enum.c: b_enum():

- Fork a virtual subshell before adding a built-in.

src/cmd/ksh93/sh/xec.c: sh_exec():

- Fork a virtual subshell when detecting typeset's -T option.

Improves fix to https://github.com/ksh93/ksh/issues/256
2021-11-29 09:02:07 +01:00
Martijn Dekker
43cd8da2fe Fix 'command' prefix in enum type def pre-parsing (re: 1dc18346)
Symptom:

$ ksh -c 'command enum -i P_t=(a b); P_t -A v=([f]=b); typeset -p v'
ksh: syntax error at line 1: `(' unexpected

Expected: no syntax error, and output of 'P_t -A v=([f]=b)'.

src/cmd/ksh93/sh/parse.c: check_typedef():
- For enum, skip over any possible 'command' prefixes before
  pre-parsing options with optget (or, technically, skip anything
  else that might come before 'enum', though I don't think anything
  else is possible).
- The sh_addbuiltin() call at the end to pre-add the builtin
  obtained the node pointer to the built-in and the node flags from
  the parser tree. This did not work if a 'command' prefix was
  present. However, we don't actually need this. For parsing
  purposes, the BLT_DCL flag for a declaration built-in is
  sufficient; this is what gets the parser to accept
  assignment-arguments including parentheses. So just apply that.
  In addition, let's point it to an actual dummy built-in, 'true'
  (SYSTRUE), so that if a user does run something like 'if false;
  then enum Foo_t=(...); fi', the leaked Foo_t dummy at least won't
  do anything (not even crash).
2021-11-28 21:16:19 +01:00
Martijn Dekker
c9ca0ff531 typeset equivalents: use 'typeset' in error messages (re: 1fbbeaa1)
When giving an invalid or incompatible option to a typeset option
equivalent command (former default alias) such as 'compound' or
'integer', the resulting usage messages are incorrect. Example:

$ ksh -c 'compound -T foo=(typeset -a bar[1]=23)'
ksh: compound: -T cannot be used with other options
Usage: compound [-bflmnprstuxACHS] [-a[[type]]] [-i[base]] [-E[n]]
                [-F[n]] [-L[n]] [-M[mapping]] [-R[n]] [-X[n]]
                [-h string] [-T[tname]] [-Z[n]] [name[=value]...]
   Or: compound -f [name...]
   Or: compound -m [name=name...]
   Or: compound -n [name=name...]
   Or: compound -T [tname[=(type definition)]...]
 Help: compound [ --help | --man ] 2>&1

The error message is wrong (there were no other options) and some
of the listed usages are invalid, like 'compound -f'.

Typeset option equivalent commands should just use 'typeset' in all
their error messages to avoid confusion. This is done by setting
error_info.id to the name of the typeset builtin.
2021-11-28 21:16:17 +01:00
Martijn Dekker
7318afc278 jobs.c: refactor SIGHUP handling; document bug fixed (re: 62cf88d0)
There is quite a bit of no-op code in the job_hup() function due
to conditions that always test false. This commit removes that code
and clarifies the rest, making the purpose of this function clear.

job_hup() (before 62cf88d0: job_terminate()) is called via
job_walk() by sh_done() in fault.c to issue SIGHUP, the "hang up"
signal, to every background job's process group when the current
session is ungracefully disconnected. (One way to trigger such a
disconnection is to forcibly terminate a ssh session by typing '~.'
on a new prompt.)

The bug that Solaris patch 260-22964338 fixed is that ksh then
killed all non-disowned jobs' process groups without considering
that ksh still remembers a job even when all its processes are
finished (have the P_DONE flag). In that condition, the process
group ID may well be reused by another process by now, so it is
dangerous to killpg() it; we risk killing unrelated processes!

This is *not* a hypothetical problem; the Solaris patch exists
because this happened to a Solaris customer. However, the bug
exists on all operating systems. It's rarely triggered but serious,
and it's more likely to occur on heavy workloads that re-use
process/group IDs a lot. And it's on every currently released
non-Solaris version of ksh93. Eesh.

src/cmd/ksh93/sh/jobs.c:
src/cmd/ksh93/include/jobs.h:
- Remove job_terminate() which was unused as of 62cf88d0.
  It could have been fixed instead of replaced. Oh well.
- Refactor job_hup():
  - Remove code that will never be executed because, at
    those points, it is known that pw->p_pgrp != 0.
  - Simplify the loop that checks that there is at least
    one non-P_DONE process so it doesn't need a flag.

For documentation purposes, below is a reproducer for the bug
before the Solaris patch. It is rather involved.

1. Compile the C program below (cpid).
2. In one terminal, 'ssh localhost'.
3. Within the ssh session:
   - 'exec -a-ksh /path/to/buggy/ksh' to get a ksh login shell.
   - 'sleep 1 &' and let it finish. Note down the reported PID.
     That is the one we will reuse. Let's say 26650.
4. In another terminal, run: ./cpid 26650 (the PID from the
   previous step). Now wait until it says "PID 26650 is ready"; it
   has now succeeded at re-using that PID, and will just sit there.
   This process will never voluntarily terminate. If we have the
   bug, the termination of this process will be the symptom.
5. In the first terminal, forcibly terminate the ssh session by
   typing, on a new prompt: ~. (tilde, dot). This triggers the
   buggy routine to issue SIGHUP to all of ksh's background jobs.
6. In the second terminal, the bug is reproduced if cpid has been
   terminated, reporting 'waitpid return 26650, status 0x0001', so
   ksh just killed this process that it had nothing to do with.
   (Note that status 0x0001 refers to being killed by signal 1
   which is SIGHUP.)

cpid.c follows (written by George Lijo, tweaked by me):

 #include <stdio.h>
 #include <stdlib.h>
 #include <unistd.h>
 #include <signal.h>
 #include <sys/wait.h>

 int main(int argc, char *argv[])
 {
	pid_t	pid, rpid, opid;
	int	i, status, npid;
	if (argc != 2)
	{
		fprintf(stderr, "Usage: cpid <PID to re-use>\n");
		exit(1);
	}
	rpid = atoi(argv[1]);
	opid = getpid();
	for (;;)
	{
		if ((pid = fork()) == 0)
		{
			setpgrp();
			pause();
			_exit(0);
		}
		if (pid == rpid)
			break;
		kill(pid, SIGKILL);
		waitpid(pid, NULL, 0);
		if (opid < rpid && pid > rpid)
			printf("Cannot create PID %d\n", rpid);
		opid = pid;
	}
	printf("PID %d is ready\n", pid);
	i = waitpid(pid, &status, 0);
	printf("waitpid return %d, status 0x%4.4x\n", i, status);
	return status;
 }
2021-11-25 19:29:17 +01:00
Martijn Dekker
f3433a696a Reset sh.arithrecursion in sh_exit() instead (re: d50d3d7c)
Since the arithmetic recursion level only becomes incorrect when
an error interrupts the arithmetic subsystem, and all such error
messages call sh_exit(), it should be good enough to reset it
there, so we don't need to do that for nearly every sh_exec() run.
2021-11-25 10:26:09 +01:00
Martijn Dekker
27ccdd2517 Fix parentheses in sh_{push,pop}context macros
The lack of parentheses around the shp parameter expansion made
it impossible to pass something like &sh as the first parameter.
2021-11-25 04:11:41 +01:00
Johnothan King
84ded2d0c4 Backport the ksh93v- rm builtin to fix 'rm -d' (#348)
The -d flag implemented in the rm builtin is completely broken. No
matter what you do it refuses to remove directories, even if -r is
also passed. Reproducer:

  $ mkdir /tmp/empty
  $ PATH=/opt/ast/bin rm -d /tmp/empty
  rm: /tmp/empty: directory
  $ PATH=/opt/ast/bin rm -dr /tmp/empty
  rm: /tmp/empty: directory not removed [Is a directory]

Additionally, the description of 'rm -d' in the man page contradicts
how it's specified in <https://www.austingroupbugs.net/view.php?id=802>.

The ksh93v- rm builtin fixed nearly all of these issues, so I've
backported it to 93u+m and applied one additional fix for 'rm -rd'.

src/lib/libcmd/rm.c:
- Backported the fixes from the ksh93v- rm builtin's -d flag when
  used on empty directories.
- Backported the man page update for rm(1) from ksh93v-.
- The ksh93v- rm builtin had one additional bug that caused the -r
  option to fail when combined with -d. This was fixed by
  overriding -d if -r is also passed.

src/cmd/ksh93/tests/builtins.sh:
- Add regression tests for the rm builtin's -d option.
2021-11-25 03:52:05 +01:00
Martijn Dekker
2d65148fad arith.c: scope(): de-obfuscate some code
This function adds the NV_ADD flag to its 'flags' variable for
nv_serach() calls subject to some checks. However, every call that
uses that variable explicitly turns off the NV_ADD bit again.

A search in the ast-open-history repo reveals that this check
briefly made a difference between versions 2010-06-25 and
2010-08-11, but it's been a complete no-op ever since.

src/cmd/ksh93/sh/arith.c: scope():
- Remove no-op code.
- Resolve the constant expressions involving the 'flags' variable,
  get rid of the variable, and just indicate the flag bitmasks
  directly in the nv_search() calls.
- Detangle and split up the excessively long 'if' construct.

No change in behaviour.

Previously noticed by Kurtis Rader for ksh2020:
https://github.com/att/ast/commit/d5ce3b05
2021-11-25 03:25:39 +01:00
Martijn Dekker
214308f81e '.': disable ksh function lookup in POSIX mode
POSIXly, '.' loads only files, not functions.

This only applies to '.', not 'source' (which is not in POSIX).

src/cmd/ksh93/bltins/misc.c: b_source():
- For ksh function lookup, add an additional check that we're not
  in POSIX mode and running the '.' (SYSDOT) builtin.
2021-11-24 09:12:39 +01:00
Martijn Dekker
c0334e32a1 [1.0 release prep] Remove tilde expansion discipline
Defining a .sh.tilde.get or .sh.tilde.set discipline function to
extend tilde expansion works well as long as the discipline
function doesn't get interrupted (e.g. with Crtl+C) or produce an
error message. Either of those will cause the shell to become
unstable and crash.

This feature is now removed from the 1.0 branch as it is not ready
for prime time. It can return to a release branch if/when we manage
to fix it on the master branch.

Related: https://github.com/ksh93/ksh/issues/346
2021-11-24 07:46:58 +01:00
Martijn Dekker
40de1e92b0 [1.0 release prep] Block namespace defs in ksh functions
In 93v-/ksh2020, namespace defs in any function are a syntax error.
This commit blocks namespace defs for ksh functions only, at the
execution level. This follows some of AT&T original intention while
working around some of the known bugs with namespaces.

Related: https://github.com/ksh93/ksh/issues/325
2021-11-24 07:31:22 +01:00
Martijn Dekker
e3d91ffa90 nv_associative(): finally use proper check for enum (re: b98e32fc)
As of the previous commit, I finally know how to properly check for
a variable of a type created by 'enum'. We need to check for both
the NV_UINT16 attribute and the ENUM_disc discipline.

Also:
- regression test tweaks
- add missing tests for previous commit (f600a5ea)
2021-11-24 02:06:08 +01:00
Martijn Dekker
a66cd72f7d arith: implement range checking for enum types
Within arithmetic expressions, enumeration values of variables of a
type created with the 'enum' command translate to index numbers
from 0 to the number of elements minus 1. However, there was no
range checking on this in the arithmetic subsystem, allowing the
assignment of out-of-range values that did not correspond to any
enumeration value.

Variables of an enum type are internally unsigned short integers
(NV_UINT16), like those created with 'integer -su', except with an
additional discipline function (ENUM_disc).

src/cmd/ksh93/bltins/enum.c,
src/cmd/ksh93/include/builtins.h:
- To implement range checking, the arithmetic system needs access
  to the 'nelem' (number of elements) member of 'struct Enum'. This
  is only defined locally in enum.c. We could move that to name.h
  so arith.c can access it, but enum.c has code that supports
  compiling as standalone. So, instead, define a quick extern
  function, b_enum_elem(), that does the necessary type conversion
  and returns a type's number of elements.
- Add --man documentation for the arithmetic subsystem behaviour
  for enum types. Tell the enuminfo() function, which dynamically
  inserts values into the documentation, how to process new \f tags
  'lastv' (the last-defined value) and 'lastn' (the number of the
  last element).

src/cmd/ksh93/sh/arith.c: arith():
- For NV_UINT16 variables with an ENUM_disc discipline, check the
  range using b_enum_elem() and error out if necessary.

Resolves: https://github.com/ksh93/ksh/issues/335
2021-11-23 22:10:40 +01:00
Johnothan King
e26937b36a Add support for 'stty size' to the libcmd 'stty' builtin (#342)
This commit adds support for 'stty size' to the stty builtin, as
defined in <https://austingroupbugs.net/view.php?id=1053>. The size
mode is used to display the terminal's number of rows and columns.
Note that stty isn't included in the default list of builtin
commands; testing this addition requires adding CMDLIST(stty) to
the table of builtins in src/cmd/ksh93/data/builtins.c.

src/lib/libcmd/stty.c:
- Add support for the size mode to the stty builtin. This mode is
  only used to display the terminal's number of rows and columns,
  so error out if any arguments are given that attempt to set the
  terminal size.
2021-11-23 15:38:14 +01:00
Martijn Dekker
10ef74e1a2 shtests: unignore SIGCONT
For some reason, Void Linux (with musl libc) sets SIGCONT to
ignored on the Linux console, causing the 'sleep -s' test in
builtins.sh to fail spuriously as it relies on SIGCONT to work.

src/cmd/ksh93/tests/shtests:
- Reset SIGCONT using the unadvertised 'trap + SIGCONT' feature.

Resolves: https://github.com/ksh93/ksh/issues/301
2021-11-22 16:55:51 +01:00
Martijn Dekker
74730c8ac7 test/[: Improve error status > 1 (re: 7003aba4, cd2cf236, ef1f53b5)
As I got to know the code better, it now seems painfully obvious
that getting test/[ to issue an exit status >= 2 on error only
requires a simple check in sh_exit() in fault.c, which is called
whenever the shell issues an error message.
2021-11-22 15:37:04 +01:00
Martijn Dekker
8ced1daadf Fix enum type definition pre-parsing for shcomp and dot/source
Parser limitations prevent shcomp or source from handling enum
types correctly:

    $ cat /tmp/colors.sh
    enum Color_t=(red green blue orange yellow)
    Color_t -A Colors=([foo]=red)

    $ shcomp /tmp/colors.sh > /dev/null
    /tmp/colors.sh: syntax error at line 2: `(' unexpected
    $ source /tmp/colors.sh
    /bin/ksh: source: syntax error: `(' unexpected

Yet, for types created using 'typeset -T', this works. This is done
via a check_typedef() function that preliminarily adds the special
declaration builtin at parse time, with details to be filled in
later at execution time.

This hack will produce ugly undefined behaviour if the definition
command creating that type built-in is then not actually run at
execution time before the type built-in is accessed.

But the hack is necessary because we're dealing with a fundamental
design flaw in the ksh language. Dynamically addable built-ins that
change the syntactic parsing of the shell language on the fly are
an absurdity that violates the separation between parsing and
execution, which muddies the waters and creates the need for some
kind of ugly hack to keep things like shcomp more or less working.

This commit extends that hack to support enum.

src/cmd/ksh93/sh/parse.c:
- check_typedef():
  - Add 'intypeset' parameter that should be set to 1 for typeset
    and friends, 2 for enum.
  - When processing enum arguments, use AST getopt(3) to skip over
    enum's options to find the name of the type to be defined.
    (getopt failed if we were running a -c script; deal with this
    by zeroing opt_info.index first.)
- item(): Update check_typedef() call, passing lexp->intypeset.
- simple(): Set lexp->intypeset to 2 when processing enum.

The rest of the changes are all to support the above and should be
fairly obvious, except:

src/cmd/ksh93/bltins/enum.c:
- enuminfo(): Return on null pointer, avoiding a crash upon
  executing 'Type_t --man' if Type_t has not been fully defined due
  to the definition being pre-added at parse time but not executed.
  It's all still wrong, but a crash is worse.

Resolves: https://github.com/ksh93/ksh/issues/256
2021-11-21 17:43:55 +01:00
Martijn Dekker
996def3141 builtins.h: rm broken check for removed SYSDECLARE (re: 921bbcae) 2021-11-21 17:43:23 +01:00
Martijn Dekker
893c6a9068 nv_associative(): clarify value indicating enum (re: 6b9703ff) 2021-11-21 17:43:14 +01:00
Johnothan King
e554a07c56 typeset -T shouldn't list types created with enum (#340)
Listing types with 'typeset -T' will list not only types created with
typeset, but also types created with enum. However, the types created
by enum are not displayed correctly in the resulting output:

$ enum Foo_t=(foo bar)
$ typeset -T
typeset -T Foo_t
typeset -T Foo_t=fo)

The fix for this bug was backported from ksh93v- 2013-10-08.

src/cmd/ksh93/sh/nvtype.c:
- sh_outtype(): Skip over enums when listing types with 'typeset -T'.
2021-11-20 09:48:48 +01:00
Martijn Dekker
cb961788a8 shell.3: fix formatting for sh_{g,s}etscope 2021-11-20 04:53:42 +01:00
Martijn Dekker
98ea0c2dbb tests/signal.sh: fix AT&T's err_exit bogosity (re: 712261c8) 2021-11-20 03:31:10 +01:00
Martijn Dekker
6829fc9a29 tests/leaks.sh: tweak Linux tolerance again (re: 31fe1c28)
The referenced commit did not fix the symptoms on the 1.0 branch
(no vmalloc) on the GitHub CI runners.

The failures are intermittent and are not reproduced with vmalloc
or on other operating systems.

Though the failures occur on a different test each time, the total
amount of "leaked" bytes is always 36864, e.g.:

    leaks.sh[388]: run command with preceding PATH assignment in
    main shell (leaked approx 36864 bytes after 4096 iterations)

36864/4096 equals exactly 9. An odd number, literally and
figuratively, but I suppose that's the tolerance Linux needs.

src/cmd/ksh93/tests/leaks.sh
- Increase tolerance of bytes per iteration from 8 to 9.
2021-11-19 20:21:25 +01:00
Johnothan King
396b388e1f Fix a few issues with $RANDOM seeding in subshells (#339)
This commit fixes an issue I found in the subshell $RANDOM
reseeding code.

The main issue is a performance regression in the shbench fibonacci
benchmark, introduced in commit af6a32d1. Performance dropped in
this benchmark because $RANDOM is always reseeded and restored,
even when it's never used in a subshell. Performance results from
before and after this performance fix (results are on Linux with
CC=gcc and CCFLAGS='-O2 -D_std_malloc'):

  $ ./shbench -b bench/fibonacci.ksh -l 100 ./ksh-0f06a2e ./ksh-af6a32d ./ksh-f31e368 ./ksh-randfix

  benchmarking ./ksh-0f06a2e, ./ksh-af6a32d, ./ksh-f31e368, ./ksh-randfix ...
  *** fibonacci.ksh ***
  # ./ksh-0f06a2e  # Recent version of ksh93u+m
  # ./ksh-af6a32d  # Commit that introduced the regression
  # ./ksh-f31e368  # Commit without the regression
  # ./ksh-randfix  # Ksh93u+m with this patch applied

  -------------------------------------------------------------------------------------------------
  name           ./ksh-0f06a2e        ./ksh-af6a32d        ./ksh-f31e368        ./ksh-randfix
  -------------------------------------------------------------------------------------------------
  fibonacci.ksh  0.481 [0.459-0.515]  0.472 [0.455-0.504]  0.396 [0.380-0.442]  0.407 [0.385-0.439]
  -------------------------------------------------------------------------------------------------

src/cmd/ksh93/include/variables.h,
src/cmd/ksh93/sh/{init,subshell}.c:
- Rather than reseed $RANDOM every time a subshell is created, add
  a sh_save_rand_seed() function that does this only when the
  $RANDOM variable is used in a subshell. This function is called
  by the $RANDOM discipline functions nget_rand() and put_rand().
  As a minor optimization, sh_save_rand_seed doesn't reseed if it's
  called from put_rand().
- Because $RANDOM may have a seed of zero (i.e., RANDOM=0),
  sp->rand_seed isn't enough to tell if $RANDOM has been reseeded.
  Add sp->rand_state for this purpose.
- sh_subshell(): Only restore the former $RANDOM seed and state if
  it is necessary to prevent a subshell leak.

src/cmd/ksh93/tests/variables.sh:
- Add two regression tests for bugs I ran into while making this
  patch.
2021-11-19 08:18:44 +01:00
Martijn Dekker
745ffd366d sh.1: Add missing printf -v doc (re: eb760a62); more tweaks
Also add a missing 'Shell Variables' heading that is referred to
elsewhere, and capitalise the ASCII acronym.
2021-11-19 05:32:09 +01:00
Martijn Dekker
15bbc2f632 manual: use consistent terminology
The ksh manual page is one of the few places that calls globbing
"file name generation". The mksh and zsh manuals use the same term.
But every other shell's manual calls it "pathname expansion": bash,
dash, yash, FreeBSD sh. So does ksh's built-in documentation (alias
--man, export --man, readonly --man, set --man, typeset --man).
What's more, the authoritative ksh reference, Bolsky & Korn's 1995
"The New Kornshell" book, also calls it "pathname expansion", and
so does the POSIX standard.

Similarly, "arithmetic substitution" should be called "arithmetic
expansion" per Bolsky & Korn as well as POSIX.

This commit has several other miscellaneous documentation tweaks as
well.
2021-11-19 03:54:42 +01:00
Martijn Dekker
bd9752e43c Backport 'printf -v' from ksh 93v-
'printf' on bash and zsh has a popular -v option that allows
assigning formatted output directly to variables without using a
command substitution. This is much faster and avoids snags with
stripping final linefeeds. AT&T had replicated this feature in the
abandoned 93v- beta version. This backports it with a few tweaks
and one user-visible improvement.

The 93v- version prohibited specifying a variable name with an
array subscript, such as printf -v var\[3\] foo. This works fine on
bash and zsh, so I see no reason why this should not work on ksh,
as nv_putval() deals with array subscripts just fine.

src/cmd/ksh93/bltins/print.c: b_print():
- While processing the -v option when called as printf, get a
  pointer to the variable, creating it if necessary. Pass only the
  NV_VARNAME flag to enforce a valid variable name, and not (as
  93v- does) the NV_NOARRAY flag to prohibit array subscripts.
- If a variable was given, set the output file to an internal
  string buffer and jump straight to processing the format.
- After processing the format, assign the contents to the string
  buffer to the variable.

src/cmd/ksh93/data/builtins.c:
- Document the new option, adding a warning that unquoted square
  brackets may trigger pathname expansion.
2021-11-19 03:54:33 +01:00
Martijn Dekker
fb8308243c printf: fix %(pattern)q documentation in 'printf --man'
%(pattern)q is equivalent to %P. It's also equivalent to %#P, but
since the alternative format specifier '#' does nothing for %P,
%P and %#P are the same and documenting #%P is just confusing.

Thanks to @stephane-chazelas for the report.

src/cmd/ksh93/bltins/print.c:
- In the printmap struct, document %P as equivalent of %(pattern)q.
- Sort it alphabetically.
- Do not pointlessly repeat the string "Equivalent to". Instead,
  let the discipline function infof() insert it for each entry.
  (This is the function used to dynamically insert the equivalents
  documentation into the --man output at the \fextra\f tag in
  sh_optprintf[] in data/builtins.c.)

Resolves: https://github.com/ksh93/ksh/issues/338
2021-11-18 17:46:38 +01:00
Martijn Dekker
0b0d0094b9 bltins/misc.c: exec: finish cleanup (re: d8eba9d1)
An obsolete struct was left that passed some variables on between
b_exec() and the deleted B_login(). We can simply make those local
variables now. Let's get rid of the redundant sh pointer, too.
2021-11-18 04:38:46 +01:00
Martijn Dekker
1e96013367 tests/pty.sh: fix two failures due to typeahead on Debian Bullseye
As the (original AT&T) comment at the top says, "the trickiest part
of the tests is avoiding typeahead in the pty dialogue".

Two tests failed to [p]eek at the prompt before they started
'typing'. This causes unpredictable results. On Debian Bullseye
this triggers typeahead, which produces unwanted echo to the
terminal, killing the tests.

src/cmd/ksh93/tests/pty.sh:
- Add missing 'p' commands for the first prompt to the tests
  'nobackslashctrl in emacs' and 'emacs backslash escaping'.

Resolves: https://github.com/ksh93/ksh/issues/332
2021-11-17 23:05:05 +01:00
Martijn Dekker
77c7de7cc7 package: fix Bourne compat (re: 48e6dd98)
Tried to compile on Solaris 10.1 for the first time in a while.
Turns out the obsolete Bourne /bin/sh does not support 'test -e'.

bin/package, src/cmd/INIT/package.sh:
- Use 'test -f' instead.
2021-11-17 06:09:35 +01:00
Martijn Dekker
c734568b02 arithmetic: Fix the octal leading zero mess (#337)
In C/POSIX arithmetic, a leading 0 denotes an octal number, e.g.
010 == 8. But this is not a desirable feature as it can cause
problems with processing things like dates with a leading zero.
In ksh, you should use 8#10 instead ("10" with base 8).

It would be tolerable if ksh at least implemented it consistently.
But AT&T made an incredible mess of it. For anyone who is not
intimately familiar with ksh internals, it is inscrutable where
arithmetic evaluation special-cases a leading 0 and where it
doesn't. Here are just some of the surprises/inconsistencies:

1. The AT&T maintainers tried to honour a leading 0 inside of
   ((...)) and $((...)) and not for arithmetic contexts outside it,
   but even that inconsistency was never quite consistent.

2. Since 2010-12-12, $((x)) and $(($x)) are different:
      $ /bin/ksh -c 'x=010; echo $((x)) $(($x))'
      10 8
   That's a clear violation of both POSIX and the principle of
   least astonishment. $((x)) and $(($x)) should be the same in
   all cases.

3. 'let' with '-o letoctal' acts in this bizarre way:
      $ set -o letoctal; x=010; let "y1=$x" "y2=010"; echo $y1 $y2
      10 8
   That's right, 'let y=$x' is different from 'let y=010' even
   when $x contains the same string value '010'! This violates
   established shell grammar on the most basic level.

This commit introduces consistency. By default, ksh now acts like
mksh and zsh: the octal leading zero is disabled in all arithmetic
contexts equally. In POSIX mode, it is enabled equally.

The one exception is the 'let' built-in, where this can still be
controlled independently with the letoctal option as before (but,
because letoctal is synched with posix when switching that on/off,
it's consistent by default).

We're also removing the hackery that causes variable expansions for
the 'let' builtin to be quietly altered, so that 'x=010; let y=$x'
now does the same as 'let y=010' even with letoctal on.

Various files:
- Get rid of now-redundant sh.inarith (shp->inarith) flag, as we're
  no longer distinguishing between being inside or outside ((...)).

src/cmd/ksh93/sh/arith.c:
- arith(): Let disabling POSIX octal constants by skipping leading
  zeros depend on either the letoctal option being off (if we're
  running the "let" built-in") or the posix option being off.
- sh_strnum(): Preset a base of 10 for strtonll(3) depending on the
  posix or letoctal option being off, not on the sh.inarith flag.

src/cmd/ksh93/include/argnod.h,
src/cmd/ksh93/sh/args.c,
src/cmd/ksh93/sh/macro.c:
- Remove astonishing hackery that violated shell grammar for 'let'.

src/cmd/ksh93/sh/name.c (nv_getnum()),
src/cmd/ksh93/sh/nvdisc.c (nv_getn()):
- Remove loops for skipping leading zeroes that included a broken
  check for justify/zerofill attributes, thereby fixing this bug:
	$ typeset -Z x=0x15; echo $((x))
	-ksh: x15: parameter not set
  Even if this code wasn't redundant before, it is now: sh_arith()
  is called immediately after the removed code and it ignores
  leading zeroes via sh_strnum() and strtonll(3).

Resolves: https://github.com/ksh93/ksh/issues/334
2021-11-17 04:28:08 +01:00
Martijn Dekker
257eea612a edit.c: don't trace tput command on init (re: ef8b80cf)
When starting a new interactive ksh with the -v or -x option, an
annoying symptom occurs: the 'tput' command that ed_setup() issues
to get the escape sequence for cursor-up is xtraced or echoed,
corrupting prompt display, for example ('▂' is the cursor):
	$ ksh -x
	$ + /usr/bin/tput cuu1
	+ 2> /dev/null
	+ .sh.subscript=$'\E[A'
	▂
or
	$ ksh -v
	$ .sh.subscript=$(/usr/bin/tput cuu1 2>/dev/null)▂

src/cmd/ksh93/edit/edit.c: ed_setup():
- Turn off xtrace and verbose while sh_trap()ing tput.
2021-11-17 04:27:20 +01:00
Martijn Dekker
54674cb325 shcomp: refuse to write binary data to terminal
So, shcomp has messed up my terminal once too often by writing
compiled binary data to it. While fixing that I've done some other
tweaks as well.

src/cmd/ksh93/sh/shcomp.c: main():
- Fix error/warning message id (the "name:" prefix before messages)
  so it makes sense to the user. Save shcomp's argv[0] id for error
  messages that are directly from shcomp's main(), and use the
  argv[1] script id (set by sh_init()) for warnings produced by the
  compilation process. If there is no script id because we're
  reading from stdin, set it to "(stdin)".
- If no arguments are given, refuse to read from standard input if
  it's on a tty. Instead, write a brief usage message (with pointer
  to --help and --man, see e21a053e) and exit. This is far more
  helpful; people will rarely want to compile a script by manually
  typing it in. If you really want to do that, use /dev/stdin as
  the input filename. :)
- Error out if we're about to write binary data to a tty (even if
  /dev/stdout was given as the output filename).
- Turn off SH_MULTILINE to avoid some pointless editor init in case
  we're reading from stdin on a terminal.
- Do not attempt to copy remaining data if we're already at EOF.
  This fixes a bug that required the user to press Ctrl+D twice
  when manually entering a script on the terminal. Pressing Ctrl+D
  once and then entering more data would corrupt the bytecode.
2021-11-16 23:34:52 +01:00