1
0
Fork 0
mirror of git://git.code.sf.net/p/cdesktopenv/code synced 2025-03-09 15:50:02 +00:00
Commit graph

109 commits

Author SHA1 Message Date
Martijn Dekker
b3050769ea Fix 'return' emitting signals; allow arbitrary return values
When a global EXIT trap is set, and a ksh-style function exits with
a status > 256 that could have been the result of a signal, then
the shell incorrectly issues that signal to itself. Depending on
the signal, this causes ksh to terminate itself ungracefully:

    $ cat /tmp/exit267
    trap 'echo OK' EXIT  # This trap triggers the crash
    function foo { return 267; }
    foo
    $ bash /tmp/exit267
    OK
    $ ksh-3aee10d7 /tmp/exit267
    OK
    $ ksh /tmp/exit267
    Memory fault(coredump)

On most systems, status 267 corresponds to SIGSEGV. The reported
memory fault is not real; it results from ksh incorrectly killing
itself with that signal.

The problem is caused by two factors:

1. As of 93u+ 2012-08-01, ksh explicitly allows 'return' to use an
   exit status corresponding to a signal (from 257 to end of signal
   range). The rest of the integer range is trunctated to 8 bits.
   This is contrary to both 'man ksh' and 'return --man' which both
   say it's always truncated to 8 bits. Plus, combined with point 2
   below, this new behaviour is nonsensical, as 'return' has no
   business actually generating signals. However, a couple of
   regression tests now depend on this, as may some scripts.

2. When a ksh-style function does not handle a signal, the signal
   is passed down to the parent environment and ksh does this by
   reissuing the signal to its own process after leaving the
   function scope. However, it does this by checking the exit
   status, which is very bad practice as there is no guarantee
   that an exit status corresponding to a signal was in fact
   produced by a signal, particularly after they changed the
   behaviour of 'return' per 1 above.

This commit fixes both issues. It also takes a proper decision on
allowable 'return' exit status arguments. Since 93u+ was released
nearly a decade ago and some scripts may now rely on being able to
pass certain exit statuses out of the 8-bit range, we should not
disallow this now. But neither should we be half-hearted in
allowing only some arbitrary selection of 9-bit statuses; 'return'
values categorically should have nothing to do with signals, so
this is no basis for limiting them. We're now allowing the full
unsigned integer range, which is usually 32 bits. This is like zsh,
and may create some interesting possibilities for scripts.
Just don't forget that $? will still lose all but its 8 least
significant bits when leaving the current (sub)shell environment.

src/cmd/ksh93/sh/xec.c: sh_funscope():
- Fix passing down unhandled signals from interrupted ksh functions
  (jumpval==SH_JMPFUN) to the parent environment. Do not pay any
  attention to the exit status. Instead, use sh.lastsig (a.k.a.
  shp->lastsig). It is set by sh_fault() in fault.c for just this
  purpose and contains the last signal handled for the current
  command. It is reset in sh_exec() before running any new command.
  So if it contains a signal, that is the one that interrupted the
  ksh function, so it's the correct one to pass down. (Further
  evidence: sh_subshell() was already using this in the same way.)

src/cmd/ksh93/bltins/cflow.c: b_return():
- Allow any signed int return value when invoked as and behaving
  like 'return'.
- Add warning if a passed value is out of int range. Set the exit
  status to 128 in that case; int overflow is undefined behaviour
  in C and we want consistent behaviour across platforms. It should
  be safe enough to check if the long and int values are equal.
- Refactor for clarity.

src/cmd/ksh93/sh/subshell.c: sh_subshell():
- If a function returns with a status out of the 8 bit range in a
  virtual subshell, this status could be passed down to the parent
  shell in full. However, if the subshell forks, then the kernel
  will enforce an 8-bit exit status. That is inconsistent. Scripts
  should not be able to tell the difference between forked and
  non-forked subshells, so artificially enforce that limit here.

Other changed files:
- Documentation updates and copy-edits.
- Update an AT&T functions.sh regress test to allow arbitrary
  integer return values for functions.
- Add regression tests based in part on @JohnoKing's reproducers.
- Rework some vaguely related regression tests to fail gracefully.

Thanks to Johnothan King for the report and the testing.

Fixes: https://github.com/ksh93/ksh/issues/364
2021-12-09 06:41:39 +01:00
Martijn Dekker
0d2d6bb47b Documentation tweaks (re: cd8c48cc)
The information about an out of memory error does not apply
to the version of the PR that was eventually committed since
it does not use getcwd() which might cause such an error.
See https://github.com/ksh93/ksh/pull/358#issuecomment-986294934
2021-12-06 06:58:34 +01:00
Johnothan King
cd8c48cc5a Add the '-e' flag to the 'cd' builtin (#358)
This change adds the -e flag to the cd builtin, as specified in
<https://www.austingroupbugs.net/view.php?id=253>. The -e flag is
used to verify if the the current working directory after 'cd -P'
successfully changes the directory, and returns with exit status 1
if the cwd couldn't be determined. Additionally, it causes all
other errors to return with exit status >1 (i.e., status 2 unless
ENOMEM occurs) if -e and -P are both active.

src/cmd/ksh93/bltins/cd_pwd.c:
- Add -e option to the cd builtin command. It verifies $PWD by
  using test_inode() to execute the equivalent of [[ . -ef $PWD ]].
- The check for restricted mode has been moved after optget to
  allow 'cd -eP' to return with exit status 2 when in restricted
  mode. To avoid changing the previous behavior of cd when -e isn't
  passed, extra checks have been added to prevent cd from printing
  usage information in restricted mode.

src/cmd/ksh93/tests/builtins.sh:
- Add regression tests for the exit status when using the cd -P
  flag with and without -e.

src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/sh.1:
- Document the addition of -e to the cd builtin.
2021-12-06 06:58:11 +01:00
Johnothan King
6904585f49 Port illumos' shell linter improvements (#353)
This commit ports over two improvements to the shell linter from
illumos (original patch written by Andy Fiddaman). Links to the
relevant bug reports and the original patch:
https://www.illumos.org/issues/13601
https://www.illumos.org/issues/13631
c7b656fc71

The first improvement is to the lint warning for arithmetic
operators in [[ ... ]]. The ksh linter now suggests the correct
equivalent operator to use in ((...)). Example:
  $ ksh -nc '[[ 30 -gt 25 ]]'
  # Original warning
  warning: line 1: -gt within [[ ... ]] obsolete, use ((...))
  # New warning
  warning: line 1: [[ ... -gt ... ]] obsolete, use ((... > ...))

The second improvement pertains to variable expansion in arithmetic
expressions. The ksh linter now suggests referencing variable names
directly:
  $ ksh -nc 'integer foo=40; (($foo < 50 ))'
  # Old warning
  warning: line 1: variable expansion makes arithmetic evaluation less efficient
  # New warning
  warning: line 1: in '(($foo < 50))', using '$' is slower and can introduce rounding errors

src/cmd/ksh93/{data/lexstates,sh/lex,sh/parse}.c:
- Port the improved shell lint warnings from illumos to ksh93u+m.
- The original checks for arithmetic operators involved a bunch of
  if statements with inefficient calls to strcmp(3). These were
  replaced with a more efficient switch statement that avoids
  strcmp.
2021-12-05 19:27:16 +01:00
Martijn Dekker
f508660ddf Revert "Fix defining types conditionally and/or in subshells (re: 8ced1daa)"
This reverts commit 2b9cbbbc8e.

This is not ready for prime time. Crashses when running a $PS2
discipline function. This needs fixing and more testing in
development before making it into the 1.0 branch. In the meantime,
that terrible problem with types is back, sorry about that.
2021-11-29 20:08:53 +01:00
Martijn Dekker
2b9cbbbc8e Fix defining types conditionally and/or in subshells (re: 8ced1daa)
This commit mitigates the effects of the hack explained in the
referenced commit so that dummy built-in command nodes added by the
parser for declaration/assignment purposes do not leak out into the
execution level, except in a relatively harmless corner case.

Something like

	if false; then
		typeset -T Foo_t=(integer -i bar)
	fi

will no longer leave a broken dummy Foo_t declaration command. The
same applies to declaration commands created with enum.

The corner case remaining is:

$ ksh -c 'false && enum E_t=(a b c); E_t -a x=(b b a c)'
ksh: E_t: not found

Since the 'enum' command is not executed, this should have thrown
a syntax error on the 'E_t -a' declaration:
ksh: syntax error at line 1: `(' unexpected

This is because the -c script is parsed entirely before being
executed, so E_t is recognised as a declaration built-in at parse
time. However, the 'not found' error shows that it was successfully
eliminated at execution time, so the inconsistent state will no
longer persist.

This fix now allows another fix to be effective as well: since
built-ins do not know about virtual subshells, fork a virtual
subshell into a real subshell before adding any built-ins.

src/cmd/ksh93/sh/parse.c:

- Add a pair of functions, dcl_hactivate() and dcl_dehacktivate(),
  that (de)activate an internal declaration built-ins tree into
  which check_typedef() can pre-add dummy type declaration command
  nodes. A viewpath from the main built-ins tree to this internal
  tree is added, unifying the two for search purposes and causing
  new nodes to be added to the internal tree. When parsing is done,
  we close that viewpath. This hides those pre-added nodes at
  execution time. Since the parser is sometimes called recursively
  (e.g. for command substitutions), keep track of this and only
  activate and deactivate at the first level.

- We also need to catch errors. This is done by setting libast's
  error_info.exit variable to a dcl_exit() function that tidies up
  and then passes control to the original (usually sh_exit()).

- sh_cmd(): This is the most central function in the parser. You'd
  think it was sh_parse(), but $(modern)-form command substitutions
  use sh_dolparen() instead. Both call sh_cmd(). So let's simply
  add a dcl_hacktivate() call at the beginning and a
  dcl_deactivate() call at the end.

- assign(): This function calls path_search(), which among many
  other things executes an FATH search, which may execute arbitrary
  code at parse time (!!!). So, regardless of recursion level,
  forcibly dehacktivate() to avoid those ugly parser side effects
  returning in that context.

src/cmd/ksh93/bltins/enum.c: b_enum():

- Fork a virtual subshell before adding a built-in.

src/cmd/ksh93/sh/xec.c: sh_exec():

- Fork a virtual subshell when detecting typeset's -T option.

Improves fix to https://github.com/ksh93/ksh/issues/256
2021-11-29 09:02:07 +01:00
Martijn Dekker
c0334e32a1 [1.0 release prep] Remove tilde expansion discipline
Defining a .sh.tilde.get or .sh.tilde.set discipline function to
extend tilde expansion works well as long as the discipline
function doesn't get interrupted (e.g. with Crtl+C) or produce an
error message. Either of those will cause the shell to become
unstable and crash.

This feature is now removed from the 1.0 branch as it is not ready
for prime time. It can return to a release branch if/when we manage
to fix it on the master branch.

Related: https://github.com/ksh93/ksh/issues/346
2021-11-24 07:46:58 +01:00
Martijn Dekker
8ced1daadf Fix enum type definition pre-parsing for shcomp and dot/source
Parser limitations prevent shcomp or source from handling enum
types correctly:

    $ cat /tmp/colors.sh
    enum Color_t=(red green blue orange yellow)
    Color_t -A Colors=([foo]=red)

    $ shcomp /tmp/colors.sh > /dev/null
    /tmp/colors.sh: syntax error at line 2: `(' unexpected
    $ source /tmp/colors.sh
    /bin/ksh: source: syntax error: `(' unexpected

Yet, for types created using 'typeset -T', this works. This is done
via a check_typedef() function that preliminarily adds the special
declaration builtin at parse time, with details to be filled in
later at execution time.

This hack will produce ugly undefined behaviour if the definition
command creating that type built-in is then not actually run at
execution time before the type built-in is accessed.

But the hack is necessary because we're dealing with a fundamental
design flaw in the ksh language. Dynamically addable built-ins that
change the syntactic parsing of the shell language on the fly are
an absurdity that violates the separation between parsing and
execution, which muddies the waters and creates the need for some
kind of ugly hack to keep things like shcomp more or less working.

This commit extends that hack to support enum.

src/cmd/ksh93/sh/parse.c:
- check_typedef():
  - Add 'intypeset' parameter that should be set to 1 for typeset
    and friends, 2 for enum.
  - When processing enum arguments, use AST getopt(3) to skip over
    enum's options to find the name of the type to be defined.
    (getopt failed if we were running a -c script; deal with this
    by zeroing opt_info.index first.)
- item(): Update check_typedef() call, passing lexp->intypeset.
- simple(): Set lexp->intypeset to 2 when processing enum.

The rest of the changes are all to support the above and should be
fairly obvious, except:

src/cmd/ksh93/bltins/enum.c:
- enuminfo(): Return on null pointer, avoiding a crash upon
  executing 'Type_t --man' if Type_t has not been fully defined due
  to the definition being pre-added at parse time but not executed.
  It's all still wrong, but a crash is worse.

Resolves: https://github.com/ksh93/ksh/issues/256
2021-11-21 17:43:55 +01:00
Martijn Dekker
15bbc2f632 manual: use consistent terminology
The ksh manual page is one of the few places that calls globbing
"file name generation". The mksh and zsh manuals use the same term.
But every other shell's manual calls it "pathname expansion": bash,
dash, yash, FreeBSD sh. So does ksh's built-in documentation (alias
--man, export --man, readonly --man, set --man, typeset --man).
What's more, the authoritative ksh reference, Bolsky & Korn's 1995
"The New Kornshell" book, also calls it "pathname expansion", and
so does the POSIX standard.

Similarly, "arithmetic substitution" should be called "arithmetic
expansion" per Bolsky & Korn as well as POSIX.

This commit has several other miscellaneous documentation tweaks as
well.
2021-11-19 03:54:42 +01:00
Martijn Dekker
bd9752e43c Backport 'printf -v' from ksh 93v-
'printf' on bash and zsh has a popular -v option that allows
assigning formatted output directly to variables without using a
command substitution. This is much faster and avoids snags with
stripping final linefeeds. AT&T had replicated this feature in the
abandoned 93v- beta version. This backports it with a few tweaks
and one user-visible improvement.

The 93v- version prohibited specifying a variable name with an
array subscript, such as printf -v var\[3\] foo. This works fine on
bash and zsh, so I see no reason why this should not work on ksh,
as nv_putval() deals with array subscripts just fine.

src/cmd/ksh93/bltins/print.c: b_print():
- While processing the -v option when called as printf, get a
  pointer to the variable, creating it if necessary. Pass only the
  NV_VARNAME flag to enforce a valid variable name, and not (as
  93v- does) the NV_NOARRAY flag to prohibit array subscripts.
- If a variable was given, set the output file to an internal
  string buffer and jump straight to processing the format.
- After processing the format, assign the contents to the string
  buffer to the variable.

src/cmd/ksh93/data/builtins.c:
- Document the new option, adding a warning that unquoted square
  brackets may trigger pathname expansion.
2021-11-19 03:54:33 +01:00
Martijn Dekker
d9cd49c6d7 Remove duplicate error message
e_badnum from streval.h and e_number from shell.h are both defined
as "%s: bad number". We only need one. Remove the one that is used
only once: e_badnum.
2021-11-15 21:15:41 +01:00
Martijn Dekker
d9f1fdaa41 Fix [ \( str -a str \) ], [ \( str -o str \) ]
Symptoms:

$ test \( string1 -a string2 \)
/usr/local/bin/ksh: test: argument expected
$ test \( string1 -o string2 \)
/usr/local/bin/ksh: test: argument expected

The parentheses should be irrelevant and this should be a test for
the non-emptiness of string1 and/or string2.

src/cmd/ksh93/bltins/test.c:

- b_test(): There is a block where the case of 'test' with five or
  less arguments, the first and last one being parentheses, is
  special-cased. The parentheses are removed as a workaround: argv
  is increased to skip the opening parenthesis and argc is
  decreased by 2. However, there is no corresponding increase of
  tdata.av which is a copy of this function's argv. This renders
  the workaround ineffective. The fix is to add that increase.

- e3(): Do not handle '!' as a negator if not followed by an
  argument. This allows a right-hand expression that is equal to
  '!' (i.e. a test for the non-emptiness of the string '!').
2021-11-15 02:44:56 +01:00
Martijn Dekker
c81473061a test/[: binary operators: fix '<' and add '=~'; some more cleanups
In ksh88, the test/[ built-in supported both the '<' and '>'
lexical sorting comparison operators, same as in [[. However, in
every version of ksh93, '<' does not work though '>' still does!

Still, the code for both is present in test_binop():

src/cmd/ksh93/bltins/test.c
548:		case TEST_SGT:
549:			return(strcoll(left, right)>0);
550:		case TEST_SLT:
551:			return(strcoll(left, right)<0);

Analysis: The binary operators are looked up in shtab_testops[] in
data/testops.c using a macro called sh_lookup, which expands to a
sh_locate() call. If we examine that function in sh/string.c, it's
easy to see that on systems using ASCII (i.e. all except IBM
mainframes), it assumes the table is sorted in ASCII order.

src/cmd/ksh93/sh/string.c
64:	while((c= *tp->sh_name) && (CC_NATIVE!=CC_ASCII || c <= first))

The problem was that the '<' operator was not correctly sorted in
shtab_testops[]; it was sorted immediately before '>', but after
'='. The ASCII order is: < (60), = (61), > (62). This caused '<' to
never be found in the table.

The test_binop() function is also used by [[, yet '<' always worked
in that. This is because the parser has code that directly checks
for '<' and '>' within [[ (in sh/parse.c, lines 1949-1952).

This commit also adds '=~' to 'test', which took three lines of
code and allowed eliminating error handling in test_binop() as
test/[ and [[ now support the same binary ops. (re: fc2d5a60)

src/cmd/ksh93/*/*.[ch]:
- Rename a couple of very misleadingly named macros in test.h:
  . For == and !=, the TEST_PATTERN bit is off for pattern compares
    and on for literal string compares! Rename to TEST_STRCMP.
  . The TEST_BINOP bit does not denote all binary operators, but
    only the logical -a/-o ops in test/[. Rename to TEST_ANDOR.

src/cmd/ksh93/bltins/test.c: test_binop():
- Add support for =~. This is only used by test/[. The method is
  implemented in two lines that convert the ERE to a shell pattern
  by prefixing it with ~(E), then call test_strmatch with that
  temporary string to match the ERE and update ${.sh.match}.
- Since all binary ops from shtab_testops[] are now accounted for,
  remove unknown op error handling from this function.

src/cmd/ksh93/data/testops.c:
- shtab_testops[]:
  . Correctly sort the '<' (TEST_SLT) entry.
  . Remove ']]' (TEST_END). It's not an op and doesn't belong here.
- Update sh_opttest[] documentation with =~, \<, \>.
- Remove now-unused e_unsupported_op[] error message.

src/cmd/ksh93/sh/lex.c: sh_lex():
- Check for ']]' directly instead of relying on the removed
  TEST_END entry from shtab_testops[].

src/cmd/ksh93/tests/bracket.sh:
- Add relevant tests.

src/cmd/ksh93/tests/builtins.sh:
- Fix an old test that globally deleted the 'test' builtin. Delete
  it within the command substitution subshell only.
- Remove the test for non-support of =~ in test/[.
- Update the test for invalid test/[ op to use test directly.
2021-11-14 02:46:34 +01:00
Martijn Dekker
6f5c9fea93 test/[: Fix binary -a/-o operators in POSIX mode
POSIX requires
	test "$a" -a "$b"
to return true if both $a and $b are non-empty, and
	test "$a" -o "$b"
to return true if either $a or $b is non-empty.

In ksh, this fails if "$a" is '!' or '(' as this causes ksh to
interpret the -a and -o as unary operators (-a being a file
existence test like -e, and -o being a shell option test).

$ test ! -a ""; echo "$?"
0		(expected: 1/false)
$ set -o trackall; test ! -o trackall; echo "$?"
1		(expected: 0/true)
$ test \( -a \); echo "$?"
ksh: test: argument expected
2		(expected: 0/true)
$ test \( -o \)
ksh: test: argument expected
2		(expected: 0/true)

Unfortunately this problem cannot be fixed without risking breakage
in legacy scripts. For instance, a script may well use
	test ! -a filename
to check that a filename is nonexistent. POSIX specifies that this
always return true as it is a test for the non-emptiness of both
strings '!' and 'filename'.

So this commit fixes it for POSIX mode only.

src/cmd/ksh93/bltins/test.c: e3():
- If the posix option is active, specially handle the case of
  having at least three arguments with the second being -a or -o,
  overriding their handling as unary operators.

src/cmd/ksh93/data/testops.c:
- Update 'test --man --' date and say that unary -a is deprecated.

src/cmd/ksh93/sh.1:
- Document the fix under the -o posix option.
- For test/[, explain that binary -a/-o are deprecated.

src/cmd/ksh93/tests/bracket.sh:
- Add tests based on reproducers in bug report.

Resolves: https://github.com/ksh93/ksh/issues/330
2021-11-13 03:43:29 +01:00
Martijn Dekker
7b5b0a5d54 Fix octal number arguments in printf integer arithmetic
Bug 1: POSIX requires numbers used as arguments for all the %d,
%u... in printf to be interpreted as in the C language, so
	printf '%d\n' 010
should output 8 when the posix option is on. However, it outputs 10.

This bug was introduced as a side effect of a change introduced in
the 2012-02-07 version of ksh 93u+m, which caused the recognition
of leading-zero numbers as octal in arithmetic expressions to be
disabled outside ((...)) and $((...)). However, POSIX requires
leading-zero octal numbers to be recognised for printf, too.

The change in question introduced a sh.arith flag that is set while
we're processing a POSIX arithmetic expression, i.e., one that
recognises leading-zero octal numbers.
Bug 2: Said flag is not reset in a command substitution used within
an arithmetic expression. A command substitution should be a
completely new context, so the following should both output 10:

$ ksh -c 'integer x; x=010; echo $x'
10            # ok; it's outside ((…)) so octals are not recognised
$ ksh -c 'echo $(( $(integer x; x=010; echo $x) ))'
8             # bad; $(comsub) should create new non-((…)) context

src/cmd/ksh93/bltins/print.c: extend():
- For the u, d, i, o, x, and X conversion modifiers, set the POSIX
  arithmetic context flag before calling sh_strnum() to convert the
  argument. This fixes bug 1.

src/cmd/ksh93/sh/subshell.c: sh_subshell():
- When invoking a command substitution, save and unset the POSIX
  arithmetic context flag. Restore it at the end. This fixes bug 2.

Reported-by: @stephane-chazelas
Resolves: https://github.com/ksh93/ksh/issues/326
2021-09-13 04:57:37 +02:00
hyenias
92f7ca5423
Back port ksh93v- float, int, and exp10 changes from math.tab (#299)
src/cmd/ksh93/data/math.tab:
- Added exp10().
- Remove int() as being an alias to floor().
- Created entries for local float() and local int() which are
  defined in features/math.sh.

src/cmd/ksh93/features/math.sh:
- Backport floor() and int() related code from ksh93v-.

src/cmd/ksh93/sh.1:
- Sync man page to math.tab's potential functions.
2021-05-08 04:43:37 +01:00
Martijn Dekker
d087b031f0 Fix single quotes in expansion operator string (re: 5ed9ffd6)
The referenced commit introduced the following bug:

> The closing quote does not appear to be registering during the
> parse of the following:
>
>	echo ${var:+'{}'}
>
> Within a script, this will result in:
>
>	syntax error at line 1: `'' unmatched

src/cmd/ksh93/data/lexstates.c,
src/cmd/ksh93/include/lexstates.h:
- Add new ST_MOD1 state table that is a copy of ST_QUOTE, but adds
  a special meaning (ST_LIT) for the single quote (position 39).

src/cmd/ksh93/sh/lex.c: sh_lex():
- For parameter expansion operators with old-style quoting
  (S_MOD1), use the new ST_MOD1 state table instead of ST_QUOTE.
  This causes single quotes within them to be processed properly.

src/cmd/ksh93/tests/quoting2.sh:
- Add tests.

Thanks to @gkamat for the bug report.
Resolves: https://github.com/ksh93/ksh/issues/290
2021-04-30 05:28:21 +01:00
Martijn Dekker
2aad3cab06 Add ksh 93u+m contributors notice to 964 copyright headers 2021-04-26 00:19:31 +01:00
Johnothan King
086d504393
Lots of man page fixes and some other minor fixes (#284)
Noteworthy changes:
- The man pages have been updated to fix a ton of instances of
  runaway underlining (this was done with `sed -i 's/\\f5/\\f3/g'`
  commands). This commit dramatically increased in size because
  of this change.
- The documentation for spawnveg(3) has been extended with
  information about its usage of posix_spawn(3) and vfork(2).
- The documentation for tmfmt(3) has been updated with the changes
  previously made to the man pages for the printf and date builtins
  (though the latter builtin is disabled by default).
- The shell's tracked alias tree (hash table) is now documented in
  the shell(3) man page.
- Removed the commented out regression test for an ERRNO variable
  as the COMPATIBILITY file states it was removed in ksh93.
2021-04-23 22:02:30 +01:00
Johnothan King
2c38fb93fd
Fix the exit status returned when a command isn't executable (#273)
Previous discussion: https://github.com/att/ast/issues/485

If ksh attempts to execute a non-executable command found in the
PATH, in some instances the error message and return status are
incorrect. In the example below, ksh returns with exit status 126
when using the -c execve(2) optimization or when using fork(2) in
an interactive shell. However, using posix_spawn(3) causes the exit
status to change:
  $ echo 'print cannot execute' > /tmp/x
  # Runs command with spawnveg (i.e., posix_spawn or vfork)
  $ ksh -c 'PATH=/tmp; x; echo $?'
  ksh: x: not found
  127
  # Runs command with execve
  $ ksh -c 'PATH=/tmp; x'; echo $?
  ksh: x: cannot execute [Permission denied]
  126
  # Runs command with fork
  $ ksh -ic 'PATH=/tmp; x; echo $?'
  ksh: x: cannot execute [Permission denied]
  126

Since 'x' is in the PATH but can't be executed, the correct exit
status is 126, not 127. It's worth noting this bug doesn't cause
the regression tests to fail with ksh93u+m, but it does cause one
test to fail when run under dtksh:

    path.sh[706]: Long nonexistent command name: got status 126, ''

This commit backports various fixes for this bug from ksh2020, with
additional fixes applied (since there were still some additional
issues the ksh2020 patch didn't fix). The lacking regression test
for exit status 126 in path.sh has been rewritten to test for more
scenarios where ksh failed to return the correct error message
and/or exit status. I can also confirm with this patch applied the
path.sh regression tests now pass when run under dtksh.

src/cmd/ksh93/sh/path.c:
- Add a comment to path_absolute() describing 'oldpp' is the
  current pointer in the while loop and 'pp' is the next pointer.
  Backported from:
  a6cad450

- The patch from ksh2020 didn't fix this bug in the SHOPT_SPAWN
  code (because ksh2020 prefers fork(2)), so issues with the exit
  status could still occur when using spawnveg. To fix this, always
  set 'noexec' to the value of errno if can_execute fails. Before
  this fix, errno was discarded if 'pp' was a null pointer and
  can_execute failed.

- If a command couldn't be executed and the error wasn't ENOENT,
  save errno in a 'not_executable' variable. If an executable
  command couldn't be found in the PATH, exit with status 126 and
  set errno to the saved value. This was based on a ksh2020 bugfix,
  but it has been reworked a little bit to fix a bug that caused a
  mismatch between the error message shown and errno. Example with
  a non-executable file in PATH:
  $ nonexec
  ksh2020: nonexec: cannot execute [No such file or directory]
  The ksh2020 patch: <https://github.com/att/ast/pull/493>

- Backport a ksh2020 bugfix for directories in the PATH when
  running one of the added regression tests on OpenBSD:
  https://github.com/att/ast/pull/767

src/cmd/ksh93/data/msg.c,
src/cmd/ksh93/include/shell.h,
src/cmd/ksh93/sh/{path,xec}.c:
- If a command name is too long (ENAMETOOLONG), then it wasn't
  found in the PATH. For that case return exit status 127, like
  for ENOENT.

src/cmd/ksh93/tests/path.sh:
- Replace the old test with a new set of more extensive tests.
  These tests check the error message and exit status when ksh
  attempts to run a command using any of the following:
   - execve(2), used with the last command run with -c       (*A tests).
   - posix_spawn(3)/vfork(2), used in noninteractive scripts (*B tests).
   - fork(2), used in interactive shells with job control    (*C tests).
   - command -x                                              (*D tests).
   - exec(1)                                                 (*E tests).
- Add a regression test from ksh2020 for attempting to execute a
  directory:
  https://github.com/att/ast/pull/758

src/lib/libast/include/ast.h,
src/lib/libast/include/wait.h:
- Avoid bitshifts in macros for static error codes. The return
  values of command not found and exec related errors are static
  values and should not require any macro magic for calculation.
  Backported from: c073b102
- Simplify EXIT_* and W* macros to use 8 bits.
2021-04-15 03:37:57 +01:00
Johnothan King
a065558291
Fix more compiler warnings, typos and other minor issues (#260)
Many of these changes are minor typo fixes. The other changes
(which are mostly compiler warning fixes) are:

NEWS:
- The --globcasedetect shell option works on older Linux kernels
  when used with FAT32/VFAT file systems, so remove the note about
  it only working with 5.2+ kernels.

src/cmd/ksh93/COMPATIBILITY:
- Update the documentation on function scoping with an addition
  from ksh93v- (this does apply to ksh93u+).

src/cmd/ksh93/edit/emacs.c:
- Check for '_AST_ksh_release', not 'AST_ksh_release'.

src/cmd/INIT/mamake.c,
src/cmd/INIT/ratz.c,
src/cmd/INIT/release.c,
src/cmd/builtin/pty.c:
- Add more uses of UNREACHABLE() and noreturn, this time for the
  build system and pty.

src/cmd/builtin/pty.c,
src/cmd/builtin/array.c,
src/cmd/ksh93/sh/name.c,
src/cmd/ksh93/sh/nvtype.c,
src/cmd/ksh93/sh/suid_exec.c:
- Fix six -Wunused-variable warnings (the name.c nv_arrayptr()
  fixes are also in ksh93v-).
- Remove the unused 'tableval' function to fix a -Wunused-function
  warning.

src/cmd/ksh93/sh/lex.c:
- Remove unused 'SHOPT_DOS' code, which isn't enabled anywhere.
  https://github.com/att/ast/issues/272#issuecomment-354363112

src/cmd/ksh93/bltins/misc.c,
src/cmd/ksh93/bltins/trap.c,
src/cmd/ksh93/bltins/typeset.c:
- Add dictionary generator function declarations for former
  aliases that are now builtins (re: 1fbbeaa1, ef1621c1, 3ba4900e).
- For consistency with the rest of the codebase, use '(void)'
  instead of '()' for print_cpu_times.

src/cmd/ksh93/sh/init.c,
src/lib/libast/path/pathshell.c:
- Move the otherwise unused EXE macro to pathshell() and only
  search for 'sh.exe' on Windows.

src/cmd/ksh93/sh/xec.c,
src/lib/libast/include/ast.h:
- Add an empty definition for inline when compiling with C89.
  This allows the timeval_to_double() function to be inlined.

src/cmd/ksh93/include/shlex.h:
- Remove the unused 'PIPESYM2' macro.

src/cmd/ksh93/tests/pty.sh:
- Add '# err_exit #' to count the regression test added in
  commit 113a9392.

src/lib/libast/disc/sfdcdio.c:
- Move diordwr, dioread, diowrite and dioexcept behind
  '#ifdef F_DIOINFO' to fix one -Wunused-variable warning and
  multiple -Wunused-function warnings (sfdcdio() only uses these
  functions when F_DIOINFO is defined).

src/lib/libast/string/fmtdev.c:
- Fix two -Wimplicit-function-declaration warnings on Linux by
  including sys/sysmacros.h in fmtdev().
2021-04-08 19:58:07 +01:00
Johnothan King
b2a7ec032f
Add LC_TIME to the supported locale variables (#257)
The current version of 93u+m does not have proper support for the
LC_TIME variable. Setting LC_TIME has no effect on printf %T, and
if the locale is invalid no error message is shown:
    $ LC_TIME=ja_JP.UTF-8
    $ printf '%T\n' now
    Wed Apr  7 15:18:13 PDT 2021
    $ LC_TIME=invalid.locale
    $ # No error message

src/cmd/ksh93/data/variables.c,
src/cmd/ksh93/include/variables.h,
src/cmd/ksh93/sh/init.c:
- Add support for the $LC_TIME variable. ksh93v- attempted to add
  support for LC_TIME, but the patch from that version was extended
  because the variable still didn't function correctly.

src/cmd/ksh93/tests/variables.sh:
- Add LC_TIME to the regression tests for LC_* variables.
2021-04-08 13:06:22 +01:00
Johnothan King
fc2d5a6019
test foo =~ foo should fail with exit status 2 (#245)
When test is passed the '=~' operator, it will silently fail with
exit status 1:
    $ test foo =~ foo; echo $?
    1
This bug is caused by test_binop reaching the 'NOTREACHED' area of
code. The bugfix was adapted from ksh2020:
https://github.com/att/ast/issues/1152

src/cmd/ksh93/bltins/test.c: test_binop():
- Error out with a message suggesting usage of '[[ ... ]]' if '=~'
  is passed to the test builtin.
- Special-case TEST_END (']]') as that is not really an operator.

Co-authored-by: Martijn Dekker <martijn@inlv.org>
2021-03-27 21:51:16 +00:00
Martijn Dekker
af07bb6aa3 globcasedetect: add 'set --man' self-doc (re: 71934570) 2021-03-22 19:42:08 +00:00
Martijn Dekker
71934570bf Add --globcasedetect shell option for globbing and completion
One of the best-kept secrets of libast/ksh93 is that the code
includes support for case-insensitive file name generation (a.k.a.
pathname expansion, a.k.a. globbing) as well as case-insensitive
file name completion on interactive shells, depending on whether
the file system is case-insensitive or not. This is transparently
determined for each directory, so a path pattern that spans
multiple file systems can be part case-sensitive and part case-
insensitive. In more precise terms, each slash-separated path name
component pattern P is treated as ~(i:P) if its parent directory
exists on a case-insensitive file system. I recently discovered
this while dealing with <https://github.com/ksh93/ksh/issues/223>.

However, that support is dead code on almost all current systems.
It depends on pathconf(2) having a _PC_PATH_ATTRIBUTES selector.
The 'c' attribute is supposedly returned if the given directory is
on a case insensitive file system. There are other attributes as
well (at least 'l', see src/lib/libcmd/rm.c). However, I have been
unable to find any system, current or otherwise, that has
_PC_PATH_ATTRIBUTES. Google and mailing list searches yield no
relevant results at all. If anyone knows of such a system, please
add a comment to this commit on GitHub, or email me.

An exception is Cygwin/Windows, on which the "c" attribute was
simply hardcoded, so globbing/completion is always case-
insensitive. As of Windows 10, that is wrong, as it added the
possibility to mount case-sensitive file systems.

On the other hand, this was never activated on the Mac, even
though macOS has always used a case-insensitive file like Windows.
But, being UNIX, it can also mount case-sensitive file systems.

Finally, Linux added the possibility to create individual case-
insensitive ext4 directories fairly recently, in version 5.2.
https://www.collabora.com/news-and-blog/blog/2020/08/27/using-the-linux-kernel-case-insensitive-feature-in-ext4/

So, since this functionality latently exists in the code base, and
three popular OSs now have relevant file system support, we might
as well make it usable on those systems. It's a nice idea, as it
intuitively makes sense for globbing and completion behaviour to
auto-adapt to file system case insensitivity on a per-directory
basis. No other shell does this, so it's a nice selling point, too.

However, the way it is coded, this is activated unconditionally on
supported systems. That is not a good idea. It will surprise users.
Since globbing is used with commands like 'rm', we do not want
surprises. So this commit makes it conditional upon a new shell
option called 'globcasedetect'. This option is only compiled into
ksh on systems where we can actually detect FS case insensitivity.

To implement this, libast needs some public API additions first.

*** libast changes ***

src/lib/libast/features/lib:
- Add probes for the linux/fs.h and sys/ioctl.h headers.
  Linux needs these to use ioctl(2) in pathicase(3) (see below).

src/lib/libast/path/pathicase.c,
src/lib/libast/include/ast.h,
src/lib/libast/man/path.3,
src/lib/libast/Mamfile:
- Add new pathicase(3) public API function. This uses whatever
  OS-specific method it can detect at compile time to determine if
  a particular path is on a case-insensitive file system. If no
  method is available, it only sets errno to ENOSYS and returns -1.
  Currently known to work on: macOS, Cygwin, Linux 5.2+, QNX 7.0+.
- On systems (if any) that have the mysterious _PC_PATH_ATTRIBUTES
  selector for pathconf(2), call astconf(3) and check for the 'c'
  attribute to determine case insensitivity. This should preserve
  compatibility with any such system.

src/lib/libast/port/astconf.c:
- dynamic[]: As case-insensitive globbing is now optional on all
  systems, do not set the 'c' attribute by default on _WINIX
  (Cygwin/Windows) systems.
- format(): On systems that do not have _PC_PATH_ATTRIBUTES, call
  pathicase(3) to determine the value for the "c" (case
  insensitive) attribute only. This is for compatibility as it is
  more efficient to call pathicase(3) directly.

src/lib/libast/misc/glob.c,
src/lib/libast/include/glob.h:
- Add new GLOB_DCASE public API flag to glob(3). This is like
  GLOB_ICASE (case-insensitive matching) except it only makes the
  match case-insensitive if the file system for the current
  pathname component is determined to be case-insensitive.
- gl_attr(): For efficiency, call pathicase(3) directly instead of
  via astconf(3).
- glob_dir(): Only call gl_attr() to determine file system case
  insensitivity if the GLOB_DCASE flag was passed. This makes case
  insensitive globbing optional on all systems.
- glob(): The options bitmask needs to be widened to fit the new
  GLOB_DCASE option. Define this centrally in a new GLOB_FLAGMASK
  macro so it is easy to change it along with GLOB_MAGIC (which
  uses the remaining bits for a sanity check bit pattern).

src/lib/libast/path/pathexists.c:
- For efficiency, call pathicase(3) directly instead of via
  astconf(3).

*** ksh changes ***

src/cmd/ksh93/features/options,
src/cmd/ksh93/SHOPT.sh:
- Add new SHOPT_GLOBCASEDET compile-time option. Set it to probe
  (empty) by default so that the shell option is compiled in on
  supported systems only, which is determined by new iffe feature
  test that checks if pathicase(3) returns an ENOSYS error.

src/cmd/ksh93/data/options.c,
src/cmd/ksh93/include/shell.h:
- Add -o globcasedetect shell option if compiling with
  SHOPT_GLOBCASEDET.

src/cmd/ksh93/sh/expand.c: path_expand():
- Pass the new GLOB_DCASE flag to glob(3) if the
  globcasedetect/SH_GLOBCASEDET shell option is set.

src/cmd/ksh93/edit/completion.c:
- While file listing/completion is based on globbing and
  automatically becomes case-insensitive when globbing does, it
  needs some additional handling to make a string comparison
  case-insensitive in corresponding cases. Otherwise, partial
  completions may be deleted from the command line upon pressing
  tab. This code was already in ksh 93u+ and just needs to be
  made conditional upon SHOPT_GLOBCASEDET and globcasedetect.
- For efficiency, call pathicase(3) directly instead of via
  astconf(3).

src/cmd/ksh93/sh.1:
- Document the new globcasedetect shell option.
2021-03-22 18:45:19 +00:00
Johnothan King
814b5c6890
Fix various minor problems and update the documentation (#237)
These are minor fixes I've accumulated over time. The following
changes are somewhat notable:

- Added a missing entry for 'typeset -s' to the man page.
- Add strftime(3) to the 'see also' section. This and the date(1)
  addition are meant to add onto the documentation for 'printf %T'.
- Removed the man page the entry for ksh reading $PWD/.profile on
  login. That feature was removed in commit aa7713c2.
- Added date(1) to the 'see also' section of the man page.
- Note that the 'hash' command can be used instead of 'alias -t' to
  workaround one of the caveats listed in the man page.
- Use an 'out of memory' error message rather than 'out of space'
  when memory allocation fails.
- Replaced backticks with quotes in some places for consistency.
- Added missing documentation for the %P date format.
- Added missing documentation for the printf %Q and %p formats
  (backported from ksh2020: https://github.com/att/ast/pull/1032).
- The comments that show each builtin's options have been updated.
2021-03-21 14:39:03 +00:00
Martijn Dekker
936a1939a8
Allow proper tilde expansion overrides (#225)
Until now, when performing any tilde expansion like ~/foo or
~user/foo, ksh added a placeholder built-in command called
'.sh.tilde', ostensibly with the intention to allow users to
override it with a shell function or custom builtin. The multishell
ksh93 repo <https://github.com/multishell/ksh93/> shows this was
added sometime between 2002-06-28 and 2004-02-29. However, it has
never worked and crashed the shell.

This commit replaces that with something that works. Specific tilde
expansions can now be overridden using .set or .get discipline
functions associated with the .sh.tilde variable (see manual,
Discipline Functions).

For example, you can use either of:

.sh.tilde.set()
{
        case ${.sh.value} in
        '~tmp') .sh.value=${XDG_RUNTIME_DIR:-${TMPDIR:-/tmp}} ;;
        '~doc') .sh.value=~/Documents ;;
        '~ksh') .sh.value=/usr/local/src/ksh93/ksh ;;
        esac
}

.sh.tilde.get()
{
        case ${.sh.tilde} in
        '~tmp') .sh.value=${XDG_RUNTIME_DIR:-${TMPDIR:-/tmp}} ;;
        '~doc') .sh.value=~/Documents ;;
        '~ksh') .sh.value=/usr/local/src/ksh93/ksh ;;
        esac
}

src/cmd/ksh93/include/variables.h,
src/cmd/ksh93/data/variables.c:
- Add SH_TILDENOD for a new ${.sh.tilde} predefined variable.
  It is initially unset.

src/cmd/ksh93/sh/macro.c:
- sh_btilde(): Removed.
- tilde_expand2(): Rewritten. I started out with the tiny version
  of this function from the 2002-06-28 version of ksh. It uses the
  stack instead of sfio, which is more efficient. A bugfix for
  $HOME == '/' was retrofitted so that ~/foo does not become
  //foo instead of /foo. The rest is entirely new code.
     To implement the override functionality, it now checks if
  ${.sh.tilde} has any discipline function associated with it.
  If it does, it assigns the tilde expression to ${.sh.tilde} using
  nv_putval(), triggering the .set discipline, and then reads it
  back using nv_getval(), triggering the .get discipline. The
  resulting value is used if it is nonempty and does not still
  start with a tilde.

src/cmd/ksh93/bltins/typeset.c,
src/cmd/ksh93/tests/builtins.sh:
- Since ksh no longer adds a dummy '.sh.tilde' builtin, remove the
  ad-hoc hack that suppressed it from the output of 'builtin'.

src/cmd/ksh93/tests/tilde.sh:
- Add tests verifying everything I can think of, as well as tests
  for bugs found and fixed during this rewrite.

src/cmd/ksh93/tests/pty.sh:
- Add test verifying that the .sh.tilde.set() discipline does not
  modify the exit status value ($?) when performing tilde expansion
  as part of tab completion.

src/cmd/ksh93/sh.1:
- Instead of "tilde substitution", call the basic mechanism "tilde
  expansion", which is the term used everywhere else (including the
  1995 Bolsky/Korn ksh book).
- Document the new override feature.

Resolves: https://github.com/ksh93/ksh/issues/217
2021-03-17 21:07:14 +00:00
Martijn Dekker
d4adc8fcf9 Fix test -v for numeric types & set/unset state for short int
This commit fixes two interrelated problems.

1. The -v unary test/[/[[ operator is documented to test if a
   variable is set. However, it always returns true for variable
   names with a numeric attribute, even if the variable has not
   been given a value. Reproducer:
	$ ksh -o nounset -c 'typeset -i n; [[ -v n ]] && echo $n'
	ksh: n: parameter not set
   That is clearly wrong; 'echo $n' should never be reached and the
   error should not occur, and does not occur on mksh or bash.

2. Fixing the previous problem revealed serious breakage in short
   integer type variables that was being masked. After applying
   that fix and then executing 'typeset -si var=0':
   - The conditional assignment expansions ${var=123} and
     ${var:=123} assigned 123 to var, even though it was set to 0.
   - The expansions ${var+s} and ${var:+n} incorrectly acted as if
     the variable was unset and empty, respectively.
   - '[[ -v var ]]' and 'test -v var' incorrectly returned false.
   The problems were caused by a different storage method for short
   ints. Their values were stored directly in the 'union Value'
   member of the Namval_t struct, instead of allocated on the stack
   and referred to by a pointer, as regular integers and all other
   types do. This inherently broke nv_isnull() as this leaves no
   way to distinguish between a zero value and no value at all.
   (I'm also pretty sure it's undefined behaviour in C to check for
   a null pointer at the address where a short int is stored.)
   The fix is to store short ints like other variables and refer
   to them by pointers. The NV_INT16P combined bit mask already
   existed for this, but nv_putval() did not yet support it.

src/cmd/ksh93/bltins/test.c: test_unop():
- Fix problem 1. For -v, only check nv_isnull() and do not check
  for the NV_INTEGER attribute (which, by the way, is also used
  for float variables by combining it with other bits).
  See also 5aba0c72 where we recently fixed nv_isnull() to
  work properly for all variable types including short ints.

src/cmd/ksh93/sh/name.c: nv_putval():
- Fix problem 2, part 1. Add support for NV_INT16P. The code is
  simply copied and adapted from the code for regular integers, a
  few lines further on. The regular NV_SHORT code is kept as this
  is still used for some special variables like ${.sh.level}.

src/cmd/ksh93/bltins/typeset.c: b_typeset():
- Fix problem 2, part 2. Use NV_INT16P instead of NV_SHORT.

src/cmd/ksh93/tests/attributes.sh:
- Add set/unset/empty/nonempty tests for all numeric types.

src/cmd/ksh93/tests/bracket.sh,
src/cmd/ksh93/tests/comvar.sh:
- Update a couple of existing tests.
- Add test for [[ -v var ]] and [[ -n ${var+s} ]] on unset
  and empty variables with many attributes.

src/cmd/ksh93/COMPATIBILITY:
- Add a note detailing the change to test -v.

src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/sh.1:
- Correct 'typeset -C' documentation. Variables declared as
  compound are *not* initially unset, but initially have the empty
  compound value. 'typeset' outputs them as:
	typeset -C foo=()
  and not:
	typeset -C foo
  and nv_isnull() is never true for them. This may or may not
  technically be a bug. I don't think it's worth changing, but
  it should at least be documented correctly.
2021-03-10 00:38:41 +00:00
hyenias
5aba0c7251
Fix set/unset state for short integer (typeset -si) (#211)
This commit fixes at least three bugs:
1. When issuing 'typeset -p' for unset variables typeset as short
   integer, a value of 0 was incorrectly diplayed.
2. ${x=y} and ${x:=y} were still broken for short integer types
   (re: 9f2389ed). ${x+set} and ${x:+nonempty} were also broken.
3. A memory fault could occur if typeset -l followed a -s option
   with integers. Additonally, now the last -s/-l wins out as the
   option to utilize instead of it always being short.

src/cmd/ksh93/include/name.h:
- Fix the nv_isnull() macro by removing the direct exclusion of
  short integers from this set/unset test. This breaks few things
  (only ${.sh.subshell} and ${.sh.level}, as far as we can tell)
  while potentially correcting many aspects of short integer use
  (at least bugs 1 and 2 above), as this macro is widely used.
- union Value: add new pid_t *pidp pointer member for PID values
  (see further below).

src/cmd/ksh93/bltins/typeset.c: b_typeset():
- To fix bug 3 above, unset the 'shortint' flag and NV_SHORT
  attribute bit upon encountering the -l optiobn.

*** To fix ${.sh.subshell} to work with the new nv_isnull():

src/cmd/ksh93/sh/defs.h:
- Add new 'realsubshell' member to the shgd (aka shp->gd) struct
  which will be the integer value for ${.sh.subshell}.

src/cmd/ksh93/sh/init.c,
src/cmd/ksh93/data/variables.c:
- Initialize SH_SUBSHELLNOD as a pointer to shgd->realsubshell
  instead of using a short value (.s) directly. Using a pointer
  allows nv_isnull() to return a positive for ${.sh.subshell} as
  a non-null pointer is what it checks for.
- While we're at it, initialize PPIDNOD ($PPID) and SH_PIDNOD
  (${.sh.pid}) using the new pdip union member, which is more
  correct as they are values of type pid_t.

src/cmd/ksh93/sh/subshell.c,
src/cmd/ksh93/sh/xec.c:
- Update the ${.sh.subshell} increases/decreases to refer to
  shgd->realsubshell (a.k.a. shp->gd->realsubshell).

*** To fix ${.sh.level} after changing nv_isnull():

src/cmd/ksh93/sh/macro.c: varsub():
- Add a specific exception for SH_LEVLNOD to the nv_isnull() test,
  so that ${.sh.level} is always considered to be set. Its handling
  throughout the code is too complex/special for a simple fix, so
  we have to special-case it, at least for now.

*** Regression test additions:

src/cmd/ksh93/tests/attributes.sh:
- Add in missing short integer tests and correct the one that
  existed. The -si test now yields 'typeset -x -r -s -i foo'
  instead of 'typeset -x -r -s -i foo=0' which brings it in line
  with all the others.
- Add in some other -l attribute tests for floats. Note, -lX test
  was not added as the size of long double is platform dependent.

src/cmd/ksh93/tests/variables.sh:
- Add tests for ${x=y} and ${x:=y} used on short int variables.

Co-authored-by: Martijn Dekker <martijn@inlv.org>
2021-03-08 04:19:36 +00:00
Martijn Dekker
82d6272733 manual: invocation options: edits for clarity
src/cmd/ksh93/data/builtins.c:
- sh_optksh[]: Edit descriptions of -c and -s options for clarity.
- sh_set[]: The --rc long name equivalent for -E was documented
  wrong, but in any case it does not belong in sh_set[], because
  that also shows up in 'set --man' and this invocation-only option
  cannot be used with 'set'. Remove it. (Note that all other
  invocation options already don't have inline documentation of
  their long equivalents. This may or may not be fixed at some
  point. It is problematic because they should not be documented in
  sh_set[] but there is no other good place for them.)

src/cmd/ksh93/sh.1:
- Generally edit the Invocation section for clarity.
- Document the long invocation option equivalents.
- Remove some nonsense from the -s description: "Shell output,
  except for the output of the Special Commands listed above, is
  written to file descriptor 2" (which is standard error).
  In fact, this option has no influence at all on what is written
  to standard error or standard output.
2021-02-25 17:22:53 +00:00
Martijn Dekker
18529b88c6 Add lots of checks for out of memory (re: 0ce0b671)
Huge typeset -L/-R adjustment length values were still causing
crashses on sytems with not enough memory. They should error out
gracefully instead of crashing.

This commit adds out of memory checks to all malloc/calloc/realloc
calls that didn't have them (which is all but two or three).

The stkalloc/stakalloc calls don't need the checks; it has
automatic checking, which is done by passing a pointer to the
outofspace() function to the stakinstall() call in init.c.

src/lib/libast/include/error.h:
- Change the ERROR_PANIC exit status value from ERROR_LEVEL (255)
  to 77, which is what it is supposed to be according to the libast
  error.3 manual page. Exit statuses > 128 for anything else than
  signals are not POSIX compliant and may cause misbehaviour.

src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/sh/init.c:
- To facilitate consistency, add a simple extern sh_outofmemory()
  function that throws an ERROR_PANIC "out of memory".

src/cmd/ksh93/include/shell.h,
src/cmd/ksh93/data/builtins.c:
- Remove now-redundant e_nospace[] extern message; it is now only
  used in one place so it might as well be a string literal in
  sh_outofmemory().

All other changed files:
- Verify the result of all malloc/calloc/realloc calls and call
  sh_outofmemory() if they fail.
2021-02-21 22:27:28 +00:00
Johnothan King
2b805f7f1c
Fix many spelling errors and word repetitions (#188)
Many of the errors fixed in this commit are word repetitions
such as 'the the' and minor spelling errors. One formatting
error in the ksh man page has also been fixed.
2021-02-20 03:22:24 +00:00
Martijn Dekker
fe74702766 Fix miscellaneous typos 2021-02-16 16:45:06 +00:00
Martijn Dekker
24598fed7c Add new 'nobackslashctrl' shell option; other minor editor tweaks
The following emacs editor 'feature' kept making me want to go back
to bash. I forget a backslash escape in a command somewhere. So I
go back to insert it. I type the \, then want to go forward. My
right arrow key, instead of moving the cursor, then replaces my
backslash with garbage. Why? The backslash escapes the following
control character at the editor level and inserts it literally.

The vi editor has a variant of this which is much less harmful. It
only works in insert mode and the backslash only escapes the next
kill or erase character.

In both editors, this feature is completely redundant with the
'stty lnext' character which is ^V by default -- and works better
as well because it also escapes ^C, ^J (linefeed) and ^M (Return).

 [In fact, you could even issue 'stty lnext \\' and get a much more
 consistent version of this feature on any shell. You have to type
 two backslashes to enter one, but it won't kill your cursor keys.]

If it were up to me alone, I'd simply remove this misfeature from
both editors. However, it is long-standing documented behaviour.
It's in the 1995 book. Plus, POSIX specifies the vi variant of it.

So, this adds a shell option instead. It was quite trivial to do.
Now I can 'set --nobackslashctrl' in my ~/.kshrc. What a relief!

Note: To keep .kshrc compatibile with older ksh versions, use:
    command set --nobackslashctrl 2>/dev/null

src/cmd/ksh93/include/shell.h,
src/cmd/ksh93/data/options.c:
- Add new SH_NOBACKSLCTRL/"nobackslashctrl" long-form option. The
  "no" prefix shows it to the user as "backslashctrl" which is on
  by default. This avoids unexpectedly changing historic behaviour.

src/cmd/ksh93/edit/emacs.c: ed_emacsread(),
src/cmd/ksh93/edit/vi.c: getline():
- Only set the flag for special backslash handling if
  SH_NOBACKSLCTRL is off.

src/cmd/ksh93/sh.1,
src/cmd/ksh93/data/builtins.c:
- Document the new option (as "backslashctrl", on by default).

Other minor tweaks:

src/cmd/ksh93/edit/edit.c:
- ed_setup(): Add fallback #error if no tput method is set. This
  should never be triggered; it's to catch future editing mistakes.
- escape(): cntl('\t') is nonsense as '\t' is already a control
  character, so change this to just '\t'.
- xcommands(): Let's enable the ^X^D command for debugging
  information on non-release builds.

src/cmd/ksh93/features/cmds:
- The tput feature tests assumed a functioning terminal in $TERM.
  However, for all we know we might be compiling with no tty and
  TERM=dumb. The tput commands won't work then. So set TERM=ansi
  to use a standard default.
2021-02-16 01:29:00 +00:00
Martijn Dekker
af5f7acf99 Fix bugs related to --posix shell option (re: 921bbcae, f45a0f16)
This fixes the following:
1. 'set --posix' now works as an equivalent of 'set -o posix'.
2. The posix option turns off braceexpand and turns on letoctal.
   Any attempt to override that in a single command such as 'set -o
   posix +o letoctal' was quietly ignored. This now works as long
   as the overriding option follows the posix option in the command.
3. The --default option to 'set' now stops the 'posix' option, if
   set or unset in the same 'set' command, from changing other
   options. This allows the command output by 'set +o' to correctly
   restore the current options.

src/cmd/ksh93/data/builtins.c:
- To make 'set --posix' work, we must explicitly list it in
  sh_set[] as a supported option so that AST optget(3) recognises
  it and won't override it with its own default --posix option,
  which converts the optget(3) string to at POSIX getopt(3) string.
    This means it will appear as a separate entry in --man output,
  whether we want it to or not. So we might as well use it as an
  example to document how --optionname == -o optionname, replacing
  the original documentation that was part of the '-o' description.

src/cmd/ksh93/sh/args.c: sh_argopts():
- Add handling for explitit --posix option in data/builtins.c.
- Move SH_POSIX syncing SH_BRACEEXPAND and SH_LETOCTAL from
  sh_applyopts() into the option parsing loop here. This fixes
  the bug that letoctal was ignored in 'set -o posix +o letoctal'.
- Remember if --default was used in a flag, and do not sync options
  with SH_POSIX if the flag is set. This makes 'set +o' work.

src/cmd/ksh93/include/argnod.h,
src/cmd/ksh93/data/msg.c,
src/cmd/ksh93/sh/args.c: sh_printopts():
- Do not potentially translate the 'on' and 'off' labels in 'set
  -o' output. No other shell does, and some scripts parse these.

src/cmd/ksh93/sh/init.c: sh_init():
- Turn on SH_LETOCTAL early along with SH_POSIX if the shell was
  invoked as sh; this makes 'sh -o' and 'sh +o' show expected
  options (not that anyone does this, but correctness is good).

src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/include/shell.h:
- The state flags were in defs.h and most (but not all) of the
  shell options were in shell.h. Gather all the shell state and
  option flag definitions into one place in shell.h for clarity.
- Remove unused SH_NOPROFILE and SH_XARGS option flags.

src/cmd/ksh93/tests/options.sh:
- Add tests for these bugs.

src/lib/libast/misc/optget.c: styles[]:
- Edit default optget(3) option self-documentation for clarity.

Several changed files:
- Some SHOPT_PFSH fixes to avoid compiling dead code.
2021-02-14 23:51:19 +00:00
Martijn Dekker
d790731a80 github: Revert failed experiment (re: 2e6c56df, 9ad9a1de, 5c389035)
.github/workflows/ci.yml:
- Go back to wrapping the regression tests in script(1).

src/cmd/ksh93/data/builtins.c:
- Never mind about the stty builtin.

src/cmd/ksh93/tests/pty.sh:
- Refuse to run if there isn't a functioning tty.
- Make sure stty(1) works on /dev/tty by redirecting stdin.
2021-02-13 13:52:14 +00:00
Martijn Dekker
2e6c56df82 github: Try to make the tests pass (re: 9ad9a1de, 5c389035)
So, the pty regression tests on the Linux GitHub runner all failed.

Let's test an assumption: the reason is that we need the stty
builtin to properly set the pty state, because the OS-provided stty
command does not work if there is no real tty.

src/cmd/ksh93/data/builtins.c:
- Compile in the stty built-in. This adds about 20k to the binary
  for a command that most users rarely need and even more rarely
  need to be built in, so only compile it in on non-release builds.

src/cmd/ksh93/tests/pty.sh:
- Skip the tests if we cannot either use the stty builtin or change
  the state of the real terminal to be compatible with the tests.
2021-02-13 13:15:46 +00:00
Martijn Dekker
76ea18dcbd Fix disabling SHOPT_FIXEDARRAY (re: 2182ecfa)
It was easier than expected to fix this one. The many regression
test failures caused by disabling it were all due to one bug:
'typeset -p' output broke when building without this option.

src/cmd/ksh93/sh/nvtree.c: nv_attribute():
- In this function to print the attributes of a name-value pair,
  move four lines of code out of #if SHOPT_FIXEDARRAY...#endif that
  were inadvertently moved into the #if block in ksh93 2012-05-18.
  See the changes to nvtree.c in this multishell repo commit:
  aabab56a

src/cmd/ksh93/data/builtins.c:
- Update/rewrite 'typeset -a' documentation.
- Make it adapt to SHOPT_FIXEDARRAY.
- Fix a few typos.

src/cmd/ksh93/tests/arrays2.sh:
- Only one regression test needs a SHOPT_FIXEDARRAY check.

.github/workflows/ci.yml:
- Disable SHOPT_FIXEDARRAY when regression-testing without SHOPTs.
- Enable xtrace, add ':' commands for traced comments. This should
  make the CI runner output logs a little more readable.
2021-02-10 04:48:56 +00:00
Martijn Dekker
2182ecfa08 Fix compile/regress fails on compiling without SHOPT_* options
Many compile-time options were broken so that they could not be
turned off without causing compile errors and/or regression test
failures. This commit now allows the following to be disabled:

SHOPT_2DMATCH    # two dimensional ${.sh.match} for ${var//pat/str}
SHOPT_BGX        # one SIGCHLD trap per completed job
SHOPT_BRACEPAT   # C-shell {...,...} expansions (, required)
SHOPT_ESH        # emacs/gmacs edit mode
SHOPT_HISTEXPAND # csh-style history file expansions
SHOPT_MULTIBYTE  # multibyte character handling
SHOPT_NAMESPACE  # allow namespaces
SHOPT_STATS      # add .sh.stats variable
SHOPT_VSH        # vi edit mode

The following still break ksh when disabled:

SHOPT_FIXEDARRAY # fixed dimension indexed array
SHOPT_RAWONLY    # make viraw the only vi mode
SHOPT_TYPEDEF    # enable typeset type definitions

Compiling without SHOPT_RAWONLY just gives four regression test
failures in pty.sh, but turning off SHOPT_FIXEDARRAY and
SHOPT_TYPEDEF causes compilation to fail. I've managed to tweak the
code to make it compile without those two options, but then dozens
of regression test failures occur, often in things nothing directly
to do with those options. It looks like the separation between the
code for these options and the rest was never properly maintained.
Making it possible to disable SHOPT_FIXEDARRAY and SHOPT_TYPEDEF
may involve major refactoring and testing and may not be worth it.

This commit has far too many tweaks to list. Notables fixes are:

src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/data/options.c:
- Do not compile in the shell options and documentation for
  disabled features (braceexpand, emacs/gmacs, vi/viraw), so the
  shell is not left with no-op options and inaccurate self-doc.

src/cmd/ksh93/data/lexstates.c:
- Comment the state tables to associte them with their IDs.
- In the ST_MACRO table (sh_lexstate9[]), do not make the S_BRACE
  state for position 123 (ASCII for '{') conditional upon
  SHOPT_BRACEPAT (brace expansion), otherwise disabling this causes
  glob patterns of the form {3}(x) (matching 3 x'es) to stop
  working as well -- and that is ksh globbing, not brace expansion.

src/cmd/ksh93/edit/edit.c: ed_read():
- Fixed a bug: SIGWINCH was not handled by the gmacs edit mode.

src/cmd/ksh93/sh/name.c: nv_putval():
- The -L/-R left/right adjustment options to typeset do not count
  zero-width characters. This is the behaviour with SHOPT_MULTIBYTE
  enabled, regardless of locale. Of course, what a zero-width
  character is depends on the locale, but control characters are
  always considered zero-width. So, to avoid a regression, add some
  fallback code for non-SHOPT_MULTIBYTE builds that skips ASCII
  control characters (as per iscntrl(3)) so they are still
  considered to have zero width.

src/cmd/ksh93/tests/shtests:
- Export the SHOPT_* macros from SHOPT.sh to the tests as
  environment variables, so the tests can check for them and decide
  whether or how to run tests based on the compile-time options
  that the tested binary was presumably compiled with.
- Do not run the C.UTF-8 tests if SHOPT_MULTIBYTE is not enabled.

src/cmd/ksh93/tests/*.sh:
- Add a bunch of checks for SHOPT_* env vars. Since most should
  have a value 0 (off) or 1 (on), the form ((SHOPT_FOO)) is a
  convenient way to use them as arithmetic booleans.

.github/workflows/ci.yml:
- Make GitHub do more testing: run two locale tests (Dutch and
  Japanese UTF-8 locales), then disable all the SHOPTs that we can
  currently disable, recompile ksh, and run the tests again.
2021-02-08 22:02:45 +00:00
Martijn Dekker
403e864ac8 Disallow >;word and <>;word for 'redirect' (re: 7b59fb80, 7b82c338)
The >;word and <>;word redirection operators cannot be used with
the 'exec' builtin, but the 'redirect' builtin (which used to be
an alias of 'command exec') permitted them. However, they do not
have the documented effect of the added ';'. So this commit blocks
those operators for 'redirect' as they are blocked for 'exec'.

It also tweaks redirect's error message if a non-redirection
argument is encountered.

src/cmd/ksh93/sh/parse.c: simple():
- Set the lexp->inexec flag for SYSREDIR (redirect) as well as
  SYSEXEC (exec). This flag is checked for in sh_lex() (lex.c) to
  throw a syntax error if one of these two operators is used.

src/cmd/ksh93/sh.1, src/cmd/ksh93/data/builtins.c:
- Documentation tweaks.

src/cmd/ksh93/sh/xec.c, src/cmd/ksh93/bltins/misc.c:
- When 'redirect' gives an 'incorrect syntax' (e_badsyntax) error
  message, include the first word that was found not to be a valid
  redirection. This is simply the first argument, as redirections
  are removed from the arguments list.

src/cmd/ksh93/tests/io.sh:
- Update test to reflect new error message format.
2021-02-07 03:23:56 +00:00
Martijn Dekker
6a0e9a1a75 Tweak and regress-test 'command -x' (re: 66e1d446)
Turns out the assumption I was operating on, that Linux and macOS
align arguments on 32 or 64 bit boundaries, is incorrect -- they
just need some extra bytes per argument. So we can use a bit more
of the arguments buffer on these systems than I thought.

src/cmd/ksh93/features/externs:
- Change the feature test to simply detect the # of extra bytes per
  argument needed. On *BSD and commercial Unices, ARG_EXTRA_BYTES
  shows as zero; on Linux and macOS (64-bit), this yields 8. On
  Linux (32-bit), this yields 4.

src/cmd/ksh93/sh/path.c: path_xargs():
- Do not try to calculate alignment, just add ARG_EXTRA_BYTES to
  each argument.
- Also add this when substracting the length of environment
  variables and leading and trailing static command arguments.

src/cmd/ksh93/tests/path.sh:
- Test command -v/-V with -x.
- Add a robust regression test for command -x.

src/cmd/ksh93/data/builtins.c, src/cmd/ksh93/sh.1:
- Tweak docs. Glob patterns also expand to multiple words.
2021-02-01 02:19:02 +00:00
Martijn Dekker
ede479967f resolve/remove USAGE_LICENSE macros; remove repetitive (c) strings
This takes another small step towards disentangling the build
system from the old AT&T environment. The USAGE_LICENSE macros with
author and copyright information, which was formerly generated
dynamically for each file from a database, are eliminated and the
copyright/author information is instead inserted into the AST
getopt usage strings directly.

Repetitive license/copyright information is also removed from the
getopt strings in the builtin commands (src/lib/libcmd/*.c and
src/cmd/ksh93/data/builtins.c). There's no need to include 55
identical license/copyright strings in the ksh binary; one (in the
main ksh getopt string, shown by ksh --man) ought to be enough!
This makes the ksh binary about 10k smaller.

It does mean that something like 'enum --author', 'typeset
--license' or 'shift --copyright' will now not show those notices
for those builtins, but I doubt anyone will care.
2021-01-31 11:00:49 +00:00
Martijn Dekker
66e1d44642 command -x: fix efficiency; always run external cmd (re: acf84e96)
This commit fixes 'command -x' to adapt to OS limitations with
regards to data alignment in the arguments list. A feature test is
added that detects if the OS aligns the argument on 32-bit or
64-bit boundaries or not at all, allowing 'command -x' to avoid
E2BIG errors while maximising efficiency.

Also, as of now, 'command -x' is a way to bypass built-ins and
run/query an external command. Built-ins do not limit the length of
their argument list, so '-x' never made sense to use for them. And
because '-x' hangs on Linux and macOS on every ksh93 release
version to date (see acf84e96), few use it, so there is little
reason not to make this change.

Finally, this fixes a longstanding bug that caused the minimum exit
status of 'command -x' to be 1 if a command with many arguments was
divided into several command invocations. This is done by replacing
broken flaggery with a new SH_XARG state flag bit.

src/cmd/ksh93/features/externs:
- Add new C feature test detecting byte alignment in args list.
  The test writes a #define ARG_ALIGN_BYTES with the amount of
  bytes the OS aligns arguments to, or zero for no alignment.

src/cmd/ksh93/include/defs.h:
- Add new SH_XARG state bit indicating 'command -x' is active.

src/cmd/ksh93/sh/path.c: path_xargs():
- Leave extra 2k in the args buffer instead of 1k, just to be sure;
  some commands add large environment variables these days.
- Fix a bug in subtracting the length of existing arguments and
  environment variables. 'size -= strlen(cp)-1;' subtracts one less
  than the size of cp, which makes no sense; what is necessary is
  to substract the length plus one to account for the terminating
  zero byte, i.e.: 'size -= strlen(cp)+1'.
- Use the ARG_ALIGN_BYTES feature test result to match the OS's
  data alignment requirements.
- path_spawn(): E2BIG: Change to checking SH_XARG state bit.

src/cmd/ksh93/bltins/whence.c: b_command():
- Allow combining -x with -p, -v and -V with the expected results
  by setting P_FLAG to act like 'whence -p'. E.g., as of now,
	command -xv printf
  is equivalent to
	whence -p printf
  but note that 'whence' has no equivalent of 'command -pvx printf'
  which searches $(getconf PATH) for a command.
- When -x will run a command, now set the new SH_XARG state flag.

src/cmd/ksh93/sh/xec.c: sh_exec():
- Change to using the new SH_XARG state bit.
- Skip the check for built-ins if SH_XARG is active, so that
  'command -x' now always runs an external command.

src/lib/libcmd/date.c, src/lib/libcmd/uname.c:
- These path-bound builtins sometimes need to run the external
  system command by the same name, but they did that by hardcoding
  an unportable direct path. Now that 'command -x' runs an external
  command, change this to using 'command -px' to guarantee using
  the known-good external system utility in the default PATH.
- In date.c, fix the format string passed to 'command -px date'
  when setting the date; it was only compatible with BSD systems.
  Use the POSIX variant on non-BSD systems.
2021-01-30 06:53:19 +00:00
Martijn Dekker
d2cc520883 Disable SHOPT_KIA (ksh -R) by default
SHOPT_KIA enables the -R option that generates a cross-reference
database from a script. However, no tool to analyse this database
is shipped or seems to be available anywhere (in spite of multiple
people looking for one), and the format is very opaque. No usage
examples are known or findable on the internet. This seems like it
should not be compiled in by default, although we'll keep the code
in case some way to use it is found.

src/cmd/ksh93/SHOPT.sh:
- Disable SHOPT_KIA by default by removing the default 1 value.

src/cmd/ksh93/sh/args.c, src/cmd/ksh93/sh/parse.c:
- Fix a couple of preprocessor logic bugs that made it impossible
  to compile ksh without SHOPT_KIA.

src/cmd/ksh93/data/builtins.c:
- Fix typo in -R doc in ksh --man (in case SHOPT_KIA is enabled).

src/cmd/ksh93/sh.1:
- Since sh.1 is not generated dynamically, remove the -R doc.
2021-01-23 18:26:38 +00:00
Martijn Dekker
7fdeadd4f1 Increase shcomp bytecode header version; doc updates
src/cmd/ksh93/include/version.h:
- Centrally define the 93u+m copyright (SH_RELEASE_CPYR) for adding
  to the original AT&T copyright in 'ksh --man' and 'shcomp --man'.
- Centrally define the binary header version number for bytecode
  generated by shcomp: SHCOMP_HDR_VERSION.
- Bump SHCOMP_HDR_VERSION from 3 to 4. Converting all the preset
  aliases to builtin commands has caused new bytecode to be
  incompatible with old ksh. (However, old bytecode runs fine on
  93u+m, because shcomp pre-expands the preset aliases.)

src/cmd/ksh93/sh/shcomp.c:
- Instead of keeping its own version date (not changed since 2003),
  use the same version string as ksh itself (SH_RELEASE).
- Use SH_RELEASE_CPYR for the extra 93u+m copyright string.
- Use SHCOMP_HDR_VERSION for the bytecode header.

src/cmd/ksh93/sh/parse.c: sh_parse():
- Use SHCOMP_HDR_VERSION for the bytecode version check.

src/cmd/ksh93/data/builtins.c: opt_ksh[]:
- Use SH_RELEASE_CPYR for the extra 93u+m copyright string.

src/cmd/ksh93/COMPATIBILITY:
- Mention that 93u+m shcomp bytecode won't run on older ksh.
- Document changes in printf %T (re: 9526b3fa).

src/cmd/ksh93/README:
- Mention that we run on UnixWare (with major regressions).
  https://github.com/ksh93/ksh/pull/159#issuecomment-764667929
2021-01-21 17:04:14 +00:00
Martijn Dekker
0a10e76ccc typeset: add error msgs for incompatible options; improve usage msg
This adds informative error messages if incompatible options are
given. It also documents the exclusive -m, -n and -T options on
separate usage lines, as was already done with -f. The usage
message for incompatible options now looks something like this:

| $ ksh -c 'typeset -L10 -F -f -i foo'
| ksh: typeset: -i/-F/-E/-X cannot be used with -L/-R/-Z
| ksh: typeset: -f cannot be used with other options
| Usage: typeset [-bflmnprstuxACHS] [-a[type]] [-i[base]] [-E[n]]
|                [-F[n]] [-L[n]] [-M[mapping]] [-R[n]] [-X[n]]
|                [-h string] [-T[tname]] [-Z[n]] [name[=value]...]
|    Or: typeset -f [name...]
|    Or: typeset -m [name=name...]
|    Or: typeset -n [name=name...]
|    Or: typeset -T [tname[=(type definition)]...]
|  Help: typeset [ --help | --man ] 2>&1

(see also the previous commit, e21a053e)

Unfortunately the first "Usage" line has some redundancies with the
"Or:" lines showing separate usages. It doesn't seem to be possible
to avoid this; it's a flaw in how libast generates everything
(usage, help, manual) from one huge getopt(3) string. I still think
the three added "Or:" lines are an improvement as it wasn't
previously shown that these options need to be used on their own.

src/cmd/ksh93/bltins/typeset.c: b_typeset():
- Instead of only showing a generic usage message, add an
  informative error message if incompatible options were given.
- Conflicting options detection was failing because NV_LJUST and
  NV_EXPNOTE have the same bitmask value. Use a new 'isadjust'
  flag for -L/-R/-Z to remember if one of these was set.
- Detect conflict between -L/-R/-Z and a float option, not just -i.

src/cmd/ksh93/include/name.h, src/cmd/ksh93/data/msg.c:
- Add the two new error messages for incompatible options.

src/cmd/ksh93/data/builtins.c: sh_opttypeset[]:
- Add a space after 'float' in in "[+float?\btypeset -lE\b]" as
  this makes 'float' appear on its own line, improving formatting.
- Show -m, -n, -T on separate usage lines like -f, as none of these
  can be combined with other options.
- Remove "cannot be combined with other options" from -m and -n
  descriptions, as that should now be clear from the separate usage
  lines -- and even if not, the error message is now informative.

src/cmd/ksh93/sh.1, src/cmd/ksh93/COMPATIBILITY:
- Update.

src/cmd/ksh93/tests/types.sh:
- Remove obsolete test: 'typeset -RF' is no longer accepted.
  (It crashed in 93u+, so this is not an incompatibility...)

Resolves: https://github.com/ksh93/ksh/issues/48
2021-01-21 09:36:10 +00:00
Martijn Dekker
7bab9508aa Fix crash on subshell exit if PWD is inaccessible (re: dd9bc229)
This commit also further mitigates the problems with restoring an
inaccessible or nonexistent PWD on exiting a virtual subshell.

Harald van Dijk writes:
> On a build of ksh with -fsanitize=undefined to help diagnose
> problems:
>
> $ mkdir deleted
> $ cd deleted
> $ rmdir ../deleted
> $ ksh -c '(cd /; (cd /)); :'
> /home/harald/ksh/src/cmd/ksh93/sh/subshell.c:561:22: runtime
> error: null pointer passed as argument 1, which is declared to
> never be null
> Segmentation fault (core dumped)
>
> Note that it segfaults the same with default compilation flags,
> but it does not print out the useful extra message. The code
> assumes that pwd is non-null and passes it to strcmp without
> checking, but it will be null if the current directory cannot be
> determined, for instance because it has been deleted.

src/cmd/ksh93/sh/subshell.c: sh_subshell():
- Avoid the null pointer dereference reported above.

src/cmd/ksh93/bltins/cd_pwd.c: b_cd():
- Fork a virtual subshell even on systems with fchdir(2) if the
  present working directory tests as inaccessible on invoking 'cd';
  it may no longer exist and fchdir would fail to get a handle.
  (For the test we have to opendir(3) the full path to the PWD and
  not ".", as the latter may succeed even if the PWD is gone.)

src/cmd/ksh93/data/builtins.c:
- Update 'cd' version string.

Fixes:   https://github.com/ksh93/ksh/issues/153
Related: https://github.com/ksh93/ksh/issues/141
2021-01-19 18:47:41 +00:00
Martijn Dekker
222515bf08 Implement hash tables for virtual subshells (re: 102868f8, 9d428f8f)
The forking fix implemented in 102868f8 and 9d428f8f, which stops
the main shell's hash table from being cleared if PATH is changed
in a subshell, can cause a significant performance penalty for
certain scripts that do something like

    ( PATH=... command foo )

in a subshell, especially if done repeatedly. This is because the
hash table is cleared (and hence a subshell forks) even for
temporary PATH assignments preceding commands.

It also just plain doesn't work. For instance:

    $ hash -r; (ls) >/dev/null; hash
    ls=/bin/ls

Simply running an external command in a subshell caches the path in
the hash table that is shared with a main shell. To remedy this, we
would have to fork the subshell before forking any external
command. And that would be an unacceptable performance regression.

Virtual subshells do not need to fork when changing PATH if they
get their own hash tables. This commit adds these. The code for
alias subshell trees (which was removed in ec888867 because they
were broken and unneeded) provided the beginning of a template for
their implementation.

src/cmd/ksh93/sh/subshell.c:
- struct subshell: Add strack pointer to subshell hash table.
- Add sh_subtracktree(): return pointer to subshell hash table.
- sh_subfuntree(): Refactor a bit for legibility.
- sh_subshell(): Add code for cleaning up subshell hash table.

src/cmd/ksh93/sh/name.c:
- nv_putval(): Remove code to fork a subshell upon resetting PATH.
- nv_rehash(): When in a subshell, invalidate a hash table entry
  for a subshell by creating the subshell scope if needed, then
  giving that entry the NV_NOALIAS attribute to invalidate it.

src/cmd/ksh93/sh/path.c: path_search():
- To set a tracked alias/hash table entry, use sh_subtracktree()
  and pass the HASH_NOSCOPE flag to nv_search() so that any new
  entries are added to the current subshell table (if any) and do
  not influence any parent scopes.

src/cmd/ksh93/bltins/typeset.c: b_alias():
- b_alias(): For hash table entries, use sh_subtracktree() instead
  of forking a subshell. Keep forking for normal aliases.
- setall(): To set a tracked alias/hash table entry, pass the
  HASH_NOSCOPE flag to nv_search() so that any new entries are
  added to the current subshell table (if any) and do not influence
  any parent scopes.

src/cmd/ksh93/sh/init.c: put_restricted():
- Update code for clearing the hash table (when changing $PATH) to
  use sh_subtracktree().

src/cmd/ksh93/bltins/cd_pwd.c:
- When invalidating path name bindings to relative paths, use the
  subshell hash tree if applicable by calling sh_subtracktree().
- rehash(): Call nv_rehash() instead of _nv_unset()ting the hash
  table entry; this is needed to work correctly in subshells.

src/cmd/ksh93/tests/leaks.sh:
- Add leak tests for various PATH-related operations in the main
  shell and in a virtual subshell.
- Several pre-existing memory leaks are exposed by the new tests
  (I've confirmed these in 93u+). The tests are disabled and marked
  TODO for now, as these bugs have not yet been fixed.

src/cmd/ksh93/tests/subshell.sh:
- Update.

Resolves: https://github.com/ksh93/ksh/issues/66
2021-01-07 22:18:25 +00:00
Martijn Dekker
3567220898 New semantic versioning scheme; disable vmalloc in release builds
As of this commit, ksh 93u+m has a standard semantic version number
<https://semver.org/>, beginning with 1.0.0-alpha. This is added to
the version string in a way that should be compatible with scripts
parsing ${.sh.version} or $(ksh --version). This addition does not
replace the release date and does not affect $((.sh.version)).

For non-release builds, the version string will be:
	FORK/VERSION+HASH YYYY-MM-DD
e.g.:	93u+m/1.0.0-alpha+41ef7f76 2021-01-03

For release builds, it will be:
	FORK/VERSION YYYY-MM-DD
e.g.:	93u+m/1.0.0 2021-01-03

It is now automatically decided by bin/package whether to build a
release or development build. When building from a directory that
is not a git repository, or if the current git branch name starts
with a number (e.g. '1.0'), the release build is enabled; otherwise
a development build is the default. This is arranged by adding -D
flags to $CCFLAGS as described below. These flags are prepended to
$CCFLAGS, so they can be overridden by adding your own -D or -U
flags via the environment.

In addition, AST vmalloc is disabled for release builds as of this
commit, forcing the use of the OS's standard malloc(3). In 2021,
this is generally more reliable, faster, and more economical with
memory than AST vmalloc. Several memory leaks and crashing bugs are
avoided, e.g.: <https://github.com/ksh93/ksh/issues/95>.

For development builds, vmalloc stays enabled (along with its known
bugs) because this allows the use of the vmstat builtin, making it
much more efficient to test for memory leaks. For more info, see
the regression test script: src/cmd/ksh93/tests/leaks.sh

bin/package, src/cmd/INIT/package.sh:
- Add flags for build type. In $CCFLAGS, define _AST_ksh_release if
  we're not on any git branch or on a git branch whose name starts
  with a number. Otherwise, define _AST_git_commit as the first 8
  characters of the current git commit hash.

src/lib/libast/features/vmalloc:
- If _AST_ksh_release is defined, disable vmalloc and force use of
  the operating system's malloc. Discussion:
  https://github.com/ksh93/ksh/issues/95

src/cmd/ksh93/include/version.h:
- Define new format for version string, adding a semantic version
  number as well as (for non-release builds) the git commit hash.

src/cmd/ksh93/sh/init.c: e_version[]:
- Add a 'v' to the ${.sh.version} feature string if ksh was
  compiled with vmalloc enabled. This allows scripts, such as
  regression tests, to detect this.

src/cmd/ksh93/data/builtins.c: sh_optksh[]:
- Add a copyright line crediting the contributors to ksh 93u+m.

Resolves: https://github.com/ksh93/ksh/issues/95
2021-01-05 04:52:42 +00:00
Harald van Dijk
41ef7f76cf Invocation: fix infinite loop on 'ksh +s'
When starting ksh +s, it gets stuck in an infinite loop continually
trying to parse its own binary as a shell script and rejecting it:

$ arch/linux.i386-64/bin/ksh +s
arch/linux.i386-64/bin/ksh: arch/linux.i386-64/bin/ksh: cannot execute [Exec format error]
arch/linux.i386-64/bin/ksh: arch/linux.i386-64/bin/ksh: cannot execute [Exec format error]
arch/linux.i386-64/bin/ksh: arch/linux.i386-64/bin/ksh: cannot execute [Exec format error]
arch/linux.i386-64/bin/ksh: arch/linux.i386-64/bin/ksh: cannot execute [Exec format error]
arch/linux.i386-64/bin/ksh: arch/linux.i386-64/bin/ksh: cannot execute [Exec format error]
[...]
$ echo 'echo "this is stdin"' | arch/linux.i386-64/bin/ksh +s
arch/linux.i386-64/bin/ksh: arch/linux.i386-64/bin/ksh: cannot execute [Exec format error]
(no loop, but still ksh trying to parse itself)

src/cmd/ksh93/sh/init.c: sh_init():
- When forcing on the '-s' option upon finding no command
  arguments, also update sh.offoptions, a.k.a. shp->offoptions.
  This avoids the inconsistent state causing this problem.

  In main.c, there is:

  if(sh_isoption(SH_SFLAG))
      fdin = 0;
  else
      (code to open $0 as a file)

  This was entering the else block because sh_isoption(SH_SFLAG)
  was returning 0, and $0 is set to the ksh binary as it is
  supposed to when no other script is provided. When I looked for
  why sh_isoption was returning 0, I found main.c's

  for(i=0; i<elementsof(shp->offoptions.v); i++)
      shp->options.v[i] &= ~shp->offoptions.v[i];

  Before this loop, shp->offoptions tracks which options were
  explicitly disabled by the user on the command line. The effect
  of this loop is to make "explicitly disabled" take precedence
  over "implicitly enabled". My patch removes the registration of
  the +s option.

Fixes: https://github.com/ksh93/ksh/issues/150
Co-authored-by: Martijn Dekker <martijn@inlv.org>
2021-01-03 23:54:36 +00:00