1
0
Fork 0
mirror of git://git.code.sf.net/p/cdesktopenv/code synced 2025-02-13 11:42:21 +00:00
Commit graph

983 commits

Author SHA1 Message Date
Martijn Dekker
de511cfbc2 libast: regex: re-backport robustness improvements from 93v- beta
That intermittent regression test failure in types.sh seems to be
gone. So let's reimport the regex changes into the 1.0 branch to
subject them to wider testing and make sure any failures stay gone.
(re: 48568476, 38aab428, 1aa8f771)

[Original commit message from 1aa8f771 follows]

There are two main changes:

1. The regex code now creates and uses its own stack (env->mst)
   instead of using the shared standard stack (stkstd). That seems
   likely to be a good thing.

2. Missing mbinit() calls were inserted. The 93v- code uses a
   completely different multibyte characters API, so these needed
   to be translated back to the older API. But, as mbinit() is no
   longer a no-op as of 300cd199, these calls do stop things from
   breaking if a previous operation is interrupted mid-character.

I think there might be a couple of off-by-one errors fixed as well,
as there are two instances of this change:

-               while ((index += skip[buf[index]]) < mid);
+               while (index < mid)
+                       index += skip[buf[index]];
2021-12-28 22:24:41 +00:00
Johnothan King
4032050249 Port cksum builtin performance improvements from illumos (#391)
This commit ports performance optimizations from illumos for the libsum
code (used by the cksum and sum builtins):
98bea71f0d
The new codepath in libsum uses prefetching and loop unrolling to
improve performance (prefetching is done with __builtin_prefetch()
or sun_prefetch_read_many() if either is available).

Script for testing (note that cksum must be enabled in
src/cmd/ksh93/data/builtins.c):
   #!/bin/ksh
   builtin cksum || exit 1
   for ((i=0; i!=50000; i++)) do
       cksum -x att /etc/hosts
   done >/dev/null

Results on Linux x86_64 (using CCFLAGS=-O2):
$ echo 'UNPATCHED:'; time arch/linux.i386-64/bin/ksh /tmp/foo; echo 'PATCHED'; time /tmp/ksh /tmp/foo
UNPATCHED:

real    0m09.989s
user    0m07.582s
sys     0m02.406s
PATCHED:

real    0m06.536s
user    0m04.331s
sys     0m02.204s

src/lib/libsum/{sum-att.c,sum-crc.c,Mamfile}:
- Port the performance optimizations from illumos to 93u+m libsum. To
  prevent problems with older versions of GCC, avoid the new codepath
  if GCC is older than the 3.1 release series. Additionally, the ast.h
  header must be included to handle tcc defining __GNUC__ on FreeBSD.
- Apply some build fixes to allow the new codepath to build with Clang
  3.6 and newer (my own testing indicates an even better performance
  improvement with Clang than with GCC).
2021-12-28 22:22:52 +00:00
Johnothan King
8f9d1bec97 Add three options to 'ulimit' (#406)
This patch adds a few extra options to the ulimit command (if the OS
supports them). These options are also present in Bash, although in ksh
additional long forms of each option are available:
  ulimit -k/--kqueues   This is the maximum number of kqueues.
  ulimit -P/--npts      This is the maximum number of pseudo-terminals.
  ulimit -R/--rttime    This is the time a real-time process can run
                        before blocking, in microseconds. When the
                        limit is exceeded, the process is sent SIGXCPU.

Other changes:
- bltins/ulimit.c: Change the formatting from sfprintf and increase the
  size of the tmp buffer to prevent text from being cut off in ulimit
  -a (this was required to add ulimit -R).
- data/limits.c: Add support for using microseconds as a unit.
2021-12-28 22:02:20 +00:00
Martijn Dekker
8f24d4dc56 tests/leaks.sh: add a newly discovered leak as known
Help us fix it at: https://github.com/ksh93/ksh/issues/407
2021-12-28 21:59:50 +00:00
Johnothan King
b425196958 Fix ASan heap-buffer-overflow when handling syntax errors (#402)
This commit backports a bugfix from ksh2020 to fix an ASan
heap-buffer-overflow error in one of the regression tests. See:
https://github.com/att/ast/commit/c57f7398
https://github.com/att/ast/issues/1261

This explanation comes from the linked issue:
> The poplevel() in this block of code is called when lp->lexd.lex_max
> is zero:
> https://github.com/att/ast/blob/bd94eb56/src/cmd/ksh93/sh/lex.c#L921-L925
> Since poplevel() first decrements lp->lexd.lex_max then uses it as
> an index into lp->lexd.lex_match this causes the word before the
> start of that buffer to be accessed. The buffer is allocated here:
> https://github.com/att/ast/blob/bd94eb56/src/cmd/ksh93/sh/lex.c#L2210-L2218

src/cmd/ksh93/sh/lex.c:
- Avoid calling poplevel() twice when handling syntax errors.
2021-12-28 17:53:35 +00:00
Johnothan King
de795e1f9d Allow regression tests to pass without any /opt/ast/bin builtins (#403)
This commit primarily makes changes that allow the regression tests to
run without any of the /opt/ast/bin builtins compiled into ksh. It also
makes a minor improvement to one of the tests in locale.sh by
shellquoting an error message.

Co-authored-by: Martijn Dekker <martijn@inlv.org>
2021-12-28 17:52:57 +00:00
Johnothan King
0e197eee57 Fix mkservice compile errors and add SHOPT_MKSERVICE (#401)
The unused mkservice and eloop builtins are currently not built, and if
an attempt to compile them is made the build ends in failure. This
commit backports a few build fixes from ksh93v- 2012-08-24 that allow
mkservice and eloop to build (plus an additional compiler warning fix
not in ksh93v-). I've also added a new SHOPT_MKSERVICE setting (turned
off by default) so that mkservice and eloop can be built if the user
chooses to include them in their build of ksh.
2021-12-28 17:51:11 +00:00
Johnothan King
a3ed4c368b Fix implicit warnings in the iffe feature tests (#396)
This commit fixes some implicit function warnings in the iffe feature
tests by adding missing include directives.
2021-12-28 17:50:39 +00:00
Martijn Dekker
db3a3d8fc0 tests/leaks.sh: redesign with a more robust testing algorithm
On modern operating systems, memory management is non-deterministic
(i.e. random, unpredictable) to varying degrees. This makes testing
for memory leaks a nightmare as the OS may decide to randomly grow
a process's memory allocation at any time for no apparent reason,
causing intermittent test failures that do not represent real
memory leaks. So far, the leaks test tried to cope with this by
using a large number of iterations plus a certain amount of bytes
of tolerance per iteration. This was inefficient and on some
systems still did not fully eliminate intermittent test failures.

This commit introduces a new testing algorithm that is designed to
cope with a large degree of unpredictability. Instead of a fixed
number of test iterations, it defines a maximum (16384), dividing
them in blocks of 128 iterations. It also defines a minimum number
of sequential "good" iteration blocks, counted if memory usage did
not increase from one block to the next. That minimum number is set
to 16. The theory is that if we can get 16 "good" iteration blocks
in a row, we can safely assume it's not a real memory leak, break
the loop, and consider the test succeeded. That "good" sequence is
allowed to occur at any point in the loop, creating a high built-in
tolerance for non-deterministic shenanigans. It also speeds up the
tests, as successful tests can bow out at 16 * 128 == 2048
iterations if they're lucky. If the OS decides to randomly grow the
memory heap, it may take more tries, but almost (?) certainly not
more than the maximum 16384 (128 blocks). If the counter reaches
that, then we assume a memory leak and throw a test failure.

We're also no longer testing with byte granularity in any case; the
randomness of memory management makes that pointless. All getmem()
function versions now return kibibytes (1024 bytes).

This should eliminate the need for workarounds such as initial
iterations to "steady the state" or a tolerance of a certain number
of bytes. I've experimentally determined the exact values
(max_iter, block_iter, min_good_blocks) that seem to work reliably
on all systems I've tested. They are easy to tweak if necessary.

To make all this manageable, this commit hides all the supporting
code in a triplet of aliases (TEST, DO, DONE) that, when used
correctly, create a grammatically robust shell code block: you can
add redirections, pipe into it, etc. as expected. This makes the
actual tests a great deal easier to read as well.

src/cmd/ksh93/tests/pty.sh:
- Implement new leaks testing framework as described and convert
  all the tests to it.
- Mark known leaks with a 'known' variable. Print non-fail warnings
  for all known leaks, but skip the tests by default. Test them
  only if DEBUG is exported. This is better than commenting them
  out as we will no longer be tempted to forget about these.
- Move the test for large command substitutions to subshell.sh --
  it's not in fact a leak test; instead, it checks that command
  substitutions don't lose data.

src/cmd/ksh93/tests/_common: err_exit():
- Since we're printing more warnings, clearly mark all test
  failures with 'FAIL:' to make them stand out.

src/cmd/ksh93/tests/shtests.
src/cmd/ksh93/tests/pty.sh:
- Special-case leaks.sh for counting tests; grep ^TEST.
- Special-case pty.sh as well while we're at it by grepping tst()
  calls. Remove all the dummy '# err_exit #' comments from pty.sh
  as they are now no longer used for counting the tests.
2021-12-28 17:47:29 +00:00
Martijn Dekker
a9c6f77c3e tests/alias.sh: fix regression test (re: f7213f03)
The variable 'i' had already been used for a non-numeric purpose,
so when declaring it integer in a subshell it's necessary to
initialise it with value 0 or an arithmetic error is shown (which
does not interrupt the test or make it fail).

For the record, the errpr was mine, not Johnothan's.
2021-12-28 15:08:20 +00:00
Johnothan King
24174f0fb7 Backport -P and -t flags for 'type'/'whence' from ksh93v- (#392)
This commit backports the whence '-t' option from ksh93v-. The '-t'
option is useful when one needs to identify the type of a command.
The '-t' flag was added by ksh93v- for compatibility with Bash.

It should be noted the ksh93v- patch had one bug, which this commit
fixes. Path-bound builtins from /opt/ast/bin were classified as
files if loaded from /opt/ast/bin in the PATH. Reproducer:
   $ PATH=/opt/ast/bin whence -t cat
   file

src/cmd/ksh93/bltins/whence.c:
- Simplify the bitmask values for the command and whence builtin
  flags.
- Add the -t flag to the whence and type builtins. To prevent bugs,
  -t will always override -v if both of those flags were passed.

src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/sh.1:
- Add documentation for the new -t option.
2021-12-27 06:40:02 +00:00
Martijn Dekker
2027648f1a Remove leftover pre-C89 code (re: a1f5c992)
I'd forgotten to check for uses of the __STDC__ macro. This is
defined on all C compilers that support C89/C90 or later standards.
So we can remove all fallback code disabled by that macro.
2021-12-27 05:46:23 +00:00
Martijn Dekker
e072e7c170 Fix crash in xtrace while processing here-document (re: d7cada7b)
Depending on the OS, the heredoc.sh regression tests, and possibly
others, still crashed with the -x option (xtrace) on.

Analysis: The lexer crashes in lex_advance(). Something has caused
an inconsistent lexer state, and it happened earlier on, so the
backtrace is useless for figuring out where that happened.

But I think I've found it. It's the sh_mactry() call here:

src/cmd/ksh93/sh/xec.c, lines 2800 to 2807 in f7213f03
2800:   if(!(cp=nv_getval(sh_scoped(shp,PS4NOD))))
2801:           cp = "+ ";
2802:   else
2803:   {
2804:           sh_offoption(SH_XTRACE);
2805:           cp = sh_mactry(shp,cp);
2806:           sh_onoption(SH_XTRACE);
2807:   }

sh_mactry() needs to parse the contents of $PS4 to perform
expansions and command substitutions in it, which involves the
lexer. If that happens in a here-document, the lexer is in the C
function call stack, in the middle of parsing the here-document.
Result: inconsistent lexer state. Solution: save and restore lexer
state in sh_mactry().

After this commit, all regression tests should pass with the
'-x'/'--xtrace' option in use, with no errors or crashes.

Note for backporters: this fix depends both on on d7cada7b and on
the consistency fix for the Lex_t type's size applied in a7ed5d9f.

src/cmd/ksh93/include/shlex.h:
- Cosmetic fix: remove a copied & pasted backslash. (re: a7ed5d9f)

src/cmd/ksh93/sh/macro.c: sh_mactry():
- Save and restore the lexer state before letting sh_mactrim()
  indirectly parse and execute code.

src/cmd/ksh93/tests/*.sh:
- Turn off xtrace in various command substitutions that contain
  2>&1 redirections, so that the xtrace output is not caught by
  the command substitutions, causing tests to fail incorrectly.
- Turn off xtrace for a few code blocks with 2>&1 redirections,
  stopping xtrace output from being written to standard output.

Resolves: https://github.com/ksh93/ksh/issues/306 (again)
2021-12-27 04:02:25 +00:00
Martijn Dekker
91a7c2e3e9 Fix crash/freeze upon interrupting command substitution with pipe
On some systems (at least Linux and macOS):

1. Run on a command line: t=$(sleep 10|while :; do :; done)
2. Press Ctrl+C in the first 10 seconds.
3. Execute any other command substitution. The shell crashes.

Analysis: Something in the job_wait() call in the sh_subshell()
restore routine may be interrupted by a signal such as SIGINT on
Linux and macOS. Exactly what that interruptible thing is remains
to be determined. In any case, since job_wait() was invoked after
sh_popcontext(), interrupting it caused the sh_subshell() restore
routine to be aborted, resulting in an inconsistent state of the
shell. The fix is to sh_popcontext() at a later stage instead.

src/cmd/ksh93/sh/subshell.c: sh_subshell():
- Rename struct checkpt buff to checkpoint because it's clearer.
- Move the sh_popcontext() call to near the end, just after
  decreasing the subshell level counters and restoring the global
  subshell data struct to its parent. This seems like a logical
  place for it and could allow other things to be interrupted, too.
- Get rid of the if(shp->subshell) because it is known that the
  value is > 0 at this point.
- The short exit routine run if the subshell forked now needs a new
  sh_popcontext() call, because this is handled before restoring
  the virtual subshell state.
- While we're here, do a little more detransitioning from all those
  pointless shp pointers.

Fixes: https://github.com/ksh93/ksh/issues/397
2021-12-27 03:49:41 +00:00
Johnothan King
f7213f03a2 Fix multiple bugs when using 'alias -p' to print aliases (#398)
This commit was originally intended to fix just one bug with shcomp's
handling of 'alias -p', but while fixing that I found a large number
of related issues in the alias command's -p, -t and -x options. The
current patch provides bugfixes for all of the bugs listed below:

1) Listing aliases in a script with 'alias -p' or 'alias' broke
   shcomp's bytecode output:
   https://github.com/ksh93/ksh/issues/87#issuecomment-813819122

2) Listing individual aliases with the -p option doesn't work:
      $ alias foo=bar bar=foo
      $ alias foo
      foo=bar
      $ alias -p foo  # No output

3) Listing specific tracked aliases with -pt does not display them
   in a reusable format, but rather adds another tracked alias:
      $ hash -r cat vi
      $ alias -pt vi  # No output
      $ alias -pt rm
      $ alias -t
      cat=/usr/bin/cat
      rm=/usr/bin/rm
      vi=/usr/bin/vi

4) Listing all tracked aliases with -pt does not output them in a
   reusable format (the resulting command printed only creates a
   normal alias, which is different from a tracked alias):
      $ hash -r cat
      $ alias -pt
      alias cat=/usr/bin/cat  # Expected 'alias -t cat'

5) Listing a non-existent alias with -p doesn't cause an error:
      $ unalias -a
      $ alias -p notanalias  # No output
      $ echo $?
      0
      $ alias notanalias
      notanalias: alias not found
      $ echo $?
      1
      $ hash -r
      $ alias -pt notacommand  # No output
      $ echo $?
      0

6) Attempting to list 256 non-existent aliases results in exit
   status zero:
      $ unalias -a
      $ alias $(awk -v ORS= 'BEGIN { for(i=0;i<256;i++) print "x "; }')
      x: alias not found
      --cut error message--
      $ echo $?
      0

Changes:

- typeset.c: Avoid printing anything while shcomp is compiling a
  script. This is needed because the alias command is run by shcomp
  to prevent parsing issues.

- b_alias(): To avoid adding tracked aliases with -pt, set
  tdata.aflag to '+' so that setall() and other related functions
  only list tracked aliases.

- b_alias(): Set tdata.pflag to 1 so that setall() and other
  functions recognize -p was passed.

- print_value(): Add support for listing specific aliases with
  'alias -p'.

- setall(): To avoid any issues with zombie tracked aliases (see also
  the regression tests) ignore tracked alias nodes marked with the
  NV_NOALIAS attribute. This bit is set for tracked alias nodes by
  the nv_rehash() function.

- setall(): For backward compatibility, continue incrementing the
  exit status for each invalid alias and tracked alias passed. This
  was already how alias behaved when listing aliases without -p, so
  using -p shouldn't cause a change in behavior:
      $ unalias -a
      $ alias foo bar
      foo: alias not found
      bar: alias not found
      $ echo $?
      2
  To fix bug 6, the exit status is set to one if an enforced 8-bit
  exit status would be zero.

- print_namval(): Set the prefix to 'alias -t' so that listing
  tracked aliases with 'alias -pt' works correctly.

- data/msg.c and include/name.h: Add an error message for when
  'alias -pt' doesn't find a tracked alias.

- tests/alias.sh: Add a ton of regression tests for the bugs fixed in
  this commit.
2021-12-27 03:49:06 +00:00
Johnothan King
feeb62d15f sh_close(): Set errno to EBADF for invalid file descriptors (#399)
The sh_close() function fails to set errno to EBADF when passed a
negative (invalid) file descriptor. This commit fixes the issue
by setting errno if the file descriptor is a negative value
(backported from ksh93v- 2012-08-24).
2021-12-27 03:48:38 +00:00
Martijn Dekker
9403d326f4 shtests --posix: ensure consistent locale: unset all LC_* vars
When running tests in --posix/-p mode, not all LC_* variables were
unset, so that certain aspects of the locale could be non-POSIX.
2021-12-27 03:48:27 +00:00
Martijn Dekker
a1f5c99204 INIT: remove proto, ratz (re: 46593a89, 6137b99a); major cleanup
This takes another step towards cleaning up the build system. We
now do not even pretend to be theoretically compatible with
pre-1989 K&R C compilers or with C++ compilers. In practice, this
had already been broken for many years due to bit rot.

Commit 46593a89 already removed the license handling enormity that
depended on proto, so now we can cleanly remove it altogether. But
we do need to leave some backwards compatibility stubs to keep the
build system compatible with older AST code; it should remain
possible to build older ksh versions with the current build system
(the bin/ and src/cmd/INIT/ directories) for testing purposes.

So as of now there is no more __MANGLE__d rubbish in your generated
header files. This is only about a quarter of a century overdue...

This commit also includes a huge amount of code cleanup to remove
thousands of unused K&R C fallbacks and other cruft, particularly
in libast. This code base should now be a little easier to
understand for people who are familiar with a modern(ish) C
standard.

ratz is now also removed; this was a standalone and simplified 2005
version of gunzip. As of 6137b99a, none of our code uses it, even
theoretically. And the real g(un)zip is now everywhere.

src/cmd/INIT/proto.c, src/cmd/INIT/ratz.c:
- Removed.

COPYRIGHT:
- Remove zlib license; this only applied to ratz.

bin/package, src/cmd/INIT/package.sh:
- Related cleanups.
- Unset LC_ALL before invoking a new shell, respecting the user's
  locale again and avoiding multibyte character corruption on the
  command line.

src/cmd/INIT/proto.sh:
- Add stub for backwards compatibility with Mamfiles that depend on
  proto. It does nothing but pass input without modification and is
  now installed as the new arch/*/bin/proto by src/cmd/INIT/Mamfile.

src/cmd/INIT/iffe.sh:
- Ignore the proto-related -e (--package) and -p (--prototyped)
  options; keep parsing them for backwards compatibility.
- Trim the macros passed to every test to their standard C
  versions, removing K&R C and C++ versions. These are now
  considered to be for backwards compatibility only.

src/cmd/INIT/iffe.tst:
- Remove proto(1) mangling code.
  By the way, iffe can be regression-tested as follows:
        $ bin/package use   # set up environment in a child shell
        $ regress src/cmd/INIT/iffe.tst
        $ exit              # leave package environment

src/cmd/INIT/make.probe, src/cmd/INIT/probe.win32:
- Remove code to handle C++.

src/lib/libast/features/common:
- As in iffe.sh above, trim macros designed for compatibility with
  C++ and ancient C compilers to their standard C versions and
  comment that they are for backwards compatibility with AST code.
  This is needed to keep all the old ast and ksh code compiling.

src/cmd/ksh93/sh/init.c,
src/cmd/ksh93/sh/name.c:
- Clarify libshell ABI compatibility function versions of macros.
  A "proto workaround" comment in the original code mislead me into
  thinking this had something to do with the removed proto(1), but
  it's unrelated. Call the workaround macro BYPASS_MACRO instead.

src/cmd/ksh93/include/defs.h:
- sh_sigcheck() macro: allow &sh as an argument: parenthesise shp.

src/cmd/ksh93/sh/nvtype.c:
- Remove unused nv_mkstruct() function. (re: d0a5cab1)

**/features/*:
- Remove obsolete iffe 'set prototyped' option.

**/Mamfile:
- Remove all references to the ast/prototyped.h header.
- Remove all use of the proto command. Simply copy instead.

*** 850-ish source files: ***
- Remove all '#pragma prototyped' directives.
- Remove all C++ compat code conditional upon defined(__cplusplus).
- Remove all use of the _ARG_ macro, which on standard C expands to
  its argument:
        #define _ARG_(x)        x
  (on K&R C, it expanded to nothing)
- Remove all use of _BEGIN_EXTERNS_ and _END_EXTERNS_ macros (empty
  on standard C; this was for C++ compatibility)
- Reduce all #if __STD_C (standard code) #else (K&R code) #endif
  blocks to the standard code only, without use of the macro.
- Same for _STD_ macro which seems to have had the same function.
- Change all instances of 'Void_t' to standard 'void'.
2021-12-24 07:05:22 +00:00
Johnothan King
3785a0685c Fix process substitutions printing PIDs in profile scripts (#395)
- sh/args.c: A process substitution run in a profile script may print
  its PID as if it was a command spawned with '&'. Reproducer:
     $ cat /tmp/env
     true >(false)
     $ ENV=/tmp/env ksh
     [1]	730227
     $
  This bug is fixed by turning off the SH_PROFILE state while running
  a process substitution.

- sh/subshell.c: The SH_INTERACTIVE fix in 3525535e renders the extra
  check for SH_PROFILE redundant, so it has been removed.

- tests/io.sh: Update the procsub PIDs test to also check the result
  after using process substitution in a profile script.
2021-12-22 13:27:00 +00:00
Johnothan King
740a24a456 Fix ASan buffer overflow errors caused by memcmp (#393)
This commit replaces more instances of memcmp with strncmp to fix some
more heap-buffer-overflow errors in ASan, some of which can occur when
running the regression tests with xtrace enabled. It combines two
existing patches plus another fix in name.c for xtrace:
https://www.mail-archive.com/ast-developers@lists.research.att.com/msg00877.html
https://github.com/oracle/solaris-userland/blob/master/components/ksh93/patches/035-CR7036535.patch
2021-12-22 06:37:58 +00:00
Martijn Dekker
d95700c348 print.c: resolve whitespace diff with master (re: fb8308243)
I ended up committing versions of the fix to the master and 1.0
branches that differed only in whitespace in a few lines (no code
differences). This commit makes the whitespace identical so this
does not keep annoying me when I look at 'git diff 1.0 master'.
2021-12-22 05:15:32 +00:00
Martijn Dekker
fcd9efce7f Interactive: Avoid losing the job after suspending a subshell
Reproducer: run vi in a subshell:

	$ (vi)

vi opens; now press Ctrl+Z to suspend. The output is as expected:

	[2] + Stopped                  (vi)

…but the exit status is 18 (SIGTSTP's signal number) instead of 0.

Now do:

	$ fg
	(vi)
	$

The exit status is 18 again, vi is not resumed, and the job is
lost. You have to find vi's pid manually using ps and kill it.

Forking all non-command substitution subshells invoked from the
interactive main shell is the only reliable and effective fix I've
found. I've tried to fork the subshell conditionally in every other
remotely plausible place I can think of in fault.c and xec.c, but I
can't get anything to work properly. If anyone can get this to work
without forking as much (or at all), please do submit a patch or PR
that supersedes this fix.

At least subshells of subshells don't need to fork, so the
performance impact can be limited. Plus, it's not as if most people
need maximum speed on the interactive command line. Scripts
(including login/profile scripts) are not affected at all.

Command substitutions can be handled differently. My testing shows
that all shells except ksh93 simply block SIGTSTP (the ^Z signal)
while they run. We should do the same, so they don't need to fork.

NOTE for any backporters: the subshell.c and fault.c changes depend
on commits 35b02626 and 48ba6964 to work correctly.

src/cmd/ksh93/sh/subshell.c: sh_subshell():
- If the interactive shell state bit is on, then before executing
  the subshell's code:
  - for command substitutions, block SIGTSTP;
  - for other subshells, fork.
- For command substitutions, release SIGTSTP if the interactive
  shell state bit was on upon invoking the subshell.

src/cmd/ksh93/sh/fault.c:
- Instead of checking for a virtual subshell, check the shell's
  interactive state bit to decide whether to handle SIGTSTP, as
  that is only turned on in the interactive main shell.

src/cmd/ksh93/sh/main.c: sh_main():
- To avoid bugs, ignore SIGTSTP while running profile scripts.
  Blocking it doesn't work because delaying it until after
  sigrelease() will cause a crash. Thanks to @JohnoKing for this.
- While we're here, prevent a possible overflow of the 'beenhere'
  static char variable by only incrementing it once.

Co-authored-by: Johnothan King <johnothanking@protonmail.com>
Resolves: https://github.com/ksh93/ksh/issues/390
2021-12-22 05:09:17 +00:00
Martijn Dekker
3525535e1f sh_parse(): don't turn on interactive state (re: 48ba6964)
Reproducer:

	$ (sleep 1& echo done)
	done
	$ (eval "echo hi"; sleep 1& echo done)
	hi
	[1]	30587
	done

No job control output should be printed for a background process
invoked from a subshell, not even after 'eval'.

The cause: sh_parse() turns on the shell's interactive state bit
(sh_state(SH_INTERACTIVE)) if the interactive shell option is on.

This is incorrect. The parser should have no involvement with shell
interactivity in principle because that's not its domain.

Not only that, the parser may need to run in a subshell, e.g. when
executing traps or 'eval' commands (as above). By definition, a
subshell can never be interactive.

We already fixed many bugs related to job control and the shell's
interactive state. Even if these two lines previously papered over
some breakage, I can't find any now after simply removing them. If
any is found later, then it'll need to be fixed properly instead.

Related: https://github.com/ksh93/ksh/issues/390
2021-12-22 05:06:12 +00:00
Johnothan King
e6989853bc Fix yet more minor bugs related to the regression tests (#389)
- Redirect error output from the ulimit builtin (re: 3e58851f).
- Fix the test failure for 'cd -eP' on illumos by making a directory
  symlink first, then removing the symlink after cd.
- Fix the test failure for 'getconf -l' on illumos by quoting
  strings with the -q option.
- astconf.c: Only quote strings if the -q option was passed.
- Improve error messages from intermittently failing types.sh tests
2021-12-21 08:01:00 +00:00
Martijn Dekker
0ead68b704 bin/package: fix 64-bit detection on systems without 'cc'
If the compiler is called gcc but not cc, the 64-bit detection
didn't work and $HOSTTYPE (and the arch/ subdirectory) did not get
the -64 suffix.

bin/package, src/cmd/INIT/package.sh:
- Run checkcc() before attempting to compile the program. This will
  set $cc to the path to gcc if there is no 'cc' command.
- trap: use 'rm -rf' to also delete .dSYM directories (macOS).
- checkcc(): Since we're here, find clang as well.
2021-12-21 07:11:44 +00:00
Martijn Dekker
24fc1bbca9 Sanitise standards/feature macros, remove compiler/linker wrappers
The goal is to get rid of all compiler/linker wrapper scripts as
they are overridden by passing CC/LD and it should be possible to
select your compiler or linker without breaking the build. The
probing and feature testing system should set the appropriate flags
and macros. This makes some progress towards that.

src/lib/libast/features/standards:
- Eliminate the shotgun approach to standards macros on popular
  systems where the macros we we need to set are known and
  documented. The following will enable standards compliance plus
  all the available extensions:
  - Set no macros at all for any BSD system (excluding macOS).
  - Set _DARWIN_C_SOURCE on Darwin/macOS.
  - Set everything and the kitchen sink for Solaris/illumos in
    a way that enables backwards compatibility with older Solaris.
    This is unofficial, but following the standards(5) manual
    disables a lot of basic functionality that we depend on.
  - Set _GNU_SOURCE for GNU (glibc).
  - Remove the covered macros from the shotgun approach fallback.
- Add a new heuristic. _POSIX_PATH_MAX and _SC_PAGESIZE are among
  the basic macros disabled when you pass recommended standards
  macros, killing the build, so it's good to check if they compile.

src/cmd/INIT/ar.freebsd12.amd64,
src/cmd/INIT/ar.linux.i386-64:
- Removed. May cause build failures on some systems as not all 'ar'
  implementations support the U option. Plus, I can think of no
  good reason to disable deterministic mode (which always creates
  identical output) on 'ar' implementations that support it. See:
  https://groups.google.com/g/comp.unix.shell/c/LdOD1Ya0C9E/m/U6DhgHVICwAJ

src/cmd/INIT/cc.linux.*-icc,
  Removed icc wrappers. These manually source /etc/profile.d/icc.sh
  but I don't think that is the build system's job. Profile scripts
  should be run at login time and export variables we inherit
  through the environment.

src/cmd/INIT.cc.{freebsd,linux,openbsd}*:
- Removed. Should be entirely superfluous now that the standards
  feature test sets the appropriate macros.

src/cmd/INIT.cc.sol11.*:
- Removed as the standards feature test now sets the approopriate
  macros. Note the Solaris build system should now simply pass CC
  as normal instead of passing CC_EXPLICIT.
2021-12-21 06:52:16 +00:00
Martijn Dekker
d3f553ca76 shtests: report correct line numbers for shcomp fails (re: bd38c804)
Inserting the _common script instead of sourcing it caused all test
failures in shcomp runs to be reported with the number of lines in
_common added.

src/cmd/ksh93/shtests:
- Only incorporate the aliases from _common; dot/source the rest of
  the code as normal. Replace the first few lines with the aliases
  to avoid affecting $LINENO; they are comments anyway.
2021-12-21 06:48:21 +00:00
Martijn Dekker
a381a1b049 Better fix for BUG_IFSISSET (re: 95294419)
With a better understanding of the code 1.5 years later, the
special-casing for IFS introduced in that commit seems like a hack.

The problem was not that the IFS node always exists but that it is
always considered to have a 'get' discipline function. Variables
with a 'get' discipline are considered set. This makes sense for
all variables except IFS.

The nv_isnull() macro is used to check if a variable is set. It
calls nv_hasget() to determine if the variable has a 'get'
discipline. So a better fix is for nv_hasget() always to return
false for IFS.

src/cmd/ksh93/bltins/test.c, src/cmd/ksh93/sh/macro.c:
- Remove special-casing for IFS.

src/cmd/ksh93/sh/nvdisc.c: nv_hasget():
- Always return false for IFS, taking local scope into account.
2021-12-21 06:29:30 +00:00
Martijn Dekker
69e0de9274 bin/package: more cleanups (re: 9166545a)
src/cmd/INIT/package.sh, bin/package:
- Derive the command name from $0 instead of hardcoding it.
- Remove NPROC and related code to support parallel building. This
  is not supported with mamake, is unlikely to be reintroduced any
  time soon, and if it ever is it will need to be done in a
  different way anwyay.
- Invoke 'sed' and 'tr' directly instead of via $SED and $TR
  variables. We're not building our own dynamically linked 'sed'
  and 'tr' in this distribution so LD_LIBRARY_PATH is irrelevant.
  If we ever do again, there are better ways to make sure the OS
  standard 'sed' and 'tr' are invoked than this kludge.
- Use note() consistently to print warnings to standard error.
  note() is changed to print each argument on a new line prefixed
  by the command name, so arguments need to be quoted now if they
  are to be shown on a single line.
- Use a new err_out() function to error out, avoiding code
  repetition.
2021-12-18 14:47:24 +00:00
Martijn Dekker
ce3e080c0e Release 1.0.0-beta.2
Announcing: KornShell 93u+m 1.0.0-beta.2
https://github.com/ksh93/ksh

In May 2020, when every KornShell (ksh93) development project was
abandoned, development was rebooted in a new fork based on the last
stable AT&T version: ksh 93u+. This new fork is called ksh 93u+m as a
permanent nod to its origin. We're restarting it at version 1.0. Seven
months after the first beta, the second one is ready. Please test this
second beta and report any bugs you find, or help us fix known bugs.

We're now the default ksh93 in some OS distributions, at least Debian
and Slackware! Even though we don't think it's stable release quality
yet, the consensus seems to be that 93u+m is already much better than
the last AT&T release.

Main developers: Martijn Dekker, Johnothan King, hyenias

Contributors: Andy Fiddaman, Anuradha Weeraman, Chase, Gordon Woodhull,
Govind Kamat, Harald van Dijk, Lev Kujawski, Marc Wilson, Ryan Schmidt,
Sterling Jensen

HOW TO GET IT

Please download the source code tarball from our GitHub releases page:

	https://github.com/ksh93/ksh/releases

To build, follow the instructions in README.md or src/cmd/ksh93/README.

HOW TO GET INVOLVED

To report a bug, please open an issue at our GitHub page (see above).
Alternatively, email me at martijn@inlv.org with your report.
To get involved in development, read the brief policy information in
README.md and then jump right in with a pull request or email a patch.
See the TODO file in the top-level directory for a to-do list.

*** MAIN CHANGES between 1.0.0-beta.1 and 1.0.0-beta.2 ***

New features in built-in commands:

- 'cd' now supports an -e option that, when combined with -P, verifies
  that $PWD is correct after changing directories; this helps detect
  access permission problems. See:
  https://www.austingroupbugs.net/view.php?id=253

- 'printf' now supports a -v option as in bash. This assigns formatted
  output directly to variables, which is very fast and will not strip
  final newline (\n) characters.

- The 'return' command, when used to return from a function, can now
  return any status value in the 32-bit signed integer range, like on
  zsh. However, due to a traditional Unix kernel limitation, $? is
  still trimmed to its least significant 8 bits whenever leaving a
  (sub)shell environment.

- 'test'/'[' now supports all the same operators as [[ (including =~,
  \<, \>) except for the different 'and'/'or' operators. Note that
  'test'/'[' remains deprecated due to its unfixable pitfalls;
  [[ ... ]] is recommended instead.

Shell language changes:

- Several improvements were made to the --noexec shell code linter.

- Arithmetic expressions in native ksh mode no longer interpret a
  number with a leading zero as octal in any context. Use 8#octalnumber
  instead (e.g. 8#400 == 256). Arithmetic expressions now also behave
  identically within and outside ((...)) and $((...)).

- POSIX compatibility mode fixes (only applicable with the --posix shell
  option on):
  - A leading zero is now consistently recognised as introducing an octal
    number in all arithmetic contexts.
  - $((inf)) and $((nan)) are now interpreted as regular variables.
  - The '.' built-in no longer runs ksh functions and now only runs
    files.

Bugs fixed:

- '.' and '..' are now once again completed by tab completion.

- If SIGINT is set to ignore, the interactive shell no longer exits on
  Ctrl+C.

- ksh now builds and runs on Apple's new M1 hardware.

- The 'return' and 'exit' commands no longer risk triggering actual
  signals by returning or exiting with a status > 256.

- Ksh no longer behaves badly when parsing a type definition command
  ('typeset -T' or 'enum') without executing it or when executing it in
  a subshell. Types can now safely be defined in subshells and defined
  conditionally as in 'if condition; then enum ...; fi'.

- Discipline functions, especially those applied to PS2 or .sh.tilde,
  will no longer crash your shell upon being interrupted or throwing an
  error.

- Fixed a bug that could corrupt output if standard output is closed
  upon initialising the shell.

- Fixed a bug in the [[ ... ]] compound command: the '!' logical
  negation operator now correctly negates another '!', e.g.,
  [[ ! ! 1 -eq 1 ]] now returns 0/true. Note that this has always been
  the case for 'test'/'['.

- Fixed SHLVL so that replacing ksh by itself (exec ksh) will not
  increase it.

- Arithmetic expressions are no longer allowed to assign out-of-range
  values to variables of types declared with enum.

- The 'time' keyword no longer makes the --errexit shell option
  ineffective.

- Various bugs in libcmd built-in commands (those bound to the
  /opt/ast/bin path by default) have been fixed.

- Various other crashing bugs have been fixed.

Fixes for the shcomp byte code compiler:

- shcomp is now able to compile scripts that define types using enum.

- shcomp now refuses to mess up your terminal by writing bytecode
  to it.

*** MAIN CHANGES between ksh 93u+ 2012-08-01 and 93u+m 1.0.0-beta.1 ***

Hundreds of bugs have been fixed, including many serious/critical bugs.
This includes upstreamed patches from OpenSUSE, Red Hat, and Solaris, fixes
backported from the abandoned 93v- beta and ksh2020 fork, as well as many
new fixes from the community. See the NEWS file for more information, and
the git commit log for complete documentation of every fix. Incompatible
changes have been minimised, but not at the expense of fixing bugs. For a
list of potentially incompatible changes, see src/cmd/ksh93/COMPATIBILITY.

Though there was a "no new features, bugfixes only" policy, some new
features were found necessary, either to fix serious design flaws or to
complete functionality that was evidently intended, but not finished.
Below is a summary of these new features.

New command line editor features:

- The forward-delete and End keys are now handled as expected in the
  emacs and vi built-in line editors.

- In the vi and emacs line editors, repeat count parameters can now also
  be used for the arrow keys and the forward-delete key. E.g., in emacs
  mode, <ESC> 7 <left-arrow> will now move the cursor seven positions to
  the left. In vi control mode, this would be entered as: 7 <left-arrow>.

New shell language features:

- The &>file redirection shorthand (for >file 2>&1) is now available for
  all scripts and interactive sessions and not only for profile/login
  scripts, bringing ksh 93u+m in line with mksh, bash, and zsh.

- File name generation (a.k.a. pathname expansion, a.k.a. globbing) now
  never matches the special navigational names '.' (current directory)
  and '..' (parent directory). This change makes a pattern like .*
  useful; it now matches all hidden files (dotfiles) in the current
  directory, without the harmful inclusion of '.' and '..'.

- Tilde expansion can now be extended or modified by defining a
  .sh.tilde.get or .sh.tilde.set discipline function. This replaces a
  2004 undocumented attempt to add this functionality via a .sh.tilde
  command, which never worked and crashed the shell. See the manual for
  details on the new method.

New features in built-in commands:

- Usage error messages now show the --help/--man self-documentation options.

- Path-bound built-ins (such as /opt/ast/bin/cat) can now be executed by
  invoking the canonical path, so the following will now work as expected:
	$ /opt/ast/bin/cat --version
	  version         cat (AT&T Research) 2012-05-31

- 'command -x' now looks for external commands only, skipping built-ins.
  In addition, its xargs-like functionality no longer freezes the shell on
  Linux and macOS, making it effectively a new feature on these systems.

- 'redirect' now checks if all arguments are valid redirections before
  performing them. If an error occurs, it issues an error message instead
  of terminating the shell.

- 'suspend' now refuses to suspend a login shell, as there is probably no
  parent shell to return to and the login session would freeze.

- 'times' now gives high precision output in a POSIX compliant format.

- 'typeset' now gives an informative error message if an incompatible
  combination of options is given.

- 'whence -v/-a' now reports the location of autoloadable functions.

New features in shell options:

- A new --globcasedetect shell option is added on OSs where we can
  check for a case-insensitive file system (currently Windows/Cygwin,
  macOS, Linux and QNX 7.0+). When this option is turned on, file name
  generation (globbing), as well as file name tab completion on
  interactive shells, automatically become case-insensitive on file
  systems where the difference between upper and lower case is ignored
  for file names. This is transparently determined for each directory, so
  a path pattern that spans multiple file systems can be part
  case-sensitive and part case-insensitive.

- A new --nobackslashctrl shell option disables the special escaping
  behaviour of the backslash character in the emacs and vi built-in
  editors. Particularly in the emacs editor, this makes it much easier to
  go backward, insert a forgotten backslash into a command, and then
  continue editing without having your next cursor key replace your
  backslash with garbage. Note that Ctrl+V (or whatever other character
  was set using 'stty lnext') always escapes all control characters in
  either editing mode.

- A new --posix shell option has been added to ksh 93u+m that makes the
  ksh language more compatible with other shells by following the POSIX
  standard more closely. See the manual page for details. It is enabled by
  default if ksh is invoked as sh, otherwise it is disabled by default.

- Enhancement to -G/--globstar: symbolic links to directories are now
  followed if they match a normal (non-**) glob pattern. For example, if
  '/lnk' is a symlink to a directory, '/lnk/**' and '/l?k/**' now work as
  you would expect.
2021-12-17 04:20:04 +01:00
Johnothan King
85199ab351 Backport ksh93v- bugfix for [[ 1<2 ]] (#380)
Strings compared in [[ with the > and < operators should be compared
lexically. This does not work when the strings are single digits, as
the parser interprets it as a syntax error:
   $ [[ 10<2 ]]   # 10 lexically sorts before 2
   $ echo $?
   0
   $ [[ 1<2 ]]
   /usr/bin/ksh: syntax error: `<' unexpected
   $ echo $?
   3

src/cmd/ksh93/sh/lex.c:
- Don't interpret numbers next to > and < as a redirection while
  inside of [[. This bugfix was backported from ksh93v- 2014-06-25.

src/cmd/ksh93/tests/bracket.sh:
- Add regression tests for the > and < operators.
2021-12-17 03:26:41 +01:00
Martijn Dekker
e67df29c07 Re-fix defining types conditionally or in subshells (re: f508660d)
New version. I'm pretty sure the problems that forced me to revert
it earlier are fixed.

This commit mitigates the effects of the hack explained in the
referenced commit so that dummy built-in command nodes added by the
parser for declaration/assignment purposes do not leak out into the
execution level, except in a relatively harmless corner case.

Something like

    if false; then
        typeset -T Foo_t=(integer -i bar)
    fi

will no longer leave a broken dummy Foo_t declaration command. The
same applies to declaration commands created with enum.

The corner case remaining is:

    $ ksh -c 'false && enum E_t=(a b c); E_t -a x=(b b a c)'
    ksh: E_t: not found

Since the 'enum' command is not executed, this should have thrown
a syntax error on the 'E_t -a' declaration:
ksh: syntax error at line 1: `(' unexpected

This is because the -c script is parsed entirely before being
executed, so E_t is recognised as a declaration built-in at parse
time. However, the 'not found' error shows that it was successfully
eliminated at execution time, so the inconsistent state will no
longer persist.

This fix now allows another fix to be effective as well: since
built-ins do not know about virtual subshells, fork a virtual
subshell into a real subshell before adding any built-ins.

src/cmd/ksh93/sh/parse.c:

- Add a pair of functions, dcl_hactivate() and dcl_dehacktivate(),
  that (de)activate an internal declaration built-ins tree into
  which check_typedef() can pre-add dummy type declaration command
  nodes. A viewpath from the main built-ins tree to this internal
  tree is added, unifying the two for search purposes and causing
  new nodes to be added to the internal tree. When parsing is done,
  we close that viewpath. This hides those pre-added nodes at
  execution time. Since the parser is sometimes called recursively
  (e.g. for command substitutions), keep track of this and only
  activate and deactivate at the first level.
     (Fixed compared to previous version of this commit: calling
  dcl_dehacktivate() when the recursion level is already zero is
  now a harmless no-op. Since this only occurs in error handling
  conditions, who cares.)

- We also need to catch errors. This is done by setting libast's
  error_info.exit variable to a dcl_exit() function that tidies up
  and then passes control to the original (usually sh_exit()).
     (Fixed compared to previous version of this commit: dcl_exit()
  immediately deactivates the hack, no matter the recursion level,
  and restores the regular sh_exit(). This is the right thing to
  do when we're in the process of erroring out.)

- sh_cmd(): This is the most central function in the parser. You'd
  think it was sh_parse(), but $(modern)-form command substitutions
  use sh_dolparen() instead. Both call sh_cmd(). So let's simply
  add a dcl_hacktivate() call at the beginning and a
  dcl_deactivate() call at the end.

- assign(): This function calls path_search(), which among many
  other things executes an FPATH search, which may execute
  arbitrary code at parse time (!!!). So, regardless of recursion
  level, forcibly dehacktivate() to avoid those ugly parser side
  effects returning in that context.

src/cmd/ksh93/bltins/enum.c: b_enum():

- Fork a virtual subshell before adding a built-in.

src/cmd/ksh93/sh/xec.c: sh_exec():

- Fork a virtual subshell when detecting typeset's -T option.

Improves fix to https://github.com/ksh93/ksh/issues/256
2021-12-17 01:28:28 +01:00
Martijn Dekker
2bc1d814c9 Do not exit shell on Ctrl+C with SIGINT ignored (re: 7e5fd3e9)
The killpg(getpgrp(),SIGINT) call added to ed_getchar() in that
commit caused the interactive shell to exit on ^C even if SIGINT is
being ignored. We cannot revert or remove that call without
breaking job control. This commit applies a new fix instead.

Reproducers fixed by this commit:

  SIGINT ignored by child:

    $ PS1='childshell$ ' ksh
    childshell$ trap '' INT
    childshell$ (press Ctrl+C)
    $

  SIGINT ignored by parent:

    $ (trap '' INT; ENV=/./dev/null PS1='childshell$ ' ksh)
    childshell$ (press Ctrl+C)
    $

  SIGINT ignored by parent, trapped in child:

    $ (trap '' INT; ENV=/./dev/null PS1='childshell$ ' ksh)
    childshell$ trap 'echo test' INT
    childshell$ (press Ctrl+C)
    $

I've experimentally determined that, under these conditions, the
SFIO stream error state is set to 256 == 0400 == SH_EXITSIG.

src/cmd/ksh93/sh/main.c: exfile():
- On EOF or error, do not return (exiting the shell) if the shell
  state is interactive and if sferror(iop)==SH_EXITSIG.
- Refactor that block a little to make the new check fit in nicely.

src/cmd/ksh93/tests/pty.sh:
- Test the above three reproducers.

Fixes: https://github.com/ksh93/ksh/issues/343
2021-12-16 19:56:46 +01:00
Martijn Dekker
4856847631 libast/regex: full revert (re: 38aab428, 1aa8f771)
In another branch, this is causing a mysterious test failure in
types.sh. So this is not ready for 1.0.
2021-12-16 19:06:33 +01:00
Martijn Dekker
3779d84220 more regression test updates (re: 7cdb01f6) 2021-12-16 11:46:17 +01:00
Martijn Dekker
fad16a0c63 tests/path.sh: update regression test (re: 7cdb01f6) 2021-12-16 09:20:00 +01:00
Martijn Dekker
7cdb01f625 New selection of default libcmd /opt/ast/bin built-ins
Note that this is only about the /opt/ast/bin built-in commands,
not about the regular pathless builtins such as printf.
To use these, either add /opt/ast/bin to your $PATH or use a
command like 'builtin cp'. As usual, --man provides info.

Removed as defaults for lack of convincing advantages over the OS's
external commands:
- chmod, cmp, head, logname, mkdir, sync, uname, wc

Remain as useful defaults:
- basename, cat, cut, dirname. These are commonly used in
  performance-sensitive code paths in scripts and having them as
  built-ins can be good for performance.
- getconf: This is the only interface to some libast internals that
  is available to ksh. It's also has better functionality than most
  OS-shipped 'getconf' commands, e.g., it can list and query all
  the configuration values.

Added as defaults:
- cp, ln, mv: Having these built in can speed up scripts that
  manage files. Also the AST versions have extended functionality
  (see cp --man, etc.).
- mktemp: External mktemp commands vary too widely and are
  incompatible, but it's important that scripts can securely make
  temporary files, so it's good to ship a known interface to this
  functionality.

As a result, the statically linked ksh binary is very slightly
smaller than before.

Resolves: https://github.com/ksh93/ksh/issues/349
2021-12-16 09:09:37 +01:00
Martijn Dekker
60e0687cec shtests: don't run pty tests with shcomp by default
The pty tests tests the interactive shell. Therefore, running them
through the script compiler is a waste of time.

Not only that, it is reported that the pty tests intermittently
fail with shcomp on some systems. This is not worth trying to fix.

src/cmd/ksh93/tests/shtests:
- Only run pty.sh with shcomp if -c/--compile was explicitly
  specified.
- Document the change.
2021-12-16 09:09:05 +01:00
Martijn Dekker
38aab428bb Temporarily revert separate stack for libast/regex (re: 1aa8f771)
Welcome to AT&T engineering practices in action: a fix in one thing
breaks a completely unreleated thing, but only in very specific
and inscrutable circumstances.

Commit ffe84ee7 introduced a regression test failure in types.sh:

test types begins at 2021-12-14+23:57:35
	types.sh[130]: z.r.s should be z.r.x
test types failed at 2021-12-14+23:57:35 with exit code 1 [ 86 tests 1 error ]
test types(C.UTF-8) begins at 2021-12-14+23:57:35
test types(C.UTF-8) passed at 2021-12-14+23:57:35 [ 86 tests 0 errors ]
test types(shcomp) begins at 2021-12-14+23:57:35
test types(shcomp) passed at 2021-12-14+23:57:35 [ 86 tests 0 errors ]

Only enough, I've *only* found this regression on the GitHub CI
runner. I've tried it on three different regular Linux systems and
it occurs on none of them, nor on macOS.

Another odd thing: it only fails on the first of those three test
runs. But my experiments show it fails very consistently.

Through a process of systematic elimination in a test branch, I've
found that the failure is triggered by the change to using a
separate stack in the regex code. All the other changes are fine.

Using a separate stack improves the robustness of the regex code,
but it apparently exposes some breakage in how the very dodgy
'typeset -T' code is handling the stack, which was being masked by
sharing a stack with it. Or at least that seems like the most
plausible explanation to me right now.

So, until that breakage can be traced and fixed, the regex code now
shares the main stack with everything else again for the time being.

_____
Just to record this: by adding a couple of debug lines:

  typeset -p z | sed 's/^/[DEBUG] /'
  printf '[DEBUG] %s\n' "${z.r.s}" "${z.r.x}"

the symptom reveals itself more clearly on the GitHub runner:

  test types begins at 2021-12-15+17:25:57
  [DEBUG] Y_t z=(X_t r=(x=foo;y=bam;s=''))
  [DEBUG]
  [DEBUG] foo
          types.sh[132]: z.r.s should be z.r.x
  test types failed at 2021-12-15+17:25:57 with exit code 1 [ 86 tests 1 error ]
  test types(C.UTF-8) begins at 2021-12-15+17:25:57
  [DEBUG] Y_t z=(X_t r=(x=foo;y=bam;s=foo))
  [DEBUG] foo
  [DEBUG] foo
  test types(C.UTF-8) passed at 2021-12-15+17:25:57 [ 86 tests 0 errors ]
  test types(shcomp) begins at 2021-12-15+17:25:57
  [DEBUG] Y_t z=(X_t r=(x=foo;y=bam;s=foo))
  [DEBUG] foo
  [DEBUG] foo
  test types(shcomp) passed at 2021-12-15+17:25:57 [ 86 tests 0 errors ]
2021-12-16 09:08:44 +01:00
Martijn Dekker
9166545aa3 bin/package: go POSIX
Yes, we're finally abandoning the old Bourne shell so we can
use sane $(command substitutions) and the like. POSIX sh (with
tolerance for shell bugs) is very highly portable these days.
Even Solaris 10 has a POSIX shell, though not as /bin/sh.

bin/package, src/cmd/INIT/package.sh:
- Be nice: if we're on an obsolete or broken shell, try hard to
  escape to a good one. This should preserve the possibility to
  just run 'bin/package make' via ancient /bin/sh on Solaris 10.
  - Note: zsh without sh emulation is considered broken because
    $path, which we use, is linked to $PATH. You have to run it via
    a symlink named sh or via 'zsh --emulate sh' to disable this.
    Enabling emulation mode after initialisation will not work.
- More self-documentation cleanups and updates.
- Regenerate the text-only fallback version of the self-doc.
- Remove flat view functionality (no arch directory); it may have
  been broken for some time, but quite frankly I could not care
  less. It's yet more featuritis. Building in arch/ is fine.
2021-12-16 09:08:20 +01:00
Martijn Dekker
1aa8f771d8 libast: regex: backport robustness improvements from 93v- beta
There are two main changes:

1. The regex code now creates and uses its own stack (env->mst)
   instead of using the shared standard stack (stkstd). That seems
   likely to be a good thing.

2. Missing mbinit() calls were inserted. The 93v- code uses a
   completely different multibyte characters API, so these needed
   to be translated back to the older API. But, as mbinit() is no
   longer a no-op as of 300cd199, these calls do stop things from
   breaking if a previous operation is interrupted mid-character.

I think there might be a couple of off-by-one errors fixed as well,
as there are two instances of this change:

-		while ((index += skip[buf[index]]) < mid);
+		while (index < mid)
+			index += skip[buf[index]];
2021-12-15 00:50:59 +01:00
Martijn Dekker
7c30a59e25 package: allow 'tee' to catch up before returning (re: beb3c64a)
In the referenced commit message I neglected to mention that, when
doing bin/package make, we're now running 'tee' in the background
again and the building job in the foreground, as opposed to the
other way around. Foreground jobs are more reliably interruptable.
But that reintroduced the problem fixed in 5b8d29d3. Now I don't
know what I was thinking then -- the obvious fix is to add a 'wait'
command, allowing 'tee' to catch up before returning to the prompt.
2021-12-15 00:50:56 +01:00
Martijn Dekker
6137b99a6b package: remove a lot more unused AT&T cruft
This reduces the bin/package script by more than half!

bin/package, src/cmd/INIT/package.sh:
- Remove obsolete and unused package commands: admin, contents,
  license, list, remote, regress, setup, update, verify, write.
- Remove associated documentation.
- Replace install command with a dummy. It'll come back when we
  reintroduce the building of dynamic libaries.
- Update the test command to run the regression tests properly
  and capture the output in arch/*/lib/package/gen/tests.out, as
  documented. Arguments are simply passed to bin/shtests.

src/cmd/INIT/{ditto.sh,hurl.sh,release.c}:
- Removed. These were support scripts for some of the removed
  package commands.

src/cmd/ksh93/tests/pty.sh:
- Avoid failure when capturing output via 'bin/package test' by
  redirecting standard error to /dev/tty when running the tests.
2021-12-15 00:50:45 +01:00
Johnothan King
61fa1b68bf The chown builtin should fail with the same error consistently (#378)
This bug was first reported at <https://www.illumos.org/issues/3782>.
The chown builtin when used on illumos can fail with different error
messages after running the same command twice:

  $ touch /tmp/x
  $ /opt/ast/bin/chown -h 433:434 /tmp/px
  chown: /tmp/x: cannot change owner and group [Not owner]
  $ /opt/ast/bin/chown -h 433:434 /tmp/px
  chown: /tmp/x: cannot change owner and group [Invalid argument]

The error messages differ because the libast struid and strgid
functions will return -2 if the same nonexistent ID is used twice.

The fix for this bug has been ported from here:
https://github.com/illumos/illumos-gate/commit/4162633a7c5961f388fd

src/lib/libcmd/chgrp.c:
- Remove NOID macro and check for a < 0 error status instead.
  This is different from the Illumos fix at
  <https://github.com/illumos/illumos-gate/commit/4162633a7c59>
  which added another macro.

src/lib/libast/man/{strgid,struid}.3:
- Correct errors in the strgid and struid documentation.
- Document that the strgid and struid functions will return -2 if
  the same invalid name is used twice.

Co-authored-by: Martijn Dekker <martijn@inlv.org>
2021-12-14 10:37:44 +01:00
Martijn Dekker
c0354a869f Fix build on illumos (re: 7fb814e1, 580ff616)
The standards macros consistency fix for iffe exposed breakage on
illumos: the standards flags aren't set properly. Back in 580ff616,
I set _XPG6 from features/common, which is the wrong place; the
correct place is features/standards -- especially now that iffe
uses its results.

In addition, to get header declarations that aren't somehow in
conflict with themselves on illumos, don't result in "implicit
function declaration" warnings, and expose all the functionality,
we need to define *all* the _XPG[4-7] macros *and* __EXTENSIONS__
*and* _XOPEN_SOURCE. Welp. Thankfully, that's just fine with
Solaris, too.

Thanks to @JohnoKing for the heads-up.
2021-12-14 10:13:41 +01:00
Johnothan King
5084bef41f Fix buggy bin/package environment behavior (#377)
Running 'bin/package environment' should only show what will happen
during the build process, but in it's current state the feature has
some bugs:

1. Errors can occur relating to the failed creation of files and a
   failed attempt to change the directory. Various errors from
   'bin/package environment make' and 'bin/package environment make
   CC=tcc':
   bin/package[5632]: cd: /home/johno/GitRepos/KornShell/ksh/arch/linux.i386-64/lib/package/gen: [No such file or directory]
   bin/package: line 5869: /home/johno/GitRepos/KornShell/ksh/arch/linux.i386-64/lib/package/gen/CC: cannot create [No such file or directory]
   bin/package: line 5869: /home/johno/GitRepos/KornShell/ksh/arch/linux.i386-64/lib/package/gen/CCFLAGS: cannot create [No such file or directory]
   bin/package: line 5869: /home/johno/GitRepos/KornShell/ksh/arch/linux.i386-64/lib/package/gen/CCLDFLAGS: cannot create [No such file or directory]
   bin/package: line 5869: /home/johno/GitRepos/KornShell/ksh/arch/linux.i386-64/lib/package/gen/LDFLAGS: cannot create [No such file or directory]
   bin/package: line 5869: /home/johno/GitRepos/KornShell/ksh/arch/linux.i386-64/lib/package/gen/KSH_RELFLAGS: cannot create [No such file or directory]
   bin/package[5888]: /home/johno/GitRepos/KornShell/ksh/arch/linux.i386-64/lib/package/gen/host: cannot create [No such file or directory]

2. The package script may in some scenarios create a temporary file
   at the root of the repository, such as 'pkg77213.c'.

bin/package, src/cmd/INIT/package.sh:
- Avoid creating files or changing the directory while the
  environment qualifier is on (this also affects the debug
  qualifier). Part of this fix is based on a patch from Marcin
  Cieślak[*], with other fixes applied for similar problems the
  environment qualifier had.
2021-12-14 03:54:50 +01:00
Martijn Dekker
46593a89b7 Get rid of overcomplicated AT&T copyright/license maintenance code
I'm now taking another small step towards extricating this build
system from the long-dead AT&T AST universe.

This commit modifies/reduces the tool called proto. AT&T used proto
for two purposes:

  1. To convert ANSI C code to a form compatible with ancient
     (pre-ANSI) K&R C compilers using extremely complex macro
     voodo. It was similarly capable of translating to C++.
     Theoretically, this entire code base should compile on
     anything from a 1980s K&R C compiler to a modern C++ compiler.
     In practice, given the massive amount of bit rot we inherited,
     I am 99.9% sure that this has been broken for many years.

  2. To automagically insert license comments into source files
     based on an extremely complicated license database system.
     (In all-too-typical AT&T fashion, this second function of
     proto is completely unrelated to the first.)

Function 2 has now been removed because, unlike the AT&T legal
department, I don't think it's worth going to unspeakably extreme
lengths to avoid maintaining license information in source code
files by hand.

In the process, proto.c was cleaned up to look halfway like actual
C code, but it's still processed code: most macros have been
expanded to their numeric value, all comments were stripped, etc.
So don't expect to understand this code. The actual source code is
in these two directories in the ast-open-history repo:

https://github.com/ksh93/ast-open-history/tree/master/src/cmd/proto
https://github.com/ksh93/ast-open-history/tree/master/src/lib/libpp

Meanwhile, nobody wants to compile ksh with a pre-ANSI K&R C
compiler in 2021 -- and there's no good reason to be compatible
with C++ because standard C compilers are universally available.
So, proto will go away when I manage to figure out how to pry it
loose from the innards of this build system.

src/lib/libast/port/astlicense.c:
- Removed. This is al the license handling code that was
  incorporated in proto.c in stripped form. It was not used
  anywhere else, and the environment where it was useful is gone.

src/cmd/INIT/proto.c:
- Cleanup to make this halfway maintainable: indentation, huge
  blocks of empty lines, #line directives, etc.
- Delete all the code corresponding to astlicense.c. This was
  actually easy as it was in a discrete block.
- proto(), pppopen(): Remove 'license'/'notice' and 'options'
  arguments.
- main(): Remove processing of -l (license) and -o (license
  options) flags.

**/Mamfile:
- Update all the proto invocations to remove the -l and -o flags.

bin/package, src/cmd/INIT/package.sh:
- Delete the 'copyright' command, which used the -l and -o
  options to tell proto to extract copyright information from
  *.lic/*.def files in lib/package.

COPYRIGHT:
- Added. This has the information from 'bin/package copyright', with
  the copyright years corrected to plausible values as the AST code
  used the current year (2021) for all of them. It adds ksh 93u+m
  copyright and contributor information at the top as well.
     (Yes, some of the lines in the old non-AT&T copyright notices
  are clipped. This is the actual output of the 'bin/package
  copyright' command as generated by 'proto' in the AST
  distribution. For all that extreme complexity, they couldn't even
  reproduce the notices correctly. But it's officially sanctioned
  by AT&T in exactly this form, so there you have it.)

lib/package/**:
- Removed. All these files are now obsolete and redundant.
2021-12-14 03:15:16 +01:00
Johnothan King
c2ac69b2d5 Use dynamic maximum configuration values when necessary (#370)
This commit fixes an issue with how ksh was obtaining the value of
NGROUPS_MAX. On some systems this setting can be changed (e.g., on
illumos adding 'set ngroups_max=32' to /etc/system then rebooting
changes NGROUPS_MAX from 16 to 32). Ksh was using NGROUPS_MAX with
the assumption it's a static value, which could cause issues on
systems where it isn't static. This bugfix is inspired by the one
from <https://github.com/lkujaw/ast/commit/b1362c3a5>, although it
has been expanded a bit to account for OPEN_MAX as well.

src/cmd/ksh93/sh/init.c, src/lib/libcmd/fds.c:
- Rename the getconf() macro to astconf_long() and move it to ast.h
  to prevent redundancy. Other sections of the code have been
  modified to use this macro for astconf() to account for
  dynamic settings.
- An equivalent macro for unsigned long values (astconf_ulong) has
  been added.
- Prefer sysconf(3) where available. It has better performance as it
  returns a numeric value directly instead of via string
  conversion.
- The astconf_long and astconf_ulong macros have been documented in
  the ast(3) man page.
2021-12-13 07:53:14 +01:00
Martijn Dekker
fc752b574a Re-match '.' and '..' in tab completion (re: 5312a59d, aad74597)
Turns out there is a bona fide, honest-to-goodness use case for
matching '.' and '..' in globbing after all. It's when globbing is
used as the backend mechanism for file name completion in
interactive shell editors. A tab invisibly adds a * at the end of
the word to the left of your cursor and the resulting pattern is
expanded. In 5312a59d, this broke for '.' and '..'.

Typing '.' followed by two tabs should result in a menu that
includes './' and '../'. Typing '..' followed by a tab should
result in '../', (or a menu that includes it if there are files
with names starting with '..'). This is the behaviour in 93u+ and
we should maintain this.

To restore this functionality without reintroducing the harmful
behaviour fixed in the referenced commits, we should special-case
this, allowing '.' and '..' to match only for file name completion.

src/lib/libast/include/glob.h:
- Fix an inaccurate comment: the GLOB_COMPLETE flag is used for
  command completion, not file name completion. This is very clear
  from reading the path_expand() function in sh/expand.c.
- Add new GLOB_FCOMPLETE flag for file name completion.

src/lib/libast/misc/glob.c:
- Adapt flags mask to fit the new flag.
- glob_dir(): If GLOB_FCOMPLETE is passed, allow '.' and '..' to
  match even if expanded from a pattern.
- Clarify the fix from aad74597 with an extended comment based on
  <https://github.com/ksh93/ksh/issues/146#issuecomment-790991990>.

src/cmd/ksh93/sh/expand.c: path_expand():
- If we're in the SH_FCOMPLETE (file name completion) state, then
  pass the new GLOB_FCOMPLETE flag to AST glob(3).

Fixes: https://github.com/ksh93/ksh/issues/372
Thanks to @fbrau for the bug report.
2021-12-13 01:50:50 +01:00
Johnothan King
e54001d58b Various minor capitalization and typo fixes (#371)
This commit fixes various minor typos, punctuation errors and
corrects the capitalization of many names.
2021-12-13 01:49:42 +01:00