1
0
Fork 0
mirror of git://git.code.sf.net/p/cdesktopenv/code synced 2025-03-09 15:50:02 +00:00
Commit graph

51 commits

Author SHA1 Message Date
atheik
9bed28c3f9 Fix line continuation within command substitutions
In command substitutions of the $(standard) and ${ shared state; }
form, backslash line continuation is broken.

Reproducer:

	echo $(
	echo one two\
	three
	)

Actual output (ksh93, all versions):

	one two\ three

Expected output (every other shell, POSIX spec):

	one twothree

src/cmd/ksh93/sh/lex.c: sh_lex(): case S_REG:
- Do not skip new-line joining if we're currently processing a
  command substitution of one of these forms (i.e., if the
  lp->lexd.dolparen level is > 0).

Background info/analysis:

comsub() is called from sh_lex() when S_PAR is the current state.
In src/cmd/ksh93/data/lexstates.c, we see that S_PAR is reached in
the ST_DOL state table at index 40. Decimal 40 is ( in ASCII. So,
the previous skipping of characters was done according to the
ST_DOL state table, and the character that stopped it was (. This
means we have $(.

Alternatively, comsub() may be called from sh_lex() by jumping to
the do_comsub label. In brief, that is the case when we have ${.

Regardless of which it is from the two, comsub() is now called from
sh_lex(). In comsub(), lp->lexd.dolparen is incremented at the
beginning and decremented at the end. Between them, we see that
sh_lex() is called. So, lp->lexd.dolparen in sh_lex() indicates the
depth of nesting $( or ${ statements we're in. Thus, it is also the
number of comsub() invocations seen in a backtrace taken in
sh_lex().

The codepath for `...` is different (and never had this bug).

Co-authored by: Martijn Dekker <martijn@inlv.org>
Resolves: https://github.com/ksh93/ksh/issues/367
2022-05-22 00:23:54 +01:00
atheik
40a5c45b48 Allow double quotes within backtick comsub within double quotes
The following reproducer causes a spurious syntax error:

    foo="`: "("`"

The nested double quotes are not recognised correctly, causing a
syntax error at the '('. Removing the outer double quotes (which
are unnecessary) is a workaround, but it's still a bug as every
other shell accepts this. This bug has been present since the
original Bourne shell.

src/cmd/ksh93/sh/lex.c: sh_lex(): case S_QUOTE:
- If the current character is '"' and we're in a `...` command
  substitution (ingrave is true), then do not switch to the old
  mode but keep using the ST_QUOTE state table.

Thanks to @JohnoKing for the report and to @atheik for the fix.

Co-authored by: Martijn Dekker <martijn@inlv.org>
Resolves: https://github.com/ksh93/ksh/issues/352
2022-05-20 22:48:47 +01:00
Martijn Dekker
b398f33c49 Another round of accumulated minor fixes and cleanups
Only notable changes listed below.

**/Mamfile:
- Do not bother redirecting standard error for 'cmp -s' to
  /dev/null. Normally, 'cmp -s' on Linux, macOS, *BSD, or Solaris
  do not not print any message. If it does, something unusual is
  going on and I would want to see the message.
- Since we now require a POSIX shell, we can use '!'.

src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/sh/init.c:
- Remove SH_TYPE_PROFILE symbol, unused after the removal of the
  SHOPT_PFSH code. (re: eabd6453)

src/cmd/ksh93/sh/io.c:
- piperead(), slowread(): Replace redundant sffileno() calls by
  the variables already containing their results. (re: 50d342e4)

src/cmd/ksh93/bltins/mkservice.c,
rc/lib/libcmd/vmstate.c:
- If these aren't compiled, define a stub function to silence the
  ranlib(1) warning that the .o file does not contain symbols.
2022-03-11 21:20:32 +01:00
Johnothan King
8fc8c2f51c Fix a few minor issues (#473)
Changes:
- Fixed two xtrace test failures introduced in commit cfc8744c.
- The definition of _use_ntfork_tcpgrp in xec.c is now dependent on
  SHOPT_SPAWN being defined (re: 8e9ed5be).
- Removed many unnecessary newlines and fixed various typos.
2022-03-11 21:18:42 +01:00
Martijn Dekker
14a43a0a88 Yet more misc. cleanups; rm SHOPT_PFSH, SHOPT_TYPEDEF
Notable changes:
- Remove SHOPT_PFSH compile-time option and associated code.
  This was meant to work with Solaris rights profiles, see:
  https://docs.oracle.com/cd/E23824_01/html/821-1461/profiles-1.html#REFMAN1profiles-1
  But it has been obsolete for years as Solaris stopped using
  it in its shipped ksh several OS versions ago, preferring a
  library-based wrapper around ksh and other shells.
  Nonetheless I experimented with the option on Solaris 11.4.
  Result: no external command will run; output of unitialised
  memory in error message. So it's already fallen victim to bit
  rot. There's nothing interesting here, so just get rid.
- Remove SHOPT_TYPEDEF compile-time option (but keep the code!).
  Turning it off caused the build to fail. It may be possible to
  fix it, but the type definition code is integral to ksh now (e.g.
  'enum' depends on much of it) so it makes no sense to disable it.
  This was removed in the ksh 93v- beta version as well.
- Remove nv_close() calls and remove nv_close() documentation from
  the nval.3 man page. This function is a dummy, present without
  any changes since the beginning of the ast-open-archive repo in
  1995. The comment was: "Currently this is a dummy, but someday
  will be needed for reference counting". 27 or more years later,
  it's time to admit it's never going to happen. (And of course,
  nv_close() calls were not being used with anything resembling
  consistency.)
- Add a null nv_close() macro to nval.h for compatibility with
  third party code that follows the old documentation.
- Add a few missing regression tests.
2022-02-10 21:04:56 +00:00
Martijn Dekker
9c313c7fe3 Update copyright years in files changed since 1st Jan 2022 2022-01-30 20:49:04 +00:00
Martijn Dekker
172becffea Some more accumulated minor tweaks and cleanups
Notable changes:

src/cmd/ksh93/include/fault.h:
- Get rid of the superflous sh pointer argument in the
  sh_pushcontext() and sh_popcontext() macros. (re: 2d3ec8b6)

src/cmd/ksh93/tests/io.sh:
- Tweak a process substitution test to allow up to a second for
  unused process substitution processes to disappear. On the Alpine
  Linux console (at least the musl libc version), this is needed to
  avoid a test failure as long as no GUI is active. As soon as you
  start X11, this phenomenon disappears, even on the console. Very
  strange, but also very probably not ksh's fault.

src/cmd/ksh93/tests/shtests:
- Instead of just SIGCONT and SIGPIPE, set all signals to default,
  just to be sure to avoid spurious test failures due to signals
  that were ignored on entry. (It made no difference to the
  aforementioned Alpine Linux test failure, so ignored signals had
  nothing to do with that -- but still a good idea.)

.github/workflows/ci.yml:
- On the GitHub CI runs, when testing with SHOPTs disabled, disable
  SHOPT_SPAWN as well, which tests that everything still works
  correctly with the regular fork(2) method.

COPYRIGHT:
- Remove duplicate of BSD license.
2022-01-25 16:13:15 +00:00
Martijn Dekker
b590a9f155 [shp cleanup 01..20] all the rest (re: 2d3ec8b6)
This combines 20 cleanup commits from the dev branch.

All changed files:
- Clean up pointer defererences to sh.
- Remove shp arguments from functions.

Other notable changes:

src/cmd/ksh93/include/shell.h,
src/cmd/ksh93/sh/init.c:
- On second thought, get rid of the function version of
  sh_getinterp() as libshell ABI compatibility is moot. We've
  already been breaking that by reordering the sh struct, so there
  is no way it's going to work without recompiling.

src/cmd/ksh93/sh/name.c:
- De-obfuscate the relationship between nv_scan() and scanfilter().
  The former just calls the latter as a static function, there's no
  need to do that via a function pointer and void* type conversions.

src/cmd/ksh93/bltins/typeset.c,
src/cmd/ksh93/sh/name.c,
src/cmd/ksh93/sh/nvdisc.c:
- 'struct adata' and 'struct tdata', defined as local struct types
  in these files, need to have their first three fields in common,
  the first being a pointer to sh. This is because scanfilter() in
  name.c accesses these fields via a type conversion. So the sh
  field needed to be removed in all three at the same time.
  TODO: de-obfuscate: good practice definition via a header file.

src/cmd/ksh93/sh/path.c:
- Naming consistency: reserve the path_ function name prefix for
  externs and rename statics with that prefix.
- The default path was sometimes referred to as the standard path.
  To use one term, rename std_path to defpath and onstdpath() to
  ondefpath().
- De-obfuscate SHOPT_PFSH conditional code by only calling
  pf_execve() (was path_pfexecve()) if that is compiled in.

src/cmd/ksh93/include/streval.h,
src/cmd/ksh93/sh/streval.c:
- Rename extern strval() to arith_strval() for consistency.

src/cmd/ksh93/sh/string.c:
- Remove outdated/incorrect isxdigit() fallback; '#ifnded isxdigit'
  is not a correct test as isxdigit() is specified as a function.
  Plus, it's part of C89/C90 which we now require. (re: ac8991e5)

src/cmd/ksh93/sh/suid_exec.c:
- Replace an incorrect reference to shgd->current_pid with
  getpid(); it cannot work as (contrary to its misleading directory
  placement) suid_exec is an independent libast program with no
  link to ksh or libshell at all. However, no one noticed because
  this was in fallback code for ancient systems without
  setreuid(2). Since that standard function was specified in POSIX
  Issue 4 Version 2 from 1994, we should remove that fallback code
  sometime as part of another obsolete code cleanup operation to
  avoid further bit rot. (re: 843b546c)

src/cmd/ksh93/bltins/print.c: genformat():
- Remove preformat[] which was always empty and had no effect.

src/cmd/ksh93/shell.3:
- Minor copy-edit.
- Remove documentation for nonexistent sh.infile_name. A search
  through ast-open-archive[*] reveals this never existed at all.
- Document sh.savexit (== $?).

src/cmd/ksh93/shell.3,
src/cmd/ksh93/include/shell.h,
src/cmd/ksh93/sh/init.c:
- Remove sh.gd/shgd; this is now unused and was never documented
  or exposed in the shell.h public interface.
- sh_sigcheck() was documented in shell.3 as taking no arguments
  whereas in the actual code it took a shp argument. I decided to
  go with the documentation.
- That leaves sh_parse() as the only documented function that still
  takes an shp argument. I'm just going to go ahead and remove it
  for consistency, reverting sh_parse() to its pre-2003 spec.
- Remove undocumented/unused sh_bltin_tree() function which simply
  returned sh.bltin_tree.
- Bump SH_VERSION to 20220106.
2022-01-07 16:16:31 +00:00
Johnothan King
b425196958 Fix ASan heap-buffer-overflow when handling syntax errors (#402)
This commit backports a bugfix from ksh2020 to fix an ASan
heap-buffer-overflow error in one of the regression tests. See:
c57f7398
https://github.com/att/ast/issues/1261

This explanation comes from the linked issue:
> The poplevel() in this block of code is called when lp->lexd.lex_max
> is zero:
> bd94eb56/src/cmd/ksh93/sh/lex.c (L921-L925)
> Since poplevel() first decrements lp->lexd.lex_max then uses it as
> an index into lp->lexd.lex_match this causes the word before the
> start of that buffer to be accessed. The buffer is allocated here:
> bd94eb56/src/cmd/ksh93/sh/lex.c (L2210-L2218)

src/cmd/ksh93/sh/lex.c:
- Avoid calling poplevel() twice when handling syntax errors.
2021-12-28 17:53:35 +00:00
Martijn Dekker
a1f5c99204 INIT: remove proto, ratz (re: 46593a89, 6137b99a); major cleanup
This takes another step towards cleaning up the build system. We
now do not even pretend to be theoretically compatible with
pre-1989 K&R C compilers or with C++ compilers. In practice, this
had already been broken for many years due to bit rot.

Commit 46593a89 already removed the license handling enormity that
depended on proto, so now we can cleanly remove it altogether. But
we do need to leave some backwards compatibility stubs to keep the
build system compatible with older AST code; it should remain
possible to build older ksh versions with the current build system
(the bin/ and src/cmd/INIT/ directories) for testing purposes.

So as of now there is no more __MANGLE__d rubbish in your generated
header files. This is only about a quarter of a century overdue...

This commit also includes a huge amount of code cleanup to remove
thousands of unused K&R C fallbacks and other cruft, particularly
in libast. This code base should now be a little easier to
understand for people who are familiar with a modern(ish) C
standard.

ratz is now also removed; this was a standalone and simplified 2005
version of gunzip. As of 6137b99a, none of our code uses it, even
theoretically. And the real g(un)zip is now everywhere.

src/cmd/INIT/proto.c, src/cmd/INIT/ratz.c:
- Removed.

COPYRIGHT:
- Remove zlib license; this only applied to ratz.

bin/package, src/cmd/INIT/package.sh:
- Related cleanups.
- Unset LC_ALL before invoking a new shell, respecting the user's
  locale again and avoiding multibyte character corruption on the
  command line.

src/cmd/INIT/proto.sh:
- Add stub for backwards compatibility with Mamfiles that depend on
  proto. It does nothing but pass input without modification and is
  now installed as the new arch/*/bin/proto by src/cmd/INIT/Mamfile.

src/cmd/INIT/iffe.sh:
- Ignore the proto-related -e (--package) and -p (--prototyped)
  options; keep parsing them for backwards compatibility.
- Trim the macros passed to every test to their standard C
  versions, removing K&R C and C++ versions. These are now
  considered to be for backwards compatibility only.

src/cmd/INIT/iffe.tst:
- Remove proto(1) mangling code.
  By the way, iffe can be regression-tested as follows:
        $ bin/package use   # set up environment in a child shell
        $ regress src/cmd/INIT/iffe.tst
        $ exit              # leave package environment

src/cmd/INIT/make.probe, src/cmd/INIT/probe.win32:
- Remove code to handle C++.

src/lib/libast/features/common:
- As in iffe.sh above, trim macros designed for compatibility with
  C++ and ancient C compilers to their standard C versions and
  comment that they are for backwards compatibility with AST code.
  This is needed to keep all the old ast and ksh code compiling.

src/cmd/ksh93/sh/init.c,
src/cmd/ksh93/sh/name.c:
- Clarify libshell ABI compatibility function versions of macros.
  A "proto workaround" comment in the original code mislead me into
  thinking this had something to do with the removed proto(1), but
  it's unrelated. Call the workaround macro BYPASS_MACRO instead.

src/cmd/ksh93/include/defs.h:
- sh_sigcheck() macro: allow &sh as an argument: parenthesise shp.

src/cmd/ksh93/sh/nvtype.c:
- Remove unused nv_mkstruct() function. (re: d0a5cab1)

**/features/*:
- Remove obsolete iffe 'set prototyped' option.

**/Mamfile:
- Remove all references to the ast/prototyped.h header.
- Remove all use of the proto command. Simply copy instead.

*** 850-ish source files: ***
- Remove all '#pragma prototyped' directives.
- Remove all C++ compat code conditional upon defined(__cplusplus).
- Remove all use of the _ARG_ macro, which on standard C expands to
  its argument:
        #define _ARG_(x)        x
  (on K&R C, it expanded to nothing)
- Remove all use of _BEGIN_EXTERNS_ and _END_EXTERNS_ macros (empty
  on standard C; this was for C++ compatibility)
- Reduce all #if __STD_C (standard code) #else (K&R code) #endif
  blocks to the standard code only, without use of the macro.
- Same for _STD_ macro which seems to have had the same function.
- Change all instances of 'Void_t' to standard 'void'.
2021-12-24 07:05:22 +00:00
Johnothan King
85199ab351 Backport ksh93v- bugfix for [[ 1<2 ]] (#380)
Strings compared in [[ with the > and < operators should be compared
lexically. This does not work when the strings are single digits, as
the parser interprets it as a syntax error:
   $ [[ 10<2 ]]   # 10 lexically sorts before 2
   $ echo $?
   0
   $ [[ 1<2 ]]
   /usr/bin/ksh: syntax error: `<' unexpected
   $ echo $?
   3

src/cmd/ksh93/sh/lex.c:
- Don't interpret numbers next to > and < as a redirection while
  inside of [[. This bugfix was backported from ksh93v- 2014-06-25.

src/cmd/ksh93/tests/bracket.sh:
- Add regression tests for the > and < operators.
2021-12-17 03:26:41 +01:00
Johnothan King
e54001d58b Various minor capitalization and typo fixes (#371)
This commit fixes various minor typos, punctuation errors and
corrects the capitalization of many names.
2021-12-13 01:49:42 +01:00
Johnothan King
cd562b16e2 Port more shell lint improvements from illumos and ksh93v- (#374)
This commit adds onto <https://github.com/ksh93/ksh/pull/353> by porting
over two additional improvements to the shell linter:

1) The changes in the aforementioned pull request were merged into
   illumos-gate with an additional change.[*] The illumos revision of
   the patch improved the warning for (( $foo = $? )) to specify '$foo'
   causes the warning.[**] Example:
      $ ksh -n -c '(( $? != $bar ))'
      ksh: warning: line 1: in '(( $? != $bar ))', using '$' as in '$bar' is slower and can introduce rounding errors
   While I was porting the illumos patch I did notice one problem. The
   string it uses from paramsub() skips over the initial '{' in
   '${var}', resulting in the warning printing '$var}' instead:
      $ ksh -n -c '(( ${.sh.pid} != $$ ))'
      ...  in '(( ${.sh.pid} != $$ ))', using '$' as in '$.sh.pid}' is slower  ...
   This was fixed by including the missing '{' in the string returned by
   paramsub for ${var} variables.

2) In ksh93v-, parsing x=$((expr)) with the shell linter will cause ksh
   to warn the user x=$((expr)) is slower than ((x=expr)). This
   improvement has been backported with a modified warning:
      # Result from this commit
      $ ksh -n -c 'x=$((1 + 2))'
      ksh: warning: line 1: x=$((1 + 2)) is slower than ((x=1 + 2))
      # Result from ksh93v-
      $ ksh93v -n -c 'x=$((1 + 2))'
      ksh93v: warning: line 1: ((x=1 + 2)) is more efficient than x=$((1 + 2))
   Minor note: the ksh93v- patch had an invalid use of memcmp; this
   version of the patch uses strncmp instead.

References:
be548e87bc
65722363_22fdf8e7/
2021-12-13 01:49:37 +01:00
Martijn Dekker
feedc05037 Reset lexer state on syntax error and on SIGINT (Ctrl+C)
ksh crashed if you pressed Ctrl+C or Ctrl+D on a PS2 prompt while
you haven't finished entering a $(command substitution). It
corrupts subsequent command substitutions. Sometimes the situation
recovers, sometimes the shell crashes.

Simple crash reproducer:

  $ PS1="\$(echo foo) \$(echo bar) \$(echo baz) > "
  foo bar baz > echo $(                        <-- now press Ctrl+D
  > ksh: syntax error: `(' unmatched
  Memory fault

The same happens with Ctrl+C, minus the syntax error message.

The problem is that the lexer state becomes inconsistent when the
lexer is interrupted in the middle of reading a command
substitution of the form $( ... ). This is tracked in the
'lexd.dolparen' variable in the lexer state struct.

Resetting that variable is sufficient to fix this issue. However,
in this commit I prefer to just reinitialise the lexer state
completely to pre-empt any other possible issues. Whether there was
a syntax error or the user pressed Ctrl+C, we just interrupted all
lexing and parsing, so the lexer *should* restart from scratch.

src/cmd/ksh93/sh/fault.c: sh_fault():
- If the shell is in an interactive state (e.g. not a subshell) and
  SIGINT was received, reinitialise the lexer state. This fixes the
  crash with Ctrl+C.

src/cmd/ksh93/sh/lex.c: sh_syntax():
- When handling a syntax error, reset the lexer state. This fixes
  the crash with Ctrl+D.

NEWS:
- Also add the forgotten item for the previous fix (re: 2322f939).
2021-12-09 07:35:12 +01:00
Martijn Dekker
aa3048880b cleanup: get rid of KSHELL and _BLD_shell preprocessor macros
Once upon a time it might have been possible to build certain parts
of ksh, such as the emacs and vi editors and possibly even the
name/value library (nval(3)) as independent libraries. But given
the depressing amount of bit rot in the code that we inherited, I
am certain that disabling either of these macros had been resulting
in a broken build for many years before AT&T abandoned this code
base. These are certainly not going to be useful now.

Meanwhile the KSHELL macro got in the way of me today, because the
Mamfile did not define it for all the .c files, but some headers
declared some functionality conditionally upon that macro. So
including <io.h> in, e.g., nvdisc.c did not declare the same
functions as including that header in files with KSHELL defined.
This inconsistency is now gone as well, for various files.

I'm currently working on making it possible once again to build
libshell as a dynamic library; that should be good enough. And that
never involved disabling either of these macros.
2021-12-09 06:43:22 +01:00
Johnothan King
beccb93fd4 Fix various compiler warnings and minor issues (#362)
List of changes:
- Fixed some -Wuninitialized warnings and removed some unused variables.

- Removed the unused extern for B_login (re: d8eba9d1).

- The libcmd builtins and the vmalloc memfatal function now handle
  memory errors with 'ERROR_SYSTEM|ERROR_PANIC' for consistency with how
  ksh itself handles out of memory errors.

- Added usage of UNREACHABLE() where it was missing from error handling.

- Extend many variables from short to int to prevent overflows (most
  variables involve file descriptors).

- Backported a ksh2020 patch to fix unused value Coverity issues
  (https://github.com/att/ast/pull/740).

- Note in src/cmd/ksh93/README that ksh compiles with Cygwin on
  Windows 10 and Windows 11, albeit with many test failures.

- Add comments to detail some sections of code. Extensive list of
  commits related to this change:
  ca2443b5, 7e7f1372, 2db9953a, 7003aba4, 6f50ff64, b1a41311,
  222515bf, a0dcdeea, 0aa9e03f, 61437b27, 352e68da, 88e8fa67,
  bc8b36fa, 6e515f1d, 017d088c, 035a4cb3, 588a1ff7, 6d63b57d,
  a2f13c19, 794d1c86, ab98ec65, 1026006d

- Removed a lot of dead ifdef code.

- edit/emacs.c: Hide an assignment to avoid a -Wunused warning. (See
  also https://github.com/att/ast/pull/753, which removed the assignment
  because ksh2020 removed the !SHOPT_MULTIBYTE code.)

- sh/nvdisc.c: The sh_newof macro cannot return a null pointer because
  it will instead cause the shell to exit if memory cannot be allocated.
  That makes the if statement here a no-op, so remove it.

- sh/xec.c: Fixed one unused variable warning in sh_funscope().

- sh/xec.c: Remove a fallthrough comment added in commit ed478ab7
  because the TFORK code doesn't fall through (GCC also produces no
  -Wimplicit-fallthrough warning here).

- data/builtins.c: The cd and pwd man pages state that these builtins
  default to -P if PATH_RESOLVE is 'physical', which isn't accurate:
     $ /opt/ast/bin/getconf PATH_RESOLVE
     physical
     $ mkdir /tmp/dir; ln -s /tmp/dir /tmp/sym
     $ cd /tmp/sym
     $ pwd
     /tmp/sym
     $ cd -P /tmp/sym
     $ pwd
     /tmp/dir
  The behavior described by these man pages isn't specified in the ksh
  man page or by POSIX, so to avoid changing these builtin's behavior
  the inaccurate PATH_RESOLVE information has been removed.

- Mamfiles: Preserve multi-line errors by quoting the $x variable.
  This fix was backported from 93v-.
  (See also <a7e9cc82>.)

- sh/subshell.c: Remove set but not used sp->errcontext variable.
2021-12-09 06:42:59 +01:00
Martijn Dekker
520f530198 lex.c: add default, though it should never happen (re: b0282f26)
If a bug is ever introduced that causes a [[ ... ]] operator to be
unhandled by the linter, we should at least avoid writing random
memory contents to standard error. In non-release builds, let's
abort() so the problem can be more easily backtraced.
2021-12-05 19:27:38 +01:00
Johnothan King
6904585f49 Port illumos' shell linter improvements (#353)
This commit ports over two improvements to the shell linter from
illumos (original patch written by Andy Fiddaman). Links to the
relevant bug reports and the original patch:
https://www.illumos.org/issues/13601
https://www.illumos.org/issues/13631
c7b656fc71

The first improvement is to the lint warning for arithmetic
operators in [[ ... ]]. The ksh linter now suggests the correct
equivalent operator to use in ((...)). Example:
  $ ksh -nc '[[ 30 -gt 25 ]]'
  # Original warning
  warning: line 1: -gt within [[ ... ]] obsolete, use ((...))
  # New warning
  warning: line 1: [[ ... -gt ... ]] obsolete, use ((... > ...))

The second improvement pertains to variable expansion in arithmetic
expressions. The ksh linter now suggests referencing variable names
directly:
  $ ksh -nc 'integer foo=40; (($foo < 50 ))'
  # Old warning
  warning: line 1: variable expansion makes arithmetic evaluation less efficient
  # New warning
  warning: line 1: in '(($foo < 50))', using '$' is slower and can introduce rounding errors

src/cmd/ksh93/{data/lexstates,sh/lex,sh/parse}.c:
- Port the improved shell lint warnings from illumos to ksh93u+m.
- The original checks for arithmetic operators involved a bunch of
  if statements with inefficient calls to strcmp(3). These were
  replaced with a more efficient switch statement that avoids
  strcmp.
2021-12-05 19:27:16 +01:00
Johnothan King
370440473e Fix KEYBD trap crash when inputting a command substitution (#355)
This change fixes a crash that can occur after setting a KEYBD trap
then inputting a multi-line command substitution. The crash is
similar to issue #347, but it's easier to reproduce since it
doesn't require you to setup a kshrc file. Reproducer for the
crash:

  $ ENV=/./dev/null ksh
  $ trap : KEYBD
  $ : $(
  > true)
  Memory fault(coredump)

The bugfix was backported (with considerable changes) from ksh93v-
2013-10-08. The crash was first reported on the old mailing list:
https://www.mail-archive.com/ast-users@lists.research.att.com/msg00313.html

src/cmd/ksh93/{include/shlex.h,sh/lex.c}:
- To fix this properly, we need sizeof(Lex_t) to work as expected
  in edit.c, but that is thwarted by the _SHLEX_PRIVATE macro in
  lex.c which shlex.h uses to add private structs to the Lex_t type
  in lex.c only. So get rid of that _SHLEX_PRIVATE macro and make
  those members part of the centrally defined struct, renaming them
  to make it clear they're considered private to lex.c.

src/cmd/ksh93/edit/edit.c:
- Now that we can get its size, save and restore the shell lexing
  context when a KEYBD trap is present.

src/cmd/ksh93/tests/pty.sh:
- Add a regression test for the KEYBD trap crash.

Co-authored-by: Martijn Dekker <martijn@inlv.org>
2021-11-30 04:27:31 +01:00
Martijn Dekker
c81473061a test/[: binary operators: fix '<' and add '=~'; some more cleanups
In ksh88, the test/[ built-in supported both the '<' and '>'
lexical sorting comparison operators, same as in [[. However, in
every version of ksh93, '<' does not work though '>' still does!

Still, the code for both is present in test_binop():

src/cmd/ksh93/bltins/test.c
548:		case TEST_SGT:
549:			return(strcoll(left, right)>0);
550:		case TEST_SLT:
551:			return(strcoll(left, right)<0);

Analysis: The binary operators are looked up in shtab_testops[] in
data/testops.c using a macro called sh_lookup, which expands to a
sh_locate() call. If we examine that function in sh/string.c, it's
easy to see that on systems using ASCII (i.e. all except IBM
mainframes), it assumes the table is sorted in ASCII order.

src/cmd/ksh93/sh/string.c
64:	while((c= *tp->sh_name) && (CC_NATIVE!=CC_ASCII || c <= first))

The problem was that the '<' operator was not correctly sorted in
shtab_testops[]; it was sorted immediately before '>', but after
'='. The ASCII order is: < (60), = (61), > (62). This caused '<' to
never be found in the table.

The test_binop() function is also used by [[, yet '<' always worked
in that. This is because the parser has code that directly checks
for '<' and '>' within [[ (in sh/parse.c, lines 1949-1952).

This commit also adds '=~' to 'test', which took three lines of
code and allowed eliminating error handling in test_binop() as
test/[ and [[ now support the same binary ops. (re: fc2d5a60)

src/cmd/ksh93/*/*.[ch]:
- Rename a couple of very misleadingly named macros in test.h:
  . For == and !=, the TEST_PATTERN bit is off for pattern compares
    and on for literal string compares! Rename to TEST_STRCMP.
  . The TEST_BINOP bit does not denote all binary operators, but
    only the logical -a/-o ops in test/[. Rename to TEST_ANDOR.

src/cmd/ksh93/bltins/test.c: test_binop():
- Add support for =~. This is only used by test/[. The method is
  implemented in two lines that convert the ERE to a shell pattern
  by prefixing it with ~(E), then call test_strmatch with that
  temporary string to match the ERE and update ${.sh.match}.
- Since all binary ops from shtab_testops[] are now accounted for,
  remove unknown op error handling from this function.

src/cmd/ksh93/data/testops.c:
- shtab_testops[]:
  . Correctly sort the '<' (TEST_SLT) entry.
  . Remove ']]' (TEST_END). It's not an op and doesn't belong here.
- Update sh_opttest[] documentation with =~, \<, \>.
- Remove now-unused e_unsupported_op[] error message.

src/cmd/ksh93/sh/lex.c: sh_lex():
- Check for ']]' directly instead of relying on the removed
  TEST_END entry from shtab_testops[].

src/cmd/ksh93/tests/bracket.sh:
- Add relevant tests.

src/cmd/ksh93/tests/builtins.sh:
- Fix an old test that globally deleted the 'test' builtin. Delete
  it within the command substitution subshell only.
- Remove the test for non-support of =~ in test/[.
- Update the test for invalid test/[ op to use test directly.
2021-11-14 02:46:34 +01:00
Martijn Dekker
d087b031f0 Fix single quotes in expansion operator string (re: 5ed9ffd6)
The referenced commit introduced the following bug:

> The closing quote does not appear to be registering during the
> parse of the following:
>
>	echo ${var:+'{}'}
>
> Within a script, this will result in:
>
>	syntax error at line 1: `'' unmatched

src/cmd/ksh93/data/lexstates.c,
src/cmd/ksh93/include/lexstates.h:
- Add new ST_MOD1 state table that is a copy of ST_QUOTE, but adds
  a special meaning (ST_LIT) for the single quote (position 39).

src/cmd/ksh93/sh/lex.c: sh_lex():
- For parameter expansion operators with old-style quoting
  (S_MOD1), use the new ST_MOD1 state table instead of ST_QUOTE.
  This causes single quotes within them to be processed properly.

src/cmd/ksh93/tests/quoting2.sh:
- Add tests.

Thanks to @gkamat for the bug report.
Resolves: https://github.com/ksh93/ksh/issues/290
2021-04-30 05:28:21 +01:00
Martijn Dekker
2aad3cab06 Add ksh 93u+m contributors notice to 964 copyright headers 2021-04-26 00:19:31 +01:00
Martijn Dekker
b7dde4e747 Fix ksh exit on syntax error in profile (re: cb67a01b, ceb77b13)
Johnothan King writes:
> There are two regressions related to how ksh handles syntax
> errors in the .kshrc file. If ~/.kshrc or the file pointed to by
> $ENV have a syntax error, ksh exits during startup. Additionally,
> the error message printed is incorrect:
>
> $ cat /tmp/synerror
> ((
> echo foo
>
> # ksh93u+m
> $ ENV=/tmp/synerror arch/*/bin/ksh -ic 'echo ${.sh.version}'
> /tmp/synerror: syntax error: `/t/tmp/synerror' unmatched
>
> # ksh93u+
> $ ENV=/tmp/synerror ksh93u -ic 'echo ${.sh.version}'
> /tmp/synerror: syntax error: `(' unmatched
> Version AJM 93u+ 2012-08-01
>
> The regression that causes the incorrect error message was
> introduced by commit cb67a01. The other bug that causes ksh to
> exit on startup was introduced by commit ceb77b1.

src/cmd/ksh93/sh/lex.c: fmttoken():
- Call stakfreeze(0) to terminate a possible unterminated previous
  stack item before writing the token string onto the stack. This
  fixes the bug with garbage in a syntax error message.

src/cmd/ksh93/sh/main.c: exfile():
- Revert Red Hat's ksh-20140801-diskfull.patch applied in ceb77b13.
  This fixes the bug with interactive ksh exiting on syntax error
  in a profile script. Testing by @JohnoKing showed the patch is no
  longer necessary to fix a login crash on disk full, as commit
  970069a6 (which applied Red Hat patches ksh-20120801-macro.patch
  and ksh-20120801-fd2lost.patch) also fixes that crash.

src/cmd/ksh93/README:
- Fix typos. (re: fdc08b23)

Co-authored-by: Johnothan King <johnothanking@protonmail.com>
Resolves: https://github.com/ksh93/ksh/issues/281
2021-04-21 19:42:24 +01:00
Martijn Dekker
cb67a01b45 lex.c: simplify fmttoken() by using the stack (re: 3255aed2)
Using the stack makes it impossible for future buffer overflows to
occur. It also simplifies fmttoken() by eliminating the need to
declare a local buffer and pass a pointer to that as an argument.

For info: man src/lib/libast/man/stak.3
2021-04-09 17:36:29 +01:00
hyenias
3255aed2c4
lex.c: Fix buffer overflow in debug sh_lex and sh_syntax (#262)
fmttoken() needs a minimal char[4] token buffer passed to it.

Originally reported by: Jakub Wilk <jwilk@jwilk.net>
Original bug report: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=879464

The following code lines from fmttoken() yield a n=3 for SYMSEMI as
n=1 from the start, e.g. 'for <>;'.

        case SYMSEMI:
                if(tok[0]=='<')
                        tok[n++] = '>';
                sym = ';';
                break;
        default:
                sym = 0;
        }
        tok[n++] = sym;
}
tok[n] = 0;

n[0]='<'
n[1]='>'
n[2]=';'
n[3]=0 # <-- BUFFER overflow as the passed character buffers have a size of 3

src/cmd/ksh93/sh/lex.c:
- DBUG: sh_lex(): Adjust char tokstr[3] to char tokstr[4]
- sh_syntax(): Adjust char tokbuf[3] to char tokbuf[4]
2021-04-09 02:47:21 +01:00
Johnothan King
a065558291
Fix more compiler warnings, typos and other minor issues (#260)
Many of these changes are minor typo fixes. The other changes
(which are mostly compiler warning fixes) are:

NEWS:
- The --globcasedetect shell option works on older Linux kernels
  when used with FAT32/VFAT file systems, so remove the note about
  it only working with 5.2+ kernels.

src/cmd/ksh93/COMPATIBILITY:
- Update the documentation on function scoping with an addition
  from ksh93v- (this does apply to ksh93u+).

src/cmd/ksh93/edit/emacs.c:
- Check for '_AST_ksh_release', not 'AST_ksh_release'.

src/cmd/INIT/mamake.c,
src/cmd/INIT/ratz.c,
src/cmd/INIT/release.c,
src/cmd/builtin/pty.c:
- Add more uses of UNREACHABLE() and noreturn, this time for the
  build system and pty.

src/cmd/builtin/pty.c,
src/cmd/builtin/array.c,
src/cmd/ksh93/sh/name.c,
src/cmd/ksh93/sh/nvtype.c,
src/cmd/ksh93/sh/suid_exec.c:
- Fix six -Wunused-variable warnings (the name.c nv_arrayptr()
  fixes are also in ksh93v-).
- Remove the unused 'tableval' function to fix a -Wunused-function
  warning.

src/cmd/ksh93/sh/lex.c:
- Remove unused 'SHOPT_DOS' code, which isn't enabled anywhere.
  https://github.com/att/ast/issues/272#issuecomment-354363112

src/cmd/ksh93/bltins/misc.c,
src/cmd/ksh93/bltins/trap.c,
src/cmd/ksh93/bltins/typeset.c:
- Add dictionary generator function declarations for former
  aliases that are now builtins (re: 1fbbeaa1, ef1621c1, 3ba4900e).
- For consistency with the rest of the codebase, use '(void)'
  instead of '()' for print_cpu_times.

src/cmd/ksh93/sh/init.c,
src/lib/libast/path/pathshell.c:
- Move the otherwise unused EXE macro to pathshell() and only
  search for 'sh.exe' on Windows.

src/cmd/ksh93/sh/xec.c,
src/lib/libast/include/ast.h:
- Add an empty definition for inline when compiling with C89.
  This allows the timeval_to_double() function to be inlined.

src/cmd/ksh93/include/shlex.h:
- Remove the unused 'PIPESYM2' macro.

src/cmd/ksh93/tests/pty.sh:
- Add '# err_exit #' to count the regression test added in
  commit 113a9392.

src/lib/libast/disc/sfdcdio.c:
- Move diordwr, dioread, diowrite and dioexcept behind
  '#ifdef F_DIOINFO' to fix one -Wunused-variable warning and
  multiple -Wunused-function warnings (sfdcdio() only uses these
  functions when F_DIOINFO is defined).

src/lib/libast/string/fmtdev.c:
- Fix two -Wimplicit-function-declaration warnings on Linux by
  including sys/sysmacros.h in fmtdev().
2021-04-08 19:58:07 +01:00
Johnothan King
c4f980eb29
Introduce usage of __builtin_unreachable() and noreturn (#248)
This commit adds an UNREACHABLE() macro that expands to either the
__builtin_unreachable() compiler builtin (for release builds) or
abort(3) (for development builds). This is used to mark code paths
that are never to be reached.

It also adds the 'noreturn' attribute to functions that never
return: path_exec(), sh_done() and sh_syntax(). The UNREACHABLE()
macro is not added after calling these.

The purpose of these is:
* to slightly improve GCC/Clang compiler optimizations;
* to fix a few compiler warnings;
* to add code clarity.

Changes of note:

src/cmd/ksh93/sh/io.c: outexcept():
- Avoid using __builtin_unreachable() here since errormsg can
  return despite using ERROR_system(1), as shp->jmplist->mode is
  temporarily set to 0. See: https://github.com/att/ast/issues/1336

src/cmd/ksh93/tests/io.sh:
- Add a regression test for the ksh2020 bug referenced above.

src/lib/libast/features/common:
- Detect the existence of either the C11 stdnoreturn.h header or
  the GCC noreturn attribute, preferring the former when available.
- Test for the existence of __builtin_unreachable(). Use it for
  release builds. On development builds, use abort() instead, which
  crahses reliably for debugging when unreachable code is reached.

Co-authored-by: Martijn Dekker <martijn@inlv.org>
2021-04-05 00:28:24 +01:00
Johnothan King
ed478ab7e3
Fix many GCC -Wimplicit-fallthrough warnings (#243)
This commit adds '/* FALLTHROUGH */' comments to fix many
GCC warnings when compiling with -Wimplicit-fallthrough.
Additionally, the existing fallthrough comments have been
changed for consistency.
2021-03-30 21:49:20 +01:00
Johnothan King
814b5c6890
Fix various minor problems and update the documentation (#237)
These are minor fixes I've accumulated over time. The following
changes are somewhat notable:

- Added a missing entry for 'typeset -s' to the man page.
- Add strftime(3) to the 'see also' section. This and the date(1)
  addition are meant to add onto the documentation for 'printf %T'.
- Removed the man page the entry for ksh reading $PWD/.profile on
  login. That feature was removed in commit aa7713c2.
- Added date(1) to the 'see also' section of the man page.
- Note that the 'hash' command can be used instead of 'alias -t' to
  workaround one of the caveats listed in the man page.
- Use an 'out of memory' error message rather than 'out of space'
  when memory allocation fails.
- Replaced backticks with quotes in some places for consistency.
- Added missing documentation for the %P date format.
- Added missing documentation for the printf %Q and %p formats
  (backported from ksh2020: https://github.com/att/ast/pull/1032).
- The comments that show each builtin's options have been updated.
2021-03-21 14:39:03 +00:00
Martijn Dekker
0b814b53bd Remove more legacy libast code (re: f9c127e3, 651bbd56)
This removes #ifdefs checking for the existence of
SH_PLUGIN_VERSION (version check for dynamically loaded builtins)
and the SFIO identifiers SF_BUFCONST, SF_CLOSING, SF_APPENDWR,
SF_ATEXIT, all of which are defined by the bundled libast.
2021-03-21 06:39:32 +00:00
Martijn Dekker
f8f2c4b608 Remove obsolete quote balancing hack
The old Bourne shell failed to check for closing quotes and command
substitution backticks when encountering end-of-file in a parser
context (such as a script). ksh93 implemented a hack for partial
compatibility with this bug, tolerating unbalanced quotes and
backticks in backtick command subsitutions, 'eval', and command
line invocation '-c' scripts only.

This hack became broken for backtick command substitutions in
fe20311f/350b52ea as a memory leak was fixed by adding a newline to
the stack at the end of the command substitution. That extra
newline becomes part of any string whose quotes are not properly
terminated, causing problems such as the one detailed here:
https://www.mail-archive.com/ast-developers@lists.research.att.com/msg01889.html

    $ touch abc
    $ echo `ls "abc`
    ls: abc
    : not found

No other fix for the memory leak is known that doesn't cause other
problems. (The alternative fix detailed in the referenced mailing
list post causes a different corner-case regression.)

Besides, the hack has always caused other corner case bugs as well:

	$ ksh -c '((i++'
Actual:	ksh: i++(: not found
	(If an external command 'i++(' existed, it would be run)
Expect:	ksh: syntax error at line 1: `(' unmatched

	$ ksh -c 'i=0; echo $((++i'
Actual:	(empty line; the arithmetic expansion is ignored)
Expect:	ksh: syntax error at line 1: `(' unmatched

	$ ksh -c 'echo $(echo "hi)'
Actual:	ksh: syntax error at line 1: `(' unmatched
Expect: ksh: syntax error at line 1: `"' unmatched

So, it's time to get rid of this hack. The old Bourne shell is
dead and buried. No other shell tries to support this breakage.
Tolerating syntax errors is just asking for strange side effects,
inconsistent states, and corner case bugs. We should not want to do
that. Old scripts that rely on this will just need to be fixed.

src/cmd/ksh93/sh/lex.c:
- struct lexdata: Remove 'char balance' member for remembering an
  unbalanced quote or backtick.
- sh_lex(): Remove the back to remember and compensate for
  unbalanced quotes/backticks that was executed only if we were
  executing a script from a string, as opposed to a file.

src/cmd/ksh93/COMPATIBILITY:
- Note the change.

Resolves: https://github.com/ksh93/ksh/issues/199
2021-03-05 22:17:14 +00:00
Johnothan King
7ad274f8b6
Add more out of memory checks (re: 18529b88) (#192)
The referenced commit neglected to add checks for strdup() calls.
That calls malloc() as well, and is used a lot.

This commit switches to another strategy: it adds wrapper functions
for all the allocation macros that check if the allocation
succeeded, so those checks don't need to be done manually.

src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/sh/init.c:
- Add sh_malloc(), sh_realloc(), sh_calloc(), sh_strdup(),
  sh_memdup() wrapper functions with success checks. Call nospace()
  to error out if allocation fails.
- Update new_of() macro to use sh_malloc().
- Define new sh_newof() macro to replace newof(); it uses
  sh_realloc().

All other changed files:
- Replace the relevant calls with the wrappers.
- Remove now-redundant success checks from 18529b88.
- The ERROR_PANIC error message calls are updated to inclusive-or
  ERROR_SYSTEM into the exit code argument, so libast's error()
  appends the human-readable version of errno in square brackets.
  See src/lib/libast/man/error.3

src/cmd/ksh93/edit/history.c:
- Include "defs.h" to get access to the wrappers even if KSHELL is
  not defined.
- Since we're here, fix a compile error that occurred with KSHELL
  undefined by updating the type definition of hist_fname[] to
  match that of history.h.

src/cmd/ksh93/bltins/enum.c:
- To get access to sh_newof(), include "defs.h" instead of
  <shell.h> (note that "defs.h" includes <shell.h> itself).

src/cmd/ksh93/Mamfile:
- enum.c: depend on defs.h instead of shell.h.
- enum.o: add an -I. flag in the compiler invocation so that defs.h
  can find its subsequent includes.

src/cmd/builtin/pty.c:
- Define one outofmemory() function and call that instead of
  repeating the error message call.
- outofmemory() never returns, so remove superfluous exit handling.

Co-authored-by: Martijn Dekker <martijn@inlv.org>
2021-02-27 21:21:58 +00:00
Martijn Dekker
18529b88c6 Add lots of checks for out of memory (re: 0ce0b671)
Huge typeset -L/-R adjustment length values were still causing
crashses on sytems with not enough memory. They should error out
gracefully instead of crashing.

This commit adds out of memory checks to all malloc/calloc/realloc
calls that didn't have them (which is all but two or three).

The stkalloc/stakalloc calls don't need the checks; it has
automatic checking, which is done by passing a pointer to the
outofspace() function to the stakinstall() call in init.c.

src/lib/libast/include/error.h:
- Change the ERROR_PANIC exit status value from ERROR_LEVEL (255)
  to 77, which is what it is supposed to be according to the libast
  error.3 manual page. Exit statuses > 128 for anything else than
  signals are not POSIX compliant and may cause misbehaviour.

src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/sh/init.c:
- To facilitate consistency, add a simple extern sh_outofmemory()
  function that throws an ERROR_PANIC "out of memory".

src/cmd/ksh93/include/shell.h,
src/cmd/ksh93/data/builtins.c:
- Remove now-redundant e_nospace[] extern message; it is now only
  used in one place so it might as well be a string literal in
  sh_outofmemory().

All other changed files:
- Verify the result of all malloc/calloc/realloc calls and call
  sh_outofmemory() if they fail.
2021-02-21 22:27:28 +00:00
Martijn Dekker
e37aa358bf Fix BUG_CASEEMPT: empty 'case' list was syntax error
'case x in esac' should be syntactically correct, but was an error:

	$ ksh -c 'case x in esac'
	ksh: syntax error at line 1: `case' unmatched

Inserting a newline was a workaround:

	$ ksh -c $'case x in\nesac'
	(no output)

The problem was that the 'esac' reserved word was not being
recognised if it immediately followed the 'in' reserved word.

src/cmd/ksh93/sh/lex.c: sh_lex():
- Do not turn off recognition of reserved words after 'in' if we're
  in a 'case' construct; only do this for 'for' and 'select'.

src/cmd/ksh93/tests/case.sh:
- Add seven regression test for correct recognition of 'esac'.
  Only two failed on ksh93. The rest is to catch future bugs.

Fixes: https://github.com/ksh93/ksh/issues/177
2021-02-16 06:50:12 +00:00
Martijn Dekker
2182ecfa08 Fix compile/regress fails on compiling without SHOPT_* options
Many compile-time options were broken so that they could not be
turned off without causing compile errors and/or regression test
failures. This commit now allows the following to be disabled:

SHOPT_2DMATCH    # two dimensional ${.sh.match} for ${var//pat/str}
SHOPT_BGX        # one SIGCHLD trap per completed job
SHOPT_BRACEPAT   # C-shell {...,...} expansions (, required)
SHOPT_ESH        # emacs/gmacs edit mode
SHOPT_HISTEXPAND # csh-style history file expansions
SHOPT_MULTIBYTE  # multibyte character handling
SHOPT_NAMESPACE  # allow namespaces
SHOPT_STATS      # add .sh.stats variable
SHOPT_VSH        # vi edit mode

The following still break ksh when disabled:

SHOPT_FIXEDARRAY # fixed dimension indexed array
SHOPT_RAWONLY    # make viraw the only vi mode
SHOPT_TYPEDEF    # enable typeset type definitions

Compiling without SHOPT_RAWONLY just gives four regression test
failures in pty.sh, but turning off SHOPT_FIXEDARRAY and
SHOPT_TYPEDEF causes compilation to fail. I've managed to tweak the
code to make it compile without those two options, but then dozens
of regression test failures occur, often in things nothing directly
to do with those options. It looks like the separation between the
code for these options and the rest was never properly maintained.
Making it possible to disable SHOPT_FIXEDARRAY and SHOPT_TYPEDEF
may involve major refactoring and testing and may not be worth it.

This commit has far too many tweaks to list. Notables fixes are:

src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/data/options.c:
- Do not compile in the shell options and documentation for
  disabled features (braceexpand, emacs/gmacs, vi/viraw), so the
  shell is not left with no-op options and inaccurate self-doc.

src/cmd/ksh93/data/lexstates.c:
- Comment the state tables to associte them with their IDs.
- In the ST_MACRO table (sh_lexstate9[]), do not make the S_BRACE
  state for position 123 (ASCII for '{') conditional upon
  SHOPT_BRACEPAT (brace expansion), otherwise disabling this causes
  glob patterns of the form {3}(x) (matching 3 x'es) to stop
  working as well -- and that is ksh globbing, not brace expansion.

src/cmd/ksh93/edit/edit.c: ed_read():
- Fixed a bug: SIGWINCH was not handled by the gmacs edit mode.

src/cmd/ksh93/sh/name.c: nv_putval():
- The -L/-R left/right adjustment options to typeset do not count
  zero-width characters. This is the behaviour with SHOPT_MULTIBYTE
  enabled, regardless of locale. Of course, what a zero-width
  character is depends on the locale, but control characters are
  always considered zero-width. So, to avoid a regression, add some
  fallback code for non-SHOPT_MULTIBYTE builds that skips ASCII
  control characters (as per iscntrl(3)) so they are still
  considered to have zero width.

src/cmd/ksh93/tests/shtests:
- Export the SHOPT_* macros from SHOPT.sh to the tests as
  environment variables, so the tests can check for them and decide
  whether or how to run tests based on the compile-time options
  that the tested binary was presumably compiled with.
- Do not run the C.UTF-8 tests if SHOPT_MULTIBYTE is not enabled.

src/cmd/ksh93/tests/*.sh:
- Add a bunch of checks for SHOPT_* env vars. Since most should
  have a value 0 (off) or 1 (on), the form ((SHOPT_FOO)) is a
  convenient way to use them as arithmetic booleans.

.github/workflows/ci.yml:
- Make GitHub do more testing: run two locale tests (Dutch and
  Japanese UTF-8 locales), then disable all the SHOPTs that we can
  currently disable, recompile ksh, and run the tests again.
2021-02-08 22:02:45 +00:00
Martijn Dekker
f9427909dc Make redirections like {varname}>file work with brace expansion off
This is some nonsense: redirections that store a file descriptor
greater than 9 in a variable, like {var}<&2 and the like, stopped
working if brace expansion was turned off. '{var}' is not a brace
expansion as it doesn't contain ',' or '..'; something like 'echo
{var}' is always output unexpanded. And redirections and brace
expansion are two completely unrelated things. It wasn't documented
that these redirections require the -B/braceexpand option, either.

src/cmd/ksh93/sh/lex.c: sh_lex():
- Remove incorrect check for braceexpand option before processing
  redirections of this form.

src/cmd/ksh93/COMPATIBILITY:
- Insert a brief item mentioning this.

src/cmd/ksh93/sh.1:
- Correction: these redirections do not yield a file descriptor >
  10, but > 9, a.k.a. >= 10.
- Add a brief example showing how these redirections can be used.

src/cmd/ksh93/tests/io.sh:
- Add a quick regression test.
2021-02-05 05:08:39 +00:00
Martijn Dekker
674a0c3559 Another lexical fix for here-documents (re: 6e515f1d)
OpenSUSE patch from:
https://build.opensuse.org/package/view_file/shells/ksh/ksh93-heredoc.dif

src/cmd/ksh93/sh/lex.c: here_copy():
- Do not potentially seek back a zero or negative length.
2021-01-28 06:32:42 +00:00
Martijn Dekker
bd283959be Fix lexing of 'case' in do...done in a $(comsub) (rhbz#1241013)
The following caused a spurious syntax error:

$ x=$(for i in 1; do case $i in word) true;; esac; done)
-ksh: syntax error: `;;' unexpected

Prior discussion:
https://bugzilla.redhat.com/1241013

Original patch, backported from 93v- beta, applied without change:
642af4d6/f/ksh-20120801-parserfix.patch
2020-09-27 21:26:09 +02:00
Martijn Dekker
843b546c1a rm redundant getpid(2) syscalls (re: 9de65210)
Now that we have ${.sh.pid} a.k.a. shgd->current_pid, which is
updated using getpid() whenever forking a new process, there is no
need for anything else to ever call getpid(); we can use the stored
value instead. There were a lot of these syscalls kicking around,
some of them in performance-sensitive places.

The following lists only changes *other* than changing getpid() to
shgd->currentpid.

src/cmd/ksh93/include/defs.h:
- Comments: clarify what shgd->{pid,ppid,current_pid} are for.

src/cmd/ksh93/sh/main.c,
src/cmd/ksh93/sh/init.c:
- On reinit for a new script, update shgd->{pid,ppid,current_pid}
  in the sh_reinit() function itself instead of calling sh_reinit()
  from sh_main() and then updating those immediately after that
  call. It just makes more sense this way. Nothing else ever calls
  sh_reinit() so there are no side effects.

src/cmd/ksh93/sh/xec.c: _sh_fork():
- Update shgd->current_pid in the child early, so that the rest of
  the function can use it instead of calling getpid() again.
- Remove reassignment of SH_PIDNOD->nvalue.lp value pointer to
  shgd->current_pid (which makes ${.sh.pid} work in the shell).
  It's constant and was already set on init.
2020-09-23 04:19:02 +02:00
Martijn Dekker
fe20311fe9 Fix command substitution memory leaks (rhbz#982142)
This fixes two memory leaks in old-style command substitutions
(one when invoking an alias, one when invoking an autoloaded
function), as well as a possible third leak with an unknown
reproducer, by applying this Red Hat patch:
642af4d6/f/ksh-20120801-mlikfiks.patch

src/cmd/ksh93/sh/macro.c: comsubst():
- For as-yet unknown reasons, the alias leak did not occur when
  adding a space at the end of the command substitution, as in
  a=`some_alias `. This fix is a workaround that simply writes
  an extra space to the stack. TODO: a real fix.

src/cmd/ksh93/sh/path.c: funload():
- Add missing free() before return. This fixes the leak with
  autoloaded functions.

src/cmd/ksh93/sh/lex.c: alias_exceptf():
- This function is called "whenever an end of string is found with
  alias". This adds a check for an SF_FINAL stream status flag when
  deciding whether to call free(). In sfio.h this is commented as:
      #define SF_FINAL 11 /* closing is done except stream free */
  When I revert this change, none of the regression tests fail, so
  I don't know how to trigger this supposed leak. But it makes some
  sense given the sfio.h comment, so I'll keep it.

src/cmd/ksh93/tests/leaks.sh:
- Add the reproducers from rhbz#982142 as regression tests
  (including an extra one for nested command substitutions that was
  already fixed as of 93u+, but testing is good).
     I replaced the external 'expr' and 'ls' commands by uses of
  the 'true' builtin, otherwise the tests take far too long to run
  with 16384 iterations. At least the alias leak was still behaving
  identically after replacing 'ls' by 'true'.
2020-09-21 00:36:36 +02:00
Martijn Dekker
9f2066f146 Improve fix for parentheses in param expansions (re: 5ed9ffd6)
The fix was incomplete: expansions using '?' (${var?w(ord},
${var:?wo)rd}) still did not tolerate parentheses in the word
as regular characters.

It was also possible to simplify the fix by making use of the
ST_BRACE (sh_lexstate7[]) state table. See data/lexstates.c and
include/lexstates.h.

src/cmd/ksh93/sh/lex.c: sh_lex(): case S_MOD1:
- The previous fix tested for modifier operator characters : - + =
  as part of the S_MOD2 case, though they are defined as S_MOD1 in
  the ST_BRACE state table. It only worked because of the
  fallthrough. And it turns out the S_MOD1 case already had a
  similar fix, though incomplete. The new fix effectively cancelled
  the old one out as any S_MOD1 character eventually led to
  'continue'. So it can be simplified by removing most of that
  code, without causing any change in behaviour. Only the mode
  change to the ST_QUOTE state table followed by 'continue' is
  necessary. This also fixes it for the '?' operator as that is
  also defined as S_MOD1 in the ST_BRACE state table.

src/cmd/ksh93/sh/macro.c:
- When skipping a ${...} expansion using sh_lexskip(), use the
  ST_QUOTE state table if the character c is an S_MOD1 modifier
  operator character. This makes it consistent with the S_MOD1
  handling in sh_lex().

src/cmd/ksh93/tests/variables.sh:
- Update regression tests to include ? and :? operators.
2020-09-13 10:15:26 +02:00
Martijn Dekker
5ed9ffd6c4 This fixes erroneous syntax errors in parameter expansions such as
${var:-wor)d} or ${var+w(ord}. The parentheses now correctly lose
their normal grammatical meaning within the braces. Fix by Eric
Scrivner (@etscrivner) from July 2018 backported from ksh2020.

This fix complies with POSIX:
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_02

src/cmd/ksh93/sh/lex.c: sh_lex():
- Set the ST_QUOTE state when analysing a modifier with parameter
  expansions using operators ':', '-', '+', '='. This state causes
  subsequent characters (including parentheses) to be considered
  quoted, suppressing their normal grammatical meaning.

src/cmd/ksh93/sh/macro.c: varsub():
- Same for skipping the expansion.

Fixes: https://github.com/ksh93/ksh/issues/126
Prior discussion: https://github.com/att/ast/issues/475
2020-09-05 16:20:22 +02:00
Martijn Dekker
f9c127e39e Remove legacy code for older libast versions
Since ksh 93u+m comes bundled with libast 20111111, there's no need
to support older versions, so this is another cleanup opportunity.

src/cmd/ksh93/include/defs.h:
- Throw an #error if AST_VERSION is undefined or < 20111111.
  (Note that _AST_VERSION is the same as AST_VERSION, but the
  latter is newer and preferred; see src/lib/libast/features/api)

All other changed files:
- Remove legacy code for versions older than the currently used
  versions, which are:
  _AST_VERSION    20111111
  ERROR_VERSION   20100309
  GLOB_VERSION    20060717
  OPT_VERSION     20070319
  SFIO_VERSION    20090915
  VMALLOC_VERSION 20110808
2020-09-04 02:31:39 +02:00
Martijn Dekker
c607c48c84 Revert <> redir FD except in posix mode (re: eeee77ed, 60516872)
eeee77ed implemented a POSIX compliance fix that caused a potential
incompatibility with existing ksh scripts; it made the (rarely
used) read/write redirection operator, <>, default to file
descriptor 0 (standard input) as POSIX specified, instead of 1
(standard output) which is traditional ksh93 behaviour. So ksh
scripts needed to change all <> to 1<> to override the new default.

This commit reverts that change, except in the new posix mode.

src/cmd/ksh93/sh/lex.c:
- Make FD for <> default to 0 in POSIX mode, 1 otherwise.

src/cmd/ksh93/tests/io.sh:
- Revert <> regression test changes from 60516872; we no longer
  need 1<> instead of <> in ksh code.
2020-09-01 08:48:18 +01:00
Martijn Dekker
921bbcaeb7 Remove SHOPT_BASH; keep &> redir operator, '-o posix' option
On 16 June there was a call for volunteers to fix the bash
compatibility mode; it has never successfully compiled in 93u+.
Since no one showed up, it is now removed due to lack of interest.

A couple of things are kept, which are now globally enabled:

1. The &>file redirection shorthand (for >file 2>&1). As a matter
   of fact, ksh93 already supported this natively, but only while
   running rc/profile/login scripts, and it issued a warning. This
   makse it globally available and removes the warning, bringing
   ksh93 in line with mksh, bash and zsh.

2. The '-o posix' standard compliance option. It is now enabled on
   startup if ksh is invoked as 'sh' or if the POSIXLY_CORRECT
   variable exists in the environment. To begin with, it disables
   the aforementioned &> redirection shorthand. Further compliance
   tweaks will be added in subsequent commits. The differences will
   be fairly minimal as ksh93 is mostly compliant already.

In all changed files, code was removed that was compiled (more
precisely, failed to compile/link) if the SHOPT_BASH preprocessor
identifier was defined. Below are other changes worth mentioning:

src/cmd/ksh93/sh/bash.c,
src/cmd/ksh93/data/bash_pre_rc.sh:
- Removed.

src/cmd/ksh93/data/lexstates.c,
src/cmd/ksh93/include/shlex.h,
src/cmd/ksh93/sh/lex.c:
- Globally enable &> redirection operator if SH_POSIX not active.
- Remove warning that was issued when &> was used in rc scripts.

src/cmd/ksh93/data/options.c,
src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/sh/args.c:
- Keep SH_POSIX option (-o posix).
- Replace SH_TYPE_BASH shell type by SH_TYPE_POSIX.

src/cmd/ksh93/sh/init.c:
- sh_type(): Return SH_TYPE_POSIX shell type if ksh was invoked
  as sh (or rsh, restricted sh).
- sh_init(): Enable posix option if the SH_TYPE_POSIX shell type
  was detected, or if the CONFORMANCE ast config variable was set
  to "standard" (which libast sets on init if POSIXLY_CORRECT
  exists in the environment).

src/cmd/ksh93/tests/options.sh,
src/cmd/ksh93/tests/io.sh:
- Replace regression tests for &> and move to io.sh. Since &> is
  now for general use, no longer test in an rc script, and don't
  check that a warning is issued.

Closes: #9
Progresses: #20
2020-09-01 06:19:19 +01:00
Martijn Dekker
f8feed1bd2 SHOPT_MULTIBYTE-related cleanup (re: 8477d2ce)
As of 8477d2ce, the mbwide() macro (which tests if we're in a
multibyte locale, i.e. UTF-8) is redefined as a constant 0 if we're
compiling without SHOPT_MULTIBYTE. See src/cmd/ksh93/include/defs.h

The other multibyte macros use mbwide() as well, so they all revert
to the single-byte fallbacks in that case, and the multibyte code
in them is never compiled. See src/lib/libast/include/ast.h

Consequently we can now do a bit of cleanup and get rid of many of
the '#if SHOPT_MULTIBYTE' directives, as the compiler optimiser
will happily remove the multibyte-specific code. This increases the
legibility of the ksh code.

I'm taking the opportunity to fix a few typos and whitespace
formatting glitches as well.
2020-08-30 04:50:57 +01:00
Johnothan King
6e515f1d45
Fix command substitutions run on the same line as a here-doc (#91)
When a command substitution is run on the same line as a here-document,
a syntax error occurs due to a regression introduced in ksh93u+ 2011-04-15:

true << EOF; true $(true)
EOF
syntax error at line 1: `<<EOF' here-document not contained within command substitution

The regression is caused by an error check that was added to make
the following script causes a syntax error (because the here-document
isn't completed inside of the command substitution):

$(true << EOF)
EOF

src/cmd/ksh93/sh/lex.c:
- Only throw an error when a here-document in a command substitution
  isn't completed inside of the command substitution.

src/cmd/ksh93/tests/heredoc.sh:
- Add a regression test for running a command substitution on the
  same line as a here-document.
- Add a missed regression test for using here-documents in command
  substitutions. This is the original bug that was fixed in ksh93u+
  2011-04-15 (it is why the error message was added), but a regression
  test for here-documents in command substitutions wasn't added in
  that version.

This bugfix was backported from ksh93v- 2013-10-10-alpha.
2020-07-24 00:03:57 +01:00
Johnothan King
db1d539d49
Fix ERE repetition expressions in [[ ... =~ ERE{x,y} ]] (#54)
Regular expressions that combine a repetition expression with
a parenthesized sub-expression throw a garbled syntax error:

$ [[ AATAAT =~ (AAT){2} ]]
ksh: syntax error: `~(E)(AAT){2} ]]
:'%Cred%h%Creseksh: syntax error: `~(E)(AAT){2} ]]
:'%Cred%h%Creseksh: syntax' unexpected

The syntax error occurs because ksh is not fully
accounting for '=~' when it runs into a curly bracket.
This fix disables the syntax error when the operator
is '=~' and adds handling for '(str){x}' (to allow for
more than one sub-expression). This bugfix and the
regression tests for it were backported from ksh93v-
2014-12-24-beta.

src/cmd/ksh93/sh/lex.c:
- Do not trigger a syntax error for '{x}' when the operator
  is '=~' and add handling for multiple parentheses when
  combined with '{x}'.

src/cmd/ksh93/tests/bracket.sh:
- Add two tests from ksh93v- to test sub-expressions
  combined with the '{x}' quantifier.
2020-07-02 18:40:15 +01:00
Martijn Dekker
eeee77edd1 Fix BUG_REDIRIO
ksh used to redirect standard output by default when no file
descriptor was specified with the rarely used '<>' reading/writing
redirection operator. It now redirects standard input by default,
as POSIX specifies and as all other POSIX shells do. To redirect
standard output for reading and writing, you now need '1<>'.

Ref.: https://github.com/att/ast/issues/75
      http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_07_07

(cherry picked from commit 29afc16c47824fc79ed092ae7704c525b1db6a0a)
2020-06-12 01:45:13 +02:00
Martijn Dekker
39a14c1000 Fix 80 typos in comments
(cherry picked from commit 7dca902a85dc02e5df66f0f45a00d1575e7a0220)
2020-06-12 01:45:12 +02:00