1
0
Fork 0
mirror of git://git.code.sf.net/p/cdesktopenv/code synced 2025-03-09 15:50:02 +00:00
Commit graph

57 commits

Author SHA1 Message Date
Martijn Dekker
3ce064bbba lex.c: endword(): fix out-of-bounds index to state table
The lexer use 256-byte state tables (see data/lexstates.c), one
byte per possible value for the (unsigned) char type. But the sp
variable used as an index to a state table in loops like this...
                while((n = state[*sp++]) == 0)
                        ;
...is a char*, a pointer to a char. The C standard does not define
if the char type is signed or not (!). On clang and gcc, it is
signed. That means that, whenever a single-byte, high-bit (> 127)
character is encountered, the value wraps around to negative, and a
read occurs outside of the actual state table, causing potentially
incorrect behaviour or a crash.

src/cmd/ksh93/sh/lex.c:
- endword(): Make sp and three related variables explicitly
  unsigned char pointers. This requires a bunch of annoying
  typecasts to stop compilers complaining; so be it.
- To avoid even more typecasts, make stack_shift() follow suit.
- Reorder variable declarations for legibility.
2022-07-11 02:20:27 +02:00
Martijn Dekker
1934686de3 Fix oddly specific syntax error corrupting subsequent [[ ... ]]
Reproducer:

    $ x=([x]=1 [y)
    -ksh: syntax error: `)' unexpected
    $ [[ -z $x ]]
    -ksh: [[ -z  ]]: not found

Any '[[' command following that syntax error will fail similarly;
the whole of it (after variable expansion) is incorrectly looked up
as a command name. The syntax error must be generated by an
associative array assignment (with or without an explicit typeset
-A) with at least one valid assignment element followed by an
invalid assignment element starting with '[' but not containing
']='.

This seems to be another bug that is in every ksh93 version ever.
I've confirmed that ksh 1993-12-28 s+ and ksh2020 fail identically.
Presumably, so does everything in between.

Analysis:

The syntax error function, sh_syntax(), calls lexopen() in mode 0
to reset the lexer state. There is a variable that isn't getting
reset there though it should be. Using systematic elimination I
found that the variable that needs to be reset is lp->assignok (set
"when name=value is legal"). If it is set, '[[' is not processed.

src/cmd/ksh93/sh/lex.c: lexopen():
- Reset 'assignok' in the lexer state (regardless of mode).
- In the mode 0 total lexer state reinit, several members of lexd
  (struct _shlex_pvt_lexdata_) were not getting reset; just memset
  the whole thing to zero.
     Note for backporters: this change requires commit da97587e to
  be correct. That commit took the stack size and pointer (lex_max
  and *lex_match) out of this struct; those should not be reset!

Resolves: https://github.com/ksh93/ksh/issues/486
2022-07-09 23:00:11 +02:00
Martijn Dekker
7c4418ccdc Multibyte character handling overhaul; allow global disable
The SHOPT_MULTIBYTE compile-time option did not make much sense as
disabling it only disabled multibyte support for ksh/libshell, not
libast or libcmd built-in commands. This commit allows disabling
multibyte support for the entire codebase by defining the macro
AST_NOMULTIBYTE (e.g. via CCFLAGS). This slightly speeds up the
code and makes an optimised binary about 5% smaller.

src/lib/libast/include/ast.h:
- Add non-multibyte fallback versions of the multibyte macros that
  are used if AST_NOMULTIBYTE is defined. This should cause most
  multibyte handling to be automatically optimised out everywhere.
- Reformat the multibyte macros for legibility.
- Similify mbchar() and and mbsize() macros by defining them in
  terms of mbnchar() and mbnsize(), eliminating code duplication.
- Correct non-multibyte fallback of mbwidth(). For consistent
  behaviour, control characters and out-of-range values should
  return -1 as they do for UTF-8. The fallback is now the same as
  default_wcwidth() in src/lib/libast/comp/setlocale.c.

src/lib/libast/comp/setlocale.c:
- If AST_NOMULTIBYTE is defined, do not compile in the debug and
  UTF-8 locale conversion functions, including several large
  conversion tables. Define their fallback macros as 0 as these are
  used as function pointers.

src/cmd/ksh93/SHOPT.sh,
src/cmd/ksh93/Mamfile:
- Change the SHOPT_MULTIBYTE default to empty, indicating "probe".
- Synchronise SHOPT_MULTIBYTE with !AST_NOMULTIBYTE by default.

src/cmd/ksh93/include/defs.h:
- When SHOPT_MULTIBYTE is zero but AST_NOMULTIBYTE is not non-zero,
  then enable AST_NOMULTIBYTE here to use the ast.h non-multibyte
  fallbacks for ksh. When this is done, the effect is that
  multibyte is optimized out for ksh only, as before.
- Remove previous fallback for disabling multibyte (re: c2cb0eae).

src/cmd/ksh93/include/lexstates.h,
src/cmd/ksh93/sh/lex.c:
- Define SETLEN() macro to assign to LEN (i.e. _Fcin.fclen) for
  multibyte only and do not assign to it directly. With no
  SHOPT_MULTIBYTE, define that macro as empty. This allows removing
  multiple '#if SHOPT_MULTIBYTE' directives from lex.c, as that
  code will all be optimised out automatically if it's disabled.

src/cmd/ksh93/include/national.h,
src/cmd/ksh93/sh/string.c:
- Fix flagrantly incorrect non-multibyte fallback for sh_strchr().
  The latter returns an integer offset (-1 if not found), whereas
  strchr(3) returns a char pointer (NULL if not found). Incorporate
  the fallback into the function for correct handling instead of
  falling back to strchr(3) directly.

src/cmd/ksh93/sh/macro.c:
- lastchar() optimisation: avoid function call if SHOPT_MULTIBYTE
  is enabled but we're not actually in a multibyte locale.

src/cmd/ksh93/sh/name.c:
- Use ja_size() even with SHOPT_MULTIBYTE disabled (re: 2182ecfa).
  Though no regression tests failed, the non-multibyte fallback for
  typeset -L/-R/-Z length calculation was probably not quite
  correct as ja_size() does more. The ast.h change to mbwidth()
  ensures correct behaviour for non-multibyte locales.

src/cmd/ksh93/tests/shtests:
- Since its value in SHOPT.sh is now empty by default, add a quick
  feature test (for the length of the UTF-8 character 'é') to check
  if SHOPT_MULTIBYTE needs to be enabled for the regression tests.
2022-07-09 00:32:27 +02:00
Martijn Dekker
cc1689849e Another round of tweaks
Notable changes:
- Change a bunch of uses of memcmp(3) to compare strings (which can
  cause buffer read overflows) to strncmp(3).
- src/cmd/ksh93/include/name.h: Eliminate redundant Nambfp_f type.
  Replace with Shbltin_f type from libast's shcmd.h. Since that is
  not guaranteed to be included here, repeat the type definition
  here without fully defining the struct (which is valid in C).
2022-06-24 15:34:55 +01:00
Martijn Dekker
da97587e9e lex.c: prevent restoring outdated stack pointer
Lexical levels are stored in a dynamically grown array of int values
grown by the stack_grow function. The pointer lex_match and the
maximum index lex_max are part of the lexer state struct that is now
saved and restored in various places -- see e.g. 37044047, a2bc49be.
If the stack needs to be grown, it is reallocated in stack_grow()
using sh_realloc(). If that happens between saving and restoring the
lexer state, then an outdated pointer is restored, and crash.

src/cmd/ksh93/include/shlex.h,
src/cmd/ksh93/sh/lex.c:
- Take lex_match and lex_max out of the lexer state struct and make
  them separate static variables.

src/cmd/ksh93/edit/edit.c:
- While we're at it, save and restore the lexer state in a way that
  is saner than the 93v- beta approach (re: 37044047) as well as
  more readable. Instead of permanently allocating memory, use a
  local variable to save the struct. Save/restore directly around
  the sh_trap() call that actually needs this done.

Resolves: https://github.com/ksh93/ksh/issues/482
2022-06-23 03:35:48 +01:00
Martijn Dekker
148a8a3f46 Another build system overhaul (re: 35672208, 580ff616, 6cc2f6a0)
So far we've been handling AST release build and git commit flags
and ksh SHOPT_* compile time options in the generic package build
script. That was a hack that was necessary before I had sufficient
understanding of the build system. Some of it did not work very
well, e.g. the correct git commit did not show up in ${.sh.version}
when compiling from a git repo.

As of this commit, this is properly included in the mamake
dependency tree by handling it from the libast and ksh93 Mamfiles,
guaranteeing they are properly up to date.

For a release build, the _AST_ksh_release macro is renamed to
_AST_release, because some aspects of libast also use this.

This commit also adds my first attempt at documenting the (very
simple, six-command) mamake language as it is currently implemented
-- which is significantly different from Glenn Fowler's original
paper. This is mostly based on reading the mamake.c source code.

src/cmd/INIT/README-mamake.md:
- Added.

bin/package, src/cmd/INIT/package.sh:
- Delete the hack.

**/Mamfile:
- Remove KSH_RELFLAGS and KSH_SHOPTFLAGS, which supported the hack.
- Delete 'meta' commands. They were no-ops; mamake.c ignores them.
  They also did not add any informative value.

src/lib/libast/Mamfile:
- Add a 'virtual' target that obtains the current git commit,
  examines the git branch, and decides whether to auto-set an
  _AST_git_commit and/or or _AST_release #define to a new
  releaseflags.h header file. This is overwritten on each run.
- Add code to the install target that copies limit.h to
  include/ast, but only if it doesn't exist or the content of the
  original changed. This allows '#include <releaseflags.h>' from
  any program using libast while avoiding needless recompiles.
- When there are uncommitted changes, add /MOD (modified) to the
  commit hash instead of not defining it at all.

src/cmd/ksh93/**:
- Mamfile: Add a shopt.h target that reads SHOPT.sh and converts it
  into a new shopt.h header file in the object code directory. The
  shopt.h header only contains SHOPT_* directives that have a value
  in SHOPT.sh (not the empty/probe ones). They also do not redefine
  the macros if they already exist, so overriding with something
  like CCFLAGS+=' -DSHOPT_FOO=1' remains possible.
- **.c: Every c file now #includes "shopt.h" first. So SHOPT_*
  macros are no longer passed via environment/MAM variables.
* SHOPT.sh: The AUDITFILE and CMDLIB_DIR macros no longer need an
  extra backslash-escape for the double quotes in their values.
  (The old way required this because mamake inserts MAM variables
  directly into shell scripts as literals without quoting.  :-/ )

src/cmd/INIT/mamake.c:
- Import the two minor changes between from 93u+ and 93v-: bind()
  is renamed to bindfile() and there is a tweak to detecting an
  "improper done statement".
- Allow arbitrary whitespace (isspace()) everywhere, instead of
  spaces only. This obsoletes my earlier indentation workaround
  from 6cc2f6a0; turns out mamake always supported indentation, but
  with spaces only.
- Do not skip line numbers at the beginning of each line. This
  undocumented feature is not and (AFAICT) never has been used.
- Throw an error on unknown command or rule attribute. Quite an
  important feature for manual maintenance: catches typos, etc.
2022-06-12 05:47:02 +01:00
atheik
9bed28c3f9 Fix line continuation within command substitutions
In command substitutions of the $(standard) and ${ shared state; }
form, backslash line continuation is broken.

Reproducer:

	echo $(
	echo one two\
	three
	)

Actual output (ksh93, all versions):

	one two\ three

Expected output (every other shell, POSIX spec):

	one twothree

src/cmd/ksh93/sh/lex.c: sh_lex(): case S_REG:
- Do not skip new-line joining if we're currently processing a
  command substitution of one of these forms (i.e., if the
  lp->lexd.dolparen level is > 0).

Background info/analysis:

comsub() is called from sh_lex() when S_PAR is the current state.
In src/cmd/ksh93/data/lexstates.c, we see that S_PAR is reached in
the ST_DOL state table at index 40. Decimal 40 is ( in ASCII. So,
the previous skipping of characters was done according to the
ST_DOL state table, and the character that stopped it was (. This
means we have $(.

Alternatively, comsub() may be called from sh_lex() by jumping to
the do_comsub label. In brief, that is the case when we have ${.

Regardless of which it is from the two, comsub() is now called from
sh_lex(). In comsub(), lp->lexd.dolparen is incremented at the
beginning and decremented at the end. Between them, we see that
sh_lex() is called. So, lp->lexd.dolparen in sh_lex() indicates the
depth of nesting $( or ${ statements we're in. Thus, it is also the
number of comsub() invocations seen in a backtrace taken in
sh_lex().

The codepath for `...` is different (and never had this bug).

Co-authored by: Martijn Dekker <martijn@inlv.org>
Resolves: https://github.com/ksh93/ksh/issues/367
2022-05-22 00:23:54 +01:00
atheik
40a5c45b48 Allow double quotes within backtick comsub within double quotes
The following reproducer causes a spurious syntax error:

    foo="`: "("`"

The nested double quotes are not recognised correctly, causing a
syntax error at the '('. Removing the outer double quotes (which
are unnecessary) is a workaround, but it's still a bug as every
other shell accepts this. This bug has been present since the
original Bourne shell.

src/cmd/ksh93/sh/lex.c: sh_lex(): case S_QUOTE:
- If the current character is '"' and we're in a `...` command
  substitution (ingrave is true), then do not switch to the old
  mode but keep using the ST_QUOTE state table.

Thanks to @JohnoKing for the report and to @atheik for the fix.

Co-authored by: Martijn Dekker <martijn@inlv.org>
Resolves: https://github.com/ksh93/ksh/issues/352
2022-05-20 22:48:47 +01:00
Martijn Dekker
b398f33c49 Another round of accumulated minor fixes and cleanups
Only notable changes listed below.

**/Mamfile:
- Do not bother redirecting standard error for 'cmp -s' to
  /dev/null. Normally, 'cmp -s' on Linux, macOS, *BSD, or Solaris
  do not not print any message. If it does, something unusual is
  going on and I would want to see the message.
- Since we now require a POSIX shell, we can use '!'.

src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/sh/init.c:
- Remove SH_TYPE_PROFILE symbol, unused after the removal of the
  SHOPT_PFSH code. (re: eabd6453)

src/cmd/ksh93/sh/io.c:
- piperead(), slowread(): Replace redundant sffileno() calls by
  the variables already containing their results. (re: 50d342e4)

src/cmd/ksh93/bltins/mkservice.c,
rc/lib/libcmd/vmstate.c:
- If these aren't compiled, define a stub function to silence the
  ranlib(1) warning that the .o file does not contain symbols.
2022-03-11 21:20:32 +01:00
Johnothan King
8fc8c2f51c Fix a few minor issues (#473)
Changes:
- Fixed two xtrace test failures introduced in commit cfc8744c.
- The definition of _use_ntfork_tcpgrp in xec.c is now dependent on
  SHOPT_SPAWN being defined (re: 8e9ed5be).
- Removed many unnecessary newlines and fixed various typos.
2022-03-11 21:18:42 +01:00
Martijn Dekker
14a43a0a88 Yet more misc. cleanups; rm SHOPT_PFSH, SHOPT_TYPEDEF
Notable changes:
- Remove SHOPT_PFSH compile-time option and associated code.
  This was meant to work with Solaris rights profiles, see:
  https://docs.oracle.com/cd/E23824_01/html/821-1461/profiles-1.html#REFMAN1profiles-1
  But it has been obsolete for years as Solaris stopped using
  it in its shipped ksh several OS versions ago, preferring a
  library-based wrapper around ksh and other shells.
  Nonetheless I experimented with the option on Solaris 11.4.
  Result: no external command will run; output of unitialised
  memory in error message. So it's already fallen victim to bit
  rot. There's nothing interesting here, so just get rid.
- Remove SHOPT_TYPEDEF compile-time option (but keep the code!).
  Turning it off caused the build to fail. It may be possible to
  fix it, but the type definition code is integral to ksh now (e.g.
  'enum' depends on much of it) so it makes no sense to disable it.
  This was removed in the ksh 93v- beta version as well.
- Remove nv_close() calls and remove nv_close() documentation from
  the nval.3 man page. This function is a dummy, present without
  any changes since the beginning of the ast-open-archive repo in
  1995. The comment was: "Currently this is a dummy, but someday
  will be needed for reference counting". 27 or more years later,
  it's time to admit it's never going to happen. (And of course,
  nv_close() calls were not being used with anything resembling
  consistency.)
- Add a null nv_close() macro to nval.h for compatibility with
  third party code that follows the old documentation.
- Add a few missing regression tests.
2022-02-10 21:04:56 +00:00
Martijn Dekker
9c313c7fe3 Update copyright years in files changed since 1st Jan 2022 2022-01-30 20:49:04 +00:00
Martijn Dekker
172becffea Some more accumulated minor tweaks and cleanups
Notable changes:

src/cmd/ksh93/include/fault.h:
- Get rid of the superflous sh pointer argument in the
  sh_pushcontext() and sh_popcontext() macros. (re: 2d3ec8b6)

src/cmd/ksh93/tests/io.sh:
- Tweak a process substitution test to allow up to a second for
  unused process substitution processes to disappear. On the Alpine
  Linux console (at least the musl libc version), this is needed to
  avoid a test failure as long as no GUI is active. As soon as you
  start X11, this phenomenon disappears, even on the console. Very
  strange, but also very probably not ksh's fault.

src/cmd/ksh93/tests/shtests:
- Instead of just SIGCONT and SIGPIPE, set all signals to default,
  just to be sure to avoid spurious test failures due to signals
  that were ignored on entry. (It made no difference to the
  aforementioned Alpine Linux test failure, so ignored signals had
  nothing to do with that -- but still a good idea.)

.github/workflows/ci.yml:
- On the GitHub CI runs, when testing with SHOPTs disabled, disable
  SHOPT_SPAWN as well, which tests that everything still works
  correctly with the regular fork(2) method.

COPYRIGHT:
- Remove duplicate of BSD license.
2022-01-25 16:13:15 +00:00
Martijn Dekker
b590a9f155 [shp cleanup 01..20] all the rest (re: 2d3ec8b6)
This combines 20 cleanup commits from the dev branch.

All changed files:
- Clean up pointer defererences to sh.
- Remove shp arguments from functions.

Other notable changes:

src/cmd/ksh93/include/shell.h,
src/cmd/ksh93/sh/init.c:
- On second thought, get rid of the function version of
  sh_getinterp() as libshell ABI compatibility is moot. We've
  already been breaking that by reordering the sh struct, so there
  is no way it's going to work without recompiling.

src/cmd/ksh93/sh/name.c:
- De-obfuscate the relationship between nv_scan() and scanfilter().
  The former just calls the latter as a static function, there's no
  need to do that via a function pointer and void* type conversions.

src/cmd/ksh93/bltins/typeset.c,
src/cmd/ksh93/sh/name.c,
src/cmd/ksh93/sh/nvdisc.c:
- 'struct adata' and 'struct tdata', defined as local struct types
  in these files, need to have their first three fields in common,
  the first being a pointer to sh. This is because scanfilter() in
  name.c accesses these fields via a type conversion. So the sh
  field needed to be removed in all three at the same time.
  TODO: de-obfuscate: good practice definition via a header file.

src/cmd/ksh93/sh/path.c:
- Naming consistency: reserve the path_ function name prefix for
  externs and rename statics with that prefix.
- The default path was sometimes referred to as the standard path.
  To use one term, rename std_path to defpath and onstdpath() to
  ondefpath().
- De-obfuscate SHOPT_PFSH conditional code by only calling
  pf_execve() (was path_pfexecve()) if that is compiled in.

src/cmd/ksh93/include/streval.h,
src/cmd/ksh93/sh/streval.c:
- Rename extern strval() to arith_strval() for consistency.

src/cmd/ksh93/sh/string.c:
- Remove outdated/incorrect isxdigit() fallback; '#ifnded isxdigit'
  is not a correct test as isxdigit() is specified as a function.
  Plus, it's part of C89/C90 which we now require. (re: ac8991e5)

src/cmd/ksh93/sh/suid_exec.c:
- Replace an incorrect reference to shgd->current_pid with
  getpid(); it cannot work as (contrary to its misleading directory
  placement) suid_exec is an independent libast program with no
  link to ksh or libshell at all. However, no one noticed because
  this was in fallback code for ancient systems without
  setreuid(2). Since that standard function was specified in POSIX
  Issue 4 Version 2 from 1994, we should remove that fallback code
  sometime as part of another obsolete code cleanup operation to
  avoid further bit rot. (re: 843b546c)

src/cmd/ksh93/bltins/print.c: genformat():
- Remove preformat[] which was always empty and had no effect.

src/cmd/ksh93/shell.3:
- Minor copy-edit.
- Remove documentation for nonexistent sh.infile_name. A search
  through ast-open-archive[*] reveals this never existed at all.
- Document sh.savexit (== $?).

src/cmd/ksh93/shell.3,
src/cmd/ksh93/include/shell.h,
src/cmd/ksh93/sh/init.c:
- Remove sh.gd/shgd; this is now unused and was never documented
  or exposed in the shell.h public interface.
- sh_sigcheck() was documented in shell.3 as taking no arguments
  whereas in the actual code it took a shp argument. I decided to
  go with the documentation.
- That leaves sh_parse() as the only documented function that still
  takes an shp argument. I'm just going to go ahead and remove it
  for consistency, reverting sh_parse() to its pre-2003 spec.
- Remove undocumented/unused sh_bltin_tree() function which simply
  returned sh.bltin_tree.
- Bump SH_VERSION to 20220106.
2022-01-07 16:16:31 +00:00
Johnothan King
b425196958 Fix ASan heap-buffer-overflow when handling syntax errors (#402)
This commit backports a bugfix from ksh2020 to fix an ASan
heap-buffer-overflow error in one of the regression tests. See:
c57f7398
https://github.com/att/ast/issues/1261

This explanation comes from the linked issue:
> The poplevel() in this block of code is called when lp->lexd.lex_max
> is zero:
> bd94eb56/src/cmd/ksh93/sh/lex.c (L921-L925)
> Since poplevel() first decrements lp->lexd.lex_max then uses it as
> an index into lp->lexd.lex_match this causes the word before the
> start of that buffer to be accessed. The buffer is allocated here:
> bd94eb56/src/cmd/ksh93/sh/lex.c (L2210-L2218)

src/cmd/ksh93/sh/lex.c:
- Avoid calling poplevel() twice when handling syntax errors.
2021-12-28 17:53:35 +00:00
Martijn Dekker
a1f5c99204 INIT: remove proto, ratz (re: 46593a89, 6137b99a); major cleanup
This takes another step towards cleaning up the build system. We
now do not even pretend to be theoretically compatible with
pre-1989 K&R C compilers or with C++ compilers. In practice, this
had already been broken for many years due to bit rot.

Commit 46593a89 already removed the license handling enormity that
depended on proto, so now we can cleanly remove it altogether. But
we do need to leave some backwards compatibility stubs to keep the
build system compatible with older AST code; it should remain
possible to build older ksh versions with the current build system
(the bin/ and src/cmd/INIT/ directories) for testing purposes.

So as of now there is no more __MANGLE__d rubbish in your generated
header files. This is only about a quarter of a century overdue...

This commit also includes a huge amount of code cleanup to remove
thousands of unused K&R C fallbacks and other cruft, particularly
in libast. This code base should now be a little easier to
understand for people who are familiar with a modern(ish) C
standard.

ratz is now also removed; this was a standalone and simplified 2005
version of gunzip. As of 6137b99a, none of our code uses it, even
theoretically. And the real g(un)zip is now everywhere.

src/cmd/INIT/proto.c, src/cmd/INIT/ratz.c:
- Removed.

COPYRIGHT:
- Remove zlib license; this only applied to ratz.

bin/package, src/cmd/INIT/package.sh:
- Related cleanups.
- Unset LC_ALL before invoking a new shell, respecting the user's
  locale again and avoiding multibyte character corruption on the
  command line.

src/cmd/INIT/proto.sh:
- Add stub for backwards compatibility with Mamfiles that depend on
  proto. It does nothing but pass input without modification and is
  now installed as the new arch/*/bin/proto by src/cmd/INIT/Mamfile.

src/cmd/INIT/iffe.sh:
- Ignore the proto-related -e (--package) and -p (--prototyped)
  options; keep parsing them for backwards compatibility.
- Trim the macros passed to every test to their standard C
  versions, removing K&R C and C++ versions. These are now
  considered to be for backwards compatibility only.

src/cmd/INIT/iffe.tst:
- Remove proto(1) mangling code.
  By the way, iffe can be regression-tested as follows:
        $ bin/package use   # set up environment in a child shell
        $ regress src/cmd/INIT/iffe.tst
        $ exit              # leave package environment

src/cmd/INIT/make.probe, src/cmd/INIT/probe.win32:
- Remove code to handle C++.

src/lib/libast/features/common:
- As in iffe.sh above, trim macros designed for compatibility with
  C++ and ancient C compilers to their standard C versions and
  comment that they are for backwards compatibility with AST code.
  This is needed to keep all the old ast and ksh code compiling.

src/cmd/ksh93/sh/init.c,
src/cmd/ksh93/sh/name.c:
- Clarify libshell ABI compatibility function versions of macros.
  A "proto workaround" comment in the original code mislead me into
  thinking this had something to do with the removed proto(1), but
  it's unrelated. Call the workaround macro BYPASS_MACRO instead.

src/cmd/ksh93/include/defs.h:
- sh_sigcheck() macro: allow &sh as an argument: parenthesise shp.

src/cmd/ksh93/sh/nvtype.c:
- Remove unused nv_mkstruct() function. (re: d0a5cab1)

**/features/*:
- Remove obsolete iffe 'set prototyped' option.

**/Mamfile:
- Remove all references to the ast/prototyped.h header.
- Remove all use of the proto command. Simply copy instead.

*** 850-ish source files: ***
- Remove all '#pragma prototyped' directives.
- Remove all C++ compat code conditional upon defined(__cplusplus).
- Remove all use of the _ARG_ macro, which on standard C expands to
  its argument:
        #define _ARG_(x)        x
  (on K&R C, it expanded to nothing)
- Remove all use of _BEGIN_EXTERNS_ and _END_EXTERNS_ macros (empty
  on standard C; this was for C++ compatibility)
- Reduce all #if __STD_C (standard code) #else (K&R code) #endif
  blocks to the standard code only, without use of the macro.
- Same for _STD_ macro which seems to have had the same function.
- Change all instances of 'Void_t' to standard 'void'.
2021-12-24 07:05:22 +00:00
Johnothan King
85199ab351 Backport ksh93v- bugfix for [[ 1<2 ]] (#380)
Strings compared in [[ with the > and < operators should be compared
lexically. This does not work when the strings are single digits, as
the parser interprets it as a syntax error:
   $ [[ 10<2 ]]   # 10 lexically sorts before 2
   $ echo $?
   0
   $ [[ 1<2 ]]
   /usr/bin/ksh: syntax error: `<' unexpected
   $ echo $?
   3

src/cmd/ksh93/sh/lex.c:
- Don't interpret numbers next to > and < as a redirection while
  inside of [[. This bugfix was backported from ksh93v- 2014-06-25.

src/cmd/ksh93/tests/bracket.sh:
- Add regression tests for the > and < operators.
2021-12-17 03:26:41 +01:00
Johnothan King
e54001d58b Various minor capitalization and typo fixes (#371)
This commit fixes various minor typos, punctuation errors and
corrects the capitalization of many names.
2021-12-13 01:49:42 +01:00
Johnothan King
cd562b16e2 Port more shell lint improvements from illumos and ksh93v- (#374)
This commit adds onto <https://github.com/ksh93/ksh/pull/353> by porting
over two additional improvements to the shell linter:

1) The changes in the aforementioned pull request were merged into
   illumos-gate with an additional change.[*] The illumos revision of
   the patch improved the warning for (( $foo = $? )) to specify '$foo'
   causes the warning.[**] Example:
      $ ksh -n -c '(( $? != $bar ))'
      ksh: warning: line 1: in '(( $? != $bar ))', using '$' as in '$bar' is slower and can introduce rounding errors
   While I was porting the illumos patch I did notice one problem. The
   string it uses from paramsub() skips over the initial '{' in
   '${var}', resulting in the warning printing '$var}' instead:
      $ ksh -n -c '(( ${.sh.pid} != $$ ))'
      ...  in '(( ${.sh.pid} != $$ ))', using '$' as in '$.sh.pid}' is slower  ...
   This was fixed by including the missing '{' in the string returned by
   paramsub for ${var} variables.

2) In ksh93v-, parsing x=$((expr)) with the shell linter will cause ksh
   to warn the user x=$((expr)) is slower than ((x=expr)). This
   improvement has been backported with a modified warning:
      # Result from this commit
      $ ksh -n -c 'x=$((1 + 2))'
      ksh: warning: line 1: x=$((1 + 2)) is slower than ((x=1 + 2))
      # Result from ksh93v-
      $ ksh93v -n -c 'x=$((1 + 2))'
      ksh93v: warning: line 1: ((x=1 + 2)) is more efficient than x=$((1 + 2))
   Minor note: the ksh93v- patch had an invalid use of memcmp; this
   version of the patch uses strncmp instead.

References:
be548e87bc
65722363_22fdf8e7/
2021-12-13 01:49:37 +01:00
Martijn Dekker
feedc05037 Reset lexer state on syntax error and on SIGINT (Ctrl+C)
ksh crashed if you pressed Ctrl+C or Ctrl+D on a PS2 prompt while
you haven't finished entering a $(command substitution). It
corrupts subsequent command substitutions. Sometimes the situation
recovers, sometimes the shell crashes.

Simple crash reproducer:

  $ PS1="\$(echo foo) \$(echo bar) \$(echo baz) > "
  foo bar baz > echo $(                        <-- now press Ctrl+D
  > ksh: syntax error: `(' unmatched
  Memory fault

The same happens with Ctrl+C, minus the syntax error message.

The problem is that the lexer state becomes inconsistent when the
lexer is interrupted in the middle of reading a command
substitution of the form $( ... ). This is tracked in the
'lexd.dolparen' variable in the lexer state struct.

Resetting that variable is sufficient to fix this issue. However,
in this commit I prefer to just reinitialise the lexer state
completely to pre-empt any other possible issues. Whether there was
a syntax error or the user pressed Ctrl+C, we just interrupted all
lexing and parsing, so the lexer *should* restart from scratch.

src/cmd/ksh93/sh/fault.c: sh_fault():
- If the shell is in an interactive state (e.g. not a subshell) and
  SIGINT was received, reinitialise the lexer state. This fixes the
  crash with Ctrl+C.

src/cmd/ksh93/sh/lex.c: sh_syntax():
- When handling a syntax error, reset the lexer state. This fixes
  the crash with Ctrl+D.

NEWS:
- Also add the forgotten item for the previous fix (re: 2322f939).
2021-12-09 07:35:12 +01:00
Martijn Dekker
aa3048880b cleanup: get rid of KSHELL and _BLD_shell preprocessor macros
Once upon a time it might have been possible to build certain parts
of ksh, such as the emacs and vi editors and possibly even the
name/value library (nval(3)) as independent libraries. But given
the depressing amount of bit rot in the code that we inherited, I
am certain that disabling either of these macros had been resulting
in a broken build for many years before AT&T abandoned this code
base. These are certainly not going to be useful now.

Meanwhile the KSHELL macro got in the way of me today, because the
Mamfile did not define it for all the .c files, but some headers
declared some functionality conditionally upon that macro. So
including <io.h> in, e.g., nvdisc.c did not declare the same
functions as including that header in files with KSHELL defined.
This inconsistency is now gone as well, for various files.

I'm currently working on making it possible once again to build
libshell as a dynamic library; that should be good enough. And that
never involved disabling either of these macros.
2021-12-09 06:43:22 +01:00
Johnothan King
beccb93fd4 Fix various compiler warnings and minor issues (#362)
List of changes:
- Fixed some -Wuninitialized warnings and removed some unused variables.

- Removed the unused extern for B_login (re: d8eba9d1).

- The libcmd builtins and the vmalloc memfatal function now handle
  memory errors with 'ERROR_SYSTEM|ERROR_PANIC' for consistency with how
  ksh itself handles out of memory errors.

- Added usage of UNREACHABLE() where it was missing from error handling.

- Extend many variables from short to int to prevent overflows (most
  variables involve file descriptors).

- Backported a ksh2020 patch to fix unused value Coverity issues
  (https://github.com/att/ast/pull/740).

- Note in src/cmd/ksh93/README that ksh compiles with Cygwin on
  Windows 10 and Windows 11, albeit with many test failures.

- Add comments to detail some sections of code. Extensive list of
  commits related to this change:
  ca2443b5, 7e7f1372, 2db9953a, 7003aba4, 6f50ff64, b1a41311,
  222515bf, a0dcdeea, 0aa9e03f, 61437b27, 352e68da, 88e8fa67,
  bc8b36fa, 6e515f1d, 017d088c, 035a4cb3, 588a1ff7, 6d63b57d,
  a2f13c19, 794d1c86, ab98ec65, 1026006d

- Removed a lot of dead ifdef code.

- edit/emacs.c: Hide an assignment to avoid a -Wunused warning. (See
  also https://github.com/att/ast/pull/753, which removed the assignment
  because ksh2020 removed the !SHOPT_MULTIBYTE code.)

- sh/nvdisc.c: The sh_newof macro cannot return a null pointer because
  it will instead cause the shell to exit if memory cannot be allocated.
  That makes the if statement here a no-op, so remove it.

- sh/xec.c: Fixed one unused variable warning in sh_funscope().

- sh/xec.c: Remove a fallthrough comment added in commit ed478ab7
  because the TFORK code doesn't fall through (GCC also produces no
  -Wimplicit-fallthrough warning here).

- data/builtins.c: The cd and pwd man pages state that these builtins
  default to -P if PATH_RESOLVE is 'physical', which isn't accurate:
     $ /opt/ast/bin/getconf PATH_RESOLVE
     physical
     $ mkdir /tmp/dir; ln -s /tmp/dir /tmp/sym
     $ cd /tmp/sym
     $ pwd
     /tmp/sym
     $ cd -P /tmp/sym
     $ pwd
     /tmp/dir
  The behavior described by these man pages isn't specified in the ksh
  man page or by POSIX, so to avoid changing these builtin's behavior
  the inaccurate PATH_RESOLVE information has been removed.

- Mamfiles: Preserve multi-line errors by quoting the $x variable.
  This fix was backported from 93v-.
  (See also <a7e9cc82>.)

- sh/subshell.c: Remove set but not used sp->errcontext variable.
2021-12-09 06:42:59 +01:00
Martijn Dekker
520f530198 lex.c: add default, though it should never happen (re: b0282f26)
If a bug is ever introduced that causes a [[ ... ]] operator to be
unhandled by the linter, we should at least avoid writing random
memory contents to standard error. In non-release builds, let's
abort() so the problem can be more easily backtraced.
2021-12-05 19:27:38 +01:00
Johnothan King
6904585f49 Port illumos' shell linter improvements (#353)
This commit ports over two improvements to the shell linter from
illumos (original patch written by Andy Fiddaman). Links to the
relevant bug reports and the original patch:
https://www.illumos.org/issues/13601
https://www.illumos.org/issues/13631
c7b656fc71

The first improvement is to the lint warning for arithmetic
operators in [[ ... ]]. The ksh linter now suggests the correct
equivalent operator to use in ((...)). Example:
  $ ksh -nc '[[ 30 -gt 25 ]]'
  # Original warning
  warning: line 1: -gt within [[ ... ]] obsolete, use ((...))
  # New warning
  warning: line 1: [[ ... -gt ... ]] obsolete, use ((... > ...))

The second improvement pertains to variable expansion in arithmetic
expressions. The ksh linter now suggests referencing variable names
directly:
  $ ksh -nc 'integer foo=40; (($foo < 50 ))'
  # Old warning
  warning: line 1: variable expansion makes arithmetic evaluation less efficient
  # New warning
  warning: line 1: in '(($foo < 50))', using '$' is slower and can introduce rounding errors

src/cmd/ksh93/{data/lexstates,sh/lex,sh/parse}.c:
- Port the improved shell lint warnings from illumos to ksh93u+m.
- The original checks for arithmetic operators involved a bunch of
  if statements with inefficient calls to strcmp(3). These were
  replaced with a more efficient switch statement that avoids
  strcmp.
2021-12-05 19:27:16 +01:00
Johnothan King
370440473e Fix KEYBD trap crash when inputting a command substitution (#355)
This change fixes a crash that can occur after setting a KEYBD trap
then inputting a multi-line command substitution. The crash is
similar to issue #347, but it's easier to reproduce since it
doesn't require you to setup a kshrc file. Reproducer for the
crash:

  $ ENV=/./dev/null ksh
  $ trap : KEYBD
  $ : $(
  > true)
  Memory fault(coredump)

The bugfix was backported (with considerable changes) from ksh93v-
2013-10-08. The crash was first reported on the old mailing list:
https://www.mail-archive.com/ast-users@lists.research.att.com/msg00313.html

src/cmd/ksh93/{include/shlex.h,sh/lex.c}:
- To fix this properly, we need sizeof(Lex_t) to work as expected
  in edit.c, but that is thwarted by the _SHLEX_PRIVATE macro in
  lex.c which shlex.h uses to add private structs to the Lex_t type
  in lex.c only. So get rid of that _SHLEX_PRIVATE macro and make
  those members part of the centrally defined struct, renaming them
  to make it clear they're considered private to lex.c.

src/cmd/ksh93/edit/edit.c:
- Now that we can get its size, save and restore the shell lexing
  context when a KEYBD trap is present.

src/cmd/ksh93/tests/pty.sh:
- Add a regression test for the KEYBD trap crash.

Co-authored-by: Martijn Dekker <martijn@inlv.org>
2021-11-30 04:27:31 +01:00
Martijn Dekker
c81473061a test/[: binary operators: fix '<' and add '=~'; some more cleanups
In ksh88, the test/[ built-in supported both the '<' and '>'
lexical sorting comparison operators, same as in [[. However, in
every version of ksh93, '<' does not work though '>' still does!

Still, the code for both is present in test_binop():

src/cmd/ksh93/bltins/test.c
548:		case TEST_SGT:
549:			return(strcoll(left, right)>0);
550:		case TEST_SLT:
551:			return(strcoll(left, right)<0);

Analysis: The binary operators are looked up in shtab_testops[] in
data/testops.c using a macro called sh_lookup, which expands to a
sh_locate() call. If we examine that function in sh/string.c, it's
easy to see that on systems using ASCII (i.e. all except IBM
mainframes), it assumes the table is sorted in ASCII order.

src/cmd/ksh93/sh/string.c
64:	while((c= *tp->sh_name) && (CC_NATIVE!=CC_ASCII || c <= first))

The problem was that the '<' operator was not correctly sorted in
shtab_testops[]; it was sorted immediately before '>', but after
'='. The ASCII order is: < (60), = (61), > (62). This caused '<' to
never be found in the table.

The test_binop() function is also used by [[, yet '<' always worked
in that. This is because the parser has code that directly checks
for '<' and '>' within [[ (in sh/parse.c, lines 1949-1952).

This commit also adds '=~' to 'test', which took three lines of
code and allowed eliminating error handling in test_binop() as
test/[ and [[ now support the same binary ops. (re: fc2d5a60)

src/cmd/ksh93/*/*.[ch]:
- Rename a couple of very misleadingly named macros in test.h:
  . For == and !=, the TEST_PATTERN bit is off for pattern compares
    and on for literal string compares! Rename to TEST_STRCMP.
  . The TEST_BINOP bit does not denote all binary operators, but
    only the logical -a/-o ops in test/[. Rename to TEST_ANDOR.

src/cmd/ksh93/bltins/test.c: test_binop():
- Add support for =~. This is only used by test/[. The method is
  implemented in two lines that convert the ERE to a shell pattern
  by prefixing it with ~(E), then call test_strmatch with that
  temporary string to match the ERE and update ${.sh.match}.
- Since all binary ops from shtab_testops[] are now accounted for,
  remove unknown op error handling from this function.

src/cmd/ksh93/data/testops.c:
- shtab_testops[]:
  . Correctly sort the '<' (TEST_SLT) entry.
  . Remove ']]' (TEST_END). It's not an op and doesn't belong here.
- Update sh_opttest[] documentation with =~, \<, \>.
- Remove now-unused e_unsupported_op[] error message.

src/cmd/ksh93/sh/lex.c: sh_lex():
- Check for ']]' directly instead of relying on the removed
  TEST_END entry from shtab_testops[].

src/cmd/ksh93/tests/bracket.sh:
- Add relevant tests.

src/cmd/ksh93/tests/builtins.sh:
- Fix an old test that globally deleted the 'test' builtin. Delete
  it within the command substitution subshell only.
- Remove the test for non-support of =~ in test/[.
- Update the test for invalid test/[ op to use test directly.
2021-11-14 02:46:34 +01:00
Martijn Dekker
d087b031f0 Fix single quotes in expansion operator string (re: 5ed9ffd6)
The referenced commit introduced the following bug:

> The closing quote does not appear to be registering during the
> parse of the following:
>
>	echo ${var:+'{}'}
>
> Within a script, this will result in:
>
>	syntax error at line 1: `'' unmatched

src/cmd/ksh93/data/lexstates.c,
src/cmd/ksh93/include/lexstates.h:
- Add new ST_MOD1 state table that is a copy of ST_QUOTE, but adds
  a special meaning (ST_LIT) for the single quote (position 39).

src/cmd/ksh93/sh/lex.c: sh_lex():
- For parameter expansion operators with old-style quoting
  (S_MOD1), use the new ST_MOD1 state table instead of ST_QUOTE.
  This causes single quotes within them to be processed properly.

src/cmd/ksh93/tests/quoting2.sh:
- Add tests.

Thanks to @gkamat for the bug report.
Resolves: https://github.com/ksh93/ksh/issues/290
2021-04-30 05:28:21 +01:00
Martijn Dekker
2aad3cab06 Add ksh 93u+m contributors notice to 964 copyright headers 2021-04-26 00:19:31 +01:00
Martijn Dekker
b7dde4e747 Fix ksh exit on syntax error in profile (re: cb67a01b, ceb77b13)
Johnothan King writes:
> There are two regressions related to how ksh handles syntax
> errors in the .kshrc file. If ~/.kshrc or the file pointed to by
> $ENV have a syntax error, ksh exits during startup. Additionally,
> the error message printed is incorrect:
>
> $ cat /tmp/synerror
> ((
> echo foo
>
> # ksh93u+m
> $ ENV=/tmp/synerror arch/*/bin/ksh -ic 'echo ${.sh.version}'
> /tmp/synerror: syntax error: `/t/tmp/synerror' unmatched
>
> # ksh93u+
> $ ENV=/tmp/synerror ksh93u -ic 'echo ${.sh.version}'
> /tmp/synerror: syntax error: `(' unmatched
> Version AJM 93u+ 2012-08-01
>
> The regression that causes the incorrect error message was
> introduced by commit cb67a01. The other bug that causes ksh to
> exit on startup was introduced by commit ceb77b1.

src/cmd/ksh93/sh/lex.c: fmttoken():
- Call stakfreeze(0) to terminate a possible unterminated previous
  stack item before writing the token string onto the stack. This
  fixes the bug with garbage in a syntax error message.

src/cmd/ksh93/sh/main.c: exfile():
- Revert Red Hat's ksh-20140801-diskfull.patch applied in ceb77b13.
  This fixes the bug with interactive ksh exiting on syntax error
  in a profile script. Testing by @JohnoKing showed the patch is no
  longer necessary to fix a login crash on disk full, as commit
  970069a6 (which applied Red Hat patches ksh-20120801-macro.patch
  and ksh-20120801-fd2lost.patch) also fixes that crash.

src/cmd/ksh93/README:
- Fix typos. (re: fdc08b23)

Co-authored-by: Johnothan King <johnothanking@protonmail.com>
Resolves: https://github.com/ksh93/ksh/issues/281
2021-04-21 19:42:24 +01:00
Martijn Dekker
cb67a01b45 lex.c: simplify fmttoken() by using the stack (re: 3255aed2)
Using the stack makes it impossible for future buffer overflows to
occur. It also simplifies fmttoken() by eliminating the need to
declare a local buffer and pass a pointer to that as an argument.

For info: man src/lib/libast/man/stak.3
2021-04-09 17:36:29 +01:00
hyenias
3255aed2c4
lex.c: Fix buffer overflow in debug sh_lex and sh_syntax (#262)
fmttoken() needs a minimal char[4] token buffer passed to it.

Originally reported by: Jakub Wilk <jwilk@jwilk.net>
Original bug report: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=879464

The following code lines from fmttoken() yield a n=3 for SYMSEMI as
n=1 from the start, e.g. 'for <>;'.

        case SYMSEMI:
                if(tok[0]=='<')
                        tok[n++] = '>';
                sym = ';';
                break;
        default:
                sym = 0;
        }
        tok[n++] = sym;
}
tok[n] = 0;

n[0]='<'
n[1]='>'
n[2]=';'
n[3]=0 # <-- BUFFER overflow as the passed character buffers have a size of 3

src/cmd/ksh93/sh/lex.c:
- DBUG: sh_lex(): Adjust char tokstr[3] to char tokstr[4]
- sh_syntax(): Adjust char tokbuf[3] to char tokbuf[4]
2021-04-09 02:47:21 +01:00
Johnothan King
a065558291
Fix more compiler warnings, typos and other minor issues (#260)
Many of these changes are minor typo fixes. The other changes
(which are mostly compiler warning fixes) are:

NEWS:
- The --globcasedetect shell option works on older Linux kernels
  when used with FAT32/VFAT file systems, so remove the note about
  it only working with 5.2+ kernels.

src/cmd/ksh93/COMPATIBILITY:
- Update the documentation on function scoping with an addition
  from ksh93v- (this does apply to ksh93u+).

src/cmd/ksh93/edit/emacs.c:
- Check for '_AST_ksh_release', not 'AST_ksh_release'.

src/cmd/INIT/mamake.c,
src/cmd/INIT/ratz.c,
src/cmd/INIT/release.c,
src/cmd/builtin/pty.c:
- Add more uses of UNREACHABLE() and noreturn, this time for the
  build system and pty.

src/cmd/builtin/pty.c,
src/cmd/builtin/array.c,
src/cmd/ksh93/sh/name.c,
src/cmd/ksh93/sh/nvtype.c,
src/cmd/ksh93/sh/suid_exec.c:
- Fix six -Wunused-variable warnings (the name.c nv_arrayptr()
  fixes are also in ksh93v-).
- Remove the unused 'tableval' function to fix a -Wunused-function
  warning.

src/cmd/ksh93/sh/lex.c:
- Remove unused 'SHOPT_DOS' code, which isn't enabled anywhere.
  https://github.com/att/ast/issues/272#issuecomment-354363112

src/cmd/ksh93/bltins/misc.c,
src/cmd/ksh93/bltins/trap.c,
src/cmd/ksh93/bltins/typeset.c:
- Add dictionary generator function declarations for former
  aliases that are now builtins (re: 1fbbeaa1, ef1621c1, 3ba4900e).
- For consistency with the rest of the codebase, use '(void)'
  instead of '()' for print_cpu_times.

src/cmd/ksh93/sh/init.c,
src/lib/libast/path/pathshell.c:
- Move the otherwise unused EXE macro to pathshell() and only
  search for 'sh.exe' on Windows.

src/cmd/ksh93/sh/xec.c,
src/lib/libast/include/ast.h:
- Add an empty definition for inline when compiling with C89.
  This allows the timeval_to_double() function to be inlined.

src/cmd/ksh93/include/shlex.h:
- Remove the unused 'PIPESYM2' macro.

src/cmd/ksh93/tests/pty.sh:
- Add '# err_exit #' to count the regression test added in
  commit 113a9392.

src/lib/libast/disc/sfdcdio.c:
- Move diordwr, dioread, diowrite and dioexcept behind
  '#ifdef F_DIOINFO' to fix one -Wunused-variable warning and
  multiple -Wunused-function warnings (sfdcdio() only uses these
  functions when F_DIOINFO is defined).

src/lib/libast/string/fmtdev.c:
- Fix two -Wimplicit-function-declaration warnings on Linux by
  including sys/sysmacros.h in fmtdev().
2021-04-08 19:58:07 +01:00
Johnothan King
c4f980eb29
Introduce usage of __builtin_unreachable() and noreturn (#248)
This commit adds an UNREACHABLE() macro that expands to either the
__builtin_unreachable() compiler builtin (for release builds) or
abort(3) (for development builds). This is used to mark code paths
that are never to be reached.

It also adds the 'noreturn' attribute to functions that never
return: path_exec(), sh_done() and sh_syntax(). The UNREACHABLE()
macro is not added after calling these.

The purpose of these is:
* to slightly improve GCC/Clang compiler optimizations;
* to fix a few compiler warnings;
* to add code clarity.

Changes of note:

src/cmd/ksh93/sh/io.c: outexcept():
- Avoid using __builtin_unreachable() here since errormsg can
  return despite using ERROR_system(1), as shp->jmplist->mode is
  temporarily set to 0. See: https://github.com/att/ast/issues/1336

src/cmd/ksh93/tests/io.sh:
- Add a regression test for the ksh2020 bug referenced above.

src/lib/libast/features/common:
- Detect the existence of either the C11 stdnoreturn.h header or
  the GCC noreturn attribute, preferring the former when available.
- Test for the existence of __builtin_unreachable(). Use it for
  release builds. On development builds, use abort() instead, which
  crahses reliably for debugging when unreachable code is reached.

Co-authored-by: Martijn Dekker <martijn@inlv.org>
2021-04-05 00:28:24 +01:00
Johnothan King
ed478ab7e3
Fix many GCC -Wimplicit-fallthrough warnings (#243)
This commit adds '/* FALLTHROUGH */' comments to fix many
GCC warnings when compiling with -Wimplicit-fallthrough.
Additionally, the existing fallthrough comments have been
changed for consistency.
2021-03-30 21:49:20 +01:00
Johnothan King
814b5c6890
Fix various minor problems and update the documentation (#237)
These are minor fixes I've accumulated over time. The following
changes are somewhat notable:

- Added a missing entry for 'typeset -s' to the man page.
- Add strftime(3) to the 'see also' section. This and the date(1)
  addition are meant to add onto the documentation for 'printf %T'.
- Removed the man page the entry for ksh reading $PWD/.profile on
  login. That feature was removed in commit aa7713c2.
- Added date(1) to the 'see also' section of the man page.
- Note that the 'hash' command can be used instead of 'alias -t' to
  workaround one of the caveats listed in the man page.
- Use an 'out of memory' error message rather than 'out of space'
  when memory allocation fails.
- Replaced backticks with quotes in some places for consistency.
- Added missing documentation for the %P date format.
- Added missing documentation for the printf %Q and %p formats
  (backported from ksh2020: https://github.com/att/ast/pull/1032).
- The comments that show each builtin's options have been updated.
2021-03-21 14:39:03 +00:00
Martijn Dekker
0b814b53bd Remove more legacy libast code (re: f9c127e3, 651bbd56)
This removes #ifdefs checking for the existence of
SH_PLUGIN_VERSION (version check for dynamically loaded builtins)
and the SFIO identifiers SF_BUFCONST, SF_CLOSING, SF_APPENDWR,
SF_ATEXIT, all of which are defined by the bundled libast.
2021-03-21 06:39:32 +00:00
Martijn Dekker
f8f2c4b608 Remove obsolete quote balancing hack
The old Bourne shell failed to check for closing quotes and command
substitution backticks when encountering end-of-file in a parser
context (such as a script). ksh93 implemented a hack for partial
compatibility with this bug, tolerating unbalanced quotes and
backticks in backtick command subsitutions, 'eval', and command
line invocation '-c' scripts only.

This hack became broken for backtick command substitutions in
fe20311f/350b52ea as a memory leak was fixed by adding a newline to
the stack at the end of the command substitution. That extra
newline becomes part of any string whose quotes are not properly
terminated, causing problems such as the one detailed here:
https://www.mail-archive.com/ast-developers@lists.research.att.com/msg01889.html

    $ touch abc
    $ echo `ls "abc`
    ls: abc
    : not found

No other fix for the memory leak is known that doesn't cause other
problems. (The alternative fix detailed in the referenced mailing
list post causes a different corner-case regression.)

Besides, the hack has always caused other corner case bugs as well:

	$ ksh -c '((i++'
Actual:	ksh: i++(: not found
	(If an external command 'i++(' existed, it would be run)
Expect:	ksh: syntax error at line 1: `(' unmatched

	$ ksh -c 'i=0; echo $((++i'
Actual:	(empty line; the arithmetic expansion is ignored)
Expect:	ksh: syntax error at line 1: `(' unmatched

	$ ksh -c 'echo $(echo "hi)'
Actual:	ksh: syntax error at line 1: `(' unmatched
Expect: ksh: syntax error at line 1: `"' unmatched

So, it's time to get rid of this hack. The old Bourne shell is
dead and buried. No other shell tries to support this breakage.
Tolerating syntax errors is just asking for strange side effects,
inconsistent states, and corner case bugs. We should not want to do
that. Old scripts that rely on this will just need to be fixed.

src/cmd/ksh93/sh/lex.c:
- struct lexdata: Remove 'char balance' member for remembering an
  unbalanced quote or backtick.
- sh_lex(): Remove the back to remember and compensate for
  unbalanced quotes/backticks that was executed only if we were
  executing a script from a string, as opposed to a file.

src/cmd/ksh93/COMPATIBILITY:
- Note the change.

Resolves: https://github.com/ksh93/ksh/issues/199
2021-03-05 22:17:14 +00:00
Johnothan King
7ad274f8b6
Add more out of memory checks (re: 18529b88) (#192)
The referenced commit neglected to add checks for strdup() calls.
That calls malloc() as well, and is used a lot.

This commit switches to another strategy: it adds wrapper functions
for all the allocation macros that check if the allocation
succeeded, so those checks don't need to be done manually.

src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/sh/init.c:
- Add sh_malloc(), sh_realloc(), sh_calloc(), sh_strdup(),
  sh_memdup() wrapper functions with success checks. Call nospace()
  to error out if allocation fails.
- Update new_of() macro to use sh_malloc().
- Define new sh_newof() macro to replace newof(); it uses
  sh_realloc().

All other changed files:
- Replace the relevant calls with the wrappers.
- Remove now-redundant success checks from 18529b88.
- The ERROR_PANIC error message calls are updated to inclusive-or
  ERROR_SYSTEM into the exit code argument, so libast's error()
  appends the human-readable version of errno in square brackets.
  See src/lib/libast/man/error.3

src/cmd/ksh93/edit/history.c:
- Include "defs.h" to get access to the wrappers even if KSHELL is
  not defined.
- Since we're here, fix a compile error that occurred with KSHELL
  undefined by updating the type definition of hist_fname[] to
  match that of history.h.

src/cmd/ksh93/bltins/enum.c:
- To get access to sh_newof(), include "defs.h" instead of
  <shell.h> (note that "defs.h" includes <shell.h> itself).

src/cmd/ksh93/Mamfile:
- enum.c: depend on defs.h instead of shell.h.
- enum.o: add an -I. flag in the compiler invocation so that defs.h
  can find its subsequent includes.

src/cmd/builtin/pty.c:
- Define one outofmemory() function and call that instead of
  repeating the error message call.
- outofmemory() never returns, so remove superfluous exit handling.

Co-authored-by: Martijn Dekker <martijn@inlv.org>
2021-02-27 21:21:58 +00:00
Martijn Dekker
18529b88c6 Add lots of checks for out of memory (re: 0ce0b671)
Huge typeset -L/-R adjustment length values were still causing
crashses on sytems with not enough memory. They should error out
gracefully instead of crashing.

This commit adds out of memory checks to all malloc/calloc/realloc
calls that didn't have them (which is all but two or three).

The stkalloc/stakalloc calls don't need the checks; it has
automatic checking, which is done by passing a pointer to the
outofspace() function to the stakinstall() call in init.c.

src/lib/libast/include/error.h:
- Change the ERROR_PANIC exit status value from ERROR_LEVEL (255)
  to 77, which is what it is supposed to be according to the libast
  error.3 manual page. Exit statuses > 128 for anything else than
  signals are not POSIX compliant and may cause misbehaviour.

src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/sh/init.c:
- To facilitate consistency, add a simple extern sh_outofmemory()
  function that throws an ERROR_PANIC "out of memory".

src/cmd/ksh93/include/shell.h,
src/cmd/ksh93/data/builtins.c:
- Remove now-redundant e_nospace[] extern message; it is now only
  used in one place so it might as well be a string literal in
  sh_outofmemory().

All other changed files:
- Verify the result of all malloc/calloc/realloc calls and call
  sh_outofmemory() if they fail.
2021-02-21 22:27:28 +00:00
Martijn Dekker
e37aa358bf Fix BUG_CASEEMPT: empty 'case' list was syntax error
'case x in esac' should be syntactically correct, but was an error:

	$ ksh -c 'case x in esac'
	ksh: syntax error at line 1: `case' unmatched

Inserting a newline was a workaround:

	$ ksh -c $'case x in\nesac'
	(no output)

The problem was that the 'esac' reserved word was not being
recognised if it immediately followed the 'in' reserved word.

src/cmd/ksh93/sh/lex.c: sh_lex():
- Do not turn off recognition of reserved words after 'in' if we're
  in a 'case' construct; only do this for 'for' and 'select'.

src/cmd/ksh93/tests/case.sh:
- Add seven regression test for correct recognition of 'esac'.
  Only two failed on ksh93. The rest is to catch future bugs.

Fixes: https://github.com/ksh93/ksh/issues/177
2021-02-16 06:50:12 +00:00
Martijn Dekker
2182ecfa08 Fix compile/regress fails on compiling without SHOPT_* options
Many compile-time options were broken so that they could not be
turned off without causing compile errors and/or regression test
failures. This commit now allows the following to be disabled:

SHOPT_2DMATCH    # two dimensional ${.sh.match} for ${var//pat/str}
SHOPT_BGX        # one SIGCHLD trap per completed job
SHOPT_BRACEPAT   # C-shell {...,...} expansions (, required)
SHOPT_ESH        # emacs/gmacs edit mode
SHOPT_HISTEXPAND # csh-style history file expansions
SHOPT_MULTIBYTE  # multibyte character handling
SHOPT_NAMESPACE  # allow namespaces
SHOPT_STATS      # add .sh.stats variable
SHOPT_VSH        # vi edit mode

The following still break ksh when disabled:

SHOPT_FIXEDARRAY # fixed dimension indexed array
SHOPT_RAWONLY    # make viraw the only vi mode
SHOPT_TYPEDEF    # enable typeset type definitions

Compiling without SHOPT_RAWONLY just gives four regression test
failures in pty.sh, but turning off SHOPT_FIXEDARRAY and
SHOPT_TYPEDEF causes compilation to fail. I've managed to tweak the
code to make it compile without those two options, but then dozens
of regression test failures occur, often in things nothing directly
to do with those options. It looks like the separation between the
code for these options and the rest was never properly maintained.
Making it possible to disable SHOPT_FIXEDARRAY and SHOPT_TYPEDEF
may involve major refactoring and testing and may not be worth it.

This commit has far too many tweaks to list. Notables fixes are:

src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/data/options.c:
- Do not compile in the shell options and documentation for
  disabled features (braceexpand, emacs/gmacs, vi/viraw), so the
  shell is not left with no-op options and inaccurate self-doc.

src/cmd/ksh93/data/lexstates.c:
- Comment the state tables to associte them with their IDs.
- In the ST_MACRO table (sh_lexstate9[]), do not make the S_BRACE
  state for position 123 (ASCII for '{') conditional upon
  SHOPT_BRACEPAT (brace expansion), otherwise disabling this causes
  glob patterns of the form {3}(x) (matching 3 x'es) to stop
  working as well -- and that is ksh globbing, not brace expansion.

src/cmd/ksh93/edit/edit.c: ed_read():
- Fixed a bug: SIGWINCH was not handled by the gmacs edit mode.

src/cmd/ksh93/sh/name.c: nv_putval():
- The -L/-R left/right adjustment options to typeset do not count
  zero-width characters. This is the behaviour with SHOPT_MULTIBYTE
  enabled, regardless of locale. Of course, what a zero-width
  character is depends on the locale, but control characters are
  always considered zero-width. So, to avoid a regression, add some
  fallback code for non-SHOPT_MULTIBYTE builds that skips ASCII
  control characters (as per iscntrl(3)) so they are still
  considered to have zero width.

src/cmd/ksh93/tests/shtests:
- Export the SHOPT_* macros from SHOPT.sh to the tests as
  environment variables, so the tests can check for them and decide
  whether or how to run tests based on the compile-time options
  that the tested binary was presumably compiled with.
- Do not run the C.UTF-8 tests if SHOPT_MULTIBYTE is not enabled.

src/cmd/ksh93/tests/*.sh:
- Add a bunch of checks for SHOPT_* env vars. Since most should
  have a value 0 (off) or 1 (on), the form ((SHOPT_FOO)) is a
  convenient way to use them as arithmetic booleans.

.github/workflows/ci.yml:
- Make GitHub do more testing: run two locale tests (Dutch and
  Japanese UTF-8 locales), then disable all the SHOPTs that we can
  currently disable, recompile ksh, and run the tests again.
2021-02-08 22:02:45 +00:00
Martijn Dekker
f9427909dc Make redirections like {varname}>file work with brace expansion off
This is some nonsense: redirections that store a file descriptor
greater than 9 in a variable, like {var}<&2 and the like, stopped
working if brace expansion was turned off. '{var}' is not a brace
expansion as it doesn't contain ',' or '..'; something like 'echo
{var}' is always output unexpanded. And redirections and brace
expansion are two completely unrelated things. It wasn't documented
that these redirections require the -B/braceexpand option, either.

src/cmd/ksh93/sh/lex.c: sh_lex():
- Remove incorrect check for braceexpand option before processing
  redirections of this form.

src/cmd/ksh93/COMPATIBILITY:
- Insert a brief item mentioning this.

src/cmd/ksh93/sh.1:
- Correction: these redirections do not yield a file descriptor >
  10, but > 9, a.k.a. >= 10.
- Add a brief example showing how these redirections can be used.

src/cmd/ksh93/tests/io.sh:
- Add a quick regression test.
2021-02-05 05:08:39 +00:00
Martijn Dekker
674a0c3559 Another lexical fix for here-documents (re: 6e515f1d)
OpenSUSE patch from:
https://build.opensuse.org/package/view_file/shells/ksh/ksh93-heredoc.dif

src/cmd/ksh93/sh/lex.c: here_copy():
- Do not potentially seek back a zero or negative length.
2021-01-28 06:32:42 +00:00
Martijn Dekker
bd283959be Fix lexing of 'case' in do...done in a $(comsub) (rhbz#1241013)
The following caused a spurious syntax error:

$ x=$(for i in 1; do case $i in word) true;; esac; done)
-ksh: syntax error: `;;' unexpected

Prior discussion:
https://bugzilla.redhat.com/1241013

Original patch, backported from 93v- beta, applied without change:
642af4d6/f/ksh-20120801-parserfix.patch
2020-09-27 21:26:09 +02:00
Martijn Dekker
843b546c1a rm redundant getpid(2) syscalls (re: 9de65210)
Now that we have ${.sh.pid} a.k.a. shgd->current_pid, which is
updated using getpid() whenever forking a new process, there is no
need for anything else to ever call getpid(); we can use the stored
value instead. There were a lot of these syscalls kicking around,
some of them in performance-sensitive places.

The following lists only changes *other* than changing getpid() to
shgd->currentpid.

src/cmd/ksh93/include/defs.h:
- Comments: clarify what shgd->{pid,ppid,current_pid} are for.

src/cmd/ksh93/sh/main.c,
src/cmd/ksh93/sh/init.c:
- On reinit for a new script, update shgd->{pid,ppid,current_pid}
  in the sh_reinit() function itself instead of calling sh_reinit()
  from sh_main() and then updating those immediately after that
  call. It just makes more sense this way. Nothing else ever calls
  sh_reinit() so there are no side effects.

src/cmd/ksh93/sh/xec.c: _sh_fork():
- Update shgd->current_pid in the child early, so that the rest of
  the function can use it instead of calling getpid() again.
- Remove reassignment of SH_PIDNOD->nvalue.lp value pointer to
  shgd->current_pid (which makes ${.sh.pid} work in the shell).
  It's constant and was already set on init.
2020-09-23 04:19:02 +02:00
Martijn Dekker
fe20311fe9 Fix command substitution memory leaks (rhbz#982142)
This fixes two memory leaks in old-style command substitutions
(one when invoking an alias, one when invoking an autoloaded
function), as well as a possible third leak with an unknown
reproducer, by applying this Red Hat patch:
642af4d6/f/ksh-20120801-mlikfiks.patch

src/cmd/ksh93/sh/macro.c: comsubst():
- For as-yet unknown reasons, the alias leak did not occur when
  adding a space at the end of the command substitution, as in
  a=`some_alias `. This fix is a workaround that simply writes
  an extra space to the stack. TODO: a real fix.

src/cmd/ksh93/sh/path.c: funload():
- Add missing free() before return. This fixes the leak with
  autoloaded functions.

src/cmd/ksh93/sh/lex.c: alias_exceptf():
- This function is called "whenever an end of string is found with
  alias". This adds a check for an SF_FINAL stream status flag when
  deciding whether to call free(). In sfio.h this is commented as:
      #define SF_FINAL 11 /* closing is done except stream free */
  When I revert this change, none of the regression tests fail, so
  I don't know how to trigger this supposed leak. But it makes some
  sense given the sfio.h comment, so I'll keep it.

src/cmd/ksh93/tests/leaks.sh:
- Add the reproducers from rhbz#982142 as regression tests
  (including an extra one for nested command substitutions that was
  already fixed as of 93u+, but testing is good).
     I replaced the external 'expr' and 'ls' commands by uses of
  the 'true' builtin, otherwise the tests take far too long to run
  with 16384 iterations. At least the alias leak was still behaving
  identically after replacing 'ls' by 'true'.
2020-09-21 00:36:36 +02:00
Martijn Dekker
9f2066f146 Improve fix for parentheses in param expansions (re: 5ed9ffd6)
The fix was incomplete: expansions using '?' (${var?w(ord},
${var:?wo)rd}) still did not tolerate parentheses in the word
as regular characters.

It was also possible to simplify the fix by making use of the
ST_BRACE (sh_lexstate7[]) state table. See data/lexstates.c and
include/lexstates.h.

src/cmd/ksh93/sh/lex.c: sh_lex(): case S_MOD1:
- The previous fix tested for modifier operator characters : - + =
  as part of the S_MOD2 case, though they are defined as S_MOD1 in
  the ST_BRACE state table. It only worked because of the
  fallthrough. And it turns out the S_MOD1 case already had a
  similar fix, though incomplete. The new fix effectively cancelled
  the old one out as any S_MOD1 character eventually led to
  'continue'. So it can be simplified by removing most of that
  code, without causing any change in behaviour. Only the mode
  change to the ST_QUOTE state table followed by 'continue' is
  necessary. This also fixes it for the '?' operator as that is
  also defined as S_MOD1 in the ST_BRACE state table.

src/cmd/ksh93/sh/macro.c:
- When skipping a ${...} expansion using sh_lexskip(), use the
  ST_QUOTE state table if the character c is an S_MOD1 modifier
  operator character. This makes it consistent with the S_MOD1
  handling in sh_lex().

src/cmd/ksh93/tests/variables.sh:
- Update regression tests to include ? and :? operators.
2020-09-13 10:15:26 +02:00
Martijn Dekker
5ed9ffd6c4 This fixes erroneous syntax errors in parameter expansions such as
${var:-wor)d} or ${var+w(ord}. The parentheses now correctly lose
their normal grammatical meaning within the braces. Fix by Eric
Scrivner (@etscrivner) from July 2018 backported from ksh2020.

This fix complies with POSIX:
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_02

src/cmd/ksh93/sh/lex.c: sh_lex():
- Set the ST_QUOTE state when analysing a modifier with parameter
  expansions using operators ':', '-', '+', '='. This state causes
  subsequent characters (including parentheses) to be considered
  quoted, suppressing their normal grammatical meaning.

src/cmd/ksh93/sh/macro.c: varsub():
- Same for skipping the expansion.

Fixes: https://github.com/ksh93/ksh/issues/126
Prior discussion: https://github.com/att/ast/issues/475
2020-09-05 16:20:22 +02:00
Martijn Dekker
f9c127e39e Remove legacy code for older libast versions
Since ksh 93u+m comes bundled with libast 20111111, there's no need
to support older versions, so this is another cleanup opportunity.

src/cmd/ksh93/include/defs.h:
- Throw an #error if AST_VERSION is undefined or < 20111111.
  (Note that _AST_VERSION is the same as AST_VERSION, but the
  latter is newer and preferred; see src/lib/libast/features/api)

All other changed files:
- Remove legacy code for versions older than the currently used
  versions, which are:
  _AST_VERSION    20111111
  ERROR_VERSION   20100309
  GLOB_VERSION    20060717
  OPT_VERSION     20070319
  SFIO_VERSION    20090915
  VMALLOC_VERSION 20110808
2020-09-04 02:31:39 +02:00
Martijn Dekker
c607c48c84 Revert <> redir FD except in posix mode (re: eeee77ed, 60516872)
eeee77ed implemented a POSIX compliance fix that caused a potential
incompatibility with existing ksh scripts; it made the (rarely
used) read/write redirection operator, <>, default to file
descriptor 0 (standard input) as POSIX specified, instead of 1
(standard output) which is traditional ksh93 behaviour. So ksh
scripts needed to change all <> to 1<> to override the new default.

This commit reverts that change, except in the new posix mode.

src/cmd/ksh93/sh/lex.c:
- Make FD for <> default to 0 in POSIX mode, 1 otherwise.

src/cmd/ksh93/tests/io.sh:
- Revert <> regression test changes from 60516872; we no longer
  need 1<> instead of <> in ksh code.
2020-09-01 08:48:18 +01:00