1
0
Fork 0
mirror of git://git.code.sf.net/p/cdesktopenv/code synced 2025-03-09 15:50:02 +00:00
Commit graph

281 commits

Author SHA1 Message Date
Johnothan King
8c16f38a88
Fix an infinite loop related to $_ if ksh is /bin/sh (#90)
The following explanation is mostly taken from Tomas Klacko's report on
the old mailing list (which also contains a C program reproducer) [*]:

1. When ksh starts a binary, it sets its environment variable "_"
   to "*number*/path/to/binary". Where "number" is the pid of the
   ksh process.

2. The binary forks and the child executes a suid root shell script
   which begins with #!/bin/sh. For this bug to occur, ksh must be /bin/sh.

3. The ksh process interpreting the suid shell script leaves the "_"
   variable as not set (nv_getval(L_ARGNOD) returns NULL) because
   the "number" from step 1 is not the pid of its parent process.

4-5. Because "_" is not set and the script is suid root, an infinite
   loop occurs because when the SHELL environment variable contains
   "/bin/sh" pathshell() returns "/bin/sh". This becomes an infinite
   loop of /bin/sh /dev/fd/3 executing /bin/sh /dev/fd/3.

src/cmd/ksh93/sh/init.c: get_lastarg():
- Disable the check for if the "number" refers to the process id of
  the parent process.

src/cmd/ksh93/sh/main.c: sh_main():
- Prevent an infinite loop when '$_' is not passed in from the environment.

Solaris applies this bugfix to their version of ksh:
https://github.com/oracle/solaris-userland/blob/master/components/ksh93/patches/190-17432413.patch

[*]: https://www.mail-archive.com/ast-developers@lists.research.att.com/msg01680.html
2020-07-24 01:20:26 +01:00
Johnothan King
6e515f1d45
Fix command substitutions run on the same line as a here-doc (#91)
When a command substitution is run on the same line as a here-document,
a syntax error occurs due to a regression introduced in ksh93u+ 2011-04-15:

true << EOF; true $(true)
EOF
syntax error at line 1: `<<EOF' here-document not contained within command substitution

The regression is caused by an error check that was added to make
the following script causes a syntax error (because the here-document
isn't completed inside of the command substitution):

$(true << EOF)
EOF

src/cmd/ksh93/sh/lex.c:
- Only throw an error when a here-document in a command substitution
  isn't completed inside of the command substitution.

src/cmd/ksh93/tests/heredoc.sh:
- Add a regression test for running a command substitution on the
  same line as a here-document.
- Add a missed regression test for using here-documents in command
  substitutions. This is the original bug that was fixed in ksh93u+
  2011-04-15 (it is why the error message was added), but a regression
  test for here-documents in command substitutions wasn't added in
  that version.

This bugfix was backported from ksh93v- 2013-10-10-alpha.
2020-07-24 00:03:57 +01:00
Martijn Dekker
f207cd5787 Fix race conditions running external commands with job control on
When ksh is compiled with SHOPT_SPAWN (the default), which uses
posix_spawn(3) or vfork(2) (via sh_ntfork()) to launch external
commands, at least two race conditions occur when launching
external commands while job control is active. See:
1887863/comments/3
https://www.mail-archive.com/ast-developers@research.att.com/msg00717.html

The basic issue is that this performance optimisation is
incompatible with job control, because it uses a spawning mechanism
that doesn't copy the parent process' memory pages into the child
process, therefore no state that involves memory can be set before
exec-ing the external program. This makes it impossible to
correctly set the terminal's process group ID in the child process,
something that is essential for job control to work.

src/cmd/ksh93/sh/xec.c:
- Use sh_fork() instead of sh_ntfork() if job control is active.
  This uses fork(2), which is 30%-ish slower on most sytems, but
  allows for correctly setting the terminal process group.

src/cmd/ksh93/tests/basic.sh:
- Add regression test for the race condition reported in #79.

src/cmd/INIT/cc.darwin:
- Remove hardcoded flag to disable SHOPT_SPAWN on the Mac.
  It should be safe to use now.

Fixes https://github.com/ksh93/ksh/issues/79
2020-07-22 13:45:33 +01:00
Martijn Dekker
4e5f24e38c sh/xec.c: remove inactive and broken SHOPT_AMP code
This code has always been completely undocumented since it was
added sometime between 2002 and 2004[*]. No one (including Google)
knows what it's for and no one is likely to find out.

Not only that, it doesn't compile. If SHOPT_AMP is defined, then it
errors out on an undefined function `print_fun` and an undefined
member `shpath` of 'struct Shell_s'. So it's clear that the code
had been abandoned by its authors for some time as of 2012.

src/cmd/ksh93/sh/xec.c:
- Remove vestigial SHOPT_AMP stuff, whatever that was.

[*] Found out by searching multishell ksh93 repo:
    https://github.com/multishell/ksh93/
2020-07-22 13:38:34 +01:00
Johnothan King
e2d1b593ac
Merge dtksh patches from one of the CDE developers (#85)
This merges some fixes to support building dtksh with -DBUILD_DTKSH.
These patches were sent through private email from the CDE developer
Chase. The reason these patches were submitted is because Chase wishes
to include ksh in CDE as an up-to-date git submodule. Quote from Chase:
"... my priority is to get your new version into our code as a git
 submodule, and do it quickly before our code bases differ too widely."

Link to CDE project for anyone interested:
https://sourceforge.net/projects/cdesktopenv/

Although the patches were privately discussed, there are some public
emails on the CDE mailing list (links shortened due to long URLs):
ksh-chaos thread:   https://bit.ly/3hjJ83b
dtksh alias thread: https://bit.ly/3hkzKfJ

The main fix is for suid_exec, which is now told that /usr/dt is a
valid directory to run from via preprocessor flags. A patch for
Shift-JIS was also submitted, but it isn't in this commit because it
isn't an effective fix for the existing Shift-JIS bugs. I will be
giving that patch some more testing.

From: Chase <nicetrynsa@protonmail.ch>
Co-authored by: Johnothan King <johnothanking@protonmail.com>
2020-07-22 06:44:24 +01:00
Martijn Dekker
88e8fa67c6 Avoid crash due to broken optimisation in job locking [OpenSUSE]
This applies ksh93-jobs.dif from OpenSUSE. Source:
https://build.opensuse.org/package/show/openSUSE:Leap:42.3:Update/ksh

src/cmd/ksh93/sh/jobs.c:
- jog_init(): Save errno in case close(JOBTTY) fails. If cause of
  failure was interruption by a signal (EINTR), repeat close.
- job_kill(): Replace Red Hat fix for #35 with nicer OpenSUSE fix
  that doesn't add a goto before declaring variables. Re: ff358f34
2020-07-22 05:01:21 +01:00
Martijn Dekker
db72f41f4b Fix subshell file descriptor leak
A file descriptor (at least 3, can't reproduce for 4 and up) opened
with 'exec' or 'redirect' in a virtual/non-forked subshell survived
that subshell after exiting it:

    $ ksh -c '(redirect 3>&1); echo bug >&3'
    bug

src/cmd/ksh93/sh/io.c:
- Apply a patch from OpenSUSE (ksh93-redirectleak.dif). Source:
  https://build.opensuse.org/package/show/openSUSE:Leap:42.3:Update/ksh

src/cmd/ksh93/tests/io.sh:
- Add regression test.

Thanks to Marc Wilson for flagging this up.
2020-07-21 04:12:40 +01:00
Martijn Dekker
b2bdbef561 ksh -i: only print newline on EOF if really interactive
Some regression tests have to be run with the -i option, making the
shell behave (mostly) as if it is interactive. This causes ksh to
print a final newline upon EOF (Ctrl+D). This is functional if the
shell is really interactive, i.e. if standard input is on a
terminal and we're not running a shell script: it ensures that a
parent shell's prompt appears on a new line. But for tests like
   ksh -i -c 'testcommands'
or
   ksh -i <<EOF
   testcommands
   EOF
it's a minor annoyance. Adding an explicit 'exit' is an effective
workaround, but we might as well fix it.

src/cmd/ksh93/sh/main.c: exfile(): done:
- If shell is "interactive", only print final newline if standard
  input is on a terminal and we're not running a -c script.
2020-07-20 16:29:43 +01:00
Johnothan King
bd88cc7f4f
Fix two crashes related to kshdb (#82)
This commit fixes two different crashes related to kshdb:
- When redirect is given an invalid file descriptor, a segfault
  no longer occurs. Reproducer:
  $ ksh -c 'redirect 9>&200000000000'

- Fix a crash due to free(3) being used on an invalid pointer.
  This can be reproduced with kshdb (commands from att/ast#582):
  $ git clone https://github.com/rocky/kshdb.git
  $ cd kshdb
  $ ksh autogen.sh
  $ echo "print hi there" > $HOME/.kshdbrc
  $ ./kshdb -L . test/example/dbg-test1.sh

src/cmd/ksh93/bltins/misc.c: b_dot_cmd():
- The string pointed to by shp->st.filename must be able to be
  freed from memory with free(3), so duplicate the string with
  strdup(3).

src/cmd/ksh93/sh/io.c: sh_redirect():
- Show an error message when a file descriptor is invalid to
  fix a memory fault.
2020-07-19 23:42:12 +01:00
Martijn Dekker
3613da4240 Remove unused libcoshell
The coshell(1) command, which is required for libcoshell to be
useful, is not known to be shipped by any distribution. It was
removed by the ksh-community fork and hence also by 93u+m (in
2940b3f5). The coshell facility as a whole is obsolete and
insecure. For a long time now, the statically linked libcoshell
library has been 40+ kilobytes of dead weight in the ksh binary.

Prior discussion (ksh2020): https://github.com/att/ast/issues/619

src/lib/libcoshell/*:
- Removed.

src/cmd/ksh93/*:
- Remove the SHOPT_COSHELL compiler option (which was enabled) and
  a lot of code that was conditional upon #ifdef SHOPT_COSHELL.

- init.c: e_version[]: Removing SHOPT_COSHELL changed the "J"
  feature identifier in ${.sh.version} to a lowercase "j", which
  was conditional upon SHOPT_BGX (background job extensions).
  But src/cmd/ksh93/RELEASE documents (at 08-12-04, on line 1188):
    | +SHOPT_BGX enables background job extensions. Noted by "J" in
    |  the version string when enabled. [...]
  That is the only available documentation. So change that "j" back
  to a "J", leaving the version string unchanged after this commit.

- jobs.c: job_walk(): We need to keep one 'job_waitsafe(SIGCHLD);'
  call that was conditional upon SHOPT_COSHELL; removing it caused
  a regression test failure in tests/sigchld.sh, 'SIGCHLD blocked
  for script at end of pipeline' (which means that until now, a ksh
  compiled without libcoshell had broken SIGCHLD handling.)

bin/package, src/cmd/INIT/package.sh:
- Don't export COSHELL variable.
2020-07-17 19:28:52 +01:00
Johnothan King
2db9953ae0
Fix three bugs in the sleep builtin (#77)
This commit backports the main changes to sh_delay from ksh93v-
and ksh2020, which fixes the following bugs:

- Microsecond amounts of less than one millisecond are no longer
  ignored. The following loop will now take a minimum of one
  second to complete:
  for ((i = 0; i != 10000; i++)) do
    sleep PT100U
  done

- 'sleep 30' no longer adds an extra 30 milliseconds to the total
  amount of time to sleep. This bug is hard to notice since 30
  milliseconds can be considered within the margin of error. The
  only reason why longer delays weren't affected is because the old
  code masked the bug when the interval is greater than 30 seconds:
  else if(n > 30)
  {
      sleep(n);
      t -= n;
  }
  This caused 'sleep -s' to break with intervals greater than 30
  seconds, so an actual fix is used instead of a workaround.

- 'sleep -s' now functions correctly with intervals of more than
  30 seconds as the new code doesn't need the old workaround. This
  is done by handling '-s' in sh_delay.

src/cmd/ksh93/bltins/sleep.c:
- Remove the replacement for sleep(3) from the sleep builtin.
- Replace the old sh_delay function with the newer one from ksh2020.
  The new function uses tvsleep, which uses nanosleep(3) internally.

src/cmd/ksh93/include/shell.h,
src/cmd/ksh93/edit/edit.c,
src/cmd/ksh93/sh/jobs.c,
src/cmd/ksh93/sh/xec.c,
src/cmd/ksh93/shell.3:
- Update sh_delay documentation and usage since the function now
  requires two arguments.

src/cmd/ksh93/tests/builtins.sh:
- Add a regression test for 'sleep -s' when the interval is greater
  than 30 seconds. The other bugs can't be tested for in a feasible
  manner across all systems:
  https://github.com/ksh93/ksh/pull/72#issuecomment-657215616
2020-07-17 05:00:28 +01:00
Johnothan King
03224ae3af
Make the 'history' and 'r' commands builtins (#76)
With this change no more preset aliases exist, so the preset alias
tables can be safely removed. All ksh commands can now be used
without 'unalias -a' removing them, even in interactive shells.
Additionally, the history and r commands are no longer limited to
being used in interactive shells.

src/cmd/ksh93/bltins/hist.c:
- Implement the history and r commands as builtins. Also guarantee
  lflag is set to one by avoiding 'lflag++'.

src/cmd/ksh93/Makefile,
src/cmd/ksh93/Mamfile,
src/cmd/ksh93/sh/main.c,
src/cmd/ksh93/sh/init.c,
src/cmd/ksh93/data/aliases.c:
- Remove the table of predefined aliases because the last few have
  been removed. During init the alias tree is now initialized the
  same way as the function tree.

src/cmd/ksh93/bltins/typeset.c:
- Remove the bugfix for unsetting predefined aliases because it is
  now a no-op. Aliases are no longer able to have the NV_NOFREE
  attribute.

src/cmd/ksh93/tests/alias.sh:
- Remove the regression test for unsetting predefined aliases since
  those no longer exist.

src/cmd/ksh93/data/builtins.c:
- Update sh_opthist[] for 'hist --man', etc.

src/cmd/ksh93/sh.1:
- Remove the list of preset aliases since those no longer exist.
- Document history and r as builtins instead of preset aliases.

Co-authored-by: Martijn Dekker <martijn@inlv.org>
2020-07-16 18:56:49 +01:00
Martijn Dekker
17f81ebedb Load 'r' and 'history' default aliases on interactive only
These two default aliases are useful on interactive shells. In
scripts, they interfere with possible function or command names.

As of this commit, these final two default aliases are only loaded
for interactive shells, leaving zero default aliases for scripts.
This completes the project to get rid of misguided default aliases.

src/cmd/ksh93/include/shtable.h,
src/cmd/ksh93/data/aliases.c:
src/cmd/ksh93/sh/init.c:
- Add empty alias table shtab_noaliases[] for scripts.
- Rename inittree() to sh_inittree() and make it external.
- nv_init(), sh_reinit(): Initialise empty alias tree for scripts.

src/cmd/ksh93/sh/main.c: sh_main():
- If interactive, reinitialise alias tree for interactive shells.

src/cmd/ksh93/tests/alias.sh:
- To test default alias removal, launch shell with -i.
2020-07-16 06:44:05 +01:00
Martijn Dekker
8c7c60ec19 shellquoting: rm redundant iswprint() call (re: f9d28935)
A regression test failure was occurring on FreeBSD for
  bin/shtests -u builtins
because UTF-8 characters were wrongly encoded as bytes in the
C.UTF-8 locale. The cause is that iswprint() always returns false
on FreeBSD if the ksh-specific C.UTF-8 locale is active, as the OS
doesn't support it.

That iswprint() call is redundant anyway; the new is_invisible()
function now handles this.

src/cmd/ksh93/sh/string.c: sh_fmtq():
- Remove redundant iswprint() test.
2020-07-16 01:13:59 +01:00
Martijn Dekker
1fbbeaa19d Convert default typeset aliases to regular builtins
This converts the 'autoload', 'compound', 'float', 'functions',
'integer' and 'nameref' default aliases into regular built-in
commands, so that 'unalias -a' does not remove them. Shell
functions can now use these names, which improves compatibility
with POSIX shell scripts.

src/cmd/ksh93/data/aliases.c:
- Remove default typeset aliases.

src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/include/builtins.h:
- Add corresponding built-in command declarations. Typeset-style
  commands are now defined by a pointer range, SYSTYPESET ..
  SYSTYPESET_END. A couple need their own IDs (SYSCOMPOUND,
  SYSNAMEREF) for special-casing in sh/xec.c.
- Update 'typeset --man'.

src/cmd/ksh93/bltins/typeset.c: b_typeset():
- Recognise the new builtin commands by argv[0]. Implement them by
  inserting the corresponding 'typeset' options into the argument
  list before parsing options. This may seem like a bit of a hack,
  but it is simpler, shorter, more future-proof and less
  error-prone than manually copying and adapting all the complex
  flaggery from the option parsing loop.

src/cmd/ksh93/sh/parse.c,
src/cmd/ksh93/sh/xec.c:
- Recognise typeset-style commands by SYSTYPESET .. SYSTYPESET_END
  pointer range.
- Special-case 'compound' (SYSCOMPOUND) and 'nameref' (SYSNAMEREF)
  along with recognising the corresponding 'typeset' options.

src/cmd/ksh93/sh.1:
- Update to document the new built-ins.
- Since not all declaration commands are special built-ins now,
  identify declaration commands using a double-dagger "\(dd"
  character (which renders as '=' in ASCII) and disassociate their
  definition from that of special built-ins.

src/cmd/ksh93/tests/variables.sh:
- Adapt a regression test as there is no more 'integer' alias.
2020-07-15 20:54:06 +01:00
Martijn Dekker
c5820aabc9 Fix $TIMEFORMAT zero-decimal and error behaviour (re: 70fc1da7)
The backported 'time' keyword code introduced a bug (shared by
ksh2020): the $TIMEFORMAT format sequences %0R, %0U and %0S output
a decimal fraction, acting as %1R, %1U and %1S.

A minor ksh2020 behaviour change that was also backported was that
the $TIMEFORMAT formatting no longer errored out on encountering an
invalid identifier, but continued. That behaviour is now reverted.

Neither of these two regressions occurred on older systems that
have to use times(3) instead of getrusage(2) or gettimeofday(2).

This commit also tweaks a regression test so that it doesn't fail
if the old times(3) interface is used.

src/cmd/ksh93/sh/xec.c: p_time():
- (Fix indentation of a for loop.)
- On modern systems, when outputting the result of $TIMEFORMAT
  format sequences, only print fraction if precision is nonzero.
- On modern systems, when encountering an invalid format sequence,
  abort formatting in the same way as done for old systems.
- On old systems, initialise 'n' in a more readable way when used
  as the index for tm[].

src/cmd/ksh93/tests/basic.sh:
- Don't fail, but issue warning on old systems that use times(3).
  Otherwise, check milliseconds: with the ksh 'sleep' builtin,
  'TIMEFORMAT=%3R; time sleep .002' should always output '0.002'.
- Change regression test for TIMEFORMAT='%0S%' to check for the
  correct output, '0%', instead of checking for an error message.
2020-07-15 02:43:35 +01:00
Johnothan King
70fc1da73e
Fix the max precision of the 'time' keyword (#72)
This commit backports the required fixes from ksh2020 for using
millisecond precision with the 'time' keyword. The bugfix refactors
a decent amount of code to rely on the BSD 'timeradd' and
'timersub' macros for calculating the total amount of time elapsed
(as these aren't standard, they are selectively implemented in an
iffe feature test for platforms without them). getrusage(3) is now
preferred since it usually has higher precision than times(3) (the
latter is used as a fallback).

There are three other fixes as well:

src/lib/libast/features/time:
- Test for getrusage with an iffe feature test rather than
  assume _sys_times == _lib_getrusage.

src/cmd/ksh93/sh/xec.c:
- A single percent at the end of a format specifier is now
  treated as a literal '%' (like in Bash).
- Zero-pad seconds if seconds < 10. This was already done for
  the times builtin in commit 5c677a4c, although it wasn't
  applied to the time keyword.
- Backport the ksh2020 bugfix for the time keyword by using
  timeradd and timersub with gettimeofday (which is used with
  a timeofday macro). Prefer getrusage when it is available.
- Allow compiling without the 'timeofday' ifdef for better
  portability.
  This is the order of priority for getting the elapsed time:
  1) getrusage (most precise)
  2) times + gettimeofday (best fallback)
  3) only times (doesn't support millisecond precision)
  This was tested by using debug '#undef' statements in xec.c.

src/cmd/ksh93/features/time:
- Implement feature tests for the 'timeradd' and 'timersub'
  macros.
- Do a feature test for getrusage like in the libast time test.

src/cmd/ksh93/tests/basic.sh:
- Add test for millisecond precision.
- Add test for handling of '%' at the end of a format specifier.
- Add test for locale-specific radix point.
2020-07-14 22:48:04 +01:00
Johnothan King
fc655f1a26
Restore 'set -b'/'set -o notify' functionality (#74)
'set -b' had no effect; it should cause the shell to notify job
state changes immediately instead of waiting for the next prompt.

This fixes a regression that was introduced in ksh93t 2008-07-25.
The bugfix is from: https://github.com/att/ast/pull/1089

src/cmd/ksh93/sh/jobs.c:
- Save the tty wait state and avoid changing it if TTYWAIT was
  already on to avoid breaking 'set -b'.
  The last 'sh_offstate' is inside of an '#if' directive because it
  is only required when ksh is compiled with SHOPT_COSHELL enabled.

src/cmd/ksh93/tests/pty.sh:
- Add a regression test for 'set -b' in interactive shells.
2020-07-14 22:00:28 +01:00
Johnothan King
66c955bc8f
Fix a fork bomb when vi is run from a script and sent Ctrl-Z (#73)
This bug was reported on the old mailing list:
https://www.mail-archive.com/ast-developers@lists.research.att.com/msg00207.html

A fork bomb can occur when SIGTSTP is sent to the vi editor. Vi
must be launched from a script run with exec (tested with
BusyBox vi, nvi and vim):
$ cat /tmp/foo
vi /tmp/bar
echo end
$ ksh
$ chmod +x /tmp/foo
$ exec /tmp/foo
While in vi, send SIGTSTP using Ctrl-Z

src/cmd/ksh93/sh/fault.c:
- Only fork after Ctrl-Z if job control is available. The patch
  used checks 'job.jobcontrol' instead of 'SH_MONITOR':
  https://git.centos.org/rpms/ksh/blob/c8/f/SOURCES/ksh-20120801-forkbomb.patch
2020-07-13 19:10:23 +01:00
Martijn Dekker
778fd6ca2d Fix possible crash due to failure to update shell FD state
This applies ksh-20100621-fdstatus.patch from Red Hat. Not very
much information is available, so this one is more or less taken
on faith. But it seems to make sense on the face of it: calling
sh_fcntl() instead of fcntl(2) directly makes the shell update its
internal file descriptor state more frequently.

It claims to fix Red Hat bug 924440. The report is currently closed
to the public: https://bugzilla.redhat.com/show_bug.cgi?id=924440

However, Kamil Dudka at Red Hat writes:
https://github.com/ksh93/ksh/issues/67#issuecomment-656379993
| Yes, the summary of RHBZ#924440 is "crash in bestreclaim() after
| traversing a memory block with a very large size". We did not have
| any in house reproducer for the bug. The mentioned patch was
| provided and verified by a customer.

...and Marc Wilson dug up a Red Hat erratum containing this info:
https://download.rhn.redhat.com/errata/RHBA-2013-1599.html
| Previously, the ksh shell did not resize the file descriptor list
| every time it was necessary. This could lead to memory corruption
| when several file descriptors were used. As a consequence, ksh
| terminated unexpectedly. This updated version resizes the file
| descriptor list every time it is needed, and ksh no longer
| crashes in the described scenario. (BZ#924440)

No reproducer means no regression test can be added now.

src/cmd/ksh93/sh/io.c,
src/cmd/ksh93/sh/subshell.c,
src/cmd/ksh93/sh/xec.c:
- Change several fcntl(2) calls to sh_fcntl(). This function calls
  fcntl(2) and then updates the shell's file descriptor state.
2020-07-10 20:04:31 +01:00
Johnothan King
c4236cc295 Fix type names starting with lowercase 'a' (#69)
Type names that start with a lowercase 'a' cause an error when used:

$ typeset -T al=(typeset bar)
$ al foo=(bar=testset)
/usr/bin/ksh: al: : invalid variable name

The error occurs because when the parser checks for the alias
builtin (to set 'assignment' to two instead of one), only the first
letter of 'argp->argval' is checked (rather than the entire
string). This was fixed in ksh93v- by comparing argp->argval
against "alias", but in ksh93u+m the check can simply be removed
because it is only run when a builtin has the BLT_DCL flag. As of
04b9171, the alias builtin does not have that flag.

src/cmd/ksh93/sh/parse.c:
- Remove the bugged check for the alias builtin.

src/cmd/ksh93/tests/types.sh:
- Add a regression test for type names starting with a lowercase 'a'.
2020-07-10 17:54:51 +01:00
Martijn Dekker
f9d28935bb Fix UTF-8 shellquoting for xtrace, printf %q, etc.
This fixes an annoying issue in the shell's quoting algorithm
(used for xtrace (set -x), printf %q, and other things) for UTF-8
locales, that caused it to encode perfectly printable UTF-8
characters unnecessarily and inconsistently. For example:

$ (set -x; : 'aeu aéu')
+ : $'aeu a\u[e9]u'
$ (set -x; : 'aéu aeu')
+ : 'aéu aeu'
$ (set -x; : '正常終了 aeu')
+ : '正常終了 aeu'
$ (set -x; : 'aeu 正常終了')
+ : $'aeu \u[6b63]\u[5e38]\u[7d42]\u[4e86]'

This issue was originally reported by lijo george in May 2017:
https://www.mail-archive.com/ast-developers@lists.research.att.com/msg01958.html

src/cmd/ksh93/sh/string.c:
- Add is_invisible() function that returns true if a character is a
  Unicode invisible (non-graph) character, excluding ASCII space.
  Ref.: https://unicode.org/charts/PDF/U2000.pdf
- Use a fallback in is_invisible() if we cannot use the system's
  iswprint(3); this is the case for the ksh C.UTF-8 locale if the
  OS doesn't support that. Fall back to a hardcoded blacklist of
  invisible and control characters and put up with not encoding
  nonexistent characters into \u[xxxx] escapes.
  Ref.: https://unicode.org/charts/PDF/U2000.pdf
- When deciding whether to switch to $'...' quoting mode (state=2),
  use is_invisible() instead of testing for ASCII 0-127 range.
- In $'...' quoting mode, use is_invisible() to decide whether to
  encode wide characters into \u[xxxx] escapes.

src/cmd/ksh93/tests/builtins.sh:
- Add regression tests for shellquoting Arabic, Japanese and Latin
  UTF-8 characters, to be run only in a UTF-8 locale. The Arabic
  sample text[*] contains a couple of direction markers that are
  expected to be encoded into \u[xxxx] escapes.

[*] source: https://r12a.github.io/scripts/tutorial/summaries/arabic
2020-07-10 05:55:11 +01:00
Martijn Dekker
588a1ff7ca Fix spurious warning output in KIA (-R) database file
The ksh -R option creates a cross-reference database that can be
parsed with a "C Query Language" (CQL) tool.
See cql-1994.pdf at: http://gsf.cococlyde.org/files

The -R option puts ksh in noexec mode as it parses the script, and
this can produce warnings as the syntax is parsed. The bug is that
these warnings can end up in the database file, corrupting it.

This applies a fix from Paulo Andrade, via Siteshwar Vashisht:
https://www.mail-archive.com/ast-developers@lists.research.att.com/msg01952.html

src/cmd/ksh93/sh/parse.c:
- Terminate names with a zero character when writing database
  output.

A regression test is not very feasible because the majority of the
database output consists of cryptic IDs/hashes that vary depending
on the session and/or system and possibly other things.
2020-07-09 23:18:41 +01:00
Johnothan King
6930666234
Fix a syntax error when ((...)) is combined with redirections (#68)
This bugfix was backported from ksh93v- 2013-10-10-alpha.

src/cmd/ksh93/sh/parse: item():
- The done label is placed after the 'inout' call for handling I/O
  redirections. This causes the command below to produce a syntax
  error because the '>' is not handled as a redirection operator
  after 'goto done':
  $ ((1+2)) > /dev/null
  /usr/bin/ksh: syntax error: `>' unexpected
  Moving the done label fixes the syntax error as 'inout' is now
  called to handle the redirection operator.

src/cmd/ksh93/tests/arith.sh:
- Add a simple regression test.
2020-07-09 22:12:04 +01:00
Martijn Dekker
361fe1fcc3 Fix hash table memory leak when restoring PATH
There is a bug in path_alias() that may cause a memory leak when
clearing the hash table while setting/restoring PATH.

This applies a fix from Siteshwar Vashist:
https://www.mail-archive.com/ast-developers@lists.research.att.com/msg01945.html

Note that, contrary to Siteshwar's analysis linked above, this bug
has nothing directly to do with subshells, forked or otherwise; it
can also be reproduced by temporarily setting PATH for a command,
for example, 'PATH=/dev/null true', and then doing a PATH search.

Modified analysis:
ksh maintains the value of PATH as a linked list. When a local
scope for PATH is created (e.g. in a virtual subshell or when doing
something like PATH=/foo/bar command ...), ksh duplicates PATH by
increasing the refcount for every element in the linked list by
calling the path_dup() and path_alias() functions. However, when
the state of PATH is restored, this refcount is not decreased. Next
time when PATH is reset to a new value, ksh calls the path_delete()
function to delete the linked list that stored the older path. But
the path_delete() function does not free elements whose refcount is
greater than 1, causing a memory leak.

src/cmd/ksh93/sh/path.c: path_alias():
- Decrease refcount and free old item if needed.
  (The 'old' variable was already introduced in 99065353, but
  its value was never used there; this fixes that as well.)

src/cmd/ksh93/tests/leaks.sh:
- Add regression test. With the bug, setting/restoring PATH
  (which clears the hash table) and doing a PATH search 16 times
  causes about 1.5 KiB of memory to be leaked.
2020-07-09 18:34:15 +01:00
Martijn Dekker
a8f6d6b842 Fix crash due to double free() when sourcing multiple files
There is a bug in sh_eval() that may cause ksh to crash due to a
double free() after sourcing multiple files with '.' or 'source'
if a longjmp is triggered, e.g. by a syntax error.

This applies a fix from Siteshwar Vashist:
https://www.mail-archive.com/ast-developers@lists.research.att.com/msg01943.html

src/cmd/ksh93/sh/xec.c: sh_eval():
- Zero file descriptor io_save after closing it. This prevents a
  double free() after returning from a longjmp.

src/cmd/ksh93/tests/basic.sh:
- Add reproducer as regression test.
2020-07-09 15:35:07 +01:00
Johnothan King
e70925ce10
Fix memory leak on unset of associative array (#64)
Associative arrays weren't being properly freed from memory, which
was causing a memory leak.

This commit incorporates a patch and reproducer/regress test from:
https://www.mail-archive.com/ast-users@lists.research.att.com/msg01016.html

src/cmd/ksh93/sh/name.c:
- Properly free associative arrays from memory in nv_delete().

src/cmd/ksh93/tests/leaks.sh:
- Add regression test.
2020-07-09 01:09:40 +01:00
Martijn Dekker
300cd19987 Fix corrupt UTF-8 char processing & shellquoting after aborted read
If the processing of a multibyte character was interrupted in UTF-8
locales, e.g. by reading just one byte of a two-byte character 'ü'
(\303\274) with a command like:
	print -nr $'\303\274' | read -n1 g
then the shellquoting algorithm was corrupted in such a way that
the final quote in simple single-quoted string was missing. This
bug may have had other, as yet undiscovered, effects as well. The
problem was with corrupted multibyte character processing and not
with the shell-quoting routine sh_fmtq() itself.

Full trace and discussion at: https://github.com/ksh93/ksh/issues/5
(which is also an attempt to begin to understand the esoteric
workings of the libast mb* macros that process UTF-8 characters).

src/lib/libast/comp/setlocale.c: utf8_mbtowc():
- If called from the mbinit() macro (i.e. if both pointer
  parameters are null), reset the global multibyte character
  synchronisation state variable. This fixes the problem with
  interrupted processing leaving an inconsistent state, provided
  that mbinit() is called before processing multibyte characters
  (which it is, in most (?) places that do this). Before this fix,
  calling mbinit() in UTF-8 locales was a no-op.

src/cmd/ksh93/sh/string.c: sh_fmtq():
- Call mbinit() before potentially processing multibyte characters.
  Testing suggests that this could be superfluous, but at worst,
  it's harmless; better be sure.

src/cmd/ksh93/tests/builtins.sh:
- Add regression test for shellquoting with 'printf %q' after
  interrupting the processing of a multibyte characeter with
  'read -n1'. This test only fails in a UTF-8 locale, e.g. when
  running: bin/shtests -u builtins SHELL=/buggy/ksh-2012-08-01

Fixes #5.
2020-07-05 19:24:41 +02:00
Anuradha Weeraman
035a4cb3f4
Fix segfault if $PATH contains a .paths directory (#55)
ksh crashed if it encountered a .paths directory in any of the
directories in $PATH.

Ref: 1534855

src/cmd/ksh93/sh/path.c: path_chkpaths():
- Refuse to read .paths if it's not a regular file
  or a symlink to a regular file.
2020-07-02 23:29:07 +01:00
Johnothan King
db1d539d49
Fix ERE repetition expressions in [[ ... =~ ERE{x,y} ]] (#54)
Regular expressions that combine a repetition expression with
a parenthesized sub-expression throw a garbled syntax error:

$ [[ AATAAT =~ (AAT){2} ]]
ksh: syntax error: `~(E)(AAT){2} ]]
:'%Cred%h%Creseksh: syntax error: `~(E)(AAT){2} ]]
:'%Cred%h%Creseksh: syntax' unexpected

The syntax error occurs because ksh is not fully
accounting for '=~' when it runs into a curly bracket.
This fix disables the syntax error when the operator
is '=~' and adds handling for '(str){x}' (to allow for
more than one sub-expression). This bugfix and the
regression tests for it were backported from ksh93v-
2014-12-24-beta.

src/cmd/ksh93/sh/lex.c:
- Do not trigger a syntax error for '{x}' when the operator
  is '=~' and add handling for multiple parentheses when
  combined with '{x}'.

src/cmd/ksh93/tests/bracket.sh:
- Add two tests from ksh93v- to test sub-expressions
  combined with the '{x}' quantifier.
2020-07-02 18:40:15 +01:00
Johnothan King
10b6ba801d
Fix memory corruption when a compound variable is unset (#49)
The following set of commands ends with a memory fault under
certain circumstances because ksh attempts to free memory
twice, causing memory corruption:

$ testarray=(1 2)
$ compound testarray
$ unset testarray
$ eval testarray=

The fix is to make sure 'np->nvfun' is a valid pointer before
attempting to free memory in 'put_tree'. This patch is from
OpenSUSE: https://build.opensuse.org/package/view_file/shells/ksh/ksh93-nvtree-free.dif?expand=1

src/cmd/ksh93/sh/nvtree.c:
- Do not try to free memory when 'np->nvfun' and 'val'
  are false.

src/cmd/ksh93/tests/comvar.sh:
- Add a regression test for the double free problem. The
  reproducer must be run from an executable script
  with 'ksh -c'.
2020-06-29 18:08:28 +01:00
Johnothan King
5135cf651c
Fix crashes caused by 'typeset -RF' (#47)
Variables created with 'typeset -RF' were being treated as
short integers, even though they are actually floating point
values. As a result the following example will cause a crash:

$ typeset -RF foo=1
$ test "$foo"

This is fixed by checking for 'NV_DOUBLE' with 'nv_isattr',
which prevents ksh from treating floating point values as
short integers due to '== NV_INT16P' excluding 'NV_DOUBLE'.
This bugfix was backported from ksh93v- 2013-10-10-alpha.

src/cmd/ksh93/sh/array.c,
src/cmd/ksh93/sh/name.c,
src/cmd/ksh93/sh/nvdisc:
- Avoid treating floating point values as short integers by
  checking for 'NV_DOUBLE' with 'nv_isattr'.

src/cmd/ksh93/tests/types.sh:
- Add a regression test for the 'typeset -RF' crash. The
  crash cannot be replicated if 'typeset -RF' sets 'foo'
  to zero.
2020-06-28 23:30:27 +01:00
Johnothan King
0aa9e03f55
Fix process substitution combined with redirection (#40)
The code for handling process substitution with redirection was
never being run because IORAW is usually set when IOPROCSUB is
set. This commit fixes the problem by moving the required code
out of the !IORAW if statement. The following command now prints
'good' instead of writing 'ok' to a bizzare file:

$ ksh -c 'echo ok > >(sed s/ok/good/); wait'
good

This commit also fixes a bug that caused the process ID of the
asynchronous process to print when the shell was in interactive
mode. The following command no longer prints a process ID,
behaving like in Bash and zsh:

$ echo >(true)
/dev/fd/5

src/cmd/ksh93/sh/args.c:
 - Temporarily turn off the interactive state while in a process
   substitution to prevent the shell from printing the PID of
   the asynchronous process.

src/cmd/ksh93/sh/io.c:
 - Move the code for process substitution with redirection into
   a separate if statement.

src/cmd/ksh93/tests/io.sh:
 - Add two tests for both process substitution bugs fixed by this
   commit.

src/cmd/ksh93/tests/shtests:
 - Update shtests with a patch from Martijn Dekker to use
   pretty-printing for the output from the times builtin (if it
   is available).

Fixes #2
2020-06-23 23:02:16 +01:00
Johnothan King
c1994b87f1
Fix nested functions ignoring prefixed variable assignments (#37)
This commit fixes the bug described in att/ast#32. The fix and
following explanation is from att/ast#467:

While copying variables from function's local scope to a new scope,
variable attributes were not copied. Such variables were not marked
to be exported in the new function. For e.g.

function f2 { env | grep -i "^foo"; }
function f1 { env | grep -i "^foo"; f2; }
foo=bar f1

prints 'foo=bar' only once, but it should print be twice.

src/cmd/ksh93/sh/xec.c:
 - When variables from the local scope of a function are copied into
   the scope of a nested function, the attributes of the variables
   need to be copied as well.

src/cmd/ksh93/tests/functions.sh:
 - Add regression tests from ksh2020 to check environment variables
   passed to functions.
2020-06-23 00:27:05 +01:00
Johnothan King
ff358f3464 Fix a crash when 'kill %%' and 'kill %+' are run (#35)
Ksh was trying to use the 'pw' variable as a valid pointer even
when it was NULL. This is fixed by doing the error check for
'pw' before doing anything else in 'job_kill'.

This bugfix is from Red Hat:
44e0a643a9/f/SOURCES/ksh-20130214-fixkill.patch

Fixes #34
2020-06-22 19:11:49 +02:00
Martijn Dekker
9d428f8f5e Fix erroneous fork after 'readonly PATH' in subshell (re: 102868f8)
After making PATH readonly in a virtual subshell (without otherwise
changing it, so the subshell is never forked), then the main shell
would erroneously fork into a background process immediately after
leaving the virtual subshell. This was caused by a bug in the
forking workaround that prevents changes in PATH in a virtual
subshell from clearing the parent shell's hash table.

src/cmd/ksh93/sh/name.c: nv_putval():
- If we're either setting or restoring PATH, do an additional check
  for the NV_RDONLY flag, which means the function was told to
  ignore the variable's readonly state. It is told to ignore that
  when restoring the parent shell state after exiting a virtual
  subshell. If we don't fork then, we don't fork the parent shell.

src/cmd/ksh93/tests/subshell.sh:
- Add regression test verifying that no forking happens when making
  PATH readonly in a subshell.

Fixes #30.
2020-06-20 23:47:42 +02:00
Johnothan King
bd3e2a8001
Fix unreliable behavior when special vars are readonly or unset (#27)
src/cmd/ksh93/data/variables.c:
 - Running 'unset .sh.lineno' creates a memory fault, so fix that
   by giving it the NV_NOFREE attribute. This crash was happening
   because ${.sh.lineno} is an integer that cannot be freed from
   memory with free(3).

src/cmd/ksh93/sh/init.c:
 - Tell _nv_unset to ignore NV_RDONLY when $RANDOM and $LINENO are
   restored from the subshell scope. This is required to fully
   restore the original state of these variables after a virtual
   subshell finishes.

src/cmd/ksh93/bltins/typeset.c,
src/cmd/ksh93/sh/subshell.c:
 - Disabled some optimizations for two instances of 'sh_assignok' to
   fix 'readonly' in virtual subshells and '(unset .sh.level)' in
   nested functions. This fixes the following variables when
   '(readonly $varname); enum varname=' is run:

   $_
   ${.sh.name}
   ${.sh.subscript}
   ${.sh.level}

   The optimization in question prevents sh_assignok from saving the
   original state of these variables by making the sh_assignok call
   a no-op. Ksh needs the original state of a variable for it to be
   properly restored after a virtual subshell has run, otherwise ksh
   will simply carry over any new flags (being NV_RDONLY in this case)
   from the subshell into the main shell.

src/cmd/ksh93/tests/variables.sh:
 - Add regression tests from Martijn Dekker for setting special
   variables as readonly in virtual subshells and for unsetting
   special variables in general.

Fixes #4
2020-06-20 18:08:41 +01:00
Johnothan King
99065353b3 Fix 'whence -a' to print correct path for tracked alias (#25)
'whence -a' bases the path for tracked aliases on the user's
current working directory if an enabled ksh builtin of the same
name is also available. The following example will claim 'cat'
is in the user's current working directory:

$ whence -a cat
cat is a tracked alias for /usr/bin/cat
$ builtin cat
$ whence -a cat
cat is a shell builtin
cat is /usr/bin/cat
cat is a tracked alias for /current/working/directory/cat

This patch from ksh2020 fixes this problem by properly saving the
path of the tracked alias for use with 'whence -a', since
'path_pwd' (as implied by the function's name) only gets the users
current working directory, not the location of tracked aliases.
Ref.: https://github.com/att/ast/issues/1049

This bug was originally reported by David Morano about two decades
ago to the AST team: https://github.com/att/ast/issues/954

src/cmd/ksh93/bltins/whence.c:
 - Print the actual path of a tracked alias, path_pwd doesn't
   have this functionality.

src/cmd/ksh93/include/name.h:
 - Add 'pathcomp' for saving the value of tracked aliases.

src/cmd/ksh93/sh/path.c:
 - Save the value of tracked aliases for use by whence.

src/cmd/ksh93/tests/builtins.sh:
 - Add a regression test for using 'whence -a' on tracked
   aliases with a builtin equivalent.
2020-06-19 14:03:58 +02:00
Martijn Dekker
3e3f6b0f12 Restore #22 'unset -f' fix minus segfault (re: b7932e87, 97511748)
Applying the fix for 'unset -f' exposed a crashing bug in lookup()
in sh/nvdisc.c, which is the function for looking up discipline
functions. This is what caused tests/variables.sh to crash.
Ref.: https://github.com/ksh93/ksh/issues/23#issuecomment-645699614

src/cmd/ksh93/sh/nvdisc.c: lookup():
- To avoid segfault, check that the function pointer nq->nvalue.rp
  is actually set before checking if nq->nvalue.rp->running==1.

src/cmd/ksh93/sh/xec.c,
src/cmd/ksh93/tests/functions.sh:
- Uncomment the 'unset -f' fix from b7932e87.

Resolves #21 (again).
2020-06-18 02:48:51 +02:00
Martijn Dekker
975117485c Part revert #22 to undo memory fault (re: b7932e87)
The fix in sh/xec.c, which was backported from the ksh 93v- beta to
delay the actual removal of a running function that unsets itself,
caused a segfault in the variables.sh regression tests (see #23).

src/cmd/ksh93/sh/xec.c:
- Comment out the backported code pending a correct fix for #21.
  Now both types of functions silently fail to unset themselves
  (unless they're discipline functions).

src/cmd/ksh93/tests/functions.sh:
- Disable regression tests checking that the function was actually
  unset, pending a correct fix for #21.

Resolves: #23
Reopens: #21
2020-06-17 21:01:55 +02:00
Johnothan King
b7932e87b6
Fix two problems with 'unset -f' behavior (#22)
src/cmd/ksh93/sh/name.c:
 - Correct the check for when a function is currently running
   to fix a segmentation fault that occurred when a POSIX
   function tries to unset itself while it is running.
   This bug fix was backported from ksh93v-.

src/cmd/ksh93/sh/xec.c:
 - If a function tries to unset itself, unset the function
   with '_nv_unset(np, NV_RDONLY)' to fix a silent failure.
   This fix was also backported from ksh93v-.

src/cmd/ksh93/tests/functions.sh:
 - Add four regression tests for when a function unsets itself.

Resolves #21
2020-06-17 18:26:43 +01:00
Johnothan King
fae8862c53
Fix assignments preceding 'command <special builtin>' (#19)
Ksh was not checking for `command` when running a special builtin,
which caused preceding invocation-local variable assignments to
become global. This is the reproducer from the att/ast#72:

$ foo=BUG command eval ':'
$ echo "$foo"

This no longer prints 'BUG', as ksh now makes sure the command builtin
is not running a special builtin before making invocation-local
variable assignments global.

src/cmd/ksh93/sh/xec.c:
 - Backport the bugfix for BUG_CMDSPASGN from ksh93v- 2013-10-10-alpha.

src/cmd/ksh93/tests/builtins.sh:
 - Add a regression test based on the reproducer in att/ast#72.
2020-06-16 22:58:05 +01:00
Johnothan King
c258a04f7a
Implement a portable fix for SIGCHLD crashes (#18)
As previously reported in rhbz#1112306 (https://bugzilla.redhat.com/show_bug.cgi?id=1112306),
ksh may crash when receiving SIGCHLD because GCC's optimizer will fail to generate
`addl` and `sub` instructions to increment and decrement `job.in_critical` in the
`job_subsave` function. This bug *does* occur in GCC 10 with `-O2`, but not `-O1`;
it doesn't appear this bug has been fixed. As a reference, here is the relevant
debug assembly output of `job_subsave` when KSH is compiled with `CCFLAGS` set to
`-g -O1`:

0000000000034c97 <job_subsave>:

void *job_subsave(void)
{
   34c97:       53                      push   %rbx
        struct back_save *bp = new_of(struct back_save,0);
   34c98:       bf 18 00 00 00          mov    $0x18,%edi
   34c9d:       e8 34 4a 0a 00          callq  d96d6 <_ast_malloc>
   34ca2:       48 89 c3                mov    %rax,%rbx
        job_lock();
   34ca5:       83 05 3c 50 13 00 01    addl   $0x1,0x13503c(%rip)        # 169ce8 <job+0x28>
        *bp = bck;
   34cac:       66 0f 6f 05 4c 5a 13    movdqa 0x135a4c(%rip),%xmm0        # 16a700 <bck>
   34cb3:       00
   34cb4:       0f 11 00                movups %xmm0,(%rax)
   34cb7:       48 8b 05 52 5a 13 00    mov    0x135a52(%rip),%rax        # 16a710 <bck+0x10>
   34cbe:       48 89 43 10             mov    %rax,0x10(%rbx)
        bp->prev = bck.prev;
   34cc2:       48 8b 05 47 5a 13 00    mov    0x135a47(%rip),%rax        # 16a710 <bck+0x10>
   34cc9:       48 89 43 10             mov    %rax,0x10(%rbx)
        bck.count = 0;
   34ccd:       c7 05 29 5a 13 00 00    movl   $0x0,0x135a29(%rip)        # 16a700 <bck>
   34cd4:       00 00 00
        bck.list = 0;
   34cd7:       48 c7 05 26 5a 13 00    movq   $0x0,0x135a26(%rip)        # 16a708 <bck+0x8>
   34cde:       00 00 00 00
        bck.prev = bp;
   34ce2:       48 89 1d 27 5a 13 00    mov    %rbx,0x135a27(%rip)        # 16a710 <bck+0x10>
        job_unlock();
   34ce9:       8b 05 f9 4f 13 00       mov    0x134ff9(%rip),%eax        # 169ce8 <job+0x28>
   34cef:       83 e8 01                sub    $0x1,%eax
   34cf2:       89 05 f0 4f 13 00       mov    %eax,0x134ff0(%rip)        # 169ce8 <job+0x28>
   34cf8:       75 2b                   jne    34d25 <job_subsave+0x8e>
   34cfa:       8b 3d ec 4f 13 00       mov    0x134fec(%rip),%edi        # 169cec <job+0x2c>
   34d00:       85 ff                   test   %edi,%edi
   34d02:       74 21                   je     34d25 <job_subsave+0x8e>
   34d04:       c7 05 da 4f 13 00 01    movl   $0x1,0x134fda(%rip)        # 169ce8 <job+0x28>

When `-O2` is used instead of `-O1`, the `addl` and `sub` instructions for
incrementing and decrementing the lock are removed. GCC instead generates a
broken `mov` instruction for `job_lock` and removes the initial `sub` instruction
in job_unlock (this is also seen in Red Hat's bug report):

       job_lock();
       *bp = bck;
  37d7c:       66 0f 6f 05 7c 79 14    movdqa 0x14797c(%rip),%xmm0        # 17f700 <bck>
  37d83:       00
       struct back_save *bp = new_of(struct back_save,0);
  37d84:       49 89 c4                mov    %rax,%r12
       job_lock();
  37d87:       8b 05 5b 6f 14 00       mov    0x146f5b(%rip),%eax        # 17ece8 <job+0x28>
...
        job_unlock();
  37dc6:       89 05 1c 6f 14 00       mov    %eax,0x146f1c(%rip)        # 17ece8 <job+0x28>
  37dcc:       85 c0                   test   %eax,%eax
  37dce:       75 2b                   jne    37dfb <job_subsave+0x8b>

The original patch works around this bug by using the legacy `__sync_fetch_and_add/sub`
GCC builtins. This forces GCC to generate instructions that change the lock with
`lock addl`, `lock xadd` and `lock subl`:

       job_lock();
  37d9f:       f0 83 05 41 6f 14 00    lock addl $0x1,0x146f41(%rip)        # 17ece8 <job+0x28>
  37da6:       01
...
       job_unlock();
  37deb:       f0 0f c1 05 f5 6e 14    lock xadd %eax,0x146ef5(%rip)        # 17ece8 <job+0x28>
  37df2:       00
  37df3:       83 f8 01                cmp    $0x1,%eax
  37df6:       74 08                   je     37e00 <job_subsave+0x70>
...
  37e25:       74 11                   je     37e38 <job_subsave+0xa8>
  37e27:       f0 83 2d b9 6e 14 00    lock subl $0x1,0x146eb9(%rip)        # 17ece8 <job+0x28>

While this does work, it isn't portable. This patch implements a different
workaround for this compiler bug. If `job_lock` is put at the beginning of
`job_subsave`, GCC will generate the required `addl` and `sub` instructions:

       job_lock();
  37d67:       83 05 7a 5f 14 00 01    addl   $0x1,0x145f7a(%rip)        # 17dce8 <job+0x28>
...
        job_unlock();
  37dbb:       83 e8 01                sub    $0x1,%eax
  37dbe:       89 05 24 5f 14 00       mov    %eax,0x145f24(%rip)        # 17dce8 <job+0x28>

It is odd that moving a single line of code fixes this problem, although
GCC _should_ have generated these instructions from the original code anyway.
I'll note that this isn't the only way to get these instructions to generate.
The problem also seems to go away when inserting almost anything else inside
of the code for `job_subsave`. This is just a simple workaround for a strange
compiler bug.
2020-06-16 22:44:02 +01:00
Martijn Dekker
ad349c7668 silence macro redefinition warnings (re: 7003aba4)
src/cmd/ksh93/bltins/test.c,
src/cmd/ksh93/sh/arith.c,
src/cmd/ksh93/sh/streval.c:
- #undef ERROR_exit before redefining it, so clang stops nagging.
2020-06-16 04:51:21 +02:00
Martijn Dekker
a9de50bf79 Apply simple optimisation for ${ subshare; } (re: 3d38270b)
Running shbench after undoing the incorrect subshell optimisation
showed that the performance of ${ subshare; }-type command
substitutions went down very slightly, but consistently.

The main purpose of using this ksh-specific type of command
substitution vs. a normal one is performance. Thus, it *is*
appropriate to eke every last bit of performance out of it that
we can, provided correctness is completely preserved.

It is also a type of command substitution where every change is
supposed to be shared with the main shell environment; only command
output is captured in a subshell-like fashion.

Thus, on the face of it, it would be a logical optimisation for
sh_assignok() to avoid bothering with saving a subshell context for
variables if we're in a subshare.

Lo and behold, applying it does not introduce any regress fails.

Here are my average results of the braces.ksh benchmark from
shbench <http://fossil.0branch.com/csb/tktnew> against stock
/bin/ksh 93u+ vs. current 93u+m (same compiler flags),
100 runs pre-optimisation and 100 runs post-optimisation:

Stock /bin/ksh:		Pre-optimisation (at 3d38270b):
93u+: 0.743 secs	93u+m: 0.739 secs

Stock /bin/ksh:		Post-optimisation (now):
93u+: 0.744 secs	93u+m: 0.726 secs

The left column shows only a small margin of error with 100 runs;
the right one shows a very small, but not insignificant difference.

However, these tests were not very rigorous with 100 runs each.
If anyone wants to do it properly, please report results to
korn-shell@googlegroups.com. I'm happy enough with this, though.

Thanks to Joerg van den Hoff for providing shbench, without
which it would not have occurred to me to try this.

src/cmd/ksh93/sh/subshell.c: sh_assignok():
- Don't bother if we're in a ${ subshare; }.
2020-06-15 20:27:32 +02:00
Johnothan King
3d38270b32 Remove a buggy optimization for variables in subshells
This bug was originally reported by @lijog in att/ast#7 and has been
reported again in #15. KSH does not save the state of a variable if it
is in a newer scope. This is because of an optimization in sh_assignok
first introduced in ksh93t+ 2010-05-24. Here is the code change in that
version:

                return(np);
        /* don't bother to save if in newer scope */
-       if(!(rp=shp->st.real_fun)  || !(dp=rp->sdict))
-               dp = sp->var;
-       if(np->nvenv && !nv_isattr(np,NV_MINIMAL|NV_EXPORT) && shp->last_root)
-               dp = shp->last_root;
-       if((mp=nv_search((char*)np,dp,HASH_BUCKET))!=np)
-       {
-               if(mp || !np->nvfun || np->nvfun->subshell>=sh.subshell)
-                       return(np);
-       }
+       if(sp->var!=shp->var_tree && shp->last_root==shp->var_tree)
+               return(np);
        if((ap=nv_arrayptr(np)) && (mp=nv_opensub(np)))
        {

This change was originally made to replace a buggier optimization.
However, the current optimization causes variables set in subshells
to wrongly affect the environment outside of the subshell, as the
variable does not get set back to its original value. This patch
simply removes the buggy optimization to fix this problem.

src/cmd/ksh93/sh/subshell.c:
 - Remove a buggy optimization that caused variables set in subshells
   to affect the environment outside of the subshell.

src/cmd/ksh93/tests/subshell.sh:
 - Add a regression test for setting variables in subshells. This
   test has to be run from the disk after being created with a here
   document because it always returns the expected result when run
   directly in the regression test script.
2020-06-15 07:13:38 -07:00
Martijn Dekker
ef1621c18f Make 'source' a regular built-in
The 'source' alias is now converted into a regular built-in command
so that 'unalias -a' does not remove it, and something like
	cmd=source; $cmd name args
will now work.

This is part of the project to replace default aliases that define
essential commands by proper builtins that act identically (except
you now get the actual command's name in any error/usage messages).

src/cmd/ksh93/data/aliases.c:
- Remove 'source' default alias.

src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/include/builtins.h:
- Define 'source' regular builtin with extra parser ID "SYSSOURCE".
  Same definition as '.', minus the BLT_SPC flag indicating a
  special builtin. This preserves the behaviour of 'command .'.
- Update sh_optdot[] to include info for 'source --man'.
  (Note that \f?\f expands to the current command name.
  This allows several commands to share a single --man page.)

src/cmd/ksh93/sh/parse.c:
- In the two places that SYSDOT is checked for, also check for
  SYSSOURCE, making sure the two commands are parsed identically.

src/cmd/ksh93/sh.1:
- Remove 'source' default alias.
- Document 'source' regular builtin.
2020-06-15 11:33:44 +02:00
Johnothan King
7b994b6a7e Implement a better fix for unsetting special env vars
The regression this commit fixes was first introduced in ksh93t
2008-07-25. It was previously worked around in 6f0e008c by forking
subshells if any special environment variable is unset.

The reason why this problem doesn't occur in ksh93s+ is because in
that version of ksh sh_assignok never moves nodes, it only clones
them. The second argument doesn't set NV_MOVE, which makes
`sh_assignok(np,0)` is similar to `sh_assignok(np,1)`. In ksh93t and
higher, setting the second argument to zero causes the node to be moved
with NV_MOVE, which causes the discipline function associated with
the variable node to be removed when `np->nvfun` is set to zero (i.e.
NULL). This is why a command like `(unset LC_NUMERIC; LC_NUMERIC=invalid)`
doesn't print a diagnostic, as it looses its discipline function.

This patch fixes the problem by cloning the node with sh_assignok
if it is a special variable with a discipline function. This allows
special variables to work as expected in virtual subshells. The
original workaround has been kept for the $PATH variable only, as
hash tables are still broken in virtual subshells. It has been updated
accordingly to only fork subshells if it detects the variable node
for PATH. I have added two more regression tests for changing the
PATH in subshells to make sure hash tables continue working as
expected with this fix.

src/cmd/ksh93/bltins/typeset.c:
 - Only fork virtual subshells if the PATH will be changed. If a
   variable is a special variable with a discipline function, it
   should be just be cloned, not moved.

src/cmd/ksh93/sh/nvdisc.c:
 - Add a comment to clarify that NV_MOVE will delete the discipline
   function associated with the node.

src/cmd/ksh93/tests/subshells.sh:
 - Add two more regression tests for unsetting the PATH in subshells,
   one for if PATH is being pointed to by a nameref. Condense the
   hash table tests by moving the main test into a single function.
2020-06-13 12:55:48 -07:00
Martijn Dekker
e500479ede
Merge pull request #1 from JohnoKing/fix-builtin-delete
`builtin -d` should not delete special builtins
2020-06-12 12:36:42 +01:00
Johnothan King
017d088c39 builtin -d should not delete special builtins
The man page for the builtin command says special builtins cannot
be deleted. This wasn't the case though, running `builtin -d` on
a special builtin was deleting it. As an example, the following
set of commands was ending with 'export: not found':

$ builtin -d export
$ export foo=bar

This commit backports the bugfix from ksh93v- (2014-12-24-beta),
which added an error check to prevent special builtins from being
deleted.

src/cmd/ksh93/sh/nvdisc.c:
 - Add an error check to prevent special builtins from being deleted.

src/cmd/ksh93/tests/builtins.sh
 - Add a regression test for using `builtin -d` on special builtins.
2020-06-12 04:26:40 -07:00