1
0
Fork 0
mirror of git://git.code.sf.net/p/cdesktopenv/code synced 2025-02-13 03:32:24 +00:00

Re-fix defining types conditionally or in subshells (re: f508660d)

New version. I'm pretty sure the problems that forced me to revert
it earlier are fixed.

This commit mitigates the effects of the hack explained in the
referenced commit so that dummy built-in command nodes added by the
parser for declaration/assignment purposes do not leak out into the
execution level, except in a relatively harmless corner case.

Something like

    if false; then
        typeset -T Foo_t=(integer -i bar)
    fi

will no longer leave a broken dummy Foo_t declaration command. The
same applies to declaration commands created with enum.

The corner case remaining is:

    $ ksh -c 'false && enum E_t=(a b c); E_t -a x=(b b a c)'
    ksh: E_t: not found

Since the 'enum' command is not executed, this should have thrown
a syntax error on the 'E_t -a' declaration:
ksh: syntax error at line 1: `(' unexpected

This is because the -c script is parsed entirely before being
executed, so E_t is recognised as a declaration built-in at parse
time. However, the 'not found' error shows that it was successfully
eliminated at execution time, so the inconsistent state will no
longer persist.

This fix now allows another fix to be effective as well: since
built-ins do not know about virtual subshells, fork a virtual
subshell into a real subshell before adding any built-ins.

src/cmd/ksh93/sh/parse.c:

- Add a pair of functions, dcl_hactivate() and dcl_dehacktivate(),
  that (de)activate an internal declaration built-ins tree into
  which check_typedef() can pre-add dummy type declaration command
  nodes. A viewpath from the main built-ins tree to this internal
  tree is added, unifying the two for search purposes and causing
  new nodes to be added to the internal tree. When parsing is done,
  we close that viewpath. This hides those pre-added nodes at
  execution time. Since the parser is sometimes called recursively
  (e.g. for command substitutions), keep track of this and only
  activate and deactivate at the first level.
     (Fixed compared to previous version of this commit: calling
  dcl_dehacktivate() when the recursion level is already zero is
  now a harmless no-op. Since this only occurs in error handling
  conditions, who cares.)

- We also need to catch errors. This is done by setting libast's
  error_info.exit variable to a dcl_exit() function that tidies up
  and then passes control to the original (usually sh_exit()).
     (Fixed compared to previous version of this commit: dcl_exit()
  immediately deactivates the hack, no matter the recursion level,
  and restores the regular sh_exit(). This is the right thing to
  do when we're in the process of erroring out.)

- sh_cmd(): This is the most central function in the parser. You'd
  think it was sh_parse(), but $(modern)-form command substitutions
  use sh_dolparen() instead. Both call sh_cmd(). So let's simply
  add a dcl_hacktivate() call at the beginning and a
  dcl_deactivate() call at the end.

- assign(): This function calls path_search(), which among many
  other things executes an FPATH search, which may execute
  arbitrary code at parse time (!!!). So, regardless of recursion
  level, forcibly dehacktivate() to avoid those ugly parser side
  effects returning in that context.

src/cmd/ksh93/bltins/enum.c: b_enum():

- Fork a virtual subshell before adding a built-in.

src/cmd/ksh93/sh/xec.c: sh_exec():

- Fork a virtual subshell when detecting typeset's -T option.

Improves fix to https://github.com/ksh93/ksh/issues/256
This commit is contained in:
Martijn Dekker 2021-12-17 01:02:01 +01:00
parent 2bc1d814c9
commit e67df29c07
10 changed files with 98 additions and 18 deletions

7
NEWS
View file

@ -3,6 +3,13 @@ For full details, see the git log at: https://github.com/ksh93/ksh
Any uppercase BUG_* names are modernish shell bug IDs.
2021-12-17:
- Ksh no longer behaves badly when parsing a type definition command
('typeset -T' or 'enum') without executing it or when executing it in
a subshell. Types can now safely be defined in subshells and defined
conditionally as in 'if condition; then enum ...; fi'.
2021-12-16:
- Changed the default selection of compiled-in /opt/ast/bin built-in libcmd

View file

@ -21,7 +21,7 @@
#pragma prototyped
#include "defs.h"
#define ENUM_ID "enum (ksh 93u+m) 2021-11-23"
#define ENUM_ID "enum (ksh 93u+m) 2021-12-17"
const char sh_optenum[] =
"[-?@(#)$Id: " ENUM_ID " $\n]"
@ -239,6 +239,10 @@ int b_enum(int argc, char** argv, Shbltin_t *context)
error(ERROR_USAGE|2, "%s", optusage(NiL));
return 1;
}
#ifndef STANDALONE
if(sh.subshell && !sh.subshare)
sh_subfork();
#endif
while(cp = *argv++)
{
if(!(np = nv_open(cp, (void*)0, NV_VARNAME|NV_NOADD)) || !(ap=nv_arrayptr(np)) || ap->fun || (sz=ap->nelem&(((1L<<ARRAY_BITS)-1))) < 2)

View file

@ -1766,7 +1766,7 @@ const char sh_opttrap[] =
;
const char sh_opttypeset[] =
"+[-1c?\n@(#)$Id: typeset (ksh 93u+m) 2021-02-10 $\n]"
"+[-1c?\n@(#)$Id: typeset (ksh 93u+m) 2021-12-17 $\n]"
"[--catalog?" SH_DICT "]"
"[+NAME?typeset - declare or display variables with attributes]"
"[+DESCRIPTION?Without the \b-f\b option, \btypeset\b sets, unsets, "

View file

@ -21,7 +21,7 @@
#define SH_RELEASE_FORK "93u+m" /* only change if you develop a new ksh93 fork */
#define SH_RELEASE_SVER "1.0.0-beta.2" /* semantic version number: https://semver.org */
#define SH_RELEASE_DATE "2021-12-16" /* must be in this format for $((.sh.version)) */
#define SH_RELEASE_DATE "2021-12-17" /* must be in this format for $((.sh.version)) */
#define SH_RELEASE_CPYR "(c) 2020-2021 Contributors to ksh " SH_RELEASE_FORK
/* Scripts sometimes field-split ${.sh.version}, so don't change amount of whitespace. */

View file

@ -8715,7 +8715,3 @@ won't be executed until the foreground job terminates.
It is a good idea to leave a space after the comma operator in
arithmetic expressions to prevent the comma from being interpreted
as the decimal point character in certain locales.
.PP
Commands that add type definitions (\f3enum\fP, \f3typeset -T\fP)
must be run unconditionally and in the main shell environment.
Defining types conditionally or in a subshell will cause undefined behavior.

View file

@ -29,7 +29,7 @@
#include "variables.h"
static const char sh_opttype[] =
"[-1c?\n@(#)$Id: type (AT&T Labs Research) 2008-07-01 $\n]"
"[-1c?\n@(#)$Id: type (ksh 93u+m) 2021-12-17 $\n]"
"[--catalog?" SH_DICT "]"
"[+NAME?\f?\f - set the type of variables to \b\f?\f\b]"
"[+DESCRIPTION?\b\f?\f\b sets the type on each of the variables specified "

View file

@ -60,6 +60,10 @@ static Shnode_t *test_and(Lex_t*);
static Shnode_t *test_or(Lex_t*);
static Shnode_t *test_primary(Lex_t*);
static void dcl_hacktivate(void), dcl_dehacktivate(void), (*orig_exit)(int), dcl_exit(int);
static Dt_t *dcl_tree;
static unsigned dcl_recursion;
#define sh_getlineno(lp) (lp->lastline)
#ifndef NIL
@ -169,13 +173,10 @@ static void typeset_order(const char *str,int line)
}
/*
* Pre-add type declaration built-ins at parse time to avoid
* syntax errors when using -c, shcomp, '.'/source or eval.
* This function handles linting for 'typeset' options via typeset_order().
*
* This hack has a bad side effect: defining a type with 'typeset -T' or 'enum'
* in a subshell or an 'if false' block will cause an inconsistent state. But
* as these built-ins alter the syntax of the shell, it's necessary for making
* them work if we're parsing an entire script before or without executing it.
* Also, upon parsing typeset -T or enum, it pre-adds the type declaration built-ins that these would create to
* an internal tree to avoid syntax errors upon pre-execution parsing of assignment-arguments with parentheses.
*
* intypeset==1 for typeset & friends; intypeset==2 for enum
*/
@ -236,6 +237,40 @@ static void check_typedef(struct comnod *tp, char intypeset)
if(cp)
nv_onattr(sh_addbuiltin(cp, (Shbltin_f)SYSTRUE->nvalue.bfp, NIL(void*)), NV_BLTIN|BLT_DCL);
}
/*
* (De)activate an internal declaration built-ins tree into which check_typedef() can pre-add dummy type
* declaration command nodes, allowing correct parsing of assignment-arguments with parentheses for custom
* type declaration commands before actually executing the commands that create those commands.
*
* A viewpath from the main built-ins tree to this internal tree is added, unifying the two for search
* purposes and causing new nodes to be added to the internal tree. When parsing is done, we close that
* viewpath. This hides those pre-added nodes at execution time, avoiding an inconsistent state if a type
* creation command is parsed but not executed.
*/
static void dcl_hacktivate(void)
{
if(dcl_recursion++)
return;
if(!dcl_tree)
dcl_tree = dtopen(&_Nvdisc, Dtoset);
dtview(sh.bltin_tree, dcl_tree);
orig_exit = error_info.exit;
error_info.exit = dcl_exit;
}
static void dcl_dehacktivate(void)
{
if(!dcl_recursion || --dcl_recursion)
return;
error_info.exit = orig_exit;
dtview(sh.bltin_tree, NIL(Dt_t*));
}
static noreturn void dcl_exit(int e)
{
dcl_recursion = 1;
dcl_dehacktivate();
(*error_info.exit)(e);
UNREACHABLE();
}
/*
* Make a parent node for fork() or io-redirection
@ -503,6 +538,7 @@ static Shnode_t *sh_cmd(Lex_t *lexp, register int sym, int flag)
{
register Shnode_t *left, *right;
register int type = FINT|FAMP;
dcl_hacktivate();
if(sym==NL)
lexp->lasttok = 0;
left = list(lexp,flag);
@ -544,6 +580,7 @@ static Shnode_t *sh_cmd(Lex_t *lexp, register int sym, int flag)
sh_syntax(lexp);
}
}
dcl_dehacktivate();
return(left);
}
@ -1045,8 +1082,19 @@ static struct argnod *assign(Lex_t *lexp, register struct argnod *ap, int type)
if(array && type==NV_TYPE)
{
struct argnod *arg = lexp->arg;
unsigned save_recursion = dcl_recursion;
int p;
/*
* Forcibly deactivate the dummy declaration built-ins tree as path_search()
* does an FPATH search, which may execute arbitrary code at parse time.
*/
n = lexp->token;
if(path_search(lexp->sh,lexp->arg->argval,NIL(Pathcomp_t**),1) && (np=nv_search(lexp->arg->argval,lexp->sh->fun_tree,0)) && nv_isattr(np,BLT_DCL))
dcl_recursion = 1;
dcl_dehacktivate();
p = path_search(lexp->sh,lexp->arg->argval,NIL(Pathcomp_t**),1);
dcl_hacktivate();
dcl_recursion = save_recursion;
if(p && (np=nv_search(lexp->arg->argval,lexp->sh->fun_tree,0)) && nv_isattr(np,BLT_DCL))
{
lexp->token = n;
lexp->arg = arg;

View file

@ -1088,6 +1088,8 @@ int sh_exec(register const Shnode_t *t, int flags)
#if SHOPT_TYPEDEF
else if(argn>=3 && checkopt(com,'T'))
{
if(sh.subshell && !sh.subshare)
sh_subfork();
# if SHOPT_NAMESPACE
if(shp->namespace)
{

View file

@ -155,6 +155,11 @@ got=$(eval 2>&1 'command command command enum -i -i -iii --igno -ii PARSER_t=(r
exp='PARSER_t -r -A hack=([C]=g)'
[[ $got == "$exp" ]] || err_exit "incorrect typeset output for enum with command prefix and options" \
"(expected $(printf %q "$exp"); got $(printf %q "$got"))"
PATH=/dev/null command -v PARSER_t >/dev/null && err_exit "PARSER_t leaked out of subshell"
if false
then enum PARSER2_t=(a b)
fi
PATH=/dev/null command -v PARSER2_t >/dev/null && err_exit "PARSER2_t incompletely defined though definition was never executed"
# ======
exit $((Errors<125?Errors:125))

View file

@ -452,20 +452,24 @@ cd "$tmp"
FPATH=$PWD
PATH=$PWD:$PATH
cat > A_t <<- \EOF
if false
then typeset -T Parser_shenanigans=(typeset -i foo)
fi
typeset -T A_t=(
B_t b
)
EOF
cat > B_t <<- \EOF
PATH=/dev/null command -v Parser_shenanigans
typeset -T B_t=(
integer n=5
)
EOF
unset n
if n=$(FPATH=$PWD PATH=$PWD:$PATH $SHELL 2> /dev/null -c 'A_t a; print ${a.b.n}')
then (( n==5 )) || err_exit 'dynamic loading of types gives wrong result'
else err_exit 'unable to load types dynamically'
if n=$(FPATH=$PWD PATH=$PWD:$PATH "$SHELL" -c 'A_t a; print ${a.b.n}' 2>&1)
then [[ $n == '5' ]] || err_exit "dynamic loading of types gives wrong result (got $(printf %q "$n"))"
else err_exit "unable to load types dynamically (got $(printf %q "$n"))"
fi
# check that typeset -T reproduces a type.
@ -639,5 +643,19 @@ got=$($SHELL -c 'enum Foo_t=(foo bar); typeset -T')
[[ -z $got ]] || err_exit "Types created by enum are listed with 'typeset -T'" \
"(got $(printf %q "$got"))"
# ======
# Parser shenanigans.
if false
then typeset -T PARSER_t=(typeset name=foobar)
fi
PATH=/dev/null command -v PARSER_t >/dev/null && err_exit "PARSER_t incompletely defined though definition was never executed"
unset v
got=$( set +x; redirect 2>&1; typeset -T Subsh_t=(typeset -i x); Subsh_t -a v=( (x=1) (x=2) (x=3) ); typeset -p v )
exp='Subsh_t -a v=((typeset -i x=1) (typeset -i x=2) (typeset -i x=3))'
[[ $got == "$exp" ]] || err_exit "bad typeset output for Subsh_t" \
"(expected $(printf %q "$exp"), got $(printf %q "$got"))"
PATH=/dev/null command -v Subsh_t >/dev/null && err_exit "Subsh_t leaked out of subshell"
# ======
exit $((Errors<125?Errors:125))