1
0
Fork 0
mirror of git://git.code.sf.net/p/cdesktopenv/code synced 2025-02-13 03:32:24 +00:00

arithmetic: Fix the octal leading zero mess (#337)

In C/POSIX arithmetic, a leading 0 denotes an octal number, e.g.
010 == 8. But this is not a desirable feature as it can cause
problems with processing things like dates with a leading zero.
In ksh, you should use 8#10 instead ("10" with base 8).

It would be tolerable if ksh at least implemented it consistently.
But AT&T made an incredible mess of it. For anyone who is not
intimately familiar with ksh internals, it is inscrutable where
arithmetic evaluation special-cases a leading 0 and where it
doesn't. Here are just some of the surprises/inconsistencies:

1. The AT&T maintainers tried to honour a leading 0 inside of
   ((...)) and $((...)) and not for arithmetic contexts outside it,
   but even that inconsistency was never quite consistent.

2. Since 2010-12-12, $((x)) and $(($x)) are different:
      $ /bin/ksh -c 'x=010; echo $((x)) $(($x))'
      10 8
   That's a clear violation of both POSIX and the principle of
   least astonishment. $((x)) and $(($x)) should be the same in
   all cases.

3. 'let' with '-o letoctal' acts in this bizarre way:
      $ set -o letoctal; x=010; let "y1=$x" "y2=010"; echo $y1 $y2
      10 8
   That's right, 'let y=$x' is different from 'let y=010' even
   when $x contains the same string value '010'! This violates
   established shell grammar on the most basic level.

This commit introduces consistency. By default, ksh now acts like
mksh and zsh: the octal leading zero is disabled in all arithmetic
contexts equally. In POSIX mode, it is enabled equally.

The one exception is the 'let' built-in, where this can still be
controlled independently with the letoctal option as before (but,
because letoctal is synched with posix when switching that on/off,
it's consistent by default).

We're also removing the hackery that causes variable expansions for
the 'let' builtin to be quietly altered, so that 'x=010; let y=$x'
now does the same as 'let y=010' even with letoctal on.

Various files:
- Get rid of now-redundant sh.inarith (shp->inarith) flag, as we're
  no longer distinguishing between being inside or outside ((...)).

src/cmd/ksh93/sh/arith.c:
- arith(): Let disabling POSIX octal constants by skipping leading
  zeros depend on either the letoctal option being off (if we're
  running the "let" built-in") or the posix option being off.
- sh_strnum(): Preset a base of 10 for strtonll(3) depending on the
  posix or letoctal option being off, not on the sh.inarith flag.

src/cmd/ksh93/include/argnod.h,
src/cmd/ksh93/sh/args.c,
src/cmd/ksh93/sh/macro.c:
- Remove astonishing hackery that violated shell grammar for 'let'.

src/cmd/ksh93/sh/name.c (nv_getnum()),
src/cmd/ksh93/sh/nvdisc.c (nv_getn()):
- Remove loops for skipping leading zeroes that included a broken
  check for justify/zerofill attributes, thereby fixing this bug:
	$ typeset -Z x=0x15; echo $((x))
	-ksh: x15: parameter not set
  Even if this code wasn't redundant before, it is now: sh_arith()
  is called immediately after the removed code and it ignores
  leading zeroes via sh_strnum() and strtonll(3).

Resolves: https://github.com/ksh93/ksh/issues/334
This commit is contained in:
Martijn Dekker 2021-11-17 03:24:18 +00:00
parent 257eea612a
commit c734568b02
14 changed files with 86 additions and 93 deletions

18
NEWS
View file

@ -3,6 +3,19 @@ For full details, see the git log at: https://github.com/ksh93/ksh
Any uppercase BUG_* names are modernish shell bug IDs.
2021-11-16:
- By default, arithmetic expressions in ksh no longer interpret a number
with a leading zero as octal in any context. Use 8#octalnumber instead.
Before, ksh would arbitrarily recognize the leading octal zero in some
contexts but not others, e.g., both of:
$ x=010; echo "$((x)), $(($x))"
$ set -o letoctal; x=010; let y=$x z=010; echo "$y, $z"
would output '10, 8'. These now output '10, 10' and '8, 8', respectively.
Arithmetic expressions now also behave identically within and outside
((...)) and $((...)). Setting the --posix compliance option turns on the
recognition of the leading octal zero for all arithmetic contexts.
2021-11-15:
- In arithmetic evaluation, the --posix compliance option now disables the
@ -42,11 +55,6 @@ Any uppercase BUG_* names are modernish shell bug IDs.
2021-09-13:
- Fixed a bug introduced in 93u+ 2012-02-07 that caused the 'printf' builtin
(and its 'print -f' equivalent) to fail to recognise integer arguments with a
leading zero as octal numbers. For example, 'printf "%d\n" 010' now once
again outputs '8' instead of '10'.
- Disable the POSIX arithmetic context while running a command substitution
invoked from within an arithmetic expression. This fixes a bug that caused
integer arguments with a leading zero to be incorrectly interpreted as octal

View file

@ -151,10 +151,15 @@ For more details, see the NEWS file and for complete details, see the git log.
correctly negates another '!', e.g., [[ ! ! 1 -eq 1 ]] now returns
0/true. Note that this has always been the case for 'test'/'['.
28. In the 'printf' builtin (and the 'print -f' equivalent), numeric
arguments with a leading zero are now once again recognized as octal
numbers as in ksh93 versions before 2012-02-07, and as POSIX requires.
For example, 'printf "%d\n" 010' now once again outputs '8'.
28. By default, arithmetic expressions in ksh no longer interpret a number
with a leading zero as octal in any context. Use 8#octalnumber instead.
Before, ksh would arbitrarily recognize the leading octal zero in some
contexts but not others. One of several examples is:
x=010; echo "$((x)), $(($x))"
would output '10, 8'. This now outputs '10, 10'. Arithmetic
expressions now also behave identically within and outside ((...))
and $((...)). Setting the --posix compliance option turns on the
recognition of the leading octal zero for all arithmetic contexts.
____________________________________________________________________________

View file

@ -884,9 +884,7 @@ static int extend(Sfio_t* sp, void* v, Sffmt_t* fe)
}
break;
default:
shp->inarith = 1; /* POSIX compliance: recognize octal constants, e.g. printf '%d\n' 010 */
d = sh_strnum(argp,&lastchar,0);
shp->inarith = 0;
if(d<longmin)
{
errormsg(SH_DICT,ERROR_warn(0),e_overflow,argp);

View file

@ -125,7 +125,6 @@ struct argnod
#define ARG_ARITH 0x100 /* arithmetic expansion */
#define ARG_OPTIMIZE 0x200 /* try to optimize */
#define ARG_NOGLOB 0x400 /* no file name expansion */
#define ARG_LET 0x800 /* processing let command arguments */
#define ARG_ARRAYOK 0x1000 /* $x[sub] ==> ${x[sub]} */
extern struct dolnod *sh_argcreate(char*[]);

View file

@ -204,7 +204,6 @@ struct shared
char used_pos; /* used positional parameter */\
char universe; \
char winch; \
char inarith; /* set when in POSIX arith context, i.e. leading zero = octal, e.g. in ((...)) */ \
short arithrecursion; /* current arithmetic recursion level */ \
char indebug; /* set when in debug trap */ \
unsigned char ignsig; /* ignored signal in subshell */ \

View file

@ -21,7 +21,7 @@
#define SH_RELEASE_FORK "93u+m" /* only change if you develop a new ksh93 fork */
#define SH_RELEASE_SVER "1.0.0-beta.2" /* semantic version number: https://semver.org */
#define SH_RELEASE_DATE "2021-11-15" /* must be in this format for $((.sh.version)) */
#define SH_RELEASE_DATE "2021-11-16" /* must be in this format for $((.sh.version)) */
#define SH_RELEASE_CPYR "(c) 2020-2021 Contributors to ksh " SH_RELEASE_FORK
/* Scripts sometimes field-split ${.sh.version}, so don't change amount of whitespace. */

View file

@ -332,7 +332,7 @@ The arithmetic expression
.I expr1
is evaluated first
(see
.I "Arithmetic evaluation"
.I "Arithmetic Evaluation"
below).
The arithmetic expression
.I expr2
@ -1029,7 +1029,7 @@ for an indexed array is denoted by
an
.I arithmetic expression\^
(see
.I "Arithmetic evaluation"
.I "Arithmetic Evaluation"
below)
between a
.B [
@ -1693,7 +1693,7 @@ is assigned a new value.
.B .sh.math
Used for defining arithmetic functions
(see
.I "Arithmetic evaluation"
.I "Arithmetic Evaluation"
below)
and stores the list of user defined arithmetic functions.
.TP
@ -6447,7 +6447,7 @@ is a separate
.I "arithmetic expression"
to be evaluated.
.B let
only recognizes octal constants starting with
only recognizes octal numbers starting with
.B 0
when the
.B set
@ -6456,7 +6456,7 @@ option
is on.
See
.I "Arithmetic Evaluation"
above, for a description of arithmetic expression evaluation.
above for a description of arithmetic expression evaluation.
.sp .5
The exit status is
0 if the value of the last expression
@ -7174,7 +7174,7 @@ Same as
.B letoctal
The
.B let
command allows octal constants starting with
command allows octal numbers starting with
.BR 0 .
On by default if ksh is invoked as \fBsh\fR or \fBrsh\fR.
.TP 8
@ -7250,6 +7250,10 @@ disables the special floating point constants \fBInf\fR and \fBNaN\fR in
arithmetic evaluation so that, e.g., \fB$((inf))\fR and \fB$((nan))\fR refer
to the variables by those names;
.IP \[bu]
enables the recognition of a leading zero as introducing an octal number in
all arithmetic evaluation contexts, except in the \fBlet\fR built-in while
\fBletoctal\fR is off;
.IP \[bu]
changes the \fBtest\fR/\fB[\fR built-in command to make its deprecated
\fIexpr1\fR \fB-a\fR \fIexpr2\fR and \fIexpr1\fR \fB-o\fR \fIexpr2\fR operators
work even if \fIexpr1\fR equals "\fB!\fR" or "\fb(\fR" (which means the

View file

@ -664,8 +664,6 @@ char **sh_argbuild(Shell_t *shp,int *nargs, const struct comnod *comptr,int flag
*nargs = 0;
if(ac)
{
if(ac->comnamp == SYSLET)
flag |= ARG_LET;
argp = ac->comarg;
while(argp)
{

View file

@ -405,11 +405,9 @@ static Sfdouble_t arith(const char **ptr, struct lval *lvalue, int type, Sfdoubl
const char radix = GETDECIMAL(0);
lvalue->eflag = 0;
errno = 0;
if(shp->bltindata.bnode==SYSLET && !sh_isoption(SH_LETOCTAL))
{ /*
* Since we're running the "let" builtin, disable octal number processing by
* skipping all initial zeros, unless the 'letoctal' option is on.
*/
if(!sh_isoption(shp->bltindata.bnode==SYSLET ? SH_LETOCTAL : SH_POSIX))
{
/* Skip leading zeros to avoid parsing as octal */
while(*val=='0' && isdigit(val[1]))
val++;
}
@ -534,7 +532,7 @@ Sfdouble_t sh_strnum(register const char *str, char** ptr, int mode)
{
Shell_t *shp = sh_getinterp();
register Sfdouble_t d;
char base=(shp->inarith?0:10), *last;
char base = (sh_isoption(shp->bltindata.bnode==SYSLET ? SH_LETOCTAL : SH_POSIX) ? 0 : 10), *last;
if(*str==0)
{
d = 0.0;
@ -544,7 +542,7 @@ Sfdouble_t sh_strnum(register const char *str, char** ptr, int mode)
{
errno = 0;
d = strtonll(str,&last,&base,-1);
if(*last && !shp->inarith && sh_isstate(SH_INIT))
if(*last && sh_isstate(SH_INIT))
{
/* This call is to handle "base#value" literals if we're importing untrusted env vars. */
errno = 0;
@ -580,6 +578,7 @@ Sfdouble_t sh_strnum(register const char *str, char** ptr, int mode)
Sfdouble_t sh_arith(Shell_t *shp,register const char *str)
{
NOT_USED(shp);
return(sh_strnum(str, (char**)0, 1));
}

View file

@ -74,8 +74,6 @@ typedef struct _mac_
char patfound; /* set if pattern character found */
char assign; /* set for assignments */
char arith; /* set for ((...)) */
char let; /* set when expanding let arguments */
char zeros; /* strip leading zeros when set */
char arrayok; /* $x[] ok for arrays */
char subcopy; /* set when copying subscript */
int dotdot; /* set for .. in subscript */
@ -161,7 +159,6 @@ char *sh_mactrim(Shell_t *shp, char *str, register int mode)
savemac = *mp;
stkseek(stkp,0);
mp->arith = (mode==3);
mp->let = 0;
shp->argaddr = 0;
mp->pattern = (mode==1||mode==2);
mp->patfound = 0;
@ -219,7 +216,6 @@ int sh_macexpand(Shell_t* shp, register struct argnod *argp, struct argnod **arg
mp->arghead = arghead;
mp->quoted = mp->lit = mp->quote = 0;
mp->arith = ((flag&ARG_ARITH)!=0);
mp->let = ((flag&ARG_LET)!=0);
mp->split = !(flag&ARG_ASSIGN);
mp->assign = !mp->split;
mp->pattern = mp->split && !(flag&ARG_NOGLOB) && !sh_isoption(SH_NOGLOB);
@ -279,7 +275,7 @@ void sh_machere(Shell_t *shp,Sfio_t *infile, Sfio_t *outfile, char *string)
stkseek(stkp,0);
shp->argaddr = 0;
mp->sp = outfile;
mp->split = mp->assign = mp->pattern = mp->patfound = mp->lit = mp->arith = mp->let = 0;
mp->split = mp->assign = mp->pattern = mp->patfound = mp->lit = mp->arith = 0;
mp->quote = 1;
mp->ifsp = nv_getval(sh_scoped(shp,IFSNOD));
mp->ifs = ' ';
@ -1101,7 +1097,6 @@ static int varsub(Mac_t *mp)
int var=1,addsub=0,oldpat=mp->pattern,idnum=0,flag=0,d;
Stk_t *stkp = mp->shp->stk;
retry1:
mp->zeros = 0;
idbuff[0] = 0;
idbuff[1] = 0;
c = fcmbget(&LEN);
@ -1466,9 +1461,6 @@ retry1:
else
v = nv_getval(np);
mp->atmode = (v && mp->quoted && mode=='@');
/* special case --- ignore leading zeros */
if((mp->let || (mp->arith&&nv_isattr(np,(NV_LJUST|NV_RJUST|NV_ZFILL)))) && !nv_isattr(np,NV_INTEGER) && (offset==0 || isspace(c) || strchr(",.+-*/=%&|^?!<>",c)))
mp->zeros = 1;
}
if(savptr==stakptr(0))
stkseek(stkp,offset);
@ -1621,7 +1613,6 @@ retry1:
int split = mp->split;
int quoted = mp->quoted;
int arith = mp->arith;
int zeros = mp->zeros;
int assign = mp->assign;
if(newops)
{
@ -1638,7 +1629,7 @@ retry1:
mp->split = 0;
mp->quoted = 0;
mp->assign &= ~1;
mp->arith = mp->zeros = 0;
mp->arith = 0;
newquote = 0;
}
else if(c=='?' || c=='=')
@ -1650,7 +1641,6 @@ retry1:
mp->split = split;
mp->quoted = quoted;
mp->arith = arith;
mp->zeros = zeros;
mp->assign = assign;
/* add null byte */
sfputc(stkp,0);
@ -2111,7 +2101,6 @@ static void comsubst(Mac_t *mp,register Shnode_t* t, int type)
t = sh_dolparen((Lex_t*)mp->shp->lex_context);
if(t && t->tre.tretyp==TARITH)
{
mp->shp->inarith = 1;
fcsave(&save);
if(t->ar.arcomp)
num = arith_exec(t->ar.arcomp);
@ -2119,7 +2108,6 @@ static void comsubst(Mac_t *mp,register Shnode_t* t, int type)
num = sh_arith(mp->shp,t->ar.arexpr->argval);
else
num = sh_arith(mp->shp,sh_mactrim(mp->shp,t->ar.arexpr->argval,3));
mp->shp->inarith = 0;
out_offset:
stkset(stkp,savptr,savtop);
*mp = savemac;
@ -2330,18 +2318,6 @@ static void mac_copy(register Mac_t *mp,register const char *str, register int s
Stk_t *stkp=mp->shp->stk;
int oldpat = mp->pattern;
nopat = (mp->quote||(mp->assign==1)||mp->arith);
if(mp->zeros)
{
/* prevent leading 0's from becoming octal constants */
while(size>1 && *str=='0')
{
if(str[1]=='x' || str[1]=='X')
break;
str++,size--;
}
mp->zeros = 0;
cp = str;
}
if(mp->sp)
sfwrite(mp->sp,str,size);
else if(mp->pattern>=2 || (mp->pattern && nopat) || mp->assign==3)

View file

@ -2954,14 +2954,7 @@ Sfdouble_t nv_getnum(register Namval_t *np)
}
}
else if((str=nv_getval(np)) && *str!=0)
{
if(nv_isattr(np,NV_LJUST|NV_RJUST) || (*str=='0' && !(str[1]=='x'||str[1]=='X')))
{
while(*str=='0')
str++;
}
r = sh_arith(shp,str);
}
return(r);
}

View file

@ -105,15 +105,8 @@ Sfdouble_t nv_getn(Namval_t *np, register Namfun_t *nfp)
else
str = nv_getv(np,fp?fp:nfp);
if(str && *str)
{
if(nv_isattr(np,NV_LJUST|NV_RJUST) || (*str=='0' && !(str[1]=='x'||str[1]=='X')))
{
while(*str=='0')
str++;
}
d = sh_arith(shp,str);
}
}
return(d);
}

View file

@ -470,7 +470,6 @@ Sfio_t *sh_subshell(Shell_t *shp,Shnode_t *t, volatile int flags, int comsub)
struct rand *rp; /* current $RANDOM discipline function data */
unsigned int save_rand_seed; /* parent shell $RANDOM seed */
int save_rand_last; /* last random number from $RANDOM in parent shell */
char save_inarith; /* flag indicating POSIX arithmetic context */
memset((char*)sp, 0, sizeof(*sp));
sfsync(shp->outpool);
sh_sigcheck(shp);
@ -592,9 +591,6 @@ Sfio_t *sh_subshell(Shell_t *shp,Shnode_t *t, volatile int flags, int comsub)
{
if(comsub)
{
/* a comsub within an arithmetic expression must not itself be in an arithmetic context */
save_inarith = shp->inarith;
shp->inarith = 0;
/* disable job control */
shp->spid = 0;
sp->jobcontrol = job.jobcontrol;
@ -738,8 +734,6 @@ Sfio_t *sh_subshell(Shell_t *shp,Shnode_t *t, volatile int flags, int comsub)
sh_close(sp->tmpfd);
}
shp->fdstatus[1] = sp->fdstatus;
/* restore POSIX arithmetic context flag */
shp->inarith = save_inarith;
}
if(!shp->subshare)
{

View file

@ -21,6 +21,9 @@
. "${SHTESTS_COMMON:-${0%/*}/_common}"
integer hasposix=0
(set -o posix) 2>/dev/null && ((hasposix++)) # not using [[ -o ?posix ]] as it's broken on 93v-
trap '' FPE # NOTE: osf.alpha requires this (no ieee math)
integer x=1 y=2 z=3
@ -346,8 +349,14 @@ do (( ipx = ip % 256 ))
done
unset x
x=010
(( x == 10 )) || err_exit 'leading zeros in x treated as octal arithmetic with $((x))'
(( $x == 8 )) || err_exit 'leading zeros not treated as octal arithmetic with $x'
(( x == 10 )) || err_exit 'leading zeros in x treated as octal arithmetic with ((x))'
(( $x == 10 )) || err_exit 'leading zeros in x treated as octal arithmetic with (($x))'
if ((hasposix))
then set --posix
((x == 8)) || err_exit 'posix: leading zeros in x not treated as octal arithmetic with ((x))'
(($x == 8)) || err_exit 'posix: leading zeros in x not treated as octal arithmetic with (($x))'
set --noposix
fi
unset x
typeset -Z x=010
(( x == 10 )) || err_exit 'leading zeros not ignored for arithmetic'
@ -728,15 +737,26 @@ unset A
unset r x
integer x
r=020
(($r == 16)) || err_exit 'leading 0 not treated as octal inside ((...))'
(($r == 20)) || err_exit 'leading 0 treated as octal inside ((...))'
x=$(($r))
(( x == 16 )) || err_exit 'leading 0 not treated as octal inside $((...))'
((x == 20)) || err_exit 'leading 0 treated as octal inside $((...))'
x=$r
((x == 20 )) || err_exit 'leading 0 should not be treated as octal outside ((...))'
((x == 20)) || err_exit 'leading 0 treated as octal outside ((...))'
print -- -020 | read x
((x == -20)) || err_exit 'numbers with leading -0 should not be treated as octal outside ((...))'
((x == -20)) || err_exit 'numbers with leading -0 treated as octal outside ((...))'
print -- -8#20 | read x
((x == -16)) || err_exit 'numbers with leading -8# should be treated as octal'
if ((hasposix))
then set --posix
(($r == 16)) || err_exit 'posix: leading 0 not treated as octal inside ((...))'
x=$(($r))
(( x == 16 )) || err_exit 'posix: leading 0 not treated as octal inside $((...))'
x=$r
((x == 16)) || err_exit 'posix: leading 0 not as octal outside ((...))'
print -- -020 | read x
((x == -16)) || err_exit 'posix: numbers with leading -0 should be treated as octal outside ((...))'
set --noposix
fi
unset x
x=0x1
@ -750,8 +770,13 @@ let "$x==10" || err_exit 'arithmetic with $x where $x is 010 should be decimal i
(( 9.$x == 9.01 )) || err_exit 'arithmetic with 9.$x where x=010 should be 9.01'
(( 9$x == 9010 )) || err_exit 'arithmetic with 9$x where x=010 should be 9010'
x010=99
((x$x == 99 )) || err_exit 'arithtmetic with x$x where x=010 should be $x010'
(( 3+$x == 11 )) || err_exit '3+$x where x=010 should be 11 in ((...))'
((x$x == 99 )) || err_exit 'arithmetic with x$x where x=010 should be $x010'
(( 3+$x == 13 )) || err_exit '3+$x where x=010 should be 13 in ((...))'
if ((hasposix))
then set --posix
(( 3+$x == 11 )) || err_exit 'posix: 3+$x where x=010 should be 11 in ((...))'
set --noposix
fi
let "(3+$x)==13" || err_exit 'let should not recognize leading 0 as octal'
unset x
typeset -RZ3 x=10
@ -878,8 +903,9 @@ unset got
# ======
# https://github.com/ksh93/ksh/issues/326
for m in u d i o x X
((hasposix)) && for m in u d i o x X
do
set --posix
case $m in
o) exp="10;21;32;" ;;
x) exp="8;11;1a;" ;;
@ -887,22 +913,16 @@ do
*) exp="8;17;26;" ;;
esac
got=${ printf "%$m;" 010 021 032; }
[[ $got == "$exp" ]] || err_exit "printf %$m does not recognize octal arguments" \
[[ $got == "$exp" ]] || err_exit "posix: printf %$m does not recognize octal arguments" \
"(expected $(printf %q "$exp"), got $(printf %q "$got"))"
set --noposix
done
# https://github.com/ksh93/ksh/issues/326#issuecomment-917707463
exp=18
got=$(( $(integer x; x=010; echo $x) + 010 ))
# ^^^ decimal ^^^ octal
[[ $got == "$exp" ]] || err_exit 'Integer with leading zero incorrectly interpreted as octal in non-POSIX arith context' \
"(expected $(printf %q "$exp"), got $(printf %q "$got"))"
# ======
# BUG_ARITHNAN: In ksh <= 93u+m 2021-11-15 and zsh 5.6 - 5.8, the case-insensitive
# floating point constants Inf and NaN are recognised in arithmetic evaluation,
# overriding any variables with the names Inf, NaN, INF, nan, etc.
if (set --posix) 2>/dev/null
if ((hasposix))
then set --posix
Inf=42 NaN=13
inf=421 nan=137
@ -920,5 +940,12 @@ then set --posix
set --noposix
fi
# ======
# https://github.com/ksh93/ksh/issues/334#issuecomment-968603087
exp=21
got=$(typeset -Z x=0x15; { echo $((x)); } 2>&1)
[[ $got == "$exp" ]] || err_exit "typeset -Z corrupts hexadecimal number in arithmetic context" \
"(expected $(printf %q "$exp"), got $(printf %q "$got"))"
# ======
exit $((Errors<125?Errors:125))