external/cde - Personal Git space

mirror of git://git.code.sf.net/p/cdesktopenv/code synced 2025-03-09 15:50:02 +00:00

Author	SHA1	Message	Date
Martijn Dekker	416a412d71	Fix seeking on block devices with FD 0, 1 or 2 @stephane-chazelas reports: > A very weird issue: > > To reproduce on GNU/Linux (here as superuser) > > # truncate -s10M file > # export DEV="$(losetup -f --show file)" > # ksh -c 'exec 3<> "$DEV" 3>#((0))' # fine > # ksh -c 'exec 1<> file 1>#((0))' # fine > # ksh -c 'exec 1<> "$DEV" 1>#((0))' > ksh: 0: invalid seek offset > > Any seek offset is considered "invalid" as long as the file is a > block device and the fd is 0, 1 or 2. It's fine for fds above 2 > and it's fine with any fd for regular files. Apparently, block devices are not seekable with sfio. In io.c there is specific code to avoid using sfio's sfseek(3) if there is no sfio stream in sh.sftable[] for the file descriptor in question: 1398: Sfio_t sp = sh.sftable[fn]; [...] 1420: if(sp) 1421: { 1422: off=sfseek(sp, off, SEEK_SET); 1423: sfsync(sp); 1424: } 1425: else 1426: off=lseek(fn, off, SEEK_SET); For file descriptors 0, 1 or 2 (stdin/stdout/stderr), there is a sh.sftable[] stream by default, and it is marked as not seekable. This makes it return -1 in these lines in sfseek.c, even if the system call called via SFSK() succeeds: 89: if(f->extent < 0) 90: { / let system call set errno */ 91: (void)SFSK(f,(Sfoff_t)0,SEEK_CUR,f->disc); 92: return (Sfoff_t)(-1); 93: } ...which explains the strange behaviour. src/lib/libast/sfio/sfseek.c: sfseek(): - Allow for the possibility that the fallback system call might succeed: let it handle both errno and the return value. Resolves: https://github.com/ksh93/ksh/issues/318	2022-06-28 22:18:46 +02:00
Martijn Dekker	da97587e9e	lex.c: prevent restoring outdated stack pointer Lexical levels are stored in a dynamically grown array of int values grown by the stack_grow function. The pointer lex_match and the maximum index lex_max are part of the lexer state struct that is now saved and restored in various places -- see e.g. `37044047`, `a2bc49be`. If the stack needs to be grown, it is reallocated in stack_grow() using sh_realloc(). If that happens between saving and restoring the lexer state, then an outdated pointer is restored, and crash. src/cmd/ksh93/include/shlex.h, src/cmd/ksh93/sh/lex.c: - Take lex_match and lex_max out of the lexer state struct and make them separate static variables. src/cmd/ksh93/edit/edit.c: - While we're at it, save and restore the lexer state in a way that is saner than the 93v- beta approach (re: `37044047`) as well as more readable. Instead of permanently allocating memory, use a local variable to save the struct. Save/restore directly around the sh_trap() call that actually needs this done. Resolves: https://github.com/ksh93/ksh/issues/482	2022-06-23 03:35:48 +01:00
Martijn Dekker	d8dc2a1d81	sh_setenviron(): deactivate compound assignment prefix Reproducers: $ ksh -c 'typeset -a arr=( ( (a $(($(echo 1) + 1)) c)1))' ksh: echo: arr[0]._AST_FEATURES=CONFORMANCE - ast UNIVERSE - ucb: cannot be an array ksh: [1]=1: invalid variable name $ ksh -c 'typeset -a arr=( (a $(($(echo 1) + 1)) c)1)' ksh: echo: arr._AST_FEATURES=CONFORMANCE - ast UNIVERSE - ucb: is not an identifier ksh: [1]=1: invalid variable name src/cmd/ksh93/sh/name.c: sh_setenviron(): - Save and clear the current compound assignment prefix (sh.prefix) while assigning to the _AST_FEATURES variable.	2022-06-23 03:34:16 +01:00
Martijn Dekker	225323f138	Fix more "/dev/null: cannot create" (re: `411481eb`) Reproducer: trap : USR1 while :; do kill -s USR1 $$ \|\| exit; done & while :; do : >/dev/null; done It can take between a fraction of a second and a few minutes, but eventually it will fail like this: $ ksh foo foo[3]: /dev/null: cannot create kill: 77946: no such process It fails similarly with "cannot open" if </dev/null is used instead of >/dev/null. This is the same problem as in the referenced commit, except when handling traps -- so the same fix is required in sh_fault().	2022-06-20 19:39:00 +01:00
Martijn Dekker	40245e088d	Fix the exec optimisation mess (re: `17ebfbf6`, `6701bb30`, `d6c9821c`) This commit supersedes @lijog's Solaris patch 280-23332860 (see `17ebfbf6`) as this is a more general fix that makes the patch redundant. Of course its associated regression tests stay. Reproducer script: trap 'echo SIGUSR1 received' USR1 sh -c 'kill -s USR1 $PPID' Run as a normal script. Expected behaviour: prints "SIGUSR1 received" Actual behaviour: the shell invoking the script terminates. Oops. As of `6701bb30`, ksh again allows an exec-without-fork optimisation for the last command in a script. So the 'sh' command gets the same PID as the script, therefore its parent PID ($PPID) is the invoking script and not the script itself, which has been overwritten in working memory. This shows that, if there are traps set, the exec optimisation is incorrect as the expected process is not signalled. While `6701bb30` reintroduced this problem for scripts, this has always been an issue for certain other situations: forked command substitutions, background subshells, and -c option argument scripts. This commit fixes it in all those cases. In sh_exec(), case TFORK, the optimisation (flagged in no_fork) was only blocked for SIGINT and for the EXIT and ERR pseudosignals. That is wrong. It should be blocked for all signal and pseudosignal traps, except DEBUG (which is run before the command) and SIGKILL and SIGSTOP (which cannot be trapped). (I've also tested the behaviour of other shells. One shell, mksh, never does an exec optimisation, even if no traps are set. I don't know if that is intentional or not. I suppose it is possible that a script might expect to receive a signal without trapping it first, and they could conceivably be affected the same way by this exec optimisation. But the ash variants (e.g. Busybox ash, dash, FreeBSD sh), as well as bash, yash and zsh, all do act like this, so the behaviour is very widespread. This commit makes ksh act like them.) Multiple files: - Remove the sh.errtrap, sh.exittrap and sh.end_fn flags and their associated code from the superseded Solaris patch. src/cmd/ksh93/include/shell.h: - Add a scoped sh.st.trapdontexec flag for sh_exec() to disable exec-without-fork optimisations. It should be in the sh.st scope struct because it needs to be reset in subshell scopes. src/cmd/ksh93/bltins/trap.c: b_trap(): - Set sh.st.trapdontexec if any trap is set and non-empty (an empty trap means ignore the signal, which is inherited by an exec'd process, so the optimisation is fine in that case). - Only clear sh.st.trapdontexec if we're not in a ksh function scope; unlike subshells, ksh functions fall back to parent traps if they don't trap a signal themselves, so a ksh function's parent traps also need to disable the exec optimisation. src/cmd/ksh93/sh/fault.c: sh_sigreset(): - Introduce a new -1 mode for sh_funscope() to use, which acts like mode 0 except it does not clear sh.st.trapdontexec. This avoids clearing sh.st.trapdontexec for ksh functions scopes (see above). - Otherwise, clear sh.st.trapdontexec whenever traps are reset. src/cmd/ksh93/sh/xec.c: check_exec_optimization(): - Consolidate all the exec optimisation logic into this function, including the logic from the no_fork flag in sh_exec()/TFORK. - In the former no_fork flag logic, replace the three checks for SIGINT, ERR and EXIT traps with a single check for the sh.st.trapdontexec flag.	2022-06-18 23:27:10 +01:00
Martijn Dekker	16b3802148	Fix incorrect exec optimisation with monitor/pipefail on Reproducer script: tempfile=/tmp/out2.$$.$RANDOM bintrue=$(whence -p true) for opt in monitor pipefail do ( set +x -o "$opt" ( sleep .05 echo "ok $opt" >&2 ) 2>$tempfile \| "$bintrue" ) & wait cat "$tempfile" rm -f "$tempfile" done Expected output: ok monitor ok pipefail Actual output: (none) The 'monitor' and 'pipefail' options are supposed to make the shell wait for the all commands in the pipeline to terminate and not only the last component, regardless of whether the pipe between the component commands is still open. In the failing reproducer, the dummy external true command is subject to an exec optimization, so it replaces the subshell instead of forking a new process. This is incorrect, as the shell is no longer around to wait for the left-hand part of the pipeline, so it continues in the background without being waited for. Since it writes to standard error after .05 seconds (after the pipe is closed), the 'cat' command reliably finds the temp file empty. Without the sleep this would be a race condition with unpredictable results. Interestingly, this bug is only triggered for a (background subshell)& and not for a forked (regular subshell). Which means the exec optimization is not done for a forked regular subshell, though there is no reason not to. That will be fixed in the next commit. src/cmd/ksh93/sh/xec.c: sh_exec(): - case TFORK: Never allow an exec optimization if we're running a command in a multi-command pipeline (pipejob is set) and the shell needs to wait for all pipeline commands, i.e.: either the time keyword is in use, the SH_MONITOR state is active, or the SH_PIPEFAIL option is on. - case TFIL: Fix the logic for setting job.waitall for the non-SH_PIPEFAIL case. Do not 'or' in the boolean value but assign it, and include the SH_TIMING (time keyword in use) state too. - case TTIME: After that fix in case TFIL, we don't need to bother setting job.waitall explicitly here. src/cmd/ksh93/sh.1: - Add missing documentation for the conditions where the shell waits for all pipeline components (time, -o monitor/pipefail). Resolves: https://github.com/ksh93/ksh/issues/449	2022-06-18 23:25:30 +01:00
Martijn Dekker	6016fb64ce	Forking workaround for converting to associative array in subshell $ arch/*/bin/ksh -xc 'typeset -a a=(1 2 3); \ (typeset -A a; typeset -p a); typeset -p a' typeset -A a=() typeset -a a=(1 2 3) The associative array in the subshell is empty, so the conversion failed. So far, I have been unsuccessful at fixing this in the array and/or virtual subshell code (a patch that fixes it there would still be more than welcome). As usual, real subshells work correctly, so this commit adds another forking workaround. The use case is rare and specific enough that I have no performance concerns. src/cmd/ksh93/bltins/typeset.c: setall(): - Fork a virtual subshell if we're actually converting a variable to an associative array, i.e.: the NV_ARRAY (-A, associative array) attribute was passed, there are no assignments (sh.envlist is NULL), and the variable is not unset. src/cmd/ksh93/tests/arith.sh: - Fix the "Array subscript quoting test" tests that should not have been passing and that correctly failed after this fix; they used 'typeset -A' without an assignment in a subshell, assuming it was unset in the parent shell, which it wasn't. Resolves: https://github.com/ksh93/ksh/issues/409	2022-06-15 04:58:14 +01:00
Martijn Dekker	50db00e136	Fix subshell trap integrity, e.g. re-trapping a signal in subshell Ksh handles local traps in virtual subshells the same way as local traps in ksh-style shell functions, which can cause incorrect operation. Reproducer script: trap 'echo "parent shell trap"; exit 0' USR1 (trap 'echo "subshell trap"' USR1; kill -USR1 $$) echo wrong Output on every ksh93 version: 'wrong' Output on every other shell: 'parent shell trap' The ksh93 output is wrong because $$ is the PID of the main shell, therefore 'kill -USR1 $$' from a subshell needs to issue SIGUSR1 to the main shell and trigger the 'echo SIGUSR1' trap. This is an inevitable consequence of processing signals in a virtual subshell. Signals are a process-wide property, but a virtual subshell and the parent shell share the same process. Therefore it is not possible to distinguish between the parent shell and subshell trap. This means virtual subshells are fundamentally incompatible with receiving signals. No workaround can make this work properly. Ksh could either assume the signal needs to be caught by the subshell trap (wrong in this case, but right in others) or by the parent shell trap. But it does neither and just gives up and does nothing, which I suppose is the least bad way of doing it wrong. As another example, consider a subshell that traps a signal, then passes its own process ID (as of `9de65210`, that's ${.sh.pid}) to another process to say "here is where to signal me". A virtual subshell will send it the PID that it shares with the the parent shell. Even if a virtual subshell receives the signal correctly, it may fork mid-execution afterwards, depending on the commands that it runs (and this varies by implementation as we fix or work around bugs). So its PID could be invalidated at any time. Forking a virtual subshell at the time of trapping a signal is the only way to ensure a persistent PID and correct operation. src/cmd/ksh93/bltins/trap.c: b_trap(): - Fork when trapping (or ignoring) a signal in a virtual subshell. (There's no need to fork when trapping a pseudosignal.) src/cmd/ksh93/tests/signal.sh: - Add tests. These are simplified versions of tests already there, which issued 'kill' as a background job. Currently, running a background job causes a virtual subshell to fork before forking the 'kill' background job (or any background job, see `e3d7bf1d`) -- an ugly partial workaround that I believe just became redundant and which I will remove in the next commit.	2022-06-14 01:33:24 +01:00
Martijn Dekker	9b893992a3	[v1.0] posix: don't zero-pad 2nds (re: `5c677a4c`, `70fc1da7`, `b1a41311`) The POSIX mode now disables left-hand zero-padding of seconds in 'time'/'times' output. The standard specifies the output format quite exactly and zero-padding is not in it.	2022-06-12 16:16:11 +01:00
Martijn Dekker	3030197b89	Add error message for ambiguous long-form option abbreviation Before: $ set -o hist -ksh: set: hist: bad option(s) After: $ set --hist -ksh: set: hist: ambiguous option In ksh 93u+m, there are three options starting with 'hist', so the abbreviation does not represent a single option. It is useful to communicate this in the error message. In addition, "bad option(s)" was changed to "unknown option", the same message as that given by AST optget(3), for consistency. src/cmd/ksh93/sh/string.c: - Make sh_lookopt() return -1 for an ambiguous option. This is trivial as there is already an 'amb' variable flagging that up. src/cmd/ksh93/sh/args.c: - Use the negative sh_lookopt() return status to decide between "unknown option" and "ambiguous option". src/cmd/ksh93/data/builtins.c: sh_set[]: - Explain the --option and --nooption forms in addition to the -o option and +o option forms. - Document the long options without their 'no' prefixes (e.g. glob instead of noglob) as this simplifies documentation and the relation with the short options makes more sense. Those names are also how they show up in 'set -o' output and it is better for those to match. - Tweaks.	2022-06-10 01:11:46 +01:00
Martijn Dekker	4f9456d69f	posix: re-allow preset aliases on interactive (re: `ddaa145b`) The POSIX standard in fact contains a sentence that specifically allows preset aliases: "Implementations also may provide predefined valid aliases that are in effect when the shell is invoked." https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_03_01 I had missed that back then. It's still a terrible idea for scripts (particularly the way 93u+ implemented them), but for interactive POSIX shells it makes a lot more sense and does not violate POSIX. src/cmd/ksh93/sh/main.c: sh_main(): - Preset aliases for interactive shell regardless of SH_POSIX.	2022-06-09 21:56:57 +01:00
Martijn Dekker	b14e79e9d0	posix: use real pipe(2) instead of socketpair(2) The POSIX standard requires real UNIX pipes as in pipe(2). But on systems supporting it (all modern ones), ksh uses socketpair(2) instead to make it possible for certain commands to peek ahead without consuming input from the pipe, which is not possible with real pipes. See features/poll and sh/io.c. But this can create undesired side effects: applications connected to a pipe may test if they are connected to a pipe, which will fail if they are connected to a socket. Also, on Linux: $ cat /etc/passwd \| head -1 /dev/stdin head: cannot open '/dev/stdin' for reading: No such device or address ...which happens because, unlike most systems, Linux cannot open(2) or openat(2) a socket (a limitation that is allowed by POSIX). Unfortunately at least two things depend on the peekahead capability of the _pipe_socketpair feature. One is the non-blocking behaviour of the -n option of the 'read' built-in: -n Causes at most n bytes to be read instead of a full line, but will return when reading from a slow device as soon as any characters have been read. The other thing that breaks is the <#pattern and <##pattern redirection operators that basically grep standard input, which inherently requires peekahead. Standard UNIX pipes always block on read and it is not possible to peek ahead, so these features inevitably break. Which means we cannot simply use standard pipes without breaking compatibility. But we can at least fix it in the POSIX mode so that cross-platform scripts work more correctly. src/cmd/ksh93/sh/io.c: sh_pipe(): - If _pipe_socketpair is detected at compile time, then use a real pipe via sh_rpipe() if the POSIX mode is active. (If _pipe_socketpair is not detected, a real pipe is always used.) src/cmd/ksh93/data/builtins.c: - sh.1 documents the slow-device behaviour of -n but 'read --man' did not. Add that, making it conditional on _pipe_socketpair. Resolves: https://github.com/ksh93/ksh/issues/327	2022-06-09 16:16:16 +01:00
Martijn Dekker	0602177646	posix: block brace expansion of unquoted expansions (re: `a14d17c0`) Historically, ksh (including ksh88 and mksh) allow brace expansion not just on literal patterns but also on patterns resulting from unquoted variable expansions or command substitutions: $ a='{a,b}' ksh -c 'echo X{a,b} Y$a' Xa Xb Ya Yb Most people expect only the first (literal) pattern to be expanded, as in bash and zsh: $ a='{a,b}' bash -c 'echo X{a,b} Y$a' Xa Xb Y{a,b} The historic ksh behaviour is poorly documented and nearly unknown, violates the principle of least astonishment, and makes unquoted variable expansions even more unsafe. See discussion at: https://www.austingroupbugs.net/view.php?id=1193 https://github.com/ksh93/ksh/issues/140 Unfortunately, we cannot change it in default ksh without breaking backward compatibility. But we can at least fix it for the POSIX mode (which disables brace expansion by default but allows turning it back on), particularly as it looks like POSIX, if it decides to specify brace expansion in a future version of the standard, will disallow brace expansion on unquoted variable expansions. src/cmd/ksh93/sh/macro.c: endfield(): - When deciding whether to do brace expansion + globbing or only globbing, also check that we do not have POSIX mode and an unquoted variable expansion (mp->pattern==1).	2022-06-08 22:21:53 +01:00
Martijn Dekker	9da0887e54	Fix spurious export attribute when printing compound variables Reproducer script: typeset -Ttyp1 typ1=( function get { .sh.value="'Sample'"; } ) typ1 var11 typeset -p .sh.type typeset -p .sh.type Buggy output: namespace sh.type { typeset -r typ1='Sample' } namespace sh.type { typeset -x -r typ1='Sample' } An -x (export) attribute is magically pulled out of a hat. Analysis: The walk_tree() function in nvdisc.c repurposes (!) the NV_EXPORT attribute as an instruction to turn off indenting when pretty-printing the values of compound variables. The print_namval() function in typeset.c, implementing 'typeset -p', turns on NV_EXPORT for compound variables to inhibit indentation. But it then does not bother to turn it off, which causes this bug. src/cmd/ksh93/bltins/typeset.c: print_namval(): - When printing compound variables, only turn on NV_EXPORT temporarily. Resolves: https://github.com/ksh93/ksh/issues/456	2022-06-07 04:27:54 +01:00
Martijn Dekker	80f8cc497f	Fix completion following $'foo\'bar' On an interactive shell in emacs or vi, type a command with a $'…' quoted string that contains a backslash-escaped single quote, like: $ true $'foo\'bar' ▁ Then begin to type the name of a file present in the current working directory and press tab. Nothing happens as completion fails to work. The completion code does not recognise $'…' strings. Instead, it parses them as '…' strings in which there are no backslash escapes, so it considers the last ' to open a second quoted string which is not terminated. Plus, when replacing a $'…' string with a (backslash-escaped) completed string, the initial '$' is not replaced: $ $'/etc/hosts<Tab> $ $/etc/hosts src/cmd/ksh93/edit/completion.c: - find_begin(): - Learn how to recognise $'…' strings. A new local dollarquote flag variable is used to distinguish them from regular '…' strings. The difference is that backslash escapes (and only those) should be recognised as in "…". - Set a special type -1 for $'…' as the caller will need a way to distinguish those from '…'. - ed_expand(): When replacing a quoted string, remove an extra initial character (being the $ in $') if the type set by find_begin() is -1. Resolves: https://github.com/ksh93/ksh/issues/462	2022-06-06 03:13:13 +01:00
Martijn Dekker	7a5423dfb6	Fix more spurious comsub execution in tab completion (re: `7a2d3564`) Comsubs were either executed or caused a syntax error when attempting to complete them within single quotes. Since single quotes do not expand anything, no such completion should take place. $ '`/de<TAB>-ksh: /dev/: cannot execute [Is a directory] $ '$(/de<TAB>-ksh: syntax error: `end of file' unexpected src/cmd/ksh93/edit/completion.c: - find_begin(): - Remove recursive handling for '`' comsubs from `7a2d3564`; it is sufficient to set the return pointer to the current location cp (the character following '`') if we're not in single quotes. - For '$' and '`', if we are within single quotes, set type '\'' and set the return pointer bp to the location of the '$' or '`'. - ed_expand(): If find_begin() sets type '\'' and the current begin character is $ or `, refuse to attempt completion; return -1 to cause a terminal beep. Related: https://github.com/ksh93/ksh/issues/268 https://github.com/ksh93/ksh/issues/462#issuecomment-1038482307	2022-06-06 03:12:57 +01:00
Martijn Dekker	e3aa32a129	Add --functrace shell option (re: `2a835a2d`) A side effect of the bug fixed in `2a835a2d` caused the DEBUG trap action to appear to be inherited by subshells, but in a buggy way that could crash the shell. After the fix, the trap is reset in subshells along with all the others, as it should be. Nonetheless, as that bug was present for years, some have come to rely on it. This commit implements that functionality properly. When the new --functrace option is turned on, DEBUG trap actions are now inherited by subshells as well as ksh function scopes. In addition, since it makes logical sense, the new option also causes the -x/--xtrace option's state to be inherited by ksh function scopes. Note that changes made within the scope do not propagate upwards; this is different from bash. (I've opted against adding a -T short-form equivalent as on bash, because -T was formerly a different option on 93u+ (see 63c55ad7) and on mksh it has yet anohter a different meaning. To minimise confusion, I think it's best to have the long-form name only.) src/cmd/ksh93/include/shell.h, src/cmd/ksh93/data/options.c: - Add new "functrace" (SH_FUNCTRACE) long-form shell option. src/cmd/ksh93/sh/subshell.c: sh_subshell(): - When functrace is on, copy the parent's DEBUG trap action into the virtual subshell scope after resetting the trap actions. src/cmd/ksh93/sh/xec.c: sh_funscope(): - When functrace is on and xtrace is on in the parent scope, turn it on in the child scope. - Same DEBUG trap action handling as in sh_subshell(). Resolves: https://github.com/ksh93/ksh/issues/162	2022-06-04 17:27:27 +01:00
Martijn Dekker	1184b2ade9	Honour attribs for assignments preceding sp. builtins, POSIX functs After the previous commit, one inconsistency was left. Assignments preceding special built-ins and POSIX functions (which persist past the command :-/) caused pre-existing attributes of the respective variables to be cleared. $ (f() { typeset -p n; }; typeset -i n; n=3+4 f) n=3+4 (expected output: 'typeset -i n=7') This problem was introduced shortly before the release of ksh 93u+, in version 2012-05-04, by adding these lines of code to the code for processing preceding assignments in sh_exec(): src/cmd/ksh93/sh/xec.c: 1055: if(np) 1056: flgs \|= NV_UNJUST; So, for special and declaration commands and POSIX functions, the NV_UNJUST flag is passed to nv_open(). In older ksh versions, this flag cleared justify attributes only, but in early 2012 it was repurposed to clear all attributes -- without changing the name or the relevant comment in name.h, which are now both misleading. The reason for setting this flag in xec.c was to deal with some bugs in 'typeset'. Removing those two lines causes regressions: attributes.sh[316]: FAIL: typeset -L should not preserve old attributes attributes.sh[322]: FAIL: typeset -R should not preserve old attributes attributes.sh[483]: FAIL: typeset -F after typeset -L fails attributes.sh[488]: FAIL: integer attribute not cleared for subsequent typeset Those are all typeset regressions, which suggests this fix was relevant to typeset only. This is corroborated by the relevant AT&T ksh93/RELEASE entry: 12-04-27 A bug in which old attributes were not cleared when assigning a value using typeset has been fixed. So, to fix this 2012 regression without reintroducing the typeset regressions, we need to set the NV_UNJUST flag for invocations of the typeset family of commands only. This is changed in xec.c. While we're at it, we might as well rename that little-used flag to something that reflects its current purpose: NV_UNATTR.	2022-06-03 23:28:28 +01:00
Martijn Dekker	75b247cce2	Honour attributes for local assignments preceding certain commands Reproducer: $ typeset -i NUM=123 $ (NUM=3+4 env; :)\|grep ^NUM= NUM=3+4 $ (NUM=3+4 env)\|grep ^NUM= NUM=7 The correct output is NUM=7. This is also the output if ksh is compiled with SHOPT_SPAWN disabled, suggesting that the problem is here in sh_ntfork(), where the invocation-local environment list is set using a sh_scope() call: src/cmd/ksh93/sh/xec.c: 3496: if(t->com.comset) 3497: { 3498: scope++; 3499: sh_scope(t->com.comset,0); 3500: } Analysis: When ksh is compiled with SHOPT_SPAWN disabled, external commands are always run using the regular forking mechanism. First the shell forks, then in the fork, the preceding assignments list (if any) are executed and exported in the main scope. Replacing global variables is not a problem as the variables are exported and the forked shell is then replaced by the external command using execve(2). But when using SHOPT_SPAWN/sh_ntfork(), this cannot be done as the fork(2) use is replaced by posix_spawn(2) which does not copy the parent process into the child, therefore it's not possible to execute anything in the child before calling execve(2). Which means the preceding assignments list must be processed in the parent, not the child. Which makes overwriting global variables a no-no. To avoid overwriting global variables, sh_ntfork() treats preceding assignments like local variables in functions, which means they do not inherit any attributes from the parent scope. That is why the integer attribute is not honoured in the buggy reproducers. And this is not just an issue for external commands. sh_scope() is also used for assignments preceding a built-in command. Which is logical, as those don't create a process at all. src/cmd/ksh93/sh/xec.c: 1325: if(argp) 1326: { 1327: scope++; 1328: sh_scope(argp,0); 1329: } Which means this bug exists for them as well, regardless of whether SHOPT_SPAWN is compiled in. $ /bin/ksh -c 'typeset -i NUM; NUM=3+4 command eval '\''echo $NUM'\' 3+4 (expected: 7, as on mksh and zsh) So, the attributes from the parent scope will need to be copied into the child scope. This should be done in nv_setlist() which is called from sh_scope() with both the NV_EXPORT and NV_NOSCOPE flags passed. Those flag bits are how we can recognise the need to copy attributes. Commit `f6bc5c0` fixed a similar inconsistency with the check for the read-only attribute. In fact, the bug fixed there was simply a specific instance of this bug. The extra check for readonly was because the readonly attribute was not copied to the temporary local scope. So that fix is now replaced by the more general fix for this bug. src/cmd/ksh93/sh/name.c: nv_setlist(): - Introduce a 'vartree' local variable to avoid repetitive 'sh.prefix_root ? sh.prefix_root : sh.var_tree' expressions. - Use the NV_EXPORT\|NV_NOSCOPE flag combination to check if parent scope attributes need to be copied to the temporary local scope of an assignment preceding a command. - If so, copy everything but the value itself: attributes (nvflag), size parameter (nvsize), discipline function pointer (nvfun) and the name pointer (nvname). The latter is needed because some code, at least put_lang() in init.c, compares names by comparing the pointers instead of the strings; copying the nvname pointer avoids a regression in tests/locale.sh. src/cmd/ksh93/sh/xec.c: local_exports(): - Fix a separate bug exporting attributes to a new ksh function scope, which was previously masked by the other bug. The attributes (nvflag) were copied after nv_putval()ing the value, which is incorrect as the behaviour of nv_putval() is influenced by the attributes. But here, we're copying the value too, so we can simplify the whole function by using nv_clone() instead. This may also fix other corner cases. (re: `c1994b87`) Resolves: https://github.com/ksh93/ksh/issues/465	2022-06-03 23:28:16 +01:00
Martijn Dekker	2ecc2575d5	Fix import of float attribute/value from environment (re: `960a1a99`) Bug 1: as of `960a1a99`, floating point literals were no longer recognised when importing variables from the environment. The attribute was still imported but the value reverted to zero: $ (export LC_NUMERIC=C; typeset -xF5 num=7.75; \ ksh -c 'typeset -p num') typeset -x -F 5 num=0.00000 Bug 2 (inherited from 93u+): The code that imported variable attributes from the environment only checked '.' to determine whether the float attribute should be set. It should check the current radix point instead. $ (export LC_NUMERIC=debug; typeset -xF5 num=7,75; \ ksh -c 'typeset -p num') typeset -x -i num=0 ...or, after fixing bug 1 only, the output is: typeset -x -i num=75000 src/cmd/ksh93/sh/arith.c: sh_strnum(): - When importing untrusted env vars at init time, handle not only "base#value" literals using strtonll, but also floating point literals using strtold. This fixes the bug without reallowing arbitary expressions. (re: `960a1a99`) - When not initialising, use sh.radixpoint (see `f0386a87`) instead of '.' to help decide whether to evaluate an arith expression. src/cmd/ksh93/sh/init.c: env_import_attributes(): - Use sh.radixpoint instead of '.' to check for a decimal fraction. (This code is needed because doubles are exported as integers for ksh88 compatibility; see attstore() in name.c.)	2022-06-03 12:18:54 +01:00
Martijn Dekker	ea300089a1	New feature: 'typeset -g' as in bash 4.2+ typeset -g allows directly manipulating the attributes of variables at the global level from any context. This feature already exists on bash 4.2 and later. mksh (R50+), yash and zsh have this flag as well, but it's slightly different: it ignores the current local scope, but a parent local scope from a calling function may still be used -- whereas on bash, '-g' always refers to the global scope. Since ksh93 uses static scoping (see III.Q28 at <http://kornshell.com/doc/faq.html>), only the bash behaviour makes sense here. Note that the implementation needs to be done both in nv_setlist() (name.c) and in b_typeset() (typeset.c) because assignments are executed before the typeset built-in itself. Hence also the pre-parsing of typeset options in sh_exec(). src/cmd/ksh93/include/nval.h: - Add new NV_GLOBAL bit flag, using a previously unused bit that still falls within the 32-bit integer range. src/cmd/ksh93/sh/xec.c: sh_exec(): - When pre-parsing typeset flags, make -g pass the NV_GLOBAL flag to the nv_setlist() call that processes shell assignments prior to running the command. src/cmd/ksh93/sh/name.c: nv_setlist(): - When the NV_GLOBAL bit flag is passed, save the current variable tree pointer (sh.var_tree) as well as the current namespace (sh.namespace) and temporarily set the former to the global variable tree (sh.var_base) and the latter to NULL. This makes assignments global and ignores namesapces. src/cmd/ksh93/bltins/typeset.c: - b_typeset(): - Use NV_GLOBAL bit flag for -g. - Allow combining -n with -g, permitting 'typeset -gn var' or 'nameref -g var' to create a global nameref from a function. - Do not allow a nonsensical use of -g when using nested typeset to define member variables of 'typeset -T' types. (Type method functions can still use it as normal.) - setall(): - If NV_GLOBAL is passed, use sh.var_base and deactivate sh.namespace as in nv_setlist(). This allows attributes to be set correctly for global variables. src/cmd/ksh93/tests/{functions,namespace}.sh: - Add regression tests based on reproducers for problems found by @hyenias in preliminary versions of this feature. Resolves: https://github.com/ksh93/ksh/issues/479	2022-06-01 21:07:01 +01:00
Martijn Dekker	f73b8617dd	Restore namespace's parent scope when exiting due to error Reproducer: $ namespace test { x=123; typeset -g x=456; } $ echo $x ${.test.x} 456 123 $ namespace test { typeset -Q; } arch/darwin.i386-64/bin/ksh: typeset: -Q: unknown option [usage message snipped for brevity] $ echo $x ${.test.x} 123 123 <== expected: 123 456 $ x=789 $ echo $x ${.test.x} 789 789 <== expected: 789 456 $ # look at that, we never left the namespace... When prefixing the erroneous 'typeset' with 'command', the problem does not occur. 'command' disables the properties of special built-ins such as exit on error. So, when a special built-in exits on error, the parent scope is not properly resotred. This bug exists in every ksh93 version with SHOPT_NAMESPACE so far. src/cmd/ksh93/sh/xec.c: sh_exec(): - Before entering a namespace, use sh_pushcontext and sigsetjmp to make sure we return here if sh_exit() is called, e.g. when a special builtin throws an error, to ensure the parent scope (oldnspace) is restored. Thanks to @hyenias for making me aware of this bug. Discussion: https://github.com/ksh93/ksh/issues/479#issuecomment-1140468965	2022-05-29 23:05:03 +01:00
Martijn Dekker	8f14514661	set --default: properly restore ksh IFS behaviour (re: `9e2a8c69`) Reproducer: $ (IFS=$'\t\t'; val=$'\tone\t\ttwo\t'; set --posix; \ set -- $val; echo $#; set --noposix; set -- $val; echo $#) 2 4 <== OK $ (IFS=$'\t\t'; val=$'\tone\t\ttwo\t'; set --posix; \ set -- $val; echo $#; set --default; set -- $val; echo $#) 2 2 <== bug The output of the seconnd command line should be like the first. When POSIX mode is turned off using 'set --noposix' (or 'set +o posix'), sh.ifstable is invalidated as it needs to be repopulated on the next field split to restore ksh-specific special handling of a repeated $IFS whitespace character as non-whitespace. However, when 'set --default' is used, this does not happen, which is a bug. src/cmd/ksh93/sh/args.c: sh_argopts(): - While processing --default, when turning off SH_POSIX, call sh_invalidate_ifs() to invalidate sh.ifstable.	2022-05-28 00:13:46 +01:00
Martijn Dekker	83baa27ef9	Fix incorrect typeset -L/-R/-Z on input with spaces (re: `bdb99741`) The typeset output for -L/-R/-Z seems to be wrong when the input has leading/trailing spaces. This started occurring after the dynamic buffer size changes introduced in name.c as part of the fix for <https://github.com/ksh93/ksh/issues/142>. Test script: typeset -L8 s_date1=" 22/02/09 08:25:01"; echo "$s_date1" typeset -R10 s_date1="22/02/09 08:25:01 "; echo "$s_date1" typeset -Z10 s_date1="22/02/09 08:25:01 "; echo "$s_date1" Actual output: 22/02/0 08:25:01 0008:25:01 Expected output: 22/02/09 9 08:25:01 9 08:25:01 src/cmd/ksh93/sh/name.c: nv_newattr(): - Simplify allocation code, replacing the earlier dynamic buffer size calculation with just the greater of the strlen and size. Resolves: https://github.com/ksh93/ksh/issues/476 Co-authored-by: George Lijo <george.lijo@gmail.com>	2022-05-26 00:08:45 +01:00
atheik	9bed28c3f9	Fix line continuation within command substitutions In command substitutions of the $(standard) and ${ shared state; } form, backslash line continuation is broken. Reproducer: echo $( echo one two\ three ) Actual output (ksh93, all versions): one two\ three Expected output (every other shell, POSIX spec): one twothree src/cmd/ksh93/sh/lex.c: sh_lex(): case S_REG: - Do not skip new-line joining if we're currently processing a command substitution of one of these forms (i.e., if the lp->lexd.dolparen level is > 0). Background info/analysis: comsub() is called from sh_lex() when S_PAR is the current state. In src/cmd/ksh93/data/lexstates.c, we see that S_PAR is reached in the ST_DOL state table at index 40. Decimal 40 is ( in ASCII. So, the previous skipping of characters was done according to the ST_DOL state table, and the character that stopped it was (. This means we have $(. Alternatively, comsub() may be called from sh_lex() by jumping to the do_comsub label. In brief, that is the case when we have ${. Regardless of which it is from the two, comsub() is now called from sh_lex(). In comsub(), lp->lexd.dolparen is incremented at the beginning and decremented at the end. Between them, we see that sh_lex() is called. So, lp->lexd.dolparen in sh_lex() indicates the depth of nesting $( or ${ statements we're in. Thus, it is also the number of comsub() invocations seen in a backtrace taken in sh_lex(). The codepath for `...` is different (and never had this bug). Co-authored by: Martijn Dekker <martijn@inlv.org> Resolves: https://github.com/ksh93/ksh/issues/367	2022-05-22 00:23:54 +01:00
atheik	40a5c45b48	Allow double quotes within backtick comsub within double quotes The following reproducer causes a spurious syntax error: foo="`: "("`" The nested double quotes are not recognised correctly, causing a syntax error at the '('. Removing the outer double quotes (which are unnecessary) is a workaround, but it's still a bug as every other shell accepts this. This bug has been present since the original Bourne shell. src/cmd/ksh93/sh/lex.c: sh_lex(): case S_QUOTE: - If the current character is '"' and we're in a `...` command substitution (ingrave is true), then do not switch to the old mode but keep using the ST_QUOTE state table. Thanks to @JohnoKing for the report and to @atheik for the fix. Co-authored by: Martijn Dekker <martijn@inlv.org> Resolves: https://github.com/ksh93/ksh/issues/352	2022-05-20 22:48:47 +01:00
atheik	86b94d9feb	libast: optget(3): Fix memory leak in --help/--man info src/lib/libast/misc/optget.c: textout(): case ']': - Before returning, call pop() to free any \f...\f info items that are left. Note that this is a safe no-op if the pointer is null. Resolves: https://github.com/ksh93/ksh/issues/407 Co-authored-by: Martijn Dekker <martijn@inlv.org>	2022-03-11 21:24:08 +01:00
Martijn Dekker	fd28da31da	Fix another test/[ corner case bug; add --posix test script This fixes another corner case bug in the horror show that is the test/[ comand. Reproducer: $ ksh --posix -c 'test X -a -n' ksh: test: argument expected Every other shell returns 0 (success) as, POSIXly, this is a test for the strings 'X' and '-n' both being non-empty, combined with the binary -a (logical and) operator. Instead, '-n' was taken as a unary primary operator with a missing argument, which is incorrect. POSIX reference: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/test.html > 3 arguments: > * If $2 is a binary primary, perform the binary test of $1 and $3. src/cmd/ksh93/bltins/test.c: - e3(): If the final argument begins with a dash, always treat it as a test for a non-empty string, therefore return true. Do not limit this to "new flags" only. src/cmd/ksh93/tests/posix.sh: - Added. These are tests for every aspect of the POSIX mode.	2022-03-11 21:23:45 +01:00
Martijn Dekker	9e2a8c6925	posix mode: disable effect of repeating whitespace char in $IFS ksh has a little-known field splitting feature that conflicts with POSIX: if a single-byte whitespace character (cf. isspace(3)) is repated in $IFS, then field splitting is done as if that character wasn't a whitespace character. An exmaple with the tab character: $ (IFS=$'\t'; val=$'\tone\t\ttwo\t'; set -- $val; echo $#) 2 $ (IFS=$'\t\t'; val=$'\tone\t\ttwo\t'; set -- $val; echo $#) 4 The latter being the same as, for example $ (IFS=':'; val='1️⃣2️⃣'; set -- $val; echo $#) 4 However, this is incompatible with the POSIX spec and with every other shell except zsh, in which repeating a character in IFS does not have any effect. So the POSIX mode must disable this. src/cmd/ksh93/include/defs.h, src/cmd/ksh93/sh/init.c: - Add sh_invalidate_ifs() function that invalidates the IFS state table by setting the ifsnp discipline struct member to NULL, which will cause the next get_ifs() call to regenerate it. - get_ifs(): Treat a repeated char as S_DELIM even if whitespace, unless --posix is on. src/cmd/ksh93/sh/args.c: - sh_argopts(): Call sh_invalidate_ifs() when enabling or disabling the POSIX option. This is needed to make the change in field splitting behaviour take immediate effect instead of taking effect at the next assignment to IFS.	2022-03-11 21:22:22 +01:00
Martijn Dekker	fae1932e62	enum: remove arbitrary one-argument limitation b_enum() contains a check that exactly one argument is given: 237: if (error_info.errors \|\| !argv \|\| (argv + 1)) But the subsequent argument handling loop will happily deal with multiple arguments: 246: while(cp = argv++) Every other declaration command supports multiple arguments and I see no reason why enum shouldn't. Simply removing the '(argv + 1)' check allows 'enum' to create more than one type per invocation. src/cmd/ksh93/bltins/enum.c: - b_enum(): Remove check for >1 args as described above. - Update documentation to describe the behaviour of enumeration types in arithmetic expressions and to add an example: a bool type with two enumeration values 'false' (0) and 'true' (1). That type is predefined in ksh 93v- and 2020. We're not going to do that in 93u+m but it's good to document the possibility. src/cmd/ksh93/sh.1: - Make changes parallel to the enum.c self-doc update.	2022-03-11 21:21:23 +01:00
Johnothan King	8fc8c2f51c	Fix a few minor issues (#473 ) Changes: - Fixed two xtrace test failures introduced in commit cfc8744c. - The definition of _use_ntfork_tcpgrp in xec.c is now dependent on SHOPT_SPAWN being defined (re: 8e9ed5be). - Removed many unnecessary newlines and fixed various typos.	2022-03-11 21:18:42 +01:00
Johnothan King	bb3527aea5	Fix infinite loop when posix_spawn fails (re: `0863a8eb`) (#468 ) This commit fixes an infinite loop introduced in commit `0863a8eb` that caused ksh to enter an infinite loop if posix_spawn failed to start a new process after setting the terminal process group. Reproducer (warning: it will cause ksh to crash Wayland sessions and drives up CPU usage by a ton): $ /tmp/this/file/does/not/exist /usr/bin/ksh: /tmp/this/file/does/not/exist: not found $ <Press enter> (ksh now prints $PS1 in a loop until killed with SIGKILL) The first bug fixed is the infinite loop that occurs when posix_spawn fails to execute a command. This was fixed by setting the terminal process group to the main interactive shell. The second bug fixed is related to the signal handling of the SIGTTIN, SIGTTOU and SIGTSTP signals. In sh_ntfork() these signals are set to their default signal handlers (SIG_DFL) before running a command. The signal handlers were only restored to SIG_IGN (ignore signal) when sh_ntfork() successfully ran a command. This could cause a SIGTTOU lockup under strace when a command failed to execute in an interactive shell, while also being one cause of the infinite loop. src/cmd/ksh93/sh/xec.c: sh_ntfork(): - Restore the terminal process group if posix_spawn failed to launch a new process. This is necessary because posix_spawn will set the terminal process group before it attempts to run a command and doesn't restore it on failure.	2022-03-11 21:14:20 +01:00
atheik	2e5fd4d4c1	slowread(): Turn off O_NONBLOCK for stdin if it is on (#471 ) This change turns off O_NONBLOCK for stdin if a previously ran program left it on so that interactive programs that expect it to be off work properly. src/cmd/ksh93/sh/io.c: slowread(): - Turn off O_NONBLOCK for stdin if it is on. Fixes: https://github.com/ksh93/ksh/issues/469	2022-03-11 21:10:59 +01:00
Johnothan King	e87dbebebd	Fix use after free bug when using += (re: `75796a9c`) (#466 ) The previous fix for the += operator introduced a use-after-free bug that could result in a variable pointing to random garbage: $ foo=bar $ foo+=_foo true $ typeset -p foo foo=V V The use after free issue occurs because when nv_clone creates a copy of $foo in the true command's invocation-local scope, it does not duplicate the string $foo points to. As a result, the $foo variable in the parent scope points to the same string as $foo in the invocation-local scope, which causes the use after free bug when cloned $foo variable is freed from memory. src/cmd/ksh93/sh/nvdisc.c: - To fix the use after free bug, allow nv_clone to duplicate the string with memdup or strdup when no flags are passed. src/cmd/ksh93/tests/variables.sh: - Add a regression test for using the += operator with regular commands. src/cmd/ksh93/tests/leaks.sh: - Add a regression test to ensure the bugfix doesn't introduce any memory leaks.	2022-03-11 21:08:57 +01:00
Martijn Dekker	b09ce2fa02	Fix crash when suspending a blocked write to a FIFO Reproducer (symptoms on at least macOS and FreeBSD): $ mkfifo f $ echo foo > f (press Ctrl+Z) ^Zksh: f: cannot create [Interrupted system call] Abort The shell either aborts (dev builds) or crashes with 'Illegal instruction' (release builds). This is consistent with UNREACHABLE() being reached. Backtrace: 0 libsystem_kernel.dylib __kill + 10 1 ksh sh_done + 836 (fault.c:678) 2 ksh sh_fault + 1324 3 libsystem_platform.dylib _sigtramp + 29 4 dyld ImageLoaderMachOCompressed::resolve(ImageLoader::LinkContext const&, char const, unsigned ch 5 libsystem_c.dylib abort + 127 6 ksh sh_redirect + 3576 (io.c:1356) 7 ksh sh_exec + 7231 (xec.c:1308) 8 ksh exfile + 3247 (main.c:607) 9 ksh sh_main + 3551 (main.c:368) 10 ksh main + 38 (pmain.c:45) 11 libdyld.dylib start + 1 This means that UNREACHABLE() is actually reached here: ksh/src/cmd/ksh93/sh/io.c 1351: if((fd=sh_open(tname?tname:fname,o_mode,RW_ALL)) <0) 1352: { 1353: errormsg(SH_DICT,ERROR_system(1),((o_mode&O_CREAT)?e_create:e_open),fname); 1354: UNREACHABLE(); 1355: } The cause is that, in the following section of code in sh_fault(): ksh/src/cmd/ksh93/sh/fault.c 183: #ifdef SIGTSTP 184: if(sig==SIGTSTP) 185: { 186: sh.trapnote \|= SH_SIGTSTP; 187: if(pp->mode==SH_JMPCMD && sh_isstate(SH_STOPOK)) 188: { 189: sigrelease(sig); 190: sh_exit(SH_EXITSIG); 191: return; 192: } 193: } 194: #endif / SIGTSTP */ ...sh_exit() is not getting called and the function will not return because the SH_STOPOK bit is not set while the shell is blocked waiting to write to a FIFO. Even if sh_exit() did get called, that would not fix it, because that function also checks for the SH_STOPOK bit and returns without doing a longjmp if the signal is SIGTSTP and the SH_STOPOK bit is not set. That is direct the reason why UNREACHABLE() was raeched: errormsg() does call sh_exit() but sh_exit() then does not longjmp. src/cmd/ksh93/sh/fault.c: sh_fault(): - To avoid the crash, we simply need to return from sh_fault() if SH_STOPOK is off, so that the code path does not continue, no error message is given on Ctrl+Z, UNREACHABLE() is not reached, and the shell resumes waiting on trying to write to the FIFO. The sh.trapnote flag should not be set if we're not going to process the signal. This makes ksh behave like all other shells. Resolves: https://github.com/ksh93/ksh/issues/464	2022-02-17 20:21:23 +00:00
Martijn Dekker	11177d448d	Fix crash on cd in subshell with PWD unset (re: `5ee290c`) Reproducer: $ ksh -c 'unset PWD; (cd /); :' Memory fault The shell crashes because b_cd() is testing the value of the PWD variable without checking if there is one. src/cmd/ksh93/sh/path.c: path_pwd(): - Never return an unfreeable pointer to e_dot; always return a freeable pointer. This fixes another corner-case crashing bug. - Make sure the PWD variable gets assigned a value if it doesn't have one, even if it's the "." fallback. However, if the PWD is inaccessible but we did inherit a $PWD value that starts with a /, then use the existing $PWD value as this will help the shell fail gracefully. src/cmd/ksh93/bltins/cd_pwd.c: - b_cd(): When checking if the PWD is valid, use the sh.pwd copy instead of the PWD variable. This fixes the crash above. - b_cd(): Since path_pwd() now always returns a freeable value, free sh.pwd unconditionally before setting the new value. - b_pwd(): Not only check that path_pwd() returns a value starting with a slash, but also verify it with test_inode() and error out if it's wrong. This makes the 'pwd' command useful for checking that the PWD is currently accessible. src/cmd/ksh93/data/msg.c: - Change e_pwd error message for accuracy and clarity.	2022-02-17 19:45:37 +00:00
Martijn Dekker	d55e9686d7	Backport 'read -a' and 'read -p' from ksh 93v-/2020 This backports two minor additions to the 'read' built-in from ksh 93v-: '-a' is now the same as '-A' and '-u p' is the same as '-p'. This is for compatibility with some 93v- or ksh2020 scripts. Note that their change to the '-p' option to support both prompts and reading from the coprocess was not backported because we found it to be broken and unfixable. Discussoin at: https://github.com/ksh93/ksh/issues/463 src/cmd/ksh93/bltins/read.c: b_read(): - Backport as described above. - Rename the misleadingly named 'name' variable to 'prompt'. It points to the prompt string, not to a variable name. src/cmd/ksh93/data/builtins.c: sh_optpwd[]: - Add -a as an alterative to -A. All that is needed is adding '\|a' and optget(3) will automatically convert it to 'A'. - Change -u from a '#' (numeric) to ':' option to support 'p'. Note that b_read() now needs a corresponding strtol() to convert file descriptor strings to numbers where applicable. - Tweaks. src/cmd/ksh93/sh.1: - Update accordingly. - Tidy up the unreadable mess that was the 'read' documentation. The options are now shown in a list.	2022-02-17 19:44:54 +00:00
Martijn Dekker	95d695cb5a	Improve and document fast filescan loops (SHOPT_FILESCAN) From README: FILESCAN on Experimental option that allows fast reading of files using while < file;do ...; done and allowing fields in each line to be accessed as positional parameters. As SHOPT_FILESCAN has been enabled by default since ksh 93l 2001-06-01, the filescan loop is now documented in the manual page and the compile-time option is no longer considered experimental. We must disable this at runtime if --posix is active because it breaks a portable use case: POSIXly, 'while <file; do stuff; done' repeatedly excutes 'stuff' while 'file' can successfully be opened for reading, without actually reading from 'file'. This also backports a bugfix from the 93v- beta. Reproducer: $ echo 'one two three' >foo $ while <foo; do printf '[%s] ' "$@"; echo; done [one two three] Expected output: [one] [two] [three] The bug is that "$@" acts like "$*", joining all the positional parameters into one word though it should be generating one word for each. src/cmd/ksh93/sh/macro.c: varsub(): - Backport fix for the bug described above. I do not understand the opaque macro.c code well enough yet to usefully describe the fix. src/cmd/ksh93/sh/xec.c: sh_exec(): - Improved sanity check for filescan loop: do not recognise it if the simple command includes variable assignments, more than one redirection, or an output or append redirection. - Disable filescan loops if --posix is active. - Another 93v- fix: handle interrupts (errno==EINTR) when closing the input file.	2022-02-17 19:43:36 +00:00
Martijn Dekker	4886463bb6	Disable broken KEYBD trap for multibyte characters In UTF-8 locales, ksh breaks when a KEYBD trap is active, even a dummy no-op one like 'trap : KEYBD'. Entering multi-byte characters fails (the input is interrupted and a new prompt is displayed) and pasting content with multi-byte characters produces corrupted results. The cause is that the KEYBD trap code is not multibyte-ready. Unfortunately nobody yet understands the edit.c code well enough to implement a proper fix. Pending that, this commit implements a workaround that at least avoids breaking the shell. src/cmd/ksh93/edit/edit.c: ed_getchar(): - When a multi-byte locale is active, do not trigger the the KEYBD trap except for ASCII characters (1-127). Resolves: https://github.com/ksh93/ksh/issues/307	2022-02-17 19:39:42 +00:00
Martijn Dekker	56c0c24b55	Do not disable completion along with pathname expansion The -f/--noglob shell option is documented simply as: "Disables pathname expansion." But after 'set -f' on an interactive shell, command completion and file name completion also stop working. This is because they internally use the pathname expansion mechanism. But it is not documented anywhere that 'set -f' disables completion; it's just a side effect of an implementation detail. Though ksh has always acted like this, I think it should change because it's not useful or expected behaviour. Other shells like bash, yash or zsh don't act like this. src/cmd/ksh93/sh/expand.c, src/cmd/ksh93/sh/macro.c: - Allow the SH_COMPLETE (command completion) or SH_FCOMPLETE (file name completion) state bit to override SH_NOGLOB in path_generate() and in sh_macexpand().	2022-02-17 19:38:15 +00:00
Martijn Dekker	6304dfce41	Fix corner-case >&- redirection leak out of subshell Reproducer: exec 9>&1 ( { exec 9>&1; } 9>&- ) echo "test" >&9 # => 9: cannot open [Bad file descriptor] The 9>&- incorrectly persists beyond the { } block that it was attached to and beyond the ( ) subshell. This is yet another bug with non-forking subshells; forking it with something like 'ulimit -t unlimited' works around the bug. In over a year we have not been able to find a real fix, but I came up with a workaround that forks a virtual subshell whenever it executes a code block with a >&- or <&- redirection attached. That use case is obscure enough that it should not cause any performance regression except in very rare corner cases. src/cmd/ksh93/sh/xec.c: sh_exec(): TSETIO: - This is where redirections attached to code blocks are handled. Check for a >&- or <&- redirection using bit flaggery from shnodes.h and fork if we're executing such in a virtual subshell. Resolves: https://github.com/ksh93/ksh/issues/161 Thanks to @ko1nksm for the bug report.	2022-02-17 19:35:47 +00:00
Johnothan King	f38494ea1d	Fix multiple bugs in .sh.match (#455 ) This commit backports all of the relevant .sh.match bugfixes from ksh93v-. Most of the .sh.match rewrite is from versions 2012-08-24 and 2012-10-04, with patches from later releases of 93v- and ksh2020 also applied. Note that there are still some remaining bugs in .sh.match, although now the total count of .sh.match bugs should be less that before. These are the relevant changes in the ksh93v- changelog that were backported: 12-08-07 .sh.match no longer gets set for patterns in PS4 during set -x. 12-08-10 Rewrote .sh.match expansions fixing several bugs and improving performance. 12-08-22 .sh.match now handles subpatterns that had no matches with ${var//pattern} correctly. 12-08-21 A bug in setting .sh.match after ${var//pattern/string} when string is empty has been fixed. 12-08-21 A bug in setting .sh.match after [[ string == pattern ]] has been fixed. 12-08-31 A bug that could cause a core dump after typeset -m var=.sh.match has been fixed. 12-09-10 Fixed a bug in typeset -m the .sh.match is being renamed. 12-09-07 Fixed a bug in .sh.match code that coud cause the shell to quitely 13-02-21 The 12-01-16 bug fix prevented .sh.match from being used in the replacement string. The previous code was restored and a different fix which prevented .sh.match from being computed for nested replacement has been used instead. 13-05-28 Fixed two bug for typeset -c and typeset -m for variable .sh.match. Changes: - The SHOPT_2DMATCH option has been removed. This was already the default behavior previously, and now it's documented in the man page. - init.c: Backported the sh_setmatch() rewrite from 93v- 2012-08-24 and 2012-10-04. - Backported the libast 93v- strngrpmatch() function, as the .sh.match rewrite requires this API. - Backported the sh_match regression tests from ksh93v-, with many other sh_match tests backported from ksh2020. Much of the sh_match script is based on code from Roland Mainz: https://marc.info/?l=ast-developers&m=134606574109162&w=2 https://marc.info/?l=ast-developers&m=134490505607093 - tests/{substring,treemove}.sh: Backported other relevant .sh.match fixes, with tests added to the substring and treemove test scripts. - tests/types.sh: One of the (now reverted) memory leak bugfixes introduced a CI test failure in this script, so for that test the error message has been improved. - string/strmatch.c: The original ksh93v- code for the strngrpmatch() changes introduced a crash that could occur because strlen would be used on a null pointer. This has been fixed by avoiding strlen if the string is null. One nice side effect of these changes is a considerable performance improvement in the shbench[1] gsub benchmark (results from 20 iterations with CCFLAGS=-Os): -------------------------------------------------- name /tmp/ksh-current /tmp/ksh-matchfixes -------------------------------------------------- gsub.ksh 0.883 [0.822-0.959] 0.457 [0.442-0.505] -------------------------------------------------- Despite all of the many fixes and improvements in the backported 93v- .sh.match code, there are a few remaining bugs: - .sh.match is printed with a default [0] subscript (see also https://github.com/ksh93/ksh/issues/308#issuecomment-1025016088): $ arch//bin/ksh -c 'echo ${!.sh.match}' .sh.match[0] This bug appears to have been introduced by the changes from ksh93v- 2012-08-24. - The wrong variable name is given for 'parameter not set' errors (from https://marc.info/?l=ast-developers&m=134489094602596): $ arch//bin/ksh -u $ x=1234 $ true "${x//~(X)([012])\|([345])/}" $ compound co $ typeset -m co.array=.sh.match $ printf "%q\n" "${co.array[2][0]}" arch/linux.i386-64/bin/ksh: co.array[2][(null)]: parameter not set - .sh.match leaks out of subshells. Further information and a reproducer can be found here: https://marc.info/?l=ast-developers&m=136292897330187 [1]: https://github.com/ksh-community/shbench	2022-02-10 21:04:23 +00:00
Martijn Dekker	232b7bff30	Fix multiple bugs in executing scripts without a #! path When executing a script without a hashbang path like #!/bin/ksh, ksh forks itself, longjmps back to sh_main(), and then (among other things) calling sh_reinit() which is the function that tries to reinitialise as much of the shell as it can. This is its way of ensuring the child script is run in ksh and not some other shell. However, this appraoch is incredibly buggy. Among other things, changes in built-in commands and custom type definitions survived the reinitialisation, "exporting" variables didn't work properly, and the hash table and ${.sh.stats} weren't reset. As a result, depending on what the invoking script did, the invoked script could easily fail or malfunction. It is not actually possible to reinitialise the shell correctly, because some of the shell state is in locally scoped static variables that cannot simply be reinitialised. There are probably huge memory leaks with this approach as well. At some point, all this is going to need a total redesign. Clearly, the only reliable way involves execve(2) and a start from scratch. For now though, this seems to fix the known bugs at least. I'm sure there are more to be discovered. This commit makes another change: instead of the -h/trackall option (which has been a no-op for decades), the posix option is now inherited by the child script. Since there is no hashbang path from which to decide whether the shell should run in POSIX mode nor not, the best guess is probably the invoking script's setting. src/cmd/ksh93/sh/init.c: sh_reinit(): - Keep the SH_INIT state on during the entire procedure. - Delete remaining non-exported, non-default variables. - Remove attributes from exported variables. In POSIX mode, remove all attributes; otherwise, only remove readonly. - Unset discipline function pointers for variables. - Delete all custom types. - Delete all functions and built-ins, then reinitialise the built-ins table from scatch. - Free the alias values before clearing the alias table. - Same with hash table entries (tracked aliases). - Reset statistics. - Inherit SH_POSIX instead of SH_TRACKALL. - Call user init function last, not somewhere in the middle. src/cmd/ksh93/sh/name.c: sh_envnolocal(): - Be sure to preserve the export attribute of kept variables. Resolves: https://github.com/ksh93/ksh/issues/350	2022-02-10 21:03:43 +00:00
Johnothan King	7d4c7d9156	Fix 'typeset -p' output of compound array types (#453 ) This bugfix was backported from ksh93v- 2012-10-04. The bug fixed by this change is one that causes 'typeset -p' to omit the -C flag when listing compound arrays belonging to a type: $ typeset -T Foo_t=(compound -a bar) $ Foo_t baz $ typeset -p baz.bar typeset -a baz.bar='' # This should be 'typeset -C -a' src/cmd/ksh93/sh/nvtype.c: - Backport change from 93v- 2012-10-04 that sets the array nvalue to a pointer named Null (which is "") in nv_mktype(), then to Empty in fixnode(). - Change the Null name from the 93v- code to AltEmpty to avoid misleading code readers into thinking that it's a null pointer. src/cmd/ksh93/tests/types.sh: - Backport the relevant 93v- changes to the types regression tests. Co-authored-by: Martijn Dekker <martijn@inlv.org>	2022-02-10 21:03:24 +00:00
Johnothan King	787058bdbf	Fix the output of `typeset -p` for two dimensional indexed arrays (#454 ) In ksh93v- 2012-10-04 the following bugfix is noted in the changelog (this fix was most likely part of ksh93v- 2012-09-27, although that version is not archived anywhere): 12-09-21 A bug in which the output of a two dimensional sparse indexed array would cause the second subscript be treated as an associative array when read back in has been fixed. Elements that are sparse indexed arrays now are prefixed type "typeset -a". Below is a before and after of this change: # Before $ typeset -a foo[1][2]=bar $ typeset -p foo typeset -a foo=([1]=([2]=bar) ) # After $ typeset -a foo[1][2]=bar $ typeset -p foo typeset -a foo=(typeset -a [1]=([2]=bar) ) src/cmd/ksh93/sh/*.c: - Backport changes from ksh93v- to print 'typeset -a' before sparse indexed arrays and properly handle 'typeset -a' in reinput commands from 'typeset -p'. src/cmd/ksh93/tests: - Add two regression tests to arrays.sh for this change. - Update the existing regression tests for compatibility with the new printed typeset output.	2022-02-10 21:01:40 +00:00
Martijn Dekker	e6d0187dd8	Don't allow 'enum' and 'typeset -T' to override special built-ins Special builtins are undeleteable for a reason. But 'enum' and 'typeset -T' allow overriding them, causing an inconsistent state. @JohnoKing writes: \| The behavior is rather buggy, as it appears to successfully \| override normal builtins but fails to delete the special \| builtins, leading to scenarios where both the original builtin \| and type are run: \| \| $ typeset -T eval=(typeset BAD; typeset TYPE) # This should have failed \| $ eval foo=BAD \| /usr/bin/ksh: eval: line 1: foo: not found \| $ enum trap=(BAD TYPE) # This also should have failed \| $ trap foo=BAD \| /usr/bin/ksh: trap: condition(s) required \| $ enum umask=(BAD TYPE) \| $ umask foo=BAD \| $ echo $foo \| BAD \| \| # Examples of general bugginess \| $ trap bar=TYPE \| /usr/bin/ksh: trap: condition(s) required \| $ echo $bar \| TYPE \| $ eval var=TYPE \| /usr/bin/ksh: eval: line 1: var: not found \| $ echo $var \| TYPE This commit fixes the following: The 'enum' and 'typeset -T' commands are no longer allowed to override and replace special built-in commands, except for type definition commands previously created by these commands; these are already (dis)allowed elsewhere. A command like 'typeset -T foo_t' without any assignments no longer creates an incompletely defined 'foo_t' built-in comamnd. Instead, it is now silently ignored for backwards compatibility. This did have a regression test checking for it, but I'm changing it because that's just not a valid use case. An incomplete type definition command does nothing useful and only crashes the shell when run. src/cmd/ksh93/bltins/enum.c: b_enum(): - Do not allow overriding non-type special built-ins. src/cmd/ksh93/sh/name.c: nv_setlist(): - Do not allow 'typeset -T' to override non-type special built-ins. To avoid an inconsistent state, this must be checked for while processing the assignments list before typeset is really invoked. src/cmd/ksh93/bltins_typeset.c: b_typeset(): - Only create a type command if sh.envlist is set, i.e., if some shell assignment(s) were passed to the 'typeset -T' command. Progresses: https://github.com/ksh93/ksh/issues/350	2022-02-10 21:01:00 +00:00
Martijn Dekker	65aff0befb	Fix conditional expansions ${array[i]=value}, ${array[i]?error} $ unset foo $ echo ${foo[42]=bar} (empty line) Instead of the empty line, 'bar' was expected. As foo[42] was unset, the conditional assignment should have worked. $ unset foo $ : ${foo[42]?error: unset} (no output) The expansion should have thrown an error with the given message. This bug was introduced in ksh 93t 2008-10-01. Thanks to @JohnoKing for finding the breaking change. Analysis: The problem was experimenally determined to be in in the following lines of nv_putsub(). If the array member is unset (i.e. null), the value is set to the empty string instead: src/cmd/ksh93/sh/array.c 1250: else 1251: ap->val[size].cp = Empty; It makes some sense: if there is a value (even an empty one), the variable is set and these expansions should behave accordingly. Sure enough, deleting these lines fixes the bug, but at the expense of introducing a lot of other array-related regressions. So we need a way to special-case the affected expansions. Where to do this? If we replace line 1251 with an abort(3) call, we get this stack trace: 0 libsystem_kernel.dylib __pthread_kill + 10 1 libsystem_pthread.dylib pthread_kill + 284 2 libsystem_c.dylib abort + 127 3 ksh nv_putsub + 1411 (array.c:1255) 4 ksh nv_endsubscript + 940 (array.c:1547) 5 ksh nv_create + 4732 (name.c:1066) 6 ksh nv_open + 1951 (name.c:1425) 7 ksh varsub + 4934 (macro.c:1322) [rest omitted] The special-casing needs to be done on line 1250 of array.c, but flagged in varsub() which processes these expansions. So, varsub() calls nv_open() calls nv_create() calls nv_endsubscript() calls nv_putsub(). That's a fairly deep call stack, so passing an extra flag argument does not seem doable. I did try an approach using a couple of new bit flags passed via these functions' flags and mode parameters, but the way this code base uses bit flags is so intricate, it does not seem to be possible to add or change anything without unwanted side effects in all sorts of places. So the only fix I can think of adds yet another global flag variable for a very special case. It's ugly, but it works. An elegant fix would probably involve a fairly comprehensive redesign, which is simply not going to happen. src/cmd/ksh93/include/shell.h: - Add global sh.cond_expan flag. src/cmd/ksh93/sh/array.c: nv_putsub(): - Do not set value to empty string if sh.cond_expan is set. src/cmd/ksh93/sh/macro.c: varsub(): - Set sh.cond_expan flag while calling nv_open() for one of the affected expansions. - Minor refactoring for legibility and to make the fix fit better. - SSOT: Instead of repeating string "REPLY", use the node's nvname. - Do not pointlessly add an extra 0 byte when saving id for error message; sfstruse() already adds this. Thanks to @oguz-ismail for the bug report. Resolves: https://github.com/ksh93/ksh/issues/383	2022-02-05 23:39:16 +00:00
Martijn Dekker	493a31053e	Do not export variables with dot names (re: `8e72608c`) Variables with a dot in their name, such as those declared in namespace { ... } blocks, are usually stored in a separate tree with their actual names not containing any dots. But under some circumstances, including at least direct assignment of a non-preexisting dot variable, dot variables are stored in the main sh.var_tree with names actually containing dots. With allexport active, those could end up exported to the environment. This bug was also present in previous release versions of ksh. src/cmd/ksh93/sh/name.c: pushnam(): - Check for a dot in the name before pushing a variable to export.	2022-02-05 15:08:50 +00:00
Johnothan King	a8dd1bbd9d	typeset -p: fix output of nonexistent [0]= array element (#451 ) This fix was backported from ksh 93v- 2012-10-04. src/cmd/ksh93/sh/nvtree.c: nv_outnode(): - If the array is supposed to be empty, do not continue. This avoids outputting a nonexistent [0]= element for empty arrays. Resolves: https://github.com/ksh93/ksh/issues/420 Co-authored-by: Martijn Dekker <martijn@inlv.org>	2022-02-05 13:53:51 +00:00
Johnothan King	fb696ecfae	trap: fix use after free (#446 ) This commit adds a fix for the trap command, backported from a fork of ksh2020: `2033375f` src/cmd/ksh93/sh/jobs.c: job_chldtrap(): - Fixed a use after free bug in the for loop. The string pointed to by sh.st.trapcom[SIGCHLD] may be freed from memory after sh_trap(), so it must be reobtained each time sh_trap() is called from within the for loop.	2022-02-05 13:53:11 +00:00

1 2 3 4 5 ...

370 commits