external/cde - Personal Git space

mirror of git://git.code.sf.net/p/cdesktopenv/code synced 2025-03-09 15:50:02 +00:00

Author	SHA1	Message	Date
Johnothan King	ca5803419b	Fix various typos, man page issues and improve the documentation (#415 ) This commit makes various different improvements to the documentation: - sh.1: Backported (with changes) mandoc warning fixes from ksh2020 for the ksh93(1) man page: <https://github.com/att/ast/pull/1406> - Removed unnecessary spaces at the end of lines to fix a few other mandoc warnings. - Fixed various typos and capitalization errors in the documentation. - ANNOUNCE: Document the addition of the ${.sh.pid} variable (re: `9de65210`). - libast/man/str: Update the man pages for the libast str functions to improve how accurately each function is described. - ksh93/README: Update regression test/compatibility notes to include OpenBSD 7.0, FreeBSD 13.0 and WSL running Ubuntu 20.04. - Change a few places to store the return value from strlen in a size_t variable rather than signed int. - comp/setlocale.c: To avoid confusion of two separate variables named lang, the function local variable has been renamed to langidx.	2022-01-07 16:17:55 +00:00
Martijn Dekker	b590a9f155	[shp cleanup 01..20] all the rest (re: `2d3ec8b6`) This combines 20 cleanup commits from the dev branch. All changed files: - Clean up pointer defererences to sh. - Remove shp arguments from functions. Other notable changes: src/cmd/ksh93/include/shell.h, src/cmd/ksh93/sh/init.c: - On second thought, get rid of the function version of sh_getinterp() as libshell ABI compatibility is moot. We've already been breaking that by reordering the sh struct, so there is no way it's going to work without recompiling. src/cmd/ksh93/sh/name.c: - De-obfuscate the relationship between nv_scan() and scanfilter(). The former just calls the latter as a static function, there's no need to do that via a function pointer and void* type conversions. src/cmd/ksh93/bltins/typeset.c, src/cmd/ksh93/sh/name.c, src/cmd/ksh93/sh/nvdisc.c: - 'struct adata' and 'struct tdata', defined as local struct types in these files, need to have their first three fields in common, the first being a pointer to sh. This is because scanfilter() in name.c accesses these fields via a type conversion. So the sh field needed to be removed in all three at the same time. TODO: de-obfuscate: good practice definition via a header file. src/cmd/ksh93/sh/path.c: - Naming consistency: reserve the path_ function name prefix for externs and rename statics with that prefix. - The default path was sometimes referred to as the standard path. To use one term, rename std_path to defpath and onstdpath() to ondefpath(). - De-obfuscate SHOPT_PFSH conditional code by only calling pf_execve() (was path_pfexecve()) if that is compiled in. src/cmd/ksh93/include/streval.h, src/cmd/ksh93/sh/streval.c: - Rename extern strval() to arith_strval() for consistency. src/cmd/ksh93/sh/string.c: - Remove outdated/incorrect isxdigit() fallback; '#ifnded isxdigit' is not a correct test as isxdigit() is specified as a function. Plus, it's part of C89/C90 which we now require. (re: `ac8991e5`) src/cmd/ksh93/sh/suid_exec.c: - Replace an incorrect reference to shgd->current_pid with getpid(); it cannot work as (contrary to its misleading directory placement) suid_exec is an independent libast program with no link to ksh or libshell at all. However, no one noticed because this was in fallback code for ancient systems without setreuid(2). Since that standard function was specified in POSIX Issue 4 Version 2 from 1994, we should remove that fallback code sometime as part of another obsolete code cleanup operation to avoid further bit rot. (re: `843b546c`) src/cmd/ksh93/bltins/print.c: genformat(): - Remove preformat[] which was always empty and had no effect. src/cmd/ksh93/shell.3: - Minor copy-edit. - Remove documentation for nonexistent sh.infile_name. A search through ast-open-archive[*] reveals this never existed at all. - Document sh.savexit (== $?). src/cmd/ksh93/shell.3, src/cmd/ksh93/include/shell.h, src/cmd/ksh93/sh/init.c: - Remove sh.gd/shgd; this is now unused and was never documented or exposed in the shell.h public interface. - sh_sigcheck() was documented in shell.3 as taking no arguments whereas in the actual code it took a shp argument. I decided to go with the documentation. - That leaves sh_parse() as the only documented function that still takes an shp argument. I'm just going to go ahead and remove it for consistency, reverting sh_parse() to its pre-2003 spec. - Remove undocumented/unused sh_bltin_tree() function which simply returned sh.bltin_tree. - Bump SH_VERSION to 20220106.	2022-01-07 16:16:31 +00:00
Martijn Dekker	57ed1efc2c	Actually deactivate CDPATH when unsetting it After 'unset CDPATH', CDPATH continued to work as if nothing happened. Unsetting it should be a valid way to deactivate it. This bug is in every ksh93 version. src/cmd/ksh93/bltins/cd_pwd.c: b_cd(): - Fix a manifest logic error: first check if CDPATH (CDPNOD) is unset before assigning to 'cdpath', not the other way around. Setting the 'cdpath' pointer is what activates the CDPATH search.	2021-12-29 01:48:55 +00:00
Johnothan King	8f9d1bec97	Add three options to 'ulimit' (#406 ) This patch adds a few extra options to the ulimit command (if the OS supports them). These options are also present in Bash, although in ksh additional long forms of each option are available: ulimit -k/--kqueues This is the maximum number of kqueues. ulimit -P/--npts This is the maximum number of pseudo-terminals. ulimit -R/--rttime This is the time a real-time process can run before blocking, in microseconds. When the limit is exceeded, the process is sent SIGXCPU. Other changes: - bltins/ulimit.c: Change the formatting from sfprintf and increase the size of the tmp buffer to prevent text from being cut off in ulimit -a (this was required to add ulimit -R). - data/limits.c: Add support for using microseconds as a unit.	2021-12-28 22:02:20 +00:00
Johnothan King	0e197eee57	Fix `mkservice` compile errors and add `SHOPT_MKSERVICE` (#401 ) The unused mkservice and eloop builtins are currently not built, and if an attempt to compile them is made the build ends in failure. This commit backports a few build fixes from ksh93v- 2012-08-24 that allow mkservice and eloop to build (plus an additional compiler warning fix not in ksh93v-). I've also added a new SHOPT_MKSERVICE setting (turned off by default) so that mkservice and eloop can be built if the user chooses to include them in their build of ksh.	2021-12-28 17:51:11 +00:00
Johnothan King	24174f0fb7	Backport -P and -t flags for 'type'/'whence' from ksh93v- (#392 ) This commit backports the whence '-t' option from ksh93v-. The '-t' option is useful when one needs to identify the type of a command. The '-t' flag was added by ksh93v- for compatibility with Bash. It should be noted the ksh93v- patch had one bug, which this commit fixes. Path-bound builtins from /opt/ast/bin were classified as files if loaded from /opt/ast/bin in the PATH. Reproducer: $ PATH=/opt/ast/bin whence -t cat file src/cmd/ksh93/bltins/whence.c: - Simplify the bitmask values for the command and whence builtin flags. - Add the -t flag to the whence and type builtins. To prevent bugs, -t will always override -v if both of those flags were passed. src/cmd/ksh93/data/builtins.c, src/cmd/ksh93/sh.1: - Add documentation for the new -t option.	2021-12-27 06:40:02 +00:00
Johnothan King	f7213f03a2	Fix multiple bugs when using 'alias -p' to print aliases (#398 ) This commit was originally intended to fix just one bug with shcomp's handling of 'alias -p', but while fixing that I found a large number of related issues in the alias command's -p, -t and -x options. The current patch provides bugfixes for all of the bugs listed below: 1) Listing aliases in a script with 'alias -p' or 'alias' broke shcomp's bytecode output: https://github.com/ksh93/ksh/issues/87#issuecomment-813819122 2) Listing individual aliases with the -p option doesn't work: $ alias foo=bar bar=foo $ alias foo foo=bar $ alias -p foo # No output 3) Listing specific tracked aliases with -pt does not display them in a reusable format, but rather adds another tracked alias: $ hash -r cat vi $ alias -pt vi # No output $ alias -pt rm $ alias -t cat=/usr/bin/cat rm=/usr/bin/rm vi=/usr/bin/vi 4) Listing all tracked aliases with -pt does not output them in a reusable format (the resulting command printed only creates a normal alias, which is different from a tracked alias): $ hash -r cat $ alias -pt alias cat=/usr/bin/cat # Expected 'alias -t cat' 5) Listing a non-existent alias with -p doesn't cause an error: $ unalias -a $ alias -p notanalias # No output $ echo $? 0 $ alias notanalias notanalias: alias not found $ echo $? 1 $ hash -r $ alias -pt notacommand # No output $ echo $? 0 6) Attempting to list 256 non-existent aliases results in exit status zero: $ unalias -a $ alias $(awk -v ORS= 'BEGIN { for(i=0;i<256;i++) print "x "; }') x: alias not found --cut error message-- $ echo $? 0 Changes: - typeset.c: Avoid printing anything while shcomp is compiling a script. This is needed because the alias command is run by shcomp to prevent parsing issues. - b_alias(): To avoid adding tracked aliases with -pt, set tdata.aflag to '+' so that setall() and other related functions only list tracked aliases. - b_alias(): Set tdata.pflag to 1 so that setall() and other functions recognize -p was passed. - print_value(): Add support for listing specific aliases with 'alias -p'. - setall(): To avoid any issues with zombie tracked aliases (see also the regression tests) ignore tracked alias nodes marked with the NV_NOALIAS attribute. This bit is set for tracked alias nodes by the nv_rehash() function. - setall(): For backward compatibility, continue incrementing the exit status for each invalid alias and tracked alias passed. This was already how alias behaved when listing aliases without -p, so using -p shouldn't cause a change in behavior: $ unalias -a $ alias foo bar foo: alias not found bar: alias not found $ echo $? 2 To fix bug 6, the exit status is set to one if an enforced 8-bit exit status would be zero. - print_namval(): Set the prefix to 'alias -t' so that listing tracked aliases with 'alias -pt' works correctly. - data/msg.c and include/name.h: Add an error message for when 'alias -pt' doesn't find a tracked alias. - tests/alias.sh: Add a ton of regression tests for the bugs fixed in this commit.	2021-12-27 03:49:06 +00:00
Martijn Dekker	a1f5c99204	INIT: remove proto, ratz (re: `46593a89`, `6137b99a`); major cleanup This takes another step towards cleaning up the build system. We now do not even pretend to be theoretically compatible with pre-1989 K&R C compilers or with C++ compilers. In practice, this had already been broken for many years due to bit rot. Commit `46593a89` already removed the license handling enormity that depended on proto, so now we can cleanly remove it altogether. But we do need to leave some backwards compatibility stubs to keep the build system compatible with older AST code; it should remain possible to build older ksh versions with the current build system (the bin/ and src/cmd/INIT/ directories) for testing purposes. So as of now there is no more __MANGLE__d rubbish in your generated header files. This is only about a quarter of a century overdue... This commit also includes a huge amount of code cleanup to remove thousands of unused K&R C fallbacks and other cruft, particularly in libast. This code base should now be a little easier to understand for people who are familiar with a modern(ish) C standard. ratz is now also removed; this was a standalone and simplified 2005 version of gunzip. As of `6137b99a`, none of our code uses it, even theoretically. And the real g(un)zip is now everywhere. src/cmd/INIT/proto.c, src/cmd/INIT/ratz.c: - Removed. COPYRIGHT: - Remove zlib license; this only applied to ratz. bin/package, src/cmd/INIT/package.sh: - Related cleanups. - Unset LC_ALL before invoking a new shell, respecting the user's locale again and avoiding multibyte character corruption on the command line. src/cmd/INIT/proto.sh: - Add stub for backwards compatibility with Mamfiles that depend on proto. It does nothing but pass input without modification and is now installed as the new arch//bin/proto by src/cmd/INIT/Mamfile. src/cmd/INIT/iffe.sh: - Ignore the proto-related -e (--package) and -p (--prototyped) options; keep parsing them for backwards compatibility. - Trim the macros passed to every test to their standard C versions, removing K&R C and C++ versions. These are now considered to be for backwards compatibility only. src/cmd/INIT/iffe.tst: - Remove proto(1) mangling code. By the way, iffe can be regression-tested as follows: $ bin/package use # set up environment in a child shell $ regress src/cmd/INIT/iffe.tst $ exit # leave package environment src/cmd/INIT/make.probe, src/cmd/INIT/probe.win32: - Remove code to handle C++. src/lib/libast/features/common: - As in iffe.sh above, trim macros designed for compatibility with C++ and ancient C compilers to their standard C versions and comment that they are for backwards compatibility with AST code. This is needed to keep all the old ast and ksh code compiling. src/cmd/ksh93/sh/init.c, src/cmd/ksh93/sh/name.c: - Clarify libshell ABI compatibility function versions of macros. A "proto workaround" comment in the original code mislead me into thinking this had something to do with the removed proto(1), but it's unrelated. Call the workaround macro BYPASS_MACRO instead. src/cmd/ksh93/include/defs.h: - sh_sigcheck() macro: allow &sh as an argument: parenthesise shp. src/cmd/ksh93/sh/nvtype.c: - Remove unused nv_mkstruct() function. (re: `d0a5cab1`) /features/: - Remove obsolete iffe 'set prototyped' option. /Mamfile: - Remove all references to the ast/prototyped.h header. - Remove all use of the proto command. Simply copy instead. * 850-ish source files: *** - Remove all '#pragma prototyped' directives. - Remove all C++ compat code conditional upon defined(__cplusplus). - Remove all use of the _ARG_ macro, which on standard C expands to its argument: #define _ARG_(x) x (on K&R C, it expanded to nothing) - Remove all use of _BEGIN_EXTERNS_ and _END_EXTERNS_ macros (empty on standard C; this was for C++ compatibility) - Reduce all #if __STD_C (standard code) #else (K&R code) #endif blocks to the standard code only, without use of the macro. - Same for _STD_ macro which seems to have had the same function. - Change all instances of 'Void_t' to standard 'void'.	2021-12-24 07:05:22 +00:00
Martijn Dekker	d95700c348	print.c: resolve whitespace diff with master (re: `fb8308243`) I ended up committing versions of the fix to the master and 1.0 branches that differed only in whitespace in a few lines (no code differences). This commit makes the whitespace identical so this does not keep annoying me when I look at 'git diff 1.0 master'.	2021-12-22 05:15:32 +00:00
Martijn Dekker	a381a1b049	Better fix for BUG_IFSISSET (re: `95294419`) With a better understanding of the code 1.5 years later, the special-casing for IFS introduced in that commit seems like a hack. The problem was not that the IFS node always exists but that it is always considered to have a 'get' discipline function. Variables with a 'get' discipline are considered set. This makes sense for all variables except IFS. The nv_isnull() macro is used to check if a variable is set. It calls nv_hasget() to determine if the variable has a 'get' discipline. So a better fix is for nv_hasget() always to return false for IFS. src/cmd/ksh93/bltins/test.c, src/cmd/ksh93/sh/macro.c: - Remove special-casing for IFS. src/cmd/ksh93/sh/nvdisc.c: nv_hasget(): - Always return false for IFS, taking local scope into account.	2021-12-21 06:29:30 +00:00
Martijn Dekker	e67df29c07	Re-fix defining types conditionally or in subshells (re: `f508660d`) New version. I'm pretty sure the problems that forced me to revert it earlier are fixed. This commit mitigates the effects of the hack explained in the referenced commit so that dummy built-in command nodes added by the parser for declaration/assignment purposes do not leak out into the execution level, except in a relatively harmless corner case. Something like if false; then typeset -T Foo_t=(integer -i bar) fi will no longer leave a broken dummy Foo_t declaration command. The same applies to declaration commands created with enum. The corner case remaining is: $ ksh -c 'false && enum E_t=(a b c); E_t -a x=(b b a c)' ksh: E_t: not found Since the 'enum' command is not executed, this should have thrown a syntax error on the 'E_t -a' declaration: ksh: syntax error at line 1: `(' unexpected This is because the -c script is parsed entirely before being executed, so E_t is recognised as a declaration built-in at parse time. However, the 'not found' error shows that it was successfully eliminated at execution time, so the inconsistent state will no longer persist. This fix now allows another fix to be effective as well: since built-ins do not know about virtual subshells, fork a virtual subshell into a real subshell before adding any built-ins. src/cmd/ksh93/sh/parse.c: - Add a pair of functions, dcl_hactivate() and dcl_dehacktivate(), that (de)activate an internal declaration built-ins tree into which check_typedef() can pre-add dummy type declaration command nodes. A viewpath from the main built-ins tree to this internal tree is added, unifying the two for search purposes and causing new nodes to be added to the internal tree. When parsing is done, we close that viewpath. This hides those pre-added nodes at execution time. Since the parser is sometimes called recursively (e.g. for command substitutions), keep track of this and only activate and deactivate at the first level. (Fixed compared to previous version of this commit: calling dcl_dehacktivate() when the recursion level is already zero is now a harmless no-op. Since this only occurs in error handling conditions, who cares.) - We also need to catch errors. This is done by setting libast's error_info.exit variable to a dcl_exit() function that tidies up and then passes control to the original (usually sh_exit()). (Fixed compared to previous version of this commit: dcl_exit() immediately deactivates the hack, no matter the recursion level, and restores the regular sh_exit(). This is the right thing to do when we're in the process of erroring out.) - sh_cmd(): This is the most central function in the parser. You'd think it was sh_parse(), but $(modern)-form command substitutions use sh_dolparen() instead. Both call sh_cmd(). So let's simply add a dcl_hacktivate() call at the beginning and a dcl_deactivate() call at the end. - assign(): This function calls path_search(), which among many other things executes an FPATH search, which may execute arbitrary code at parse time (!!!). So, regardless of recursion level, forcibly dehacktivate() to avoid those ugly parser side effects returning in that context. src/cmd/ksh93/bltins/enum.c: b_enum(): - Fork a virtual subshell before adding a built-in. src/cmd/ksh93/sh/xec.c: sh_exec(): - Fork a virtual subshell when detecting typeset's -T option. Improves fix to https://github.com/ksh93/ksh/issues/256	2021-12-17 01:28:28 +01:00
Johnothan King	c2ac69b2d5	Use dynamic maximum configuration values when necessary (#370 ) This commit fixes an issue with how ksh was obtaining the value of NGROUPS_MAX. On some systems this setting can be changed (e.g., on illumos adding 'set ngroups_max=32' to /etc/system then rebooting changes NGROUPS_MAX from 16 to 32). Ksh was using NGROUPS_MAX with the assumption it's a static value, which could cause issues on systems where it isn't static. This bugfix is inspired by the one from <`b1362c3a5`>, although it has been expanded a bit to account for OPEN_MAX as well. src/cmd/ksh93/sh/init.c, src/lib/libcmd/fds.c: - Rename the getconf() macro to astconf_long() and move it to ast.h to prevent redundancy. Other sections of the code have been modified to use this macro for astconf() to account for dynamic settings. - An equivalent macro for unsigned long values (astconf_ulong) has been added. - Prefer sysconf(3) where available. It has better performance as it returns a numeric value directly instead of via string conversion. - The astconf_long and astconf_ulong macros have been documented in the ast(3) man page.	2021-12-13 07:53:14 +01:00
Johnothan King	e54001d58b	Various minor capitalization and typo fixes (#371 ) This commit fixes various minor typos, punctuation errors and corrects the capitalization of many names.	2021-12-13 01:49:42 +01:00
Johnothan King	beccb93fd4	Fix various compiler warnings and minor issues (#362 ) List of changes: - Fixed some -Wuninitialized warnings and removed some unused variables. - Removed the unused extern for B_login (re: `d8eba9d1`). - The libcmd builtins and the vmalloc memfatal function now handle memory errors with 'ERROR_SYSTEM\|ERROR_PANIC' for consistency with how ksh itself handles out of memory errors. - Added usage of UNREACHABLE() where it was missing from error handling. - Extend many variables from short to int to prevent overflows (most variables involve file descriptors). - Backported a ksh2020 patch to fix unused value Coverity issues (https://github.com/att/ast/pull/740). - Note in src/cmd/ksh93/README that ksh compiles with Cygwin on Windows 10 and Windows 11, albeit with many test failures. - Add comments to detail some sections of code. Extensive list of commits related to this change: `ca2443b5`, `7e7f1372`, `2db9953a`, `7003aba4`, `6f50ff64`, `b1a41311`, `222515bf`, `a0dcdeea`, `0aa9e03f`, `61437b27`, `352e68da`, `88e8fa67`, `bc8b36fa`, `6e515f1d`, `017d088c`, `035a4cb3`, `588a1ff7`, `6d63b57d`, `a2f13c19`, `794d1c86`, `ab98ec65`, `1026006d` - Removed a lot of dead ifdef code. - edit/emacs.c: Hide an assignment to avoid a -Wunused warning. (See also https://github.com/att/ast/pull/753, which removed the assignment because ksh2020 removed the !SHOPT_MULTIBYTE code.) - sh/nvdisc.c: The sh_newof macro cannot return a null pointer because it will instead cause the shell to exit if memory cannot be allocated. That makes the if statement here a no-op, so remove it. - sh/xec.c: Fixed one unused variable warning in sh_funscope(). - sh/xec.c: Remove a fallthrough comment added in commit `ed478ab7` because the TFORK code doesn't fall through (GCC also produces no -Wimplicit-fallthrough warning here). - data/builtins.c: The cd and pwd man pages state that these builtins default to -P if PATH_RESOLVE is 'physical', which isn't accurate: $ /opt/ast/bin/getconf PATH_RESOLVE physical $ mkdir /tmp/dir; ln -s /tmp/dir /tmp/sym $ cd /tmp/sym $ pwd /tmp/sym $ cd -P /tmp/sym $ pwd /tmp/dir The behavior described by these man pages isn't specified in the ksh man page or by POSIX, so to avoid changing these builtin's behavior the inaccurate PATH_RESOLVE information has been removed. - Mamfiles: Preserve multi-line errors by quoting the $x variable. This fix was backported from 93v-. (See also <`a7e9cc82`>.) - sh/subshell.c: Remove set but not used sp->errcontext variable.	2021-12-09 06:42:59 +01:00
Martijn Dekker	b3050769ea	Fix 'return' emitting signals; allow arbitrary return values When a global EXIT trap is set, and a ksh-style function exits with a status > 256 that could have been the result of a signal, then the shell incorrectly issues that signal to itself. Depending on the signal, this causes ksh to terminate itself ungracefully: $ cat /tmp/exit267 trap 'echo OK' EXIT # This trap triggers the crash function foo { return 267; } foo $ bash /tmp/exit267 OK $ ksh-3aee10d7 /tmp/exit267 OK $ ksh /tmp/exit267 Memory fault(coredump) On most systems, status 267 corresponds to SIGSEGV. The reported memory fault is not real; it results from ksh incorrectly killing itself with that signal. The problem is caused by two factors: 1. As of 93u+ 2012-08-01, ksh explicitly allows 'return' to use an exit status corresponding to a signal (from 257 to end of signal range). The rest of the integer range is trunctated to 8 bits. This is contrary to both 'man ksh' and 'return --man' which both say it's always truncated to 8 bits. Plus, combined with point 2 below, this new behaviour is nonsensical, as 'return' has no business actually generating signals. However, a couple of regression tests now depend on this, as may some scripts. 2. When a ksh-style function does not handle a signal, the signal is passed down to the parent environment and ksh does this by reissuing the signal to its own process after leaving the function scope. However, it does this by checking the exit status, which is very bad practice as there is no guarantee that an exit status corresponding to a signal was in fact produced by a signal, particularly after they changed the behaviour of 'return' per 1 above. This commit fixes both issues. It also takes a proper decision on allowable 'return' exit status arguments. Since 93u+ was released nearly a decade ago and some scripts may now rely on being able to pass certain exit statuses out of the 8-bit range, we should not disallow this now. But neither should we be half-hearted in allowing only some arbitrary selection of 9-bit statuses; 'return' values categorically should have nothing to do with signals, so this is no basis for limiting them. We're now allowing the full unsigned integer range, which is usually 32 bits. This is like zsh, and may create some interesting possibilities for scripts. Just don't forget that $? will still lose all but its 8 least significant bits when leaving the current (sub)shell environment. src/cmd/ksh93/sh/xec.c: sh_funscope(): - Fix passing down unhandled signals from interrupted ksh functions (jumpval==SH_JMPFUN) to the parent environment. Do not pay any attention to the exit status. Instead, use sh.lastsig (a.k.a. shp->lastsig). It is set by sh_fault() in fault.c for just this purpose and contains the last signal handled for the current command. It is reset in sh_exec() before running any new command. So if it contains a signal, that is the one that interrupted the ksh function, so it's the correct one to pass down. (Further evidence: sh_subshell() was already using this in the same way.) src/cmd/ksh93/bltins/cflow.c: b_return(): - Allow any signed int return value when invoked as and behaving like 'return'. - Add warning if a passed value is out of int range. Set the exit status to 128 in that case; int overflow is undefined behaviour in C and we want consistent behaviour across platforms. It should be safe enough to check if the long and int values are equal. - Refactor for clarity. src/cmd/ksh93/sh/subshell.c: sh_subshell(): - If a function returns with a status out of the 8 bit range in a virtual subshell, this status could be passed down to the parent shell in full. However, if the subshell forks, then the kernel will enforce an 8-bit exit status. That is inconsistent. Scripts should not be able to tell the difference between forked and non-forked subshells, so artificially enforce that limit here. Other changed files: - Documentation updates and copy-edits. - Update an AT&T functions.sh regress test to allow arbitrary integer return values for functions. - Add regression tests based in part on @JohnoKing's reproducers. - Rework some vaguely related regression tests to fail gracefully. Thanks to Johnothan King for the report and the testing. Fixes: https://github.com/ksh93/ksh/issues/364	2021-12-09 06:41:39 +01:00
Johnothan King	cd8c48cc5a	Add the '-e' flag to the 'cd' builtin (#358 ) This change adds the -e flag to the cd builtin, as specified in <https://www.austingroupbugs.net/view.php?id=253>. The -e flag is used to verify if the the current working directory after 'cd -P' successfully changes the directory, and returns with exit status 1 if the cwd couldn't be determined. Additionally, it causes all other errors to return with exit status >1 (i.e., status 2 unless ENOMEM occurs) if -e and -P are both active. src/cmd/ksh93/bltins/cd_pwd.c: - Add -e option to the cd builtin command. It verifies $PWD by using test_inode() to execute the equivalent of [[ . -ef $PWD ]]. - The check for restricted mode has been moved after optget to allow 'cd -eP' to return with exit status 2 when in restricted mode. To avoid changing the previous behavior of cd when -e isn't passed, extra checks have been added to prevent cd from printing usage information in restricted mode. src/cmd/ksh93/tests/builtins.sh: - Add regression tests for the exit status when using the cd -P flag with and without -e. src/cmd/ksh93/data/builtins.c, src/cmd/ksh93/sh.1: - Document the addition of -e to the cd builtin.	2021-12-06 06:58:11 +01:00
Martijn Dekker	f508660ddf	Revert "Fix defining types conditionally and/or in subshells (re: `8ced1daa`)" This reverts commit `2b9cbbbc8e`. This is not ready for prime time. Crashses when running a $PS2 discipline function. This needs fixing and more testing in development before making it into the 1.0 branch. In the meantime, that terrible problem with types is back, sorry about that.	2021-11-29 20:08:53 +01:00
Martijn Dekker	2b9cbbbc8e	Fix defining types conditionally and/or in subshells (re: `8ced1daa`) This commit mitigates the effects of the hack explained in the referenced commit so that dummy built-in command nodes added by the parser for declaration/assignment purposes do not leak out into the execution level, except in a relatively harmless corner case. Something like if false; then typeset -T Foo_t=(integer -i bar) fi will no longer leave a broken dummy Foo_t declaration command. The same applies to declaration commands created with enum. The corner case remaining is: $ ksh -c 'false && enum E_t=(a b c); E_t -a x=(b b a c)' ksh: E_t: not found Since the 'enum' command is not executed, this should have thrown a syntax error on the 'E_t -a' declaration: ksh: syntax error at line 1: `(' unexpected This is because the -c script is parsed entirely before being executed, so E_t is recognised as a declaration built-in at parse time. However, the 'not found' error shows that it was successfully eliminated at execution time, so the inconsistent state will no longer persist. This fix now allows another fix to be effective as well: since built-ins do not know about virtual subshells, fork a virtual subshell into a real subshell before adding any built-ins. src/cmd/ksh93/sh/parse.c: - Add a pair of functions, dcl_hactivate() and dcl_dehacktivate(), that (de)activate an internal declaration built-ins tree into which check_typedef() can pre-add dummy type declaration command nodes. A viewpath from the main built-ins tree to this internal tree is added, unifying the two for search purposes and causing new nodes to be added to the internal tree. When parsing is done, we close that viewpath. This hides those pre-added nodes at execution time. Since the parser is sometimes called recursively (e.g. for command substitutions), keep track of this and only activate and deactivate at the first level. - We also need to catch errors. This is done by setting libast's error_info.exit variable to a dcl_exit() function that tidies up and then passes control to the original (usually sh_exit()). - sh_cmd(): This is the most central function in the parser. You'd think it was sh_parse(), but $(modern)-form command substitutions use sh_dolparen() instead. Both call sh_cmd(). So let's simply add a dcl_hacktivate() call at the beginning and a dcl_deactivate() call at the end. - assign(): This function calls path_search(), which among many other things executes an FATH search, which may execute arbitrary code at parse time (!!!). So, regardless of recursion level, forcibly dehacktivate() to avoid those ugly parser side effects returning in that context. src/cmd/ksh93/bltins/enum.c: b_enum(): - Fork a virtual subshell before adding a built-in. src/cmd/ksh93/sh/xec.c: sh_exec(): - Fork a virtual subshell when detecting typeset's -T option. Improves fix to https://github.com/ksh93/ksh/issues/256	2021-11-29 09:02:07 +01:00
Martijn Dekker	c9ca0ff531	typeset equivalents: use 'typeset' in error messages (re: `1fbbeaa1`) When giving an invalid or incompatible option to a typeset option equivalent command (former default alias) such as 'compound' or 'integer', the resulting usage messages are incorrect. Example: $ ksh -c 'compound -T foo=(typeset -a bar[1]=23)' ksh: compound: -T cannot be used with other options Usage: compound [-bflmnprstuxACHS] [-a[[type]]] [-i[base]] [-E[n]] [-F[n]] [-L[n]] [-M[mapping]] [-R[n]] [-X[n]] [-h string] [-T[tname]] [-Z[n]] [name[=value]...] Or: compound -f [name...] Or: compound -m [name=name...] Or: compound -n [name=name...] Or: compound -T [tname[=(type definition)]...] Help: compound [ --help \| --man ] 2>&1 The error message is wrong (there were no other options) and some of the listed usages are invalid, like 'compound -f'. Typeset option equivalent commands should just use 'typeset' in all their error messages to avoid confusion. This is done by setting error_info.id to the name of the typeset builtin.	2021-11-28 21:16:17 +01:00
Martijn Dekker	214308f81e	'.': disable ksh function lookup in POSIX mode POSIXly, '.' loads only files, not functions. This only applies to '.', not 'source' (which is not in POSIX). src/cmd/ksh93/bltins/misc.c: b_source(): - For ksh function lookup, add an additional check that we're not in POSIX mode and running the '.' (SYSDOT) builtin.	2021-11-24 09:12:39 +01:00
Martijn Dekker	a66cd72f7d	arith: implement range checking for enum types Within arithmetic expressions, enumeration values of variables of a type created with the 'enum' command translate to index numbers from 0 to the number of elements minus 1. However, there was no range checking on this in the arithmetic subsystem, allowing the assignment of out-of-range values that did not correspond to any enumeration value. Variables of an enum type are internally unsigned short integers (NV_UINT16), like those created with 'integer -su', except with an additional discipline function (ENUM_disc). src/cmd/ksh93/bltins/enum.c, src/cmd/ksh93/include/builtins.h: - To implement range checking, the arithmetic system needs access to the 'nelem' (number of elements) member of 'struct Enum'. This is only defined locally in enum.c. We could move that to name.h so arith.c can access it, but enum.c has code that supports compiling as standalone. So, instead, define a quick extern function, b_enum_elem(), that does the necessary type conversion and returns a type's number of elements. - Add --man documentation for the arithmetic subsystem behaviour for enum types. Tell the enuminfo() function, which dynamically inserts values into the documentation, how to process new \f tags 'lastv' (the last-defined value) and 'lastn' (the number of the last element). src/cmd/ksh93/sh/arith.c: arith(): - For NV_UINT16 variables with an ENUM_disc discipline, check the range using b_enum_elem() and error out if necessary. Resolves: https://github.com/ksh93/ksh/issues/335	2021-11-23 22:10:40 +01:00
Martijn Dekker	74730c8ac7	test/[: Improve error status > 1 (re: `7003aba4`, `cd2cf236`, `ef1f53b5`) As I got to know the code better, it now seems painfully obvious that getting test/[ to issue an exit status >= 2 on error only requires a simple check in sh_exit() in fault.c, which is called whenever the shell issues an error message.	2021-11-22 15:37:04 +01:00
Martijn Dekker	8ced1daadf	Fix enum type definition pre-parsing for shcomp and dot/source Parser limitations prevent shcomp or source from handling enum types correctly: $ cat /tmp/colors.sh enum Color_t=(red green blue orange yellow) Color_t -A Colors=([foo]=red) $ shcomp /tmp/colors.sh > /dev/null /tmp/colors.sh: syntax error at line 2: `(' unexpected $ source /tmp/colors.sh /bin/ksh: source: syntax error: `(' unexpected Yet, for types created using 'typeset -T', this works. This is done via a check_typedef() function that preliminarily adds the special declaration builtin at parse time, with details to be filled in later at execution time. This hack will produce ugly undefined behaviour if the definition command creating that type built-in is then not actually run at execution time before the type built-in is accessed. But the hack is necessary because we're dealing with a fundamental design flaw in the ksh language. Dynamically addable built-ins that change the syntactic parsing of the shell language on the fly are an absurdity that violates the separation between parsing and execution, which muddies the waters and creates the need for some kind of ugly hack to keep things like shcomp more or less working. This commit extends that hack to support enum. src/cmd/ksh93/sh/parse.c: - check_typedef(): - Add 'intypeset' parameter that should be set to 1 for typeset and friends, 2 for enum. - When processing enum arguments, use AST getopt(3) to skip over enum's options to find the name of the type to be defined. (getopt failed if we were running a -c script; deal with this by zeroing opt_info.index first.) - item(): Update check_typedef() call, passing lexp->intypeset. - simple(): Set lexp->intypeset to 2 when processing enum. The rest of the changes are all to support the above and should be fairly obvious, except: src/cmd/ksh93/bltins/enum.c: - enuminfo(): Return on null pointer, avoiding a crash upon executing 'Type_t --man' if Type_t has not been fully defined due to the definition being pre-added at parse time but not executed. It's all still wrong, but a crash is worse. Resolves: https://github.com/ksh93/ksh/issues/256	2021-11-21 17:43:55 +01:00
Martijn Dekker	bd9752e43c	Backport 'printf -v' from ksh 93v- 'printf' on bash and zsh has a popular -v option that allows assigning formatted output directly to variables without using a command substitution. This is much faster and avoids snags with stripping final linefeeds. AT&T had replicated this feature in the abandoned 93v- beta version. This backports it with a few tweaks and one user-visible improvement. The 93v- version prohibited specifying a variable name with an array subscript, such as printf -v var\[3\] foo. This works fine on bash and zsh, so I see no reason why this should not work on ksh, as nv_putval() deals with array subscripts just fine. src/cmd/ksh93/bltins/print.c: b_print(): - While processing the -v option when called as printf, get a pointer to the variable, creating it if necessary. Pass only the NV_VARNAME flag to enforce a valid variable name, and not (as 93v- does) the NV_NOARRAY flag to prohibit array subscripts. - If a variable was given, set the output file to an internal string buffer and jump straight to processing the format. - After processing the format, assign the contents to the string buffer to the variable. src/cmd/ksh93/data/builtins.c: - Document the new option, adding a warning that unquoted square brackets may trigger pathname expansion.	2021-11-19 03:54:33 +01:00
Martijn Dekker	fb8308243c	printf: fix %(pattern)q documentation in 'printf --man' %(pattern)q is equivalent to %P. It's also equivalent to %#P, but since the alternative format specifier '#' does nothing for %P, %P and %#P are the same and documenting #%P is just confusing. Thanks to @stephane-chazelas for the report. src/cmd/ksh93/bltins/print.c: - In the printmap struct, document %P as equivalent of %(pattern)q. - Sort it alphabetically. - Do not pointlessly repeat the string "Equivalent to". Instead, let the discipline function infof() insert it for each entry. (This is the function used to dynamically insert the equivalents documentation into the --man output at the \fextra\f tag in sh_optprintf[] in data/builtins.c.) Resolves: https://github.com/ksh93/ksh/issues/338	2021-11-18 17:46:38 +01:00
Martijn Dekker	0b0d0094b9	bltins/misc.c: exec: finish cleanup (re: `d8eba9d1`) An obsolete struct was left that passed some variables on between b_exec() and the deleted B_login(). We can simply make those local variables now. Let's get rid of the redundant sh pointer, too.	2021-11-18 04:38:46 +01:00
Martijn Dekker	c734568b02	arithmetic: Fix the octal leading zero mess (#337 ) In C/POSIX arithmetic, a leading 0 denotes an octal number, e.g. 010 == 8. But this is not a desirable feature as it can cause problems with processing things like dates with a leading zero. In ksh, you should use 8#10 instead ("10" with base 8). It would be tolerable if ksh at least implemented it consistently. But AT&T made an incredible mess of it. For anyone who is not intimately familiar with ksh internals, it is inscrutable where arithmetic evaluation special-cases a leading 0 and where it doesn't. Here are just some of the surprises/inconsistencies: 1. The AT&T maintainers tried to honour a leading 0 inside of ((...)) and $((...)) and not for arithmetic contexts outside it, but even that inconsistency was never quite consistent. 2. Since 2010-12-12, $((x)) and $(($x)) are different: $ /bin/ksh -c 'x=010; echo $((x)) $(($x))' 10 8 That's a clear violation of both POSIX and the principle of least astonishment. $((x)) and $(($x)) should be the same in all cases. 3. 'let' with '-o letoctal' acts in this bizarre way: $ set -o letoctal; x=010; let "y1=$x" "y2=010"; echo $y1 $y2 10 8 That's right, 'let y=$x' is different from 'let y=010' even when $x contains the same string value '010'! This violates established shell grammar on the most basic level. This commit introduces consistency. By default, ksh now acts like mksh and zsh: the octal leading zero is disabled in all arithmetic contexts equally. In POSIX mode, it is enabled equally. The one exception is the 'let' built-in, where this can still be controlled independently with the letoctal option as before (but, because letoctal is synched with posix when switching that on/off, it's consistent by default). We're also removing the hackery that causes variable expansions for the 'let' builtin to be quietly altered, so that 'x=010; let y=$x' now does the same as 'let y=010' even with letoctal on. Various files: - Get rid of now-redundant sh.inarith (shp->inarith) flag, as we're no longer distinguishing between being inside or outside ((...)). src/cmd/ksh93/sh/arith.c: - arith(): Let disabling POSIX octal constants by skipping leading zeros depend on either the letoctal option being off (if we're running the "let" built-in") or the posix option being off. - sh_strnum(): Preset a base of 10 for strtonll(3) depending on the posix or letoctal option being off, not on the sh.inarith flag. src/cmd/ksh93/include/argnod.h, src/cmd/ksh93/sh/args.c, src/cmd/ksh93/sh/macro.c: - Remove astonishing hackery that violated shell grammar for 'let'. src/cmd/ksh93/sh/name.c (nv_getnum()), src/cmd/ksh93/sh/nvdisc.c (nv_getn()): - Remove loops for skipping leading zeroes that included a broken check for justify/zerofill attributes, thereby fixing this bug: $ typeset -Z x=0x15; echo $((x)) -ksh: x15: parameter not set Even if this code wasn't redundant before, it is now: sh_arith() is called immediately after the removed code and it ignores leading zeroes via sh_strnum() and strtonll(3). Resolves: https://github.com/ksh93/ksh/issues/334	2021-11-17 04:28:08 +01:00
Johnothan King	b40155fae8	Fix file descriptor leaks in the `hist` builtin (#336 ) This commit fixes two file descriptor leaks in the hist built-in. The bugfix for the first file descriptor leak was backported from ksh2020. See: https://github.com/att/ast/issues/872 `73bd61b5` Reproducer: $ echo no $ hist -s no=yes The second file descriptor leak occurs after a substitution error in the hist built-in (this leak wasn't fixed in ksh2020). Reproducer: $ echo no $ ls /proc/$$/fd $ hist -s no=yes $ hist -s no=yes $ ls /proc/$$/fd src/cmd/ksh93/bltins/hist.c: - Close leftover file descriptors when an error occurs and after 'hist -s' runs a command. src/cmd/ksh93/tests/builtins.sh: - Add two regression tests for both of the file descriptor leaks.	2021-11-16 23:34:46 +01:00
Martijn Dekker	ef1f53b5b2	test/[: rm SH_INTESTCMD; test for 'test' directly (re: `cd2cf236`) Turns out there is a way to check what built-in we're running at any time. It is done for 'let' in arith.c: sh.bltindata.bnode==SYSLET For test/[, that would be (see include/builtins.h): sh.bltindata.bnode==SYSTEST \|\| sh.bltindata.bnode==SYSBRACKET	2021-11-15 21:15:25 +01:00
Martijn Dekker	d9f1fdaa41	Fix [ $ str -a str $ ], [ $ str -o str $ ] Symptoms: $ test $ string1 -a string2 $ /usr/local/bin/ksh: test: argument expected $ test $ string1 -o string2 $ /usr/local/bin/ksh: test: argument expected The parentheses should be irrelevant and this should be a test for the non-emptiness of string1 and/or string2. src/cmd/ksh93/bltins/test.c: - b_test(): There is a block where the case of 'test' with five or less arguments, the first and last one being parentheses, is special-cased. The parentheses are removed as a workaround: argv is increased to skip the opening parenthesis and argc is decreased by 2. However, there is no corresponding increase of tdata.av which is a copy of this function's argv. This renders the workaround ineffective. The fix is to add that increase. - e3(): Do not handle '!' as a negator if not followed by an argument. This allows a right-hand expression that is equal to '!' (i.e. a test for the non-emptiness of the string '!').	2021-11-15 02:44:56 +01:00
Martijn Dekker	c81473061a	test/[: binary operators: fix '<' and add '=~'; some more cleanups In ksh88, the test/[ built-in supported both the '<' and '>' lexical sorting comparison operators, same as in [[. However, in every version of ksh93, '<' does not work though '>' still does! Still, the code for both is present in test_binop(): src/cmd/ksh93/bltins/test.c 548: case TEST_SGT: 549: return(strcoll(left, right)>0); 550: case TEST_SLT: 551: return(strcoll(left, right)<0); Analysis: The binary operators are looked up in shtab_testops[] in data/testops.c using a macro called sh_lookup, which expands to a sh_locate() call. If we examine that function in sh/string.c, it's easy to see that on systems using ASCII (i.e. all except IBM mainframes), it assumes the table is sorted in ASCII order. src/cmd/ksh93/sh/string.c 64: while((c= tp->sh_name) && (CC_NATIVE!=CC_ASCII \|\| c <= first)) The problem was that the '<' operator was not correctly sorted in shtab_testops[]; it was sorted immediately before '>', but after '='. The ASCII order is: < (60), = (61), > (62). This caused '<' to never be found in the table. The test_binop() function is also used by [[, yet '<' always worked in that. This is because the parser has code that directly checks for '<' and '>' within [[ (in sh/parse.c, lines 1949-1952). This commit also adds '=~' to 'test', which took three lines of code and allowed eliminating error handling in test_binop() as test/[ and [[ now support the same binary ops. (re: `fc2d5a60`) src/cmd/ksh93//*.[ch]: - Rename a couple of very misleadingly named macros in test.h: . For == and !=, the TEST_PATTERN bit is off for pattern compares and on for literal string compares! Rename to TEST_STRCMP. . The TEST_BINOP bit does not denote all binary operators, but only the logical -a/-o ops in test/[. Rename to TEST_ANDOR. src/cmd/ksh93/bltins/test.c: test_binop(): - Add support for =~. This is only used by test/[. The method is implemented in two lines that convert the ERE to a shell pattern by prefixing it with ~(E), then call test_strmatch with that temporary string to match the ERE and update ${.sh.match}. - Since all binary ops from shtab_testops[] are now accounted for, remove unknown op error handling from this function. src/cmd/ksh93/data/testops.c: - shtab_testops[]: . Correctly sort the '<' (TEST_SLT) entry. . Remove ']]' (TEST_END). It's not an op and doesn't belong here. - Update sh_opttest[] documentation with =~, \<, \>. - Remove now-unused e_unsupported_op[] error message. src/cmd/ksh93/sh/lex.c: sh_lex(): - Check for ']]' directly instead of relying on the removed TEST_END entry from shtab_testops[]. src/cmd/ksh93/tests/bracket.sh: - Add relevant tests. src/cmd/ksh93/tests/builtins.sh: - Fix an old test that globally deleted the 'test' builtin. Delete it within the command substitution subshell only. - Remove the test for non-support of =~ in test/[. - Update the test for invalid test/[ op to use test directly.	2021-11-14 02:46:34 +01:00
Martijn Dekker	6f5c9fea93	test/[: Fix binary -a/-o operators in POSIX mode POSIX requires test "$a" -a "$b" to return true if both $a and $b are non-empty, and test "$a" -o "$b" to return true if either $a or $b is non-empty. In ksh, this fails if "$a" is '!' or '(' as this causes ksh to interpret the -a and -o as unary operators (-a being a file existence test like -e, and -o being a shell option test). $ test ! -a ""; echo "$?" 0 (expected: 1/false) $ set -o trackall; test ! -o trackall; echo "$?" 1 (expected: 0/true) $ test $ -a $; echo "$?" ksh: test: argument expected 2 (expected: 0/true) $ test $ -o $ ksh: test: argument expected 2 (expected: 0/true) Unfortunately this problem cannot be fixed without risking breakage in legacy scripts. For instance, a script may well use test ! -a filename to check that a filename is nonexistent. POSIX specifies that this always return true as it is a test for the non-emptiness of both strings '!' and 'filename'. So this commit fixes it for POSIX mode only. src/cmd/ksh93/bltins/test.c: e3(): - If the posix option is active, specially handle the case of having at least three arguments with the second being -a or -o, overriding their handling as unary operators. src/cmd/ksh93/data/testops.c: - Update 'test --man --' date and say that unary -a is deprecated. src/cmd/ksh93/sh.1: - Document the fix under the -o posix option. - For test/[, explain that binary -a/-o are deprecated. src/cmd/ksh93/tests/bracket.sh: - Add tests based on reproducers in bug report. Resolves: https://github.com/ksh93/ksh/issues/330	2021-11-13 03:43:29 +01:00
Martijn Dekker	09a8a279f2	Fix bug on closed stdout; improve BUG_PUTIOERR fix (re: `93e15a30`) Stéphane Chazelas reported: > As noted in this austin-group-l discussion[] (relevant to this > issue): > > $ ksh93u+m -c 'pwd; echo "$?" >&2; echo test; echo "$?" >&2' >&- > 0 > 1 > /home/chazelas > > when stdout is closed, pwd does claim it succeeds (by returning a > 0 exit status), while echo doesn't (not really relevant to the > problem here, only to show it doesn't affect all builtins), and > the output that pwd failed to write earlier ends up being written > on stderr here instead of stdout upon exit (presumably) because > of that >&2 redirection. > > strace shows ksh93 attempting write(1, "/home/chazelas\n", 15) 6 > times (1, the last one, successful). > > It gets even weirder when redirecting to a file: > > $ ksh93u+m -c 'pwd; echo "$?" >&2; echo test; echo "$?" > file' >&- > 0 > $ cat file > 1 > 1 > ome/chazelas In my testing, the problem does not occur when closing stdout at the start of the -c script itself (using redirect >&- or exec >&-); it only occurs if stdout was closed before initialising the shell. That made me suspect that the problem had to do with an inconsistent file descriptor state in the shell. ksh uses internal sh_open() and sh_close() functions, among others, to maintain that state. src/cmd/ksh93/sh/main.c: sh_main(): - If the shell is initialised with stdin, stdout or stderr closed, then make the shell's file descriptor state tables reflect that fact by calling sh_close() for the closed file descriptors. This commit also improves the BUG_PUTIOERR fix from `93e15a30`. Error checking after sfsync() is not sufficient. For instance, on FreeBSD, the following did not produce a non-zero exit status: ksh -c 'echo hi' >/dev/full even though this did: ksh -c 'echo hi >/dev/full' Reliable error checking requires not only checking the result of every SFIO command that writes output, but also synching the buffer at the end of the operation and checking the result of that. src/cmd/ksh93/bltins/print.c: - Make exitval variable global to allow functions called by b_print() to set a nonzero exit status. - Check the result of all SFIO output commands that write output. - b_print(): Always sfsync() at the end, except if the s (history) flag was given. This allows getting rid of the sfsync() call that required the workaround introduced in `846ad932`. [] https://www.mail-archive.com/austin-group-l@opengroup.org/msg08056.html Resolves: https://github.com/ksh93/ksh/issues/314	2021-11-07 15:44:06 +00:00
Martijn Dekker	7b5b0a5d54	Fix octal number arguments in printf integer arithmetic Bug 1: POSIX requires numbers used as arguments for all the %d, %u... in printf to be interpreted as in the C language, so printf '%d\n' 010 should output 8 when the posix option is on. However, it outputs 10. This bug was introduced as a side effect of a change introduced in the 2012-02-07 version of ksh 93u+m, which caused the recognition of leading-zero numbers as octal in arithmetic expressions to be disabled outside ((...)) and $((...)). However, POSIX requires leading-zero octal numbers to be recognised for printf, too. The change in question introduced a sh.arith flag that is set while we're processing a POSIX arithmetic expression, i.e., one that recognises leading-zero octal numbers. Bug 2: Said flag is not reset in a command substitution used within an arithmetic expression. A command substitution should be a completely new context, so the following should both output 10: $ ksh -c 'integer x; x=010; echo $x' 10 # ok; it's outside ((…)) so octals are not recognised $ ksh -c 'echo $(( $(integer x; x=010; echo $x) ))' 8 # bad; $(comsub) should create new non-((…)) context src/cmd/ksh93/bltins/print.c: extend(): - For the u, d, i, o, x, and X conversion modifiers, set the POSIX arithmetic context flag before calling sh_strnum() to convert the argument. This fixes bug 1. src/cmd/ksh93/sh/subshell.c: sh_subshell(): - When invoking a command substitution, save and unset the POSIX arithmetic context flag. Restore it at the end. This fixes bug 2. Reported-by: @stephane-chazelas Resolves: https://github.com/ksh93/ksh/issues/326	2021-09-13 04:57:37 +02:00
Martijn Dekker	e5e1d4b53e	Decrease SHLVL before doing 'exec' from main shell Problem: $ exec ksh $ echo $SHLVL 2 $ exec ksh $ echo $SHLVL 3 $ exec ksh $ echo $SHLVL 4 ...etc. SHLVL is supposed to acount the number of shell processes that you need to exit before you get logged out. Since ksh was replacing itself with a new shell in the same process using 'exec', SHLVL should not increase. src/cmd/ksh93/bltins/misc.c: b_exec(): - When about to replace the shell and we're not in a subshell, decrease SHLVL to cancel out a subsequent increase by the replacing shell. Bash and zsh also do this.	2021-05-19 00:08:12 +02:00
Martijn Dekker	a197b0427a	Fix two more 'command' bugs BUG 1: Though 'command' is specified/documented as a regular builtin, preceding assignments survive the invocation (as with special or declaration builtins) if 'command' has no command arguments in these cases: $ foo=wrong1 command; echo $foo wrong1 $ foo=wrong2 command -p; echo $foo wrong2 $ foo=wrong3 command -x; echo $foo wrong3 Analysis: sh_exec(), case TCOM (simple command), contains the following loop that skips over 'command' prefixes, preparsing any options and remembering the offset in the 'command' variable: src/cmd/ksh93/sh/xec.c 1059 while(np==SYSCOMMAND \|\| !np && com0 && nv_search(com0,shp->fun_tree,0)==SYSCOMMAND) 1060 { 1061 register int n = b_command(0,com,&shp->bltindata); 1062 if(n==0) 1063 break; 1064 command += n; 1065 np = 0; 1066 if(!(com0= (com+=n))) 1067 break; 1068 np = nv_bfsearch(com0, shp->bltin_tree, &nq, &cp); 1069 } This skipping is not done if the preliminary b_command() call on line 1061 (with argc==0) returns zero. This is currently the case for command -v/-V, so that 'command' is treated as a plain and regular builtin for those options. The cause of the bug is that this skipping is even done if 'command' has no arguments. So something like 'foo=bar command' is treated as simply 'foo=bar', which of course survives. So the fix is for b_command() to return zero if there are no arguments. Then b_command() itself needs changing to not error out on the second/main b_command() call if there are no arguments. src/cmd/ksh93/bltins/whence.c: b_command(): - When called with argc==0, return a zero offset not just for -v (X_FLAG) or -V (V_FLAG), but also if there are no arguments left (!argv) after parsing options. - When called with argc>0, do not issue a usage error if there are no arguments, but instead return status 0 (or, if -v/-V was given, status 2 which was the status of the previous usage message). This way, 'command -v $emptyvar' now also works as you'd expect. BUG 2: 'command -p' sometimes failed after executing certain loops. src/cmd/ksh93/sh/path.c: defpath_init(): - astconf() returns a pointer to memory that may be overwritten later, so duplicate the string returned. Backported from ksh2020. (re: `f485fe0f`, `aa4669ad`, <https://github.com/att/ast/issues/959>) src/cmd/ksh93/tests/builtins.sh: - Update the test for BUG_CMDSPASGN to check every variant of 'command' (all options and none; invoking/querying all kinds of command and none) with a preceding assignment. (re: `fae8862c`) This also covers bug 2 as 'command -p' was failing on macOS prior to the fix due to a loop executed earlier in another test.	2021-05-05 02:43:18 +01:00
Johnothan King	1aec9b06dd	Fix more bugs that occur when enabling ASan (#293 ) src/cmd/ksh93/{bltins/typeset,sh/name,sh/nvtree,sh/nvtype}.c: - Replace more instances of memcmp with strncmp to fix heap-buffer-overflow errors when running the regression tests with ASan enabled. src/cmd/ksh93/edit/vi.c: - Fix an invalid dereference of the 'p' pointer to fix a crash in vi mode when entering a comment in the command history. This bugfix was backported from ksh2020: https://github.com/att/ast/issues/798 src/cmd/ksh93/tests/pty.sh: - Add a regression test for the vi mode crash.	2021-05-02 04:06:30 +01:00
Martijn Dekker	72fe631b2f	typeset: only run sh_assignok() for variables (re: `a2f13c19`) src/cmd/ksh93/bltins/typeset.c: - setall(): Only run sh_assignok() if troot points to the variable tree. For instance, it's pointless to run it for an alias. - Remove vestigial SHOPT_BSH code. The ast-open-history repo shows that earlier SHOPT_BSH code was removed on 2008-06-02 and 2005-05-22. This may have been experimental code for increased compatibility with the ancient Bourne shell. There was never any documentation.	2021-05-01 00:43:12 +01:00
Martijn Dekker	2aad3cab06	Add ksh 93u+m contributors notice to 964 copyright headers	2021-04-26 00:19:31 +01:00
Martijn Dekker	295cce2c6d	Improve fix for 'unset -f' in sub-subshell (re: `07faf384`, `13c57e4b`) src/cmd/ksh93/bltins/typeset.c: - Removing the nv_search() call altogether was actually not neccessary, I was just searching the wrong tree: instead of sh.fun_base, simply search the current sh.fun_tree which has a view to all the layered parent subshell copes. It is not going to find it in the current subshell tree but will find it in one of the parent trees if it exists. The cost of an unnecessary dummy is negligible, but so is the cost of this search, and doing it is more correct.	2021-04-24 21:26:17 +01:00
Johnothan King	24b7fcb771	Don't detect unset functions with whence (re: `13c57e4b`) (#287 ) src/cmd/ksh93/bltins/whence.c: - The previous commit that fixed 'unset -f' in virtual subshells left one bug. The type builtin (or 'whence -v') could still find the unset function in virtual subshells: $ foo() { echo foo; } $ (unset -f foo; type foo) foo is an undefined function To fix this bug, avoid detecting functions in the whence builtin unless they have the NV_FUNCTION flag. src/cmd/ksh93/tests/subshell.sh: - Add a regression test for using 'type' on a function unset inside of a virtual subshell.	2021-04-24 21:25:45 +01:00
Martijn Dekker	07faf38425	Fix 'unset -f' in sub-subshell (re: `13c57e4b`) A bug introduced in the previous commit caused 'unset -f' in a subshell of a subshell to fail to unset a function created in a parent subshell. Reproducer: $ ( f2() { echo WRONG; }; ( unset -f f2; f2 ) ) WRONG src/cmd/ksh93/bltins/typeset.c: unall(): - Do not nv_search() in sh.fun_base before setting the dummy node that marks the function as unset in this subshell. That search only reaches the base tree and not any of its subtrees. Setting the dummy unconditionally is not harmful; the cost is negligible. src/cmd/ksh93/tests/subshell.sh: - Add test for the bug.	2021-04-24 12:41:40 +01:00
Martijn Dekker	13c57e4b58	Fix 'unset -f' to work in subshells without forking (re: `047cb330`) This commit implements unsetting functions in virtual subshells, removing the need for the forking workaround. This is done by either invalidating the function found in the current subshell function tree by unsetting its NV_FUNCTION attribute bits (which will cause sh_exec() to skip it) or, if the function exists in a parent shell, by creating an empty dummy subshell node in the current function tree without that attribute. As a beneficial side effect, it seems that bug 228 (unset -f fails in forked subshells if a function is defined before forking) is now also fixed. src/cmd/ksh93/include/defs.h, src/cmd/ksh93/sh/init.c: - Add sh.fun_base for a saved pointer to the main shell's function tree for checking when in a subshell, analogous to sh.var_base. src/cmd/ksh93/bltins/typeset.c: unall(): - Remove the fork workaround. - When unsetting a function found in the current function tree (troot) and that tree is not sh.var_base (which checks if we're in a virtual subshell in a way that handles shared-state command substitutions correctly), then do not delete the function but invalidate it by unsetting its NV_FUNCTION attribute bits. - When unsetting a function not found in the current function tree, search for it in sh.fun_base and if found, add an empty dummy node to mask the parent shell environment's function. The dummy node will not have NV_FUNCTION set, so sh_exec() will skip it. src/cmd/ksh93/sh/subshell.c: - sh_subfuntree(): For 'unset -f' to work correctly with shared-state command substitutions (subshares), this function needs a fix similar to the one applied to sh_assignok() for variables in commit `911d6b06`. Walk up on the subshells tree until we find a non-subshare. - sh_subtracktree(): Apply the same fix for the hash table. - Remove table_unset() and incorporate an updated version of its code in sh_subshell(). As of `ec888867`, this function was only used to clean up the subshell function table as the alias table no longer exists. - sh_subshell(): * Simplify the loop to free the subshell hash table. * Add table_unset() code, slightly refactored for readability. Treat dummy nodes now created by unall() separately to avoid a memory leak; they must be nv_delete()d without passing the NV_FUNCTION bits. For non-dummy nodes, turn on the NV_FUNCTION attribute in case they were invalidated by unall(); this is needed for _nv_unset() to free the function definition. src/cmd/ksh93/tests/subshell.sh: - Update the test for multiple levels of subshell functions to test a subshare as well. While we're add it, add a very similar test for multiple levels of subshell variables that was missing. - Add @JohnoKing's reproducer from #228. src/cmd/ksh93/tests/leaks.sh: - Add leak tests for unsetting functions in a virtual subshell. Test both the simple unset case (unall() creates a dummy node) and the define/unset case (unall() invalidates existing node). Resolves: https://github.com/ksh93/ksh/issues/228	2021-04-24 06:57:49 +01:00
Martijn Dekker	61e0f90460	Yet more fixes for subshell directory handling (re: `feaf718f`) There were still problems left after the previous commit. On at least one system (QNX i386), the following regression test crashed: src/cmd/ksh93/test/subshell.c 900 got=$( { "$SHELL" -c '(cd /; (cd /)); print -r -- "PWD=$PWD"'; } 2>&1 ) A backtrace done on the core dunp pointed to the free() call here: src/cmd/ksh93/bltins/cd_pwd.c 90 if(oldpwd && oldpwd!=shp->pwd && oldpwd!=e_dot) 91 free(oldpwd); Analysis: The interaction between $PWD, sh.pwd aka shp->pwd, and the path_pwd() function is a mess. path_pwd() usually returns a freeable value, but not always. sh.pwd is sometimes a pointer to the value of $PWD, but not always (e.g. when you unset PWD or assign to it). Instead of debugging the exact cause of the crash, I think it is better to make this work in a more consistent way. As of this commit: 1. sh.pwd keeps its own copy of the PWD, independently of the PWD variable. The old value must always be freed immediately before assigning a new one. This is simple and consistent, reducing the chance of bugs at negligible cost. 2. The PWD variable is no longer given the NV_NOFREE attribute because its value no longer points to sh.pwd. It is now a variable like any other. src/cmd/ksh93/sh/path.c: path_pwd(): - Do not give PWDNOD the NV_NOFREE attribute. - Give sh.pwd its own copy of the PWD by strdup'ing PWDNOD's value. src/cmd/ksh93/bltins/cd_pwd.c: b_cd(): - Since sh.pwd is now consistently freed before giving it a new value and at no other time, oldpwd must not be freed any longer and can become a regular non-static variable. - If the PWD needs reinitialising, call path_pwd() to do it. src/cmd/ksh93/sh/subshell.c: sh_subshell(): - Systems with fchdir(2): Always restore the PWD upon exiting a non-subshare subshell. The check to decide whether or not to restore it was unsafe: it was not restored if the current PWD pointer and value was identical to the saved one, but a directory can be deleted and recreated under the same name. - Systems without fchdir(2) (if any exist): . Entry: Fork if the PWD is nonexistent or has no x permission. . Restore: Only chdir back if the subshell PWD was changed. That's probably the best we can do. It remains inherently unsafe. We should probably just require fchdir(2) at some point.	2021-04-20 05:39:10 +01:00
Martijn Dekker	feaf718f16	More fixes for subshell directory handling (re: `7bab9508`, `5ee290c7`) This commit fixes what are hopefully the two final aspects of #153: 1. If the present working directory does not exist (was moved or deleted) upon entering a virtual subshell, no PWD directory path is saved. Since restoring the state after exiting a virtual subshell is contingent on a previous PWD path existing, this resulted in entire aspects of the virtual subshell, such as the subshell function tree, not being cleaned up. 2. A separate problem is that 'cd ..' does not update PWD or OLDPWD when run from a nonexistent directory. A reproducer exposing both problems is: $ mkdir test $ cd test $ ksh -c '(subfn() { BAD; }; cd ..; echo subPWD==$PWD); typeset -f subfn; echo mainPWD==$PWD' subPWD==/usr/local/src/ksh93/ksh/test subfn() { BAD; };mainPWD==/usr/local/src/ksh93/ksh/test Expected output: subPWD==/usr/local/src/ksh93/ksh mainPWD==/usr/local/src/ksh93/ksh/test src/cmd/ksh93/bltins/cd_pwd.c: - If path_pwd() fails to get the PWD (usually it no longer exists), don't set $OLDPWD to '.' as that is pointless; use $PWD instead. After cd'ing from a nonexistent directory, 'cd -' should fail and should not be equivalent to 'cd .'. - Remove a redundant check for (!oldpwd) where it is always set. - Do not prematurely return without setting PWD or OLDPWD if pathcanon() fails to canonicalise a nonexistent directory. Instead, fall back to setting PWD to the result of getcwd(3). src/cmd/ksh93/sh/subshell.c: - Minor stylistic adjustment. Some NULL macros sneaked in. This historic code base does not use them (yet); change to NIL(type*). - sh_subshell(): Fix logic for determining whether to save/restore subshell state. 1. When saving, 'if(!comsub \|\| !shp->subshare)' is redundant; 'if(!shp->subshare)' should be enough. If we're not in a subshare, state should be saved. 2. When restoring, 'if(sp->shpwd)' is just nonsense as there is no guarantee that the PWD exists upon entering a subshell. Simply use the same 'if(!shp->subshare)'. Add an extra check for sp->pwd to avoid a possible segfault. Always restore the PWD on subshell exit and not only if shp->pwd is set. - sh_subshell(): Issue fatal errors in libast's "panic" format. src/cmd/ksh93/tests/builtins.sh: - Adjust a relevant test to run err_exit() outside of the subshell so that any error is counted in the main shell. - Add test for problem 2 described at the top. src/cmd/ksh93/tests/subshell.sh: - Add test for problems 1 and 2 based on reproducer above. Resolves: https://github.com/ksh93/ksh/issues/153	2021-04-19 00:06:43 +01:00
Johnothan King	a065558291	Fix more compiler warnings, typos and other minor issues (#260 ) Many of these changes are minor typo fixes. The other changes (which are mostly compiler warning fixes) are: NEWS: - The --globcasedetect shell option works on older Linux kernels when used with FAT32/VFAT file systems, so remove the note about it only working with 5.2+ kernels. src/cmd/ksh93/COMPATIBILITY: - Update the documentation on function scoping with an addition from ksh93v- (this does apply to ksh93u+). src/cmd/ksh93/edit/emacs.c: - Check for '_AST_ksh_release', not 'AST_ksh_release'. src/cmd/INIT/mamake.c, src/cmd/INIT/ratz.c, src/cmd/INIT/release.c, src/cmd/builtin/pty.c: - Add more uses of UNREACHABLE() and noreturn, this time for the build system and pty. src/cmd/builtin/pty.c, src/cmd/builtin/array.c, src/cmd/ksh93/sh/name.c, src/cmd/ksh93/sh/nvtype.c, src/cmd/ksh93/sh/suid_exec.c: - Fix six -Wunused-variable warnings (the name.c nv_arrayptr() fixes are also in ksh93v-). - Remove the unused 'tableval' function to fix a -Wunused-function warning. src/cmd/ksh93/sh/lex.c: - Remove unused 'SHOPT_DOS' code, which isn't enabled anywhere. https://github.com/att/ast/issues/272#issuecomment-354363112 src/cmd/ksh93/bltins/misc.c, src/cmd/ksh93/bltins/trap.c, src/cmd/ksh93/bltins/typeset.c: - Add dictionary generator function declarations for former aliases that are now builtins (re: `1fbbeaa1`, `ef1621c1`, `3ba4900e`). - For consistency with the rest of the codebase, use '(void)' instead of '()' for print_cpu_times. src/cmd/ksh93/sh/init.c, src/lib/libast/path/pathshell.c: - Move the otherwise unused EXE macro to pathshell() and only search for 'sh.exe' on Windows. src/cmd/ksh93/sh/xec.c, src/lib/libast/include/ast.h: - Add an empty definition for inline when compiling with C89. This allows the timeval_to_double() function to be inlined. src/cmd/ksh93/include/shlex.h: - Remove the unused 'PIPESYM2' macro. src/cmd/ksh93/tests/pty.sh: - Add '# err_exit #' to count the regression test added in commit `113a9392`. src/lib/libast/disc/sfdcdio.c: - Move diordwr, dioread, diowrite and dioexcept behind '#ifdef F_DIOINFO' to fix one -Wunused-variable warning and multiple -Wunused-function warnings (sfdcdio() only uses these functions when F_DIOINFO is defined). src/lib/libast/string/fmtdev.c: - Fix two -Wimplicit-function-declaration warnings on Linux by including sys/sysmacros.h in fmtdev().	2021-04-08 19:58:07 +01:00
Martijn Dekker	3667aa4f71	Make readonly variables exportable again (re: `264ba48b`) $ /usr/local/bin/ksh -c 'readonly v=1; export v' /usr/local/bin/ksh: export: v: is read only Every POSIX shell (even zsh, as of 5.8) allows this. So did ksh, until the referenced commit. src/cmd/ksh93/bltins/typeset.c: setall(): - Allow setting attributes on a readonly variable if any of NV_ASSIGN (== NV_NOFREE), NV_EXPORT or NV_RDONLY are the only flag bits that are set. This allows readonly, export, typeset -r, typeset -x, and typeset -rx on variable arguments without an assignment. Note that NV_ASSIGN is set for the first variable argument even though it is not an assignment, so we must allow it. The logic (or lack thereof) of that is yet to be worked out. src/cmd/ksh93/tests/readonly.sh: - Tests. Resolves: https://github.com/ksh93/ksh/issues/258	2021-04-08 06:40:25 +01:00
hyenias	264ba48bdd	Hardening of readonly variables (#239 ) Ksh currently restricts readonly scalar variables from having their values directly changed via a value assignment. However, since ksh allows variable attributes to be altered, the variable's value can be indirectly altered. For instance, if TMOUT=900 (for a 15 minute idle timeout) was set to readonly, all that is needed to alter the value of TMOUT from 900 to 0 is to issue 'typeset -R1 TMOUT', perhaps followed by a 'typeset -i TMOUT' to turn off the shell's timeout value. In addition, there are problems with arrays. The following is incorrectly allowed: typeset -a arr=((a b c) 1) readonly arr arr[0][1]=d arr=(alphas=(a b c);name=x) readonly arr.alphas arr.alphas[1]=([b]=5) arr=(alphas=(a b c);name=x) readonly arr.alphas arr.alphas[1]=(b) typeset -C arr=(typeset -r -a alphas=(a b c);name=x) arr.alphas[1]=() src/cmd/ksh93/bltins/typeset.c: setall(): - Relocate readonly attribute check higher up the code and widen its application to issue an error message if the pre-existing name-pair has the readonly bit flag set. - To avoid compatibility problems, don't check for readonly if NV_RDONLY is the only attribute set (ignoring NV_NOFREE). This allows 'readonly foo; readonly foo' to keep working. src/cmd/ksh93/sh/array.c: nv_endsubscript(): - Apply a readonly flag check when an array subscript or append assignment occurs, but allow type variables (typeset -T) as they utilize '-r' for 'required' sub-variables. src/cmd/ksh93/tests/readonly.sh: - New file. Create readonly tests that validate the warning message and validate that the readonly variable did not change. src/cmd/ksh93/sh/streval.c: - Bump MAXLEVEL from 9 to 1024 as a workaround for arithmetic expansion, avoiding a spurious error about too much recursion when the readonly.sh tests are run. This change is backported from ksh 93v-. TODO: debug a spurious increase in arithmetic recursion level variable when readonly.sh tests with 'typeset -i' are run. That is a different bug for a different commit. Co-authored-by: Martijn Dekker <martijn@inlv.org>	2021-04-05 06:43:19 +01:00
Johnothan King	c4f980eb29	Introduce usage of __builtin_unreachable() and noreturn (#248 ) This commit adds an UNREACHABLE() macro that expands to either the __builtin_unreachable() compiler builtin (for release builds) or abort(3) (for development builds). This is used to mark code paths that are never to be reached. It also adds the 'noreturn' attribute to functions that never return: path_exec(), sh_done() and sh_syntax(). The UNREACHABLE() macro is not added after calling these. The purpose of these is: * to slightly improve GCC/Clang compiler optimizations; * to fix a few compiler warnings; * to add code clarity. Changes of note: src/cmd/ksh93/sh/io.c: outexcept(): - Avoid using __builtin_unreachable() here since errormsg can return despite using ERROR_system(1), as shp->jmplist->mode is temporarily set to 0. See: https://github.com/att/ast/issues/1336 src/cmd/ksh93/tests/io.sh: - Add a regression test for the ksh2020 bug referenced above. src/lib/libast/features/common: - Detect the existence of either the C11 stdnoreturn.h header or the GCC noreturn attribute, preferring the former when available. - Test for the existence of __builtin_unreachable(). Use it for release builds. On development builds, use abort() instead, which crahses reliably for debugging when unreachable code is reached. Co-authored-by: Martijn Dekker <martijn@inlv.org>	2021-04-05 00:28:24 +01:00
Johnothan King	ca2443b58c	`cd -` shouldn't ignore `$OLDPWD` when in a new scope (#249 ) This bug was first reported at <https://github.com/att/ast/issues/8>. The 'cd' command currently takes the value of $OLDPWD from the wrong scope. In the following example 'cd -' will change the directory to /bin instead of /tmp: $ OLDPWD=/bin ksh93 -c 'OLDPWD=/tmp cd -' /bin src/cmd/ksh93/bltins/cd_pwd.c: - Use sh_scoped() to obtain the correct value of $OLDPWD. - Fix a use-after-free bug. Make the 'oldpwd' variable a static char that points to freeable memory. Each time cd is used, this variable is freed if it points to a freeable memory address and isn't also a pointer to shp->pwd. src/cmd/ksh93/sh/path.c: path_pwd(): - Simplify and add comments. - Scope $PWD properly. src/cmd/ksh93/tests/builtins.sh, src/cmd/ksh93/tests/leaks.sh: - Backport the ksh2020 regression tests for 'cd -' when $OLDPWD is set. - Add test for $OLDPWD and $PWD after subshare. - Add test for $PWD after 'cd'. - Add test for possible memory leak. - Add testing for 'unset' on OLDPWD and PWD. src/cmd/ksh93/COMPATIBILITY: - Add compatibility note about changes to $PWD and $OLDPWD. Co-authored-by: Martijn Dekker <martijn@inlv.org>	2021-04-02 01:19:19 +01:00

1 2 3 4

175 commits