1
0
Fork 0
mirror of git://git.code.sf.net/p/cdesktopenv/code synced 2025-03-09 15:50:02 +00:00
No description
Find a file
Martijn Dekker 71934570bf Add --globcasedetect shell option for globbing and completion
One of the best-kept secrets of libast/ksh93 is that the code
includes support for case-insensitive file name generation (a.k.a.
pathname expansion, a.k.a. globbing) as well as case-insensitive
file name completion on interactive shells, depending on whether
the file system is case-insensitive or not. This is transparently
determined for each directory, so a path pattern that spans
multiple file systems can be part case-sensitive and part case-
insensitive. In more precise terms, each slash-separated path name
component pattern P is treated as ~(i:P) if its parent directory
exists on a case-insensitive file system. I recently discovered
this while dealing with <https://github.com/ksh93/ksh/issues/223>.

However, that support is dead code on almost all current systems.
It depends on pathconf(2) having a _PC_PATH_ATTRIBUTES selector.
The 'c' attribute is supposedly returned if the given directory is
on a case insensitive file system. There are other attributes as
well (at least 'l', see src/lib/libcmd/rm.c). However, I have been
unable to find any system, current or otherwise, that has
_PC_PATH_ATTRIBUTES. Google and mailing list searches yield no
relevant results at all. If anyone knows of such a system, please
add a comment to this commit on GitHub, or email me.

An exception is Cygwin/Windows, on which the "c" attribute was
simply hardcoded, so globbing/completion is always case-
insensitive. As of Windows 10, that is wrong, as it added the
possibility to mount case-sensitive file systems.

On the other hand, this was never activated on the Mac, even
though macOS has always used a case-insensitive file like Windows.
But, being UNIX, it can also mount case-sensitive file systems.

Finally, Linux added the possibility to create individual case-
insensitive ext4 directories fairly recently, in version 5.2.
https://www.collabora.com/news-and-blog/blog/2020/08/27/using-the-linux-kernel-case-insensitive-feature-in-ext4/

So, since this functionality latently exists in the code base, and
three popular OSs now have relevant file system support, we might
as well make it usable on those systems. It's a nice idea, as it
intuitively makes sense for globbing and completion behaviour to
auto-adapt to file system case insensitivity on a per-directory
basis. No other shell does this, so it's a nice selling point, too.

However, the way it is coded, this is activated unconditionally on
supported systems. That is not a good idea. It will surprise users.
Since globbing is used with commands like 'rm', we do not want
surprises. So this commit makes it conditional upon a new shell
option called 'globcasedetect'. This option is only compiled into
ksh on systems where we can actually detect FS case insensitivity.

To implement this, libast needs some public API additions first.

*** libast changes ***

src/lib/libast/features/lib:
- Add probes for the linux/fs.h and sys/ioctl.h headers.
  Linux needs these to use ioctl(2) in pathicase(3) (see below).

src/lib/libast/path/pathicase.c,
src/lib/libast/include/ast.h,
src/lib/libast/man/path.3,
src/lib/libast/Mamfile:
- Add new pathicase(3) public API function. This uses whatever
  OS-specific method it can detect at compile time to determine if
  a particular path is on a case-insensitive file system. If no
  method is available, it only sets errno to ENOSYS and returns -1.
  Currently known to work on: macOS, Cygwin, Linux 5.2+, QNX 7.0+.
- On systems (if any) that have the mysterious _PC_PATH_ATTRIBUTES
  selector for pathconf(2), call astconf(3) and check for the 'c'
  attribute to determine case insensitivity. This should preserve
  compatibility with any such system.

src/lib/libast/port/astconf.c:
- dynamic[]: As case-insensitive globbing is now optional on all
  systems, do not set the 'c' attribute by default on _WINIX
  (Cygwin/Windows) systems.
- format(): On systems that do not have _PC_PATH_ATTRIBUTES, call
  pathicase(3) to determine the value for the "c" (case
  insensitive) attribute only. This is for compatibility as it is
  more efficient to call pathicase(3) directly.

src/lib/libast/misc/glob.c,
src/lib/libast/include/glob.h:
- Add new GLOB_DCASE public API flag to glob(3). This is like
  GLOB_ICASE (case-insensitive matching) except it only makes the
  match case-insensitive if the file system for the current
  pathname component is determined to be case-insensitive.
- gl_attr(): For efficiency, call pathicase(3) directly instead of
  via astconf(3).
- glob_dir(): Only call gl_attr() to determine file system case
  insensitivity if the GLOB_DCASE flag was passed. This makes case
  insensitive globbing optional on all systems.
- glob(): The options bitmask needs to be widened to fit the new
  GLOB_DCASE option. Define this centrally in a new GLOB_FLAGMASK
  macro so it is easy to change it along with GLOB_MAGIC (which
  uses the remaining bits for a sanity check bit pattern).

src/lib/libast/path/pathexists.c:
- For efficiency, call pathicase(3) directly instead of via
  astconf(3).

*** ksh changes ***

src/cmd/ksh93/features/options,
src/cmd/ksh93/SHOPT.sh:
- Add new SHOPT_GLOBCASEDET compile-time option. Set it to probe
  (empty) by default so that the shell option is compiled in on
  supported systems only, which is determined by new iffe feature
  test that checks if pathicase(3) returns an ENOSYS error.

src/cmd/ksh93/data/options.c,
src/cmd/ksh93/include/shell.h:
- Add -o globcasedetect shell option if compiling with
  SHOPT_GLOBCASEDET.

src/cmd/ksh93/sh/expand.c: path_expand():
- Pass the new GLOB_DCASE flag to glob(3) if the
  globcasedetect/SH_GLOBCASEDET shell option is set.

src/cmd/ksh93/edit/completion.c:
- While file listing/completion is based on globbing and
  automatically becomes case-insensitive when globbing does, it
  needs some additional handling to make a string comparison
  case-insensitive in corresponding cases. Otherwise, partial
  completions may be deleted from the command line upon pressing
  tab. This code was already in ksh 93u+ and just needs to be
  made conditional upon SHOPT_GLOBCASEDET and globcasedetect.
- For efficiency, call pathicase(3) directly instead of via
  astconf(3).

src/cmd/ksh93/sh.1:
- Document the new globcasedetect shell option.
2021-03-22 18:45:19 +00:00
.github/workflows GitHub CI: also test with SHOPT_DEVFD off (re: 6d63b57d) 2021-03-13 16:31:35 +00:00
bin package: check for same compiler flags between build runs 2021-03-19 14:59:32 +00:00
docs Fix various minor problems and update the documentation (#237) 2021-03-21 14:39:03 +00:00
lib/package Fix a large number of typos and other problems (#110) 2020-08-07 00:50:11 +01:00
src Add --globcasedetect shell option for globbing and completion 2021-03-22 18:45:19 +00:00
.gitignore Build system tweaks; fix use of brk(2)/sbrk(2) feature test 2021-01-26 09:59:11 +00:00
LICENSE.md Top-level documentation tweaks 2021-02-20 23:19:50 +00:00
NEWS Add --globcasedetect shell option for globbing and completion 2021-03-22 18:45:19 +00:00
README.md Top-level documentation tweaks 2021-02-20 23:19:50 +00:00
TODO Top-level documentation tweaks 2021-02-20 23:19:50 +00:00

KornShell 93u+m

This repository is used to develop bugfixes to the last stable release (93u+ 2012-08-01) of ksh93, formerly developed by AT&T Software Technology (AST). The sources in this repository were forked from the GitHub AST repository which is no longer under active development.

For user-visible fixes, see NEWS and click on commit messages for full details. For all fixes, see the commit log. To see what's left to fix, see the issue tracker.

Policy

  1. No new features; bug fixes only (but see items 3 and 4). Feature development is for a future separate branch.
  2. No major rewrites. No refactoring code that is not fully understood.
  3. No changes in documented behaviour, except if required for compliance with the POSIX shell language standard which David Korn intended for ksh to follow.
  4. No 100% bug compatibility. Broken and undocumented behaviour gets fixed.
  5. No bureaucracy, no formalities. Just fix it, or report it: create issues, send pull requests. Every interested party is invited to contribute.
  6. To help increase everyone's understanding of this code base, fixes and significant changes should be fully documented in commit messages.
  7. Code style varies somewhat in this historic code base. Your changes should match the style of the code surrounding it. Indent with tabs, assuming an 8-space tab width. Opening braces are on a line of their own, at the same identation level as their corresponding closing brace. Comments always use /*...*/.
  8. Good judgment may override this policy.

Why?

Between 2017 and 2020 there was an ultimately unsuccessful attempt to breathe new life into the KornShell by extensively refactoring the last unstable AST beta version (93v-). While that ksh2020 branch is now abandoned and still has many critical bugs, it also had a lot of bugs fixed. More importantly, the AST issue tracker now contains a lot of documentation on how to fix those bugs, which made it possible to backport many of them to the last stable release instead. This ksh 93u+m reboot now incorporates many of these bugfixes, plus patches from OpenSUSE, Red Hat, and Solaris, as well as many new fixes from the community (1, 2). Though there are many bugs left to fix, we are confident at this point that 93u+m is already the least buggy branch of ksh93 ever released.

Build

To build ksh with a custom configuration of features, edit src/cmd/ksh93/SHOPT.sh.

Then cd to the top directory and run:

bin/package make

The compiled binaries are stored in the arch directory, in a subdirectory that corresponds to your architecture. The command bin/package host type outputs the name of this subdirectory.

If you have trouble or want to tune the binaries, you may pass additional compiler and linker flags. It is usually best to export these as environment variables before running bin/package as they could change the name of the build subdirectory of the arch directory, so exporting them is a convenient way to keep them consistent between build and test commands. Note that this system uses CCFLAGS instead of the usual CFLAGS. An example that makes Solaris Studio cc produce a 64-bit binary:

export CCFLAGS="-xc99 -m64 -O" LDFLAGS="-m64"
bin/package make

Alternatively you can append these to the command, and they will only be used for that command. You can also specify an alternative shell in which to run the build scripts this way. For example:

bin/package make SHELL=/bin/bash CCFLAGS="-O2 -I/opt/local/include" LDFLAGS="-L/opt/local/lib"

For more information run

bin/package help

Many other commands in this repo self-document via the --help, --man and --html options; those that do have no separate manual page.

Test

After compiling, you can run the regression tests. Start by reading the information printed by:

bin/shtests --man

Install

Automated installation is not supported. To install manually:

cp arch/$(bin/package host type)/bin/ksh /usr/local/bin/
cp src/cmd/ksh93/sh.1 /usr/local/share/man/man1/ksh.1

(adapting the destination directories as required).

What is ksh93?

The following is the official AT&T description from 1993 that came with the ast-open distribution. The text is original, but hyperlinks were added here.


KSH-93 is the most recent version of the KornShell Language described in "The KornShell Command and Programming Language," by Morris Bolsky and David Korn of AT&T Bell Laboratories, ISBN 0-13-182700-6. The KornShell is a shell programming language, which is upward compatible with "sh" (the Bourne Shell), and is intended to conform to the IEEE P1003.2/ISO 9945.2 Shell and Utilities standard. KSH-93 provides an enhanced programming environment in addition to the major command-entry features of the BSD shell "csh". With KSH-93, medium-sized programming tasks can be performed at shell-level without a significant loss in performance. In addition, "sh" scripts can be run on KSH-93 without modification.

The code should conform to the IEEE POSIX 1003.1 standard and to the proposed ANSI-C standard so that it should be portable to all such systems. Like the previous version, KSH-88, it is designed to accept eight bit character sets transparently, thereby making it internationally compatible. It can support multi-byte characters sets with some characteristics of the character set given at run time.

KSH-93 provides the following features, many of which were also inherent in KSH-88:

  • Enhanced Command Re-entry Capability: The KSH-93 history function records commands entered at any shell level and stores them, up to a user-specified limit, even after you log off. This allows you to re-enter long commands with a few keystrokes - even those commands you entered yesterday. The history file allows for eight bit characters in commands and supports essentially unlimited size histories.
  • In-line Editing: In "sh", the only way to fix mistyped commands is to backspace or retype the line. KSH-93 allows you to edit a command line using a choice of EMACS-TC or "vi" functions. You can use the in-line editors to complete filenames as you type them. You may also use this editing feature when entering command lines from your history file. A user can capture keystrokes and rebind keys to customize the editing interface.
  • Extended I/O Capabilities: KSH-93 provides several I/O capabilities not available in "sh", including the ability to:
    • specify a file descriptor for input and output
    • start up and run co-processes
    • produce a prompt at the terminal before a read
    • easily format and interpret responses to a menu
    • echo lines exactly as output without escape processing
    • format output using printf formats.
    • read and echo lines ending in "\".
  • Improved performance: KSH-93 executes many scripts faster than the System V Bourne shell. A major reason for this is that many of the standard utilities are built-in. To reduce the time to initiate a command, KSH-93 allows commands to be added as built-ins at run time on systems that support dynamic loading such as System V Release 4.
  • Arithmetic: KSH-93 allows you to do integer arithmetic in any base from two to sixty-four. You can also do double precision floating point arithmetic. Almost the complete set of C language operators are available with the same syntax and precedence. Arithmetic expressions can be used to as an argument expansion or as a separate command. In addition, there is an arithmetic for command that works like the for statement in C.
  • Arrays: KSH-93 supports both indexed and associative arrays. The subscript for an indexed array is an arithmetic expression, whereas, the subscript for an associative array is a string.
  • Shell Functions and Aliases: Two mechanisms - functions and aliases - can be used to assign a user-selected identifier to an existing command or shell script. Functions allow local variables and provide scoping for exception handling. Functions can be searched for and loaded on first reference the way scripts are.
  • Substring Capabilities: KSH-93 allows you to create a substring of any given string either by specifying the starting offset and length, or by stripping off leading or trailing substrings during parameter substitution. You can also specify attributes, such as upper and lower case, field width, and justification to shell variables.
  • More pattern matching capabilities: KSH-93 allows you to specify extended regular expressions for file and string matches.
  • KSH-93 uses a hierarchical name space for variables. Compound variables can be defined and variables can be passed by reference. In addition, each variable can have one or more disciplines associated with it to intercept assignments and references.
  • Improved debugging: KSH-93 can generate line numbers on execution traces. Also, I/O redirections are now traced. There is a DEBUG trap that gets evaluated before each command so that errors can be localized.
  • Job Control: On systems that support job control, including System V Release 4, KSH-93 provides a job-control mechanism almost identical to that of the BSD "csh", version 4.1. This feature allows you to stop and restart programs, and to move programs between the foreground and the background.
  • Added security: KSH-93 can execute scripts which do not have read permission and scripts which have the setuid and/or setgid set when invoked by name, rather than as an argument to the shell. It is possible to log or control the execution of setuid and/or setgid scripts. The noclobber option prevents you from accidentally erasing a file by redirecting to an existing file.
  • KSH-93 can be extended by adding built-in commands at run time. In addition, KSH-93 can be used as a library that can be embedded into an application to allow scripting.

Documentation for KSH-93 consists of an "Introduction to KSH-93", "Compatibility with the Bourne Shell" and a manual page and a README file. In addition, the "New KornShell Command and Programming Language" book is available from Prentice Hall.