1
0
Fork 0
mirror of git://git.code.sf.net/p/cdesktopenv/code synced 2025-03-09 15:50:02 +00:00

printf %H: fix/reduce encoding into entities (re: 8477d2ce)

The   entity is not valid in XML, only in HTML. Since we must
be compatible with both, it can't be used. Thanks to Andras Farkas
for the bug report.

In addition, the generation of numeric entities for unprintable
characters was only valid while processing UTF-8 text while in a
UTF-8 locale. In all other conditions it produced invalid results.
This is not worth trying to fix.

Discussion:
https://groups.google.com/d/msgid/korn-shell/CAA0nTRta%3DPbOYduyBv%3DXCzumTcUCU8Lki%3DQQf2O8Erk2BFvO1g%40mail.gmail.com

src/cmd/ksh93/bltins/print.c:
- Remove conversion to   entity.
- Remove conversion of non-graph characters to numeric entities.
  Convert only the 5 semantically meaningful characters: < > & " '

src/cmd/ksh93/include/defs.h,
src/cmd/ksh93/sh/string.c:
- We don't need sh_isprint() in print.c anymore, so turn it back
  into a static function.

src/cmd/ksh93/tests/builtins.sh:
- Update and trim regression tests.
This commit is contained in:
Martijn Dekker 2020-08-11 07:08:44 +01:00
parent 61437b2728
commit e01801572d
4 changed files with 6 additions and 32 deletions

View file

@ -327,16 +327,13 @@ static char *sh_fmtcsv(const char *string)
return(stakptr(offset));
}
#if SHOPT_MULTIBYTE
/*
* Note: without SHOPT_MULTIBYTE, defs.h makes this an alias of isprint(3).
*
* Returns false if c is an invisible Unicode character, excluding ASCII space.
* Use iswgraph(3) if possible. In the ksh-specific C.UTF-8 locale, this is
* generally not possible as the OS-provided iswgraph(3) doesn't support that
* locale. So do a quick test and do our best with a fallback if necessary.
*/
int sh_isprint(int c)
static int sh_isprint(int c)
{
if(!mbwide()) /* not in multibyte locale? */
return(isprint(c)); /* use plain isprint(3) */
@ -355,7 +352,6 @@ int sh_isprint(int c)
c == 0x3000 || /* ideographic space */
c == 0xFEFF)); /* zero-width non-breaking space */
}
#endif /* SHOPT_MULTIBYTE */
/*
* print <str> quoting chars so that it can be read by the shell