N1534 Updates to Character Constants

Nick Stoughton
2010-11-04

Background

The Austin Group, responsible for the maintenance of ISO/IEC 9945 POSIX, are making updates to the POSIX shell to support strings that are very similar to C character string literals. However, they are proposing to add two additional backslash escape sequences, namely \e to mean the Escape character in the current locale, and \cx to mean "Control-x".

This paper is intended to propose character constants for C to align with this proposal, since such characters have been shown to be useful in existing practice within shells, and are somewhat more complex to do in a portable fashion by existing mechanisms.

Description

The character display semantics described in C1x section 5.2.2 includes a list of the backslash escape sequences currently supported. This proposal adds the two new escape sequences to that table.

Specific Wording Changes

Change in 5.2.1 paragraph 3, from
In the basic execution character set, there shall be control characters representing alert, backspace, carriage return, and new line.
to
In the basic execution character set, there shall be control characters representing alert, backspace, carriage return, new line and escape.
Add the following to 5.2.2 paragraph 2 in the correct alphabetic order:
\cx The character representing "CONTROL-x", where x is a single alphabetic character, as described in section 5.2.2.1.
\e The escape character
Add the following new subsection, numbered 5.2.2.1 before 5.2.3:
5.2.2.1 Control Characters

The following table describes a mapping between CONTROL characters, as introduced by \cx, and characters in the current character set. NOTE - Many of these characters are not required to be a part of the basic execution character set. If the current character set does not include any equivalent character for any of these, an implentation defined alternative character may be used.
Control Character Actual Character
A,a<SOH>
B,b<STX>
C,c<ETX>
D,d<EOT>
E,e<ENQ>
F,f<ACK>
G,g<BEL>
H,h<BS>
I,i<HT>
J,j<LF>
K,k<VT>
L,l<FF>
M,m<CR>
N,n<SO>
O,o<SI>
P,p<DLE>
Q,q<DC1>
R,r<DC2>
S,s<DC3>
T,t<DC4>
U,u<NAK>
V,v<SYN>
W,w<ETB>
X,x<CAN>
Y,y<EM>
Z,z<SUB>
[<ESC>
\<FS>
]<GS>
_<US>
?<DEL>