Anonymous | Login | 2024-12-02 08:36 UTC |
Main | My View | View Issues | Change Log | Docs |
Viewing Issue Simple Details [ Jump to Notes ] | [ Issue History ] [ Print ] | ||||||
ID | Category | Severity | Type | Date Submitted | Last Update | ||
0001468 | [1003.1(2008)/Issue 7] Shell and Utilities | Editorial | Enhancement Request | 2021-04-24 15:20 | 2024-06-11 08:52 | ||
Reporter | mortoneccc | View Status | public | ||||
Assigned To | ajosey | ||||||
Priority | normal | Resolution | Accepted | ||||
Status | Closed | ||||||
Name | Ed Morton | ||||||
Organization | |||||||
User Reference | |||||||
Section | awk | ||||||
Page Number | 2493 | ||||||
Line Number | 80182-80184 | ||||||
Interp Status | --- | ||||||
Final Accepted Text | |||||||
Summary | 0001468: awk FS definition not quite correct | ||||||
Description |
(sorry, I don't see any page or line numbers in the online spec, hence the 1 and 1 used above). In the definition of FS in the awk spec (https://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.html) [^] it says: ----- The following describes FS behavior: If FS is a null string, the behavior is unspecified. If FS is a single character: If FS is <space>, skip leading and trailing <blank> and <newline> characters; fields shall be delimited by sets of one or more <blank> or <newline> characters. Otherwise, if FS is any other character c, fields shall be delimited by each single occurrence of c. Otherwise, the string value of FS shall be considered to be an extended regular expression. Each occurrence of a sequence matching the extended regular expression shall delimit fields. ----- but that final case isn't exactly correct because an ERE can match a null string while a FS can't. Try for example splitting a record on all non-commas: $ echo 'x,y,z' | awk -F'[^,]*' '{for (i=1;i<=NF;i++) print i, "<"$i">"}' 1 <> 2 <,> 3 <,> 4 <> which makes sense since there's a null string before the first non-comma (x), 2 commas around the 2nd non-comma (y) and a null string after the last non-comma (z). Now remove the "y" from the middle to get: $ echo 'x,,z' | awk -F'[^,]*' '{for (i=1;i<=NF;i++) print i, "<"$i">"}' 1 <> 2 <,,> 3 <> and note that the null string between the 2 commas which would match the regexp `[^,]*` isn't actually matched by the FS `[^,]*`. |
||||||
Desired Action | Change the final paragraph of the FS definition mentioned above to say something like "Otherwise, the string value of FS shall be considered to be an extended regular expression such that each occurrence of a sequence **of one or more characters** matching the extended regular expression shall delimit fields." | ||||||
Tags | tc3-2008 | ||||||
Attached Files | |||||||
|
There are no notes attached to this issue. |
Mantis 1.1.6[^] Copyright © 2000 - 2008 Mantis Group |