View Issue Details
| ID | Project | Category | View Status | Date Submitted | Last Update |
|---|---|---|---|---|---|
| 0001973 | 1003.1(2024)/Issue8 | Shell and Utilities | public | 2026-03-06 07:22 | 2026-03-06 10:25 |
| Reporter | stephane | Assigned To | |||
| Priority | normal | Severity | Objection | Type | Clarification Requested |
| Status | New | Resolution | Open | ||
| Name | Stephane Chazelas | ||||
| Organization | |||||
| User Reference | |||||
| Section | awk utility | ||||
| Page Number | (page or range of pages) | ||||
| Line Number | (Line or range of lines) | ||||
| Interp Status | |||||
| Final Accepted Text | |||||
| Summary | 0001973: awk "string variables" origin | ||||
| Description | The awk specification (https://pubs.opengroup.org/onlinepubs/9799919799.2024edition/utilities/awk.html#tag_20_06_13_02) has: <<< A string value shall be considered a numeric string if it comes from one of the following: 1. Field variables 2. Input from the getline() function 3. FILENAME 4. ARGV array elements 5. ENVIRON array elements 6. Array elements created by the split() function 7. A command line variable assignment 8. Variable assignment from another numeric string variable >>> It can be interpreted as meaning that awk 'BEGIN{$1 = "10"; print ($1 > 2)}' should return 1 for instance. But no implementation that I know does so. By assigning a string to $1, it loses that special property whereby when containing a string that looks like a number it shall be considered as a number. Same applies for ARGV, FILENAME... Typo in rationale section btw: > also shall have the numeric value of the numeric string" was removed >from several sections of the ISO POSIX-2:1993 standard because *is* > specifies an unnecessary implementation detail is -> it | ||||
| Desired Action | Make it clear that it's 1. the values resulting from the splitting of $0 into $1, $2... (upon first dereferencing after reading a record (including via getline) or after assigning to $0) that are candidate for numeric strings, not the field variables per se, or change to "Field variables unless subsequently assigned a string value". 3. the current input file as initially assigned to FILENAME, or "FILENAME unless subsequently assigned a string value" And so on for ARGV and ENVIRON Or add some verbiage below that list along the lines of: > And the corresponding variables have not been subsequently assigned a string value. That still makes it ambiguous for things like: $1 = "10"; $0 = "11 12"; print ($1 > 2) Where $1 becomes a numeric string again after assignment to $0 | ||||
| Tags | No tags attached. | ||||
|
|
May also be worth clarifying (in a separate ticket?) that in sub(ere, repl[, in ]) or gsub(ere, repl[, in ]), if "in" (or $0 if omitted) was a numeric string and there's been at least one substitution, then it becomes a non-numeric string even if it contains the valid representation of a number. That is for instance: printf '%s\n' 12 13 | awk '{gsub("2", "2")}; $0 > 2' Should output 13 only as 12 is successfully substituted with 12, making it a string which is not greater than "2" while 13 remains a numeric string as the substitution failed. |
|
|
For context, that came up at https://unix.stackexchange.com/questions/804798/awk-comparing-to-constant-numbers |
|
|
> 1. the values resulting from the splitting of $0 into $1, $2... Sorry, that wording is insufficient as that doesn't cover $0 itself, where it's its assigning from input (the current record or via getline) that is considered for numeric strings. For the case where $0 is recomputed when individual fields are modified, I find the behaviour varies between implementations. echo 10 | LC_ALL=C awk '{$1 = $1}; $0 > 2' outputs 10 in mawk, but not in busybox, GNU nor bwk's `awk`. While echo 10 | LC_ALL=C awk -v OFS=. '{$2 = 3}; $0 > 2' outputs 10 in none of them. |
| Date Modified | Username | Field | Change |
|---|---|---|---|
| 2026-03-06 07:22 | stephane | New Issue | |
| 2026-03-06 08:01 | stephane | Note Added: 0007389 | |
| 2026-03-06 09:59 | stephane | Note Added: 0007391 | |
| 2026-03-06 10:20 | stephane | Note Added: 0007392 | |
| 2026-03-06 10:21 | stephane | Note Edited: 0007392 | |
| 2026-03-06 10:25 | stephane | Note Edited: 0007389 |