Austin Group Defect Tracker

Aardvark Mark IV


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0000519 [1003.1(2008)/Issue 7] Shell and Utilities Objection Enhancement Request 2011-11-27 23:11 2020-03-18 15:29
Reporter dwheeler View Status public  
Assigned To ajosey
Priority normal Resolution Accepted As Marked  
Status Applied  
Name David A. Wheeler
Organization
User Reference
Section make
Page Number 2915
Line Number 95808
Interp Status ---
Final Accepted Text Note: 0001206
Summary 0000519: Add to make macro variable pattern substitution (expansion), e.g., $(foo:%.o=%.c)
Description POSIX make supports $(string1[:subst1=[subst2]]) and ${string1[:subst1=[subst2]]}. However, many make implementations also support macro variable pattern substitution using '%', which is not specified by POSIX make.

GNU make, SunPro make, and NetBSD make (at *least*) implement this. Reasons to add this include:
* It adds more capabilities. In particular, this allows matching and replacement of prefixes, or both prefixes and postfixes; POSIX make's variable substitutions only handle postfix information.
* It's so widely supported that many people use it even when they could use the POSIX standard's capabilities, simply because they *expect* this to be supported. Given its widespread implementation, this is not irrational. Which suggests that it's time to add it to POSIX's make.

Note that if macro function $(patsubst) is supported, and subst1 has exactly 1 percent, this is equivalent to "$(patsubst subst1,subst2,$(VAR))". However, these kinds of variable value substitutions are so common that it's worth having a simple abbreviation for the common case of referencing a variable but using a single substitution. $(patsubst ...) is more flexible, because it can accept arbitrary values in its parameters including other functions, but it's also more wordy. Thus, I advocate supporting both: $(patsubst) for the general case, and macro variable pattern subsitution for an abbreviation of a common case.

NOTE: At *least* NetBSD make and GNU make implement this matching as *case-sensitive* match, and '%' matches zero characters. I confirmed this using this makefile:

$ cat makefile.subst
DEMO1 = hello.c
RESULT1 = $(DEMO1:%.c=%.o)
DEMO2 = .c
RESULT2 = $(DEMO2:%.c=%.o)
DEMO3 = X.C
RESULT3 = $(DEMO3:%.c=%.o)

all:
        @echo RESULT1 = $(RESULT1)
        @echo RESULT2 = $(RESULT2)
        @echo RESULT3 = $(RESULT3)


$ make -f makefile.subst
RESULT1 = hello.o
RESULT2 = .o
RESULT3 = X.C


$ bmake -f makefile.subst # NetBSD make
RESULT1 = hello.o
RESULT2 = .o
RESULT3 = X.C
Desired Action At the end of the paragraph "Macro expansions using the forms $(string1[:subst1=[subst2]]) or ${string1[:subst1=[subst2]]} can be used to replace all occurrences of subst1 with subst2 when the macro substitution is performed."

append:
"If subst1 contains a <percent>, then a pattern substitution will occur. The substitution will only occur if the subst1 pattern matches the entire value of the variable, where the first <percent> shall match zero or more characters. If there is a match in a pattern substitution, the result is subst2, where the first <percent> in subst2 (if present) is replaced by the characters that matched the <percent> of subst1. Matching is case-sensitive."
Tags issue8
Attached Files gz file icon makefile-patterns.tar.gz [^] (2,159 bytes) 2011-11-29 01:14

- Relationships
related to 0000865Closed 1003.1(2013)/Issue7+TC1 Use of % or $ in subst1 while expanding $(string1:subst1=[subst2]) should be avoided 

-  Notes
(0001057)
joerg (reporter)
2011-11-28 14:15

This feature has been introduced in 1986 by SunPro make. I cannot speak for NetBSD make as I know that all BSD makes implemented patterm macro replcements
in an incorrect way about 10 years ago. Aprox. 7 years ago, I worked with one of the FreeBSD developers to fix the implementation in FreeBSD.

Besides to SunPro make, pattern macro replacement is implemented correctly in my smake and I did not find a bug in the gmake implementation.

Pattern macro replacement is more widespread than the gmake function calls. This
is why I would give it a higher precedence than macro function calls.

Here is the related part in the SunPro make man page:

  Pattern Replacement Macro References
     Pattern matching replacements can also be applied to macros,
     with a reference of the form:

            $(name:op%os=np%ns)

     where op is the existing (old) prefix and os is the existing
     (old) suffix, np and ns are the new prefix and new suffix,
     respectively, and the pattern matched by % (a string of zero
     or more characters), is carried forward from the value being
     replaced. For example:

       PROGRAM=fabricate
       DEBUG= $(PROGRAM:%=tmp/%-g)

     sets the value of DEBUG to tmp/fabricate-g.

     Notice that pattern replacement macro references cannot be
     used in the dependency list of a pattern matching rule; the
     % characters are not evaluated independently. Also, any
     number of % metacharacters can appear after the equal-sign.
(0001058)
joerg (reporter)
2011-11-28 14:46

Sorry, I have to correct myself. Gmake does not properly implement pattern macro
expansion:

AAA= $(XXX:%g=a%b%c)

XXX=BLAg

all:
        echo $(AAA)


should print: "aBLAbBLAc"

but it prints: "aBLAb%c"
(0001063)
dwheeler (reporter)
2011-11-29 00:07

Regarding #1057:
"I cannot speak for NetBSD make as I know that all BSD makes implemented patterm macro replcements in an incorrect way about 10 years ago."

I'm not sure what you mean by the "incorrect way". I did anticipate issues if there were more than one '%' symbol. I did some additional checking and found some interesting additional oddnesses in GNU make and NetBSD make. (I'll upload some scripts that test various cases.)

I think the issues that I found, at least, are easily handled in the spec. So here's more info & a proposed modification to the proposal. Quick bottom line: Let's leave unspecified what happens if there isn't exactly one '%' in the match or replacement pattern.

First, I've found that NetBSD make, at least older versions, doesn't work as I expected when there is no '%' in the replacement pattern. E.G., if XFOOC="x_foo.c", then $(XFOOC:%.c=no_percent) = x_foono_percent on NetBSD make version 20080515 (granted, that's a little old). I can't believe anyone actually *expects* or *depends* on this weird behavior of the NetBSD make when the replacement doesn't have a '%' at all. I think that *is* a bug. But probably it's not an important one - why not just insert the constant?

I also checked what happens when there is more than one '%' in the replacement pattern. In this case GNU make and BSD make agree, the second '%' is considered non-special. I understand why comment #1058 suggests that gmake "does not properly implement", but in this case, I think it's arguable. GNU make and NetBSD make actually seem to agree in this case (though apparently they disagree with SunPro make).

Both the GNU and BSD implementations also agree that '%' symbols after the first one in the pattern are not special and just match themselves.

HOWEVER, I think there's a simple solution. Practically all macro variable replacements involve exactly 1 percent sign on each side of the "=". So while I think a standard *can* declare a particular semantic as correct, in this case, let's just say it's *unspecified* if there are other than exactly 1 percent characters on either side.

If somebody wants to be more specific, great, but let's cover the 99% case now, and work to gain agreement on the edge cases later.

So here's an updated proposal, changing the text to use "shall" and adding that anything other than 1 percent sign on each side would be unspecified.

"If subst1 contains a <percent>, then a macro variable pattern substitution shall occur. A macro variable pattern substitution shall only occur if the subst1 pattern matches the entire value of the variable, where the first <percent> shall match zero or more characters. If there is no match, the original value of the macro variable is returned, If there is a match in a macro variable pattern substitution, the result is subst2, where the first <percent> in subst2 shall be replaced by the characters that matched the <percent> of subst1. The results of a macro variable pattern substitution are unspecified if there is not exactly one <percent> in subst1 and exactly one <percent> in subst2. Matching is case-sensitive."
(0001066)
joerg (reporter)
2011-11-29 09:56

Please note that we are talking about a feature that has been introduced by
SunPro make in 1986.

When we try to define something that has been introduced by SunPro make, we
should look at the specification and the behavior of the original implementation
instead looking at incomplete/imperfect reimplementations - except when might be
an obvious design bug in the original implementation. I cannot see such a bug,
so please let us agree on the original definition which is also implemented by "smake".

With regard to *BSD make, IIRC, a problem was that tere was no support for recursive expandion ("name", "op", "os", "np", ns" contain macro calls) in older versions.

Another important issue for an implementation may be the equal sign. It should
be mentioned that the equal sign getween "os" and "np" is the first equal sign
and that further equal signs should be treated as normal text.

More than one equal sign are e.g. needed in expansions like this:

    $(EMAIL:%=EMAIL=%)

More than one percent may be needed in expansions like:

    $(LPATH:%=-R% -L%)

that expands a library path to -Rpath -Lpath in order to both set search paths
for the static linker and the runtime linker.

In general, making too many things "unspecified", results in unsuable features.
The current "include" specification in the POSIX standard e.g. calls important
defails "unspecificed", so I cannot use my makefiles with a make implementation
that just implements the current POSIX standard.
(0001067)
joerg (reporter)
2011-11-29 10:06

Another important note on pattern macro substitutions,
they only make sense, when they operate on wordlists as expcted:

CFILES= a.c b.c c.c

OFILES= $(CFILES:%.c=%.o)

all:
     echo $(OFILES)

is expected to print a.o b.o c.o

This is how most modern makefiles are working, in special when they deal
with auto-dependencies.

So the current proposal for the desired action in the standard does not match
the needs.
(0001068)
joerg (reporter)
2011-11-29 10:32

And another important behavior that needs to be defined in the standard:

A pattern macro substitution may only occur when the left side (the text between
the colon and the first equal sign) matches at least one non-blank character.

The current proposal is in conflict with this important detail.

Note that

   $(LPATH:%=-L%)

does not expand to anything in case that LPATH is "empty"
(0001069)
dwheeler (reporter)
2011-11-30 04:58

Regarding comment #1066:

> When we try to define something that has been introduced by SunPro make, we should look at the specification and the behavior of the original implementation instead looking at incomplete/imperfect reimplementations...

I would instead suggest that, "whenever something is proposed, we should carefully examine all extant implementations, including the initial one". You're welcome to believe that reimplementations are incomplete/imperfect - in this particular case, I might even agree with you :-). But I think it's important to consider everyone fairly. Perhaps the later variations are better, or perhaps their 'imperfections' are adequate for user needs. Presuming that the first implementation is *always* correct seems like asking for trouble :-).

> With regard to *BSD make, IIRC, a problem was that tere was no support for recursive expandion ("name", "op", "os", "np", ns" contain macro calls) in older versions.

Ah! *That* is what you mean. GNU make does allow expansion there, and obviously SunPro make does too. The NetBSD make I tried actually DOES support recursive expansion, though only in certain contexts... but it does allow some.

GNU make *and* NetBSD make both support macro substitutions that include an embedded "=" by expanding the macros first, THEN looking for the "=". That is, both of them produce:
XFOOC = x_foo.c
FULLSUB= %.c=%.o
...
 $(XFOOC:$(FULLSUB)) = x_foo.o

There is a difference when "=" isn't inside the expansion, though. Given:
DOTC = .c
DOTO = .o
.... and asked to evaluate "$(XFOOC:$(DOTC)=$(DOTO))",

GNU make produces:
  $(XFOOC:$(DOTC)=$(DOTO)) = x_foo.o

NetBSD make 20080515 has trouble in this case. It complains (on stderr) the message:
  bmake: Unknown modifier '.'
and it produces:
  $(XFOOC:$(DOTC)=$(DOTO)) = .o)

In this case I agree with you, the substitutions *SHOULD* occur. There are many cases where filename extensions vary, so you WANT to let makefile variables be used, and handling filename extensions is a common use.
I think there's a good argument that this is just an implementation bug, not an intentional semantic.

I'm going to write the proposal with the presumption of macro expansion. Perhaps people will disagree and it be taken back out, and if so, that's what happens. But maybe not... perhaps we can just agree that it should do that, and be done with it.

> Another important issue for an implementation may be the equal sign. It should be mentioned that the equal sign getween "os" and "np" is the first equal sign and that further equal signs should be treated as normal text.

Excellent point! Yes, that definitely should be added.

> In general, making too many things "unspecified", results in unsuable features. The current "include" specification in the POSIX standard e.g. calls important defails "unspecificed", so I cannot use my makefiles with a make implementation that just implements the current POSIX standard.

I completely agree. In fact, I believe that POSIX make in *particular* suffers from leaving too many things unspecified. But in the case of $(var:old=new), I think it's extremely rare to need multiple '%' in either old or new. If we add standard mechanisms to invoke the shell to set variables and such (e.g., $(system) and "!="), then unusual cases like that can be handled anyway. So let's focus on the common case, and try to add the ability to call out to the shell and do immediate evaluation; once those are added, the need to do that lessens.


Regarding comment #1067:

> Another important note on pattern macro substitutions, they only make sense, when they operate on wordlists as expcted:
> CFILES= a.c b.c c.c
> OFILES= $(CFILES:%.c=%.o)
> all:
> echo $(OFILES)
> is expected to print a.o b.o c.o

Drat, that's exactly what I meant, but not what I wrote. I was concentrating so much on what happened to a word that I forget to spec that it applies to every word.

Thanks so much for pointing that out, I'll definitely fix that.



Comment #1068:

> A pattern macro substitution may only occur when the left side (the text between the colon and the first equal sign) matches at least one non-blank character.
> The current proposal is in conflict with this important detail.
> Note that
> $(LPATH:%=-L%)
> does not expand to anything in case that LPATH is "empty"

That's an interesting special case, and the use case makes sense to me. GNU make also this (as well as SunPro make). NetBSD make does not, sadly. We could make this unspecified, but given the use case and significant implementation, it's better not to.

By the way, thanks for all the detailed analysis, I appreciate it.


So here's a modified proposal, where I try to take these comments into account.

On line 95805, replace the sentences after the first one with:
"Any macros on the right-hand-side of the colon shall be recursively expanded before further examination.
If there is more than one equal sign after the <colon> after this,
the first one shall be considered the separator. If subst1 does not have a <percent> character,
the subst1 to be replaced shall be recognized when it is a suffix at the end of a word in string1
(where a word, in this context, is defined to be a string delimited by the beginning of
the line, a <blank>, a <tab>, or a <newline>).
If subst1 contains one <percent>, then
the subst1 to be replaced shall be recognized every time it matches an entire word in string1
where the <percent> in subst1 matches zero or more characters; the <percent> in subst2 that replaces
the original word shall use the text that matched the <percent> in subst1.
The results are unspecified if there is more than one <percent> in either subst1 or subst2.
If the original value of variable named by string1 is an empty string, the final result shall be an empty string. Matching is case-sensitive."

More comments?
(0001071)
joerg (reporter)
2011-11-30 16:56

Regarding your comment on my comment #1068:

I just try to remember what features are needed in order to write my build system
that I started in 1992. From my point of view, the POSIX standard for make is only
useful in case it permits the minimal set of features I need in order to
implement my portable build system.

I started to write smake in the early 1980s in order to understand how a make
program works. It later turned out that smake is the only make implementation
that allows me to support more than 30 OS platforms. The basic idea behind POSIX
is not to make implementations portable but to make the program specifications
portable. Once the make specification is featurefull enough and all OS platforms
implement them, the basic OS make implementation could be used instead of a
portable make implementation.

Wait - once POSIX make implements all needed features, it needs to define a
macro to indicate it is compliant and to permit to include the POSIX set of
rule files instead of using implementation specific rules.

In order to allow portable makefiles, it is essential, that constructs like:

 $(LPATH:%=-L%)

do not match on empty macros, because this is the way to implement many of the
needed abstractions.

Regarding your new proposal:

We should not mix the definition for suffix macro replacements with the
definition for pattern macro replacements. A macro replacement that does not
include a percent in the text between the colon and the first equal sign is
not a pattern macro replacement.

It seems to be better to just add a new paragraph after line 95808.
(0001196)
joerg (reporter)
2012-04-12 14:43
edited on: 2012-04-12 16:26

Let me make a new proposal for the change.

After the paragraph with the text:

"Macro expansions in string1 of macro definition lines shall be evaluated when read. Macro expansions in string2 of macro definition lines shall be performed when the macro identified by string1 is expanded in a rule or command."

insert new paragraphs with the following text:

Macro expansions using the forms $(string1:[op]%[os]=[np][%][ns]) or
${string1:[op]%[os]=[np][%][ns]} are called patterm macro expansions.
Where "op" is the old prefix, "os" is the old suffix, "np" is the new
prefix and "ns" is the new suffix. Any item inside square brackets
is optional. The string "op%os" must completely match "string1"
or a word in "string1", where the % character in "op%os" matches
zero or more characters. Any word that does not match is copied
unmodified.

If more than one <percent> character appears before the equal sign, the
second and further <percent> characters are seen as literal part of "os".

If no <percent> character appears on the right hand side after
the equal sign, the matched string is removed from the
replacement. If a single <percent> character appears on the
right hand side, the characters matched by the <percent> on the
left hand side replace the <percent> character on the right hand
side. If more than one <percent> character appears after the
<equal-sign>, it is unspecified whether the first <percent>
character is replaced by the matched text and all remaining
<percent> characters are left unchanged, or each <percent>
character is replaced by the matched text.
 
In both macro expansion forms, any macros on the right-hand-side of the colon
shall be recursively expanded before further examination. If there is more
than one equal sign after the <colon> after this, the first one shall be
considered the separator. If the macro "string1" expands to a list of
white space separated words made of one or more characters, the expansion
applies to each of the words separately.
 
If the original value of variable named by string1 is an empty string,
the final result shall be an empty string. Matching is case-sensitive.

(0001206)
geoffclare (manager)
2012-04-16 14:44
edited on: 2014-09-18 16:02

[Rewording of Note: 0001196]

Undo the changes made in TC2 by 0000865 and make the following changes instead.

After the paragraph:

Macro expansions in string1 of macro definition lines shall be
evaluated when read. Macro expansions in string2 of macro definition
lines shall be performed when the macro identified by string1 is
expanded in a rule or command.

add the following new paragraphs:

Macro expansions using the forms $(string1:[op]%[os]=[np][%][ns]) or
${string1:[op]%[os]=[np][%][ns]} are called pattern macro expansions,
where "op" is the old prefix, "os" is the old suffix, "np" is the new
prefix and "ns" is the new suffix. Any item inside square brackets
is optional. With this form, when the macro string1 is expanded each
whitespace separated word that completely matches the "[op]%[os]"
pattern on the left hand side of the <equals-sign> ('='), where the
<percent> ('%') character matches zero or more characters, shall be
replaced by the right hand side of the <equals-sign> and shall then
be further modified according to the use of <percent> characters as
described below. Any words that do not match shall be unmodified in
the expansion.

If more than one <percent> character appears on the left hand side
of the <equals-sign> ('='), the second and subsequent <percent>
characters shall be treated as literal characters in "os".

If no <percent> character appears on the right hand side of the
<equals-sign>, no further modification of the word shall be performed.
If a single <percent> character appears on the right hand side, the
<percent> character in the word shall be replaced with the characters
matched by the <percent> on the left hand side. If more than one
<percent> character appears on the right hand side, it is unspecified
whether the first <percent> character in the word is replaced with the
characters matched by the <percent> on the left hand side and all
remaining <percent> characters are left unchanged, or each <percent>
character is replaced with the characters matched by the <percent> on
the left hand side.

In both macro expansion forms, any macro expansions on the right hand
side of the colon shall be recursively expanded before further
examination. If this results in more than one <equals-sign> after the
<colon>, the first one shall be considered to be the separator.

If the original value of the variable named by string1 is an empty
string, the final result shall be an empty string. Matching shall be
case-sensitive.

(0001209)
joerg (reporter)
2012-04-19 15:09
edited on: 2012-04-19 15:21

Should we mention that in case all field from [np][%][ns] is omitted,
that the word is translated into nothing?


- Issue History
Date Modified Username Field Change
2011-11-27 23:11 dwheeler New Issue
2011-11-27 23:11 dwheeler Status New => Under Review
2011-11-27 23:11 dwheeler Assigned To => ajosey
2011-11-27 23:11 dwheeler Name => Your Name Here
2011-11-27 23:11 dwheeler Section => make
2011-11-27 23:11 dwheeler Page Number => 2915
2011-11-27 23:11 dwheeler Line Number => 95808
2011-11-28 14:15 joerg Note Added: 0001057
2011-11-28 14:46 joerg Note Added: 0001058
2011-11-29 00:07 dwheeler Note Added: 0001063
2011-11-29 01:14 dwheeler File Added: makefile-patterns.tar.gz
2011-11-29 09:56 joerg Note Added: 0001066
2011-11-29 10:06 joerg Note Added: 0001067
2011-11-29 10:32 joerg Note Added: 0001068
2011-11-30 04:58 dwheeler Note Added: 0001069
2011-11-30 16:56 joerg Note Added: 0001071
2012-04-12 14:43 joerg Note Added: 0001196
2012-04-12 15:59 joerg Note Edited: 0001196
2012-04-12 16:16 joerg Note Edited: 0001196
2012-04-12 16:19 joerg Note Edited: 0001196
2012-04-12 16:21 joerg Note Edited: 0001196
2012-04-12 16:26 joerg Note Edited: 0001196
2012-04-16 14:44 geoffclare Note Added: 0001206
2012-04-16 14:47 geoffclare Name Your Name Here => David A. Wheeler
2012-04-16 14:47 geoffclare Interp Status => ---
2012-04-19 15:09 joerg Note Added: 0001209
2012-04-19 15:20 joerg Note Edited: 0001209
2012-04-19 15:21 joerg Note Edited: 0001209
2012-04-19 15:26 geoffclare Final Accepted Text => Note: 0001206
2012-04-19 15:26 geoffclare Status Under Review => Resolved
2012-04-19 15:26 geoffclare Resolution Open => Accepted As Marked
2012-04-19 15:26 geoffclare Tag Attached: issue8
2014-08-12 10:55 geoffclare Relationship added related to 0000865
2014-09-18 16:02 geoffclare Note Edited: 0001206
2020-03-18 15:29 geoffclare Status Resolved => Applied


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker