Austin Group Defect Tracker

Aardvark Mark IV


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0001388 [1003.1(2016/18)/Issue7+TC2] Shell and Utilities Objection Omission 2020-08-11 14:17 2021-02-12 16:47
Reporter geoffclare View Status public  
Assigned To
Priority normal Resolution Accepted As Marked  
Status Applied  
Name Geoff Clare
Organization The Open Group
User Reference
Section yacc
Page Number 3456
Line Number 116732
Interp Status ---
Final Accepted Text Note: 0005220
Summary 0001388: yacc description does not say who declares yyerror() and yylex()
Description The description of yacc talks about the functions yyerror() and yylex() in various places, but nowhere does it state who is responsible for declaring them. This means that, in practice, a portable application has to declare them in the .y file (if it does not define them) in case yacc does not provide declarations of them in the code file and the compiler that will be used treats calls to undeclared functions as an error.

It doesn't take much thought about the reasons why prototypes were added to the C language to come to the conclusion that the greatly preferable solution is for yacc to be required to supply prototypes for yyerror() and yylex() that match the yyerror() definition in its library and the yylex() definition produced by lex, even (some might say especially) if the .y file includes definitions of those functions to be used instead of the library version of yyerror() and a lex-generated yylex().

There is also no statement about a declaration of main(), but the situation for main() is quite different from the above two functions. Although in theory an application could call yacc's library version of main() from code in a .y file, it is questionable why any application (other than a test suite) would do so, in particular because that version of main() does not accept any arguments and it calls exit() -- it does not return -- and therefore is of little use recursively. An application that provides its own main() could call it recursively, but can reasonably be expected to ensure it does not call main() without previously defining or declaring it. In addition, since main() has multiple different allowed prototypes, if yacc were to output a declaration it would have to be a non-prototype one:

int main();

so that there is no risk of a clash with a definition of main() that has a different prototype, but there does not seem much point in it producing such a declaration. (Or it could check whether the .y file contains a definition of main() and output a prototype declaration if it does not contain one, but that seems impractical.)

The simplest solution is just not to allow yacc to provide a declaration of main().
Desired Action On page 3456 line 116732 section yacc (Code File), change:
It also shall contain a copy of the #define statements in the header file.
to:
It also shall contain prototype declarations of the yyerror() and yylex() functions, and a copy of the #define statements in the header file, prior to any code copied from within <tt>%{</tt> and <tt>%}</tt> in the declarations section in grammar.

On page 3456 line 116734 section yacc (Code File), add a new paragraph:
The code file shall not contain a declaration of the main() function, unless one is present within <tt>%{</tt> and <tt>%}</tt> in the declarations section in grammar.

On page 3469 line 117335 section yacc (RATIONALE), add new paragraphs:
Earlier versions of this standard did not require the code file created by yacc to contain declarations of yyerror() and yylex(). This meant that portable applications that did not define them had to declare them in the grammar file, to ensure they would not be diagnosed by the compiler as being called without being declared, but this was not stated in those versions of the standard either. The standard developers decided it was preferable for yacc to include the declarations in the code file and this is now a requirement.

Earlier versions of this standard were also silent about a declaration of main(). However, the equivalent solution was not adopted since a declaration of main() would only be needed if it is called recursively by an application. Although in theory an application could call the yacc library version of main() from code in a grammar file, it is questionable why any application (other than a test suite) would do so, in particular because that version of main() does not accept any arguments and it calls exit() -- it does not return -- and therefore is of little use recursively. An application that includes its own definition of main() could call it recursively, but can reasonably be expected to ensure it does not call main() without previously defining or declaring it. An additional complication is that main() has multiple different allowed prototypes. The standard developers decided the simplest solution was not to allow yacc to provide a declaration of main() in the code file.

Tags issue8
Attached Files

- Relationships

-  Notes
(0004919)
shware_systems (reporter)
2020-08-11 16:53

A collateral issue is ensuring yyin is declared as a FILE *, if yyparse() is linked to a yylex() generated by lex, in a manner accessible to code calling yyparse(). This adds an implied requirement on the .c or .h file produced to do a #include <stdio.h> to get the defining declaration of FILE.

The example main() for lex assumes such a declaration of yyin is produced at the top or bottom of the %% section, but there is no requirement this be an extern the header produced by yacc could reference. The lex description just has it shall be used, nothing on how it is to be declared, actually. The main() example for yacc doesn't declare or assign stdin to it either, assuming yylex() does this by default somehow on first call if generated by lex, or accesses stdin directly in a yylex() implemented in the programs section of the .y file.

Maybe this has been addressed as part of some other bug, but I don't remember any such discussion offhand.
(0004928)
Konrad_Schwarz (reporter)
2020-08-19 10:08

Aren't default implementations of these symbols defined in the -ly library?
(0004931)
shware_systems (reporter)
2020-08-19 16:42
edited on: 2020-08-19 16:44

Yes, but when a compiler requires the absence of any main() to be found to format an a.out as a dynamic library, rather than a utility executable, the use of liby and libl is precluded so the compile doesn't see the main() in those libraries. As such, the output of yacc or lex needs to be compile-able in a manner that doesn't require any library to provide those prototypes implicitly.

(0005205)
rhansen (manager)
2021-01-21 17:17

On page 3456 line 116732 section yacc (Code File), change:
It also shall contain a copy of the #define statements in the header file.
to:
It also shall contain function prototypes for the yyerror(), yylex(), and yyparse() functions, and a copy of the #define statements in the header file, prior to any code copied from within <tt>%{</tt> and <tt>%}</tt> in the declarations section in grammar.


On page 3456 line 116734 section yacc (Code File), add a new paragraph:
The code file shall not contain a declaration of the main() function, unless one is present within <tt>%{</tt> and <tt>%}</tt> in the declarations section in grammar.


On page 3469 line 117335 section yacc (RATIONALE), add new paragraphs:
Earlier versions of this standard did not require the code file created by yacc to contain declarations of yyerror(), yylex(), and yyparse(). This meant that portable applications that did not define them had to declare them in the grammar file, to ensure they would not be diagnosed by the compiler as being called without being declared, but this was not stated in those versions of the standard either. The standard developers decided it was preferable for yacc to include the declarations in the code file and this is now a requirement.

Earlier versions of this standard were also silent about a declaration of main(). However, the equivalent solution was not adopted because a declaration of main() would only be needed if it is called recursively by an application. Although in theory an application could call the yacc library version of main() from code in a grammar file, it is questionable why any application (other than a test suite) would do so, in particular because that version of main() does not accept any arguments and it calls exit()—it does not return—and therefore is of little use recursively. An application that includes its own definition of main() could call it recursively, but can reasonably be expected to ensure it does not call main() without previously defining or declaring it. An additional complication is that main() has multiple different allowed prototypes. The standard developers decided the simplest solution was to disallow yacc from providing a declaration of main() in the code file.
(0005206)
nick (manager)
2021-01-21 21:17

From Akim Demaille (maintainer of GNU Bison):


I agree that yyparse should be part of the declarations in the header
file (or in the implementation file if there is no header).

However I dislike that yylex and yyerror be prototyped. We don't do
it in Bison because:

- the user might decide that yylex be static, and it's actually not
  unfrequent that it is. The user might want yylex to return an enum
  of the valid expected tokens. The parser itself simply does not
  care, and forcing a prototype onto the user will just be more
  constraints for her. She might even have played with #define
  yylex() to pass additional arguments, or re#defined yylex to foolex,
  etc.

  In my humble opinion the parser is merely a consumer of
  yylex, it is not the provider. It is up to the provider to provide
  the prototype. For instance *lex* should be in charge of declaring
  yylex, not Yacc.

- the user provides her yyerror and she is perfectly welcome to
  make it a variadic function that starts with a format string,
  instead of a plain string. The parser itself only calls this
  function with a single argument, but what if the user wanted
  something more generic for some other calls to yyerror that she
  wrote herself.

  Granted, there is something dangerous here if the parser was to
  pass a message with percent-signs in it. But Yacc does not
  issue such messages.

  Again, Yacc does not provide this function, it *requires* it.
  So I disagree that Yacc should provide the prototype.

Pushing prototypes onto the user restricts her freedom. Her
freedom to make things static, her freedom to use attributes
to defeat warnings about ignored arguments, her freedom to
simply comply with a contract the way she wants. Call it
duck-typing if you want, but my opinion is that the contract
between the user and yyparse is much weaker than what a
prototype would actually enforce.

(Not to mention things that are outside the scope of Yacc,
such as passing additional arguments to yylex to avoid, for
instance, the use of globals for semantic values. But yes,
this out of the scope of Yacc.)


Note that since people are currently providing the prototypes
themselves, there's a good chance that the ones provided by
new Yaccs might be incompatible. There will be backward
compatibility issues. And portability issues when from one
machine to another you get–or don't get–the prototype,
because of varying versions of Yacc.



The routines provided in the library are merely an instance of
the set of possibles. And to be clear, I think -ly is vastly
useless. It is simply not worth the trouble to have to find a
library for main and yyerror. Someone who is ready to
spend time learning Yacc to generate a parser for a grammar is
certainly competent enough to write main and yyerror.


I also see references to yyin. I'm not sure I understand
exactly what is suggested, but yyin is out of scope. Yacc
does not care about yyin at all. yyparse knows nothing
about chars, it only knows about tokens. People routinely
make parser of strings, not just FILE*. yyin is irrelevant
to Yacc and should not be referred to in Yacc's specifications.
(0005207)
dickey (reporter)
2021-01-21 22:17

Without a prototype, a C compiler cannot provide useful diagnostics when the number or type of parameters differs from what yacc requires.

Deciding whether to require that a prototype be static or extern is outside the scope of this question, because it's possible to provide prototyping information conditionally, e.g., by a macro which defines the return type and parameters. This is an example of what has been defined by Berkeley yacc starting in 2010:

/* Parameters sent to lex. */
#ifdef YYLEX_PARAM
# define YYLEX_DECL() yylex(void *YYLEX_PARAM)
# define YYLEX yylex(YYLEX_PARAM)
#else
# define YYLEX_DECL() yylex(void)
# define YYLEX yylex()
#endif

/* Parameters sent to yyerror. */
#ifndef YYERROR_DECL
#define YYERROR_DECL() yyerror(const char *s)
#endif
#ifndef YYERROR_CALL
#define YYERROR_CALL(msg) yyerror(msg)
#endif

Those are generated, and for some non-yacc features the macros differ, but provide an application developer with a way to refer to yylex and yyerror consistently with the parser's assumptions.
(0005208)
eggert (reporter)
2021-01-22 23:04

Akim's points are persuasive. Bison formerly used Berkeley yacc-style YYLEX_PARAM etc. but this feature was awkward and was withdrawn in Bison 3.0 (2013) in favor of GNU extensions like %lex-param that are cleaner and more general. If POSIX were to standardize in this area, %lex-param would be a good thing to look at as part of a separate bug report.

In the meantime, the original proposal goes too far in that it prohibits users from declaring yylex and yyparse with APIs that work just fine even if they don't exactly match the yacc default. Instead, I suggest modifying the standard only to require a yyparse() prototype in the header and code files (which is fine as yacc generates the yyparse code, so it specifies yyparse's API), and to document existing practice for yylex and yyerror.

A proposed change follows. I don't have the PDF so page and line numbers are approximate.

-----

 On page 3456 line 116732 section yacc (Code File), change:

    It also shall contain a copy of the #define statements in the header file.

to:

    It also shall contain a copy of the #define statements and yyparse() declaration in the header file, prior to any code copied from within <tt>%{</tt> and <tt>%}</tt> in the declarations section in grammar.

 On page 345? line 116??? section yacc (Header File), add a new sentence:

    The header file shall also declare the yyparse() function, using a function prototype.

 On page 3456 line 116734 section yacc (Code File), add a new paragraph:

    The code file shall not contain a declaration of the main(), yyerror() or yylex() functions, unless such a declaration is present within the declarations or programs section.

On page 3469 line 117335 section yacc (RATIONALE), add new paragraphs:

    Because the code and header files do not declare the yyerror() or yylex() functions, a portable application should declare these functions in the declarations section, as these two functions are typically supplied by the application and it is the application's responsibility to declare them. The current standard follows historical practice in requiring the application to declare these functions even in the rare case where an application uses the yacc library.

    Earlier versions of this standard did not require the code or header file created by yacc to declare yyparse(), which meant that portable applications had to declare yyparse() themselves. The current standard lets applications declare yyparse() by including the header file.
(0005210)
geoffclare (manager)
2021-01-26 10:55

There are existing implementations that include declarations for yyerror() and yylex() in the code file. The ones we know about are Solaris and its derivatives (Illumos, etc.), but it may be all SVR4-derived implementations. This means that:

1. Forbidding these declarations in Issue 8, as suggested by Paul Eggert in Note: 0005208, is not a viable solution as it would make these implementations non-conforming.

2. Applications that include their own declaration of yyerror() or yylex() with a different prototype are already non-portable (unless they also define a macro of the same name - see below).

In yesterday's teleconference it was pointed out that Solaris protects these declarations with #ifndef to avoid a clash with any prior macro definition. This could provide the key to a compromise solution. If we require the declarations in Issue 8 but also require them to be protected so that they are not visible if a macro exists, then applications that include their own declaration of yyerror() or yylex() with a different prototype would just need to arrange for a macro of the same name to be defined in order to be portable to Issue 8 systems. This could be done by putting, e.g.
#define yylex yylex
in the declaration section or by adding -Dyylex=yylex to the compiler options used to compile y.tab.c. It could even be done just in the configuration step for such systems by adding the -D to the configured compiler options.

Of course the Bison maintainers could, if they wish, choose to have it write the yyerror() and yylex() declarations only if it is invoked by the yacc wrapper script (or with the option(s) passed by that script), or only if POSIXLY_CORRECT is set in the environment, in order to reduce the number of applications that need to do this.
(0005215)
eggert (reporter)
2021-01-28 03:33

I just now checked Solaris 10, and its yacc is a bit of a mess. First, for C it declares only yylex (its yyerror declaration is conditionalized on C++). Second, it declares neither the renamed yyerror nor the renamed yylex if yacc's -p option is used.

Berkeley yacc (2.0 20210109) is similarly in a mess. It does not declare yyerror. It declares yylex, but not if you use -p or have #defined either yylex or YYSTATE.

We could use the "#define yylex yylex" trick as a compromise that will likely require minor changes to all three yacc implementations. The basic idea is as follows:

 * y.tab.c declares yyerror unless yyerror is already defined as a macro. (None of the three implementations declare yyerror now.)

 * y.tab.c declares yylex unless yylex is already defined as a macro. (Solaris yacc and byacc do this, Bison does not.)

 * y.tab.h declares yyparse (Bison does this, Solaris yacc and byacc do not.)

 * All of the "yy"s in the above are consistently changed to some other prefix P if the "-p P" option is used. (None of the three implementations do this now.)

Here’s a proposed change to implement this suggestion. I don't have the PDF so page and line numbers are approximate.
-----


 On page 3456 line 116732 section yacc (Code File), change:

    It also shall contain a copy of the #define statements in the header file. If a %union declaration is used, the declaration for YYSTYPE shall also be included in this file.

to:

    It also shall contain a copy of the declarations and #define directives of the header file, prior to any code copied from within <tt>%{</tt> and <tt>%}</tt> in the declarations section in grammar.

 On page 345? line 116??? section yacc (Header File), add a new sentence:

    The header file shall also contain a declaration that is compatible with ‘int yyparse (void);’.

 On page 3456 line 116734 section yacc (Code File), add a new paragraph:

    The code file shall also contain declaration that are compatible with ‘void yyerror(const char *);’ and ‘int yylex(void);’. These declarations shall be placed after any code copied from within <tt>%{</tt> and <tt>%}</tt> in the declarations section, and shall be omitted if that copied code defines the preprocessor macros yyerror and yylex, respectively, where the yy in the preprocessor macro names is replaced by sym_prefix if the -p sym_prefix option is used.

On page 3469 line 117335 section yacc (RATIONALE), add new paragraphs:

    If the yyerror() and yylex() functions are not defined within <tt>%{</tt> and <tt>%}</tt> in the declarations section or in the programs section, suggested practice is to declare these functions into a separate header file and include the file in the declarations section, followed by ‘#define yyerror yyerror’ and ‘#define yylex yylex’ if the identifiers are not already macros. This lets the separate header file be the definitive API for all code defining or using these functions.

    Earlier versions of this standard did not specify whether the code or header file created by yacc should declare yyerror(), yylex() and yyparse(). The current version specifies when and where these declarations should appear.
(0005216)
dickey (reporter)
2021-01-28 09:32
edited on: 2021-01-28 10:04

The comment about byacc is inaccurate. Here's a slice for the yyparse (the behavior with the -p option appears to be consistent with https://pubs.opengroup.org/onlinepubs/9699919799/utilities/yacc.html): [^]

/* compatibility with bison */
#ifdef YYPARSE_PARAM
/* compatibility with FreeBSD */
# ifdef YYPARSE_PARAM_TYPE
# define YYPARSE_DECL() yyparse(YYPARSE_PARAM_TYPE YYPARSE_PARAM)
# else
# define YYPARSE_DECL() yyparse(void *YYPARSE_PARAM)
# endif
#else
# define YYPARSE_DECL() yyparse(void)
#endif

extern int YYPARSE_DECL();

which a compiler will see as "extern int yyparse(void)".

Regarding the on-topic details, yylex and yyerror are provided in byacc as macros (see my first comment), but not used directly as in that last line of the yyparse, because it would interfere with being able to make one or both of those static. Whether the macro names are suitable for standardization hasn't been brought up yet.

(0005217)
dickey (reporter)
2021-01-28 09:34

Short: the suggested change to "On page 3456 line 116734" does not take into account the possibility of declaring yylex/yyerror as "static".
(0005218)
dickey (reporter)
2021-01-28 10:02

typo to fix here: "The code file shall also contain declaration"
should be: "The code file shall also contain declarations" (missing plural)
(0005219)
dickey (reporter)
2021-01-28 10:03

The suggested change "On page 3469 line 117335 section" does not mention that those symbols may already be macros. Some expansion/revision is probably in order.
(0005220)
rhansen (manager)
2021-01-28 16:44
edited on: 2021-01-28 16:45

On page 3456 line 116732 section yacc (Code File), change:
It also shall contain a copy of the #define statements in the header file.
to:
It also shall contain a copy of the #define statements in the header file, prior to any code copied from semantic actions in grammar, and the following function prototypes for the yyerror(), yylex(), and yyparse() functions, after any code copied from within <tt>%{</tt> and <tt>%}</tt> in the declarations section in grammar and before any code copied from semantic actions in grammar:
void yyerror(const char *);
int yylex(void);
int yyparse(void);

The declarations of yyerror() and yylex() shall be protected by #ifndef or #if preprocessor statements such that each is only visible if a preprocessor macro with the name yyerror or yylex, respectively, is not already defined, where the yy in the macro names is replaced by sym_prefix if the -p sym_prefix option is used.


On page 3456 line 116734 section yacc (Code File), add a new paragraph:
The code file shall not contain a declaration of the main() function, unless one is present within <tt>%{</tt> and <tt>%}</tt> in the declarations section in grammar.


On page 3456 line 116739 section yacc (Header File), add a new sentence:
The header file may also declare the yyparse() function, using a function prototype. It shall not declare the yyerror() and yylex() functions.


On page 3465 line 117139 section yacc (Yacc Library), change:
int yyerror(const char *s)
to:
void yyerror(const char *s)



On page 3467 line 117217 section yacc (APPLICATION USAGE), add new paragraph:
If yyerror() and yylex() are not defined within <tt>%{</tt> and <tt>%}</tt> in the declarations section as functions or macros, nor in the programs section as functions, recommended practice is to declare them as functions in a separate header file and include that file in the declarations section, followed by <tt>#define yyerror yyerror</tt> and <tt>#define yylex yylex</tt>. This lets the separate header file be the definitive API for all code defining or using these functions.


On page 3467 line 117239 section yacc (EXAMPLES), change:
int yyerror(const char *msg)
to:
void yyerror(const char *msg)


On page 3469 line 117335 section yacc (RATIONALE), add new paragraphs:
Earlier versions of this standard did not require the code file created by yacc to contain declarations of yyerror(), yylex(), and yyparse(). This meant that portable applications that did not define them had to declare them in the grammar file, to ensure they would not be diagnosed by the compiler as being called without being declared, but this was not stated in those versions of the standard either. The standard developers decided it was preferable for yacc to include the declarations in the code file and this is now a requirement. However, the declarations of yyerror() and yylex() are only visible if a macro of the same name is not defined, which provides application writers with a way to suppress the declaration if desired (for example, in order to provide their own declaration that would conflict with the one written by yacc()). These functions are not declared in the header file because a macro definition in the declaration section would not be be able to suppress them there.

Earlier versions of this standard were also silent about a declaration of main(). However, the equivalent solution was not adopted because a declaration of main() would only be needed if it is called recursively by an application. Although in theory an application could call the yacc library version of main() from code in a grammar file, it is questionable why any application (other than a test suite) would do so, in particular because that version of main() does not accept any arguments and it calls exit()—it does not return—and therefore is of little use recursively. An application that includes its own definition of main() could call it recursively, but can reasonably be expected to ensure it does not call main() without previously defining or declaring it. An additional complication is that main() has multiple different allowed prototypes. The standard developers decided the simplest solution was to disallow yacc from providing a declaration of main() in the code file.


(0005236)
geoffclare (manager)
2021-02-12 16:47

When applying this bug, I removed the extraneous "()" from "written by yacc()".

- Issue History
Date Modified Username Field Change
2020-08-11 14:17 geoffclare New Issue
2020-08-11 14:17 geoffclare Name => Geoff Clare
2020-08-11 14:17 geoffclare Organization => The Open Group
2020-08-11 14:17 geoffclare Section => yacc
2020-08-11 14:17 geoffclare Page Number => 3456
2020-08-11 14:17 geoffclare Line Number => 116732
2020-08-11 14:17 geoffclare Interp Status => ---
2020-08-11 16:53 shware_systems Note Added: 0004919
2020-08-19 10:08 Konrad_Schwarz Note Added: 0004928
2020-08-19 16:42 shware_systems Note Added: 0004931
2020-08-19 16:44 shware_systems Note Edited: 0004931
2021-01-21 17:17 rhansen Note Added: 0005205
2021-01-21 21:17 nick Note Added: 0005206
2021-01-21 22:17 dickey Note Added: 0005207
2021-01-22 23:04 eggert Note Added: 0005208
2021-01-26 10:55 geoffclare Note Added: 0005210
2021-01-28 03:33 eggert Note Added: 0005215
2021-01-28 09:32 dickey Note Added: 0005216
2021-01-28 09:34 dickey Note Added: 0005217
2021-01-28 10:02 dickey Note Added: 0005218
2021-01-28 10:03 dickey Note Added: 0005219
2021-01-28 10:04 dickey Note Edited: 0005216
2021-01-28 16:44 rhansen Note Added: 0005220
2021-01-28 16:45 rhansen Note Edited: 0005220
2021-01-28 16:46 rhansen Final Accepted Text => Note: 0005220
2021-01-28 16:46 rhansen Status New => Resolved
2021-01-28 16:46 rhansen Resolution Open => Accepted As Marked
2021-01-28 16:47 rhansen Tag Attached: issue8
2021-02-12 16:47 geoffclare Note Added: 0005236
2021-02-12 16:47 geoffclare Status Resolved => Applied


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker