getline
is specified to read “from the current input file” and to return 0 at the end-of-file. Both gawk and POSIX docs use this verbiage. It makes sense: Data may be divided between files for a reason. The language is more expressive if getline
can distinguish files. Information that is structured enough to warrant getline
usually doesn’t cross file boundaries.
But both GNU and macOS/BSD implementations hide the EOF and immediately open the next file. Doing so, they update FILENAME
, which is not among the list of variables specified to be affected in either GNU nor POSIX docs.
The only workaround I see is to make sure that each file starts with a throwaway line, and detect when FNR
resets to 1. Yuck.
It’s a strange coincidence for both implementations to have this bug. Looking at the source, neither behavior is negligent. Both take specific steps to advance the file, in contrast to the code branch for getline
from a named I/O handle. It’s especially weird for the verbose GNU docs to contradict the behavior.
Am I missing something? Have I stumbled across an uncommon case or is this known to Awk lore?
1