Let’s see you have specified the second argument of open function like this:
open my $fh, ">:via(File::BOM):encoding(UTF-8)", $file or die "Cannot open $file: $!";
Here you specify the :via(File::BOM)
first and :encoding(UTF-8)"
,
Then, the output string will be cut in the middle.
Following script attempts to output the UTF-8 text file with BOM,
The contents should be concatenation of strings from “xxx yyyy1” to “xxx yyyy100″, incrementing the trailing number, and delimitted with ” / “.
#!/usr/bin/perl
# bomTest.pl
use strict;
use warnings;
use File::BOM;
use feature 'say';
my $file = '/home/cf/Desktop/foo.txt';
open my $fh, ">:via(File::BOM):encoding(UTF-8)", $file or die "Cannot open $file: $!";
say $fh 'xxx yyyy1 / xxx yyyy2 / xxx yyyy3 / xxx yyyy4 / xxx yyyy5 / xxx yyyy6 / xxx yyyy7 / xxx yyyy8 / xxx yyyy9 / xxx yyyy10 / ' .
'xxx yyyy11 / xxx yyyy12 / xxx yyyy13 / xxx yyyy14 / xxx yyyy15 / xxx yyyy16 / xxx yyyy17 / xxx yyyy18 / xxx yyyy19 / xxx yyyy20 / ' .
... snip ...
'xxx yyyy71 / xxx yyyy72 / xxx yyyy73 / xxx yyyy74 / xxx yyyy75 / xxx yyyy76 / xxx yyyy77 / xxx yyyy78 / xxx yyyy79 / xxx yyyy80 / ' .
'xxx yyyy81 / xxx yyyy82 / xxx yyyy83 / xxx yyyy84 / xxx yyyy85 / xxx yyyy86 / xxx yyyy87 / xxx yyyy88 / xxx yyyy89 / xxx yyyy90 / ' .
'xxx yyyy91 / xxx yyyy92 / xxx yyyy93 / xxx yyyy94 / xxx yyyy95 / xxx yyyy96 / xxx yyyy97 / xxx yyyy98 / xxx yyyy99 / xxx yyyy100';
close $fh;
The output file foo.txt will be will be UTF-8 file with the BOM (0x EF BB BF) at the top of the file, but the string will be terminated at the middle as below:
xxx yyyy1 / xxx yyyy2 / xxx yyyy3 / xxx yyyy4 / xxx yyyy5 / xxx yyyy6 / xxx yyyy7 / xxx yyyy8 / xxx yyyy9 / xxx yyyy10 / xxx yyyy11 / xxx yyyy12 / xxx yyyy13 / xxx yyyy14 / xxx yyyy15 / xxx yyyy16 / xxx yyyy17 / xxx yyyy18 / xxx yyyy19 / xxx yyyy20 / …snip… xxx yyyy71 / xxx yyyy72 / xxx yyyy73 / xxx yyyy74 / xxx yyyy75 / xxx yyyy76 / xxx yyyy77 / xxx yyyy78 / xxx yyyy79 / xxx yy
The output stops in the middle of “xxx yyyy80”.
Now if you change the script like this:
open my $fh, ">:encoding(UTF-8):via(File::BOM)", $file or die "Cannot open $file: $!";
The changed point is the order of I-O Layer.
You specified :encoding(UTF-8) first and :via(File::BOM) last.
Then the script run completely to “xxx yyyy100”.
What is this phenomenon?
Is it the bug of Perl encoding?
Or else the one of File::BOM module?
Or is it the reasonable specification of them?