Perl’s glob doesn’t find a file when I think it should

Below is the code which does the processing of lines written in new.txt file.

my $bankNameTest;
my $bankName;
my @lines;

my $document = do {
    local $/ = undef;
    open my $fh, "<", "new.txt"
        or die "could not open $file: $!";
    <$fh>;
};

chomp ($document);
print "$document is docn";
@lines = split (/n/,$document);


foreach my $test (@lines) {
    $bankNameTest=glob ("${test}/wint_nightly_nfarm_*.main.20241216.4262507") ;
    chomp ($bankNameTest);
    print "$bankNameTest is testbanknamen";
    $bankName=`ls -d $bankNameTest | sed -r "s?${var}/??g" | sed -r "s/wint_nightly_nfarm_//g" | sed -r "s/.main.20241216.4262507//g"`;
    chomp ($bankName);
    print "$bankName is the bankNamen";
}

Content of new.txt is :

_qa/SERVFARM/CAPTURE_DB/CaptureViewer/2487964_brd_launch/cap_allegro_view
_qa/SERVFARM/CAPTURE_DB/CaptureViewer/FIND/enh_find/DRC_MARKERS/hierdsn_incomp_entry
_qa/SERVFARM/CAPTURE_DB/CaptureViewer/crossprobe/03_between_cap_allegroviewer/dsn_schm_page

Output coming is:

_qa/SERVFARM/CAPTURE_DB/CaptureViewer/2487964_brd_launch/cap_allegro_view/wint_nightly_nfarm_capture_viewer.main.20241216.4262507 is testbankname
capture_viewer is the bankName
 is testbankname
 is the bankName
_qa/SERVFARM/CAPTURE_DB/CaptureViewer/crossprobe/03_between_cap_allegroviewer/dsn_schm_page_and_part_name_with_special_chars/wint_nightly_nfarm_capture_viewer.main.20241216.4262507 is testbankname
capture_viewer is the bankName

Here, 2nd line of new.txt is not getting processed correctly.

Expected Output:

_qa/SERVFARM/CAPTURE_DB/CaptureViewer/2487964_brd_launch/cap_allegro_view/wint_nightly_nfarm_capture_viewer.main.20241216.4262507 is testbankname
capture_viewer is the bankName
_qa/SERVFARM/CAPTURE_DB/CaptureViewer/FIND/enh_find/DRC_MARKERS/hierdsn_incomp_entry/wint_nightly_nfarm_capture_viewer.main.20241216.4262507 is testbankname
capture_viewer is the bankName
_qa/SERVFARM/CAPTURE_DB/CaptureViewer/crossprobe/03_between_cap_allegroviewer/dsn_schm_page_and_part_name_with_special_chars/wint_nightly_nfarm_capture_viewer.main.20241216.4262507 is testbankname
capture_viewer is the bankName

Not sure what’s the issue? Seems to be some character related issue in new.txt. Need to have the output as mentioned in “Expected Output”.

7

Update:

I just realized that glob does not update, even if the variable inside the pattern changes. This is a fairly subtle feature, which caused your program to behave in an unexpected way. So I thought I would add a more elaborate explanation.

To test this, I ran the following oneliner:

$ perl -lwe'for (qw(a b c)) { $x = glob "$_*"; print $x; }'

This loops over “a”, “b” and “c” and executes a glob with a string containing the loop variable. One might assume that this would use a different starting letter for each iteration, first “a” then “b” and then “c”. But in fact, this is printed:

a
a.csv
a.dat

The first pattern was used, but then the pattern was never updated.

When used in scalar context, glob creates an iterator, and each succeeding usage of the same pattern returns the next item on the list first created. The iterator is used until it is exhausted, after which it returns undef, and a new pattern is allowed.

That is why you have a failed second result. The pattern never changes, because the glob is in iteration mode. It doesn’t check the pattern for an updated variable, it checks the first pattern. Since there was only one file to be found, the second iteration returns undef, which prints as the empty string, but would give a warning under use warnings.

Since I loop over the glob in my code, this bug is not present, even though I use glob. It also explains why your code was fixed when you switched to using an array @bankNameTest, but now you know why.


Original answer:

Inspired by brian’s answer I managed to simplify the code a bit more.

The exact mistakes made was covered by brian, so I won’t repeat that. I felt that the suggested solution could be improved, which is why I add this. Credit to brian for doing the ground work for this answer.

Firstly, I feel a glob might well be the better solution. We can sense that the number of files in the directory might be large, so it might be unnecessary to loop through them all. Though I am not certain that glob on the system side is faster than Perl’s readdir. It may well be a case of similar efficiency.

Secondly, the original code assumes one bank name. But there are issues with that. If we have more than one match, should we only report one? And in that case, which one? I opted for reporting all matches in the order they are found.

use strict;
use warnings;

my $file = 'new.txt';
open my $fh, '<', $file or die "Cannot open '$file': $!";

for my $dir (<$fh>) {
    chomp $dir;
    print "$dirn-----------------n";
    for my $match (glob ("$dir/wint_nightly_nfarm_*.main.20241216.4262507")) {
        my ($bank) = $match =~ /wint_nightly_nfarm_(.+).main.20241216.4262507/;
    print "$bank is the bankn" if $bank;
    }
}

This will output something like this (on my system with one sample):

_qa/SERVFARM/CAPTURE_DB/CaptureViewer/2487964_brd_launch/cap_allegro_view
-----------------
BANKOFSTACKOVERFLOW is the bank
_qa/SERVFARM/CAPTURE_DB/CaptureViewer/FIND/enh_find/DRC_MARKERS/hierdsn_incomp_entry
-----------------
_qa/SERVFARM/CAPTURE_DB/CaptureViewer/crossprobe/03_between_cap_allegroviewer/dsn_schm_page
-----------------

But you do not need to hardcode the new.txt filename. You can simply use the diamond operator, like this:

for my $dir (<>) {   # diamond operator accepts file arg or stdin
    chomp $dir;
    print "$dirn-----------------n";
    for my $match (glob ("$dir/wint_nightly_nfarm_*.main.20241216.4262507")) {
        my ($bank) = $match =~ /wint_nightly_nfarm_(.+).main.20241216.4262507/;
        print "$bank is the bankn" if $bank;
    }
}

Which can be run in many different ways, for example:

$ perl program.pl new.txt
$ someotherprogram | program.pl
$ ls -d file1 file2 file3* | program.pl
etc...

Either use a file name argument, or standard input.

First, let’s simplify some of the program structure.

You want to process a file’s lines, but you do a lot of work to slurp all of the contents into a single value only to split it again. All of that code is just this while loop:

open my $fh, '<', 'new.txt' or die "could not open $file: $!";
while( <$fh> ) {
    chomp;
    ... 
    }   

Now, the stuff inside the loop is a bit more tricky because I think you might be doing everything the hard way.

It looks like you want to find a single file where you might not know what’s in the middle of the name. You have this glob pattern:

${test}/wint_nightly_nfarm_*.main.20241216.4262507

My first thought when you see two line in your output that don’t have any filename is that this pattern doesn’t match a file in one of the directories. Did you look inside that directory and see the file you were expecting?

With no match from the glob, $bankNameTest is empty and that pretty much ruins the rest of the loop.

Whenever you interact with something outside of your program, check that it worked before using the result:

my $file = glob(...);
unless( defined $file ) { ... }

Without a filename, you cannot continue the iteration. You might want to output a warning then skip everything else:

my $file = glob(...);
unless( defined $file ) {
    warn "Did not find file";
    next;
    }

And, this isn’t just about the context of the call to glob. If you don’t find the file in the list context case, you still have this problem:

my( $file ) = glob(...);  # file not there 

After that, in the cases where the glob does work, you have some more trickery.

$bankName=`ls -d $bankNameTest | sed -r "s?${var}/??g" | sed -r "s/wint_nightly_nfarm_//g" | sed -r "s/.main.20241216.4262507//g"`;

I’m not sure what you want this to do. The ls -d ... just outputs the directory name, which I guess is another way of priming the input for the pipeline. After that it’s a bunch of sed. But, Perl does everything sed does:

$bankName = $bankNameTest;
$bankname =~ s/...//; # trying to get rid of the directory?
$bankname =~ s/wint_nightly_nfarm_//;
$bankname =~ s/.main.20241216.4262507//;

You don’t need /g here since I think you expect any of these to match in more than one place.

I think you are trying to get that bit from the wildcard part of the glob pattern. But you don’t need to destructively whittle away at a string:

my $bankname;
if( $bankNameTest =~ m/wint_nightly_nfarm_(.*).main.20241216.4262507/ ) {
    $bankname = $1
    }
   

Realize that any time that you want to do some sort of text manipulation, that Perl can do it inside of Perl, and probably better and simpler than anything else.

But, seeing these two things together, I might be inclined to use readdir, which doesn’t include the directory name, and to match the file name directly to capture the interesting part. Once I find that file, I stop looking:

open my $fh, '<', $file or die "could not open $file: $!";
while( <$fh> ) {
    chomp;
    print "$_ is testbanknamen";
    
    opendir my $dh, $_ or do {
        warn "Could not open dir <$_>: $!";
        next;
        }
        
    my $bankName;
    while( my $f = readdir($dh) ) {
        next unless $f =~ /wint_nightly_nfarm_(.*).main.20241216.4262507/;
        $bankName = $1;
        last;
        }       
    print "$bankName is the bankNamen";
    }   

3

Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa Dịch vụ tổ chức sự kiện 5 sao Thông tin về chúng tôi Dịch vụ sinh nhật bé trai Dịch vụ sinh nhật bé gái Sự kiện trọn gói Các tiết mục giải trí Dịch vụ bổ trợ Tiệc cưới sang trọng Dịch vụ khai trương Tư vấn tổ chức sự kiện Hình ảnh sự kiện Cập nhật tin tức Liên hệ ngay Thuê chú hề chuyên nghiệp Tiệc tất niên cho công ty Trang trí tiệc cuối năm Tiệc tất niên độc đáo Sinh nhật bé Hải Đăng Sinh nhật đáng yêu bé Khánh Vân Sinh nhật sang trọng Bích Ngân Tiệc sinh nhật bé Thanh Trang Dịch vụ ông già Noel Xiếc thú vui nhộn Biểu diễn xiếc quay đĩa Dịch vụ tổ chức tiệc uy tín Khám phá dịch vụ của chúng tôi Tiệc sinh nhật cho bé trai Trang trí tiệc cho bé gái Gói sự kiện chuyên nghiệp Chương trình giải trí hấp dẫn Dịch vụ hỗ trợ sự kiện Trang trí tiệc cưới đẹp Khởi đầu thành công với khai trương Chuyên gia tư vấn sự kiện Xem ảnh các sự kiện đẹp Tin mới về sự kiện Kết nối với đội ngũ chuyên gia Chú hề vui nhộn cho tiệc sinh nhật Ý tưởng tiệc cuối năm Tất niên độc đáo Trang trí tiệc hiện đại Tổ chức sinh nhật cho Hải Đăng Sinh nhật độc quyền Khánh Vân Phong cách tiệc Bích Ngân Trang trí tiệc bé Thanh Trang Thuê dịch vụ ông già Noel chuyên nghiệp Xem xiếc khỉ đặc sắc Xiếc quay đĩa thú vị
Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa
Thiết kế website Thiết kế website Thiết kế website Cách kháng tài khoản quảng cáo Mua bán Fanpage Facebook Dịch vụ SEO Tổ chức sinh nhật