Below is the code which does the processing of lines written in new.txt
file.
my $bankNameTest;
my $bankName;
my @lines;
my $document = do {
local $/ = undef;
open my $fh, "<", "new.txt"
or die "could not open $file: $!";
<$fh>;
};
chomp ($document);
print "$document is docn";
@lines = split (/n/,$document);
foreach my $test (@lines) {
$bankNameTest=glob ("${test}/wint_nightly_nfarm_*.main.20241216.4262507") ;
chomp ($bankNameTest);
print "$bankNameTest is testbanknamen";
$bankName=`ls -d $bankNameTest | sed -r "s?${var}/??g" | sed -r "s/wint_nightly_nfarm_//g" | sed -r "s/.main.20241216.4262507//g"`;
chomp ($bankName);
print "$bankName is the bankNamen";
}
Content of new.txt
is :
_qa/SERVFARM/CAPTURE_DB/CaptureViewer/2487964_brd_launch/cap_allegro_view
_qa/SERVFARM/CAPTURE_DB/CaptureViewer/FIND/enh_find/DRC_MARKERS/hierdsn_incomp_entry
_qa/SERVFARM/CAPTURE_DB/CaptureViewer/crossprobe/03_between_cap_allegroviewer/dsn_schm_page
Output coming is:
_qa/SERVFARM/CAPTURE_DB/CaptureViewer/2487964_brd_launch/cap_allegro_view/wint_nightly_nfarm_capture_viewer.main.20241216.4262507 is testbankname
capture_viewer is the bankName
is testbankname
is the bankName
_qa/SERVFARM/CAPTURE_DB/CaptureViewer/crossprobe/03_between_cap_allegroviewer/dsn_schm_page_and_part_name_with_special_chars/wint_nightly_nfarm_capture_viewer.main.20241216.4262507 is testbankname
capture_viewer is the bankName
Here, 2nd line of new.txt is not getting processed correctly.
Expected Output:
_qa/SERVFARM/CAPTURE_DB/CaptureViewer/2487964_brd_launch/cap_allegro_view/wint_nightly_nfarm_capture_viewer.main.20241216.4262507 is testbankname
capture_viewer is the bankName
_qa/SERVFARM/CAPTURE_DB/CaptureViewer/FIND/enh_find/DRC_MARKERS/hierdsn_incomp_entry/wint_nightly_nfarm_capture_viewer.main.20241216.4262507 is testbankname
capture_viewer is the bankName
_qa/SERVFARM/CAPTURE_DB/CaptureViewer/crossprobe/03_between_cap_allegroviewer/dsn_schm_page_and_part_name_with_special_chars/wint_nightly_nfarm_capture_viewer.main.20241216.4262507 is testbankname
capture_viewer is the bankName
Not sure what’s the issue? Seems to be some character related issue in new.txt. Need to have the output as mentioned in “Expected Output”.
7
Update:
I just realized that glob
does not update, even if the variable inside the pattern changes. This is a fairly subtle feature, which caused your program to behave in an unexpected way. So I thought I would add a more elaborate explanation.
To test this, I ran the following oneliner:
$ perl -lwe'for (qw(a b c)) { $x = glob "$_*"; print $x; }'
This loops over “a”, “b” and “c” and executes a glob with a string containing the loop variable. One might assume that this would use a different starting letter for each iteration, first “a” then “b” and then “c”. But in fact, this is printed:
a
a.csv
a.dat
The first pattern was used, but then the pattern was never updated.
When used in scalar context, glob creates an iterator, and each succeeding usage of the same pattern returns the next item on the list first created. The iterator is used until it is exhausted, after which it returns undef, and a new pattern is allowed.
That is why you have a failed second result. The pattern never changes, because the glob is in iteration mode. It doesn’t check the pattern for an updated variable, it checks the first pattern. Since there was only one file to be found, the second iteration returns undef
, which prints as the empty string, but would give a warning under use warnings
.
Since I loop over the glob in my code, this bug is not present, even though I use glob. It also explains why your code was fixed when you switched to using an array @bankNameTest
, but now you know why.
Original answer:
Inspired by brian’s answer I managed to simplify the code a bit more.
The exact mistakes made was covered by brian, so I won’t repeat that. I felt that the suggested solution could be improved, which is why I add this. Credit to brian for doing the ground work for this answer.
Firstly, I feel a glob might well be the better solution. We can sense that the number of files in the directory might be large, so it might be unnecessary to loop through them all. Though I am not certain that glob on the system side is faster than Perl’s readdir. It may well be a case of similar efficiency.
Secondly, the original code assumes one bank name. But there are issues with that. If we have more than one match, should we only report one? And in that case, which one? I opted for reporting all matches in the order they are found.
use strict;
use warnings;
my $file = 'new.txt';
open my $fh, '<', $file or die "Cannot open '$file': $!";
for my $dir (<$fh>) {
chomp $dir;
print "$dirn-----------------n";
for my $match (glob ("$dir/wint_nightly_nfarm_*.main.20241216.4262507")) {
my ($bank) = $match =~ /wint_nightly_nfarm_(.+).main.20241216.4262507/;
print "$bank is the bankn" if $bank;
}
}
This will output something like this (on my system with one sample):
_qa/SERVFARM/CAPTURE_DB/CaptureViewer/2487964_brd_launch/cap_allegro_view
-----------------
BANKOFSTACKOVERFLOW is the bank
_qa/SERVFARM/CAPTURE_DB/CaptureViewer/FIND/enh_find/DRC_MARKERS/hierdsn_incomp_entry
-----------------
_qa/SERVFARM/CAPTURE_DB/CaptureViewer/crossprobe/03_between_cap_allegroviewer/dsn_schm_page
-----------------
But you do not need to hardcode the new.txt
filename. You can simply use the diamond operator, like this:
for my $dir (<>) { # diamond operator accepts file arg or stdin
chomp $dir;
print "$dirn-----------------n";
for my $match (glob ("$dir/wint_nightly_nfarm_*.main.20241216.4262507")) {
my ($bank) = $match =~ /wint_nightly_nfarm_(.+).main.20241216.4262507/;
print "$bank is the bankn" if $bank;
}
}
Which can be run in many different ways, for example:
$ perl program.pl new.txt
$ someotherprogram | program.pl
$ ls -d file1 file2 file3* | program.pl
etc...
Either use a file name argument, or standard input.
First, let’s simplify some of the program structure.
You want to process a file’s lines, but you do a lot of work to slurp all of the contents into a single value only to split it again. All of that code is just this while
loop:
open my $fh, '<', 'new.txt' or die "could not open $file: $!";
while( <$fh> ) {
chomp;
...
}
Now, the stuff inside the loop is a bit more tricky because I think you might be doing everything the hard way.
It looks like you want to find a single file where you might not know what’s in the middle of the name. You have this glob pattern:
${test}/wint_nightly_nfarm_*.main.20241216.4262507
My first thought when you see two line in your output that don’t have any filename is that this pattern doesn’t match a file in one of the directories. Did you look inside that directory and see the file you were expecting?
With no match from the glob, $bankNameTest
is empty and that pretty much ruins the rest of the loop.
Whenever you interact with something outside of your program, check that it worked before using the result:
my $file = glob(...);
unless( defined $file ) { ... }
Without a filename, you cannot continue the iteration. You might want to output a warning then skip everything else:
my $file = glob(...);
unless( defined $file ) {
warn "Did not find file";
next;
}
And, this isn’t just about the context of the call to glob
. If you don’t find the file in the list context case, you still have this problem:
my( $file ) = glob(...); # file not there
After that, in the cases where the glob does work, you have some more trickery.
$bankName=`ls -d $bankNameTest | sed -r "s?${var}/??g" | sed -r "s/wint_nightly_nfarm_//g" | sed -r "s/.main.20241216.4262507//g"`;
I’m not sure what you want this to do. The ls -d ...
just outputs the directory name, which I guess is another way of priming the input for the pipeline. After that it’s a bunch of sed. But, Perl does everything sed does:
$bankName = $bankNameTest;
$bankname =~ s/...//; # trying to get rid of the directory?
$bankname =~ s/wint_nightly_nfarm_//;
$bankname =~ s/.main.20241216.4262507//;
You don’t need /g
here since I think you expect any of these to match in more than one place.
I think you are trying to get that bit from the wildcard part of the glob pattern. But you don’t need to destructively whittle away at a string:
my $bankname;
if( $bankNameTest =~ m/wint_nightly_nfarm_(.*).main.20241216.4262507/ ) {
$bankname = $1
}
Realize that any time that you want to do some sort of text manipulation, that Perl can do it inside of Perl, and probably better and simpler than anything else.
But, seeing these two things together, I might be inclined to use readdir
, which doesn’t include the directory name, and to match the file name directly to capture the interesting part. Once I find that file, I stop looking:
open my $fh, '<', $file or die "could not open $file: $!";
while( <$fh> ) {
chomp;
print "$_ is testbanknamen";
opendir my $dh, $_ or do {
warn "Could not open dir <$_>: $!";
next;
}
my $bankName;
while( my $f = readdir($dh) ) {
next unless $f =~ /wint_nightly_nfarm_(.*).main.20241216.4262507/;
$bankName = $1;
last;
}
print "$bankName is the bankNamen";
}
3