I’m in a new position where I need to process a flat files on a regular basis. The last time I did this was 5 or 6 years ago but as part of the file layout I received control totals. It gave me simplistic information on the file like the total number of records as well as sums of the important fields. This helped me during testing then also during production to verify the file arrived and has correct information.
I have asked for similar data for this new project and have hit a wall of no.
Is this no longer a standard practice?
Is there a better way?
It depends on what you are trying to test.
If you are trying to verify that a file arrived intact and wasn’t changed in transit, a checksum (or a digital signature) would be the gold standard.
If you are trying to verify that you have processed the file successfully, the aggregate information that you’re looking for is exceptionally helpful. Unfortunately, a large fraction of flat file based ETL processes are built with exceedingly little thought given to handling and preventing errors so lots of data sources have never been asked to provide this sort of header information, don’t have the code to do so, and aren’t terribly interested in writing the code on their side to generate the totals if few of their customers have any interest in using it.
What you’re asking for sounds entirely reasonable. If you are getting pushback from the data source, I would, within the realm of what is politically feasible in your environment, escalate the issue. Presumably, if the data source is internal, you have much more leverage than if you are asking for a data feed from an external vendor.
1