I have a .sqlx file of type “operations”, where I’m loading data from a CSV file into a table named “incidents”. However, this CSV file may or may not be present in the designated bucket. Since incidents don’t occur every day, the CSV file typically contains data from the previous date (e.g., a file named “test30042024”).
The issue arises when attempting to execute the process. If the CSV file is missing from the bucket, the execution fails. I wonder if there is a way to validate if it exists before trying to load it, so it does not break the script.
Here is what the sqlx script currently looks like:
config {
type: "operations",
hasOutput: true
}
js {
function getYesterday() {
const today = new Date();
const localOffset = -(today.getTimezoneOffset() / 60);
const peruOffset = -5; // Peru timezone offset (GMT-5)
const offset = peruOffset - localOffset;
const yesterday = new Date(today.getTime() - 24 * 3600 * 1000 + offset * 3600 * 1000);
const dd = String(yesterday.getDate()).padStart(2, '0');
const mm = String(yesterday.getMonth() + 1).padStart(2, '0');
const yyyy = yesterday.getFullYear();
return dd + mm + yyyy;
}
}
LOAD DATA INTO
project.dataset.incidents (incidentId Integer,
name String,
timestamp Integer)
FROM FILES (skip_leading_rows=1, format = 'CSV',
uris = ['gs://bucket/test${getYesterday()}.csv']);
and here is an example of how the CSV can look like:
incidentId,name,timestamp
1,SOMENAME,1714482000000
2,ANOTHERNAME,1714482000000
3,ANYNAME,1714485600000
I see some documentation about the LOAD DATA function but could not find anything that allows to do this kind of validation