I am attempting to measure the total execution time from Spark to BigTable. However, when I wrap the following code around the BigTable related function, it consistently shows only 0.5 seconds, regardless of the data volume I extract:
Scala2
val t0 = System.nanoTime()
// Get the dataframe after querying data from Bigtable function
val df = queryDataBigtable()
val t1 = System.nanoTime()
val elapsedTime = (t1 - t0) / 1e9d // Convert nanoseconds to seconds
println(s"Elapsed time: $elapsedTime seconds")
To ensure that Spark performs actual work, I tried using df.count
. However, since this is production code, I’m hesitant to rely on .count
for monitoring the query time for Bigtable from Spark. Do you have any better suggestions for monitoring bigtable time?