val sql =
s"""
INSERT OVERWRITE TABLE dynamic_member
PARTITION(tenant_id = $tenantId, list_id = $listId)
select customer_id from $view
"""
sparkSession.sql(sql)
Sometimes, when multiple job concurrent write data to different partition, I get this exception:
FileNotFoundException File hdfs://XXX/cdp/dynamic_member/_temporary/0 does not exist
And I found spark3 will use InsertIntoHadoopFsRelationCommand to write data to hive table partition, and write job will use same temporary directory under table location, there maybe exist collision ?
And spark2 use InsertIntoHiveTable to write data to hive table partition, write job use a random directory under table location(for example: hdfs://xxx/cdp/dynamic_member/.hive-staging_hive_2024-06-11_15-55-10_306_2966530999068165515-1/-ext-10000/_temporary/0/_temporary/attempt_20240611155610_0000_m_000000_0/)
How can I avoid multiple job to use same temporary when use spark3