I would like to get your opinion on an issue regarding our Data Lake.
Historically, we have been using the following file format: YYYY/MM/DD/FILE_NAME.csv. However, a new recruit suggests grouping all files under a single directory, with a slightly modified format: YYYY_MM_DD_nom_FILE_NAME.csv.
The arguments against this are inefficiency, performance issues, and bandwidth problems for Databricks.
Could I have your opinion on these two options, please?
Thank you