Spark ORC writer generates incorrect ORC meta file statistics(min, max)
We found that Hive cannot concatenate some ORC files generated by Spark 3.2.1 and higher versions which contain long strings.
hive –orcfiledump While parsing a protocol message, the input ended unexpectedly in the middle of a field
I have a table with orc files on hdfs and I want to read it by spark but I got error. there are 14 files in hdfs directory and I can not read the last file.
I used hive –orcfiledump {file path} but I got same error: