i have S3 access_key_id, secret_access_key and endpoint URL.
I tried opening spar2-shell
import org.apache.spark.sql.SparkSession
val spark = SparkSession.builder() .appName("Read ORC from S3") .getOrCreate()
sc.hadoopConfiguration.set("fs.s3a.access.key", "ABC") sc.hadoopConfiguration.set("fs.s3a.secret.key", "2ju0jzWo/ABC") sc.hadoopConfiguration.set("fs.s3a.endpoint", "Https://abc")
val df = spark.read.orc("s3a://rcemqe-24-45ae3433-0511-459e-bdaf-7f1348f9d8d0/user/rcem1403/output/mapsig/combine/rcem_map_sccp_lean_min/usecasename=rcem_map_min/finalcubebintime=1650532150/gran=FifteenMinutes/")
Getting below WARN and then nothings happens and end up saying path not found. even though the path exists.
24/04/23 16:08:26 WARN lineage.LineageWriter: Lineage directory /var/log/spark2/lineage doesn't exist or is not writable. Lineage for this application will be disabled. 24/04/23 16:08:27 WARN lineage.LineageWriter: Lineage directory /var/log/spark2/lineage doesn't exist or is not writable. Lineage for this application will be disabled. 24/04/23 16:08:27 WARN fs.FileSystem: S3FileSystem is deprecated and will be removed in future releases. Use NativeS3FileSystem or S3AFileSystem instead. 24/04/23 16:16:29 WARN streaming.FileStreamSink: Error while looking for metadata directory.