Relative Content

Tag Archive for apache-sparkshufflebucket

Spark Sort Operation in Join Despite Pre-sorted Bucketed Tables

I’m working with Spark and encountering an unexpected sort operation during a join of two pre-sorted and bucketed tables. Both tables have been created with the same number of buckets and are sorted by the join key. However, when I perform the join operation, Spark still includes a sort step in the execution plan.