Hi I am trying to take a transpose of a DenseMatrix and multiply it to a RowMatrix.
I have a DenseMatrix = V and RowMatrix = U.
I have tried to implement the below function to create a transpose
def dense_T(dense_mat):
if str(type(dense_mat)) == "<class 'pyspark.mllib.linalg.DenseMatrix'>":
t_mat = DenseMatrix(dense_mat.numRows, dense_mat.numCols, dense_mat.values, isTransposed=True)
else:
print("input is not a dense matrix")
return t_mat
But when I do
V_trans = dense_T(V)
U.multiply(V_trans)
I still get dimension issues. and V and V_trans have the same dimensions (from documentation, isTransposed = True seems like it should not change dimensions anyways, but should calculated like it is transposed, which the multiply() is not doing…)
It seems like there is a way of converting the matrix into numpy or using a loop to create a new value list and then indexing it back to a transposed matrix like below.
transposed_values = [values[j*num_rows + i] for i in range(num_rows) for j in range(num_cols)]
But due to scalability, I would like to not use numpy(numpy diminishes the reason for distributed computing form what I have read) nor loop through each value.
What are my options?
Also how come Spark does not have such a common method to be implemented easily? What is the reason?