I have to do an interpolation on a data set of more than 11 thousand lines and 87 columns, which takes 1:40 minutes to complete. I’ve already tried using dask array and chunk list but it still takes a long time to finish running. Maybe I’m using the wrong dask array, could anyone help me?
I have a function that applies IDW interpolation, my sample data (known values), and the unknown values are in the unknown_points array (11 thousand rows) that I am trying to apply the dask to. I am placing the result of the interpolation in my test_2 dataframe, according to the columns that were interpolated.
import dask.array as da
l = da.from_array(unknown_points, chunks=(100, 100))
columns = len(values[0]) #87 columns
for i in range(columns):
interpolated_values = idw_interpolation(sample_points, l, values[:,i], power=2)
teste_2[i] = interpolated_values
NATALIA BASTOS DE SOUSA is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.