Relative Content

Tag Archive for pythonpandasdask

Dask merge ordered

I have a kind of huge dataset (about 100GB) based on blockchain data. I want to merge two tables based on the transactionHash, which would be impossible (O(n^2)) except because these two tables are both ordered by blockNumber, so this can be done in O(|A|+|B|)=O(n).

problems with data-struture or type in dask with pyarrow data-Frames how do i correct it for pandas

Traceback (most recent call last):
File “C:UsersPycharmProjectspythonProject2processing 2.py”, line 369, in process_data
df = calculate_cycles(df, open_arr, high_arr, low_arr, close_arr)
File “C:UsersPycharmProjectspythonProject2processing 2.py”, line 55, in calculate_cycles
cdl_inside_df = ta.cdl_inside(open_arr, high_arr, low_arr, close_arr)
File “C:UsersAppDataLocalProgramsPythonPython310libsite-packagespandas_tacandlescdl_inside.py”, line 16, in cdl_inside
inside = (high.diff() < 0) & (low.diff() > 0)
AttributeError: ‘NoneType’ object has no attribute ‘diff’
Error processing data: ‘NoneType’ object has no attribute ‘diff’
Loading data from D:New folderdataNew folder – CopyNew folderoutputNew folderdata_1.parquet