I’m using Ray on my Mac to run some Python computations. The code will write with large HDF files with size ranging from 25 to 50 GB. I received the following error message while executing my code:
(raylet) [2024-05-16 11:56:24,761 E 1443 16308] (raylet) file_system_monitor.cc:111: /tmp/ray is over 95% full, available space: 18296090624; capacity: 494384795648. Object creation will fail if spilling is required.
Error executing job with overrides: []
Traceback (most recent call last):
File "/Users/***/1_Projects/***/t1_main.py", line 107, in main
outputs = ray.get(futures)
^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniconda/base/envs/env_030/lib/python3.11/site-packages/ray/_private/auto_init_hook.py", line 21, in auto_init_wrapper
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniconda/base/envs/env_030/lib/python3.11/site-packages/ray/_private/client_mode_hook.py", line 103, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniconda/base/envs/env_030/lib/python3.11/site-packages/ray/_private/worker.py", line 2623, in get
values, debugger_breakpoint = worker.get_objects(object_refs, timeout=timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniconda/base/envs/env_030/lib/python3.11/site-packages/ray/_private/worker.py", line 861, in get_objects
raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(OutOfDiskError): ray::process_hdf() (pid=1462, ip=127.0.0.1)
ray.exceptions.OutOfDiskError: Local disk is full
The object cannot be created because the local object store is full and the local disk's utilization is over capacity (95% by default).Tip: Use `df` on this node to check disk usage and `ray memory` to check object store memory usage.
This error is despite significant storage available on the mac drive (~161GB), allocated space for spill on a separate drive (~538 GB), and space available for data storage (~1TB). I’m trying to understand why this error is happening and how can I fix it.
Snippets from my code:
ray.init(
_system_config={
"automatic_object_spilling_enabled": True,
"object_spilling_config": json.dumps({
"type": "filesystem",
"params": {"directory_path": "/Volumes/T7/Ray_Spill"}
})
}
)
Processing and writign data:
futures = [process_hdf.remote(x, y, z) for x, y, z in zip(batch,
repeat(time_stamps),
repeat(cell_centers))]
outputs = ray.get(futures)
log.info('Writing the files to disk...')
# Save the data with compression
file_path = f'{file_storage_path}/model_runs_chunk_{idx}.h5'
write_to_hdf5(file_path, outputs)