While running the Trainingwrapper script of lucidrain/RETRO-pytorch in Google colab, I get the exception: No embeddings found in folder .tmp/embeddings. I tried to manually create a folder named embeddings in temp folder in colab but didn’t solve the problem.
The script is given below:
wrapper = TrainingWrapper(
retro = retro, # path to retro instance
knn = 2, # knn (2 in paper was sufficient)
chunk_size = 64, # chunk size (64 in paper)
documents_path = './text_folder', # path to folder of text
glob = '**/*.txt', # text glob
chunks_memmap_path = './train.chunks.dat', # path to chunks
seqs_memmap_path = './train.seq.dat', # path to sequence data
doc_ids_memmap_path = './train.doc_ids.dat', # path to document ids per chunk (used for filtering neighbors belonging to same document)
max_chunks = 1_000_000, # maximum cap to chunks
max_seqs = 100_000, # maximum seqs
knn_extra_neighbors = 100, # num extra neighbors to fetch
max_index_memory_usage = '100m',
current_memory_available = '1G'
)
the error says embeddings are saved but can’t find them:
embedded 7 / 7
saved .tmp/embeddings/00000.npy
0it [00:00, ?it/s]
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-6-8a66455104f9> in <cell line: 19>()
17 ).cuda()
18
---> 19 wrapper = TrainingWrapper(
20 retro = retro, # path to retro instance
21 knn = 2, # knn (2 in paper was sufficient)
6 frames
/usr/local/lib/python3.10/dist-packages/embedding_reader/numpy_reader.py in __init__(self, embeddings_folder)
75 self.count = self.headers["count"].sum()
76 if self.count == 0:
---> 77 raise ValueError(f"No embeddings found in folder {embeddings_folder}")
78 self.nb_files = len(self.headers["count"])
79 self.dimension = int(self.headers.iloc[0]["dimension"])
ValueError: No embeddings found in folder .tmp/embeddings
Also I have a GPU so I tried to run the scripts locally and it gets following error:
Traceback (most recent call last):
File "D:RETRO-pytorchwrapper.py", line 17, in <module>
).cuda()
^^^^^^
File "D:RETRO-pytorchvenvLibsite-packagestorchnnmodulesmodule.py", line 915, in cuda
return self._apply(lambda t: t.cuda(device))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:RETRO-pytorchvenvLibsite-packagestorchnnmodulesmodule.py", line 779, in _apply
module._apply(fn)
File "D:RETRO-pytorchvenvLibsite-packagestorchnnmodulesmodule.py", line 804, in _apply
param_applied = fn(param)
^^^^^^^^^
File "D:RETRO-pytorchvenvLibsite-packagestorchnnmodulesmodule.py", line 915, in <lambda>
return self._apply(lambda t: t.cuda(device))
^^^^^^^^^^^^^^
File "D:RETRO-pytorchvenvLibsite-packagestorchcuda__init__.py", line 284, in _lazy_init
raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled
Running both on Google colab GPU environment and loaclly.
Zahin Mohammad is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.