Specifically, this is my test code, which is the official test code provided by MTEB.
from mteb import MTEB
from sentence_transformers import SentenceTransformer
# Define the sentence-transformers model name
model_name = "average_word_embeddings_komninos"
model = SentenceTransformer(model_name)
# ArguAna task
evaluation = MTEB(tasks=["ArguAna"])
results = evaluation.run(model, output_folder=f"./results/{model_name}")
The specific error that occurred is:
Traceback (most recent call last):
File "/data1/jiyifan/OpenMatch/mteb_test.py", line 172, in <module>
results = evaluation.run(
File "/data1/jiyifan/anaconda3/envs/om1/lib/python3.10/site-packages/mteb/evaluation/MTEB.py", line 422, in run
raise e
File "/data1/jiyifan/anaconda3/envs/om1/lib/python3.10/site-packages/mteb/evaluation/MTEB.py", line 352, in run
task.load_data(eval_splits=task_eval_splits, **kwargs)
File "/data1/jiyifan/anaconda3/envs/om1/lib/python3.10/site-packages/mteb/abstasks/AbsTaskRetrieval.py", line 231, in load_data
corpus, queries, qrels = HFDataLoader(
File "/data1/jiyifan/anaconda3/envs/om1/lib/python3.10/site-packages/mteb/abstasks/AbsTaskRetrieval.py", line 96, in load
self._load_qrels(split)
File "/data1/jiyifan/anaconda3/envs/om1/lib/python3.10/site-packages/mteb/abstasks/AbsTaskRetrieval.py", line 175, in _load_qrels
qrels_ds = load_dataset(
File "/data1/jiyifan/anaconda3/envs/om1/lib/python3.10/site-packages/datasets/load.py", line 2594, in load_dataset
builder_instance = load_dataset_builder(
File "/data1/jiyifan/anaconda3/envs/om1/lib/python3.10/site-packages/datasets/load.py", line 2303, in load_dataset_builder
builder_instance: DatasetBuilder = builder_cls(
File "/data1/jiyifan/anaconda3/envs/om1/lib/python3.10/site-packages/datasets/packaged_modules/cache/cache.py", line 140, in __init__
config_name, version, hash = _find_hash_in_cache(
File "/data1/jiyifan/anaconda3/envs/om1/lib/python3.10/site-packages/datasets/packaged_modules/cache/cache.py", line 85, in _find_hash_in_cache
raise ValueError(
ValueError: There are multiple 'mteb/arguana' configurations in the cache: queries, default, corpus
**Please specify which configuration to reload from the cache, e.g.
load_dataset('mteb/arguana', 'queries')**
I haven’t found a solution online, does anyone have any insights?
New contributor
Xian Lu is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.