Thiết kế website giá rẻ

Question

I’m (more or less) following the Training_quora_duplicate_questions.py example using my own data.

As I understood from the cross encoder page, and I quote:

For binary tasks and tasks with continuous scores (like STS), we set num_labels=1. For classification tasks, we set it to the number of labels we have.

As such I’ve written the model thusly:

model = CrossEncoder("distilroberta-base", num_labels= 2)

and the evaluator thusly:

evaluator = CEBinaryClassificationEvaluator.from_input_examples(dev_examples, name= "Rooms-dev")

Yet, when I run the code as soon as it tries to evaluate for the first time it gives the following error:

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

I tried the same code before but with num_labels=1 and the it worked, ‘though isn’t (I think) quite what I’d like since I want to model to predict either 0 or 1 and not a continuous from 0 to 1.

Any ideia what might be causing this?

Code and error message below:

# extracting rooms & labels into lists
rooms_1 = rooms_df["room_ner1"].tolist()
rooms_2 = rooms_df["room_ner2"].tolist()
labels = rooms_df["label"].tolist()

# creating array of room-pairs
dataset = [
    [original, candidate, label]
    for original, candidate, label
    in zip(rooms_1, rooms_2, labels)
]

random.shuffle(dataset)

dev_sample_size = int(len(dataset) * 0.10)

dev_data = dataset[: dev_sample_size]
train_data = dataset[dev_sample_size: ]

# preparing training dataset
train_examples = list()
n_examples = len(train_data)

# creating training dataset
for i in trange(n_examples):
    example = train_data[i]
    train_examples.append(InputExample(texts= [example[0], example[1]], label= example[2]))
    train_examples.append(InputExample(texts= [example[1], example[0]], label= example[2]))

# preparing test dataset
dev_examples = list()
d_examples = len(dev_data)

# creating test dataset
for i in trange(d_examples):
    example = dev_data[i]
    dev_examples.append(InputExample(texts= [example[0], example[1]], label= example[2]))

train_batch_size = 16
num_epochs = 4
model_save_path = "output/training_rooms-" + datetime.now().strftime("%Y-%m-%d_%H-%M-%S")

model = CrossEncoder("distilroberta-base", num_labels= 2)#, device= "mps")

train_dataloader = DataLoader(
    train_examples, 
    shuffle= True,
    batch_size= train_batch_size
)

evaluator = CEBinaryClassificationEvaluator.from_input_examples(dev_examples, name= "Rooms-dev") 

warmup_steps = math.ceil(len(train_dataloader) * num_epochs * 0.1) # 10% train data for warm-up
logger.info(f"Warmup-steps: {warmup_steps:_}")

model.fit(
    train_dataloader= train_dataloader,
    evaluator= evaluator,
    epochs= num_epochs,
    evaluation_steps= 5_000,
    warmup_steps= warmup_steps,
    output_path= model_save_path
)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[67], [line 1](vscode-notebook-cell:?execution_count=67&line=1)
----> [1](vscode-notebook-cell:?execution_count=67&line=1) model.fit(
      [2](vscode-notebook-cell:?execution_count=67&line=2)     train_dataloader= train_dataloader,
      [3](vscode-notebook-cell:?execution_count=67&line=3)     evaluator= evaluator,
      [4](vscode-notebook-cell:?execution_count=67&line=4)     epochs= num_epochs,
      [5](vscode-notebook-cell:?execution_count=67&line=5)     evaluation_steps= 5_000,
      [6](vscode-notebook-cell:?execution_count=67&line=6)     warmup_steps= warmup_steps,
      [7](vscode-notebook-cell:?execution_count=67&line=7)     output_path= model_save_path
      [8](vscode-notebook-cell:?execution_count=67&line=8) )

File [~/IronHack/GitHub/room-matching/.venv/lib/python3.11/site-packages/sentence_transformers/cross_encoder/CrossEncoder.py:275](https://file+.vscode-resource.vscode-cdn.net/Users/duarteharris/IronHack/GitHub/room-matching/~/IronHack/GitHub/room-matching/.venv/lib/python3.11/site-packages/sentence_transformers/cross_encoder/CrossEncoder.py:275), in CrossEncoder.fit(self, train_dataloader, evaluator, epochs, loss_fct, activation_fct, scheduler, warmup_steps, optimizer_class, optimizer_params, weight_decay, evaluation_steps, output_path, save_best_model, max_grad_norm, use_amp, callback, show_progress_bar)
    [272](https://file+.vscode-resource.vscode-cdn.net/Users/duarteharris/IronHack/GitHub/room-matching/~/IronHack/GitHub/room-matching/.venv/lib/python3.11/site-packages/sentence_transformers/cross_encoder/CrossEncoder.py:272) training_steps += 1
    [274](https://file+.vscode-resource.vscode-cdn.net/Users/duarteharris/IronHack/GitHub/room-matching/~/IronHack/GitHub/room-matching/.venv/lib/python3.11/site-packages/sentence_transformers/cross_encoder/CrossEncoder.py:274) if evaluator is not None and evaluation_steps > 0 and training_steps % evaluation_steps == 0:
--> [275](https://file+.vscode-resource.vscode-cdn.net/Users/duarteharris/IronHack/GitHub/room-matching/~/IronHack/GitHub/room-matching/.venv/lib/python3.11/site-packages/sentence_transformers/cross_encoder/CrossEncoder.py:275)     self._eval_during_training(
    [276](https://file+.vscode-resource.vscode-cdn.net/Users/duarteharris/IronHack/GitHub/room-matching/~/IronHack/GitHub/room-matching/.venv/lib/python3.11/site-packages/sentence_transformers/cross_encoder/CrossEncoder.py:276)         evaluator, output_path, save_best_model, epoch, training_steps, callback
    [277](https://file+.vscode-resource.vscode-cdn.net/Users/duarteharris/IronHack/GitHub/room-matching/~/IronHack/GitHub/room-matching/.venv/lib/python3.11/site-packages/sentence_transformers/cross_encoder/CrossEncoder.py:277)     )
    [279](https://file+.vscode-resource.vscode-cdn.net/Users/duarteharris/IronHack/GitHub/room-matching/~/IronHack/GitHub/room-matching/.venv/lib/python3.11/site-packages/sentence_transformers/cross_encoder/CrossEncoder.py:279)     self.model.zero_grad()
    [280](https://file+.vscode-resource.vscode-cdn.net/Users/duarteharris/IronHack/GitHub/room-matching/~/IronHack/GitHub/room-matching/.venv/lib/python3.11/site-packages/sentence_transformers/cross_encoder/CrossEncoder.py:280)     self.model.train()

File [~/IronHack/GitHub/room-matching/.venv/lib/python3.11/site-packages/sentence_transformers/cross_encoder/CrossEncoder.py:450](https://file+.vscode-resource.vscode-cdn.net/Users/duarteharris/IronHack/GitHub/room-matching/~/IronHack/GitHub/room-matching/.venv/lib/python3.11/site-packages/sentence_transformers/cross_encoder/CrossEncoder.py:450), in CrossEncoder._eval_during_training(self, evaluator, output_path, save_best_model, epoch, steps, callback)
    [448](https://file+.vscode-resource.vscode-cdn.net/Users/duarteharris/IronHack/GitHub/room-matching/~/IronHack/GitHub/room-matching/.venv/lib/python3.11/site-packages/sentence_transformers/cross_encoder/CrossEncoder.py:448) """Runs evaluation during the training"""
    [449](https://file+.vscode-resource.vscode-cdn.net/Users/duarteharris/IronHack/GitHub/room-matching/~/IronHack/GitHub/room-matching/.venv/lib/python3.11/site-packages/sentence_transformers/cross_encoder/CrossEncoder.py:449) if evaluator is not None:
--> [450](https://file+.vscode-resource.vscode-cdn.net/Users/duarteharris/IronHack/GitHub/room-matching/~/IronHack/GitHub/room-matching/.venv/lib/python3.11/site-packages/sentence_transformers/cross_encoder/CrossEncoder.py:450)     score = evaluator(self, output_path=output_path, epoch=epoch, steps=steps)
    [451](https://file+.vscode-resource.vscode-cdn.net/Users/duarteharris/IronHack/GitHub/room-matching/~/IronHack/GitHub/room-matching/.venv/lib/python3.11/site-packages/sentence_transformers/cross_encoder/CrossEncoder.py:451)     if callback is not None:
    [452](https://file+.vscode-resource.vscode-cdn.net/Users/duarteharris/IronHack/GitHub/room-matching/~/IronHack/GitHub/room-matching/.venv/lib/python3.11/site-packages/sentence_transformers/cross_encoder/CrossEncoder.py:452)         callback(score, epoch, steps)

File [~/IronHack/GitHub/room-matching/.venv/lib/python3.11/site-packages/sentence_transformers/cross_encoder/evaluation/CEBinaryClassificationEvaluator.py:81](https://file+.vscode-resource.vscode-cdn.net/Users/duarteharris/IronHack/GitHub/room-matching/~/IronHack/GitHub/room-matching/.venv/lib/python3.11/site-packages/sentence_transformers/cross_encoder/evaluation/CEBinaryClassificationEvaluator.py:81), in CEBinaryClassificationEvaluator.__call__(self, model, output_path, epoch, steps)
     [76](https://file+.vscode-resource.vscode-cdn.net/Users/duarteharris/IronHack/GitHub/room-matching/~/IronHack/GitHub/room-matching/.venv/lib/python3.11/site-packages/sentence_transformers/cross_encoder/evaluation/CEBinaryClassificationEvaluator.py:76) logger.info("CEBinaryClassificationEvaluator: Evaluating the model on " + self.name + " dataset" + out_txt)
     [77](https://file+.vscode-resource.vscode-cdn.net/Users/duarteharris/IronHack/GitHub/room-matching/~/IronHack/GitHub/room-matching/.venv/lib/python3.11/site-packages/sentence_transformers/cross_encoder/evaluation/CEBinaryClassificationEvaluator.py:77) pred_scores = model.predict(
     [78](https://file+.vscode-resource.vscode-cdn.net/Users/duarteharris/IronHack/GitHub/room-matching/~/IronHack/GitHub/room-matching/.venv/lib/python3.11/site-packages/sentence_transformers/cross_encoder/evaluation/CEBinaryClassificationEvaluator.py:78)     self.sentence_pairs, convert_to_numpy=True, show_progress_bar=self.show_progress_bar
     [79](https://file+.vscode-resource.vscode-cdn.net/Users/duarteharris/IronHack/GitHub/room-matching/~/IronHack/GitHub/room-matching/.venv/lib/python3.11/site-packages/sentence_transformers/cross_encoder/evaluation/CEBinaryClassificationEvaluator.py:79) )
---> [81](https://file+.vscode-resource.vscode-cdn.net/Users/duarteharris/IronHack/GitHub/room-matching/~/IronHack/GitHub/room-matching/.venv/lib/python3.11/site-packages/sentence_transformers/cross_encoder/evaluation/CEBinaryClassificationEvaluator.py:81) acc, acc_threshold = BinaryClassificationEvaluator.find_best_acc_and_threshold(pred_scores, self.labels, True)
     [82](https://file+.vscode-resource.vscode-cdn.net/Users/duarteharris/IronHack/GitHub/room-matching/~/IronHack/GitHub/room-matching/.venv/lib/python3.11/site-packages/sentence_transformers/cross_encoder/evaluation/CEBinaryClassificationEvaluator.py:82) f1, precision, recall, f1_threshold = BinaryClassificationEvaluator.find_best_f1_and_threshold(
     [83](https://file+.vscode-resource.vscode-cdn.net/Users/duarteharris/IronHack/GitHub/room-matching/~/IronHack/GitHub/room-matching/.venv/lib/python3.11/site-packages/sentence_transformers/cross_encoder/evaluation/CEBinaryClassificationEvaluator.py:83)     pred_scores, self.labels, True
     [84](https://file+.vscode-resource.vscode-cdn.net/Users/duarteharris/IronHack/GitHub/room-matching/~/IronHack/GitHub/room-matching/.venv/lib/python3.11/site-packages/sentence_transformers/cross_encoder/evaluation/CEBinaryClassificationEvaluator.py:84) )
     [85](https://file+.vscode-resource.vscode-cdn.net/Users/duarteharris/IronHack/GitHub/room-matching/~/IronHack/GitHub/room-matching/.venv/lib/python3.11/site-packages/sentence_transformers/cross_encoder/evaluation/CEBinaryClassificationEvaluator.py:85) ap = average_precision_score(self.labels, pred_scores)

File [~/IronHack/GitHub/room-matching/.venv/lib/python3.11/site-packages/sentence_transformers/evaluation/BinaryClassificationEvaluator.py:226](https://file+.vscode-resource.vscode-cdn.net/Users/duarteharris/IronHack/GitHub/room-matching/~/IronHack/GitHub/room-matching/.venv/lib/python3.11/site-packages/sentence_transformers/evaluation/BinaryClassificationEvaluator.py:226), in BinaryClassificationEvaluator.find_best_acc_and_threshold(scores, labels, high_score_more_similar)
    [223](https://file+.vscode-resource.vscode-cdn.net/Users/duarteharris/IronHack/GitHub/room-matching/~/IronHack/GitHub/room-matching/.venv/lib/python3.11/site-packages/sentence_transformers/evaluation/BinaryClassificationEvaluator.py:223) assert len(scores) == len(labels)
    [224](https://file+.vscode-resource.vscode-cdn.net/Users/duarteharris/IronHack/GitHub/room-matching/~/IronHack/GitHub/room-matching/.venv/lib/python3.11/site-packages/sentence_transformers/evaluation/BinaryClassificationEvaluator.py:224) rows = list(zip(scores, labels))
--> [226](https://file+.vscode-resource.vscode-cdn.net/Users/duarteharris/IronHack/GitHub/room-matching/~/IronHack/GitHub/room-matching/.venv/lib/python3.11/site-packages/sentence_transformers/evaluation/BinaryClassificationEvaluator.py:226) rows = sorted(rows, key=lambda x: x[0], reverse=high_score_more_similar)
    [228](https://file+.vscode-resource.vscode-cdn.net/Users/duarteharris/IronHack/GitHub/room-matching/~/IronHack/GitHub/room-matching/.venv/lib/python3.11/site-packages/sentence_transformers/evaluation/BinaryClassificationEvaluator.py:228) max_acc = 0
    [229](https://file+.vscode-resource.vscode-cdn.net/Users/duarteharris/IronHack/GitHub/room-matching/~/IronHack/GitHub/room-matching/.venv/lib/python3.11/site-packages/sentence_transformers/evaluation/BinaryClassificationEvaluator.py:229) best_threshold = -1

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Thank you in advance for any help.

Thiết kế website giá rẻ

Danh mục

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() when training CrossEncoder