I am unable to reproduce my results in PyTorch after adding a nn.ReLU()
. I am sure the problem must be here instead of other places, since I have tested with ablation for hundreds of times. It is weird that the activation function does affect the reproducibility. Even though I replace the nn.ReLU()
with torch.max()
, it does NOT work yet.
Setting deterministic experiments:
seed = 0
np.random.seed(seed)
random.seed(seed)
os.environ['PYTHONHASHSEED'] = str(seed)
os.environ['CUBLAS_WORKSPACE_CONFIG'] = ':4096:8'
torch.manual_seed(seed)
torch.cuda.manual_seed(seed)
torch.cuda.manual_seed_all(seed)
torch.manual_seed(seed)
torch.backends.cudnn.enabled = False
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False
torch.use_deterministic_algorithms(True)
Problematic code:
class DiffNet_Encoder(nn.Module):
def __init__(self, data, social_data, emb_size, n_layers):
super(DiffNet_Encoder, self).__init__()
self.data = data
self.social_data = social_data
self.latent_size = emb_size
self.layers = n_layers
self.norm_inter = data.norm_inter
self.embedding_dict = self._init_model()
self.sparse_norm_inter = TorchGraphInterface.convert_sparse_mat_to_tensor(self.norm_inter).cuda()
self.sparse_social_adj = TorchGraphInterface.convert_sparse_mat_to_tensor(self.social_data.tocsr()).cuda()
def _init_model(self):
initializer = nn.init.xavier_uniform_
embedding_dict = {'weight%d'%k: nn.Parameter(initializer(torch.empty(2*self.latent_size, self.latent_size))) for k in range(self.layers)}
embedding_dict['user_emb'] = nn.Parameter(initializer(torch.empty(self.data.user_num, self.latent_size)))
embedding_dict['item_emb'] = nn.Parameter(initializer(torch.empty(self.data.item_num, self.latent_size)))
embedding_dict = nn.ParameterDict(embedding_dict)
return embedding_dict
def forward(self):
user_embeddings = self.embedding_dict['user_emb']
item_embeddings = self.embedding_dict['item_emb'] # null operation
for k in range(self.layers):
new_user_embeddings = torch.sparse.mm(self.sparse_social_adj, user_embeddings)
user_embeddings = torch.matmul(torch.cat([new_user_embeddings, user_embeddings], dim=1), self.embedding_dict['weight%d'%k])
user_embeddings = nn.ReLU()(user_embeddings) # HERE!!!!!!! If I comment this line out, it can be reproducible
final_user_embeddings = user_embeddings + torch.sparse.mm(self.sparse_norm_inter, item_embeddings)
return final_user_embeddings, item_embeddings.data
user_embeddings = nn.ReLU()(user_embeddings)
makes the results non-reproducible.
Result W/ ReLU():
2024-05-06 11:25:48,930 – DiffNet – INFO – ###Evaluation Results###
2024-05-06 11:25:48,930 – DiffNet – INFO – [‘Top 10n’, ‘Hit Ratio:0.15284n’, ‘Precision:0.13263n’, ‘Recall:0.15036n’, ‘NDCG:0.16989n’]
2024-05-06 11:18:04,953 – DiffNet – INFO – ###Evaluation Results###
2024-05-06 11:18:04,953 – DiffNet – INFO – [‘Top 10n’, ‘Hit Ratio:0.15507n’, ‘Precision:0.13457n’, ‘Recall:0.15272n’, ‘NDCG:0.17073n’]
Result W/O ReLU():
2024-05-06 12:34:17,763 – DiffNet – INFO – ###Evaluation Results###
2024-05-06 12:34:17,763 – DiffNet – INFO – [‘Top 10n’, ‘Hit Ratio:0.14373n’, ‘Precision:0.12473n’, ‘Recall:0.14033n’, ‘NDCG:0.16247n’]
2024-05-06 12:33:55,630 – DiffNet – INFO – ###Evaluation Results###
2024-05-06 12:33:55,631 – DiffNet – INFO – [‘Top 10n’, ‘Hit Ratio:0.14373n’, ‘Precision:0.12473n’, ‘Recall:0.14033n’, ‘NDCG:0.16247n’]
This is a copy of code DiffNet, a social recommender.
Simon CHEN is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.