recbole.evaluator.evaluators

class recbole.evaluator.evaluators.LossEvaluator(config, metrics)[source]

Bases: recbole.evaluator.abstract_evaluator.IndividualEvaluator

Loss Evaluator is mainly used in rating prediction and click through rate prediction. Now, we support four loss metrics which contain ‘AUC’, ‘RMSE’, ‘MAE’, ‘LOGLOSS’.

Note

The metrics used do not calculate group-based metrics which considers the metrics scores averaged across users. They are also not limited to k. Instead, they calculate the scores on the entire prediction results regardless the users.

collect(interaction, pred_scores)[source]

collect the loss intermediate result of one batch, this function mainly implements concatenating preds and trues. It is called at the end of each batch

Parameters
  • interaction (Interaction) – AbstractEvaluator of the batch

  • pred_scores (tensor) – the tensor of model output with a size of (N, )

Returns

a batch of scores with a size of (N, 2)

Return type

tensor

evaluate(batch_matrix_list, *args)[source]

calculate the metrics of all batches. It is called at the end of each epoch

Parameters

batch_matrix_list (list) – the results of all batches

Returns

such as {‘AUC’: 0.83}

Return type

dict

class recbole.evaluator.evaluators.RankEvaluator(config, metrics)[source]

Bases: recbole.evaluator.abstract_evaluator.GroupedEvaluator

Rank Evaluator is mainly used in ranking tasks except for topk tasks. Now, we support one rank metric containing ‘GAUC’.

Note

The metrics used calculate group-based metrics which considers the metrics scores averaged across users except for top-k metrics.

average_rank(scores)[source]

Get the ranking of an ordered tensor, and take the average of the ranking for positions with equal values.

Parameters

scores (tensor) – an ordered tensor, with size of (N, )

Returns

average_rank

Return type

torch.Tensor

Example

>>> average_rank(tensor([[1,2,2,2,3,3,6],[2,2,2,2,4,5,5]]))
tensor([[1.0000, 3.0000, 3.0000, 3.0000, 5.5000, 5.5000, 7.0000],
[2.5000, 2.5000, 2.5000, 2.5000, 5.0000, 6.5000, 6.5000]])
Reference:

https://github.com/scipy/scipy/blob/v0.17.1/scipy/stats/stats.py#L5262-L5352

collect(interaction, scores_tensor)[source]

collect the rank intermediate result of one batch, this function mainly implements ranking and calculating the sum of rank for positive items. It is called at the end of each batch

Parameters
  • interaction (Interaction) – AbstractEvaluator of the batch

  • scores_tensor (tensor) – the tensor of model output with size of (N, )

evaluate(batch_matrix_list, eval_data)[source]

calculate the metrics of all batches. It is called at the end of each epoch

Parameters
  • batch_matrix_list (list) – the results of all batches

  • eval_data (Dataset) – the class of test data

Returns

such as {'GAUC': 0.9286}

Return type

dict

get_user_pos_len_list(interaction, scores_tensor)[source]

get number of positive items and all items in test set of each user

Parameters
  • interaction (Interaction) – AbstractEvaluator of the batch

  • scores_tensor (tensor) – the tensor of model output with size of (N, )

Returns

number of positive items, list: number of all items

Return type

list

class recbole.evaluator.evaluators.TopKEvaluator(config, metrics)[source]

Bases: recbole.evaluator.abstract_evaluator.GroupedEvaluator

TopK Evaluator is mainly used in ranking tasks. Now, we support six topk metrics which

contain ‘Hit’, ‘Recall’, ‘MRR’, ‘Precision’, ‘NDCG’, ‘MAP’.

Note

The metrics used calculate group-based metrics which considers the metrics scores averaged across users. Some of them are also limited to k.

collect(interaction, scores_tensor)[source]

collect the topk intermediate result of one batch, this function mainly implements padding and TopK finding. It is called at the end of each batch

Parameters
  • interaction (Interaction) – AbstractEvaluator of the batch

  • scores_tensor (tensor) – the tensor of model output with size of (N, )

Returns

a matrix contain topk matrix and shape matrix

Return type

torch.Tensor

evaluate(batch_matrix_list, eval_data)[source]

calculate the metrics of all batches. It is called at the end of each epoch

Parameters
  • batch_matrix_list (list) – the results of all batches

  • eval_data (Dataset) – the class of test data

Returns

such as {'Hit@20': 0.3824, 'Recall@20': 0.0527, 'Hit@10': 0.3153, 'Recall@10': 0.0329}

Return type

dict