Training SettingsΒΆ

Training settings are designed to set parameters about model training.

  • epochs (int) : The number of training epochs. Defaults to 300.

  • train_batch_size (int) : The training batch size. Defaults to 2048.

  • learner (str) : The name of used optimizer. Defaults to 'adam'. Range in ['adam', 'sgd', 'adagrad', 'rmsprop', 'sparse_adam'].

  • learning_rate (float) : Learning rate. Defaults to 0.001.

  • neg_sampling(dict): This parameter controls the negative sampling for model training. The key range is ['uniform', 'popularity'], which decides the distribution of negative items in sampling pools. In addition, we also support dynamic negative sampling ['dynamic']. uniform means uniformly select negative items while popularity means select negative items based on their popularity (Counter(item) in .inter file). The value k (int) of this parameter means sample k negative items for each positive item. Note that if your data is labeled, you need to set this parameter as None. The default value of this parameter is {'uniform': 1}, which means uniformly sample one negative item for each positive item. When dynamic negative sampling, dynamic decides the number of candidate negative items. For example, {'uniform': 1, 'dynamic': 2} means sampling 2 items for each positive item uniformly, and dynamically choosing the item with the higher score as the selected negative item. In particular, 'uniform': 1 means that a positive item pairs with one negative item, and 'dynamic': 2 means dynamically selecting each negative item from two candidates.

  • eval_step (int) : The number of training epochs before an evaluation on the valid dataset. If it is less than 1, the model will not be evaluated on the valid dataset. Defaults to 1.

  • stopping_step (int) : The threshold for validation-based early stopping. Defaults to 10.

  • clip_grad_norm (dict) : The args of clip_grad_norm_ which will clip gradient norm of model. Defaults to None.

  • loss_decimal_place(int): The decimal place of training loss. Defaults to 4.

  • weight_decay (float) : The weight decay (L2 penalty), used for optimizer. Default to 0.0.

  • require_pow(bool): The sign identifies whether the power operation is performed based on the norm in EmbLoss. Defaults to False.