FEARec¶
- Reference:
Xinyu Du et al. “Frequency Enhanced Hybrid Attention Network for Sequential Recommendation.” In SIGIR 2023.
- Reference code:
- class recbole.model.sequential_recommender.fearec.FEABlock(n_heads, hidden_size, intermediate_size, hidden_dropout_prob, attn_dropout_prob, hidden_act, layer_norm_eps, n, config)[source]¶
Bases:
torch.nn.modules.module.Module
One transformer layer consists of a multi-head self-attention layer and a point-wise feed-forward layer.
- Parameters
hidden_states (torch.Tensor) – the input of the multi-head self-attention sublayer
attention_mask (torch.Tensor) – the attention mask for the multi-head self-attention sublayer
- Returns
- The output of the point-wise feed-forward sublayer,
is the output of the transformer layer.
- Return type
feedforward_output (torch.Tensor)
- forward(hidden_states, attention_mask)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class recbole.model.sequential_recommender.fearec.FEAEncoder(n_layers=2, n_heads=2, hidden_size=64, inner_size=256, hidden_dropout_prob=0.5, attn_dropout_prob=0.5, hidden_act='gelu', layer_norm_eps=1e-12, config=None)[source]¶
Bases:
torch.nn.modules.module.Module
One TransformerEncoder consists of several TransformerLayers.
n_layers(num): num of transformer layers in transformer encoder. Default: 2
n_heads(num): num of attention heads for multi-head attention layer. Default: 2
hidden_size(num): the input and output hidden size. Default: 64
inner_size(num): the dimensionality in feed-forward layer. Default: 256
hidden_dropout_prob(float): probability of an element to be zeroed. Default: 0.5
attn_dropout_prob(float): probability of an attention score to be zeroed. Default: 0.5
- hidden_act(str): activation function in feed-forward layer. Default: ‘gelu’
candidates: ‘gelu’, ‘relu’, ‘swish’, ‘tanh’, ‘sigmoid’
layer_norm_eps(float): a value added to the denominator for numerical stability. Default: 1e-12
- forward(hidden_states, attention_mask, output_all_encoded_layers=True)[source]¶
- Parameters
hidden_states (torch.Tensor) – the input of the TransformerEncoder
attention_mask (torch.Tensor) – the attention mask for the input hidden_states
output_all_encoded_layers (Bool) – whether output all transformer layers’ output
- Returns
if output_all_encoded_layers is True, return a list consists of all transformer layers’ output, otherwise return a list only consists of the output of last transformer layer.
- Return type
all_encoder_layers (list)
- training: bool¶
- class recbole.model.sequential_recommender.fearec.FEARec(config, dataset)[source]¶
Bases:
recbole.model.abstract_recommender.SequentialRecommender
- calculate_loss(interaction)[source]¶
Calculate the training loss for a batch data.
- Parameters
interaction (Interaction) – Interaction class of the batch.
- Returns
Training loss, shape: []
- Return type
torch.Tensor
- decompose(z_i, z_j, origin_z, batch_size)[source]¶
We do not sample negative examples explicitly. Instead, given a positive pair, similar to (Chen et al., 2017), we treat the other 2(N − 1) augmented examples within a minibatch as negative examples.
- forward(item_seq, item_seq_len)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- full_sort_predict(interaction)[source]¶
full sort prediction function. Given users, calculate the scores between users and all candidate items.
- Parameters
interaction (Interaction) – Interaction class of the batch.
- Returns
Predicted scores for given users and all candidate items, shape: [n_batch_users * n_candidate_items]
- Return type
torch.Tensor
- get_attention_mask(item_seq)[source]¶
Generate left-to-right uni-directional attention mask for multi-head attention.
- get_bi_attention_mask(item_seq)[source]¶
Generate bidirectional attention mask for multi-head attention.
- info_nce(z_i, z_j, temp, batch_size, sim='dot')[source]¶
We do not sample negative examples explicitly. Instead, given a positive pair, similar to (Chen et al., 2017), we treat the other 2(N − 1) augmented examples within a minibatch as negative examples.
- predict(interaction)[source]¶
Predict the scores between users and items.
- Parameters
interaction (Interaction) – Interaction class of the batch.
- Returns
Predicted scores for given users and items, shape: [batch_size]
- Return type
torch.Tensor
- training: bool¶
- class recbole.model.sequential_recommender.fearec.FeedForward(hidden_size, inner_size, hidden_dropout_prob, hidden_act, layer_norm_eps)[source]¶
Bases:
torch.nn.modules.module.Module
Point-wise feed-forward layer is implemented by two dense layers.
- Parameters
input_tensor (torch.Tensor) – the input of the point-wise feed-forward layer
- Returns
the output of the point-wise feed-forward layer
- Return type
hidden_states (torch.Tensor)
- forward(input_tensor)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- gelu(x)[source]¶
Implementation of the gelu activation function.
For information: OpenAI GPT’s gelu is slightly different (and gives slightly different results):
0.5 * x * (1 + torch.tanh(math.sqrt(2 / math.pi) * (x + 0.044715 * torch.pow(x, 3))))
Also see https://arxiv.org/abs/1606.08415
- training: bool¶
- class recbole.model.sequential_recommender.fearec.HybridAttention(n_heads, hidden_size, hidden_dropout_prob, attn_dropout_prob, layer_norm_eps, i, config)[source]¶
Bases:
torch.nn.modules.module.Module
Hybrid Attention layer: combine time domain self-attention layer and frequency domain attention layer.
- Parameters
input_tensor (torch.Tensor) – the input of the multi-head Hybrid Attention layer
attention_mask (torch.Tensor) – the attention mask for input tensor
- Returns
the output of the multi-head Hybrid Attention layer
- Return type
hidden_states (torch.Tensor)
- forward(input_tensor, attention_mask)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- time_delay_agg_inference(values, corr)[source]¶
SpeedUp version of Autocorrelation (a batch-normalization style design) This is for the inference phase.
- time_delay_agg_training(values, corr)[source]¶
SpeedUp version of Autocorrelation (a batch-normalization style design) This is for the training phase.
- training: bool¶