FEARec¶

Reference:: Xinyu Du et al. “Frequency Enhanced Hybrid Attention Network for Sequential Recommendation.” In SIGIR 2023.
Reference code:: https://github.com/sudaada/FEARec

class recbole.model.sequential_recommender.fearec.FEABlock(n_heads, hidden_size, intermediate_size, hidden_dropout_prob, attn_dropout_prob, hidden_act, layer_norm_eps, n, config)[source]¶

Bases: Module

One transformer layer consists of a multi-head self-attention layer and a point-wise feed-forward layer.

Parameters:

hidden_states (torch.Tensor) – the input of the multi-head self-attention sublayer
attention_mask (torch.Tensor) – the attention mask for the multi-head self-attention sublayer

Returns:

The output of the point-wise feed-forward sublayer,: is the output of the transformer layer.

Return type:

feedforward_output (torch.Tensor)

forward(hidden_states, attention_mask)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool¶

class recbole.model.sequential_recommender.fearec.FEAEncoder(n_layers=2, n_heads=2, hidden_size=64, inner_size=256, hidden_dropout_prob=0.5, attn_dropout_prob=0.5, hidden_act='gelu', layer_norm_eps=1e-12, config=None)[source]¶

Bases: Module

One TransformerEncoder consists of several TransformerLayers.

n_layers(num): num of transformer layers in transformer encoder. Default: 2
n_heads(num): num of attention heads for multi-head attention layer. Default: 2
hidden_size(num): the input and output hidden size. Default: 64
inner_size(num): the dimensionality in feed-forward layer. Default: 256
hidden_dropout_prob(float): probability of an element to be zeroed. Default: 0.5
attn_dropout_prob(float): probability of an attention score to be zeroed. Default: 0.5
hidden_act(str): activation function in feed-forward layer. Default: ‘gelu’
candidates: ‘gelu’, ‘relu’, ‘swish’, ‘tanh’, ‘sigmoid’
layer_norm_eps(float): a value added to the denominator for numerical stability. Default: 1e-12

forward(hidden_states, attention_mask, output_all_encoded_layers=True)[source]¶

Parameters:

hidden_states (torch.Tensor) – the input of the TransformerEncoder
attention_mask (torch.Tensor) – the attention mask for the input hidden_states
output_all_encoded_layers (Bool) – whether output all transformer layers’ output

Returns:

if output_all_encoded_layers is True, return a list consists of all transformer layers’ output, otherwise return a list only consists of the output of last transformer layer.

Return type:

all_encoder_layers (list)

training: bool¶

class recbole.model.sequential_recommender.fearec.FEARec(config, dataset)[source]¶

Bases: SequentialRecommender

static alignment(x, y)[source]¶

calculate_loss(interaction)[source]¶

Calculate the training loss for a batch data.

Parameters:: interaction (Interaction) – Interaction class of the batch.
Returns:: Training loss, shape: []
Return type:: torch.Tensor

decompose(z_i, z_j, origin_z, batch_size)[source]¶: We do not sample negative examples explicitly. Instead, given a positive pair, similar to (Chen et al., 2017), we treat the other 2(N − 1) augmented examples within a minibatch as negative examples.

forward(item_seq, item_seq_len)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

full_sort_predict(interaction)[source]¶

full sort prediction function. Given users, calculate the scores between users and all candidate items.

Parameters:: interaction (Interaction) – Interaction class of the batch.
Returns:: Predicted scores for given users and all candidate items, shape: [n_batch_users * n_candidate_items]
Return type:: torch.Tensor

get_attention_mask(item_seq)[source]¶: Generate left-to-right uni-directional attention mask for multi-head attention.

get_bi_attention_mask(item_seq)[source]¶: Generate bidirectional attention mask for multi-head attention.

get_same_item_index(dataset)[source]¶

info_nce(z_i, z_j, temp, batch_size, sim='dot')[source]¶: We do not sample negative examples explicitly. Instead, given a positive pair, similar to (Chen et al., 2017), we treat the other 2(N − 1) augmented examples within a minibatch as negative examples.

mask_correlated_samples(batch_size)[source]¶

predict(interaction)[source]¶

Predict the scores between users and items.

Parameters:: interaction (Interaction) – Interaction class of the batch.
Returns:: Predicted scores for given users and items, shape: [batch_size]
Return type:: torch.Tensor

training: bool¶

truncated_normal_(tensor, mean=0, std=0.09)[source]¶

static uniformity(x)[source]¶

class recbole.model.sequential_recommender.fearec.FeedForward(hidden_size, inner_size, hidden_dropout_prob, hidden_act, layer_norm_eps)[source]¶

Bases: Module

Point-wise feed-forward layer is implemented by two dense layers.

Parameters:: input_tensor (torch.Tensor) – the input of the point-wise feed-forward layer
Returns:: the output of the point-wise feed-forward layer
Return type:: hidden_states (torch.Tensor)

forward(input_tensor)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

gelu(x)[source]¶

Implementation of the gelu activation function.

For information: OpenAI GPT’s gelu is slightly different (and gives slightly different results):

0.5 * x * (1 + torch.tanh(math.sqrt(2 / math.pi) * (x + 0.044715 * torch.pow(x, 3))))

Also see https://arxiv.org/abs/1606.08415

get_hidden_act(act)[source]¶

swish(x)[source]¶

training: bool¶

class recbole.model.sequential_recommender.fearec.HybridAttention(n_heads, hidden_size, hidden_dropout_prob, attn_dropout_prob, layer_norm_eps, i, config)[source]¶

Bases: Module

Hybrid Attention layer: combine time domain self-attention layer and frequency domain attention layer.

Parameters:

input_tensor (torch.Tensor) – the input of the multi-head Hybrid Attention layer
attention_mask (torch.Tensor) – the attention mask for input tensor

Returns:

the output of the multi-head Hybrid Attention layer

Return type:

hidden_states (torch.Tensor)

forward(input_tensor, attention_mask)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

time_delay_agg_inference(values, corr)[source]¶: SpeedUp version of Autocorrelation (a batch-normalization style design) This is for the inference phase.

time_delay_agg_training(values, corr)[source]¶: SpeedUp version of Autocorrelation (a batch-normalization style design) This is for the training phase.

training: bool¶

transpose_for_scores(x)[source]¶