# LightSANs¶

## Introduction¶

**Title:** Lighter and Better: Low-Rank Decomposed Self-Attention Networks for Next-Item Recommendation

**Authors:** Xinyan Fan, Zheng Liu, Jianxun Lian, Wayne Xin Zhao, Xing Xie, and Ji-Rong Wen

**Abstract:** Self-attention networks (SANs) have been intensively applied for sequential recommenders,
but they are limited due to: (1) the quadratic complexity and vulnerability to over-parameterization
in self-attention; (2) inaccurate modeling of sequential relations between items due
to the implicit position encoding. In this work, we propose the low-rank decomposed
self-attention networks (LightSANs) to overcome these problems. Particularly, we introduce
the low-rank decomposed self-attention, which projects user’s historical items into
a small constant number of latent interests and leverages item-to-interest interaction
to generate the context-aware representation. It scales linearly w.r.t. the user’s
historical sequence length in terms of time and space, and is more resilient to over-parameterization.
Besides, we design the decoupled position encoding, which models the sequential relations
between items more precisely. Extensive experimental studies are carried out on three
real-world datasets, where LightSANs outperform the existing SANs-based recommenders
in terms of both effectiveness and efficiency.

## Running with RecBole¶

**Model Hyper-Parameters:**

`num_layers (int)`

: The number of transformer layers in transformer encoder. Default to`2`

.`n_heads (int)`

: The number of attention heads for multi-head attention layer. Defaults to`2`

.`k_interests (int)`

: The number of latent interests dimension in item-to-interest aggregation. Defaults to`5`

. We recommend setting k_interests=0.1*MAX_ITEM_LIST_LENGTH to ensure the compression effect.`hidden_size (int)`

: The input and output hidden size. Defaults to`64`

.`inner_size (int)`

: The imensionality in feed-forward layer. Defaults to`256`

.`hidden_dropout_prob (float)`

: The probability of an element to be zeroed. Defaults to`0.5`

.`attn_dropout_prob (float)`

: The probability of an attention score to be zeroed. Defaults to`0.5`

.`hidden_act (str)`

: The activation function in feed-forward layer. Defaults to`'gelu'`

. Range in`['gelu', 'relu', 'swish', 'tanh', 'sigmoid']`

.`layer_norm_eps (float)`

: The value added to the denominator for numerical stability. Defaults to`1e-12`

.`initializer_range (float)`

: The range of weights initialization. Defaults to`0.02`

.`loss_type (str)`

: The type of loss function. If`loss_type='CE'`

, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If`loss_type='BPR'`

, the training task will be optimized in the pair-wise way, which maximizes the difference between positive item and negative item. In this way, negative sampling is necessary, such as setting`--neg_sampling="{'uniform': 1}"`

. Defaults to`'CE'`

. Range in`['BPR', 'CE']`

.

**A Running Example:**

Write the following code to a python file, such as run.py

```
from recbole.quick_start import run_recbole
parameter_dict = {
'neg_sampling': None,
}
run_recbole(model='LightSANs', dataset='ml-100k', config_dict=parameter_dict)
```

And then:

```
python run.py
```

## Tuning Hyper Parameters¶

If you want to use `HyperTuning`

to tune hyper parameters of this model, you can copy the following settings and name it as `hyper.test`

.

```
learning_rate choice [0.01,0.005,0.001,0.0005,0.0001]
num_layers choice [1,2,3]
k_interests choice [10,15,20]
```

Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model.

Then, with the source code of RecBole (you can download it from GitHub), you can run the `run_hyper.py`

to tuning:

```
python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test
```

For more details about Parameter Tuning, refer to Parameter Tuning.

If you want to change parameters, dataset or evaluation settings, take a look at