recbole.data.dataloader.sequential_dataloader

class recbole.data.dataloader.sequential_dataloader.SequentialDataLoader(config, dataset, batch_size=1, dl_format=<InputType.POINTWISE: 1>, shuffle=False)[source]

Bases: recbole.data.dataloader.abstract_dataloader.AbstractDataLoader

SequentialDataLoader is used for sequential model. It will do data augmentation for the origin data. And its returned data contains the following:

  • user id

  • history items list

  • history items’ interaction time list

  • item to be predicted

  • the interaction time of item to be predicted

  • history list length

  • other interaction information of item to be predicted

Parameters
  • config (Config) – The config of dataloader.

  • dataset (Dataset) – The dataset of dataloader.

  • batch_size (int, optional) – The batch_size of dataloader. Defaults to 1.

  • dl_format (InputType, optional) – The input type of dataloader. Defaults to POINTWISE.

  • shuffle (bool, optional) – Whether the dataloader will be shuffle after a round. Defaults to False.

augmentation(item_list_index, target_index, item_list_length)[source]

Data augmentation.

Parameters
  • item_list_index (numpy.ndarray) – the index of history items list in interaction.

  • target_index (numpy.ndarray) – the index of items to be predicted in interaction.

  • item_list_length (numpy.ndarray) – history list length.

Returns

the augmented data.

Return type

dict

data_preprocess()[source]

Do data augmentation before training/evaluation.

dl_type = 1
property pr_end

This property marks the end of dataloader.pr which is used in __next__().

class recbole.data.dataloader.sequential_dataloader.SequentialFullDataLoader(config, dataset, sampler, neg_sample_args, batch_size=1, dl_format=<InputType.POINTWISE: 1>, shuffle=False)[source]

Bases: recbole.data.dataloader.neg_sample_mixin.NegSampleMixin, recbole.data.dataloader.sequential_dataloader.SequentialDataLoader

SequentialFullDataLoader is a sequential-dataloader with full sort. In order to speed up calculation, this dataloader would only return then user part of interactions, positive items and used items. It would not return negative items.

Parameters
  • config (Config) – The config of dataloader.

  • dataset (Dataset) – The dataset of dataloader.

  • sampler (Sampler) – The sampler of dataloader.

  • neg_sample_args (dict) – The neg_sample_args of dataloader.

  • batch_size (int, optional) – The batch_size of dataloader. Defaults to 1.

  • dl_format (InputType, optional) – The input type of dataloader. Defaults to POINTWISE.

  • shuffle (bool, optional) – Whether the dataloader will be shuffle after a round. Defaults to False.

dl_type = 2
get_pos_len_list()[source]
Returns

Number of positive item for each user in a training/evaluating epoch.

Return type

numpy.ndarray or list

get_user_len_list()[source]
Returns

Number of all item for each user in a training/evaluating epoch.

Return type

numpy.ndarray

class recbole.data.dataloader.sequential_dataloader.SequentialNegSampleDataLoader(config, dataset, sampler, neg_sample_args, batch_size=1, dl_format=<InputType.POINTWISE: 1>, shuffle=False)[source]

Bases: recbole.data.dataloader.neg_sample_mixin.NegSampleByMixin, recbole.data.dataloader.sequential_dataloader.SequentialDataLoader

SequentialNegSampleDataLoader is sequential-dataloader with negative sampling. Like GeneralNegSampleDataLoader, for the result of every batch, we permit that every positive interaction and its negative interaction must be in the same batch. Beside this, when it is in the evaluation stage, and evaluator is topk-like function, we also permit that all the interactions corresponding to each user are in the same batch and positive interactions are before negative interactions.

Parameters
  • config (Config) – The config of dataloader.

  • dataset (Dataset) – The dataset of dataloader.

  • sampler (Sampler) – The sampler of dataloader.

  • neg_sample_args (dict) – The neg_sample_args of dataloader.

  • batch_size (int, optional) – The batch_size of dataloader. Defaults to 1.

  • dl_format (InputType, optional) – The input type of dataloader. Defaults to POINTWISE.

  • shuffle (bool, optional) – Whether the dataloader will be shuffle after a round. Defaults to False.

get_pos_len_list()[source]
Returns

Number of positive item for each user in a training/evaluating epoch.

Return type

numpy.ndarray

get_user_len_list()[source]
Returns

Number of all item for each user in a training/evaluating epoch.

Return type

numpy.ndarray