recbole.sampler¶
- class recbole.sampler.sampler.AbstractSampler(distribution, alpha)[source]¶
Bases:
object
AbstractSampler
is a abstract class, all sampler should inherit from it. This sampler supports returning a certain number of random value_ids according to the input key_id, and it also supports to prohibit certain key-value pairs by setting used_ids.- Parameters
distribution (str) – The string of distribution, which is used for subclass.
- used_ids¶
The result of
get_used_ids()
.- Type
numpy.ndarray
- get_used_ids()[source]¶
- Returns
Used ids. Index is key_id, and element is a set of value_ids.
- Return type
numpy.ndarray
- sample_by_key_ids(key_ids, num)[source]¶
Sampling by key_ids.
- Parameters
key_ids (numpy.ndarray or list) – Input key_ids.
num (int) – Number of sampled value_ids for each key_id.
- Returns
Sampled value_ids. value_ids[0], value_ids[len(key_ids)], value_ids[len(key_ids) * 2], …, value_id[len(key_ids) * (num - 1)] is sampled for key_ids[0]; value_ids[1], value_ids[len(key_ids) + 1], value_ids[len(key_ids) * 2 + 1], …, value_id[len(key_ids) * (num - 1) + 1] is sampled for key_ids[1]; …; and so on.
- Return type
torch.tensor
- class recbole.sampler.sampler.KGSampler(dataset, distribution='uniform', alpha=1.0)[source]¶
Bases:
recbole.sampler.sampler.AbstractSampler
KGSampler
is used to sample negative entities in a knowledge graph.- Parameters
dataset (Dataset) – The knowledge graph dataset, which contains triplets in a knowledge graph.
distribution (str, optional) – Distribution of the negative entities. Defaults to ‘uniform’.
- get_used_ids()[source]¶
- Returns
Used entity_ids is the same as tail_entity_ids in knowledge graph. Index is head_entity_id, and element is a set of tail_entity_ids.
- Return type
numpy.ndarray
- sample_by_entity_ids(head_entity_ids, num=1)[source]¶
Sampling by head_entity_ids.
- Parameters
head_entity_ids (numpy.ndarray or list) – Input head_entity_ids.
num (int, optional) – Number of sampled entity_ids for each head_entity_id. Defaults to
1
.
- Returns
Sampled entity_ids. entity_ids[0], entity_ids[len(head_entity_ids)], entity_ids[len(head_entity_ids) * 2], …, entity_id[len(head_entity_ids) * (num - 1)] is sampled for head_entity_ids[0]; entity_ids[1], entity_ids[len(head_entity_ids) + 1], entity_ids[len(head_entity_ids) * 2 + 1], …, entity_id[len(head_entity_ids) * (num - 1) + 1] is sampled for head_entity_ids[1]; …; and so on.
- Return type
torch.tensor
- class recbole.sampler.sampler.RepeatableSampler(phases, dataset, distribution='uniform', alpha=1.0)[source]¶
Bases:
recbole.sampler.sampler.AbstractSampler
RepeatableSampler
is used to sample negative items for each input user. The difference fromSampler
is it can only sampling the items that have not appeared at all phases.- Parameters
phases (str or list of str) – All the phases of input.
dataset (Dataset) – The union of all datasets for each phase.
distribution (str, optional) – Distribution of the negative items. Defaults to ‘uniform’.
- phase¶
the phase of sampler. It will not be set until
set_phase()
is called.- Type
str
- get_used_ids()[source]¶
- Returns
Used item_ids is the same as positive item_ids. Index is user_id, and element is a set of item_ids.
- Return type
numpy.ndarray
- sample_by_user_ids(user_ids, item_ids, num)[source]¶
Sampling by user_ids.
- Parameters
user_ids (numpy.ndarray or list) – Input user_ids.
item_ids (numpy.ndarray or list) – Input item_ids.
num (int) – Number of sampled item_ids for each user_id.
- Returns
Sampled item_ids. item_ids[0], item_ids[len(user_ids)], item_ids[len(user_ids) * 2], …, item_id[len(user_ids) * (num - 1)] is sampled for user_ids[0]; item_ids[1], item_ids[len(user_ids) + 1], item_ids[len(user_ids) * 2 + 1], …, item_id[len(user_ids) * (num - 1) + 1] is sampled for user_ids[1]; …; and so on.
- Return type
torch.tensor
- class recbole.sampler.sampler.Sampler(phases, datasets, distribution='uniform', alpha=1.0)[source]¶
Bases:
recbole.sampler.sampler.AbstractSampler
Sampler
is used to sample negative items for each input user. In order to avoid positive items in train-phase to be sampled in valid-phase, and positive items in train-phase or valid-phase to be sampled in test-phase, we need to input the datasets of all phases for pre-processing. And, before using this sampler, it is needed to callset_phase()
to get the sampler of corresponding phase.- Parameters
phases (str or list of str) – All the phases of input.
datasets (Dataset or list of Dataset) – All the dataset for each phase.
distribution (str, optional) – Distribution of the negative items. Defaults to ‘uniform’.
- phase¶
the phase of sampler. It will not be set until
set_phase()
is called.- Type
str
- get_used_ids()[source]¶
- Returns
Used item_ids is the same as positive item_ids. Key is phase, and value is a numpy.ndarray which index is user_id, and element is a set of item_ids.
- Return type
dict
- sample_by_user_ids(user_ids, item_ids, num)[source]¶
Sampling by user_ids.
- Parameters
user_ids (numpy.ndarray or list) – Input user_ids.
item_ids (numpy.ndarray or list) – Input item_ids.
num (int) – Number of sampled item_ids for each user_id.
- Returns
Sampled item_ids. item_ids[0], item_ids[len(user_ids)], item_ids[len(user_ids) * 2], …, item_id[len(user_ids) * (num - 1)] is sampled for user_ids[0]; item_ids[1], item_ids[len(user_ids) + 1], item_ids[len(user_ids) * 2 + 1], …, item_id[len(user_ids) * (num - 1) + 1] is sampled for user_ids[1]; …; and so on.
- Return type
torch.tensor
- class recbole.sampler.sampler.SeqSampler(dataset, distribution='uniform', alpha=1.0)[source]¶
Bases:
recbole.sampler.sampler.AbstractSampler
SeqSampler
is used to sample negative item sequence.- Parameters
datasets (Dataset or list of Dataset) – All the dataset for each phase.
distribution (str, optional) – Distribution of the negative items. Defaults to ‘uniform’.
- get_used_ids()[source]¶
- Returns
Used ids. Index is key_id, and element is a set of value_ids.
- Return type
numpy.ndarray
- sample_neg_sequence(pos_sequence)[source]¶
For each moment, sampling one item from all the items except the one the user clicked on at that moment.
- Parameters
pos_sequence (torch.Tensor) – all users’ item history sequence, with the shape of (N, ).
- Returns
all users’ negative item history sequence.
- Return type
torch.tensor