recbole.data.kg_dataset¶
-
class
recbole.data.dataset.kg_dataset.KnowledgeBasedDataset(config)[source]¶ Bases:
recbole.data.dataset.dataset.DatasetKnowledgeBasedDatasetis based onDataset, and load.kgand.linkadditionally.Entities are remapped together with
item_idspecially. All entities are remapped into three consecutive ID sections.virtual entities that only exist in interaction data.
entities that exist both in interaction data and kg triplets.
entities only exist in kg triplets.
It also provides several interfaces to transfer
.kgfeatures into coo sparse matrix, csr sparse matrix,DGL.GraphorPyG.Data.-
head_entity_field¶ The same as
config['HEAD_ENTITY_ID_FIELD'].- Type
str
-
tail_entity_field¶ The same as
config['TAIL_ENTITY_ID_FIELD'].- Type
str
-
relation_field¶ The same as
config['RELATION_ID_FIELD'].- Type
str
-
entity_field¶ The same as
config['ENTITY_ID_FIELD'].- Type
str
-
kg_feat¶ Internal data structure stores the kg triplets. It’s loaded from file
.kg.- Type
pandas.DataFrame
-
item2entity¶ Dict maps
item_idtoentity, which is loaded from file.link.- Type
dict
-
entity2item¶ Dict maps
entitytoitem_id, which is loaded from file.link.- Type
dict
Note
entity_fielddoesn’t exist exactly. It’s only a symbol, representing entity features. E.g. it can be written intoconfig['fields_in_same_space'].[UI-Relation]is a special relation token.-
ckg_graph(form='coo', value_field=None)[source]¶ Get graph or sparse matrix that describe relations of CKG, which combines interactions and kg triplets into the same graph.
Item ids and entity ids are added by
user_numtemporally.For an edge of <src, tgt>,
graph[src, tgt] = 1ifvalue_fieldisNone, elsegraph[src, tgt] = self.kg_feat[self.relation_field][src, tgt]orgraph[src, tgt] = [UI-Relation].Currently, we support graph in DGL and PyG, and two type of sparse matrices,
cooandcsr.- Parameters
form (str, optional) – Format of sparse matrix, or library of graph data structure. Defaults to
coo.value_field (str, optional) –
self.relation_fieldorNone, Defaults toNone.
- Returns
Graph / Sparse matrix of kg triplets.
-
property
ent_level_ent_fields¶ Get entity fields remapped together with
entity_id.- Returns
List of field names.
- Return type
list
-
property
entities¶ Returns: numpy.ndarray: List of entity id, including virtual entities.
-
property
entity_num¶ Get the number of different tokens of entities, including virtual entities.
- Returns
Number of different tokens of entities, including virtual entities.
- Return type
int
-
property
head_entities¶ Returns: numpy.ndarray: List of head entities of kg triplets.
-
kg_graph(form='coo', value_field=None)[source]¶ Get graph or sparse matrix that describe relations between entities.
For an edge of <src, tgt>,
graph[src, tgt] = 1ifvalue_fieldisNone, elsegraph[src, tgt] = self.kg_feat[value_field][src, tgt].Currently, we support graph in DGL and PyG, and two type of sparse matrices,
cooandcsr.- Parameters
form (str, optional) – Format of sparse matrix, or library of graph data structure. Defaults to
coo.value_field (str, optional) – edge attributes of graph, or data of sparse matrix, Defaults to
None.
- Returns
Graph / Sparse matrix of kg triplets.
-
property
rec_level_ent_fields¶ Get entity fields remapped together with
item_id.- Returns
List of field names.
- Return type
list
-
property
relation_num¶ Get the number of different tokens of
self.relation_field.- Returns
Number of different tokens of
self.relation_field.- Return type
int
-
property
relations¶ Returns: numpy.ndarray: List of relations of kg triplets.
-
save(filepath)[source]¶ Saving this
Datasetobject to local path.- Parameters
filepath (str) – path of saved dir.
-
property
tail_entities¶ Returns: numpy.ndarray: List of tail entities of kg triplets.