recbole.data.kg_dataset¶
-
class
recbole.data.dataset.kg_dataset.
KnowledgeBasedDataset
(config, saved_dataset=None)[source]¶ Bases:
recbole.data.dataset.dataset.Dataset
KnowledgeBasedDataset
is based onDataset
, and load.kg
and.link
additionally.Entities are remapped together with
item_id
specially. All entities are remapped into three consecutive ID sections.virtual entities that only exist in interaction data.
entities that exist both in interaction data and kg triplets.
entities only exist in kg triplets.
It also provides several interfaces to transfer
.kg
features into coo sparse matrix, csr sparse matrix,DGL.Graph
orPyG.Data
.-
head_entity_field
¶ The same as
config['HEAD_ENTITY_ID_FIELD']
.- Type
str
-
tail_entity_field
¶ The same as
config['TAIL_ENTITY_ID_FIELD']
.- Type
str
-
relation_field
¶ The same as
config['RELATION_ID_FIELD']
.- Type
str
-
entity_field
¶ The same as
config['ENTITY_ID_FIELD']
.- Type
str
-
kg_feat
¶ Internal data structure stores the kg triplets. It’s loaded from file
.kg
.- Type
pandas.DataFrame
-
item2entity
¶ Dict maps
item_id
toentity
, which is loaded from file.link
.- Type
dict
-
entity2item
¶ Dict maps
entity
toitem_id
, which is loaded from file.link
.- Type
dict
Note
entity_field
doesn’t exist exactly. It’s only a symbol, representing entitiy features. E.g. it can be written intoconfig['fields_in_same_space']
.[UI-Relation]
is a special relation token.-
ckg_graph
(form='coo', value_field=None)[source]¶ Get graph or sparse matrix that describe relations of CKG, which combines interactions and kg triplets into the same graph.
Item ids and entity ids are added by
user_num
temporally.For an edge of <src, tgt>,
graph[src, tgt] = 1
ifvalue_field
isNone
, elsegraph[src, tgt] = self.kg_feat[self.relation_field][src, tgt]
orgraph[src, tgt] = [UI-Relation]
.Currently, we support graph in DGL and PyG, and two type of sparse matrixes,
coo
andcsr
.- Parameters
form (str, optional) – Format of sparse matrix, or library of graph data structure. Defaults to
coo
.value_field (str, optional) –
self.relation_field
orNone
, Defaults toNone
.
- Returns
Graph / Sparse matrix of kg triplets.
-
property
ent_level_ent_fields
¶ Get entity fields remapped together with
entity_id
.- Returns
List of field names.
- Return type
list
-
property
entities
¶ Returns: numpy.ndarray: List of entity id, including virtual entities.
-
property
entity_num
¶ Get the number of different tokens of entities, including virtual entities.
- Returns
Number of different tokens of entities, including virtual entities.
- Return type
int
-
property
head_entities
¶ Returns: numpy.ndarray: List of head entities of kg triplets.
-
kg_graph
(form='coo', value_field=None)[source]¶ Get graph or sparse matrix that describe relations between entities.
For an edge of <src, tgt>,
graph[src, tgt] = 1
ifvalue_field
isNone
, elsegraph[src, tgt] = self.kg_feat[value_field][src, tgt]
.Currently, we support graph in DGL and PyG, and two type of sparse matrixes,
coo
andcsr
.- Parameters
form (str, optional) – Format of sparse matrix, or library of graph data structure. Defaults to
coo
.value_field (str, optional) – edge attributes of graph, or data of sparse matrix, Defaults to
None
.
- Returns
Graph / Sparse matrix of kg triplets.
-
property
rec_level_ent_fields
¶ Get entity fields remapped together with
item_id
.- Returns
List of field names.
- Return type
list
-
property
relation_num
¶ Get the number of different tokens of
self.relation_field
.- Returns
Number of different tokens of
self.relation_field
.- Return type
int
-
property
relations
¶ Returns: numpy.ndarray: List of relations of kg triplets.
-
save
(filepath)[source]¶ Saving this
Dataset
object to local path.- Parameters
filepath (str) – path of saved dir.
-
property
tail_entities
¶ Returns: numpy.ndarray: List of tail entities of kg triplets.