recbole.data.kg_dataset¶
- class recbole.data.dataset.kg_dataset.KnowledgeBasedDataset(config)[source]¶
Bases:
recbole.data.dataset.dataset.Dataset
KnowledgeBasedDataset
is based onDataset
, and load.kg
and.link
additionally.Entities are remapped together with
item_id
specially. All entities are remapped into three consecutive ID sections.virtual entities that only exist in interaction data.
entities that exist both in interaction data and kg triplets.
entities only exist in kg triplets.
It also provides several interfaces to transfer
.kg
features into coo sparse matrix, csr sparse matrix,DGL.Graph
orPyG.Data
.- head_entity_field¶
The same as
config['HEAD_ENTITY_ID_FIELD']
.- Type
str
- tail_entity_field¶
The same as
config['TAIL_ENTITY_ID_FIELD']
.- Type
str
- relation_field¶
The same as
config['RELATION_ID_FIELD']
.- Type
str
- entity_field¶
The same as
config['ENTITY_ID_FIELD']
.- Type
str
- kg_feat¶
Internal data structure stores the kg triplets. It’s loaded from file
.kg
.- Type
pandas.DataFrame
- item2entity¶
Dict maps
item_id
toentity
, which is loaded from file.link
.- Type
dict
- entity2item¶
Dict maps
entity
toitem_id
, which is loaded from file.link
.- Type
dict
Note
entity_field
doesn’t exist exactly. It’s only a symbol, representing entity features.[UI-Relation]
is a special relation token.- ckg_graph(form='coo', value_field=None)[source]¶
Get graph or sparse matrix that describe relations of CKG, which combines interactions and kg triplets into the same graph.
Item ids and entity ids are added by
user_num
temporally.For an edge of <src, tgt>,
graph[src, tgt] = 1
ifvalue_field
isNone
, elsegraph[src, tgt] = self.kg_feat[self.relation_field][src, tgt]
orgraph[src, tgt] = [UI-Relation]
.Currently, we support graph in DGL and PyG, and two type of sparse matrices,
coo
andcsr
.- Parameters
form (str, optional) – Format of sparse matrix, or library of graph data structure. Defaults to
coo
.value_field (str, optional) –
self.relation_field
orNone
, Defaults toNone
.
- Returns
Graph / Sparse matrix of kg triplets.
- property entities¶
Returns: numpy.ndarray: List of entity id, including virtual entities.
- property entity_num¶
Get the number of different tokens of entities, including virtual entities.
- Returns
Number of different tokens of entities, including virtual entities.
- Return type
int
- property head_entities¶
Returns: numpy.ndarray: List of head entities of kg triplets.
- kg_graph(form='coo', value_field=None)[source]¶
Get graph or sparse matrix that describe relations between entities.
For an edge of <src, tgt>,
graph[src, tgt] = 1
ifvalue_field
isNone
, elsegraph[src, tgt] = self.kg_feat[value_field][src, tgt]
.Currently, we support graph in DGL and PyG, and two type of sparse matrices,
coo
andcsr
.- Parameters
form (str, optional) – Format of sparse matrix, or library of graph data structure. Defaults to
coo
.value_field (str, optional) – edge attributes of graph, or data of sparse matrix, Defaults to
None
.
- Returns
Graph / Sparse matrix of kg triplets.
- property relation_num¶
Get the number of different tokens of
self.relation_field
.- Returns
Number of different tokens of
self.relation_field
.- Return type
int
- property relations¶
Returns: numpy.ndarray: List of relations of kg triplets.
- property tail_entities¶
Returns: numpy.ndarray: List of tail entities of kg triplets.