RippleNet

Posted on 2023-03-02 Edited on 2024-06-19 In RecSys , KGRec

Background

CF: sparsity, cold start

KG-benefit:

KG introduces semantic relatedness among items, which can help find their latent connections and improve the precision of recommended items;
KG consists of relations with various types, which is helpful for extending a user’s interests reasonably and increasing the diversity of recommended items;
KG connects a user’s historical records and the recommended ones, thereby bringing explainability to recommender systems.

Existing KG model:

embedding-based method: DKN, CKE, SHINE, but more suitable for in-graph applications
path-based method: rely heavily on manually designed meta-paths

so the author proposes RippleNet:

combine embedding-based and path-based() methods
1. RippleNet incorporates the KGE methods into recommendation naturally by preference propagation;
2. RippleNet can automatically discover possible paths from an item in a user’s history to a candidate item.

专注于挖掘KG中用户感兴趣的实体！！

interaction matrix Y and knowledge graph G

the set of k-hop relevant entities for user u is defined as

\(\varepsilon_u^0=V_u = \{v|y_{uv}=1\}\) is the items which the user interacts with, and they can link with entities in knowledge graph

can be seen as the seed set of user u in KG(就是user如何参与到KG中)

v: embedding of item. Item embedding can incorporate one-hot ID , attributes of an item, based on the application scenario.

r: embedding of relation between head entity and tail entity.

h: embedding of head entity.

t: embedding of tail entity.

attention weight \(p_i\) can be regarded as the similarity of item v and the entity \(h_i\) measured in the space of relation \(r_i\).

\(r_i\) is important, since an item-entity pair may have different similarities when measured by different relations

the second layer just replace v with \(o_u^1\)

\[ p_i = softmax(o_u^{1T}R_ih_i) = \frac{exp(o_u^{1T}T_ih_i)}{\sum_{(h,r,t)\in S_u^2}exp(o_u^{1T}Rh)} \]

\[ o_u^2 = \sum_{(h_i,r_i,t_i)\in S_u^2}p_it_i \]

and third layer replace \(o_u^1\) with \(o_u^2\)

while

Propagation only used in KG-graph

模型不断扩散，不断获取更高层数neighbor的信息，最后通过加在一起汇总

所以与曾经互动过的item有关系的实体信息（KG信息）汇总为user embedding，最后再与没互动过的item计算估计互动概率，

所以是否能理解为user汇总的KG信息

别人的笔记：：

这里的分成三个部分：分别是预测分数的交叉熵损失，知识图谱特征表示的损失，参数正则化的损失：

预测部分的损失很好理解，就是用户和该item之间的预测值和真实值的loss

知识图谱特征表示的损失：我们在计算每个阶段的加权求和时上面说了，假设前提是hR=t，这是假设，所以我们需要设一个loss让模型学习，学习的内容就是hR和t之间计算相似度后，预测0,1是否相似

l2正则化损失：每一个hop中h，r，t分别和自己相乘后，求和再求均值得到一个值，即为该loss（这里我理解的不是很深，有了解的可以评论区说说）