kudos: General Idea of Userbased Recommendation

Userbased Recommendation

1.The idea of userbased recommendation is pretty straight forward: you like the stuff the person similar with you like. Then the question comes, how do we measure similarity between people? The ratings of the stuff both of you rated!

Algorithm

Input,you had a dataset that include the userID, itemID,ratings.
Caculate Similarity,following operation are avilable

Here we have two user x and y, they both rated k items.
(1) Pearson Similarity:

$\frac{\sum_{i=1}^{k}(x_i-E(x))(y_i-E(y))}{std(x)std(y)}$

(2) Eucleadian Distance:

$\sqrt{\sum_{i=1}^{k}(x_i-y_i)^2}$

(3) Mahattan Distance:

$\sum_{i=1}^{k}|x_i-y_i|$

(4) Cosine Similarity: $\frac{\sum_{i=1}^{k}A_i*B_i}{\sum{A_i^2}\sum{B_i^2}}$

Question comes again:which similarity should I use?

Generallly speaking,

If data is subject to grade inflation, use pearson similarity. If data is dense and the magnitude of data is important, both manhantan distance and Euclidean distance will work. If your data is sparse, consine similarity works fine.
Suppose we have N by N similarity matrix,

Get K person similar with A, You can choose K by experiment.

person K neighbour Score Weight

A B 0.7 0.35

C 0.8 0.40

D 0.5 0.25

If both B C D rating on X , their rating are 3, 4,4 respectively, The result should be 3 0.35+4 0.40+4 *0.25=3.65

Choose the top i item and recommend it to user A

person	K neighbour	Score	Weight
A	B	0.7	0.35
	C	0.8	0.40
	D	0.5	0.25

Pros

1.Make recommendation without knowing the detail of items.

Cons

1.Code Start, need a large amount of data to made accureate recommendation.

2.Computaion, O(n^2) is needed for compute the similarity matrix.

3.Sparsity. Few people rating items online.

kudos

2014年11月25日星期二

General Idea of Userbased Recommendation

Userbased Recommendation

Algorithm

Pros

Cons

没有评论:

发表评论