Userbased Recommendation
1.The idea of userbased recommendation is pretty straight forward: you like the stuff the person similar with you like. Then the question comes, how do we measure similarity between people? The ratings of the stuff both of you rated!
Algorithm
Input,you had a dataset that include the userID, itemID,ratings.
Caculate Similarity,following operation are avilable
Here we have two user x and y, they both rated k items.
(1) Pearson Similarity:$\frac{\sum_{i=1}^{k}(x_i-E(x))(y_i-E(y))}{std(x)std(y)}$
(2) Eucleadian Distance:
$\sqrt{\sum_{i=1}^{k}(x_i-y_i)^2}$
(3) Mahattan Distance:
$\sum_{i=1}^{k}|x_i-y_i|$
(4) Cosine Similarity: $\frac{\sum_{i=1}^{k}A_i*B_i}{\sum{A_i^2}\sum{B_i^2}}$
Question comes again:which similarity should I use?
Generallly speaking,
If data is subject to grade inflation, use pearson similarity. If data is dense and the magnitude of data is important, both manhantan distance and Euclidean distance will work. If your data is sparse, consine similarity works fine.
Suppose we have N by N similarity matrix,
Get K person similar with A, You can choose K by experiment.
person K neighbour Score Weight A B 0.7 0.35 C 0.8 0.40 D 0.5 0.25 If both B C D rating on X , their rating are 3, 4,4 respectively, The result should be 3 0.35+4 0.40+4 *0.25=3.65
Choose the top i item and recommend it to user A
Pros
1.Make recommendation without knowing the detail of items.
Cons
1.Code Start, need a large amount of data to made accureate recommendation.
2.Computaion, O(n^2) is needed for compute the similarity matrix.
3.Sparsity. Few people rating items online.
没有评论:
发表评论