APP下载

Recommender Systems Based on Tensor Decomposition

2021-12-14ZhoubaoSunXiaodongZhangHaoyuanLiYanXiaoandHaifengGuo

Computers Materials&Continua 2021年1期

Zhoubao Sun,Xiaodong Zhang,Haoyuan Li,Yan Xiao and Haifeng Guo

1School of Information Engineering,Nanjing Audit University,Nanjing,211815,China

2School of Computing,National University of Singapore,117417,Singapore

3Jiangsu Key Laboratory of Data Science and Smart Software,Jinling Institute of Technology,Nanjing,211169,China

Abstract:Recommender system is an effective tool to solve the problems of information overload.The traditional recommender systems,especially the collaborative filtering ones,only consider the two factors of users and items.While social networks contain abundant social information,such as tags,places and times.Researches show that the social information has a great impact on recommendation results.Tags not only describe the characteristics of items,but also reflect the interests and characteristics of users.Since the traditional recommender systems cannot parse multi-dimensional information,in this paper,a tensor decomposition model based on tag regularization is proposed which incorporates social information to benefit recommender systems.The original Singular Value Decomposition(SVD)model is optimized by mining the co-occurrence and mutual exclusion of tags,and their features are constrained by the relationship between tags.Experiments on real dataset show that the proposed algorithm achieves superior performance to existing algorithms.

Keywords:Recommender system;social information;tensor decomposition;tag regularization

1 Introduction

In recent years,recommender systems have become popular and attracted much attention.A recommender system is an effective tool to solve the problem of information overload.It builds models according to the interest and historical behavior of the target users to predict the interest of them,generate recommendation lists,and meet the personalized needs of target users in specific situations[1].

The concept of recommendation system was first proposed by Resnick[2],from then on,many scholars have begun to research recommendation algorithms to improve the performance of recommender systems.The traditional ones[2],which only consider the relationships between users and items,ignoring the social interactions among users,are no longer suitable for the social tagging system with the ternary relationship of users,items and tags.In fact,the accompanying social information plays a very important role in recommendation process.Hence,in order to provide more accuracy and personalized recommendation results,the accompanying information should be taken into consideration.

In order to solve the problem mentioned above,we focus on the social-based recommender system and propose an algorithm named RSboTD(Recommender Systems based on Tensor Decomposition)that integrates social network information to improve the prediction accuracy of recommender systems.In the process of recommendation,we mainly propose a tensor decomposition model based on tag regularization.It mainly analyses the co-occurrence and mutual exclusion between tags,which are added as a regularization item of the tensor decomposition model to mine the potential association among users,items and tags,and improve the performance of tag recommendation by using tag information in the social tagging system.We have conducted experiments on real dataset to evaluate the performance of the proposed model on the predict accuracy.

The remainder of the paper is organized as follows.The related work is presented in Section 2.The problem and the details of the proposed approach are shown in Section 3.Section 4 shows the experiments results.Finally,we draw the conclusion in Section 5.

2 Related Work

In this section,we review the algorithms to traditional recommender systems and social-based recommender systems.

2.1 Traditional Recommender Systems

Collaborative filtering may be the most commonly-used recommender systems.The existing approaches include memory-based[3,4],model-based[5,6]and content-based approaches[7,8].Memory-based algorithms mine recommender items by finding the nearest neighbor from a rating matrix based on other similar users’ tastes.Model-based algorithms use the ratings to build a model to make rating predictions.The data sparsity is the key drawback of collaborative filtering algorithms.Therefore,the recommender system cannot recommend the item until it is rated by a substantial number of users.Content-based algorithms take the tags into consideration to recommend the user items similar to the ones the user preferred in the past.The limitation is that two different items are indistinguishable if they are represented by the same tags,another problem is that,if there is a new user,its’recommender accuracy is not credible.

Each kind of recommendation algorithms have its own defects.In order to alleviate this problem,hybrid recommendation algorithm is proposed.By combining the advantages of different kinds of algorithms,the recommendation results are better.Different ways of combining recommendation algorithms can produce different results.

2.2 Social-Based Recommender Systems

The traditional recommendation algorithms are mature,but assume the users are independent and rarely use social tag information.However,the fact is that,the relationships among users and the accompanying social information,are useful for improving the accuracy of recommender systems.In this paper,the concept “social-based recommender system” is defined as combining the social network information,such as tags.With the development of social network,how to utilize tags has become a hot issue and been studied in many applications.Hunag et al.[9]used an adaptive model to calculate the relationship between users and tag-related items.A hybrid model is established to recommend tags by combining the similarity between tags to improve the accuracy[10,11].Although these models are helpful to improve recommendation performance,they only consider the relationship between users and tags or items and tags,ignoring the relationship among users,items and tags.Therefore,Symeonidis et al.[12]proposed the model based on tensor to solve the problems;Rendle et al.[13]proposed a pairwise interaction tensor factorization model for tag recommendation;Ifada et al.[14]used social tagging information for tensor decomposition,and then Bayesian estimation is used to calculate user preferences for items.Ma et al.[15,16]proposed a socialized recommendation algorithm based on probability matrix decomposition,which considers user’s potential feature matrix is determined by user preferences and user's trust relationship.Ren et al.[17]proposed a random walk model to spread trust among users in order to solve the problem of data sparsity.Both of the above models are based on the trust relationships among users.In addition,a social regularization method which combines the social relationship of users with matrix decomposition was proposed in[18].Liu et al.[19]employed the influence of context on the basis of[18],and proposed a recommendation algorithm combining social network and context.Because context has a certain impact on social recommendation,tensor decomposition has become a hot issue in social recommendation[20].

However,the social recommendation problem is not well studied in the above work.They assumed that a user’s preference should be similar to that of him/her social network,and tags are not used effectively.Compared with the mentioned work,in this paper,the tags are taken into consideration and tensor decomposition is used to calculate the similarity between users,which can improve the accuracy of recommendation results.

3 The Proposed Social-Based Recommender Algorithm

The traditional recommender systems ignore the importance of tags.They just employ the user-item matrix to generate recommendation.In fact,the tagged ternary relation has higher accuracy in sparse dataset than binary metadata under the same conditions.So,more and more scholars begin to study tagbased recommender model.For the high-dimensional data with tag information,tensor decomposition becomes an efficient research method.

3.1 High Order Singular Value Decomposition(HOSVD)

HOSVD is the most widely used decomposition method in Tucker decomposition.HOSVD can also be regarded as a high-order extension of SVD.Consider a third-order tensorY∈RM×N×L,as Fig.1,HOSVD can decompose the third-order tensor into three low-rank matrices and one core tensor,represents the potential feature matrix of user,represents the potential feature matrix of item,represents the potential feature matrix of tag,is the core tensor,R1,R2andR3represents the potential feature dimensions of users,resources and tags respectively.

Figure 1:The HOSVD model

Assuming that the decomposed tensor is ^Y,

In Eq.(2),λ1,λ2>0,represent the penalty weight of regularization term for loss function.The optimization problem of each parameter can be solved by least square method or random gradient descent method.

3.2 Tensor Decomposition Model Based on Tag Regularization

As the most widely used method in tensor decomposition,HOSVD has been successfully applied in social tagging system.However,the decomposition model based on the original data model is defective.Firstly,the original tensor may be very sparse due to users only tag few items.Under these circumstances,the traditional HOSVD model is ineffective.Secondly,there are correlations among tags,for example,tags with similar semantics usually appear at the same time,so the accuracy of recommendation will be reduced if we ignore the intrinsic relationships between tags.In order to alleviate the problem,we use topic model to map the original sparse tensor to high-dimensional space,and propose a tensor decomposition based on the co-occurrence and mutual exclusion of tags.

The parameters in the tensor diversity model can be solved as follows.

Input:Tensor Y,threshold α,β,parameters λ,λ1,λ2 Output:C,U,Z,T 1.Initial U ∈RM×R1,Z ∈RK×R2,T ∈RL×R3,C ∈RR1×R2×R3 2.t =t0 3.for i,j,k()∈D do 4.η ←1■■t■ and t ←t+1 5.Initial ^Yi,j,k =C×UUi×ZZj×TTk 6. C ←C-η∂l∂C+λ1C7. Ui ←Ui-η∂l∂Ui+λ2Ui8. Zj ←Zj-η∂l∂Zj+λ2Zj9. Tk ←Tk -η∂l∂Tk+λ2Tk10.end for

According to Eqs.(4)–(7),the gradient method is used to calculate the parameter values,it can iteratively solve the values of each parameter in the process of tensor decomposition.The first to third actions are initialization operations,and the fifth to ninth lines are iterative execution steps.η is the learning ratio,λ1,λ2are the regularization parameters.The time complexity of the algorithm isThe scale of the algorithm depends on the number of observations in the tensor and the eigenvector dimensions of users,items and tags,R1,R2andR3is related to the number of iterations.With the updating of tag features,the co-occurrence and mutual exclusion of tags need to be recalculated,so the time complexity is also dominated by the number of tags.

4 Experimental Analysis

In this section,we conduct experiments on real dataset to validate the effectiveness of tag regularization algorithm.All the experiments are conducted on a Linux virtual machine with Intel processors(3.0 GHz)and 4 GB memory.

4.1 Dataset

With the rapid development of Web 2.0 technology,a lot of data has been produced on the internet every day.People influence each other through social network services.In this paper,Del.icio.us dataset,which is a well-known tool that is easy and free to save,organize and discover interesting links on the web,is used for the experiments.The users can share the interesting resource and get in touch with other users who have the same interests.The users in the same community can easily get the new tags of the other users without the need to access them.The dataset from Del.icio.us contains social network information,item and tag.Social network information contains 1867 users.The data density is 0.44%.There are 437593information entries,with 64305 tags,69225 items(URLs)and 1867 users.At last,we collect 99499 pieces of information.By random extraction,80%of the data set is selected as training set and 20%as test set.

We use the training data to learn model and predict Top-Ntags for each user’s tagging tendency of resources.Therefore,in this paper,we use the popular metrics,precision and recall,to measure the prediction quality of the proposed approach.The precision and recall are defined as follows,

whereR(u)denotes the tags that userumay label,andT(u)denotes the actual tags that userulabeled.The precision refers to the number of items whichulabeled takes the proportion of the entire recommendation items.It reflects the possibility thatuis interested in recommender item.The recall refers to the number of items whichulabeled takes the proportion of all the items.It reflects the possibility that an item whichulabeled may be recommended.As usual,with the increasing number of recommender items,the precision decreases.According to the calculation method above mentioned,we useprecision@Nindicates the accuracy of recommended labels andrecall@Nindicates the recall rate of the recommended label.

4.2 Comparisons

In order to discuss the advantages and disadvantages of the proposed algorithm and other algorithms in accuracy and validity,we compare our algorithm with the other two existing algorithms:

1.HOSVD[21]:The traditional decomposition model of HOSVD,which directly uses the original HOSVD to decompose tensors based on.

2.T-HOSVD[21]:A traditional tensor decomposition algorithm based on tag regularization.A tensor model based onis constructed by the traditional 0/1 model,and then decomposed by the tag regularization tensor decomposition method.

3.RSboTD:A tag regularization tensor decomposition algorithm,mapping items to latent semantic space by topic,builds a tensor model based on and decomposes it by tag regularization tensor decomposition method.

4.3 Experimental Results

The experiment is conducted on Del.icio.us dataset.HOSVD,T-HOSVD,RSboTD stands for the three kinds of algorithms mentioned above.The dimension of latent factor characteristic matrix for each order of HOSVD is set respectivelyR1=60,R2=105,R3=225.Setting learning rate η=0.5,the regularization parameters λ1=λ2=0.001,the tag regularization parameters λ=0.01,the tag co-occurrence threshold α=0.7,the tag exclusion threshold β=0.3,K=60.Fig.2 shows the experimental results of various algorithms in precision.

Figure 2:The precision of the compared algorithm

Fig.3 shows the experiment results of the compared algorithms in recall rate.It is obvious that RSboTD performs better than the other existing algorithms,the reason is that it takes advantage of the intrinsic relationships between tags.It can be seen that when recommending the number of tags between 3 and 5,the efficiency is better.

Figure 3:The recall rate of the comparison algorithms

When recommending two tags or even few,the recall rate of RSboTD is slightly lower than that of THOSVD.This is because the advantages of item content cannot be reflected when the number of recommendations is too small,but the recall rate of RSboTD is higher than that of TR-HOSVD when recommend more tags.Therefore,it can be inferred that recommendation based on item content can better discover the items that users are interested in.So,we can consider that the RSboTD algorithm proposed in this paper has better recommendation effect than the other two algorithms.

Tag regularization terms have certain impacts on the performance of the algorithm,but how to measure the constraints of the tag regularization term on the loss function is what we need to study.The threshold parameters α and β are used to measure the degree of co-occurrence and mutual exclusion between tags in tag regularization items,0 <β <0.5,0.5 ≤α ≤1.In order to express the relationship between these two thresholds more intuitively,we set β=1-α ,Fig.4 shows the trends of accuracy and recall rate with varying parameters.

Figure 4:The impacts of parameters

It can be seen from the Fig.4 that when the value of parameter α is between 0.65 and 0.8,which means the value of parameter β is between 0.2 and 0.35,the recommendation performance is better,if the value of parameter α is too small,there are too many co-occurrence and mutually exclusive relations of tags,which will reduce the recommendation performance.When the value of parameter α is too large,there are too few co-occurrence and mutually exclusive relations of tags which will also reduce recommendation performance.From the Fig.4,we can see that α=0.7,the accuracy and recall rate can achieve the optimal results in this dataset.

Fig.5 shows the effect of parameter λ.It can be seen that when the value of λ is too large,the recommendation accuracy and recall rate are almost zero.This is because the tag regularization item is only used to assist recommendation.If it is too large,the training result will be distorted.Therefore,setting the parameter λ in a relatively small range can improve the recommendation performance.In this paper,the accuracy and recall rate can achieve the best results when λ=0.01.If λ=0,the proposed RSboTD algorithm will degenerate into the ordinary HOSVD algorithm.

Figure 5:The impacts of parameter λ

5 Conclusion

In this paper,based on the observation that the tags can improve the prediction quality,we propose a social regularization algorithm which incorporates tags to benefit recommender systems.Tensor decomposition is employed in designing social regularization terms.The experiments on real dataset show that the proposed approach outperforms the other existing algorithms.

With the development of social networks,the social-based recommender systems become more important.In this paper,we fuse tags on the basis of tensor decomposition to improve recommendation quality,however,we need to investigate the following problems:the cold-start problem,the time-series information,the place information and so on.These are many interesting works to be explored in the future.

Acknowledgement:Many thanks are expressed to Jinjin Zhang for his kind help during the preparation of the manuscript and to Congdong Lv for assistant with the experiments.

Funding Statement:The work is supported by the following grants:The National Key Research and Development Program of China(No.2019YFB1404602,X.D.Zhang);The Natural Science Foundation of the Jiangsu Higher Education Institutions of China(No.17KJB520017,Z.B.Sun);The Young Teachers Training Project of Nanjing Audit University(No.19QNPY017,Z.B.Sun);The Opening Project of Jiangsu Key Laboratory of Data Science and Smart Software(No.2018DS301,H.F.Guo,Jinling Institute of Technology);and Funded by Government Audit Research Foundation of Nanjing Audit University.

Conflicts of Interest:The authors declare that they have no conflicts of interest to report regarding the present study.