Network Embedding-Based Anomalous Density Searching for Multi-Group Collaborative Fraudsters Detection in Social Media

2019-07-18ChengzhangZhuWentaoZhaoQianLiPanLiandQiaoboDa

Computers Materials&Continua 2019年7期

Chengzhang Zhu , Wentao Zhao, , Qian Li Pan Li and Qiaobo Da

Abstract: Detecting collaborative fraudsters who manipulate opinions in social media is becoming extremely important in order to provide reliable information, in which, however, the diversity in different groups of collaborative fraudsters presents a significant challenge to existing collaborative fraudsters detection methods.These methods often detect collaborative fraudsters as the largest group of users who have the strongest relation with each other in the social media, consequently overlooking the other groups of fraudsters that are with strong user relation yet small group size.This paper introduces a novel network embedding-based framework NEST and its instance BEST to address this issue.NEST detects multiple groups of collaborative fraudsters by two steps.In the first step, to disclose user collaboration, it represents users according to their social relations.Then, in the second step, to identify the collaborative fraudsters, it detects the user groups with anomalous large group density in its representation space.BEST instantiates NEST by using a bipartite network embedding method to represent users and adopting a fast density group detection method based on the k-dimensional tree.Our experiments show BEST (i) performs significantly better in detecting fraudsters on four real-word social media data sets, and (ii) effectively detects multiple groups of collaborative fraudsters, compared to three state-of-the-art competitors.

Keywords: Fraudster detection, network embedding, social media.

1 Introduction

The reliability of social media content is becoming increasingly significant because social media heavily affects people every day.Unfortunately, a large proportion of social media content is proposed by fraudsters who collaborate to manipulate social opinions driven by huge profit and incentives of reputation [Mukherjee, Venkataraman, Liu et al.(2013); Xiang, Li, Hao et al.(2018)].As a result, effectively detecting such collaborative fraudsters is critical and with great bossiness values [Akoglu, Chandy and Faloutsos (2013)].

Recent year has seen significant progress made in fraudsters detection.Current efforts mainly focused on extracting fraudster indicators and/or features from users’ behavior [Mukherjee, Liu and Glance (2012); Ye and Akoglu (2015); Hooi, Shin, Song et al.(2017)] or users’ proposed content [Mukherjee, Venkataraman, Liu et al.(2013); Wang, Liu and Zhao (2017); You, Qian and Liu (2018)].Because of the great distinguishing ability of anomalous behavior and content, these indicators and/or features have shown remarkable performance in detecting individual fraudsters [Rayana and Akoglu (2016)].However, identifying fraudsters with collaborative manipulation is a challenging task.Specifically, the collaborative manipulation poses the two major challenges below: (i) The content of collaborative fraudsters may not be anomalous because the collaborative manipulation may dominate social opinions.(ii) The professional fraudsters will imitate the behavior of honest users to evade inspection [Hooi, Song, Beutel et al.(2016)].These two challenges cause the failure of current behavior and content-based fraudsters detection methods in detecting collaborative fraudsters.

To detect collaborative fraudsters, the dense subgraph mining methods [Hooi, Song, Beutel et al.(2016); Hooi, Shin, Song et al.(2017); Wu, Hu, Morstatter et al.(2017); Liu, Hooi and Faloutsos (2017); Xiang, Shen, Qin et al.(2018); Xiang, Zhao, Li et al.(2018)] are the major solutions, which detect collaborative fraudsters according to the significant collaboration footprint.Specifically, the dense subgraph mining methods always detect collaborative fraudsters as the largest group of users who have the strongest relation with each other in the social media.However, in this way, they may overlook the other groups of fraudsters that are with strong user relation yet small group size.In reality, social media may contain multiple groups of collaborative fraudsters instead of only the largest group of collaborative fraudsters.

In this paper, we introduce a novel Network Embedding-based denSiTy subgraph mining (NEST for short) framework for multi-group collaborative fraudsters detection in social media.Specifically, NEST first represents users according to their social relations to dis- close user collaboration.In this process, users who have similar activities will be embed- ded near to each other in the representation space.NEST then detects the user groups with anomalous large group density in its representation space to identify the collaborative fraudsters.Accordingly, any group of collaborative fraudsters with large joint activities can be effectively detected.

Essentially, this detection procedure simultaneously tackles three challenges brought by collaborative fraudsters: content domination, behavior camouflage, and multiple fraudsters groups, resulting in a robust and comprehensive collaborative fraudsters detecting result.In the first step, NEST solves the content domination and behavior camouflage problems by distilling user social relations which are reflected in users’ joint activities.The rationale is that the cooperation of collaborative fraudsters to manipulate opinions cannot be avoided.In the second step, NEST discovers fraudsters groups by analyzing the outlier of group density in its representation space.The intuition is that the joint activities of collaborative fraudsters must be more frequent than honest users, but the number of fraudsters is much less than honest users.

We further implement NEST by proposing a Bipartitie networking Embedding-based fast denSiTy subgraph mining method based on the k-dimensional tree structure, termed BEST.Specifically, BEST first models the users and their activities as a bipartite network as demonstrated in Fig.1.In the bipartite network, the nodes on each side are users and activities, and a link refers to a user participates in an activity.Then, to comprehensively capture user collaborations, BEST represents users by embedding both the explicit and implicit relations in the bipartite network.Lastly, to fast detect the collaborative fraudsters, BEST builds a k-dimensional tree for the representation space and searches the anomalous density group based on the k-dimensional tree.

Accordingly, this paper makes two major contributions:

● We introduce a novel network embedding-based framework NEST for identifying collaborative fraudsters in social media.NEST represents users according to their social relations and detects fraudsters by analyzing the outlier of group density in the representation space.It results in a more reliable and comprehensive collaborative fraudsters detection, compared to existing dense subgraph mining-based solutions.

● We instantiate NEST to an effective and efficient multi-group collaborative fraudsters detection method, BEST, by introducing bipartite network embedding and k- dimensional tree-based anomalous density group searching.The bipartite network embedding captures both explicit and implicit user relations, and the k-dimensional tree-based method guarantees the efficiency of density groups searching.

Extensive empirical results show that (i) BEST performs significantly better in detecting fraudsters on four large real-world social media data sets; and (ii) BEST effectively detects multiple groups of collaborative fraudsters, compared to three state-of-the-art competitors.

2 Related work

2.1 Fraudster detection

Current efforts on fraudster detection can be roughly classified into two categories:

individual characteristics-based methods and relational characteristics-based methods.The individual characteristics-based methods use the user proposed content and/or user’s behavior to identify whether a user is a fraudster.The information used by these methods mainly include the statics and linguistic characteristics of a content [Li, Huang, Yang et al.(2011); Mukherjee, Kumar, Liu et al.(2013); Wang, Liu and Zhao (2017); You, Qian and Liu (2018)], and the historical actions of a user [Fei, Mukherjee, Liu et al.(2013); Mukherjee, Venkataraman, Liu et al.(2013)].These individual characteristics are designed as features for fraudster detection [Jindal and Liu (2008); Lim, Nguyen, Jindal et al.(2010); Zhao, Resnick and Mei (2015); Li, Fei, Wang et al.(2017)].However, as evidenced by Hooi et al.[Hooi, Song, Beutel et al.(2016)], the individual characteristics are not robust for collaborative fraudsters who jointly manipulate social opinions and fraudsters may imitate the behavior of honest users.

The relational characteristics-based methods capture user-activity, user-user, and activity-activity relations, typically via a graph [Pandit, Chau, Wang et al.(2007); Stringhini, Kruegel and Vigna (2010); Akoglu, Chandy and Faloutsos (2013); Junqué de Fortuny, Stankova, Moeyersoms et al.(2014); Akoglu, Tong and Koutra (2015); Shehnepoor, Salehi, Farahbakhsh et al.(2017)].They hold an assumption that fake reviews are manipulated by groups of fraudsters.With this assumption, they assume a group of fraudsters will have dense links to a group of manipulated activities (useractivity relation) [Akoglu, Chandy and Faloutsos (2013); Wang, Xie, Liu et al.(2011)], a group of fraudsters will co-occur in many activities (user-user relation) [Wu, Hu, Morstatter et al.(2017); Sun, Qu, Chakrabarti et al.(2005); Xu, Zhang, Chang et al.(2013)], and different manipulated activities will have overlapped linked fraudsters (activity-activity relation) [Hovy (2016)].

Although current methods show their strengths to disclose fraudsters, most of them fail to discover multiple groups of collaborative fraudsters in social network.In this paper, we propose a networking-embedding based framework NEST to fill the gaps of multi- group collaborative fraudsters detection.The proposed NEST achieves a more reliable and comprehensive detection by revealing users within density groups in its representation space, which delicately embeds the user’s social relationships.

2.2.Network embedding

Our proposed method is based on network embedding, which can be categorized into two types: matrix factorization (MF)-based and neural network-based methods.

MF-based methods involve linear [Cox and Cox (2000)] and nonlinear [Nedich and Ozdaglar (2008)] procedures in the embedding process.While the linear procedures adopt linear transformations, such as singular value decomposition (SVD) and multiple dimensional scaling (MDS), to generate low-dimensional embedding [Cox and Cox (2000)], the non- linear methods utilize nonlinear transformations, e.g.kernel PCA and manifold learning, to capture complicated data structures.However, both have high computational cost because of their eigen-decomposition operation on data matrix.Accordingly, these methods do not suit for large social network embedding.

Recently, neural network-based methods have shown the state-of-the-art performance.Followed by DeepWalk [Perozzi, Al-Rfou and Skiena (2014)] and Node2Vec [Grover and Leskovec (2016)], most of neural network-based methods reformulate a network embedding task as a word embedding task via performing truncated random walks in a network to convert the network to sentences.More recently, advanced work embeds both explicit and implicit relations in a network and shows its significance [Tang, Qu, Wang et al.(2015); Wang, Cui and Zhu (2016); Cao, Lu and Xu (2015); Xu, Wei, Cao et al.(2017)].However, the above methods are not designed for social network embedding.They treat the nodes in a network homogeneously, and thus, cannot capture the difference between user and activity in social media.In addition, the truncated random walks used in these methods do not consider the user-activity joint distribution in social network.

In this paper, we instantiate NEST as an effective and efficient method, BEST, via a bipartite network embedding method.This Bipartitie network embedding method is tailored for social media.Accordingly, it captures user-activity relations better in its user representation space, which provides a solid foundation for collaborative fraudsters detection.

3 NEST for collaborative fraudster detection

NEST framework adopts a two-steps procedure to detect collaborative fraudsters in social media.The workflow of NEST framework is shown in Fig.1.For a social media S with a set of usersand a set of activitiesin the first step, NEST extracts a bipartite network G from S aswhere U and A are the nodes on the two sides of G, respectively, and E U V⊆ × defines the inter-set edges.Here, each edge in E carries a non-negative weightijw , reflecting the strength between a useriu and an activityja , and theijw will be zero if the useriu does not join the activityja .Accordingly, the weights in the bipartite network can be represented by a n × m matrix W =[ wij].Then, NEST learns an embedding function f(⋅):U→ Rd, which maps a useriu to a d dimensional vector representationiu .The embedding function f(⋅) should capture and embed the social relations of users in the bipartite network into their representation space.In the second step, NEST finds the anomalous density groups in the user representation space and treats the users in the anomalous density groups as collaborative fraudsters.

Formally, NEST detects a set of collaborative fraudster groupsaccording to

NEST has a good generalizability since it can be instantiated by specifying any network embedding method and any anomalous density groups searching method.We introduce an instance of NEST in next section and then verify its performance by empirical analyses.

4 A NEST instance: BEST

BEST instantiates NEST by a bipartite network embedding method catering for social net- work, and a k-dimensional tree-based anomalous density group searching method for efficient fraudsters detection.

Figure 1: NEST Framework.In the first step, NEST extracts a bipartite network from social media data, and represents user into a vector space by embedding their social relation in the bipartite network.In the second step, NEST searches the anomalous density group of users in the representation space for collaborative fraudsters detection.The detected collaborative fraudsters are illustrated with a grey background, and their corresponding groups are highlighted by a dotted circle

4.1 Bipartite network embedding

The network embedding reveals and embeds social relations of a user into the user’s vector representation, which reflects the cooperation of users in social media.We introduce a bipartite network embedding method to jointly capture the explicit and implicate relations of users in social media.

4.1.1 Explicit relations embedding

The explicit relations refer to the direct links between users and activities, which reflect the activities a user jointed.If two users always joint similar activities, their similarity should be large in the representation space.

To preserve the explicit relations, we keep the preference of users in their representation space.Specifically, we measure the preference of a user in both social media and representation space, and make the preference of a user in representation space similar to that in social media.For the preference measurement in social media, we consider the probability of a user join in an activity.Given the bipartite network, this probability can be calculated as follows:

where wijis the weight of edge eij.The measurement reflects the preference distribution of users.We follow the setting of word2vec to use the sigmoid function to measure the interaction of a user and an activity in their representation space in a probability space:

where ui∈Rdand aj∈Rdare the embedding vectors of uiand aj, respectively.Then, we adopt KL-divergence to measure the difference between P andand optimize the user and activity representation to minimize the KL-divergence as follows:

Considering P(i,j) is a constant, minimizing the Eq.(4) equals to follows:

4.1.2 Implicit relations embedding

The implicit relations refer to the relations between users and activities that are not directly connected.For two users, if there exist a path between them in the bipartite network, they may have an implicit relation, and the weight of the path reflects the strength of this implicit relation.However, counting the paths between two nodes in a bipartite network has a great high complexity, which is impracticable in social media.

Inspired by DeepWalk [Perozzi, Al-Rfou and Skiena (2014)], we also perform a truncated random walks on the network to generate nodes corpus as random walk paths, which contain higher order implicit relations between nodes.We move a step further to reconstruct the bipartite network G as two networks where each network only contains users G(u)or activitiesG(a), and conduct random walks on these two transformed networks.It results in a stationary distribution of random walks on social media data [Gao, Chen, He et al.(2018)].InG(u), uiand ujwill have an edgeif exists a tkthat∈ E and∈ E where E is the edge set ofG.InG(a), aiand ajwill have an edgeif exists a ukthat∈ E and∈ E where E is the edge set ofG.

The random walk paths generation procedure is illustrated in Algorithm 1, which generates a set of random walk paths D(u)ofU, a set of random walk paths D(a)of A.The implicit relations embedding aims to maximize the conditional probability of the context of a node.For user corpus()u

D , it maximizes the conditional probability as follows:

where S refers to the sequence in the context,refers the context nodes of node uiin sequence s.Similarly, for activities corpusthe implicit embedding maximizes the conditional probability as:

BEST jointly considers the explicit and implicit relations embedding, forming a joint embedding objective function:

where α , β and γ are the hyper-parameters to trade-off the effects of the three components.This objective function can be effectively solved by stochastic optimization methods.By solving the objective function (8), BEST represents users into a vector space where user’s social relations have been embedded.

4.2 K-dimensional tree-based anomalous density group searching

To fast search the anomalous density group, BEST first builds a k-dimensional tree (kdtree for short) for the user representation space, and then estimates the density around each user in that space.Finally, it adopts the criteria Eq.(1) in NEST to identify the anomalous density groups.

Figure 2: Example of kd-tree.The illustrated kd-tree is built on the user representation space shown in Fig.1.Each level splits one dimension of the space into two parts

4.2.1 Building kd-tree

For user representation set u ={u1,u2⋅⋅⋅, un}, BEST builds a kd-tree, v, by Algorithm 2.As illustrated in Fig.2, the kd-tree v is a binary tree storing the user representation with their structure information, which enables the fast searching of anomalous density groups.

Algorithm 2: Building kd-tree Procedure name: BUILDKDTREE(u,depth) Input: A set of point u, and the current depth.Output: The root of the kd-tree, v, storing u 1 if u contains only one point then 2 return a leaf storing this point.3 else 4 l ←depth%d+1; 5 Split u into two subsets according to the median value q in the l th-dimension of the points inu.Let (1)u be the set of points which l th-dimension value is smaller or equal to the q, and let (2)u be the set of other points ; 6 leftv← BUILDKDTREE( (1)u , depth+1) ; 7 rightv← BUILDKDTREE( (2)u , depth+1) ; 8 Create a node v storing the q in the l th-dimension, make leftvleftv the left child of v, and make rightv the right child of v ; 9 return v.10 end

4.2.2 Density estimation

BEST estimates the density around each user in its representation space based on the kd- tree v according to the Algorithm 3, where the function SEARCHKDTREE(iu ,v,ρ) returns a set of users that around the useriu within the range η based on the kd-tree v.Essentially, BEST estimates the density around a user by the number of users close to the user within a certain distance in the representation space.If a user has a large density, the user should have a lot of collaborations with others.Accordingly, BEST uses the density as an important evidence to identify collaborative fraudsters.

Algorithm 3: Density estimation based on kd-tree Input : A set of point u, the kd-tree v and η.Output: A set of densities around each user ρ, a set of user sets S.1 {}ρ ← 2 foreach iu in u do 3 iS ←SEARCHKDTREE( iu , v, η) ; 4 iiS ρ ← ; 5 { }i ρρρ← ∪; 6 { }i S SS← ∪; 7 end 8 return ρ,S.

4.2.3 Collaborative fraudsters detection

BEST detects collaborative fraudsters after estimating density around users in the user representation space.Specifically, it treats the density larger than a thresholdε , e.g.five times of the averaged density, as anomalous, and assigns the users in the density areas as fraudsters.The procedure is summarized in the Algorithm 4.

Algorithm 4: Collaborative fraudsters detection Input : A set of densities around each user ρ, a set of user sets S, a threshold ε Output: A set of fraudster users F..1 {}ρ ← 2 foreach iS in S do 3 if iρ ε＞ then; 4 i FF S← ∪ 5 end 6 return F

5 Experiments

5.1 Data sets

The experiments are carried on two large scale real word social media data sets, including Yelp restaurant and Yelp hotel data sets used in Mukherjee et al.[Mukherjee, Venkataraman, Liu et al.(2013)].All the activities in these data sets have been assigned authenticity labels given by commercial filters.

5.2 Evaluation metrics

We evaluate their performance by three metrics - precision, recall, and F-score.While precision evaluates the fraction of true fraudsters among detected fraudsters, recall reflects the fraction of true fraudsters that have been detected over the total amount of true fraudsters.The precision and recall should be jointly considered since fraudsters detection is an imbalance problem [Luca and Zervas (2016)], i.e., fraudsters are much less than honest users.Thus, we use F-score, which balances the precision and recall, as an averaged indicator.Higher F-score indicates a better performance of a fraudsters detection method.We report these three metrics per ground-truth honest user and fraudster classes to illustrate the performance for different categories.We further average them to show overall performance.

We follow the literature [Wang, Liu and Zhao (2017)] to use the results of the Yelp commercial fraud filter to evaluate the performance.Because the Yelp commercial fraud filter only give the authenticity labels of activities, we transform the authenticity labels to the honest labels of users as the ground-truth.Considering the fraud activities distribution per each user assigned by the commercial filters, we assign the fraudster label to a user if more than 80% of the activities of the user have been labeled as fraud.The rationale is that we need to filter the false positive made by the commercial filters [Li, Chen, Liu et al.(2014)].In other words, we assume that a user with a higher proportion of the assigned fraud activities will be more likely a real fraudster.

5.3 Parameters settings

In the experiments, we set the parameters of BEST as follows.To balance the explicit and implicit social relations, we set the hyper-parameters α , β , and γ is the network embedding objective function Eq.(8) as 0.5, 0.25, and 0.25, respectively.We train the network embedding by Adam [Kingma and Ba (2014)] with embedding dimension 128 and batch size 32.For the density estimation, we set the distance range η as 1.For the anomalous density detection, we set the threshold s as the five times of the averaged density.For the parameters in the compared methods, we take their recommended settings.

5.4 Evaluation of BEST effectiveness on fraudster detection

5.4.1 Experimental settings

BEST is compared with two state-of-the-art competitors: Frauder [Hooi, Song, Beutel et al.(2016)] and HoloScope [Liu, Hooi and Faloutsos (2017)] in detecting collaborative fraudsters.These two competitors are both based on dense subgraph mining, but with different setting on the graph construction.

● Fixed weighting dense subgraph mining-based method - FRAUDER [Hooi, Song, Beutel et al.(2016)].FRAUDER is a fraudsters detection method by dense subgraph mining.To detect camouflage and hijacked accounts, it adopts a fixed weighting strategy.

● Dynamic weighting dense subgraph mining-based method-HoloScope [Liu, Hooi and Faloutsos (2017)].HoloScope uses information from graph topology and temporal spikes to detect groups of fraudsters, and employs a dynamic weighting approach to allow a more accurately fraud detection.

5.4.2 Findings-BEST significantly improving fraudsters detection performance, especially recall

The precision, recall and F-score of BEST, Frauder, and HoloScope are reported in Tab.1.Overall, BEST significantly outperforms the competitors.It improves 21.8% and 10.03% compared with the best-performing method in terms of F-score on two data sets.

Table 1: Collaborative fraudsters detection performance of different methods

5.5 Evaluation of BEST-generated user representation quality

5.5.1 Experimental settings

We visualize the user representation in a two-dimensional space trough TSNE [Maaten and Hinton (2008)].To evaluate the user representation quality, we plot the ground-truth labels of each user at their positions in the representation space.A high-quality user representation will enable a dense distribution for the collaborative fraudsters.The behavior representation generated by BEST is compared with that generated by JETB [Wang, Liu and Zhao (2017)], which is the state-of-the-art user representation method for fraudsters detection.

5.5.2 Findings-BEST generated user representation embeds fraudsters into groups with anomalous high density

The user representations generated by BEST and JETB are visualized in Fig.3.In the JETB generated representation space, the users with large density are not consistent to the ground-truth fraudster label.In contrast, the density of BEST generated representation is consistent with the ground-truth fraudsters distribution.This qualitative illustrates that BEST effectively captures the social relation of users in social media, which is essential for the collaborative fraudsters detection.

Figure 3: User representation with density of different methods on Yelp-hotel and Yelp- restaurant.The sub-figures (a), (b), (c), (d) contain the user representation information with the ground-truth labels, and the sub-figures (e), (f), (g), (h) show the density in the representation space

6 Conclusion

This paper introduces a network-embedding collaborative fraudsters detection framework NEST and its instance BEST.They perform an anomalous density searching procedure on a network embedding space which enables the detecting multiple groups of collaborative fraudsters.Two large real-world data sets demonstrate the performance of BEST is substantially better than the state-of-the-art competitors.

Acknowledgements:The work is supported by National Natural Science Foundation of China under Grant No.U1811462.

References

Akoglu, L.; Chandy, R.; Faloutsos, C.(2013): Opinion fraud detection in online reviews by network effects.Proceedings of the International AAAI Conference on Web and Social Media, vol.13, pp.2-11.

Akoglu, L.; Tong, H.; Koutra, D.(2015): Graph based anomaly detection and description: a survey.Data Mining and Knowledge Discovery, vol.29, no.3, pp.626-688.

Cao, S.; Lu, W.; Xu, Q.(2015): GraRep: learning graph representations with global structural information.Proceedings of the ACM International on Conference on Information and Knowledge Management, pp.891-900.

Cox, T.F.; Cox, M.A.(2000): Multidimensional scaling.Chapman and Hall/CRC.

Fei, G.; Mukherjee, A.; Liu, B.; Hsu, M.; Castellanos, M.et al.(2013): Exploiting burstiness in reviews for review spammer detection.Proceedings of the International AAAI Conference on Web and Social Media, vol.13, pp.175-184.

Gao, M.; Chen, L.; He, X.; Zhou, A.(2018): BiNE: bipartite network embedding.Proceedings of the International ACM SIGIR Conference on Research & Development in Information Retrieval, pp.715-724.

Grover, A.; Leskovec, J.(2016): node2vec: scalable feature learning for networks.Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp.855-864.

Hooi, B.; Shin, K.; Song, H.A.; Beutel, A.; Shah, N.et al.(2017): Graph-based fraud detection in the face of camouflage.ACM Transactions on Knowledge Discovery from Data, vol.11, no.4, pp.44:1-44:26.

Hooi, B.; Song, H.A.; Beutel, A.; Shah, N.; Shin, K.; Faloutsos, C.(2016): FRAUDAR: Bounding graph fraud in the face of camouflage.Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp.895-904.

Hovy, D.(2016): The enemy in your own camp: how well can we detect statistically- generated fake reviews-an adversarial study.Proceedings of the Annual Meeting of the Association for Computational Linguistics, vol.2, pp.351-356.

Jindal, N.; Liu, B.(2008): Opinion spam and analysis.Proceedings of the ACM International WSDM Conference, pp.219-230.

Junqué de Fortuny, E.; Stankova, M.; Moeyersoms, J.; Minnaert, B.; Provost, F.et al.(2014): Corporate residence fraud detection.Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp.1650-1659.

Kingma, D.P.; Ba, J.(2014): Adam: a method for stochastic optimization.arXiv preprint arXiv:1412.6980.

Li, F.; Huang, M.; Yang, Y.; Zhu, X.(2011): Learning to identify review spam.Proceedings of the International Joint Conference on Artificial Intelligence, pp.2488-2493.

Li, H.; Chen, Z.; Liu, B.; Wei, X.; Shao, J.(2014): Spotting fake reviews via collective positive-unlabeled learning.Proceedings of the IEEE International Conference on Data Mining, pp.899-904.

Li, H.; Fei, G.; Wang, S.; Liu, B.; Shao, W.et al.(2017): Bimodal distribution and cobursting in review spam detection.Proceedings of the International Conference on World Wide Web, pp.1063-1072.

Lim, E.P.; Nguyen, V.A.; Jindal, N.; Liu, B.; Lauw, H.W.(2010): Detecting product review spammers using rating behaviors.Proceedings of the ACM International Conference on Information and Knowledge Management, pp.939-948.

Liu, S.; Hooi, B.; Faloutsos, C.(2017): Holoscope: topology-and-spike aware fraud detection.Proceedings of the ACM International Conference on Information and Knowledge Management, pp.1539-1548.

Luca, M.; Zervas, G.(2016): Fake it till you make it: reputation, competition, and yelp review fraud.Management Science, vol.62, no.12, pp.3412-3427.

Maaten, L.v.d.; Hinton, G.(2008): Visualizing data using t-SNE.Journal of Machine Learning Research, vol.9, pp.2579-2605.

Mukherjee, A.; Kumar, A.; Liu, B.; Wang, J.; Hsu, M.et al.(2013): Spotting opinion spammers using behavioral footprints.Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp.632-640.

Mukherjee, A.; Liu, B.; Glance, N.(2012): Spotting fake reviewer groups in consumer reviews.Proceedings of the International Conference on World Wide Web, pp.191-200.

Mukherjee, A.; Venkataraman, V.; Liu, B.; Glance, N.S.(2013): What yelp fake review filter might be doing? Proceedings of the International AAAI Conference on Web and Social Media, pp.409-418.

Nedich, A.; Ozdaglar, A.(2008): A geometric framework for nonconvex optimization duality using augmented lagrangian functions.Journal of Global Optimization, vol.40, no.4, pp.545-573.

Pandit, S.; Chau, D.H.; Wang, S.; Faloutsos, C.(2007): Netprobe: a fast and scalable system for fraud detection in online auction networks.Proceedings of the International Conference on World Wide Web, pp.201-210.

Perozzi, B.; Al-Rfou, R.; Skiena, S.(2014): Deepwalk: Online learning of social representations.Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp.701-710.

Rayana, S.; Akoglu, L.(2016): Collective opinion spam detection using active inference.Proceedings of the IEEE International Conference on Data Mining, pp.630-638.

Shehnepoor, S.; Salehi, M.; Farahbakhsh, R.; Crespi, N.(2017): Netspam: a networkbased spam detection framework for reviews in online social media.IEEE Transactions on Information Forensics and Security, vol.12, no.7, pp.1585-1595.

Stringhini, G.; Kruegel, C.; Vigna, G.(2010): Detecting spammers on social networks.Proceedings of the Annual Computer Security Applications Conference, pp.1-9.

Sun, J.; Qu, H.; Chakrabarti, D.; Faloutsos, C.(2005): Neighborhood formation and anomaly detection in bipartite graphs.Proceedings of the IEEE International Conference on Data Mining, pp.1-8.

Tang, J.; Qu, M.; Wang, M.; Zhang, M.; Yan, J.et al.(2015): Line: large-scale information network embedding.Proceedings of the International Conference on World Wide Web, pp.1067-1077.

Wang, D.; Cui, P.; Zhu, W.(2016): Structural deep network embedding.Proceedings of the 22nd ACM SIGKDD international Conference on Knowledge Discovery and Data Mining, pp.1225-1234.

Wang, G.; Xie, S.; Liu, B.; Philip, S.Y.(2011): Review graph based online store review spammer detection.ICDM, pp.1242-1247.

Wang, X.; Liu, K.; Zhao, J.(2017): Handling cold-start problem in review spam detection by jointly embedding texts and behaviors.Proceedings of the Annual Meeting of the Association for Computational Linguistics, vol.1, pp.366-376.

Wu, L.; Hu, X.; Morstatter, F.; Liu, H.(2017): Adaptive spammer detection with sparse group modeling.Proceedings of the International AAAI Conference on Web and Social Media, pp.319-326.

Xiang, L.; Li, Y.; Hao, W.; Yang, P.; Shen, X.(2018): Reversible natural language watermarking using synonym substitution and arithmetic coding.Computers, Materials & Continua, vol.55, no.3, pp.541-559.

Xiang, L.; Shen, X.; Qin, J.; Hao, W.(2018): Discrete multi-graph hashing for largescale visual search.Neural Processing Letters.

Xiang, L.; Zhao, G.; Li, Q.; Hao, W.; Li, F.(2018): TUMK-ELM: A fast unsupervised heterogeneous data learning approach.IEEE Access, vol.6, pp.35305-35315.

Xu, C.; Zhang, J.; Chang, K.; Long, C.(2013): Uncovering collusive spammers in Chinese review websites.Proceedings of the ACM International on Conference on Information and Knowledge Management, pp.979-988.

Xu, L.; Wei, X.; Cao, J.; Yu, P.S.(2017): Embedding of embedding (EOE): Joint embedding for coupled heterogeneous networks.Proceedings of the ACM International Conference on Web Search and Data Mining, pp.741-749.

Ye, J.; Akoglu, L.(2015): Discovering opinion spammer groups by network footprints.Proceedings of the European Conference on Machine Learning, pp.267-282.

You, Z.; Qian, T.; Liu, B.(2018): An attribute enhanced domain adaptive model for cold- start spam review detection.Proceedings of the International Conference on Computational Linguistics, pp.1884-1895.

Zhao, Z.; Resnick, P.; Mei, Q.(2015): Enquiring minds: early detection of rumors in social media from enquiry posts.Proceedings of the International Conference on World Wide Web, pp.1395-1405.

Computers Materials&Continua

2019年7期