Biblio
We propose an efficient recommendation algorithm, by incorporating the side information of users' trust and distrust social relationships into the learning process of a Joint Non-negative Matrix Factorization technique based on Signed Graphs, namely JNMF-SG. The key idea in this study is to generate clusters based on signed graphs, considering positive and negative weights for the trust and distrust relationships, respectively. Using a spectral clustering approach for signed graphs, the clusters are extracted on condition that users with positive connections should lie close, while users with negative ones should lie far. Then, we propose a Joint Non-negative Matrix factorization framework, by generating the final recommendations, using the user-item and user-cluster associations over the joint factorization. In our experiments with a dataset from a real-world social media platform, we show that we significantly increase the recommendation accuracy, compared to state-of-the-art methods that also consider the trust and distrust side information in matrix factorization.
The cold start problem in recommender systems refers to the inability of making reliable recommendations if a critical mass of items has not yet been rated. To bypass this problem existing research focused on developing more reliable prediction models for situations in which only few items ratings exist. However, most of these approaches depend on adjusting the algorithm that determines a recommendation. We present a complimentary approach that does not require any adjustments to the recommendation algorithm. We draw on motivation theory and reward users for rating items. In particular, we instantiate different gamification patterns and examine their effect on the average userâs number of provided report ratings. Our results confirm the positive effect of instantiating gamification patterns on the number of received report ratings.
This turorial offers a rich blend of theory and practice regarding dimensionality reduction methods, to address the information overload problem in recommender systems. This problem affects our everyday experience while searching for knowledge on a topic. Naive Collaborative Filtering cannot deal with challenging issues such as scalability, noise, and sparsity. We can deal with all the aforementioned challenges by applying matrix and tensor decomposition methods. These methods have been proven to be the most accurate (i.e., Netflix prize) and efficient for handling big data. For each method (SVD, SVD++, timeSVD++, HOSVD, CUR, etc.) we will provide a detailed theoretical mathematical background and a step-by-step analysis, by using an integrated toy example, which runs throughout all parts of the tutorial, helping the audience to understand clearly the differences among factorisation methods.
User modeling of individual users on the Social Web platforms such as Twitter plays a significant role in providing personalized recommendations and filtering interesting information from social streams. Recently, researchers proposed the use of concepts (e.g., DBpedia entities) for representing user interests instead of word-based approaches, since Knowledge Bases such as DBpedia provide cross-domain background knowledge about concepts, and thus can be used for extending user interest profiles. Even so, not all concepts can be covered by a Knowledge Base, especially in the case of microblogging platforms such as Twitter where new concepts/topics emerge everyday. In this short paper, instead of using concepts alone, we propose using synsets from WordNet and concepts from DBpedia for representing user interests. We evaluate our proposed user modeling strategies by comparing them with other bag-of-concepts approaches. The results show that using synsets and concepts together for representing user interests improves the quality of user modeling significantly in the context of link recommendations on Twitter.
The Wikidata platform is a crowdsourced, structured knowledgebase aiming to provide integrated, free and language-agnostic facts which are–-amongst others–-used by Wikipedias. Users who actively enter, review and revise data on Wikidata are assisted by a property suggesting system which provides users with properties that might also be applicable to a given item. We argue that evaluating and subsequently improving this recommendation mechanism and hence, assisting users, can directly contribute to an even more integrated, consistent and extensive knowledge base serving a huge variety of applications. However, the quality and usefulness of such recommendations has not been evaluated yet. In this work, we provide the first evaluation of different approaches aiming to provide users with property recommendations in the process of curating information on Wikidata. We compare the approach currently facilitated on Wikidata with two state-of-the-art recommendation approaches stemming from the field of RDF recommender systems and collaborative information systems. Further, we also evaluate hybrid recommender systems combining these approaches. Our evaluations show that the current recommendation algorithm works well in regards to recall and precision, reaching a recall@7 of 79.71% and a precision@7 of 27.97%. We also find that generally, incorporating contextual as well as classifying information into the computation of property recommendations can further improve its performance significantly.
The Netflix experience is driven by a number of recommendation algorithms: personalized ranking, page generation, similarity, ratings, search, etc. On the January 6th, 2016 we simultaneously launched Netflix in 130 new countries around the world, which brought the total to over 190 countries. Preparing for such a rapid expansion while ensuring each algorithm was ready to work seamlessly created new challenges for our recommendation and search teams. In this talk, we will highlight the four most interesting challenges we encountered in making our algorithms operate globally and how this improved our ability to connect members worldwide with stories they'll love. In particular, we will dive into the problems of uneven availability across catalogs, balancing personal and cultural tastes, handling language, and tracking quality of recommendations. Uneven catalog availability is a challenge because many recommendation algorithms assume that people could interact with any item and then use the absence of interaction implicitly or explicitly as negative information in the model. However, this assumption does not hold globally and across time where item availability differs. Running algorithms globally means needing a notion of location so that we can handle local variations in taste while also providing a good basis for personalization. Language is another challenge in recommending video content because people can typically only enjoy content that has assets (audio, subtitles) in languages they understand. The preferences for how people enjoy such content also vary between people and depend on their familiarity with a language. Also, while would like our recommendations to work well for every one of our members, tracking quality becomes difficult because with so many members in so many countries speaking so many languages, it can be hard to determine when an algorithm or system is performing sub-optimally for some subset of them. Thus, to support this global launch, we examined each and every algorithm that is part of our service and began to address these challenges.
Recommender systems must find items that match the heterogeneous preferences of its users. Customizable recommenders allow users to directly manipulate the system's algorithm in order to help it match those preferences. However, customizing may demand a certain degree of skill and new users particularly may struggle to effectively customize the system. In user studies of two different systems, I show that there is considerable heterogeneity in the way that new users will try to customize a recommender, even within groups of users with similar underlying preferences. Furthermore, I show that this heterogeneity persists beyond the first few interactions with the recommender. System designs should consider this heterogeneity so that new users can both receive good recommendations in their early interactions as well as learn how to effectively customize the system for their preferences.
Recommender systems have become quite popular recently. However, such systems are vulnerable to several types of attacks that target user ratings. One such attack is the Sybil attack where an entity masquerades as several identities with the intention of diverting user ratings. In this work, we propose evolutionary game theory as a possible solution to the Sybil attack in recommender systems. After modeling the attack, we use replicator dynamics to solve for evolutionary stable strategies. Our results show that under certain conditions that are easily achievable by a system administrator, the probability of an attack strategy drops to zero implying degraded fitness for Sybil nodes that eventually die out.
Today's systems produce a rapidly exploding amount of data, and the data further derives more data, forming a complex data propagation network that we call the data's lineage. There are many reasons that users want systems to forget certain data including its lineage. From a privacy perspective, users who become concerned with new privacy risks of a system often want the system to forget their data and lineage. From a security perspective, if an attacker pollutes an anomaly detector by injecting manually crafted data into the training data set, the detector must forget the injected data to regain security. From a usability perspective, a user can remove noise and incorrect entries so that a recommendation engine gives useful recommendations. Therefore, we envision forgetting systems, capable of forgetting certain data and their lineages, completely and quickly. This paper focuses on making learning systems forget, the process of which we call machine unlearning, or simply unlearning. We present a general, efficient unlearning approach by transforming learning algorithms used by a system into a summation form. To forget a training data sample, our approach simply updates a small number of summations – asymptotically faster than retraining from scratch. Our approach is general, because the summation form is from the statistical query learning in which many machine learning algorithms can be implemented. Our approach also applies to all stages of machine learning, including feature selection and modeling. Our evaluation, on four diverse learning systems and real-world workloads, shows that our approach is general, effective, fast, and easy to use.
We provide a generic framework that, with the help of a preprocessing phase that is independent of the inputs of the users, allows an arbitrary number of users to securely outsource a computation to two non-colluding external servers. Our approach is shown to be provably secure in an adversarial model where one of the servers may arbitrarily deviate from the protocol specification, as well as employ an arbitrary number of dummy users. We use these techniques to implement a secure recommender system based on collaborative filtering that becomes more secure, and significantly more efficient than previously known implementations of such systems, when the preprocessing efforts are excluded. We suggest different alternatives for preprocessing, and discuss their merits and demerits.