Biblio
Most traditional recommendation algorithms only consider the binary relationship between users and projects, these can basically be converted into score prediction problems. But most of these algorithms ignore the users's interests, potential work factors or the other social factors of the recommending products. In this paper, based on the existing trustworthyness model and similarity measure, we puts forward the concept of trust similarity and design a joint interest-content recommendation framework to suggest users which videos to watch in the online video site. In this framework, we first analyze the user's viewing history records, tags and establish the user's interest characteristic vector. Then, based on the updated vector, users should be clustered by sparse subspace clust algorithm, which can improve the efficiency of the algorithm. We certainly improve the calculation of similarity to help users find better neighbors. Finally we conduct experiments using real traces from Tencent Weibo and Youku to verify our method and evaluate its performance. The results demonstrate the effectiveness of our approach and show that our approach can substantially improve the recommendation accuracy.
In order to improve the accuracy of similarity, an improved collaborative filtering algorithm based on trust and information entropy is proposed in this paper. Firstly, the direct trust between the users is determined by the user's rating to explore the potential trust relationship of the users. The time decay function is introduced to realize the dynamic portrayal of the user's interest decays over time. Secondly, the direct trust and the indirect trust are combined to obtain the overall trust which is weighted with the Pearson similarity to obtain the trust similarity. Then, the information entropy theory is introduced to calculate the similarity based on weighted information entropy. At last, the trust similarity and the similarity based on weighted information entropy are weighted to obtain the similarity combing trust and information entropy which is used to predicted the rating of the target user and create the recommendation. The simulation shows that the improved algorithm has a higher accuracy of recommendation and can provide more accurate and reliable recommendation service.
Short Message Service is now-days the most used way of communication in the electronic world. While many researches exist on the email spam detection, we haven't had the insight knowledge about the spam done within the SMS's. This might be because the frequency of spam in these short messages is quite low than the emails. This paper presents different ways of analyzing spam for SMS and a new pre-processing way to get the actual dataset of spam messages. This dataset was then used on different algorithm techniques to find the best working algorithm in terms of both accuracy and recall. Random Forest algorithm was then implemented in a real world application library written in C\# for cross platform .Net development. This library is capable of using a prebuild model for classifying a new dataset for spam and ham.
Machine learning (ML) algorithms provide a good solution for many security sensitive applications, they themselves, however, face the threats of adversary attacks. As a key problem in machine learning, how to design robust feature selection algorithms against these attacks becomes a hot issue. The current researches on defending evasion attacks mainly focus on wrapped adversarial feature selection algorithm, i.e., WAFS, which is dependent on the classification algorithms, and time cost is very high for large-scale data. Since mRMR (minimum Redundancy and Maximum Relevance) algorithm is one of the most popular filter algorithms for feature selection without considering any classifier during feature selection process. In this paper, we propose a novel adversary-aware feature selection algorithm under filter model based on mRMR, named FAFS. The algorithm, on the one hand, takes the correlation between a single feature and a label, and the redundancy between features into account; on the other hand, when selecting features, it not only considers the generalization ability in the absence of attack, but also the robustness under attack. The performance of four algorithms, i.e., mRMR, TWFS (Traditional Wrapped Feature Selection algorithm), WAFS, and FAFS is evaluated on spam filtering and PDF malicious detection in the Perfect Knowledge attack scenarios. The experiment results show that FAFS has a better performance under evasion attacks with less time complexity, and comparable classification accuracy.
Nowadays, Internet Service Providers (ISPs) have been depending on Deep Packet Inspection (DPI) approaches, which are the most precise techniques for traffic identification and classification. However, constructing high performance DPI approaches imposes a vigilant and an in-depth computing system design because the demands for the memory and processing power. Membership query data structures, specifically Bloom filter (BF), have been employed as a matching check tool in DPI approaches. It has been utilized to store signatures fingerprint in order to examine the presence of these signatures in the incoming network flow. The main issue that arise when employing Bloom filter in DPI approaches is the need to use k hash functions which, in turn, imposes more calculations overhead that degrade the performance. Consequently, in this paper, a new design and implementation for a DPI approach have been proposed. This DPI utilizes a membership query data structure called Cuckoo filter (CF) as a matching check tool. CF has many advantages over BF like: less memory consumption, less false positive rate, higher insert performance, higher lookup throughput, support delete operation. The achieved experiments show that the proposed approach offers better performance results than others that utilize Bloom filter.
With the rapid development of the information technology, more and more high-speed networks came out. The 4G LTE network as a recently emerging network has gradually entered the mainstream of the communication network. This paper proposed an effective content-based information filtering based on the 4G LTE high-speed network by combing the content-based filter and traditional simple filter. Firstly, raw information is pre-processed by five-tuple filter. Secondly, we determine the topics and character of the source data by key nearest neighbor text classification after minimum-risk Bayesian classification. Finally, the improved AdaBoost algorithm achieves the four-level content-based information filtering. The experiments reveal that the effective information filtering method can be applied to the network security, big data analysis and other fields. It has high research value and market value.
Spam Filtering is an adversary application in which data can be purposely employed by humans to attenuate their operation. Statistical spam filters are manifest to be vulnerable to adversarial attacks. To evaluate security issues related to spam filtering numerous machine learning systems are used. For adversary applications some Pattern classification systems are ordinarily used, since these systems are based on classical theory and design approaches do not take into account adversarial settings. Pattern classification system display vulnerabilities (i.e. a weakness that grants an attacker to reduce assurance on system's information) to several potential attacks, allowing adversaries to attenuate their effectiveness. In this paper, security evaluation of spam email using pattern classifier during an attack is addressed which degrade the performance of the system. Additionally a model of the adversary is used that allows defining spam attack scenario.
A robust appearance model is usually required in visual tracking, which can handle pose variation, illumination variation, occlusion and many other interferences occurring in video. So far, a number of tracking algorithms make use of image samples in previous frames to update appearance models. There are many limitations of that approach: 1) At the beginning of tracking, there exists no sufficient amount of data for online update because these adaptive models are data-dependent and 2) in many challenging situations, robustly updating the appearance models is difficult, which often results in drift problems. In this paper, we proposed a tracking algorithm based on compressive sensing theory and particle filter framework. Features are extracted by random projection with data-independent basis. Particle filter is employed to make a more accurate estimation of the target location and make much of the updated classifier. The robustness and the effectiveness of our tracker have been demonstrated in several experiments.
Efficient and secure search on encrypted data is an important problem in computer science. Users having large amount of data or information in multiple documents face problems with their storage and security. Cloud services have also become popular due to reduction in cost of storage and flexibility of use. But there is risk of data loss, misuse and theft. Reliability and security of data stored in the cloud is a matter of concern, specifically for critical applications and ones for which security and privacy of the data is important. Cryptographic techniques provide solutions for preserving the confidentiality of data but make the data unusable for many applications. In this paper we report a novel approach to securely store the data on a remote location and perform search in constant time without the need for decryption of documents. We use bloom filters to perform simple as well advanced search operations like case sensitive search, sentence search and approximate search.