Biblio
Nowadays, a typical household owns multiple digital devices that can be connected to the Internet. Advertising companies always want to seamlessly reach consumers behind devices instead of the device itself. However, the identity of consumers becomes fragmented as they switch from one device to another. A naive attempt is to use deterministic features such as user name, telephone number and email address. However consumers might refrain from giving away their personal information because of privacy and security reasons. The challenge in ICDM2015 contest is to develop an accurate probabilistic model for predicting cross-device consumer identity without using the deterministic user information. In this paper we present an accurate and scalable cross-device solution using an ensemble of Gradient Boosting Decision Trees (GBDT) and Random Forest. Our final solution ranks 9th both on the public and private LB with F0.5 score of 0.855.
Pervasive Computing is one of the latest and more advanced paradigms currently available in the computers arena. Its ability to provide the distribution of computational services within environments where people live, work or socialize leads to make issues such as privacy, trust and identity more challenging compared to traditional computing environments. In this work we review these general issues and propose a Pervasive Computing architecture based on a simple but effective trust model that is better able to cope with them. The proposed architecture combines some Artificial Intelligence techniques to achieve close resemblance with human-like decision making. Accordingly, Apriori algorithm is first used in order to extract the behavioral patterns adopted from the users during their network interactions. Naïve Bayes classifier is then used for final decision making expressed in term of probability of user trustworthiness. To validate our approach we applied it to some typical ubiquitous computing scenarios. The obtained results demonstrated the usefulness of such approach and the competitiveness against other existing ones.
Now a day's cloud computing is power station to run multiple businesses. It is cumulating more and more users every day. Database-as-a-service is service model provided by cloud computing to store, manage and process data on a cloud platform. Database-as-a-service has key characteristics such as availability, scalability, elasticity. A customer does not have to worry about database installation and management. As a replacement, the cloud database service provider takes responsibility for installing and maintaining the database. The real problem occurs when it comes to storing confidential or private information in the cloud database, we cannot rely on the cloud data vendor. A curious cloud database vendor may capture and leak the secret information. For that purpose, Protected Database-as-a-service is a novel solution to this problem that provides provable and pragmatic privacy in the face of a compromised cloud database service provider. Protected Database-as-a-service defines various encryption schemes to choose encryption algorithm and encryption key to encrypt and decrypt data. It also provides "Master key" to users, so that a metadata storage table can be decrypted only by using the master key of the users. As a result, a cloud service vendor never gets access to decrypted data, and even if all servers are jeopardized, in such inauspicious circumstances a cloud service vendor will not be able to decrypt the data. Proposed Protected Database-as-a-service system allows multiple geographically distributed clients to execute concurrent and independent operation on encrypted data and also conserve data confidentiality and consistency at cloud level, to eradicate any intermediate server between the client and the cloud database.
Small embedded devices such as microcontrollers have been widely used for identification, authentication, securing and storing confidential information. In all these applications, the security and privacy of the microcontrollers are of crucial importance. To provide strong security to protect data, these devices depend on cryptographic algorithms to ensure confidentiality and integrity of data. Moreover, many algorithms have been proposed, with each one having its strength and weaknesses. This paper presents a Differential Power Analysis(DPA) attack on hardware implementations of Advanced Encryption Standard(AES) running inside a PIC18F2420 microcontroller.
The data processing capabilities of MapReduce systems pioneered with the on-demand scalability of cloud computing have enabled the Big Data revolution. However, the data controllers/owners worried about the privacy and accountability impact of storing their data in the cloud infrastructures as the existing cloud computing solutions provide very limited control on the underlying systems. The intuitive approach - encrypting data before uploading to the cloud - is not applicable to MapReduce computation as the data analytics tasks are ad-hoc defined in the MapReduce environment using general programming languages (e.g, Java) and homomorphic encryption methods that can scale to big data do not exist. In this paper, we address the challenges of determining and detecting unauthorized access to data stored in MapReduce based cloud environments. To this end, we introduce alarm raising honeypots distributed over the data that are not accessed by the authorized MapReduce jobs, but only by the attackers and/or unauthorized users. Our analysis shows that unauthorized data accesses can be detected with reasonable performance in MapReduce based cloud environments.
Language vector space models (VSMs) have recently proven to be effective across a variety of tasks. In VSMs, each word in a corpus is represented as a real-valued vector. These vectors can be used as features in many applications in machine learning and natural language processing. In this paper, we study the effect of vector space representations in cyber security. In particular, we consider a passive traffic analysis attack (Website Fingerprinting) that threatens users' navigation privacy on the web. By using anonymous communication, Internet users (such as online activists) may wish to hide the destination of web pages they access for different reasons such as avoiding tyrant governments. Traditional website fingerprinting studies collect packets from the users' network and extract features that are used by machine learning techniques to reveal the destination of certain web pages. In this work, we propose the packet to vector (P2V) approach where we model website fingerprinting attack using word vector representations. We show how the suggested model outperforms previous website fingerprinting works.
Searchable encryption is a new developing information security technique and it enables users to search over encrypted data through keywords without having to decrypt it at first. In the last decade, many researchers are engaging in the field of searchable encryption and have proposed a series of efficient search schemes over encrypted cloud data. It is the time to survey this field to conclude a comprehensive framework by analyzing individual contributions. This paper focuses on the searchable encryption schemes in cloud. We firstly summarize the general model and threat model in searchable encryption schemes, and then present the privacy-preserving issues in these schemes. In addition, we compare the efficiency and security between semantic search and preferred search in detail. At last, some open issues and research challenges in the future are proposed.
So far, cloud storage has been accepted by an increasing number of people, which is not a fresh notion any more. It brings cloud users a lot of conveniences, such as the relief of local storage and location independent access. Nevertheless, the correctness and completeness as well as the privacy of outsourced data are what worry could users. As a result, most people are unwilling to store data in the cloud, in case that the sensitive information concerning something important is disclosed. Only when people feel worry-free, can they accept cloud storage more easily. Certainly, many experts have taken this problem into consideration, and tried to solve it. In this paper, we survey the solutions to the problems concerning auditing in cloud computing and give a comparison of them. The methods and performances as well as the pros and cons are discussed for the state-of-the-art auditing protocols.
Privacy analysis is essential in the society. Data privacy preservation for access control, guaranteed service in wireless sensor networks are important parts. In programs' verification, we not only consider about these kinds of safety and liveness properties but some security policies like noninterference, and observational determinism which have been proposed as hyper properties. Fairness is widely applied in verification for concurrent systems, wireless sensor networks and embedded systems. This paper studies verification and analysis for proving security-relevant properties and hyper properties by proposing deductive proof rules under fairness requirements (constraints).
Steganography is the art of the hidden data in such a way that it detection of hidden knowledge prevents. As the necessity of security and privacy increases, the need of the hiding secret data is ongoing. In this paper proposed an enhanced detection of the 1-2-4 LSB steganography and RSA cryptography in Gray Scale and Color images. For color images, we apply 1-2-4 LSB on component of the RGB, then encrypt information applying RSA technique. For Gray Images, we use LSB to then encrypt information and also detect edges of gray image. In the experimental outcomes, calculate PSNR and MSE. We calculate peak signal noise ratio for quality and brightness. This method makes sure that the information has been encrypted before hiding it into an input image. If in any case the cipher text got revealed from the input image, the middle person other than receiver can't access the information as it is in encrypted form.
The threats of smartphone security are mostly from the privacy disclosure and malicious chargeback software which deducting expenses abnormally. They exploit the vulnerabilities of previous permission mechanism to attack to mobile phones, and what's more, it might call hardware to spy privacy invisibly in the background. As the existing Android operating system doesn't support users the monitoring and auditing of system resources, a dynamic supervisory mechanism of process behavior based on Dalvik VM is proposed to solve this problem. The existing android system framework layer and application layer are modified and extended, and special underlying services of system are used to realize a dynamic supervisory on the process behavior of Dalvik VM. Via this mechanism, each process on the system resources and the behavior of each app process can be monitored and analyzed in real-time. It reduces the security threats in system level and positions that which process is using the system resource. It achieves the detection and interception before the occurrence or the moment of behavior so that it protects the private information, important data and sensitive behavior of system security. Extensive experiments have demonstrated the accuracy, effectiveness, and robustness of our approach.
Today's systems produce a rapidly exploding amount of data, and the data further derives more data, forming a complex data propagation network that we call the data's lineage. There are many reasons that users want systems to forget certain data including its lineage. From a privacy perspective, users who become concerned with new privacy risks of a system often want the system to forget their data and lineage. From a security perspective, if an attacker pollutes an anomaly detector by injecting manually crafted data into the training data set, the detector must forget the injected data to regain security. From a usability perspective, a user can remove noise and incorrect entries so that a recommendation engine gives useful recommendations. Therefore, we envision forgetting systems, capable of forgetting certain data and their lineages, completely and quickly. This paper focuses on making learning systems forget, the process of which we call machine unlearning, or simply unlearning. We present a general, efficient unlearning approach by transforming learning algorithms used by a system into a summation form. To forget a training data sample, our approach simply updates a small number of summations – asymptotically faster than retraining from scratch. Our approach is general, because the summation form is from the statistical query learning in which many machine learning algorithms can be implemented. Our approach also applies to all stages of machine learning, including feature selection and modeling. Our evaluation, on four diverse learning systems and real-world workloads, shows that our approach is general, effective, fast, and easy to use.
Hash based biometric template protection schemes (BTPS), such as fuzzy commitment, fuzzy vault, and secure sketch, address the privacy leakage concern on the plain biometric template storage in a database through using cryptographic hash calculation for template verification. However, cryptographic hashes have only computational security whose being cracked shall leak the biometric feature in these BTPS; and furthermore, existing BTPS are rarely able to detect during a verification process whether a probe template has been leaked from the database or not (i.e., being used by an imposter or a genuine user). In this paper we tailor the "honeywords" idea, which was proposed to detect the hashed password cracking, to enable the detectability of biometric template database leakage. However, unlike passwords, biometric features encoded in a template cannot be renewed after being cracked and thus not straightforwardly able to be protected by the honeyword idea. To enable the honeyword idea on biometrics, diversifiability (and thus renewability) is required on the biometric features. We propose to use BTPS for his purpose in this paper and present a machine learning based protected template generation protocol to ensure the best anonymity of the generated sugar template (from a user's genuine biometric feature) among other honey ones (from synthesized biometric features).
Spam Filtering is an adversary application in which data can be purposely employed by humans to attenuate their operation. Statistical spam filters are manifest to be vulnerable to adversarial attacks. To evaluate security issues related to spam filtering numerous machine learning systems are used. For adversary applications some Pattern classification systems are ordinarily used, since these systems are based on classical theory and design approaches do not take into account adversarial settings. Pattern classification system display vulnerabilities (i.e. a weakness that grants an attacker to reduce assurance on system's information) to several potential attacks, allowing adversaries to attenuate their effectiveness. In this paper, security evaluation of spam email using pattern classifier during an attack is addressed which degrade the performance of the system. Additionally a model of the adversary is used that allows defining spam attack scenario.
Nowadays, Online Social Networks (OSNs) are very popular and have become an integral part of our life. People are dependent on Online Social Networks for various purposes. The activities of most of the users are normal, but a few of the users exhibit unusual and suspicious behavior. We term this suspicious and unusual behavior as malicious behavior. Malicious behavior in Online Social Networks includes a wide range of unethical activities and actions performed by individuals or communities to manipulate thought process of OSN users to fulfill their vested interest. Such malicious behavior needs to be checked and its effects should be minimized. To minimize effects of such malicious activities, we require proper detection and containment strategy. Such strategy will protect millions of users across the OSNs from misinformation and security threats. In this paper, we discuss the different studies performed in the area of malicious behavior analysis and propose a framework for detection of malicious behavior in OSNs.
In data analysis, it is always a tough task to strike the balance between the privacy and the applicability of the data. Due to the demand for individual privacy, the data are being more or less obscured before being released or outsourced to avoid possible privacy leakage. This process is so called de-identification. To discuss a de-identification policy, the most important two aspects should be the re-identification risk and the information loss. In this paper, we introduce a novel policy searching method to efficiently find out proper de-identification policies according to acceptable re-identification risk while retaining the information resided in the data. With the UCI Machine Learning Repository as our real world dataset, the re-identification risk can therefore be able to reflect the true risk of the de-identified data under the de-identification policies. Moreover, using the proposed algorithm, one can then efficiently acquire policies with higher information entropy.
According to the advancement of mobile devices and wireless network technology, these portable devices became the potential devices that can be used for different types of payments. Recently, most of the people would rather to do their activities by their cellphones. On the other hand, there are some issues that hamper the widespread acceptance of mobile payment among people. The traditional ways of mobile payment are not secure enough, since they follow the traditional flow of data. This paper is going to suggest a new protocol named Golden Mobile Pay Center Protocol that is based on client centric model. The suggested protocol downgrade the computational operations and communications that are necessary between the engaging parties and achieves a completely privacy protection for the engaging parties. It avoids transaction repudiation among the engaging parties and will decrease replay attack s risk. The goal of the protocol is to help n users to have payments to each others'. Besides, it will utilize a new key agreement protocol named Golden Circle that is working by employing symmetric key operations. GMPCP uses GC for generating a shared session key between n users.
Online Social Networks exploit a lightweight process to identify their users so as to facilitate their fast adoption. However, such convenience comes at the price of making legitimate users subject to different threats created by fake accounts. Therefore, there is a crucial need to empower users with tools helping them in assigning a level of trust to whomever they interact with. To cope with this issue, in this paper we introduce a novel model, DIVa, that leverages on mining techniques to find correlations among user profile attributes. These correlations are discovered not from user population as a whole, but from individual communities, where the correlations are more pronounced. DIVa exploits a decentralized learning approach and ensures privacy preservation as each node in the OSN independently processes its local data and is required to know only its direct neighbors. Extensive experiments using real-world OSN datasets show that DIVa is able to extract fine-grained community-aware correlations among profile attributes with average improvements up to 50% than the global approach.
The new era of information communication and technology (ICT), everyone wants to store/share their Data or information in online media, like in cloud database, mobile database, grid database, drives etc. When the data is stored in online media the main problem is arises related to data is privacy because different types of hacker, attacker or crackers wants to disclose their private information as publically. Security is a continuous process of protecting the data or information from attacks. For securing that information from those kinds of unauthorized people we proposed and implement of one the technique based on the data modification concept with taking the iris database on weka tool. And this paper provides the high privacy in distributed clustered database environments.
With the growing observed success of big data use, many challenges appeared. Timeless, scalability and privacy are the main problems that researchers attempt to figure out. Privacy preserving is now a highly active domain of research, many works and concepts had seen the light within this theme. One of these concepts is the de-identification techniques. De-identification is a specific area that consists of finding and removing sensitive information either by replacing it, encrypting it or adding a noise to it using several techniques such as cryptography and data mining. In this report, we present a new model of de-identification of textual data using a specific Immune System algorithm known as CLONALG.
With the advancement of technology, the world has not only become a better place to live in but have also lost the privacy and security of shared data. Information in any form is never safe from the hands of unauthorized accessing individuals. Here, in our paper we propose an approach by which we can preserve data using visual cryptography. In this paper, two sixteen segment displayed text is broken into two shares that does not reveal any information about the original images. By this process we have obtained satisfactory results in statistical and structural testes.
User uses smartphones for web surfing and browsing data. Many smartphones are embedded with inbuilt location aware system called GPS [Global Positioning System]. Using GPS user have to register and share his all private information to the LBS server. LBS is nothing but Location Based Service. Simply user sends the query to the LBS server. Then what is happening the LBS server gives a private information regarding particular user location. There will be a possibility to misuse this information so using mobile crowd method hides user location from LBS server and avoid sharing of privacy information with server. Our solution does not required to change the LBS server architecture.