Visible to the public Artificial Intelligence and Privacy, 2014

SoS Newsletter- Advanced Book Block

 
SoS Logo

Artificial Intelligence and Privacy, 2014

John McCarthy, coined the term "Artificial Intelligence" in 1955 and defined it as "the science and engineering of making intelligent machines." (as quoted in Poole, Mackworth & Goebel, 1998) AI research is highly technical and specialized, and has been characterized as "deeply divided into subfields that often fail to communicate with each other." (McCorduck, Pamela (2004), Machines Who Think (2nd ed.) These divisions are attributed to both technical and social factors. The research cited here looks at the privacy implications of artificial intelligence, especially as applied to data mining.  The work cited here was presented in 2014.


Jiajun Sun; Huadong Ma, "Privacy-Preserving Verifiable Incentive Mechanism For Online Crowdsourcing Markets," Computer Communication and Networks (ICCCN), 2014 23rd International Conference on, pp.1,8, 4-7 Aug. 2014. doi: 10.1109/ICCCN.2014.6911794
Abstract: Mobile crowdsourcing is a new paradigm which leverages pervasive smartphones to efficiently collect and upload data, enabling numerous novel applications. Recently, a class of new mechanisms have been proposed to determine near-optimal prices of sensing tasks for online crowdsourcing markets, where users arrive online and the crowdsourcer has budget constraints. In particular, the mechanisms can motivate extensive users to participate in online crowdsourcing markets. Although it is so promising in real-life environments, there still exist many security and privacy challenges. In this paper, we present a heterogeneous-user based privacy-preserving verifiable incentive mechanism for online crowdsourcing markets with the budget constraint, not only to explore how to protect the privacy of the bids, selection preferences, and identity from participants, but also to make the verifiable payment between the crowdsourcer (the crowdsourcing organizer) and online sequential arrival users. Results indicate that our privacy-preserving verifiable mechanisms achieve the same results as the generic one without privacy preservation.
Keywords: data privacy; mobile computing; outsourcing; security of data; smart phones; budget constraint; heterogeneous-user based privacy-preserving verifiable incentive mechanism; mobile crowdsourcing; near-optimal prices; online crowdsourcing markets; online sequential arrival users; pervasive smartphones; privacy challenges; security challenges; sensing tasks; Artificial intelligence; Crowdsourcing; Mobile communication; Privacy; Public key; incentive mechanism; online crowdsourcing markets; privacy preservation; security verification (ID#: 15-5711)
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6911794&isnumber=6911704

 

YoungSae Kim; JinHee Han; YongSung Jeon, "Design And Requirements For Video Encryption In Intelligent Surveillance System," Information and Communication Technology Convergence (ICTC), 2014 International Conference on, pp. 763, 764, 22-24 Oct. 2014. doi: 10.1109/ICTC.2014.6983281
Abstract: This paper presents the design and requirements of effective video encryption for intelligent surveillance systems. For this purpose, we design a new video encryption system and derive requirements for it in order to protect privacy and harm in surveillance videos.
Keywords: cryptography; data privacy; video surveillance; derive requirements; intelligent surveillance system; privacy protection; surveillance videos; video encryption; Artificial intelligence; Encryption; Event detection; Object detection; Privacy; Surveillance; intelligent surveillance; video classification; video encryption (ID#: 15-5712)
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6983281&isnumber=6983064

 

Zuxing Li; Oechtering, T.J.; Jaldén, J., "Parallel Distributed Neyman-Pearson Detection With Privacy Constraints," Communications Workshops (ICC), 2014 IEEE International Conference on, pp.765,770, 10-14 June 2014. doi: 10.1109/ICCW.2014.6881292
Abstract: In this paper, the privacy problem of a parallel distributed detection system vulnerable to an eavesdropper is proposed and studied in the Neyman-Pearson formulation. The privacy leakage is evaluated by a metric related to the Neyman-Pearson criterion. We will show that it is sufficient to consider a deterministic likelihood-ratio test for the optimal detection strategy at the eavesdropped sensor. This fundamental insight helps to simplify the problem to find the optimal privacy-constrained distributed detection system design. The trade-off between the detection performance and privacy leakage is illustrated in a numerical example.
Keywords: data privacy; maximum likelihood detection; parallel algorithms; telecommunication security; wireless sensor networks; deterministic likelihood ratio test; eavesdropped sensor; optimal privacy constrained distributed detection system design; parallel distributed Neyman-Pearson detection; privacy leakage evaluation; Artificial intelligence; Wireless communication (ID#: 15-5713)
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6881292&isnumber=6881162

 

Yuangang Yao; Xiaoyu Ma; Hui Liu; Jin Yi; Xianghui Zhao; Lin Liu, "A Semantic Knowledge Base Construction Method for Information Security," Trust, Security and Privacy in Computing and Communications (TrustCom), 2014 IEEE 13th International Conference on, pp. 803, 808, 24-26 Sept. 2014. doi: 10.1109/TrustCom.2014.106
Abstract: Information security contains many concepts and knowledge entities. As the development of information technology, the complexity of increasing information security knowledge need an overview representation and organization for security analysis and risk evaluation. Ontology as a formal and shareable semantic model, which is often used to define domain knowledge schema, can also be applied for information security knowledge base construction. In this paper, we propose ontology knowledge base construction method for information security, discuss the ontology construction processes, and design the knowledge schema. The ontology contains main concepts in information security and related properties and relations about these concepts with semantics. It supplies related information, such as assets and weakness, to security management and analysis applications. We introduce each step of the proposed method, and valid it using a practical information security knowledge base development.
Keywords: knowledge based systems; ontologies (artificial intelligence); risk analysis; security of data; formal semantic model; information security analysis; information security knowledge base construction; information security knowledge base development; information technology; knowledge entities; ontology construction processes; ontology knowledge base construction method; risk evaluation; security analysis applications; security management applications; semantic knowledge base construction method; shareable semantic model; Data mining; Information security; Knowledge based systems; Ontologies; Semantics; information security; knowledge base; ontology construction; semantic web (ID#: 15-5714)
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7011330&isnumber=7011202

 

Yuxuan Luo; Jianjiang Feng; Jie Zhou, "Fingerprint Matching Based On Global Minutia Cylinder Code," Biometrics (IJCB), 2014 IEEE International Joint Conference on, pp. 1, 8, Sept. 29 2014-Oct. 2 2014. doi: 10.1109/BTAS.2014.6996231
Abstract: Although minutia set based fingerprint matching algorithms have achieved good matching accuracy, developing a fingerprint recognition system that satisfies accuracy, efficiency and privacy requirements simultaneously remains a challenging problem. Fixed-length binary vector like IrisCode is considered to be an ideal representation to meet these requirements. However, existing fixed-length vector representations of fingerprints suffered from either low distinctiveness or misalignment problem. In this paper, we propose a discriminative fixed-length binary representation of fingerprints based on an extension of Minutia Cylinder Code. A machine learning based algorithm is proposed to mine reliable reference points to overcome the misalignment problem. Experimental results on public domain plain and rolled fingerprint databases demonstrate the effectiveness of the proposed approach.
Keywords: fingerprint identification; image matching; image representation; learning (artificial intelligence); vectors; IrisCode; fingerprint databases; fingerprint recognition system; fixed-length binary vector fingerprint representations; global minutia cylinder code; machine learning based algorithm; minutia set based fingerprint matching algorithms; privacy requirements; Abstracts; Filtering algorithms (ID#: 15-5715)
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6996231&isnumber=6996217

 

Bassily, R.; Smith, A.; Thakurta, A., "Private Empirical Risk Minimization: Efficient Algorithms and Tight Error Bounds," Foundations of Computer Science (FOCS), 2014 IEEE 55th Annual Symposium on,  pp. 464, 473, 18-21 Oct. 2014. doi: 10.1109/FOCS.2014.56
Abstract: Convex empirical risk minimization is a basic tool in machine learning and statistics. We provide new algorithms and matching lower bounds for differentially private convex empirical risk minimization assuming only that each data point's contribution to the loss function is Lipschitz and that the domain of optimization is bounded. We provide a separate set of algorithms and matching lower bounds for the setting in which the loss functions are known to also be strongly convex. Our algorithms run in polynomial time, and in some cases even match the optimal nonprivate running time (as measured by oracle complexity). We give separate algorithms (and lower bounds) for (ε, 0)and (ε, δ)-differential privacy; perhaps surprisingly, the techniques used for designing optimal algorithms in the two cases are completely different. Our lower bounds apply even to very simple, smooth function families, such as linear and quadratic functions. This implies that algorithms from previous work can be used to obtain optimal error rates, under the additional assumption that the contributions of each data point to the loss function is smooth. We show that simple approaches to smoothing arbitrary loss functions (in order to apply previous techniques) do not yield optimal error rates. In particular, optimal algorithms were not previously known for problems such as training support vector machines and the high-dimensional median.
Keywords: computational complexity; convex programming; learning (artificial intelligence); minimisation;(ε, δ)-differential privacy; (ε, 0)-differential privacy; Lipschitz loss function; arbitrary loss function smoothing; machine learning; optimal nonprivate running time; oracle complexity; polynomial time; private convex empirical risk minimization; smooth function families;statistics; Algorithm design and analysis; Convex functions; Noise measurement;Optimization; Privacy; Risk management; Support vector machines (ID#: 15-5716)
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6979031&isnumber=6978973

 

Wei Wang; Qian Zhang, "A Stochastic Game For Privacy Preserving Context Sensing On Mobile Phone," INFOCOM, 2014 Proceedings IEEE, pp. 2328, 2336, April 27 2014-May 2 2014. doi: 10.1109/INFOCOM.2014.6848177
Abstract: The proliferation of sensor-equipped smartphones has enabled an increasing number of context-aware applications that provide personalized services based on users' contexts. However, most of these applications aggressively collect users sensing data without providing clear statements on the usage and disclosure strategies of such sensitive information, which raises severe privacy concerns and leads to some initial investigation on privacy preservation mechanisms design. While most prior studies have assumed static adversary models, we investigate the context dynamics and call attention to the existence of intelligent adversaries. In this paper, we first identify the context privacy problem with consideration of the context dynamics and malicious adversaries with capabilities of adjusting their attacking strategies, and then formulate the interactive competition between users and adversaries as a zero-sum stochastic game. In addition, we propose an efficient minimax learning algorithm to obtain the optimal defense strategy. Our evaluations on real smartphone context traces of 94 users validate the proposed algorithm.
Keywords: data privacy; learning (artificial intelligence); minimax techniques; smart phones; stochastic games; ubiquitous computing; attacking strategy; context dynamics; context privacy problem; context-aware application; disclosure strategy; intelligent adversary; interactive competition; minimax learning algorithm; mobile phone; optimal defense strategy; personalized services; privacy preservation mechanisms design; privacy preserving context sensing; sensor-equipped smartphones; static adversary model; user context; user sensing data; zero-sum stochastic game; Context; Context-aware services; Games; Privacy; Sensors; Smart phones; Stochastic processes (ID#: 15-5717)
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6848177&isnumber=6847911

 

Wenjuan Li; Weizhi Meng; Zhiyuan Tan; Yang Xiang, "Towards Designing an Email Classification System Using Multi-view Based Semi-supervised Learning," Trust, Security and Privacy in Computing and Communications (TrustCom), 2014 IEEE 13th International Conference on, pp. 174, 181, 24-26 Sept. 2014.  doi: 10.1109/TrustCom.2014.26
Abstract: The goal of email classification is to classify user emails into spam and legitimate ones. Many supervised learning algorithms have been invented in this domain to accomplish the task, and these algorithms require a large number of labeled training data. However, data labeling is a labor intensive task and requires in-depth domain knowledge. Thus, only a very small proportion of the data can be labeled in practice. This bottleneck greatly degrades the effectiveness of supervised email classification systems. In order to address this problem, in this work, we first identify some critical issues regarding supervised machine learning-based email classification. Then we propose an effective classification model based on multi-view disagreement-based semi-supervised learning. The motivation behind the attempt of using multi-view and semi-supervised learning is that multi-view can provide richer information for classification, which is often ignored by literature, and semi-supervised learning supplies with the capability of coping with labeled and unlabeled data. In the evaluation, we demonstrate that the multi-view data can improve the email classification than using a single view data, and that the proposed model working with our algorithm can achieve better performance as compared to the existing similar algorithms.
Keywords: learning (artificial intelligence); pattern classification; unsolicited e-mail; classification model; email classification system; labeled data; multiview data; multiview disagreement-based semisupervised learning; single view data;spam; unlabeled data; Data models; Electronic mail; Feature extraction; Semisupervised learning; Supervised learning; Support vector machines; Training; Email Classification; Machine Learning Applications; Multi-View; Network Security; Semi-Supervised Learning (ID#: 15-5718)
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7011248&isnumber=7011202

 

Peddinti, S.T.; Korolova, A.; Bursztein, E.; Sampemane, G., "Cloak and Swagger: Understanding Data Sensitivity through the Lens of User Anonymity," Security and Privacy (SP), 2014 IEEE Symposium on, pp. 493, 508, 18-21 May 2014. doi: 10.1109/SP.2014.38
Abstract: Most of what we understand about data sensitivity is through user self-report (e.g., surveys), this paper is the first to use behavioral data to determine content sensitivity, via the clues that users give as to what information they consider private or sensitive through their use of privacy enhancing product features. We perform a large-scale analysis of user anonymity choices during their activity on Quora, a popular question-and-answer site. We identify categories of questions for which users are more likely to exercise anonymity and explore several machine learning approaches towards predicting whether a particular answer will be written anonymously. Our findings validate the viability of the proposed approach towards an automatic assessment of data sensitivity, show that data sensitivity is a nuanced measure that should be viewed on a continuum rather than as a binary concept, and advance the idea that machine learning over behavioral data can be effectively used in order to develop product features that can help keep users safe.
Keywords: data privacy; learning (artificial intelligence); Quora; automatic assessment; behavioral data; cloak; content sensitivity; data sensitivity; machine learning; privacy enhancing product features; question-and-answer site; swagger; user anonymity; user self-report; Context; Crawlers; Data privacy; Facebook; Privacy; Search engines; Sensitivity (ID#: 15-5719)
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6956583&isnumber=6956545

 

Sundaramoorthy, P.; Bhuvaneshwari, S.; Sreekrishna, M.; Selvam, M., "Ontology Based Classification Of User History In Obscured Web Search," Current Trends in Engineering and Technology (ICCTET), 2014 2nd International Conference on, pp. 258, 261, 8-8 July 2014. doi: 10.1109/ICCTET.2014.6966298
Abstract: User history includes user search and other web activities like searching, downloading and extracting information. Normally, the user history is public, which can be viewed by other users when the search history is not cleared. This can be avoided by performing searching by signing-into the search engine account. The search is found to be personalized. In user history, when we get on to the query it will depart to the corresponding web page. In this paper, we are proposing a model which relates the user searches with the history on the personalized location and retrieving related information if search is found. We present the algorithm, called Decision making algorithm in order to classify the content in the user history. The segregated results are located into the corresponding directory. Extensive experiment demonstrates the efficiency and effectiveness of our construction.
Keywords: Internet; data privacy; decision making; ontologies (artificial intelligence); pattern classification; search engines; user interfaces; Decision making algorithm; Web page; downloading; information extraction; information retrieval; obscured Web search; ontology; search engine account; user history classification; Conferences; Data mining; History; Ontologies; Privacy; Search engines; Web search; Personalized web search; Search Engine; User history; ontology; semantic (ID#: 15-5720)
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6966298&isnumber=6966253

 

Gokcen, Y.; Foroushani, V.A.; Heywood, A.N.Z., "Can We Identify NAT Behavior by Analyzing Traffic Flows?," Security and Privacy Workshops (SPW), 2014 IEEE, pp. 132, 139, 17-18 May 2014. doi: 10.1109/SPW.2014.28
Abstract: It is shown in the literature that network address translation devices have become a convenient way to hide the source of malicious behaviors. In this research, we explore how far we can push a machine learning (ML) approach to identify such behaviors using only network flows. We evaluate our proposed approach on different traffic data sets against passive fingerprinting approaches and show that the performance of a machine learning approach is very promising even without using any payload (application layer) information.
Keywords: Internet; learning (artificial intelligence);telecommunication traffic; NAT behavior; machine learning; malicious behaviors; network address translation devices; passive fingerprinting approach; payload information; traffic flows; Browsers; Classification algorithms; Computers; Fingerprint recognition; IP networks; Internet; Payloads; Network address translation classification; machine learning; traffic analysis; traffic flows (ID#: 15-5721)
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6957296&isnumber=6957265

 

Priyadarshini, R.; Selvan, L.T.; Kanimozhi, S., "Performance Analysis Of Static Images And Solution For Dynamic Tracking In Private Cloud," Information Communication and Embedded Systems (ICICES), 2014 International Conference on, pp. 1, 6, 27-28 Feb. 2014. doi: 10.1109/ICICES.2014.7033788
Abstract: Nowadays, the World Wide Web has grown tremendously and become more complex because of the growing number of users and the content being added in varied formats. Cloud computing is used as a platform where large amount of data or information can be stored (as it has more storage options) easily as a pay per use model. Hence the storage will not deal only with structured document; possibility of unstructured document storage will also be there. In order to compensate that there is a need to move on to unstructured database system which is mongodb. MongoDB (from "humongous") is an open source, scalable, high-performance, schema-free, document-oriented database. Technologies like cloud computing and SaaS(Software as a Service) in cloud are growing rapidly which replaces the exisiting traditional applications. The disadvantage in maintaining, integrating and acquiring traditional softwares are overcome by cloud applications. In this paper initially a static application is deployed into the private cloud environment as an image and its performance is evaluated. In the proposed system a weblog will be created with the features of locating exact source which is retrieved dynamically from web. The dynamic software application is bundled in to image and then deployed in the private cloud environment. In existing system the search was done only based on annotated word but in proposed system the search will be done concept wise meaningfully using Machine Learning techniques. The dynamic retrieval will be done using meta heuristic techniques and performance will be evaluated under private cloud environment. Then comparison will be done between static and dynamic application as the future work.
Keywords: Web sites; cloud computing; data privacy; database management systems; information retrieval; learning (artificial intelligence); MongoDB; SaaS; Weblog; World Wide Web; cloud computing; database system; document-oriented database; dynamic retrieval; dynamic tracking; machine learning; metaheuristic technique; open source database; private cloud; schema-free database; software as a service; static images; Cloud computing; Educational institutions; Electronic publishing; Information services; Semantic Web; Semantics; Automatic Metadata Extraction; Content Search; Locate Source Content; Machine Learning Algorithm; Metaheuristic Technique (ID#: 15-5722)
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7033788&isnumber=7033740

 

Divya, R.; Robin, C.R.R., "Onto-Search: An Ontology Based Personalized Mobile Search Engine," Green Computing Communication and Electrical Engineering (ICGCCEE), 2014 International Conference on,  pp. 1, 4, 6-8 March 2014. doi: 10.1109/ICGCCEE.2014.6921422
Abstract: Web search is a frequent activity on Internet connected devices, but it is still a nuisance when user is using a mobile device, due to their default keypad and small screen, and that search results could be mostly irrelevant for the user needs. User needs an efficient way to introduce query terms and receive more precise information. In this paper, we propose a new web search personalization approach that captures the user's interests and preferences in the form of concepts by mining search results and their click through. Onto-Search is based on a client server model. Heavy tasks such as training, re ranking are done on the server. To preserve privacy only the feature vectors is passed to the server. Location information is also taken into consideration. In addition GPS locations help in reinforcing search results. Finally, based on the derived ontology SVM is used for re-ranking of future search results.
Keywords: Global Positioning System; Internet; client-server systems; data privacy; mobile computing; mobile handsets; ontologies (artificial intelligence);query processing; search engines; support vector machines; GPS locations; Internet connected devices; Onto-Search; Web search; Web search personalization approach; client server model; feature vectors; location information; mobile device; ontology SVM; ontology based personalized mobile search engine; precise information; privacy preservation; query terms; search result mining; user interests; user needs; user preferences; Global Positioning System; Mobile communication; Ontologies; Search engines; Servers; Vectors; Web search; Personalization; Re ranking search results; click through data; content ontology; location ontology; mobile search engine (ID#: 15-5723)
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6921422&isnumber=6920919

 

Peipei Sui; Tianyu Wo; Zhangle Wen; Xianxian Li, "Privacy Risks in Publication of Taxi GPS Data," High Performance Computing and Communications, 2014 IEEE 6th Intl Symp on Cyberspace Safety and Security, 2014 IEEE 11th Intl Conf on Embedded Software and Syst (HPCC,CSS,ICESS), 2014 IEEE Intl Conf on, pp. 1189, 1196, 20-22 Aug. 2014. doi: 10.1109/HPCC.2014.195
Abstract: Taxis equipped with location sensing devices are increasingly becoming popular. Such location traces can be used for traffic management, taxi dispatching, and improved city planning. However, trajectory data often contain detailed information about individuals, and disclosing such information may reveal their lifestyles, preferences, and sensitive personal information. We study the GPS data of taxis in Beijing with more than 12000 taxis, and find out there are significant privacy risks associated with publishing taxi GPS data sets. In this paper, we first analyze the dataset from spatial and temporal dimensions. Second, we show that parking point information can re-identify anonymized trajectories of taxi drivers. Third, we find taxi GPS data could also expose passengers' privacy based on origin and destination (OD) queries. As a result, more than 55% trajectories can be re-identified at a probability of 1. Meanwhile, experimental results show that it is possible, using simple algorithms, to learn the destination of target passenger based on the naïve anonymized GPS data.
Keywords: Global Positioning System; data privacy; learning (artificial intelligence); query processing; risk management; traffic engineering computing; Beijing; OD queries; city planning; learning; location sensing devices; location traces; naïve anonymized GPS data; origin and destination query; parking point information; privacy risk; spatial dimension; taxi GPS data publication; taxi dispatching; temporal dimensions; traffic management; Clustering algorithms; Data privacy; Global Positioning System; Privacy; Publishing; Trajectory; Vehicles; GPS data; origin and destination; parking point; privacy leakage (ID#: 15-5724)
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7056894&isnumber=7056577

 

Saravanan, M.; Thoufeeq, A.M.; Akshaya, S.; Jayasre Manchari, V.L., "Exploring New Privacy Approaches In A Scalable Classification Framework," Data Science and Advanced Analytics (DSAA), 2014 International Conference on, pp. 209, 215, Oct. 30 2014-Nov. 1 2014. doi: 10.1109/DSAA.2014.7058075
Abstract: Recent advancements in Information and Communication Technologies (ICT) enable many organizations to collect, store and control massive amount of various types of details of individuals from their regular transactions (credit card, mobile phone, smart meter etc.). While using these wealth of information for Personalized Recommendations provides enormous opportunities for applying data mining (or machine learning) tasks, there is a need to address the challenge of preserving individuals privacy during the time of running predictive analytics on Big Data. Privacy Preserving Data Mining (PPDM) on these applications is particularly challenging, because it involves and process large volume of complex, heterogeneous, and dynamic details of individuals. Ensuring that privacy-protected data remains useful in intended applications, such as building accurate data mining models or enabling complex analytic tasks, is essential. Differential Privacy has been tried with few of the PPDM methods and is immune to attacks with auxiliary information. In this paper, we propose a distributed implementation based on Map Reduce computing model for C4.5 Decision Tree algorithm and run extensive experiments on three different datasets using Hadoop Cluster. The novelty of this work is to experiment two different privacy methods: First method is to use perturbed data on decision tree algorithm for prediction in privacy-preserving data sharing and the second method is based on applying raw data to the privacy-preserving decision tree algorithm for private data analysis. In addition to this, we propose the combination of the methods as hybrid technique to maintain accuracy (Utility) and privacy in an acceptable level. The proposed privacy approaches has two potential benefits in the context of data mining tasks: it allows the service providers to outsource data mining tasks without exposing the raw data, and it allows data providers to share data access to third parties while limiting privacy risks.
Keywords: data mining; data privacy; decision trees; learning (artificial intelligence);C4.5 decision tree algorithm; Hadoop Cluster; ICT; big data; differential privacy; information and communication technologies; machine learning; map reduce computing model; personalized recommendation; privacy preserving data mining; private data analysis; scalable classification; Big data; Classification algorithms; Data privacy; Decision trees; Noise; Privacy; Scalability; Hybrid data privacy; Map Reduce Framework; Privacy Approaches; Privacy Preserving data Mining; Scalability (ID#: 15-5725)
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7058075&isnumber=7058031

 

Ford, V.; Siraj, A.; Eberle, W., "Smart Grid Energy Fraud Detection Using Artificial Neural Networks," Computational Intelligence Applications in Smart Grid (CIASG), 2014 IEEE Symposium on, pp. 1, 6, 9-12 Dec. 2014.  doi: 10.1109/CIASG.2014.7011557
Abstract: Energy fraud detection is a critical aspect of smart grid security and privacy preservation. Machine learning and data mining have been widely used by researchers for extensive intelligent analysis of data to recognize normal patterns of behavior such that deviations can be detected as anomalies. This paper discusses a novel application of a machine learning technique for examining the energy consumption data to report energy fraud using artificial neural networks and smart meter fine-grained data. Our approach achieves a higher energy fraud detection rate than similar works in this field. The proposed technique successfully identifies diverse forms of fraudulent activities resulting from unauthorized energy usage.
Keywords: data analysis; data mining; learning (artificial intelligence); neural nets; power system security; smart meters; smart power grids; artificial neural networks; data intelligent analysis; data mining; machine learning technique; smart grid energy fraud detection; smart grid privacy; smart grid security; smart meter fine-grained data; Data mining; Energy consumption; Energy measurement; Meteorology; Neural networks; Smart meters; Training; fraud detection; neural networks; smart meter data (ID#: 15-5726)
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7011557&isnumber=7011539

 

Idrees, F.; Rajarajan, M., "Investigating The Android Intents And Permissions For Malware Detection," Wireless and Mobile Computing, Networking and Communications (WiMob), 2014 IEEE 10th International Conference on, pp. 354, 358, 8-10 Oct. 2014. doi: 10.1109/WiMOB.2014.6962194
Abstract: Mobile phones are mastering our day to day scheduling, entertainment, information and almost every aspect of life. With the increasing human dependence on smart phones, threats against these devices have also increased exponentially. Almost all the mobile apps are playing with the mobile user's privacy besides the targeted actions by the malicious apps. Android applications use permissions to use different features and resources of mobile device along with the intents to launch different activities. Various aspects of permission framework have been studied but sufficient attention has not been given to the intent framework. This work is first of its kind which is investigating the combined effects of permissions and intent filters to distinguish between the malware and benign apps. This paper proposes a novel approach to identify the malicious apps by analyzing the permission and intent patterns of android apps. This approach is supplemented with the machine learning algorithms for further classification of apps. Performance of proposed approach has been validated by applying the technique to the available malicious and benign samples collected from a number of sources.
Keywords: Android (operating system); data privacy; invasive software; learning (artificial intelligence); pattern classification; smart phones; Android applications; Android intents; Android permissions; benign apps; human dependence; machine learning algorithms; malicious apps; malware detection; mobile app classification; mobile device features; mobile device resources; mobile phones; mobile user privacy; permission framework; smart phones; Androids; Conferences; Humanoid robots; Malware; Mobile communication; Smart phones; classification; intents; malware detection; permission model (ID#: 15-5727)
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6962194&isnumber=6962120

 

Shanny, J.A.; Sudharson, K., "User Preferred Data Enquiry System Using Mobile Communications," Information Communication and Embedded Systems (ICICES), 2014 International Conference on, pp. 1, 5, 27-28 Feb. 2014. doi: 10.1109/ICICES.2014.7033943
Abstract: Mobile device interaction with users, it is providing various purposes such as location service, road map service, traffic information service etc. It has help to user and connect to the various search engines. Because the search query is limited to small words dislike those used when interacting with various search engines through computers. This leads to drawback in good communication between the user and the server through mobile phone, as there are limitations in mobile phones. Hence the proposed solution is providing better and faster result retrieval from querying search engine through mobile phone by using user's profile information in a authentication way. Ontology ranked keyword search algorithm utilized to analyze and filter search queries and rank results accordingly. Users search history is stored only locally and search results are provided by the server in partiality to existing search engine history information. The search history partiality are categorized based. It is mining the content and location information along with the user's profile. Ranking of results are helping to the end user in very easy access to the needed available source, thus proving to be very efficient. The proposed one provides an innovative approach of searching the data on the all input text, all pattern of the text, all spatial information relative searches by user, User type specific search and finally best Ontology based Search.
Keywords: data mining; information filtering; mobile communication; mobile computing; ontologies (artificial intelligence);query processing; search engines; text analysis; content mining; location information; mobile communications; mobile phones; ontology ranked keyword search algorithm; search engine history information; search engines; search query; search query analysis; search query filtering; spatial information; user preferred data enquiry system; user profile information; Educational institutions; Mobile communication; Ontologies; Privacy; Search engines; Servers; Smart phones; Clickthrough data; concept; location search; mobile search engine; ontology; personalization; user profiling (ID#: 15-5728)
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7033943&isnumber=7033740

 

Burke, M.-J.; Kayem, A.V.D.M., "K-Anonymity for Privacy Preserving Crime Data Publishing in Resource Constrained Environments," Advanced Information Networking and Applications Workshops (WAINA), 2014 28th International Conference on, pp. 833, 840, 13-16 May 2014. doi: 10.1109/WAINA.2014.131
Abstract: Mobile crime report services have become a pervasive approach to enabling community-based crime reporting (CBCR) in developing nations. These services hold the advantage of facilitating law enforcement when resource constraints make using standard crime investigation approaches challenging. However, CBCRs have failed to achieve widespread popularity in developing nations because of concerns for privacy. Users are hesitant to make crime reports with out strong guarantees of privacy preservation. Furthermore, oftentimes lack of data mining expertise within the law enforcement agencies implies that the reported data needs to be processed manually which is a time-consuming process. In this paper we make two contributions to facilitate effective and efficient CBCR and crime data mining as well as to address the user privacy concern. The first is a practical framework for mobile CBCR and the second, is a hybrid k-anonymity algorithm to guarantee privacy preservation of the reported crime data. We use a hierarchy-based generalization algorithm to classify the data to minimize information loss by optimizing the nodal degree of the classification tree. Results from our proof-of-concept implementation demonstrate that in addition to guaranteeing privacy, our proposed scheme offers a classification accuracy of about 38% and a drop in information loss of nearly 50% over previous schemes when compared on various sizes of datasets. Performance-wise we observe an average improvement of about 50ms proportionate to the size of the dataset.
Keywords: criminal law; data mining; data privacy; generalisation (artificial intelligence);mobile computing; pattern classification; CBCR; classification accuracy; classification tree; community-based crime reporting; crime data mining; crime investigation approach; hierarchy-based generalization algorithm; k-anonymity; law enforcement; mobile crime report services; pervasive approach; privacy preserving crime data publishing; resource constrained environment; user privacy concern; Cloud computing; Data privacy; Encryption; Law enforcement; Mobile communication; Privacy; Anonymity; Developing Countries; Encryption; Information Loss; Public/Private Key Cryptography; Resource Constrained Environments (ID#: 15-5729)
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6844743&isnumber=6844560

 

Vidyalakshmi, B.S.; Wong, R.K.; Ghanavati, M.; Chi Hung Chi, "Privacy as a Service in Social Network Communications," Services Computing (SCC), 2014 IEEE International Conference on, pp. 456, 463, June 27 2014-July 2 2014. doi: 10.1109/SCC.2014.67
Abstract: With dispersing of information on social networks - both personally identifiable and general - comes the risk of these information falling into wrong hands. Users are burdened with setting privacy of multiple social networks, each with growing number of privacy settings. Exponential growth of applications (App) running on social networks have made privacy control increasingly difficult. This necessitates Privacy as a service model, especially for social networks, to handle privacy across multiple applications and platforms. Privacy aware information dispersal involves knowing who is receiving what information of ours. Our proposed service employs a supervised learning model to assist user in spotting unintended audience for a post. Different from previous work, we combine both Tie-strength and Context of the information as features in learning. Our evaluation using several classification techniques shows that the proposed method is effective and better than methods using either only Tie-strength or only Context of the information for classification.
Keywords: Web services; data privacy; learning (artificial intelligence);pattern classification; social networking (online);classification techniques; information context; information tie-strength; privacy aware information dispersal; privacy control; privacy settings; privacy-as-a-service; social network communications; social network privacy; supervised learning model; Context; Context modeling; Education; Facebook; Feature extraction; Privacy; Privacy as a service; context; social networks; tie-strength (ID#: 15-5730)
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6930567&isnumber=6930500

 

Shouwei Sun; Yizhang Jiang; Pengjiang Qian, "Transfer Learning Based Maximum Entropy Clustering," Information Science and Technology (ICIST), 2014 4th IEEE International Conference on, pp. 829, 832, 26-28 April 2014. doi: 10.1109/ICIST.2014.6920605
Abstract: The classical maximum entropy clustering (MEC) algorithm can only work on a single dataset, which might result in poor effectiveness in the condition that the capacity of the dataset is insufficient. To resolve this problem, using the strategy of transfer learning, this paper proposed the novel transfer learning based maximum entropy clustering (TL_MEC) algorithm. TL_MEC employs the historical cluster centers and membership of the past data as the references to guide the clustering on the current data, which promotes its performance distinctly from three aspects: clustering effectiveness, anti-noise, as well as privacy protection. Thus TL_MEC can work well on those small dataset if enough historical data are available. The experimental studies verified and demonstrated the contributions of this study.
Keywords: data handling; learning (artificial intelligence); pattern clustering; TL_MEC algorithm; anti-noise; clustering effectiveness; historical data; novel transfer learning based maximum entropy clustering; privacy protection; Algorithm design and analysis; Clustering algorithms; Educational institutions; Entropy; Equations; Linear programming; Privacy; Knowledge Transfer; Maximum Entropy Clustering (MEC);Source domain privacy protection; Transfer Rules (ID#: 15-5731)
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6920605&isnumber=6920317

 

Boyang Wang; Ming Li; Chow, S.S.M.; Hui Li, "A Tale Of Two Clouds: Computing On Data Encrypted Under Multiple Keys," Communications and Network Security (CNS), 2014 IEEE Conference on, pp. 337, 345, 29-31 Oct. 2014. doi: 10.1109/CNS.2014.6997502
Abstract: Cloud computing provides a convenient platform for big data computation such as machine learning and data mining. However, privacy conscious users often encrypt their data with their own keys before uploading them to the cloud. Existing techniques for computation on encrypted data are either in the single key setting or far from practical. In this paper, we show how two non-colluding servers can leverage proxy re-encryption to jointly compute arithmetic functions over the ciphertexts of multiple users without learning the inputs, intermediate or final results. Moreover, the computation is non-interactive to users and only requires minimal server-to-server interactions. Experimental results demonstrate that our schemes significantly improve the efficiency of outsourced computation when compared to the existing approach.
Keywords: Big Data; cloud computing; cryptography; data mining; data privacy; learning (artificial intelligence); Big Data computation; arithmetic functions; ciphertexts; cloud computing; data encryption; data mining; machine learning; noncolluding servers; privacy; proxy reencryption; server-to-server interactions; Ash; Computational modeling; Encryption; Public key; Servers (ID#: 15-5732)
URLhttp://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6997502&isnumber=6997445

 

Patil, A.; Singh, S., "Differential Private Random Forest," Advances in Computing, Communications and Informatics (ICACCI, 2014 International Conference on, pp. 2623, 2630, 24-27 Sept. 2014. doi: 10.1109/ICACCI.2014.6968348
Abstract: Organizations be it private or public often collect personal information about an individual who are their customers or clients. The personal information of an individual is private and sensitive which has to be secured from data mining algorithm which an adversary may apply to get access to the private information. In this paper we have consider the problem of securing these private and sensitive information when used in random forest classifier in the framework of differential privacy. We have incorporated the concept of differential privacy to the classical random forest algorithm. Experimental results shows that quality functions such as information gain, max operator and gini index gives almost equal accuracy regardless of their sensitivity towards the noise. Also the accuracy of the classical random forest and the differential private random forest is almost equal for different size of datasets. The proposed algorithm works for datasets with categorical as well as continuous attributes.
Keywords: data mining; data privacy; learning (artificial intelligence);Gini index; data mining algorithm; differential privacy; differential private random forest; information gain; max operator; personal information; private information; sensitive information; Accuracy; Data privacy; Indexes; Noise; Privacy; Sensitivity; Vegetation (ID#: 15-5733)
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6968348&isnumber=6968191

 

Minghui Zhu, "Distributed Demand Response Algorithms Against Semi-Honest Adversaries," PES General Meeting | Conference & Exposition, 2014 IEEE, pp. 1, 5, 27-31 July 2014. doi: 10.1109/PESGM.2014.6939191
Abstract: This paper investigates two problems for demand response: demand allocation market and demand shedding market. By utilizing reinforcement learning, stochastic approximation and secure multi-party computation, we propose two distributed algorithms to solve the induced games respectively. The proposed algorithms are able to protect the privacy of the market participants, including the system operator and end users. The algorithm convergence is formally ensured and the algorithm performance is verified via numerical simulations.
Keywords: demand side management; learning (artificial intelligence);numerical analysis; power markets; stochastic games; demand allocation market; demand shedding market; distributed demand response algorithms; multiparty computation security; numerical simulation; reinforcement learning; stochastic approximation; Approximation algorithms; Games; Load management; Nash equilibrium; Pricing; Privacy; Resource management (ID#: 15-5734)
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6939191&isnumber=6938773

 

Sicuranza, M.; Ciampi, M., "A Semantic Access Control for Easy Management of the Privacy for EHR Systems," P2P, Parallel, Grid, Cloud and Internet Computing (3PGCIC), 2014 Ninth International Conference on , vol., no., pp.400,405, 8-10 Nov. 2014.  doi: 10.1109/3PGCIC.2014.84
Abstract: In the last years, the increasingly use of ICT in Healthcare has led to the generation of several healthcare information systems, such as Electronic Health Record systems, which enable the management and sharing of digital clinical data. Since clinical data is generally characterized by very sensitive information, such information systems have to be able to limit their sharing of it, by enabling or denying access to the various healthcare users. In order to manage who can do what on such data, there is so need to use access control mechanisms, which have to be able to satisfy access policies defined by the patient in a dynamic manner. This paper presents a semantic access control designed for specifying flexible and fine-grained access policies in the HIS. The proposed model is based on an ontological approach able to increase the usability and feasibility of real information systems.
Keywords: authorisation; data privacy; electronic health records; health care; ontologies (artificial intelligence);semantic networks; EHR system; HIS; digital clinical data; electronic health record; healthcare information system; ontological formalization; privacy management; semantic access control; Access control; Context; Context modeling; Medical services; Organizations; Unified modeling language; Access Control Model; EHR; Ontology; Security (ID#: 15-5735)
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7024618&isnumber=7024297

 

Sadikin, M.F.; Kyas, M., "Security And Privacy Protocol For Emerging Smart RFID Applications," Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD), 2014 15th IEEE/ACIS International Conference on, pp. 1, 7, June 30 2014-July 2 2014. doi: 10.1109/SNPD.2014.6888694
Abstract: The raise of smart RFID technology (i.e. sensor integration to RFID system) has introduced various advantages in the context of location awareness applications, reaching from low cost implementation and maintenance, to its flexibility to support large-scale system. Nevertheless, the use of such technology introduces tremendous security and privacy issues (e.g. unauthorized tracking, information leakage, cloning attack, data manipulation, collision attack, replay attack, Denial-of-Service, etc.). On the other hand, the constrained nature of RFID application makes the security enforcement is more complicated. This paper presents IMAKA-Tate: Identity protection, Mutual Authentication and Key Agreement using Tate pairing of Identity-based Encryption method. It is designed to tackle various challenges in the constrained nature of RFID applications by applying a light-weight cryptographic method with advanced-level 128 bit security protection. Indeed, our proposed solution protects the RFID system from various threats, as well as preserves the privacy by early performing encryption including the identity even before the authentication is started.
Keywords: data privacy; protocols; radiofrequency identification; telecommunication security; Denial-of-Service; RFID system; cloning attack; collision attack; data manipulation; identity based encryption method; identity protection; information leakage; key agreement; large-scale system; lightweight cryptographic method; location awareness applications; mutual authentication; privacy protocol; replay attack; security protection; security protocol; sensor integration; smart RFID applications; unauthorized tracking; Authentication; Cryptography; Payloads; Privacy; Protocols; Radiofrequency identification; Mutual Authentication; Privacy Preserving; Smart RFID Security (ID#: 15-5736)
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6888694&isnumber=6888665

 

Fu Zu-feng; Wang Hai-ying; Wu Yong-wu, "Application Of Secure Multi-Party Computation In Linear Programming," Information Technology and Artificial Intelligence Conference (ITAIC), 2014 IEEE 7th Joint International, pp. 244, 248, 20-21 Dec. 2014. doi: 10.1109/ITAIC.2014.7065043
Abstract: The existing solution to the privacy preserving linear programming, can leak the user's private data when the data is much less. In this paper, the secure multiparty computation is generalized to the problem of privacy-preserving linear programming, and we present a computing protocol of privacy-preserving linear programming. The protocol is applied to consider the problem of linear programming with less and vertically distributed data, not only the maximum value of the original linear programming can be calculated in the case having optimal solution, but also the private data of all participants can be protected in the calculation.
Keywords: data privacy; linear programming; security of data; computing protocol; multiparty computation security; privacy preserving linear programming; Complexity theory; Data privacy; Linear programming; Privacy; Protocols; Security; Vectors; cryptography; linear programming; privacy preserving; secure multiparty computation (ID#: 15-5737)
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7065043&isnumber=7064993

 

Bhati, B.S.; Venkataram, P., "Data Privacy Preserving Scheme In MANETs," Internet Security (WorldCIS), 2014 World Congress on, pp. 22, 23, 8-10 Dec. 2014. doi: 10.1109/WorldCIS.2014.7028159
Abstract: Data privacy is one among the challenging issues in Mobile Adhoc NETworks (MANETs), which are deployed in hostile environments to transfer sensitive data through multi-hop routing. The undesired disclosure of data can result in breach of data privacy, and can be used in launching several attacks. Many of the works achieved data privacy by using approaches such as data transformation, data perturbation, etc. But, these approaches introduce high computational overheads and delays in a MANET. To minimize the computations in preserving data privacy, we have proposed a computational intelligence based data privacy scheme. In the scheme we use data anonymization approach, where rough set theory is used to determine the data attributes to be anonymized. Dynamically changing multiple routes are established between a sender and a receiver, by selecting more than one trusted 1-hop neighbor nodes for data transfer in each routing step. Anonymity of the receiver is also discussed. The work has been simulated in different network sizes with several data transfers. The results are quite encouraging.
Keywords: data privacy; mobile ad hoc networks; rough set theory; security of data; telecommunication network routing; telecommunication security; MANET; computation minimization; computational intelligence; computational overheads; data anonymization approach; data attributes; data perturbation; data privacy preserving scheme; data transfers; data transformation; delays; mobile adhoc networks; multihop routing; receiver anonymity; rough set theory; Artificial neural networks; Bandwidth; Batteries; Mobile ad hoc networks; Mobile computing; Anonymity; Data Attributes; Data Privacy; Mobile Adhoc Network; Rough Sets (ID#: 15-5738)
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7028159&isnumber=7027983


Note:

Articles listed on these pages have been found on publicly available internet pages and are cited with links to those pages. Some of the information included herein has been reprinted with permission from the authors or data repositories. Direct any requests via Email to news@scienceofsecurity.net for removal of the links or modifications to specific citations. Please include the ID# of the specific citation in your correspondence.