Visible to the public Biblio

Found 12044 results

Filters: Keyword is Resiliency  [Clear All Filters]
2018-02-06
Badii, A., Faulkner, R., Raval, R., Glackin, C., Chollet, G..  2017.  Accelerated Encryption Algorithms for Secure Storage and Processing in the Cloud. 2017 International Conference on Advanced Technologies for Signal and Image Processing (ATSIP). :1–6.

The objective of this paper is to outline the design specification, implementation and evaluation of a proposed accelerated encryption framework which deploys both homomorphic and symmetric-key encryptions to serve the privacy preserving processing; in particular, as a sub-system within the Privacy Preserving Speech Processing framework architecture as part of the PPSP-in-Cloud Platform. Following a preliminary study of GPU efficiency gains optimisations benchmarked for AES implementation we have addressed and resolved the Big Integer processing challenges in parallel implementation of bilinear pairing thus enabling the creation of partially homomorphic encryption schemes which facilitates applications such as speech processing in the encrypted domain on the cloud. This novel implementation has been validated in laboratory tests using a standard speech corpus and can be used for other application domains to support secure computation and privacy preserving big data storage/processing in the cloud.

Kebande, V. R., Karie, N. M., Venter, H. S..  2017.  Cloud-Centric Framework for Isolating Big Data as Forensic Evidence from IoT Infrastructures. 2017 1st International Conference on Next Generation Computing Applications (NextComp). :54–60.

Cloud computing paradigm continues to revolutionize the way business processes are being conducted through the provision of massive resources, reliability across networks and ability to offer parallel processing. However, miniaturization, proliferation and nanotechnology within devices has enabled digitization of almost every object which eventually has seen the rise of a new technological marvel dubbed Internet of Things (IoT). IoT enables self-configurable/smart devices to connect intelligently through Radio Frequency Identification (RFID), WI-FI, LAN, GPRS and other methods by further enabling timeously processing of information. Based on these developments, the integration of the cloud and IoT infrastructures has led to an explosion of the amount of data being exchanged between devices which have in turn enabled malicious actors to use this as a platform to launch various cybercrime activities. Consequently, digital forensics provides a significant approach that can be used to provide an effective post-event response mechanism to these malicious attacks in cloud-based IoT infrastructures. Therefore, the problem being addressed is that, at the time of writing this paper, there still exist no accepted standards or frameworks for conducting digital forensic investigation on cloud-based IoT infrastructures. As a result, the authors have proposed a cloud-centric framework that is able to isolate Big data as forensic evidence from IoT (CFIBD-IoT) infrastructures for proper analysis and examination. It is the authors' opinion that if the CFIBD-IoT framework is implemented fully it will support cloud-based IoT tool creation as well as support future investigative techniques in the cloud with a degree of certainty.

Dai, H., Zhu, X., Yang, G., Yi, X..  2017.  A Verifiable Single Keyword Top-k Search Scheme against Insider Attacks over Cloud Data. 2017 3rd International Conference on Big Data Computing and Communications (BIGCOM). :111–116.

With the development of cloud computing and its economic benefit, more and more companies and individuals outsource their data and computation to clouds. Meanwhile, the business way of resource outsourcing makes the data out of control from its owner and results in many security issues. The existing secure keyword search methods assume that cloud servers are curious-but-honest or partial honest, which makes them powerless to deal with the deliberately falsified or fabricated results of insider attacks. In this paper, we propose a verifiable single keyword top-k search scheme against insider attacks which can verify the integrity of search results. Data owners generate verification codes (VCs) for the corresponding files, which embed the ordered sequence information of the relevance scores between files and keywords. Then files and corresponding VCs are outsourced to cloud servers. When a data user performs a keyword search in cloud servers, the qualified result files are determined according to the relevance scores between the files and the interested keyword and then returned to the data user together with a VC. The integrity of the result files is verified by data users through reconstructing a new VC on the received files and comparing it with the received one. Performance evaluation have been conducted to demonstrate the efficiency and result redundancy of the proposed scheme.

Moustafa, N., Creech, G., Sitnikova, E., Keshk, M..  2017.  Collaborative Anomaly Detection Framework for Handling Big Data of Cloud Computing. 2017 Military Communications and Information Systems Conference (MilCIS). :1–6.

With the ubiquitous computing of providing services and applications at anywhere and anytime, cloud computing is the best option as it offers flexible and pay-per-use based services to its customers. Nevertheless, security and privacy are the main challenges to its success due to its dynamic and distributed architecture, resulting in generating big data that should be carefully analysed for detecting network's vulnerabilities. In this paper, we propose a Collaborative Anomaly Detection Framework (CADF) for detecting cyber attacks from cloud computing environments. We provide the technical functions and deployment of the framework to illustrate its methodology of implementation and installation. The framework is evaluated on the UNSW-NB15 dataset to check its credibility while deploying it in cloud computing environments. The experimental results showed that this framework can easily handle large-scale systems as its implementation requires only estimating statistical measures from network observations. Moreover, the evaluation performance of the framework outperforms three state-of-the-art techniques in terms of false positive rate and detection rate.

Uddin, M. N., Lie, H., Li, H..  2017.  Hybrid Cloud Computing and Integrated Transport System. 2017 International Conference on Green Informatics (ICGI). :111–116.

In the 21st century, integrated transport, service and mobility concepts for real-life situations enabled by automation system and smarter connectivity. These services and ideas can be blessed from cloud computing, and big data management techniques for the transport system. These methods could also include automation, security, and integration with other modes. Integrated transport system can offer new means of communication among vehicles. This paper presents how hybrid could computing influence to make urban transportation smarter besides considering issues like security and privacy. However, a simple structured framework based on a hybrid cloud computing system might prevent common existing issues.

Hong, Y. K., Nam, H. S., Lee, S. J., Kim, T., Jeong, Y. K..  2017.  Hierarchy Architecture Security Design for Energy Cloud. 2017 International Conference on Information and Communication Technology Convergence (ICTC). :1187–1189.

Recently, the researches utilizing environmentally friendly new and renewable energy and various methods have been actively pursued to solve environmental and energy problems. The trend of the technology is converged with the latest ICT technology and expanded to the cloud of share and two-way system. In the center of this tide of change, new technologies such as IoT, Big Data and AI are sustaining to energy technology. Now, the cloud concept which is a universal form in IT field will be converged with energy field to develop Energy Cloud, manage zero energy towns and develop into social infrastructure supporting smart city. With the development of social infrastructure, it is very important as a security facility. In this paper, it is discussed the concept and the configuration of the Energy Cloud, and present a basic design method of the Energy Cloud's security that can examine and respond to the risk factors of information security in the Energy Cloud.

Tiwari, T., Turk, A., Oprea, A., Olcoz, K., Coskun, A. K..  2017.  User-Profile-Based Analytics for Detecting Cloud Security Breaches. 2017 IEEE International Conference on Big Data (Big Data). :4529–4535.

While the growth of cloud-based technologies has benefited the society tremendously, it has also increased the surface area for cyber attacks. Given that cloud services are prevalent today, it is critical to devise systems that detect intrusions. One form of security breach in the cloud is when cyber-criminals compromise Virtual Machines (VMs) of unwitting users and, then, utilize user resources to run time-consuming, malicious, or illegal applications for their own benefit. This work proposes a method to detect unusual resource usage trends and alert the user and the administrator in real time. We experiment with three categories of methods: simple statistical techniques, unsupervised classification, and regression. So far, our approach successfully detects anomalous resource usage when experimenting with typical trends synthesized from published real-world web server logs and cluster traces. We observe the best results with unsupervised classification, which gives an average F1-score of 0.83 for web server logs and 0.95 for the cluster traces.

Yasumura, Y., Imabayashi, H., Yamana, H..  2017.  Attribute-Based Proxy Re-Encryption Method for Revocation in Cloud Data Storage. 2017 IEEE International Conference on Big Data (Big Data). :4858–4860.

In the big data era, many users upload data to cloud while security concerns are growing. By using attribute-based encryption (ABE), users can securely store data in cloud while exerting access control over it. Revocation is necessary for real-world applications of ABE so that revoked users can no longer decrypt data. In actual implementations, however, revocation requires re-encryption of data in client side through download, decrypt, encrypt, and upload, which results in huge communication cost between the client and the cloud depending on the data size. In this paper, we propose a new method where the data can be re-encrypted in cloud without downloading any data. The experimental result showed that our method reduces the communication cost by one quarter in comparison with the trivial solution where re-encryption is performed in client side.

Marciani, G., Porretta, M., Nardelli, M., Italiano, G. F..  2017.  A Data Streaming Approach to Link Mining in Criminal Networks. 2017 5th International Conference on Future Internet of Things and Cloud Workshops (FiCloudW). :138–143.

The ability to discover patterns of interest in criminal networks can support and ease the investigation tasks by security and law enforcement agencies. By considering criminal networks as a special case of social networks, we can properly reuse most of the state-of-the-art techniques to discover patterns of interests, i.e., hidden and potential links. Nevertheless, in time-sensible scenarios, like the one involving criminal actions, the ability to discover patterns in a (near) real-time manner can be of primary importance.In this paper, we investigate the identification of patterns for link detection and prediction on an evolving criminal network. To extract valuable information as soon as data is generated, we exploit a stream processing approach. To this end, we also propose three new similarity social network metrics, specifically tailored for criminal link detection and prediction. Then, we develop a flexible data stream processing application relying on the Apache Flink framework; this solution allows us to deploy and evaluate the newly proposed metrics as well as the ones existing in literature. The experimental results show that the new metrics we propose can reach up to 83% accuracy in detection and 82% accuracy in prediction, resulting competitive with the state of the art metrics.

Robinson, Joseph P., Shao, Ming, Zhao, Handong, Wu, Yue, Gillis, Timothy, Fu, Yun.  2017.  Recognizing Families In the Wild (RFIW): Data Challenge Workshop in Conjunction with ACM MM 2017. Proceedings of the 2017 Workshop on Recognizing Families In the Wild. :5–12.

Recognizing Families In the Wild (RFIW) is a large-scale, multi-track automatic kinship recognition evaluation, supporting both kinship verification and family classification on scales much larger than ever before. It was organized as a Data Challenge Workshop hosted in conjunction with ACM Multimedia 2017. This was achieved with the largest image collection that supports kin-based vision tasks. In the end, we use this manuscript to summarize evaluation protocols, progress made and some technical background and performance ratings of the algorithms used, and a discussion on promising directions for both research and engineers to be taken next in this line of work.

Zheng, J., Li, Y., Hou, Y., Gao, M., Zhou, A..  2017.  BMNR: Design and Implementation a Benchmark for Metrics of Network Robustness. 2017 IEEE International Conference on Big Knowledge (ICBK). :320–325.

The network robustness is defined by how well its vertices are connected to each other to keep the network strong and sustainable. The change of network robustness may reveal events as well as periodic trend patterns that affect the interactions among vertices in the network. The evaluation of network robustness may be helpful to many applications, such as event detection, disease transmission, and network security, etc. There are many existing metrics to evaluate the robustness of networks, for example, node connectivity, edge connectivity, algebraic connectivity, graph expansion, R-energy, and so on. It is a natural and urgent problem how to choose a reasonable metric to effectively measure and evaluate the network robustness in the real applications. In this paper, based on some general principles, we design and implement a benchmark, namely BMNR, for the metrics of network robustness. The benchmark consists of graph generator, graph attack and robustness metric evaluation. We find that R-energy can evaluate both connected and disconnected graphs, and can be computed more efficiently.

Vimalkumar, K., Radhika, N..  2017.  A Big Data Framework for Intrusion Detection in Smart Grids Using Apache Spark. 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI). :198–204.

Technological advancement enables the need of internet everywhere. The power industry is not an exception in the technological advancement which makes everything smarter. Smart grid is the advanced version of the traditional grid, which makes the system more efficient and self-healing. Synchrophasor is a device used in smart grids to measure the values of electric waves, voltages and current. The phasor measurement unit produces immense volume of current and voltage data that is used to monitor and control the performance of the grid. These data are huge in size and vulnerable to attacks. Intrusion Detection is a common technique for finding the intrusions in the system. In this paper, a big data framework is designed using various machine learning techniques, and intrusions are detected based on the classifications applied on the synchrophasor dataset. In this approach various machine learning techniques like deep neural networks, support vector machines, random forest, decision trees and naive bayes classifications are done for the synchrophasor dataset and the results are compared using metrics of accuracy, recall, false rate, specificity, and prediction time. Feature selection and dimensionality reduction algorithms are used to reduce the prediction time taken by the proposed approach. This paper uses apache spark as a platform which is suitable for the implementation of Intrusion Detection system in smart grids using big data analytics.

Aksu, M. U., Dilek, M. H., Tatlı, E. İ, Bicakci, K., Dirik, H. İ, Demirezen, M. U., Aykır, T..  2017.  A Quantitative CVSS-Based Cyber Security Risk Assessment Methodology for IT Systems. 2017 International Carnahan Conference on Security Technology (ICCST). :1–8.

IT system risk assessments are indispensable due to increasing cyber threats within our ever-growing IT systems. Moreover, laws and regulations urge organizations to conduct risk assessments regularly. Even though there exist several risk management frameworks and methodologies, they are in general high level, not defining the risk metrics, risk metrics values and the detailed risk assessment formulas for different risk views. To address this need, we define a novel risk assessment methodology specific to IT systems. Our model is quantitative, both asset and vulnerability centric and defines low and high level risk metrics. High level risk metrics are defined in two general categories; base and attack graph-based. In our paper, we provide a detailed explanation of formulations in each category and make our implemented software publicly available for those who are interested in applying the proposed methodology to their IT systems.

Verma, D. C., de Mel, G..  2017.  Measures of Network Centricity for Edge Deployment of IoT Applications. 2017 IEEE International Conference on Big Data (Big Data). :4612–4620.

Edge Computing is a scheme to improve the performance, latency and security guidelines for IoT applications. However, edge deployment of an application also comes with additional complexity in management, an increased attack surface for security vulnerability, and could potentially result in a more expensive solution. As a result, the conditions under which an edge deployment of IoT applications delivers a better solution is not always obvious. Metrics which would be able to predict whether or not an IoT application is suitable for edge deployment can provide useful insights to address this question. In this paper, we examine the key performance indicators for IoT applications, namely the responsiveness, scalability and cost models for different types of IoT applications. Our analysis identifies that network centrality of an IoT application is a key characteristic which determines whether or not an IoT application is a good candidate for edge deployment. We discuss the different measures of network centrality that can be used to characterize applications, and the relative performance of edge deployment compared to centralized deployment for various IoT applications.

Liu, X., Xia, C., Wang, T., Zhong, L..  2017.  CloudSec: A Novel Approach to Verifying Security Conformance at the Bottom of the Cloud. 2017 IEEE International Congress on Big Data (BigData Congress). :569–576.

In the process of big data analysis and processing, a key concern blocking users from storing and processing their data in the cloud is their misgivings about the security and performance of cloud services. There is an urgent need to develop an approach that can help each cloud service provider (CSP) to demonstrate that their infrastructure and service behavior can meet the users' expectations. However, most of the prior research work focused on validating the process compliance of cloud service without an accurate description of the basic service behaviors, and could not measure the security capability. In this paper, we propose a novel approach to verify cloud service security conformance called CloudSec, which reduces the description gap between the cloud provider and customer through modeling cloud service behaviors (CloudBeh Model) and security SLA (SecSLA Model). These models enable a systematic integration of security constraints and service behavior into cloud while using UPPAAL to check the conformance, which can not only check CloudBeh performance metrics conformance, but also verify whether the security constraints meet the SecSLA. The proposed approach is validated through case study and experiments with a cloud storage service based on OpenStack, which illustrates CloudSec approach effectiveness and can be applied in real cloud scenarios.

Mehrpouyan, H., Azpiazu, I. M., Pera, M. S..  2017.  Measuring Personality for Automatic Elicitation of Privacy Preferences. 2017 IEEE Symposium on Privacy-Aware Computing (PAC). :84–95.

The increasing complexity and ubiquity in user connectivity, computing environments, information content, and software, mobile, and web applications transfers the responsibility of privacy management to the individuals. Hence, making it extremely difficult for users to maintain the intelligent and targeted level of privacy protection that they need and desire, while simultaneously maintaining their ability to optimally function. Thus, there is a critical need to develop intelligent, automated, and adaptable privacy management systems that can assist users in managing and protecting their sensitive data in the increasingly complex situations and environments that they find themselves in. This work is a first step in exploring the development of such a system, specifically how user personality traits and other characteristics can be used to help automate determination of user sharing preferences for a variety of user data and situations. The Big-Five personality traits of openness, conscientiousness, extroversion, agreeableness, and neuroticism are examined and used as inputs into several popular machine learning algorithms in order to assess their ability to elicit and predict user privacy preferences. Our results show that the Big-Five personality traits can be used to significantly improve the prediction of user privacy preferences in a number of contexts and situations, and so using machine learning approaches to automate the setting of user privacy preferences has the potential to greatly reduce the burden on users while simultaneously improving the accuracy of their privacy preferences and security.

Wang, Y., Rawal, B., Duan, Q..  2017.  Securing Big Data in the Cloud with Integrated Auditing. 2017 IEEE International Conference on Smart Cloud (SmartCloud). :126–131.

In this paper, we review big data characteristics and security challenges in the cloud and visit different cloud domains and security regulations. We propose using integrated auditing for secure data storage and transaction logs, real-time compliance and security monitoring, regulatory compliance, data environment, identity and access management, infrastructure auditing, availability, privacy, legality, cyber threats, and granular auditing to achieve big data security. We apply a stochastic process model to conduct security analyses in availability and mean time to security failure. Potential future works are also discussed.

Heifetz, A., Mugunthan, V., Kagal, L..  2017.  Shade: A Differentially-Private Wrapper for Enterprise Big Data. 2017 IEEE International Conference on Big Data (Big Data). :1033–1042.

Enterprises usually provide strong controls to prevent cyberattacks and inadvertent leakage of data to external entities. However, in the case where employees and data scientists have legitimate access to analyze and derive insights from the data, there are insufficient controls and employees are usually permitted access to all information about the customers of the enterprise including sensitive and private information. Though it is important to be able to identify useful patterns of one's customers for better customization and service, customers' privacy must not be sacrificed to do so. We propose an alternative — a framework that will allow privacy preserving data analytics over big data. In this paper, we present an efficient and scalable framework for Apache Spark, a cluster computing framework, that provides strong privacy guarantees for users even in the presence of an informed adversary, while still providing high utility for analysts. The framework, titled Shade, includes two mechanisms — SparkLAP, which provides Laplacian perturbation based on a user's query and SparkSAM, which uses the contents of the database itself in order to calculate the perturbation. We show that the performance of Shade is substantially better than earlier differential privacy systems without loss of accuracy, particularly when run on datasets small enough to fit in memory, and find that SparkSAM can even exceed performance of an identical nonprivate Spark query.

Palanisamy, B., Li, C., Krishnamurthy, P..  2017.  Group Privacy-Aware Disclosure of Association Graph Data. 2017 IEEE International Conference on Big Data (Big Data). :1043–1052.

In the age of Big Data, we are witnessing a huge proliferation of digital data capturing our lives and our surroundings. Data privacy is a critical barrier to data analytics and privacy-preserving data disclosure becomes a key aspect to leveraging large-scale data analytics due to serious privacy risks. Traditional privacy-preserving data publishing solutions have focused on protecting individual's private information while considering all aggregate information about individuals as safe for disclosure. This paper presents a new privacy-aware data disclosure scheme that considers group privacy requirements of individuals in bipartite association graph datasets (e.g., graphs that represent associations between entities such as customers and products bought from a pharmacy store) where even aggregate information about groups of individuals may be sensitive and need protection. We propose the notion of $ε$g-Group Differential Privacy that protects sensitive information of groups of individuals at various defined group protection levels, enabling data users to obtain the level of information entitled to them. Based on the notion of group privacy, we develop a suite of differentially private mechanisms that protect group privacy in bipartite association graphs at different group privacy levels based on specialization hierarchies. We evaluate our proposed techniques through extensive experiments on three real-world association graph datasets and our results demonstrate that the proposed techniques are effective, efficient and provide the required guarantees on group privacy.

Shi, Y., Piao, C., Zheng, L..  2017.  Differential-Privacy-Based Correlation Analysis in Railway Freight Service Applications. 2017 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC). :35–39.

With the development of modern logistics industry railway freight enterprises as the main traditional logistics enterprises, the service mode is facing many problems. In the era of big data, for railway freight enterprises, coordinated development and sharing of information resources have become the requirements of the times, while how to protect the privacy of citizens has become one of the focus issues of the public. To prevent the disclosure or abuse of the citizens' privacy information, the citizens' privacy needs to be preserved in the process of information opening and sharing. However, most of the existing privacy preserving models cannot to be used to resist attacks with continuously growing background knowledge. This paper presents the method of applying differential privacy to protect associated data, which can be shared in railway freight service association information. First, the original service data need to slice by optimal shard length, then differential method and apriori algorithm is used to add Laplace noise in the Candidate sets. Thus the citizen's privacy information can be protected even if the attacker gets strong background knowledge. Last, sharing associated data to railway information resource partners. The steps and usefulness of the discussed privacy preservation method is illustrated by an example.

Guan, Z., Si, G., Du, X., Liu, P., Zhang, Z., Zhou, Z..  2017.  Protecting User Privacy Based on Secret Sharing with Fault Tolerance for Big Data in Smart Grid. 2017 IEEE International Conference on Communications (ICC). :1–6.

In smart grid, large quantities of data is collected from various applications, such as smart metering substation state monitoring, electric energy data acquisition, and smart home. Big data acquired in smart grid applications is usually sensitive. For instance, in order to dispatch accurately and support the dynamic price, lots of smart meters are installed at user's house to collect the real-time data, but all these collected data are related to user privacy. In this paper, we propose a data aggregation scheme based on secret sharing with fault tolerance in smart grid, which ensures that control center gets the integrated data without revealing user's privacy. Meanwhile, we also consider fault tolerance during the data aggregation. At last, we analyze the security of our scheme and carry out experiments to validate the results.

Zebboudj, S., Brahami, R., Mouzaia, C., Abbas, C., Boussaid, N., Omar, M..  2017.  Big Data Source Location Privacy and Access Control in the Framework of IoT. 2017 5th International Conference on Electrical Engineering - Boumerdes (ICEE-B). :1–5.

In the recent years, we have observed the development of several connected and mobile devices intended for daily use. This development has come with many risks that might not be perceived by the users. These threats are compromising when an unauthorized entity has access to private big data generated through the user objects in the Internet of Things. In the literature, many solutions have been proposed in order to protect the big data, but the security remains a challenging issue. This work is carried out with the aim to provide a solution to the access control to the big data and securing the localization of their generator objects. The proposed models are based on Attribute Based Encryption, CHORD protocol and $μ$TESLA. Through simulations, we compare our solutions to concurrent protocols and we show its efficiency in terms of relevant criteria.

Nosouhi, M. R., Pham, V. V. H., Yu, S., Xiang, Y., Warren, M..  2017.  A Hybrid Location Privacy Protection Scheme in Big Data Environment. GLOBECOM 2017 - 2017 IEEE Global Communications Conference. :1–6.

Location privacy has become a significant challenge of big data. Particularly, by the advantage of big data handling tools availability, huge location data can be managed and processed easily by an adversary to obtain user private information from Location-Based Services (LBS). So far, many methods have been proposed to preserve user location privacy for these services. Among them, dummy-based methods have various advantages in terms of implementation and low computation costs. However, they suffer from the spatiotemporal correlation issue when users submit consecutive requests. To solve this problem, a practical hybrid location privacy protection scheme is presented in this paper. The proposed method filters out the correlated fake location data (dummies) before submissions. Therefore, the adversary can not identify the user's real location. Evaluations and experiments show that our proposed filtering technique significantly improves the performance of existing dummy-based methods and enables them to effectively protect the user's location privacy in the environment of big data.

Chen, D., Irwin, D..  2017.  Weatherman: Exposing Weather-Based Privacy Threats in Big Energy Data. 2017 IEEE International Conference on Big Data (Big Data). :1079–1086.

Smart energy meters record electricity consumption and generation at fine-grained intervals, and are among the most widely deployed sensors in the world. Energy data embeds detailed information about a building's energy-efficiency, as well as the behavior of its occupants, which academia and industry are actively working to extract. In many cases, either inadvertently or by design, these third-parties only have access to anonymous energy data without an associated location. The location of energy data is highly useful and highly sensitive information: it can provide important contextual information to improve big data analytics or interpret their results, but it can also enable third-parties to link private behavior derived from energy data with a particular location. In this paper, we present Weatherman, which leverages a suite of analytics techniques to localize the source of anonymous energy data. Our key insight is that energy consumption data, as well as wind and solar generation data, largely correlates with weather, e.g., temperature, wind speed, and cloud cover, and that every location on Earth has a distinct weather signature that uniquely identifies it. Weatherman represents a serious privacy threat, but also a potentially useful tool for researchers working with anonymous smart meter data. We evaluate Weatherman's potential in both areas by localizing data from over one hundred smart meters using a weather database that includes data from over 35,000 locations. Our results show that Weatherman localizes coarse (one-hour resolution) energy consumption, wind, and solar data to within 16.68km, 9.84km, and 5.12km, respectively, on average, which is more accurate using much coarser resolution data than prior work on localizing only anonymous solar data using solar signatures.

Hassoon, I. A., Tapus, N., Jasim, A. C..  2017.  Enhance Privacy in Big Data and Cloud via Diff-Anonym Algorithm. 2017 16th RoEduNet Conference: Networking in Education and Research (RoEduNet). :1–5.

The main issue with big data in cloud is the processed or used always need to be by third party. It is very important for the owners of data or clients to trust and to have the guarantee of privacy for the information stored in cloud or analyzed as big data. The privacy models studied in previous research showed that privacy infringement for big data happened because of limitation, privacy guarantee rate or dissemination of accurate data which is obtainable in the data set. In addition, there are various privacy models. In order to determine the best and the most appropriate model to be applied in the future, which also guarantees big data privacy, it is necessary to invest in research and study. In the next part, we surfed some of the privacy models in order to determine the advantages and disadvantages of each model in privacy assurance for big data in cloud. The present study also proposes combined Diff-Anonym algorithm (K-anonymity and differential models) to provide data anonymity with guarantee to keep balance between ambiguity of private data and clarity of general data.