Visible to the public Biblio

Found 204 results

Filters: Keyword is Indexes  [Clear All Filters]
2021-04-27
Zhang, M., Chen, Y., Huang, J..  2020.  SE-PPFM: A Searchable Encryption Scheme Supporting Privacy-Preserving Fuzzy Multikeyword in Cloud Systems. IEEE Systems Journal. :1–9.
Cloud computing provides an appearing application for compelling vision in managing big-data files and responding queries over a distributed cloud platform. To overcome privacy revealing risks, sensitive documents and private data are usually stored in the clouds in a cipher-based manner. However, it is inefficient to search the data in traditional encryption systems. Searchable encryption is a useful cryptographic primitive to enable users to retrieve data in ciphertexts. However, the traditional searchable encryptions provide lower search efficiency and cannot carry out fuzzy multikeyword queries. To solve this issue, in this article, we propose a searchable encryption that supports privacy-preserving fuzzy multikeyword search (SE-PPFM) in cloud systems, which is built by asymmetric scalar-product-preserving encryptions and Hadamard product operations. In order to realize the functionality of efficient fuzzy searches, we employ Word2vec as the primitive of machine learning to obtain a fuzzy correlation score between encrypted data and queries predicates. We analyze and evaluate the performance in terms of token of multikeyword, retrieval and match time, file retrieval time and matching accuracy, etc. The experimental results show that our scheme can achieve a higher efficiency in fuzzy multikeyword ciphertext search and provide a higher accuracy in retrieving and matching procedure.
2021-04-08
Guo, T., Zhou, R., Tian, C..  2020.  On the Information Leakage in Private Information Retrieval Systems. IEEE Transactions on Information Forensics and Security. 15:2999—3012.
We consider information leakage to the user in private information retrieval (PIR) systems. Information leakage can be measured in terms of individual message leakage or total leakage. Individual message leakage, or simply individual leakage, is defined as the amount of information that the user can obtain on any individual message that is not being requested, and the total leakage is defined as the amount of information that the user can obtain about all the other messages except the one being requested. In this work, we characterize the tradeoff between the minimum download cost and the individual leakage, and that for the total leakage, respectively. Coding schemes are proposed to achieve these optimal tradeoffs, which are also shown to be optimal in terms of the message size. We further characterize the optimal tradeoff between the minimum amount of common randomness and the total leakage. Moreover, we show that under individual leakage, common randomness is in fact unnecessary when there are more than two messages.
2021-03-01
D’Alterio, P., Garibaldi, J. M., John, R. I..  2020.  Constrained Interval Type-2 Fuzzy Classification Systems for Explainable AI (XAI). 2020 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE). :1–8.
In recent year, there has been a growing need for intelligent systems that not only are able to provide reliable classifications but can also produce explanations for the decisions they make. The demand for increased explainability has led to the emergence of explainable artificial intelligence (XAI) as a specific research field. In this context, fuzzy logic systems represent a promising tool thanks to their inherently interpretable structure. The use of a rule-base and linguistic terms, in fact, have allowed researchers to create models that are able to produce explanations in natural language for each of the classifications they make. So far, however, designing systems that make use of interval type-2 (IT2) fuzzy logic and also give explanations for their outputs has been very challenging, partially due to the presence of the type-reduction step. In this paper, it will be shown how constrained interval type-2 (CIT2) fuzzy sets represent a valid alternative to conventional interval type-2 sets in order to address this issue. Through the analysis of two case studies from the medical domain, it is shown how explainable CIT2 classifiers are produced. These systems can explain which rules contributed to the creation of each of the endpoints of the output interval centroid, while showing (in these examples) the same level of accuracy as their IT2 counterpart.
2021-02-22
Lei, X., Tu, G.-H., Liu, A. X., Xie, T..  2020.  Fast and Secure kNN Query Processing in Cloud Computing. 2020 IEEE Conference on Communications and Network Security (CNS). :1–9.
Advances in sensing and tracking technology lead to the proliferation of location-based services. Location service providers (LSPs) often resort to commercial public clouds to store the tremendous geospatial data and process location-based queries from data users. To protect the privacy of LSP's geospatial data and data user's query location against the untrusted cloud, they are required to be encrypted before sending to the cloud. Nevertheless, it is not easy to design a fast and secure location-based query processing scheme over the encrypted data. In this paper, we propose a Fast and Secure kNN (FSkNN) scheme to support secure k nearest neighbor (k NN) search in cloud computing. We reveal the inherent connection between an Sk NN protocol and a secure range query protocol and further describe how to construct FSkNN based on a secure range query protocol. FSkNN leverages a customized accuracy-assured strategy to ensure the result accuracy and adopts a data structure named random Bloom filter (RBF) to build a secure index for efficiently searching. We formally prove the security of FSkNN under the random oracle model. Our evaluation results show that FSkNN is highly practical.
Oliver, J., Ali, M., Hagen, J..  2020.  HAC-T and Fast Search for Similarity in Security. 2020 International Conference on Omni-layer Intelligent Systems (COINS). :1–7.
Similarity digests have gained popularity for many security applications like blacklisting/whitelisting, and finding similar variants of malware. TLSH has been shown to be particularly good at hunting similar malware, and is resistant to evasion as compared to other similarity digests like ssdeep and sdhash. Searching and clustering are fundamental tools which help the security analysts and security operations center (SOC) operators in hunting and analyzing malware. Current approaches which aim to cluster malware are not scalable enough to keep up with the vast amount of malware and goodware available in the wild. In this paper, we present techniques which allow for fast search and clustering of TLSH hash digests which can aid analysts to inspect large amounts of malware/goodware. Our approach builds on fast nearest neighbor search techniques to build a tree-based index which performs fast search based on TLSH hash digests. The tree-based index is used in our threshold based Hierarchical Agglomerative Clustering (HAC-T) algorithm which is able to cluster digests in a scalable manner. Our clustering technique can cluster digests in O (n logn) time on average. We performed an empirical evaluation by comparing our approach with many standard and recent clustering techniques. We demonstrate that our approach is much more scalable and still is able to produce good cluster quality. We measured cluster quality using purity on 10 million samples obtained from VirusTotal. We obtained a high purity score in the range from 0.97 to 0.98 using labels from five major anti-virus vendors (Kaspersky, Microsoft, Symantec, Sophos, and McAfee) which demonstrates the effectiveness of the proposed method.
2021-02-16
Jin, Y., Tian, Z., Zhou, M., Wang, H..  2020.  MuTrack: Multiparameter Based Indoor Passive Tracking System Using Commodity WiFi. ICC 2020 - 2020 IEEE International Conference on Communications (ICC). :1—6.
Device-Free Localization and Tracking (DFLT) acts as a key component for the contactless awareness applications such as elderly care and home security. However, the random phase errors in WiFi signal and weak target echoes submerged in background clutter signals are mainly obstacles for current DFLT systems. In this paper, we propose the design and implementation of MuTrack, a multiparameter based DFLT system using commodity WiFi devices with a single link. Firstly, we select an antenna with maximum reliability index as the reference antenna for signal sanitization in which the conjugate operation removes the random phase errors. Secondly, we design a multi-dimensional parameters estimator and then refine path parameters by optimizing the complete data of path components. Finally, the Hungarian Kalman Filter based tracking method is proposed to derive accurate locations from low-resolution parameter estimates. We extensively validate the proposed system in typical indoor environment and these experimental results show that MuTrack can achieve high tracking accuracy with the mean error of 0.82 m using only a single link.
2021-02-15
Av, N., Kumar, N. A..  2020.  Image Encryption Using Genetic Algorithm and Bit-Slice Rotation. 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT). :1–6.
Cryptography is a powerful means of delivering information in a secure manner. Over the years, many image encryption algorithms have been proposed based on the chaotic system to protect the digital image against cryptography attacks. In chaotic encryption, it jumbles the image to vary the framework of the image. This makes it difficult for the attacker to retrieve the original image. This paper introduces an efficient image encryption algorithm incorporating the genetic algorithm, bit plane slicing and bit plane rotation of the digital image. The digital image is sliced into eight planes and each plane is well rotated to give a fully encrypted image after the application of the Genetic Algorithm on each pixel of the image. This makes it less prone to attacks. For decryption, we perform the operations in the reverse order. The performance of this algorithm is measured using various similarity measures like Structural Similarity Index Measure (SSIM). The results exhibit that the proposed scheme provides a stronger level of encryption and an enhanced security level.
Doğu, S., Alidoustaghdam, H., Dilman, İ, Akıncı, M. N..  2020.  The Capability of Truncated Singular Value Decomposition Method for Through the Wall Microwave Imaging. 2020 IEEE Microwave Theory and Techniques in Wireless Communications (MTTW). 1:76–81.
In this study, a truncated singular value decomposition (TSVD) based computationally efficient through the wall imaging (TWI) is addressed. Mainly, two different scenarios with identical and non-identical multiple scatterers behind the wall have been considered. The scattered data are processed with special scheme in order to improve quality of the results and measurements are performed at four different frequencies. Next, effects of selecting truncation threshold in TSVD methods are analyzed and a detailed quantitative comparison is provided to demonstrate capabilities of these TSVD methods over selection of truncation threshold.
2021-01-25
Gracy, S., Milošević, J., Sandberg, H..  2020.  Actuator Security Index for Structured Systems. 2020 American Control Conference (ACC). :2993–2998.
Given a network with a set of vulnerable actuators (and sensors), the security index of an actuator equals the minimum number of sensors and actuators that needs to be compromised so as to conduct a perfectly undetectable attack using the said actuator. This paper deals with the problem of computing actuator security indices for discrete-time LTI network systems, using a structured systems framework. We show that the actuator security index is generic, that is for almost all realizations the actuator security index remains the same. We refer to such an index as generic security index (generic index) of an actuator. Given that the security index quantifies the vulnerability of a network, the generic index is quite valuable for large scale energy systems. Our second contribution is to provide graph-theoretic conditions for computing the generic index. The said conditions are in terms of existence of linkings on appropriately-defined directed (sub)graphs. Based on these conditions, we present an algorithm for computing the generic index.
2021-01-18
Sun, J., Ma, J., Quan, J., Zhu, X., I, C..  2019.  A Fuzzy String Matching Scheme Resistant to Statistical Attack. 2019 International Conference on Networking and Network Applications (NaNA). :396–402.
The fuzzy query scheme based on vector index uses Bloom filter to construct vector index for key words. Then the statistical attack based on the deviation of frequency distribution of the vector index brings out the sensitive information disclosure. Using the noise vector, a fuzzy query scheme resistant to the statistical attack serving for encrypted database, i.e. S-BF, is introduced. With the noise vector to clear up the deviation of frequency distribution of vector index, the statistical attacks to the vector index are resolved. Demonstrated by lab experiment, S-BF scheme can achieve the secure fuzzy query with the powerful privation protection capability for encrypted cloud database without the loss of fuzzy query efficiency.
Yadav, M. K., Gugal, D., Matkar, S., Waghmare, S..  2019.  Encrypted Keyword Search in Cloud Computing using Fuzzy Logic. 2019 1st International Conference on Innovations in Information and Communication Technology (ICIICT). :1–4.
Research and Development, and information management professionals routinely employ simple keyword searches or more complex Boolean queries when using databases such as PubMed and Ovid and search engines like Google to find the information they need. While satisfying the basic needs of the researcher, basic search is limited which can adversely affect both precision and recall, decreasing productivity and damaging the researchers' ability to discover new insights. The cloud service providers who store user's data may access sensitive information without any proper authority. A basic approach to save the data confidentiality is to encrypt the data. Data encryption also demands the protection of keyword privacy since those usually contain very vital information related to the files. Encryption of keywords protects keyword safety. Fuzzy keyword search enhances system usability by matching the files perfectly or to the nearest possible files against the keywords entered by the user based on similar semantics. Encrypted keyword search in cloud using this logic provides the user, on entering keywords, to receive best possible files in a more secured manner, by protecting the user's documents.
2021-01-11
Huang, K., Yang, T..  2020.  Additive and Subtractive Cuckoo Filters. 2020 IEEE/ACM 28th International Symposium on Quality of Service (IWQoS). :1–10.
Bloom filters (BFs) are fast and space-efficient data structures used for set membership queries in many applications. BFs are required to satisfy three key requirements: low space cost, high-speed lookups, and fast updates. Prior works do not satisfy these requirements at the same time. The standard BF does not support deletions of items and the variants that support deletions need additional space or performance overhead. The state-of-the-art cuckoo filters (CF) has high performance with seemingly low space cost. However, the CF suffers a critical issue of varying space cost per item. This is because the exclusive-OR (XOR) operation used by the CF requires the total number of buckets to be a power of two, leading to the space inflation. To address the issue, in this paper we propose a scalable variant of the cuckoo filter called additive and subtractive cuckoo filter (ASCF). We aim to improve the space efficiency while sustaining comparably high performance. The ASCF uses the addition and subtraction (ADD/SUB) operations instead of the XOR operation to compute an item's two candidate bucket indexes based on its fingerprint. Experimental results show that the ASCF achieves both low space cost and high performance. Compared to the CF, the ASCF reduces up to 1.9x space cost per item while maintaining the same lookup and update throughput. In addition, the ASCF outperforms other filters in both space cost and performance.
2020-12-21
Figueiredo, N. M., Rodríguez, M. C..  2020.  Trustworthiness in Sensor Networks A Reputation-Based Method for Weather Stations. 2020 International Conference on Omni-layer Intelligent Systems (COINS). :1–6.
Trustworthiness is a soft-security feature that evaluates the correct behavior of nodes in a network. More specifically, this feature tries to answer the following question: how much should we trust in a certain node? To determine the trustworthiness of a node, our approach focuses on two reputation indicators: the self-data trust, which evaluates the data generated by the node itself taking into account its historical data; and the peer-data trust, which utilizes the nearest nodes' data. In this paper, we show how these two indicators can be calculated using the Gaussian Overlap and Pearson correlation. This paper includes a validation of our trustworthiness approach using real data from unofficial and official weather stations in Portugal. This is a representative scenario of the current situation in many other areas, with different entities providing different kinds of data using autonomous sensors in a continuous way over the networks.
Cheng, Z., Chow, M.-Y..  2020.  An Augmented Bayesian Reputation Metric for Trustworthiness Evaluation in Consensus-based Distributed Microgrid Energy Management Systems with Energy Storage. 2020 2nd IEEE International Conference on Industrial Electronics for Sustainable Energy Systems (IESES). 1:215–220.
Consensus-based distributed microgrid energy management system is one of the most used distributed control strategies in the microgrid area. To improve its cybersecurity, the system needs to evaluate the trustworthiness of the participating agents in addition to the conventional cryptography efforts. This paper proposes a novel augmented reputation metric to evaluate the agents' trustworthiness in a distributed fashion. The proposed metric adopts a novel augmentation method to substantially improve the trust evaluation and attack detection performance under three typical difficult-to-detect attack patterns. The proposed metric is implemented and validated on a real-time HIL microgrid testbed.
2020-12-11
Dabas, K., Madaan, N., Arya, V., Mehta, S., Chakraborty, T., Singh, G..  2019.  Fair Transfer of Multiple Style Attributes in Text. 2019 Grace Hopper Celebration India (GHCI). :1—5.

To preserve anonymity and obfuscate their identity on online platforms users may morph their text and portray themselves as a different gender or demographic. Similarly, a chatbot may need to customize its communication style to improve engagement with its audience. This manner of changing the style of written text has gained significant attention in recent years. Yet these past research works largely cater to the transfer of single style attributes. The disadvantage of focusing on a single style alone is that this often results in target text where other existing style attributes behave unpredictably or are unfairly dominated by the new style. To counteract this behavior, it would be nice to have a style transfer mechanism that can transfer or control multiple styles simultaneously and fairly. Through such an approach, one could obtain obfuscated or written text incorporated with a desired degree of multiple soft styles such as female-quality, politeness, or formalness. To the best of our knowledge this work is the first that shows and attempt to solve the issues related to multiple style transfer. We also demonstrate that the transfer of multiple styles cannot be achieved by sequentially performing multiple single-style transfers. This is because each single style-transfer step often reverses or dominates over the style incorporated by a previous transfer step. We then propose a neural network architecture for fairly transferring multiple style attributes in a given text. We test our architecture on the Yelp dataset to demonstrate our superior performance as compared to existing one-style transfer steps performed in a sequence.

2020-12-01
Robinette, P., Novitzky, M., Fitzgerald, C., Benjamin, M. R., Schmidt, H..  2019.  Exploring Human-Robot Trust During Teaming in a Real-World Testbed. 2019 14th ACM/IEEE International Conference on Human-Robot Interaction (HRI). :592—593.

Project Aquaticus is a human-robot teaming competition on the water involving autonomous surface vehicles and human operated motorized kayaks. Teams composed of both humans and robots share the same physical environment to play capture the flag. In this paper, we present results from seven competitions of our half-court (one participant versus one robot) game. We found that participants indicated more trust in more aggressive behaviors from robots.

2020-11-23
Sreekumari, P..  2018.  Privacy-Preserving Keyword Search Schemes over Encrypted Cloud Data: An Extensive Analysis. 2018 IEEE 4th International Conference on Big Data Security on Cloud (BigDataSecurity), IEEE International Conference on High Performance and Smart Computing, (HPSC) and IEEE International Conference on Intelligent Data and Security (IDS). :114–120.
Big Data has rapidly developed into a hot research topic in many areas that attracts attention from academia and industry around the world. Many organization demands efficient solution to store, process, analyze and search huge amount of information. With the rapid development of cloud computing, organization prefers cloud storage services to reduce the overhead of storing data locally. However, the security and privacy of big data in cloud computing is a major source of concern. One of the positive ways of protecting data is encrypting it before outsourcing to remote servers, but the encrypted significant amounts of cloud data brings difficulties for the remote servers to perform any keyword search functions without leaking information. Various privacy-preserving keyword search (PPKS) schemes have been proposed to mitigate the privacy issue of big data encrypted on cloud storage. This paper presents an extensive analysis of the existing PPKS techniques in terms of verifiability, efficiency and data privacy. Through this analysis, we present some valuable directions for future work.
2020-11-20
Moghaddam, F. F., Wieder, P., Yahyapour, R., Khodadadi, T..  2018.  A Reliable Ring Analysis Engine for Establishment of Multi-Level Security Management in Clouds. 2018 41st International Conference on Telecommunications and Signal Processing (TSP). :1—5.
Security and Privacy challenges are the most obstacles for the advancement of cloud computing and the erosion of trust boundaries already happening in organizations is amplified and accelerated by this emerging technology. Policy Management Frameworks are the most proper solutions to create dedicated security levels based on the sensitivity of resources and according to the mapping process between requirements cloud customers and capabilities of service providers. The most concerning issue in these frameworks is the rate of perfect matches between capabilities and requirements. In this paper, a reliable ring analysis engine has been introduced to efficiently map the security requirements of cloud customers to the capabilities of service provider and to enhance the rate of perfect matches between them for establishment of different security levels in clouds. In the suggested model a structural index has been introduced to receive the requirement and efficiently map them to the most proper security mechanism of the service provider. Our results show that this index-based engine enhances the rate of perfect matches considerably and decreases the detected conflicts in syntactic and semantic analysis.
Lu, X., Guan, Z., Zhou, X., Du, X., Wu, L., Guizani, M..  2019.  A Secure and Efficient Renewable Energy Trading Scheme Based on Blockchain in Smart Grid. 2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS). :1839—1844.
Nowadays, with the diversification and decentralization of energy systems, the energy Internet makes it possible to interconnect distributed energy sources and consumers. In the energy trading market, the traditional centralized model relies entirely on trusted third parties. However, as the number of entities involved in the transactions grows and the forms of transactions diversify, the centralized model gradually exposes problems such as insufficient scalability, High energy consumption, and low processing efficiency. To address these challenges, we propose a secure and efficient energy renewable trading scheme based on blockchain. In our scheme, the electricity market trading model is divided into two levels, which can not only protect the privacy, but also achieve a green computing. In addition, in order to adapt to the relatively weak computing power of the underlying equipment in smart grid, we design a credibility-based equity proof mechanism to greatly improve the system availability. Compared with other similar distributed energy trading schemes, we prove the advantages of our scheme in terms of high operational efficiency and low computational overhead through experimental evaluations. Additionally, we conduct a detailed security analysis to demonstrate that our solution meets the security requirements.
2020-11-16
Ullah, S., Shetty, S., Hassanzadeh, A..  2018.  Towards Modeling Attacker’s Opportunity for Improving Cyber Resilience in Energy Delivery Systems. 2018 Resilience Week (RWS). :100–107.
Cyber resiliency of Energy Delivery Systems (EDS) is critical for secure and resilient cyber infrastructure. Defense-in-depth architecture forces attackers to conduct lateral propagation until the target is compromised. Researchers developed techniques based on graph spectral matrices to model lateral propagation. However, these techniques ignore host criticality which is critical in EDS. In this paper, we model attacker's opportunity by developing three criticality metrics for each host along the path to the target. The first metric refers the opportunity of attackers before they penetrate the infrastructure. The second metric measure the opportunity a host provides by allowing attackers to propagate through the network. Along with vulnerability we also take into account the attributes of hosts and links within each path. Then, we derive third criticality metric to reflect the information flow dependency from each host to target. Finally, we provide system design for instantiating the proposed metrics for real network scenarios in EDS. We present simulation results which illustrates the effectiveness of the metrics for efficient defense deployment in EDS cyber infrastructure.
Zhang, C., Xu, C., Xu, J., Tang, Y., Choi, B..  2019.  GEMˆ2-Tree: A Gas-Efficient Structure for Authenticated Range Queries in Blockchain. 2019 IEEE 35th International Conference on Data Engineering (ICDE). :842–853.
Blockchain technology has attracted much attention due to the great success of the cryptocurrencies. Owing to its immutability property and consensus protocol, blockchain offers a new solution for trusted storage and computation services. To scale up the services, prior research has suggested a hybrid storage architecture, where only small meta-data are stored onchain and the raw data are outsourced to off-chain storage. To protect data integrity, a cryptographic proof can be constructed online for queries over the data stored in the system. However, the previous schemes only support simple key-value queries. In this paper, we take the first step toward studying authenticated range queries in the hybrid-storage blockchain. The key challenge lies in how to design an authenticated data structure (ADS) that can be efficiently maintained by the blockchain, in which a unique gas cost model is employed. By analyzing the performance of the existing techniques, we propose a novel ADS, called GEM2-tree, which is not only gas-efficient but also effective in supporting authenticated queries. To further reduce the ADS maintenance cost without sacrificing much the query performance, we also propose an optimized structure, GEM2*-tree, by designing a two-level index structure. Theoretical analysis and empirical evaluation validate the performance of the proposed ADSs.
2020-10-16
Shayganmehr, Masoud, Montazer, Gholam Ali.  2019.  Identifying Indexes Affecting the Quality of E-Government Websites. 2019 5th International Conference on Web Research (ICWR). :167—171.

With the development of new technologies in the world, governments have tendency to make a communications with people and business with the help of such technologies. Electronic government (e-government) is defined as utilizing information technologies such as electronic networks, Internet and mobile phones by organizations and state institutions in order to making wide communication between citizens, business and different state institutions. Development of e-government starts with making website in order to share information with users and is considered as the main infrastructure for further development. Website assessment is considered as a way for improving service quality. Different international researches have introduced various indexes for website assessment, they only see some dimensions of website in their research. In this paper, the most important indexes for website quality assessment based on accurate review of previous studies are "Web design", "navigation", services", "maintenance and Support", "Citizens Participation", "Information Quality", "Privacy and Security", "Responsiveness", "Usability". Considering mentioned indexes in designing the website facilitates user interaction with the e-government websites.

2020-10-12
Okutan, Ahmet, Cheng, Fu-Yuan, Su, Shao-Hsuan, Yang, Shanchieh Jay.  2019.  Dynamic Generation of Empirical Cyberattack Models with Engineered Alert Features. MILCOM 2019 - 2019 IEEE Military Communications Conference (MILCOM). :1–6.
Due to the increased diversity and complexity of cyberattacks, innovative and effective analytics are needed in order to identify critical cyber incidents on a corporate network even if no ground truth data is available. This paper develops an automated system which processes a set of intrusion alerts to create behavior aggregates and then classifies these aggregates into empirical attack models through a dynamic Bayesian approach with innovative feature engineering methods. Each attack model represents a unique collective attack behavior that helps to identify critical activities on the network. Using 2017 National Collegiate Penetration Testing Competition data, it is demonstrated that the developed system is capable of generating and refining unique attack models that make sense to human, without a priori knowledge.
Chia, Pern Hui, Desfontaines, Damien, Perera, Irippuge Milinda, Simmons-Marengo, Daniel, Li, Chao, Day, Wei-Yen, Wang, Qiushi, Guevara, Miguel.  2019.  KHyperLogLog: Estimating Reidentifiability and Joinability of Large Data at Scale. 2019 IEEE Symposium on Security and Privacy (SP). :350–364.
Understanding the privacy relevant characteristics of data sets, such as reidentifiability and joinability, is crucial for data governance, yet can be difficult for large data sets. While computing the data characteristics by brute force is straightforward, the scale of systems and data collected by large organizations demands an efficient approach. We present KHyperLogLog (KHLL), an algorithm based on approximate counting techniques that can estimate the reidentifiability and joinability risks of very large databases using linear runtime and minimal memory. KHLL enables one to measure reidentifiability of data quantitatively, rather than based on expert judgement or manual reviews. Meanwhile, joinability analysis using KHLL helps ensure the separation of pseudonymous and identified data sets. We describe how organizations can use KHLL to improve protection of user privacy. The efficiency of KHLL allows one to schedule periodic analyses that detect any deviations from the expected risks over time as a regression test for privacy. We validate the performance and accuracy of KHLL through experiments using proprietary and publicly available data sets.
2020-10-06
Li, Zhiyi, Shahidehpour, Mohammad, Galvin, Robert W., Li, Yang.  2018.  Collaborative Cyber-Physical Restoration for Enhancing the Resilience of Power Distribution Systems. 2018 IEEE Power Energy Society General Meeting (PESGM). :1—5.

This paper sheds light on the collaborative efforts in restoring cyber and physical subsystems of a modern power distribution system after the occurrence of an extreme weather event. The extensive cyber-physical interdependencies in the operation of power distribution systems are first introduced for investigating the functionality loss of each subsystem when the dependent subsystem suffers disruptions. A resilience index is then proposed for measuring the effectiveness of restoration activities in terms of restoration rapidity. After modeling operators' decision making for economic dispatch as a second-order cone programming problem, this paper proposes a heuristic approach for prioritizing the activities for restoring both cyber and physical subsystems. In particular, the proposed heuristic approach takes into consideration of cyber-physical interdependencies for improving the operation performance. Case studies are also conducted to validate the collaborative restoration model in the 33-bus power distribution system.