Visible to the public Biblio

Filters: Keyword is data structures  [Clear All Filters]
2021-01-11
Huang, K., Yang, T..  2020.  Additive and Subtractive Cuckoo Filters. 2020 IEEE/ACM 28th International Symposium on Quality of Service (IWQoS). :1–10.
Bloom filters (BFs) are fast and space-efficient data structures used for set membership queries in many applications. BFs are required to satisfy three key requirements: low space cost, high-speed lookups, and fast updates. Prior works do not satisfy these requirements at the same time. The standard BF does not support deletions of items and the variants that support deletions need additional space or performance overhead. The state-of-the-art cuckoo filters (CF) has high performance with seemingly low space cost. However, the CF suffers a critical issue of varying space cost per item. This is because the exclusive-OR (XOR) operation used by the CF requires the total number of buckets to be a power of two, leading to the space inflation. To address the issue, in this paper we propose a scalable variant of the cuckoo filter called additive and subtractive cuckoo filter (ASCF). We aim to improve the space efficiency while sustaining comparably high performance. The ASCF uses the addition and subtraction (ADD/SUB) operations instead of the XOR operation to compute an item's two candidate bucket indexes based on its fingerprint. Experimental results show that the ASCF achieves both low space cost and high performance. Compared to the CF, the ASCF reduces up to 1.9x space cost per item while maintaining the same lookup and update throughput. In addition, the ASCF outperforms other filters in both space cost and performance.
Zhang, H., Zhang, D., Chen, H., Xu, J..  2020.  Improving Efficiency of Pseudonym Revocation in VANET Using Cuckoo Filter. 2020 IEEE 20th International Conference on Communication Technology (ICCT). :763–769.
In VANETs, pseudonyms are often used to replace the identity of vehicles in communication. When vehicles drive out of the network or misbehave, their pseudonym certificates need to be revoked by the certificate authority (CA). The certificate revocation lists (CRLs) are usually used to store the revoked certificates before their expiration. However, using CRLs would incur additional storage, communication and computation overhead. Some existing schemes have proposed to use Bloom Filter to compress the original CRLs, but they are unable to delete the expired certificates and introduce the false positive problem. In this paper, we propose an improved pseudonym certificates revocation scheme, using Cuckoo Filter for compression to reduce the impact of these problems. In order to optimize deletion efficiency, we propose the concept of Certificate Expiration List (CEL) which can be implemented with priority queue. The experimental results show that our scheme can effectively reduce the storage and communication overhead of pseudonym certificates revocation, while retaining moderately low false positive rates. In addition, our scheme can also greatly improve the lookup performance on CRLs, and reduce the revocation operation costs by allowing deletion.
Awad, M. A., Ashkiani, S., Porumbescu, S. D., Owens, J. D..  2020.  Dynamic Graphs on the GPU. 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS). :739–748.
We present a fast dynamic graph data structure for the GPU. Our dynamic graph structure uses one hash table per vertex to store adjacency lists and achieves 3.4-14.8x faster insertion rates over the state of the art across a diverse set of large datasets, as well as deletion speedups up to 7.8x. The data structure supports queries and dynamic updates through both edge and vertex insertion and deletion. In addition, we define a comprehensive evaluation strategy based on operations, workloads, and applications that we believe better characterize and evaluate dynamic graph data structures.
2020-12-28
Riaz, S., Khan, A. H., Haroon, M., Latif, S., Bhatti, S..  2020.  Big Data Security and Privacy: Current Challenges and Future Research perspective in Cloud Environment. 2020 International Conference on Information Management and Technology (ICIMTech). :977—982.

Cloud computing is an Internet-based technology that emerging rapidly in the last few years due to popular and demanded services required by various institutions, organizations, and individuals. structured, unstructured, semistructured data is transfer at a record pace on to the cloud server. These institutions, businesses, and organizations are shifting more and more increasing workloads on cloud server, due to high cost, space and maintenance issues from big data, cloud computing will become a potential choice for the storage of data. In Cloud Environment, It is obvious that data is not secure completely yet from inside and outside attacks and intrusions because cloud servers are under the control of a third party. The Security of data becomes an important aspect due to the storage of sensitive data in a cloud environment. In this paper, we give an overview of characteristics and state of art of big data and data security & privacy top threats, open issues and current challenges and their impact on business are discussed for future research perspective and review & analysis of previous and recent frameworks and architectures for data security that are continuously established against threats to enhance how to keep and store data in the cloud environment.

2020-12-11
Zhang, W., Byna, S., Niu, C., Chen, Y..  2019.  Exploring Metadata Search Essentials for Scientific Data Management. 2019 IEEE 26th International Conference on High Performance Computing, Data, and Analytics (HiPC). :83—92.

Scientific experiments and observations store massive amounts of data in various scientific file formats. Metadata, which describes the characteristics of the data, is commonly used to sift through massive datasets in order to locate data of interest to scientists. Several indexing data structures (such as hash tables, trie, self-balancing search trees, sparse array, etc.) have been developed as part of efforts to provide an efficient method for locating target data. However, efficient determination of an indexing data structure remains unclear in the context of scientific data management, due to the lack of investigation on metadata, metadata queries, and corresponding data structures. In this study, we perform a systematic study of the metadata search essentials in the context of scientific data management. We study a real-world astronomy observation dataset and explore the characteristics of the metadata in the dataset. We also study possible metadata queries based on the discovery of the metadata characteristics and evaluate different data structures for various types of metadata attributes. Our evaluation on real-world dataset suggests that trie is a suitable data structure when prefix/suffix query is required, otherwise hash table should be used. We conclude our study with a summary of our findings. These findings provide a guideline and offers insights in developing metadata indexing methodologies for scientific applications.

2020-12-01
Garbo, A., Quer, S..  2018.  A Fast MPEG’s CDVS Implementation for GPU Featured in Mobile Devices. IEEE Access. 6:52027—52046.
The Moving Picture Experts Group's Compact Descriptors for Visual Search (MPEG's CDVS) intends to standardize technologies in order to enable an interoperable, efficient, and cross-platform solution for internet-scale visual search applications and services. Among the key technologies within CDVS, we recall the format of visual descriptors, the descriptor extraction process, and the algorithms for indexing and matching. Unfortunately, these steps require precision and computation accuracy. Moreover, they are very time-consuming, as they need running times in the order of seconds when implemented on the central processing unit (CPU) of modern mobile devices. In this paper, to reduce computation times and maintain precision and accuracy, we re-design, for many-cores embedded graphical processor units (GPUs), all main local descriptor extraction pipeline phases of the MPEG's CDVS standard. To reach this goal, we introduce new techniques to adapt the standard algorithm to parallel processing. Furthermore, to reduce memory accesses and efficiently distribute the kernel workload, we use new approaches to store and retrieve CDVS information on proper GPU data structures. We present a complete experimental analysis on a large and standard test set. Our experiments show that our GPU-based approach is remarkably faster than the CPU-based reference implementation of the standard, and it maintains a comparable precision in terms of true and false positive rates.
2020-11-16
Zhang, C., Xu, C., Xu, J., Tang, Y., Choi, B..  2019.  GEMˆ2-Tree: A Gas-Efficient Structure for Authenticated Range Queries in Blockchain. 2019 IEEE 35th International Conference on Data Engineering (ICDE). :842–853.
Blockchain technology has attracted much attention due to the great success of the cryptocurrencies. Owing to its immutability property and consensus protocol, blockchain offers a new solution for trusted storage and computation services. To scale up the services, prior research has suggested a hybrid storage architecture, where only small meta-data are stored onchain and the raw data are outsourced to off-chain storage. To protect data integrity, a cryptographic proof can be constructed online for queries over the data stored in the system. However, the previous schemes only support simple key-value queries. In this paper, we take the first step toward studying authenticated range queries in the hybrid-storage blockchain. The key challenge lies in how to design an authenticated data structure (ADS) that can be efficiently maintained by the blockchain, in which a unique gas cost model is employed. By analyzing the performance of the existing techniques, we propose a novel ADS, called GEM2-tree, which is not only gas-efficient but also effective in supporting authenticated queries. To further reduce the ADS maintenance cost without sacrificing much the query performance, we also propose an optimized structure, GEM2*-tree, by designing a two-level index structure. Theoretical analysis and empirical evaluation validate the performance of the proposed ADSs.
2020-09-28
Liu, Qin, Pei, Shuyu, Xie, Kang, Wu, Jie, Peng, Tao, Wang, Guojun.  2018.  Achieving Secure and Effective Search Services in Cloud Computing. 2018 17th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/ 12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). :1386–1391.
One critical challenge of today's cloud services is how to provide an effective search service while preserving user privacy. In this paper, we propose a wildcard-based multi-keyword fuzzy search (WMFS) scheme over the encrypted data, which tolerates keyword misspellings by exploiting the indecomposable property of primes. Compared with existing secure fuzzy search schemes, our WMFS scheme has the following merits: 1) Efficiency. It eliminates the requirement of a predefined dictionary and thus supports updates efficiently. 2) High accuracy. It eliminates the false positive and false negative introduced by specific data structures and thus allows the user to retrieve files as accurate as possible. 3) Flexibility. It gives the user great flexibility to specify different search patterns including keyword and substring matching. Extensive experiments on a real data set demonstrate the effectiveness and efficiency of our scheme.
2020-08-13
Wang, Tianyi, Chow, Kam Pui.  2019.  Automatic Tagging of Cyber Threat Intelligence Unstructured Data using Semantics Extraction. 2019 IEEE International Conference on Intelligence and Security Informatics (ISI). :197—199.
Threat intelligence, information about potential or current attacks to an organization, is an important component in cyber security territory. As new threats consecutively occurring, cyber security professionals always keep an eye on the latest threat intelligence in order to continuously lower the security risks for their organizations. Cyber threat intelligence is usually conveyed by structured data like CVE entities and unstructured data like articles and reports. Structured data are always under certain patterns that can be easily analyzed, while unstructured data have more difficulties to find fixed patterns to analyze. There exists plenty of methods and algorithms on information extraction from structured data, but no current work is complete or suitable for semantics extraction upon unstructured cyber threat intelligence data. In this paper, we introduce an idea of automatic tagging applying JAPE feature within GATE framework to perform semantics extraction upon cyber threat intelligence unstructured data such as articles and reports. We extract token entities from each cyber threat intelligence article or report and evaluate the usefulness of them. A threat intelligence ontology then can be constructed with the useful entities extracted from related resources and provide convenience for professionals to find latest useful threat intelligence they need.
2020-07-24
Wu, Zhijun, Xu, Enzhong, Liu, Liang, Yue, Meng.  2019.  CHTDS: A CP-ABE Access Control Scheme Based on Hash Table and Data Segmentation in NDN. 2019 18th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/13th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). :843—848.

For future Internet, information-centric networking (ICN) is considered a potential solution to many of its current problems, such as content distribution, mobility, and security. Named Data Networking (NDN) is a more popular ICN project. However, concern regarding the protection of user data persists. Information caching in NDN decouples content and content publishers, which leads to content security threats due to lack of secure controls. Therefore, this paper presents a CP-ABE (ciphertext policy attribute based encryption) access control scheme based on hash table and data segmentation (CHTDS). Based on data segmentation, CHTDS uses a method of linearly splitting fixed data blocks, which effectively improves data management. CHTDS also introduces CP-ABE mechanism and hash table data structure to ensure secure access control and privilege revocation does not need to re-encrypt the published content. The analysis results show that CHTDS can effectively realize the security and fine-grained access control in the NDN environment, and reduce communication overhead for content access.

2020-07-13
ahmad, sahan, Zobaed, SM, Gottumukkala, Raju, Salehi, Mohsen Amini.  2019.  Edge Computing for User-Centric Secure Search on Cloud-Based Encrypted Big Data. 2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS). :662–669.

Cloud service providers offer a low-cost and convenient solution to host unstructured data. However, cloud services act as third-party solutions and do not provide control of the data to users. This has raised security and privacy concerns for many organizations (users) with sensitive data to utilize cloud-based solutions. User-side encryption can potentially address these concerns by establishing user-centric cloud services and granting data control to the user. Nonetheless, user-side encryption limits the ability to process (e.g., search) encrypted data on the cloud. Accordingly, in this research, we provide a framework that enables processing (in particular, searching) of encrypted multiorganizational (i.e., multi-source) big data without revealing the data to cloud provider. Our framework leverages locality feature of edge computing to offer a user-centric search ability in a realtime manner. In particular, the edge system intelligently predicts the user's search pattern and prunes the multi-source big data search space to reduce the search time. The pruning system is based on efficient sampling from the clustered big dataset on the cloud. For each cluster, the pruning system dynamically samples appropriate number of terms based on the user's search tendency, so that the cluster is optimally represented. We developed a prototype of a user-centric search system and evaluated it against multiple datasets. Experimental results demonstrate 27% improvement in the pruning quality and search accuracy.

Oleshchuk, Vladimir.  2019.  Secure and Privacy Preserving Pattern Matching in Distributed Cloud-based Data Storage. 2019 10th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS). 2:820–823.
Given two strings: pattern p of length m and text t of length n. The string matching problem is to find all (or some) occurrences of the pattern p in the text t. We introduce a new simple data structure, called index arrays, and design fast privacy-preserving matching algorithm for string matching. The motivation behind introducing index arrays is determined by the need for pattern matching on distributed cloud-based datasets with semi-trusted cloud providers. It is intended to use encrypted index arrays both to improve performance and protect confidentiality and privacy of user data.
2020-07-03
Li, Feiyan, Li, Wei, Huo, Hongtao, Ran, Qiong.  2019.  Decision Fusion Based on Joint Low Rank and Sparse Component for Hyperspectral Image Classification. IGARSS 2019 - 2019 IEEE International Geoscience and Remote Sensing Symposium. :401—404.

Sparse and low rank matrix decomposition is a method that has recently been developed for estimating different components of hyperspectral data. The rank component is capable of preserving global data structures of data, while a sparse component can select the discriminative information by preserving details. In order to take advantage of both, we present a novel decision fusion based on joint low rank and sparse component (DFJLRS) method for hyperspectral imagery in this paper. First, we analyzed the effects of different components on classification results. Then a novel method adopts a decision fusion strategy which combines a SVM classifier with the information provided by joint sparse and low rank components. With combination of the advantages, the proposed method is both representative and discriminative. The proposed algorithm is evaluated using several hyperspectral images when compared with traditional counterparts.

2020-06-29
Tran, Thang M., Nguyen, Khanh-Van.  2019.  Fast Detection and Mitigation to DDoS Web Attack Based on Access Frequency. 2019 IEEE-RIVF International Conference on Computing and Communication Technologies (RIVF). :1–6.

We have been investigating methods for establishing an effective, immediate defense mechanism against the DDoS attacks on Web applications via hacker botnets, in which this defense mechanism can be immediately active without preparation time, e.g. for training data, usually asked for in existing proposals. In this study, we propose a new mechanism, including new data structures and algorithms, that allow the detection and filtering of large amounts of attack packets (Web request) based on monitoring and capturing the suspect groups of source IPs that can be sending packets at similar patterns, i.e. with very high and similar frequencies. The proposed algorithm places great emphasis on reducing storage space and processing time so it is promising to be effective in real-time attack response.

2020-06-03
Qawasmeh, Ethar, Al-Saleh, Mohammed I., Al-Sharif, Ziad A..  2019.  Towards a Generic Approach for Memory Forensics. 2019 Sixth HCT Information Technology Trends (ITT). :094—098.

The era of information technology has, unfortunately, contributed to the tremendous rise in the number of criminal activities. However, digital artifacts can be utilized in convicting cybercriminal and exposing their activities. The digital forensics science concerns about all aspects related to cybercrimes. It seeks digital evidence by following standard methodologies to be admitted in court rooms. This paper concerns about memory forensics for the unique artifacts it holds. Memory contains information about the current state of systems and applications. Moreover, an application's data explains how a criminal has been interacting the application just before the memory is acquired. Memory forensics at the application level is currently random and cumbersome. Targeting specific applications is what forensic researchers and practitioner are currently striving to provide. This paper suggests a general solution to investigate any application. Our solution aims to utilize an application's data structures and variables' information in the investigation process. This is because an application's data has to be stored and retrieved in the means of variables. Data structures and variables' information can be generated by compilers for debugging purposes. We show that an application's information is a valuable resource to the investigator.

2020-05-26
Ostrovskaya, Svetlana, Surnin, Oleg, Hussain, Rasheed, Bouk, Safdar Hussain, Lee, JooYoung, Mehran, Narges, Ahmed, Syed Hassan, Benslimane, Abderrahim.  2018.  Towards Multi-metric Cache Replacement Policies in Vehicular Named Data Networks. 2018 IEEE 29th Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC). :1–7.
Vehicular Named Data Network (VNDN) uses NDN as an underlying communication paradigm to realize intelligent transportation system applications. Content communication is the essence of NDN, which is primarily carried out through content naming, forwarding, intrinsic content security, and most importantly the in-network caching. In vehicular networks, vehicles on the road communicate with other vehicles and/or infrastructure network elements to provide passengers a reliable, efficient, and infotainment-rich commute experience. Recently, different aspects of NDN have been investigated in vehicular networks and in vehicular social networks (VSN); however, in this paper, we investigate the in-network caching, realized in NDN through the content store (CS) data structure. As the stale contents in CS do not just occupy cache space, but also decrease the overall performance of NDN-driven VANET and VSN applications, therefore the size of CS and the content lifetime in CS are primary issues in VNDN communications. To solve these issues, we propose a simple yet efficient multi-metric CS management mechanism through cache replacement (M2CRP). We consider the content popularity, relevance, freshness, and distance of a node to devise a set of algorithms for selection of the content to be replaced in CS in the case of replacement requirement. Simulation results show that our multi-metric strategy outperforms the existing cache replacement mechanisms in terms of Hit Ratio.
2020-05-22
Markchit, Sarawut, Chiu, Chih-Yi.  2019.  Hash Code Indexing in Cross-Modal Retrieval. 2019 International Conference on Content-Based Multimedia Indexing (CBMI). :1—4.

Cross-modal hashing, which searches nearest neighbors across different modalities in the Hamming space, has become a popular technique to overcome the storage and computation barrier in multimedia retrieval recently. Although dozens of cross-modal hashing algorithms are proposed to yield compact binary code representation, applying exhaustive search in a large-scale dataset is impractical for the real-time purpose, and the Hamming distance computation suffers inaccurate results. In this paper, we propose a novel index scheme over binary hash codes in cross-modal retrieval. The proposed indexing scheme exploits a few binary bits of the hash code as the index code. Based on the index code representation, we construct an inverted index structure to accelerate the retrieval efficiency and train a neural network to improve the indexing accuracy. Experiments are performed on two benchmark datasets for retrieval across image and text modalities, where hash codes are generated by three cross-modal hashing methods. Results show the proposed method effectively boosts the performance over the benchmark datasets and hash methods.

Horzyk, Adrian, Starzyk, Janusz A..  2019.  Associative Data Model in Search for Nearest Neighbors and Similar Patterns. 2019 IEEE Symposium Series on Computational Intelligence (SSCI). :933—940.
This paper introduces a biologically inspired associative data model and structure for finding nearest neighbors and similar patterns. The method can be used as an alternative to the classical approaches to accelerate the search for such patterns using various priorities for attributes according to the Sebestyen measure. The presented structure, together with algorithms developed in this paper can be useful in various computational intelligence tasks like pattern matching, recognition, clustering, classification, multi-criterion search etc. This approach is particularly useful for the on-line operation of associative neural network graphs. Graphs that dynamically develop their structure during learning on training data. The results of experiments show that the associative approach can substantially accelerate the nearest neighbor search and that associative structures can also be used as a model for KNN tasks. Finally, this paper presents how the associative structures can be used to self-organize data and represent knowledge about them in the associative way, which yields new search approaches described in this paper.
2020-03-23
Pewny, Jannik, Koppe, Philipp, Holz, Thorsten.  2019.  STEROIDS for DOPed Applications: A Compiler for Automated Data-Oriented Programming. 2019 IEEE European Symposium on Security and Privacy (EuroS P). :111–126.
The wide-spread adoption of system defenses such as the randomization of code, stack, and heap raises the bar for code-reuse attacks. Thus, attackers utilize a scripting engine in target programs like a web browser to prepare the code-reuse chain, e.g., relocate gadget addresses or perform a just-in-time gadget search. However, many types of programs do not provide such an execution context that an attacker can use. Recent advances in data-oriented programming (DOP) explored an orthogonal way to abuse memory corruption vulnerabilities and demonstrated that an attacker can achieve Turing-complete computations without modifying code pointers in applications. As of now, constructing DOP exploits requires a lot of manual work-for every combination of application and payload anew. In this paper, we present novel techniques to automate the process of generating DOP exploits. We implemented a compiler called STEROIDS that leverages these techniques and compiles our high-level language SLANG into low-level DOP data structures driving malicious computations at run time. This enables an attacker to specify her intent in an application-and vulnerability-independent manner to maximize reusability. We demonstrate the effectiveness of our techniques and prototype implementation by specifying four programs of varying complexity in SLANG that calculate the Levenshtein distance, traverse a pointer chain to steal a private key, relocate a ROP chain, and perform a JIT-ROP attack. STEROIDS compiles each of those programs to low-level DOP data structures targeted at five different applications including GStreamer, Wireshark and ProFTPd, which have vastly different vulnerabilities and DOP instances. Ultimately, this shows that our compiler is versatile, can be used for both 32-bit and 64-bit applications, works across bug classes, and enables highly expressive attacks without conventional code-injection or code-reuse techniques in applications lacking a scripting engine.
2020-02-17
Ezick, James, Henretty, Tom, Baskaran, Muthu, Lethin, Richard, Feo, John, Tuan, Tai-Ching, Coley, Christopher, Leonard, Leslie, Agrawal, Rajeev, Parsons, Ben et al..  2019.  Combining Tensor Decompositions and Graph Analytics to Provide Cyber Situational Awareness at HPC Scale. 2019 IEEE High Performance Extreme Computing Conference (HPEC). :1–7.

This paper describes MADHAT (Multidimensional Anomaly Detection fusing HPC, Analytics, and Tensors), an integrated workflow that demonstrates the applicability of HPC resources to the problem of maintaining cyber situational awareness. MADHAT combines two high-performance packages: ENSIGN for large-scale sparse tensor decompositions and HAGGLE for graph analytics. Tensor decompositions isolate coherent patterns of network behavior in ways that common clustering methods based on distance metrics cannot. Parallelized graph analysis then uses directed queries on a representation that combines the elements of identified patterns with other available information (such as additional log fields, domain knowledge, network topology, whitelists and blacklists, prior feedback, and published alerts) to confirm or reject a threat hypothesis, collect context, and raise alerts. MADHAT was developed using the collaborative HPC Architecture for Cyber Situational Awareness (HACSAW) research environment and evaluated on structured network sensor logs collected from Defense Research and Engineering Network (DREN) sites using HPC resources at the U.S. Army Engineer Research and Development Center DoD Supercomputing Resource Center (ERDC DSRC). To date, MADHAT has analyzed logs with over 650 million entries.

2019-11-18
Dong, Yuhao, Kim, Woojung, Boutaba, Raouf.  2018.  Conifer: Centrally-Managed PKI with Blockchain-Rooted Trust. 2018 IEEE International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData). :1092–1099.
Secure naming systems, or more narrowly public key infrastructures (PKIs), form the basis of secure communications over insecure networks. All security guarantees against active attackers come from a trustworthy binding between user-facing names, such as domain names, to cryptographic identities, such as public keys. By offering a secure, distributed ledger with highly decentralized trust, blockchains such as Bitcoin show promise as the root of trust for naming systems with no central trusted parties. PKIs based upon blockchains, such as Namecoin and Blockstack, have greatly improved security and resilience compared to traditional centralized PKIs. Yet blockchain PKIs tend to significantly sacrifice scalability and flexibility in pursuit of decentralization, hindering large-scale deployability on the Internet. We propose Conifer, a novel PKI with an architecture based upon CONIKS, a centralized transparency-based PKI, and Catena, a blockchain-agnostic way of embedding a permissioned log, but with a different lookup strategy. In doing so, Conifer achieves decentralized trust with security at least as strong as existing blockchain-based naming systems, yet without sacrificing the flexibility and performance typically found in centralized PKIs. We also present our reference implementation of Conifer, demonstrating how it can easily be integrated into applications. Finally, we use experiments to evaluate the performance of Conifer compared with other naming systems, both centralized and blockchain-based, demonstrating that it incurs only a modest overhead compared to traditional centralized-trust systems while being far more scalable and performant than purely blockchain-based solutions.
2019-10-23
Chen, Jing, Yao, Shixiong, Yuan, Quan, He, Kun, Ji, Shouling, Du, Ruiying.  2018.  CertChain: Public and Efficient Certificate Audit Based on Blockchain for TLS Connections. IEEE INFOCOM 2018 - IEEE Conference on Computer Communications. :2060-2068.

In recent years, real-world attacks against PKI take place frequently. For example, malicious domains' certificates issued by compromised CAs are widespread, and revoked certificates are still trusted by clients. In spite of a lot of research to improve the security of SSL/TLS connections, there are still some problems unsolved. On one hand, although log-based schemes provided certificate audit service to quickly detect CAs' misbehavior, the security and data consistency of log servers are ignored. On the other hand, revoked certificates checking is neglected due to the incomplete, insecure and inefficient certificate revocation mechanisms. Further, existing revoked certificates checking schemes are centralized which would bring safety bottlenecks. In this paper, we propose a blockchain-based public and efficient audit scheme for TLS connections, which is called Certchain. Specially, we propose a dependability-rank based consensus protocol in our blockchain system and a new data structure to support certificate forward traceability. Furthermore, we present a method that utilizes dual counting bloom filter (DCBF) with eliminating false positives to achieve economic space and efficient query for certificate revocation checking. The security analysis and experimental results demonstrate that CertChain is suitable in practice with moderate overhead.

2019-09-26
Khatchadourian, R., Tang, Y., Bagherzadeh, M., Ahmed, S..  2019.  Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams. 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). :619-630.

Streaming APIs are becoming more pervasive in mainstream Object-Oriented programming languages. For example, the Stream API introduced in Java 8 allows for functional-like, MapReduce-style operations in processing both finite and infinite data structures. However, using this API efficiently involves subtle considerations like determining when it is best for stream operations to run in parallel, when running operations in parallel can be less efficient, and when it is safe to run in parallel due to possible lambda expression side-effects. In this paper, we present an automated refactoring approach that assists developers in writing efficient stream code in a semantics-preserving fashion. The approach, based on a novel data ordering and typestate analysis, consists of preconditions for automatically determining when it is safe and possibly advantageous to convert sequential streams to parallel and unorder or de-parallelize already parallel streams. The approach was implemented as a plug-in to the Eclipse IDE, uses the WALA and SAFE analysis frameworks, and was evaluated on 11 Java projects consisting of ?642K lines of code. We found that 57 of 157 candidate streams (36.31%) were refactorable, and an average speedup of 3.49 on performance tests was observed. The results indicate that the approach is useful in optimizing stream code to their full potential.

2019-07-01
Ferreyra, N. E. Díaz, Meisy, R., Heiselz, M..  2018.  At Your Own Risk: Shaping Privacy Heuristics for Online Self-Disclosure. 2018 16th Annual Conference on Privacy, Security and Trust (PST). :1-10.

Revealing private and sensitive information on Social Network Sites (SNSs) like Facebook is a common practice which sometimes results in unwanted incidents for the users. One approach for helping users to avoid regrettable scenarios is through awareness mechanisms which inform a priori about the potential privacy risks of a self-disclosure act. Privacy heuristics are instruments which describe recurrent regrettable scenarios and can support the generation of privacy awareness. One important component of a heuristic is the group of people who should not access specific private information under a certain privacy risk. However, specifying an exhaustive list of unwanted recipients for a given regrettable scenario can be a tedious task which necessarily demands the user's intervention. In this paper, we introduce an approach based on decision trees to instantiate the audience component of privacy heuristics with minor intervention from the users. We introduce Disclosure- Acceptance Trees, a data structure representative of the audience component of a heuristic and describe a method for their generation out of user-centred privacy preferences.

2019-03-04
Han, C., Zhao, C., Zou, Z., Tang, H., You, J..  2018.  PATIP-TREE: An Efficient Method to Look up the Network Address Attribution Information. 2018 IEEE 20th International Conference on High Performance Computing and Communications; IEEE 16th International Conference on Smart City; IEEE 4th International Conference on Data Science and Systems (HPCC/SmartCity/DSS). :466–473.
The IP address attribution information includes the geographical information, the network routing information, the agency information, Internet Content Provider (ICP) information, etc. Nowadays, the attribution information is important to the network traffic engineering, which needs to be obtained in real time in network traffic analysis system. The existing proposed methods for IP address attribution information lookup cannot be employed in actual systems efficiently due to their low scalability or bad performance. They cannot address the backbone network's requirements for real-time IP address attribution information lookup, and most lookup methods do not support custom IP address attribution lookup. In response to these challenges, we propose a novel high-speed approach for IP address attribution information lookup. We first devise a data structure of IP address attribution information search tree (PATIP-TREE) to store custom IP address attribution information. Based on the PATIP-TREE, an effective algorithm for IP information lookup is proposed, which can support custom IP addresses attribution information lookup in real time. The experimental results show that our method outperforms the existing methods in terms of higher efficiency. Our approach also provides high scalability, which is suitable for many kinds network address such as IPv4 address, IPv6 address, named data networking address, etc.