Biblio

List
Filter

Found 89 results

Filters: Keyword is metadata [Clear All Filters]

2019-11-25

Rady, Mai, Abdelkader, Tamer, Ismail, Rasha. 2018. SCIQ-CD: A Secure Scheme to Provide Confidentiality and Integrity of Query results for Cloud Databases. 2018 14th International Computer Engineering Conference (ICENCO). :225–230.

Database outsourcing introduces a new paradigm, called Database as a Service (DBaaS). Database Service Providers (DSPs) have the ability to host outsourced databases and provide efficient facilities for their users. However, the data and the execution of database queries are under the control of the DSP, which is not always a trusted authority. Therefore, our problem is to ensure the outsourced database security. To address this problem, we propose a Secure scheme to provide Confidentiality and Integrity of Query results for Cloud Databases (SCIQ-CD). The performance analysis shows that our proposed scheme is secure and efficient for practical deployment.

2019-10-14

Koo, H., Chen, Y., Lu, L., Kemerlis, V. P., Polychronakis, M.. 2018. Compiler-Assisted Code Randomization. 2018 IEEE Symposium on Security and Privacy (SP). :461–477.

Despite decades of research on software diversification, only address space layout randomization has seen widespread adoption. Code randomization, an effective defense against return-oriented programming exploits, has remained an academic exercise mainly due to i) the lack of a transparent and streamlined deployment model that does not disrupt existing software distribution norms, and ii) the inherent incompatibility of program variants with error reporting, whitelisting, patching, and other operations that rely on code uniformity. In this work we present compiler-assisted code randomization (CCR), a hybrid approach that relies on compiler-rewriter cooperation to enable fast and robust fine-grained code randomization on end-user systems, while maintaining compatibility with existing software distribution models. The main concept behind CCR is to augment binaries with a minimal set of transformation-assisting metadata, which i) facilitate rapid fine-grained code transformation at installation or load time, and ii) form the basis for reversing any applied code transformation when needed, to maintain compatibility with existing mechanisms that rely on referencing the original code. We have implemented a prototype of this approach by extending the LLVM compiler toolchain, and developing a simple binary rewriter that leverages the embedded metadata to generate randomized variants using basic block reordering. The results of our experimental evaluation demonstrate the feasibility and practicality of CCR, as on average it incurs a modest file size increase of 11.46% and a negligible runtime overhead of 0.28%, while it is compatible with link-time optimization and control flow integrity.

2019-09-26

Nelmiawati, Arifandi, W.. 2018. A Seamless Secret Sharing Scheme Implementation for Securing Data in Public Cloud Storage Service. 2018 International Conference on Applied Engineering (ICAE). :1-5.

Public cloud data storage services were considered as a potential alternative to store low-cost digital data in the short term. They are offered by different providers on the Internet. Some providers offer limited free plans for the users who are starting the service. However, data security concern arises when data stored are considered as a valuable asset. This study explores the usage of secret sharing scheme: Rabin's IDA and Shamir's SSA to implement a tool called dCloud for file protection stored in public cloud storage in a seamless way. It addresses data security by hiding its complexities when targeting ordinary non-technical users. The secret key is automatically generated by dCloud in a secure random way on Rabin's IDA. Shamir's SSA completes the process through dispersing the key into each of Rabin's IDA output files. Moreover, the Hash value of the original file is added to each of those output files to confirm the integrity of the file during reconstruction. Besides, the authentication key is used to communicate with all of the defined service providers during storage and reconstruction as well. It is stored into local secure key-store. By having a key to access the key-store, an ordinary non-technical user will be able to use dCloud to store and retrieve targeted file within defined public cloud storage services securely.

2019-09-23

Chen, W., Liang, X., Li, J., Qin, H., Mu, Y., Wang, J.. 2018. Blockchain Based Provenance Sharing of Scientific Workflows. 2018 IEEE International Conference on Big Data (Big Data). :3814–3820.

In a research community, the provenance sharing of scientific workflows can enhance distributed research cooperation, experiment reproducibility verification and experiment repeatedly doing. Considering that scientists in such a community are often in a loose relation and distributed geographically, traditional centralized provenance sharing architectures have shown their disadvantages in poor trustworthiness, reliabilities and efficiency. Additionally, they are also difficult to protect the rights and interests of data providers. All these have been largely hindering the willings of distributed scientists to share their workflow provenance. Considering the big advantages of blockchain in decentralization, trustworthiness and high reliability, an approach to sharing scientific workflow provenance based on blockchain in a research community is proposed. To make the approach more practical, provenance is handled on-chain and original data is delivered off-chain. A kind of block structure to support efficient provenance storing and retrieving is designed, and an algorithm for scientists to search workflow segments from provenance as well as an algorithm for experiments backtracking are provided to enhance the experiment result sharing, save computing resource and time cost by avoiding repeated experiments as far as possible. Analyses show that the approach is efficient and effective.

Yazici, I. M., Karabulut, E., Aktas, M. S.. 2018. A Data Provenance Visualization Approach. 2018 14th International Conference on Semantics, Knowledge and Grids (SKG). :84–91.

Data Provenance has created an emerging requirement for technologies that enable end users to access, evaluate, and act on the provenance of data in recent years. In the era of Big Data, the amount of data created by corporations around the world has grown each year. As an example, both in the Social Media and e-Science domains, data is growing at an unprecedented rate. As the data has grown rapidly, information on the origin and lifecycle of the data has also grown. In turn, this requires technologies that enable the clarification and interpretation of data through the use of data provenance. This study proposes methodologies towards the visualization of W3C-PROV-O Specification compatible provenance data. The visualizations are done by summarization and comparison of the data provenance. We facilitated the testing of these methodologies by providing a prototype, extending an existing open source visualization tool. We discuss the usability of the proposed methodologies with an experimental study; our initial results show that the proposed approach is usable, and its processing overhead is negligible.

2019-09-04

Paiker, N., Ding, X., Curtmola, R., Borcea, C.. 2018. Context-Aware File Discovery System for Distributed Mobile-Cloud Apps. 2018 IEEE International Conference on Cloud Computing Technology and Science (CloudCom). :198–203.

Recent research has proposed middleware to enable efficient distributed apps over mobile-cloud platforms. This paper presents a Context-Aware File Discovery Service (CAFDS) that allows distributed mobile-cloud applications to find and access files of interest shared by collaborating users. CAFDS enables programmers to search for files defined by context and content features, such as location, creation time, or the presence of certain object types within an image file. CAFDS provides low-latency through a cloud-based metadata server, which uses a decision tree to locate the nearest files that satisfy the context and content features requested by applications. We implemented CAFDS in Android and Linux. Experimental results show CAFDS achieves substantially lower latency than peer-to-peer solutions that cannot leverage context information.

Lawson, M., Lofstead, J.. 2018. Using a Robust Metadata Management System to Accelerate Scientific Discovery at Extreme Scales. 2018 IEEE/ACM 3rd International Workshop on Parallel Data Storage Data Intensive Scalable Computing Systems (PDSW-DISCS). :13–23.

Our previous work, which can be referred to as EMPRESS 1.0, showed that rich metadata management provides a relatively low-overhead approach to facilitating insight from scale-up scientific applications. However, this system did not provide the functionality needed for a viable production system or address whether such a system could scale. Therefore, we have extended our previous work to create EMPRESS 2.0, which incorporates the features required for a useful production system. Through a discussion of EMPRESS 2.0, this paper explores how to incorporate rich query functionality, fault tolerance, and atomic operations into a scalable, storage system independent metadata management system that is easy to use. This paper demonstrates that such a system offers significant performance advantages over HDF5, providing metadata querying that is 150X to 650X faster, and can greatly accelerate post-processing. Finally, since the current implementation of EMPRESS 2.0 relies on an RDBMS, this paper demonstrates that an RDBMS is a viable technology for managing data-oriented metadata.

Maltitz, M. von, Smarzly, S., Kinkelin, H., Carle, G.. 2018. A management framework for secure multiparty computation in dynamic environments. NOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium. :1–7.

Secure multiparty computation (SMC) is a promising technology for privacy-preserving collaborative computation. In the last years several feasibility studies have shown its practical applicability in different fields. However, it is recognized that administration, and management overhead of SMC solutions are still a problem. A vital next step is the incorporation of SMC in the emerging fields of the Internet of Things and (smart) dynamic environments. In these settings, the properties of these contexts make utilization of SMC even more challenging since some vital premises for its application regarding environmental stability and preliminary configuration are not initially fulfilled. We bridge this gap by providing FlexSMC, a management and orchestration framework for SMC which supports the discovery of nodes, supports a trust establishment between them and realizes robustness of SMC session by handling nodes failures and communication interruptions. The practical evaluation of FlexSMC shows that it enables the application of SMC in dynamic environments with reasonable performance penalties and computation durations allowing soft real-time and interactive use cases.

Vanjari, M. S. P., Balsaraf, M. K. P.. 2018. Efficient Exploration of Algorithm in Scholarly Big Data Document. 2018 International Conference on Information , Communication, Engineering and Technology (ICICET). :1–5.

Algorithms are used to develop, analyzing, and applying in the computer field and used for developing new application. It is used for finding solutions to any problems in different condition. It transforms the problems into algorithmic ones on which standard algorithms are applied. Day by day Scholarly Digital documents are increasing. AlgorithmSeer is a search engine used for searching algorithms. The main aim of it provides a large algorithm database. It is used to automatically encountering and take these algorithms in this big collection of documents that enable algorithm indexing, searching, discovery, and analysis. An original set to identify and pull out algorithm representations in a big collection of scholarly documents is proposed, of scale able techniques used by AlgorithmSeer. Along with this, particularly important and relevant textual content can be accessed the platform and highlight portions by anyone with different levels of knowledge. In support of lectures and self-learning, the highlighted documents can be shared with others. But different levels of learners cannot use the highlighted part of text at same understanding level. The problem of guessing new highlights of partially highlighted documents can be solved by us.

Xiong, M., Li, A., Xie, Z., Jia, Y.. 2018. A Practical Approach to Answer Extraction for Constructing QA Solution. 2018 IEEE Third International Conference on Data Science in Cyberspace (DSC). :398–404.

Question Answering system(QA) plays an increasingly important role in the Internet age. The proportion of using the QA is getting higher and higher for the Internet users to obtain knowledge and solve problems, especially in the modern agricultural filed. However, the answer quality in QA varies widely due to the agricultural expert's level. Answer quality assessment is important. Due to the lexical gap between questions and answers, the existing approaches are not quite satisfactory. A practical approach RCAS is proposed to rank the candidate answers, which utilizes the support sets to reduce the impact of lexical gap between questions and answers. Firstly, Similar questions are retrieved and support sets are produced with their high-quality answers. Based on the assumption that high quality answers would also have intrinsic similarity, the quality of candidate answers are then evaluated through their distance from the support sets. Secondly, Different from the existing approaches, previous knowledge from similar question-answer pairs are used to bridge the straight lexical and semantic gaps between questions and answers. Experiments are implemented on approximately 0.15 million question-answer pairs about agriculture, dietetics and food from Yahoo! Answers. The results show that our approach can rank the candidate answers more precisely.

2019-08-05

Hiremath, S., Kunte, S. R.. 2018. Ensuring Cloud Data Security Using Public Auditing with Privacy Preserving. 2018 3rd International Conference on Communication and Electronics Systems (ICCES). :1100-1104.

The Cloud computing in simple terms is storing and accessing data through internet. The data stored in the cloud is managed by cloud service providers. Storing data in cloud saves users time and memory. But once user stores data in cloud, he loses the control over his data. Hence there must be some security issues to be handled to keep users data safely in the cloud. In this work, we projected a secure auditing system using Third Party Auditor (TPA). We used Advanced Encryption Standard (AES) algorithm for encrypting user's data and Secure Hash Algorithm (SHA-2) to compute message digest. The system is executed in Amazon EC2 cloud by creating windows server instance. The results obtained demonstrates that our proposed work is safe and takes a firm time to audit the files.

2019-07-01

Rasin, A., Wagner, J., Heart, K., Grier, J.. 2018. Establishing Independent Audit Mechanisms for Database Management Systems. 2018 IEEE International Symposium on Technologies for Homeland Security (HST). :1-7.

The pervasive use of databases for the storage of critical and sensitive information in many organizations has led to an increase in the rate at which databases are exploited in computer crimes. While there are several techniques and tools available for database forensic analysis, such tools usually assume an apriori database preparation, such as relying on tamper-detection software to already be in place and the use of detailed logging. Further, such tools are built-in and thus can be compromised or corrupted along with the database itself. In practice, investigators need forensic and security audit tools that work on poorlyconfigured systems and make no assumptions about the extent of damage or malicious hacking in a database.In this paper, we present our database forensics methods, which are capable of examining database content from a storage (disk or RAM) image without using any log or file system metadata. We describe how these methods can be used to detect security breaches in an untrusted environment where the security threat arose from a privileged user (or someone who has obtained such privileges). Finally, we argue that a comprehensive and independent audit framework is necessary in order to detect and counteract threats in an environment where the security breach originates from an administrator (either at database or operating system level).

2019-05-08

Ölvecký, M., Gabriška, D.. 2018. Wiping Techniques and Anti-Forensics Methods. 2018 IEEE 16th International Symposium on Intelligent Systems and Informatics (SISY). :000127–000132.

This paper presents a theoretical background of main research activity focused on the evaluation of wiping/erasure standards which are mostly implemented in specific software products developed and programming for data wiping. The information saved in storage devices often consists of metadata and trace data. Especially but not only these kinds of data are very important in the process of forensic analysis because they sometimes contain information about interconnection on another file. Most people saving their sensitive information on their local storage devices and later they want to secure erase these files but usually there is a problem with this operation. Secure file destruction is one of many Anti-forensics methods. The outcome of this paper is to define the future research activities focused on the establishment of the suitable digital environment. This environment will be prepared for testing and evaluating selected wiping standards and appropriate eraser software.

2019-03-06

Suwansrikham, P., She, K.. 2018. Asymmetric Secure Storage Scheme for Big Data on Multiple Cloud Providers. 2018 IEEE 4th International Conference on Big Data Security on Cloud (BigDataSecurity), IEEE International Conference on High Performance and Smart Computing, (HPSC) and IEEE International Conference on Intelligent Data and Security (IDS). :121-125.

Recently, cloud computing is an emerging technology along with big data. Both technologies come together. Due to the enormous size of data in big data, it is impossible to store them in local storage. Alternatively, even we want to store them locally, we have to spend much money to create bit data center. One way to save money is store big data in cloud storage service. Cloud storage service provides users space and security to store the file. However, relying on single cloud storage may cause trouble for the customer. CSP may stop its service anytime. It is too risky if data owner hosts his file only single CSP. Also, the CSP is the third party that user have to trust without verification. After deploying his file to CSP, the user does not know who access his file. Even CSP provides a security mechanism to prevent outsider attack. However, how user ensure that there is no insider attack to steal or corrupt the file. This research proposes the way to minimize the risk, ensure data privacy, also accessing control. The big data file is split into chunks and distributed to multiple cloud storage provider. Even there is insider attack; the attacker gets only part of the file. He cannot reconstruct the whole file. After splitting the file, metadata is generated. Metadata is a place to keep chunk information, includes, chunk locations, access path, username and password of data owner to connect each CSP. Asymmetric security concept is applied to this research. The metadata will be encrypted and transfer to the user who requests to access the file. The file accessing, monitoring, metadata transferring is functions of dew computing which is an intermediate server between the users and cloud service.

2019-03-04

Alsadhan, A. F., Alhussein, M. A.. 2018. Deleted Data Attribution in Cloud Computing Platforms. 2018 1st International Conference on Computer Applications Information Security (ICCAIS). :1–6.

The introduction of Cloud-based storage represents one of the most discussed challenges among digital forensic professionals. In a 2014 report, the National Institute of Standards and Technology (NIST) highlighted the various forensic challenges created as a consequence of sharing storage area among cloud users. One critical issue discussed in the report is how to recognize a file's owner after the file has been deleted. When a file is deleted, the cloud system also deletes the file metadata. After metadata has been deleted, no one can know who owned the file. This critical issue has introduced some difficulties in the deleted data acquisition process. For example, if a cloud user accidently deletes a file, it is difficult to recover the file. More importantly, it is even more difficult to identify the actual cloud user that owned the file. In addition, forensic investigators encounter numerous obstacles if a deleted file was to be used as evidence against a crime suspect. Unfortunately, few studies have been conducted to solve this matter. As a result, this work presents our proposed solution to the challenge of attributing deleted files to their specific users. We call this the “user signature” approach. This approach aims to enhance the deleted data acquisition process in cloud computing environments by specifically attributing files to the corresponding user.

2019-02-25

Lekshmi, M. B., Deepthi, V. R.. 2018. Spam Detection Framework for Online Reviews Using Hadoop’ s Computational Capability. 2018 International CET Conference on Control, Communication, and Computing (IC4). :436–440.

Nowadays, online reviews have become one of the vital elements for customers to do online shopping. Organizations and individuals use this information to buy the right products and make business decisions. This has influenced the spammers or unethical business people to create false reviews and promote their products to out-beat competitions. Sophisticated systems are developed by spammers to create bulk of spam reviews in any websites within hours. To tackle this problem, studies have been conducted to formulate effective ways to detect the spam reviews. Various spam detection methods have been introduced in which most of them extracts meaningful features from the text or used machine learning techniques. These approaches gave little importance on extracted feature type and processing rate. NetSpam[1] defines a framework which can classify the review dataset based on spam features and maps them to a spam detection procedure which performs better than previous works in predictive accuracy. In this work, a method is proposed that can improve the processing rate by applying a distributed approach on review dataset using MapReduce feature. Parallel programming concept using MapReduce is used for processing big data in Hadoop. The solution involves parallelising the algorithm defined in NetSpam and it defines a spam detection procedure with better predictive accuracy and processing rate.

2018-11-14

Sommers, Joel, Durairajan, Ramakrishnan, Barford, Paul. 2017. Automatic Metadata Generation for Active Measurement. Proceedings of the 2017 Internet Measurement Conference. :261–267.

Empirical research in the Internet is fraught with challenges. Among these is the possibility that local environmental conditions (e.g., CPU load or network load) introduce unexpected bias or artifacts in measurements that lead to erroneous conclusions. In this paper, we describe a framework for local environment monitoring that is designed to be used during Internet measurement experiments. The goals of our work are to provide a critical, expanded perspective on measurement results and to improve the opportunity for reproducibility of results. We instantiate our framework in a tool we call SoMeta, which monitors the local environment during active probe-based measurement experiments. We evaluate the runtime costs of SoMeta and conduct a series of experiments in which we intentionally perturb different aspects of the local environment during active probe-based measurements. Our experiments show how simple local monitoring can readily expose conditions that bias active probe-based measurement results. We conclude with a discussion of how our framework can be expanded to provide metadata for a broad range of Internet measurement experiments.

2018-09-05

Gaikwad, V. S., Gandle, K. S.. 2017. Ideal complexity cryptosystem with high privacy data service for cloud databases. 2017 1st International Conference on Intelligent Systems and Information Management (ICISIM). :267–270.

Data storage in cloud should come along with high safety and confidentiality. It is accountability of cloud service provider to guarantee the availability and security of client data. There exist various alternatives for storage services but confidentiality and complexity solutions for database as a service are still not satisfactory. Proposed system gives alternative solution for database as a service that integrates benefits of different services along with advance encryption techniques. It yields possibility of applying concurrency on encrypted data. This alternative provides supporting facility to connect dispersed clients with elimination of intermediate proxy by which simplicity can acquired. Performance of proposed system evaluated on basis of theoretical analyses.

2018-03-19

Jemel, M., Msahli, M., Serhrouchni, A.. 2017. Towards an Efficient File Synchronization between Digital Safes. 2017 IEEE 31st International Conference on Advanced Information Networking and Applications (AINA). :136–143.

One of the main concerns of Cloud storage solutions is to offer the availability to the end user. Thus, addressing the mobility needs and device's variety has emerged as a major challenge. At first, data should be synchronized automatically and continuously when the user moves from one equipment to another. Secondly, the Cloud service should offer to the owner the possibility to share data with specific users. The paper's goal is to develop a secure framework that ensures file synchronization with high quality and minimal resource consumption. As a first step towards this goal, we propose the SyncDS protocol with its associated architecture. The synchronization protocol efficiency raises through the choice of the used networking protocol as well as the strategy of changes detection between two versions of file systems located in different devices. Our experiment results show that adopting the Hierarchical Hash Tree to detect the changes between two file systems and adopting the WebSocket protocol for the data exchanges improve the efficiency of the synchronization protocol.

Ukwandu, E., Buchanan, W. J., Russell, G.. 2017. Performance Evaluation of a Fragmented Secret Share System. 2017 International Conference On Cyber Situational Awareness, Data Analytics And Assessment (Cyber SA). :1–6.

There are many risks in moving data into public storage environments, along with an increasing threat around large-scale data leakage. Secret sharing scheme has been proposed as a keyless and resilient mechanism to mitigate this, but scaling through large scale data infrastructure has remained the bane of using secret sharing scheme in big data storage and retrievals. This work applies secret sharing methods as used in cryptography to create robust and secure data storage and retrievals in conjunction with data fragmentation. It outlines two different methods of distributing data equally to storage locations as well as recovering them in such a manner that ensures consistent data availability irrespective of file size and type. Our experiments consist of two different methods - data and key shares. Using our experimental results, we were able to validate previous works on the effects of threshold on file recovery. Results obtained also revealed the varying effects of share writing to and retrieval from storage locations other than computer memory. The implication is that increase in fragment size at varying file and threshold sizes rather than add overheads to file recovery, do so on creation instead, underscoring the importance of choosing a varying fragment size as file size increases.

2018-02-21

Lim, H., Ni, A., Kim, D., Ko, Y. B.. 2017. Named data networking testbed for scientific data. 2017 2nd International Conference on Computer and Communication Systems (ICCCS). :65–69.

Named Data Networking (NDN) is one of the future internet architectures, which is a clean-slate approach. NDN provides intelligent data retrieval using the principles of name-based symmetrical forwarding of Interest/Data packets and innetwork caching. The continually increasing demand for rapid dissemination of large-scale scientific data is driving the use of NDN in data-intensive science experiments. In this paper, we establish an intercontinental NDN testbed. In the testbed, an NDN-based application that targets climate science as an example data intensive science application is designed and implemented, which has differentiated features compared to those of previous studies. We verify experimental justification of using NDN for climate science in the intercontinental network, through performance comparisons between classical delivery techniques and NDN-based climate data delivery.

2018-02-15

Pan, J., Mao, X.. 2017. Detecting DOM-Sourced Cross-Site Scripting in Browser Extensions. 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME). :24–34.

In recent years, with the advances in JavaScript engines and the adoption of HTML5 APIs, web applications begin to show a tendency to shift their functionality from the server side towards the client side, resulting in dense and complex interactions with HTML documents using the Document Object Model (DOM). As a consequence, client-side vulnerabilities become more and more prevalent. In this paper, we focus on DOM-sourced Cross-site Scripting (XSS), which is a kind of severe but not well-studied vulnerability appearing in browser extensions. Comparing with conventional DOM-based XSS, a new attack surface is introduced by DOM-sourced XSS where the DOM could become a vulnerable source as well besides common sources such as URLs and form inputs. To discover such vulnerability, we propose a detecting framework employing hybrid analysis with two phases. The first phase is the lightweight static analysis consisting of a text filter and an abstract syntax tree parser, which produces potential vulnerable candidates. The second phase is the dynamic symbolic execution with an additional component named shadow DOM, generating a document as a proof-of-concept exploit. In our large-scale real-world experiment, 58 previously unknown DOM-sourced XSS vulnerabilities were discovered in user scripts of the popular browser extension Greasemonkey.

2018-02-14

Yang, Y., Liu, X., Deng, R. H., Weng, J.. 2017. Flexible Wildcard Searchable Encryption System. IEEE Transactions on Services Computing. PP:1–1.

Searchable encryption is an important technique for public cloud storage service to provide user data confidentiality protection and at the same time allow users performing keyword search over their encrypted data. Previous schemes only deal with exact or fuzzy keyword search to correct some spelling errors. In this paper, we propose a new wildcard searchable encryption system to support wildcard keyword queries which has several highly desirable features. First, our system allows multiple keywords search in which any queried keyword may contain zero, one or two wildcards, and a wildcard may appear in any position of a keyword and represent any number of symbols. Second, it supports simultaneous search on multiple data owner’s data using only one trapdoor. Third, it provides flexible user authorization and revocation to effectively manage search and decryption privileges. Fourth, it is constructed based on homomorphic encryption rather than Bloom filter and hence completely eliminates the false probability caused by Bloom filter. Finally, it achieves a high level of privacy protection since matching results are unknown to the cloud server in the test phase. The proposed system is thoroughly analyzed and is proved secure. Extensive experimental results indicate that our system is efficient compared with other existing wildcard searchable encryption schemes in the public key setting.

2018-01-23

Kolosnjaji, B., Eraisha, G., Webster, G., Zarras, A., Eckert, C.. 2017. Empowering convolutional networks for malware classification and analysis. 2017 International Joint Conference on Neural Networks (IJCNN). :3838–3845.

Performing large-scale malware classification is increasingly becoming a critical step in malware analytics as the number and variety of malware samples is rapidly growing. Statistical machine learning constitutes an appealing method to cope with this increase as it can use mathematical tools to extract information out of large-scale datasets and produce interpretable models. This has motivated a surge of scientific work in developing machine learning methods for detection and classification of malicious executables. However, an optimal method for extracting the most informative features for different malware families, with the final goal of malware classification, is yet to be found. Fortunately, neural networks have evolved to the state that they can surpass the limitations of other methods in terms of hierarchical feature extraction. Consequently, neural networks can now offer superior classification accuracy in many domains such as computer vision and natural language processing. In this paper, we transfer the performance improvements achieved in the area of neural networks to model the execution sequences of disassembled malicious binaries. We implement a neural network that consists of convolutional and feedforward neural constructs. This architecture embodies a hierarchical feature extraction approach that combines convolution of n-grams of instructions with plain vectorization of features derived from the headers of the Portable Executable (PE) files. Our evaluation results demonstrate that our approach outperforms baseline methods, such as simple Feedforward Neural Networks and Support Vector Machines, as we achieve 93% on precision and recall, even in case of obfuscations in the data.

2017-12-28

Luo, S., Wang, Y., Huang, W., Yu, H.. 2016. Backup and Disaster Recovery System for HDFS. 2016 International Conference on Information Science and Security (ICISS). :1–4.

HDFS has been widely used for storing massive scale data which is vulnerable to site disaster. The file system backup is an important strategy for data retention. In this paper, we present an efficient, easy- to-use Backup and Disaster Recovery System for HDFS. The system includes a client based on HDFS with additional feature of remote backup, and a remote server with a HDFS cluster to keep the backup data. It supports full backup and regularly incremental backup to the server with very low cost and high throughout. In our experiment, the average speed of backup and recovery is up to 95 MB/s, approaching the theoretical maximum speed of gigabit Ethernet.