Biblio
Partitional Clustering Algorithm (PCA) on the Hadoop Distributed File System is to perform big data securities using the Perturbation Technique is the main idea of the proposed work. There are numerous clustering methods available that are used to categorize the information from the big data. PCA discovers the cluster based on the initial partition of the data. In this approach, it is possible to develop a security safeguarding of data that is impoverished to allow the calculations and communication. The performances were analyzed on Health Care database under the studies of various parameters like precision, accuracy, and F-score measure. The outcome of the results is to demonstrate that this method is used to decrease the complication in preserving privacy and better accuracy than that of the existing techniques.
Big data processing systems are becoming increasingly more present in cloud workloads. Consequently, they are starting to incorporate more sophisticated mechanisms from traditional database and distributed systems. We focus in this work on the use of caching policies, which for big data raise important new challenges. Not only they must respond to new variants of the trade-off between hit rate, response time, and the space consumed by the cache, but they must do so at possibly higher volume and velocity than web and database workloads. Previous caching policies have not been tested experimentally with big data workloads. We address these challenges in this work. We propose the Read Density family of policies, which is a principled approach to quantify the utility of cached objects through a family of utility functions that depend on the frequency of reads of an object. We further design the Approximate Histogram, which is a policy-based technique based on an array of counters. This technique promises to achieve runtime-space efficient computation of the metric required by the cache policy. We evaluate through trace-based simulation the caching policies from the Read Density family, and compare them with over ten state-of-the-art alternatives. We use two workload traces representative for big data processing, collected from commercial Spark and MapReduce deployments. While we achieve comparable performance to the state-of-art with less parameters, meaningful performance improvement for big data workloads remain elusive.
A blockchain is a distributed ledger forming a distributed consensus on a history of transactions, and is the underlying technology for the Bitcoin cryptocurrency. However, its applications are far beyond the financial sector. The transaction verification process for cryptocurrencies is much slower than traditional digital transaction systems. One approach to increase transaction speed and scalability is to identify a solution that offers faster Proof of Work. In this paper, we propose a method for accelerating the process of Proof of Work based on parallel mining rather than solo mining. The goal is to ensure that no more than two or more miners put the same effort into solving a specific block. The proposed method includes a process for selection of a manager, distribution of work and a reward system. This method has been implemented in a test environment that contains all the characteristics needed to perform Proof of Work for Bitcoin and has been tested, using a variety of case scenarios, by varying the difficulty level and number of validators. Preliminary results show improvement in the scalability of Proof of Work up to 34% compared to the current system.
Functionally safe control logic design without full duplication is difficult due to the complexity of random control logic. The Reorder buffer (ROB) is a control logic function commonly used in high performance computing systems. In this study, we focus on a safe ROB design used in an industry quality Network-on-Chip (NoC) Advanced eXtensible Interface (AXI) Network Interface (NI) block. We developed and applied area efficient safe design techniques including partial duplication, Error Detection Code (EDC) and invariance checking with formal proofs and showed that we can achieve a desired safe Diagnostic Coverage (DC) requirement with small area and power overheads and no performance degradation.
Science gateways bring out the possibility of reproducible science as they are integrated into reusable techniques, data and workflow management systems, security mechanisms, and high performance computing (HPC). We introduce BioinfoPortal, a science gateway that integrates a suite of different bioinformatics applications using HPC and data management resources provided by the Brazilian National HPC System (SINAPAD). BioinfoPortal follows the Software as a Service (SaaS) model and the web server is freely available for academic use. The goal of this paper is to describe the science gateway and its usage, addressing challenges of designing a multiuser computational platform for parallel/distributed executions of large-scale bioinformatics applications using the Brazilian HPC resources. We also present a study of performance and scalability of some bioinformatics applications executed in the HPC environments and perform machine learning analyses for predicting features for the HPC allocation/usage that could better perform the bioinformatics applications via BioinfoPortal.
The current paper proposes a method to combine the theoretical concepts of the parallel processing created by the DNA computing and GA environments, with the effectiveness novel mechanism of the distinction and discover of the cryptosystem keys. Three-level contributions to the current work, the first is the adoption of a final key sequence mechanism by the principle of interconnected sequence parts, the second to exploit the principle of the parallel that provides GA in the search for the counter value of the sequences of the challenge to the mechanism of the discrimination, the third, the most important and broadening the breaking of the cipher, is the harmony of the principle of the parallelism that has found via the DNA computing to discover the basic encryption key. The proposed method constructs a combined set of files includes binary sequences produced from substitution of the guess attributes of the binary equations system of the cryptosystem, as well as generating files that include all the prospects of the DNA strands for all successive cipher characters, the way to process these files to be obtained from the first character file, where extract a key sequence of each sequence from mentioned file and processed with the binary sequences that mentioned the counter produced from GA. The aim of the paper is exploitation and implementation the theoretical principles of the parallelism that providing via biological environment with the new sequences recognition mechanism in the cryptanalysis.
The article analyzes the concept of "Resilience" in relation to the development of computing. The strategy for reacting to perturbations in this process can be based either on "harsh Resistance" or "smarter Elasticity." Our "Models" are descriptive in defining the path of evolutionary development as structuring under the perturbations of the natural order and enable the analysis of the relationship among models, structures and factors of evolution. Among those, two features are critical: parallelism and "fuzziness", which to a large extent determine the rate of change of computing development, especially in critical applications. Both reversible and irreversible development processes related to elastic and resistant methods of problem solving are discussed. The sources of perturbations are located in vicinity of the resource boundaries, related to growing problem size with progress combined with the lack of computational "checkability" of resources i.e. data with inadequate models, methodologies and means. As a case study, the problem of hidden faults caused by the growth, the deficit of resources, and the checkability of digital circuits in critical applications is analyzed.
The Rowhammer bug is a novel micro-architectural security threat, enabling powerful privilege-escalation attacks on various mainstream platforms. It works by actively flipping bits in Dynamic Random Access Memory (DRAM) cells with unprivileged instructions. In order to set up Rowhammer against binaries in the Linux page cache, the Waylaying algorithm has previously been proposed. The Waylaying method stealthily relocates binaries onto exploitable physical addresses without exhausting system memory. However, the proof-of-concept Waylaying algorithm can be easily detected during page cache eviction because of its high disk I/O overhead and long running time. This paper proposes the more advanced Memway algorithm, which improves on Waylaying in terms of both I/O overhead and speed. Running time and disk I/O overhead are reduced by 90% by utilizing Linux tmpfs and inmemory swapping to manage eviction files. Furthermore, by combining Memway with the unprivileged posix fadvise API, the binary relocation step is made 100 times faster. Equipped with our Memway+fadvise relocation scheme, we demonstrate practical Rowhammer attacks that take only 15-200 minutes to covertly relocate a victim binary, and less than 3 seconds to flip the target instruction bit.
This paper describes MADHAT (Multidimensional Anomaly Detection fusing HPC, Analytics, and Tensors), an integrated workflow that demonstrates the applicability of HPC resources to the problem of maintaining cyber situational awareness. MADHAT combines two high-performance packages: ENSIGN for large-scale sparse tensor decompositions and HAGGLE for graph analytics. Tensor decompositions isolate coherent patterns of network behavior in ways that common clustering methods based on distance metrics cannot. Parallelized graph analysis then uses directed queries on a representation that combines the elements of identified patterns with other available information (such as additional log fields, domain knowledge, network topology, whitelists and blacklists, prior feedback, and published alerts) to confirm or reject a threat hypothesis, collect context, and raise alerts. MADHAT was developed using the collaborative HPC Architecture for Cyber Situational Awareness (HACSAW) research environment and evaluated on structured network sensor logs collected from Defense Research and Engineering Network (DREN) sites using HPC resources at the U.S. Army Engineer Research and Development Center DoD Supercomputing Resource Center (ERDC DSRC). To date, MADHAT has analyzed logs with over 650 million entries.
Cloud computing is the major paradigm in today's IT world with the capabilities of security management, high performance, flexibility, scalability. Customers valuing these features can better benefit if they use a cloud environment built using HPC fabric architecture. However, security is still a major concern, not only on the software side but also on the hardware side. There are multiple studies showing that the malicious users can affect the regular customers through the hardware if they are co-located on the same physical system. Therefore, solving possible security concerns on the HPC fabric architecture will clearly make the fabric industries leader in this area. In this paper, we propose an autonomic HPC fabric architecture that leverages both resilient computing capabilities and adaptive anomaly analysis for further security.
This paper presents a computational platform for dynamic security assessment (DSA) of large electricity grids, developed as part of the iTesla project. It leverages High Performance Computing (HPC) to analyze large power systems, with many scenarios and possible contingencies, thus paving the way for pan-European operational stability analysis. The results of the DSA are summarized by decision trees of 11 stability indicators. The platform's workflow and parallel implementation architecture is described in detail, including the way commercial tools are integrated into a plug-in architecture. A case study of the French grid is presented, with over 8000 scenarios and 1980 contingencies. Performance data of the case study (using 10,000 parallel cores) is analyzed, including task timings and data flows. Finally, the generated decision trees are compared with test data to quantify the functional performance of the DSA platform.