Biblio
Public key cryptography plays an important role in secure communications over insecure channels. Elliptic curve cryptography, as a variant of public key cryptography, has been extensively used in the last decades for such purposes. In this paper, we present a software tool for parallel generation of cryptographic keys based on elliptic curves. Binary method for point multiplication and C++ threads were used in parallel implementation, while secp256k1 elliptic curve was used for testing. Obtained results show speedup of 30% over the sequential solution for 8 threads. The results are briefly discussed in the paper.
Information visualization applications have become ubiquitous, in no small part thanks to the ease of wide distribution and deployment to users enabled by the web browser. Scientific visualization applications, relying on native code libraries and parallel processing, have been less suited to such widespread distribution, as browsers do not provide the required libraries or compute capabilities. In this paper, we revisit this gap in visualization technologies and explore how new web technologies, WebAssembly and WebGPU, can be used to deploy powerful visualization solutions for large-scale scientific data in the browser. In particular, we evaluate the programming effort required to bring scientific visualization applications to the browser through these technologies and assess their competitiveness against classic native solutions. As a main example, we present a new GPU-driven isosurface extraction method for block-compressed data sets, that is suitable for interactive isosurface computation on large volumes in resource-constrained environments, such as the browser. We conclude that web browsers are on the verge of becoming a competitive platform for even the most demanding scientific visualization tasks, such as interactive visualization of isosurfaces from a 1TB DNS simulation. We call on researchers and developers to consider investing in a community software stack to ease use of these upcoming browser features to bring accessible scientific visualization to the browser.
In recent years, almost all the real-world operations are transferred to cyber world and these market computers connect with each other via Internet. As a result of this, there is an increasing number of security breaches of the networks, whose admins cannot protect their networks from the all types of attacks. Although most of these attacks can be prevented with the use of firewalls, encryption mechanisms, access controls and some password protections mechanisms; due to the emergence of new type of attacks, a dynamic intrusion detection mechanism is always needed in the information security market. To enable the dynamicity of the Intrusion Detection System (IDS), it should be updated by using a modern learning mechanism. Neural Network approach is one of the mostly preferred algorithms for training the system. However, with the increasing power of parallel computing and use of big data for training, as a new concept, deep learning has been used in many of the modern real-world problems. Therefore, in this paper, we have proposed an IDS system which uses GPU powered Deep Learning Algorithms. The experimental results are collected on mostly preferred dataset KDD99 and it showed that use of GPU speed up training time up to 6.48 times depending on the number of the hidden layers and nodes in them. Additionally, we compare the different optimizers to enlighten the researcher to select the best one for their ongoing or future research.
The National Airspace System (NAS), as a portion of the US' transportation system, has not yet begun to model or adopt integration of Artificial Intelligence (AI) technology. However, users of the NAS, i.e., Air transport operators, UAS operators, etc. are beginning to use this technology throughout their operations. At issue within the broader aviation marketplace, is the continued search for a solution set to the persistent daily delays and schedule perturbations that occur within the NAS. Despite billions invested through the NAS Modernization Program, the delays persist in the face of reduced demand for commercial routings. Every delay represents an economic loss to commercial transport operators, passengers, freighters, and any business depending on the transportation performance. Therefore, the FAA needs to begin to address from an advanced concepts perspective, what this wave of new technology will affect as it is brought to bear on various operations performance parameters, including safety, security, efficiency, and resiliency solution sets. This paper is the first in a series of papers we are developing to explore the application of AI in the National Airspace System (NAS). This first paper is meant to get everyone in the aviation community on the same page, a primer if you will, to start the technical discussions. This paper will define AI; the capabilities associated with AI; current use cases within the aviation ecosystem; and how to prepare for insertion of AI in the NAS. The next series of papers will look at NAS Operations Theory utilizing AI capabilities and eventually leading to a future intelligent NAS (iNAS) environment.
Image encryption is an essential part of a Visual Cryptography. Existing traditional sequential encryption techniques are infeasible to real-time applications. High-performance reformulations of such methods are increasingly growing over the last decade. These reformulations proved better performances over their sequential counterparts. A rotational encryption scheme encrypts the images in such a way that the decryption is possible with the rotated encrypted images. A parallel rotational encryption technique makes use of a high-performance device. But it less-leverages the optimizations offered by them. We propose a rotational image encryption technique which makes use of memory coalescing provided by the Compute Unified Device Architecture (CUDA). The proposed scheme achieves improved global memory utilization and increased efficiency.
The Internet of Things (IoT) and mobile systems nowadays are required to perform more intensive computation, such as facial detection, image recognition and even remote gaming, etc. Due to the limited computation performance and power budget, it is sometimes impossible to perform these workloads locally. As high-performance GPUs become more common in the cloud, offloading the computation to the cloud becomes a possible choice. However, due to the fact that offloaded workloads from different devices (belonging to different users) are being computed in the same cloud, security concerns arise. Side channel attacks on GPU systems have been widely studied, where the threat model is the attacker and the victim are running on the same operating system. Recently, major GPU vendors have provided hardware and library support to virtualize GPUs for better isolation among users. This work studies the side channel attacks from one virtual machine to another where both share the same physical GPU. We show that it is possible to infer other user's activities in this setup and can further steal others deep learning model.
Now-a-days, the security of data becomes more and more important, people store many personal information in their phones. However, stored information require security and maintain privacy. Encryption algorithm has become the main force of maintaining the security of data. Thus, the algorithm complexity and encryption efficiency have become the main measurement of whether the encryption algorithm is save or not. With the development of hardware, we have many tools to improve the algorithm at present. Because modular exponentiation in RSA algorithm can be divided into several parts mathematically. In this paper, we introduce a conception by dividing the process of encryption and add the model into graphics process unit (GPU). By using GPU's capacity in parallel computing, the core of RSA can be accelerated by using central process unit (CPU) and GPU. Compute unified device architecture (CUDA) is a platform which can combine CPU and GPU together to realize GPU parallel programming and this is the tool we use to perform experience of accelerating RSA algorithm. This paper will also build up a mathematical model to help understand the mechanism of RSA encryption algorithm.
To solve the problems associated with large data volume real-time processing, heterogeneous systems using various computing devices are increasingly used. The characteristic of solving this class of problems is related to the fact that there are two directions for improving methods of real-time data analysis: the first is the development of algorithms and approaches to analysis, and the second is the development of hardware and software. This article reviews the main approaches to the architecture of a hardware-software solution for traffic capture and deep packet inspection (DPI) in data transmission networks with a bandwidth of 80 Gbit/s and higher. At the moment there are software and hardware tools that allow designing the architecture of capture system and deep packet inspection: 1) Using only the central processing unit (CPU); 2) Using only the graphics processing unit (GPU); 3) Using the central processing unit and graphics processing unit simultaneously (CPU + GPU). In this paper, we consider these key approaches. Also attention is paid to both hardware and software requirements for the architecture of solutions. Pain points and remedies are described.
Malicious applications have become increasingly numerous. This demands adaptive, learning-based techniques for constructing malware detection engines, instead of the traditional manual-based strategies. Prior work in learning-based malware detection engines primarily focuses on dynamic trace analysis and byte-level n-grams. Our approach in this paper differs in that we use compiler intermediate representations, i.e., the callgraph representation of binaries. Using graph-based program representations for learning provides structure of the program, which can be used to learn more advanced patterns. We use the Shortest Path Graph Kernel (SPGK) to identify similarities between call graphs extracted from binaries. The output similarity matrix is fed into a Support Vector Machine (SVM) algorithm to construct highly-accurate models to predict whether a binary is malicious or not. However, SPGK is computationally expensive due to the size of the input graphs. Therefore, we evaluate different parallelization methods for CPUs and GPUs to speed up this kernel, allowing us to continuously construct up-to-date models in a timely manner. Our hybrid implementation, which leverages both CPU and GPU, yields the best performance, achieving up to a 14.2x improvement over our already optimized OpenMP version. We compared our generated graph-based models to previously state-of-the-art feature vector 2-gram and 3-gram models on a dataset consisting of over 22,000 binaries. We show that our classification accuracy using graphs is over 19% higher than either n-gram model and gives a false positive rate (FPR) of less than 0.1%. We are also able to consider large call graphs and dataset sizes because of the reduced execution time of our parallelized SPGK implementation.
The advancement in technology has changed how people work and what software and hardware people use. From conventional personal computer to GPU, hardware technology and capability have dramatically improved so does the operating systems that come along. Unfortunately, current industry practice to compare OS is performed with single perspective. It is either benchmark the hardware level performance or performs penetration testing to check the security features of an OS. This rigid method of benchmarking does not really reflect the true performance of an OS as the performance analysis is not comprehensive and conclusive. To illustrate this deficiency, the study performed hardware level and operational level benchmarking on Windows XP, Windows 7 and Windows 8 and the results indicate that there are instances where Windows XP excels over its newer counterparts. Overall, the research shows Windows 8 is a superior OS in comparison to its predecessors running on the same hardware. Furthermore, the findings also show that the automated benchmarking tools are proved less efficient benchmark systems that run on Windows XP and older OS as they do not support DirectX 11 and other advanced features that the hardware supports. There lies the need to have a unified benchmarking approach to compare other aspects of OS such as user oriented tasks and security parameters to provide a complete comparison. Therefore, this paper is proposing a unified approach for Operating System (OS) comparisons with the help of a Windows OS case study. This unified approach includes comparison of OS from three aspects which are; hardware level, operational level performance and security tests.