Biblio
Hardware information flow analysis detects security vulnerabilities resulting from unintended design flaws, timing channels, and hardware Trojans. These information flow models are typically generated in a general way, which includes a significant amount of redundancy that is irrelevant to the specified security properties. In this work, we propose a property specific approach for information flow security. We create information flow models tailored to the properties to be verified by performing a property specific search to identify security critical paths. This helps find suspicious signals that require closer inspection and quickly eliminates portions of the design that are free of security violations. Our property specific trimming technique reduces the complexity of the security model; this accelerates security verification and restricts potential security violations to a smaller region which helps quickly pinpoint hardware security vulnerabilities.
Support vector machines (SVMs) have been widely used for classification in machine learning and data mining. However, SVM faces a huge challenge in large scale classification tasks. Recent progresses have enabled additive kernel version of SVM efficiently solves such large scale problems nearly as fast as a linear classifier. This paper proposes a new accelerated mini-batch stochastic gradient descent algorithm for SVM classification with additive kernel (AK-ASGD). On the one hand, the gradient is approximated by the sum of a scalar polynomial function for each feature dimension; on the other hand, Nesterov's acceleration strategy is used. The experimental results on benchmark large scale classification data sets show that our proposed algorithm can achieve higher testing accuracies and has faster convergence rate.
FPGAs have been used as accelerators in a wide variety of domains such as learning, search, genomics, signal processing, compression, analytics and so on. In recent years, the availability of tools and flows such as high-level synthesis has made it even easier to accelerate a variety of high-performance computing applications onto FPGAs. In this paper we propose a systematic methodology for optimizing the performance of an accelerated block using the notion of compute intensity to guide optimizations in high-level synthesis. We demonstrate the effectiveness of our methodology on an FPGA implementation of a non-uniform discrete Fourier transform (NUDFT), used to convert a wireless channel model from the time-domain to the frequency domain. The acceleration of this particular computation can be used to improve the performance and capacity of wireless channel simulation, which has wide applications in the system level design and performance evaluation of wireless networks. Our results show that our FPGA implementation outperforms the same code offloaded onto GPUs and CPUs by 1.6x and 10x respectively, in performance as measured by the throughput of the accelerated block. The gains in performance per watt versus GPUs and CPUs are 15.6x and 41.5x respectively.
Cooperation of software and hardware with hybrid architectures, such as Xilinx Zynq SoC combining ARM CPU and FPGA fabric, is a high-performance and low-power platform for accelerating RSA Algorithm. This paper adopts the none-subtraction Montgomery algorithm and the Chinese Remainder Theorem (CRT) to implement high-speed RSA processors, and deploys a 48-node cluster infrastructure based on Zynq SoC to achieve extremely high scalability and throughput of RSA computing. In this design, we use the ARM to implement node-to-node communication with the Message Passing Interface (MPI) while use the FPGA to handle complex calculation. Finally, the experimental results show that the overall performance is linear with the number of nodes. And the cluster achieves 6× 9× speedup against a multi-core desktop (Intel i7-3770) and comparable performance to a many-core server (288-core). In addition, we gain up to 2.5× energy efficiency compared to these two traditional platforms.
Graphics processing unit (GPU) has been applied successfully in many scientific computing realms due to its superior performances on float-pointing calculation and memory bandwidth, and has great potential in power system applications. The N-1 static security analysis (SSA) appears to be a candidate application in which massive alternating current power flow (ACPF) problems need to be solved. However, when applying existing GPU-accelerated algorithms to solve N-1 SSA problem, the degree of parallelism is limited because existing researches have been devoted to accelerating the solution of a single ACPF. This paper therefore proposes a GPU-accelerated solution that creates an additional layer of parallelism among batch ACPFs and consequently achieves a much higher level of overall parallelism. First, this paper establishes two basic principles for determining well-designed GPU algorithms, through which the limitation of GPU-accelerated sequential-ACPF solution is demonstrated. Next, being the first of its kind, this paper proposes a novel GPU-accelerated batch-QR solver, which packages massive number of QR tasks to formulate a new larger-scale problem and then achieves higher level of parallelism and better coalesced memory accesses. To further improve the efficiency of solving SSA, a GPU-accelerated batch-Jacobian-Matrix generating and contingency screening is developed and carefully optimized. Lastly, the complete process of the proposed GPU-accelerated batch-ACPF solution for SSA is presented. Case studies on an 8503-bus system show dramatic computation time reduction is achieved compared with all reported existing GPU-accelerated methods. In comparison to UMFPACK-library-based single-CPU counterpart using Intel Xeon E5-2620, the proposed GPU-accelerated SSA framework using NVIDIA K20C achieves up to 57.6 times speedup. It can even achieve four times speedup when compared to one of the fastest multi-core CPU parallel computing solution using KLU library. The prop- sed batch-solving method is practically very promising and lays a critical foundation for many other power system applications that need to deal with massive subtasks, such as Monte-Carlo simulation and probabilistic power flow.
The objective of this paper is to outline the design specification, implementation and evaluation of a proposed accelerated encryption framework which deploys both homomorphic and symmetric-key encryptions to serve the privacy preserving processing; in particular, as a sub-system within the Privacy Preserving Speech Processing framework architecture as part of the PPSP-in-Cloud Platform. Following a preliminary study of GPU efficiency gains optimisations benchmarked for AES implementation we have addressed and resolved the Big Integer processing challenges in parallel implementation of bilinear pairing thus enabling the creation of partially homomorphic encryption schemes which facilitates applications such as speech processing in the encrypted domain on the cloud. This novel implementation has been validated in laboratory tests using a standard speech corpus and can be used for other application domains to support secure computation and privacy preserving big data storage/processing in the cloud.
The anonymizing network Tor is examined as one method of anonymizing port scanning tools and avoiding identification and retaliation. Performing anonymized port scans through Tor is possible using Nmap, but parallelization of the scanning processes is required to accelerate the scan rate.
In this paper we use car games as a simulator for real automobiles, and generate driving logs that contain the vehicle data. This includes values for parameters like gear used, speed, left turns taken, right turns taken, accelerator, braking and so on. From these parameters we have derived some more additional parameters and analyzed them. As the input from automobile driver is only routine driving, no explicit feedback is required; hence there are more chances of being able to accurately profile the driver. Experimentation and analysis from this logged data shows possibility that driver profiling can be done from vehicle data. Since the profiles are unique, these can be further used for a wide range of applications and can successfully exhibit typical driving characteristics of each user.
This paper proposes an algorithm for multi-channel SAR ground moving target detection and estimation using the Fractional Fourier Transform(FrFT). To detect the moving target with low speed, the clutter is first suppressed by Displace Phase Center Antenna(DPCA), then the signal-to-clutter can be enhanced. Have suppressed the clutter, the echo of moving target remains and can be regarded as a chirp signal whose parameters can be estimated by FrFT. FrFT, one of the most widely used tools to time-frequency analysis, is utilized to estimate the Doppler parameters, from which the moving parameters, including the velocity and the acceleration can be obtained. The effectiveness of the proposed method is validated by the simulation.
In this work we design and develop Montage for real-time multi-user formation tracking and localization by off-the-shelf smartphones. Montage achieves submeter-level tracking accuracy by integrating temporal and spatial constraints from user movement vector estimation and distance measuring. In Montage we designed a suite of novel techniques to surmount a variety of challenges in real-time tracking, without infrastructure and fingerprints, and without any a priori user-specific (e.g., stride-length and phone-placement) or site-specific (e.g., digitalized map) knowledge. We implemented, deployed and evaluated Montage in both outdoor and indoor environment. Our experimental results (847 traces from 15 users) show that the stride-length estimated by Montage over all users has error within 9cm, and the moving-direction estimated by Montage is within 20°. For realtime tracking, Montage provides meter-second-level formation tracking accuracy with off-the-shelf mobile phones.
Novel Internet services are emerging around an increasing number of sensors and actuators in our surroundings, commonly referred to as smart devices. Smart devices, which form the backbone of the Internet of Things (IoT), enable alternative forms of user experience by means of automation, convenience, and efficiency. At the same time new security and safety issues arise, given the Internet-connectivity and the interaction possibility of smart devices with human's proximate living space. Hence, security is a fundamental requirement of the IoT design. In order to remain interoperable with the existing infrastructure, we postulate a security framework compatible to standard IP-based security solutions, yet optimized to meet the constraints of the IoT ecosystem. In this ongoing work, we first identify necessary components of an interoperable secure End-to-End communication while incorporating Public-key Cryptography (PKC). To this end, we tackle involved computational and communication overheads. The required components on the hardware side are the affordable hardware acceleration engines for cryptographic operations and on the software side header compression and long-lasting secure sessions. In future work, we focus on integration of these components into a framework and the evaluation of an early prototype of this framework.
Novel Internet services are emerging around an increasing number of sensors and actuators in our surroundings, commonly referred to as smart devices. Smart devices, which form the backbone of the Internet of Things (IoT), enable alternative forms of user experience by means of automation, convenience, and efficiency. At the same time new security and safety issues arise, given the Internet-connectivity and the interaction possibility of smart devices with human's proximate living space. Hence, security is a fundamental requirement of the IoT design. In order to remain interoperable with the existing infrastructure, we postulate a security framework compatible to standard IP-based security solutions, yet optimized to meet the constraints of the IoT ecosystem. In this ongoing work, we first identify necessary components of an interoperable secure End-to-End communication while incorporating Public-key Cryptography (PKC). To this end, we tackle involved computational and communication overheads. The required components on the hardware side are the affordable hardware acceleration engines for cryptographic operations and on the software side header compression and long-lasting secure sessions. In future work, we focus on integration of these components into a framework and the evaluation of an early prototype of this framework.