Visible to the public Biblio

Filters: Keyword is Acceleration  [Clear All Filters]
2019-05-20
Hu, W., Ardeshiricham, A., Gobulukoglu, M. S., Wang, X., Kastner, R..  2018.  Property Specific Information Flow Analysis for Hardware Security Verification. 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD). :1-8.

Hardware information flow analysis detects security vulnerabilities resulting from unintended design flaws, timing channels, and hardware Trojans. These information flow models are typically generated in a general way, which includes a significant amount of redundancy that is irrelevant to the specified security properties. In this work, we propose a property specific approach for information flow security. We create information flow models tailored to the properties to be verified by performing a property specific search to identify security critical paths. This helps find suspicious signals that require closer inspection and quickly eliminates portions of the design that are free of security violations. Our property specific trimming technique reduces the complexity of the security model; this accelerates security verification and restricts potential security violations to a smaller region which helps quickly pinpoint hardware security vulnerabilities.

2019-01-16
Abdelwahed, N., Letaifa, A. Ben, Asmi, S. El.  2018.  Content Based Algorithm Aiming to Improve the WEB\_QoE Over SDN Networks. 2018 32nd International Conference on Advanced Information Networking and Applications Workshops (WAINA). :153–158.
Since the 1990s, the concept of QoE has been increasingly present and many scientists take it into account within different fields of application. Taking for example the case of video streaming, the QoE has been well studied in this case while for the web the study of its QoE is relatively neglected. The Quality of Experience (QoE) is the set of objective and subjective characteristics that satisfy retain or give confidence to a user through the life cycle of a service. There are researches that take the different measurement metrics of QoE as a subject, others attack new ways to improve this QoE in order to satisfy the customer and gain his loyalty. In this paper, we focus on the web QoE that is declined by researches despite its great importance given the complexity of new web pages and their utility that is increasingly critical. The wealth of new web pages in images, videos, audios etc. and their growing significance prompt us to write this paper, in which we discuss a new method that aims to improve the web QoE in a software-defined network (SDN). Our proposed method consists in automating and making more flexible the management of the QoE improvement of the web pages and this by writing an algorithm that, depending on the case, chooses the necessary treatment to improve the web QoE of the page concerned and using both web prefetching and caching to accelerate the data transfer when the user asks for it. The first part of the paper discusses the advantages and disadvantages of existing works. In the second part we propose an automatic algorithm that treats each case with the appropriate solution that guarantees its best performance. The last part is devoted to the evaluation of the performance.
2018-05-01
Wang, X., Zhou, S..  2017.  Accelerated Stochastic Gradient Method for Support Vector Machines Classification with Additive Kernel. 2017 First International Conference on Electronics Instrumentation Information Systems (EIIS). :1–6.

Support vector machines (SVMs) have been widely used for classification in machine learning and data mining. However, SVM faces a huge challenge in large scale classification tasks. Recent progresses have enabled additive kernel version of SVM efficiently solves such large scale problems nearly as fast as a linear classifier. This paper proposes a new accelerated mini-batch stochastic gradient descent algorithm for SVM classification with additive kernel (AK-ASGD). On the one hand, the gradient is approximated by the sum of a scalar polynomial function for each feature dimension; on the other hand, Nesterov's acceleration strategy is used. The experimental results on benchmark large scale classification data sets show that our proposed algorithm can achieve higher testing accuracies and has faster convergence rate.

2018-03-19
Siripurapu, Srinivas, Gayasen, Aman, Gopalakrishnan, Padmini, Chandrachoodan, Nitin.  2017.  FPGA Implementation of Non-Uniform DFT for Accelerating Wireless Channel Simulations (Abstract Only). Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. :295–295.

FPGAs have been used as accelerators in a wide variety of domains such as learning, search, genomics, signal processing, compression, analytics and so on. In recent years, the availability of tools and flows such as high-level synthesis has made it even easier to accelerate a variety of high-performance computing applications onto FPGAs. In this paper we propose a systematic methodology for optimizing the performance of an accelerated block using the notion of compute intensity to guide optimizations in high-level synthesis. We demonstrate the effectiveness of our methodology on an FPGA implementation of a non-uniform discrete Fourier transform (NUDFT), used to convert a wireless channel model from the time-domain to the frequency domain. The acceleration of this particular computation can be used to improve the performance and capacity of wireless channel simulation, which has wide applications in the system level design and performance evaluation of wireless networks. Our results show that our FPGA implementation outperforms the same code offloaded onto GPUs and CPUs by 1.6x and 10x respectively, in performance as measured by the throughput of the accelerated block. The gains in performance per watt versus GPUs and CPUs are 15.6x and 41.5x respectively.

2018-02-21
Bai, Xu, Jiang, Lei, Dai, Qiong, Yang, Jiajia, Tan, Jianlong.  2017.  Acceleration of RSA processes based on hybrid ARM-FPGA cluster. 2017 IEEE Symposium on Computers and Communications (ISCC). :682–688.

Cooperation of software and hardware with hybrid architectures, such as Xilinx Zynq SoC combining ARM CPU and FPGA fabric, is a high-performance and low-power platform for accelerating RSA Algorithm. This paper adopts the none-subtraction Montgomery algorithm and the Chinese Remainder Theorem (CRT) to implement high-speed RSA processors, and deploys a 48-node cluster infrastructure based on Zynq SoC to achieve extremely high scalability and throughput of RSA computing. In this design, we use the ARM to implement node-to-node communication with the Message Passing Interface (MPI) while use the FPGA to handle complex calculation. Finally, the experimental results show that the overall performance is linear with the number of nodes. And the cluster achieves 6× 9× speedup against a multi-core desktop (Intel i7-3770) and comparable performance to a many-core server (288-core). In addition, we gain up to 2.5× energy efficiency compared to these two traditional platforms.

Zhou, G., Feng, Y., Bo, R., Chien, L., Zhang, X., Lang, Y., Jia, Y., Chen, Z..  2017.  GPU-Accelerated Batch-ACPF Solution for N-1 Static Security Analysis. IEEE Transactions on Smart Grid. 8:1406–1416.

Graphics processing unit (GPU) has been applied successfully in many scientific computing realms due to its superior performances on float-pointing calculation and memory bandwidth, and has great potential in power system applications. The N-1 static security analysis (SSA) appears to be a candidate application in which massive alternating current power flow (ACPF) problems need to be solved. However, when applying existing GPU-accelerated algorithms to solve N-1 SSA problem, the degree of parallelism is limited because existing researches have been devoted to accelerating the solution of a single ACPF. This paper therefore proposes a GPU-accelerated solution that creates an additional layer of parallelism among batch ACPFs and consequently achieves a much higher level of overall parallelism. First, this paper establishes two basic principles for determining well-designed GPU algorithms, through which the limitation of GPU-accelerated sequential-ACPF solution is demonstrated. Next, being the first of its kind, this paper proposes a novel GPU-accelerated batch-QR solver, which packages massive number of QR tasks to formulate a new larger-scale problem and then achieves higher level of parallelism and better coalesced memory accesses. To further improve the efficiency of solving SSA, a GPU-accelerated batch-Jacobian-Matrix generating and contingency screening is developed and carefully optimized. Lastly, the complete process of the proposed GPU-accelerated batch-ACPF solution for SSA is presented. Case studies on an 8503-bus system show dramatic computation time reduction is achieved compared with all reported existing GPU-accelerated methods. In comparison to UMFPACK-library-based single-CPU counterpart using Intel Xeon E5-2620, the proposed GPU-accelerated SSA framework using NVIDIA K20C achieves up to 57.6 times speedup. It can even achieve four times speedup when compared to one of the fastest multi-core CPU parallel computing solution using KLU library. The prop- sed batch-solving method is practically very promising and lays a critical foundation for many other power system applications that need to deal with massive subtasks, such as Monte-Carlo simulation and probabilistic power flow.

2018-02-06
Badii, A., Faulkner, R., Raval, R., Glackin, C., Chollet, G..  2017.  Accelerated Encryption Algorithms for Secure Storage and Processing in the Cloud. 2017 International Conference on Advanced Technologies for Signal and Image Processing (ATSIP). :1–6.

The objective of this paper is to outline the design specification, implementation and evaluation of a proposed accelerated encryption framework which deploys both homomorphic and symmetric-key encryptions to serve the privacy preserving processing; in particular, as a sub-system within the Privacy Preserving Speech Processing framework architecture as part of the PPSP-in-Cloud Platform. Following a preliminary study of GPU efficiency gains optimisations benchmarked for AES implementation we have addressed and resolved the Big Integer processing challenges in parallel implementation of bilinear pairing thus enabling the creation of partially homomorphic encryption schemes which facilitates applications such as speech processing in the encrypted domain on the cloud. This novel implementation has been validated in laboratory tests using a standard speech corpus and can be used for other application domains to support secure computation and privacy preserving big data storage/processing in the cloud.

2017-04-20
Rohrmann, R., Patton, M. W., Chen, H..  2016.  Anonymous port scanning: Performing network reconnaissance through Tor. 2016 IEEE Conference on Intelligence and Security Informatics (ISI). :217–217.

The anonymizing network Tor is examined as one method of anonymizing port scanning tools and avoiding identification and retaliation. Performing anonymized port scans through Tor is possible using Nmap, but parallelization of the scanning processes is required to accelerate the scan rate.

2017-03-08
Dangra, B. S., Rajput, D., Bedekar, M. V., Panicker, S. S..  2015.  Profiling of automobile drivers using car games. 2015 International Conference on Pervasive Computing (ICPC). :1–5.

In this paper we use car games as a simulator for real automobiles, and generate driving logs that contain the vehicle data. This includes values for parameters like gear used, speed, left turns taken, right turns taken, accelerator, braking and so on. From these parameters we have derived some more additional parameters and analyzed them. As the input from automobile driver is only routine driving, no explicit feedback is required; hence there are more chances of being able to accurately profile the driver. Experimentation and analysis from this logged data shows possibility that driver profiling can be done from vehicle data. Since the profiles are unique, these can be further used for a wide range of applications and can successfully exhibit typical driving characteristics of each user.

2015-05-05
Jian Wu, Yongmei Jiang, Gangyao Kuang, Jun Lu, Zhiyong Li.  2014.  Parameter estimation for SAR moving target detection using Fractional Fourier Transform. Geoscience and Remote Sensing Symposium (IGARSS), 2014 IEEE International. :596-599.

This paper proposes an algorithm for multi-channel SAR ground moving target detection and estimation using the Fractional Fourier Transform(FrFT). To detect the moving target with low speed, the clutter is first suppressed by Displace Phase Center Antenna(DPCA), then the signal-to-clutter can be enhanced. Have suppressed the clutter, the echo of moving target remains and can be regarded as a chirp signal whose parameters can be estimated by FrFT. FrFT, one of the most widely used tools to time-frequency analysis, is utilized to estimate the Doppler parameters, from which the moving parameters, including the velocity and the acceleration can be obtained. The effectiveness of the proposed method is validated by the simulation.
 

2015-05-04
Lan Zhang, Kebin Liu, Yonghang Jiang, Xiang-Yang Li, Yunhao Liu, Panlong Yang.  2014.  Montage: Combine frames with movement continuity for realtime multi-user tracking. INFOCOM, 2014 Proceedings IEEE. :799-807.

In this work we design and develop Montage for real-time multi-user formation tracking and localization by off-the-shelf smartphones. Montage achieves submeter-level tracking accuracy by integrating temporal and spatial constraints from user movement vector estimation and distance measuring. In Montage we designed a suite of novel techniques to surmount a variety of challenges in real-time tracking, without infrastructure and fingerprints, and without any a priori user-specific (e.g., stride-length and phone-placement) or site-specific (e.g., digitalized map) knowledge. We implemented, deployed and evaluated Montage in both outdoor and indoor environment. Our experimental results (847 traces from 15 users) show that the stride-length estimated by Montage over all users has error within 9cm, and the moving-direction estimated by Montage is within 20°. For realtime tracking, Montage provides meter-second-level formation tracking accuracy with off-the-shelf mobile phones.

2015-04-30
Shafagh, H., Hithnawi, A..  2014.  Poster Abstract: Security Comes First, a Public-key Cryptography Framework for the Internet of Things. Distributed Computing in Sensor Systems (DCOSS), 2014 IEEE International Conference on. :135-136.

Novel Internet services are emerging around an increasing number of sensors and actuators in our surroundings, commonly referred to as smart devices. Smart devices, which form the backbone of the Internet of Things (IoT), enable alternative forms of user experience by means of automation, convenience, and efficiency. At the same time new security and safety issues arise, given the Internet-connectivity and the interaction possibility of smart devices with human's proximate living space. Hence, security is a fundamental requirement of the IoT design. In order to remain interoperable with the existing infrastructure, we postulate a security framework compatible to standard IP-based security solutions, yet optimized to meet the constraints of the IoT ecosystem. In this ongoing work, we first identify necessary components of an interoperable secure End-to-End communication while incorporating Public-key Cryptography (PKC). To this end, we tackle involved computational and communication overheads. The required components on the hardware side are the affordable hardware acceleration engines for cryptographic operations and on the software side header compression and long-lasting secure sessions. In future work, we focus on integration of these components into a framework and the evaluation of an early prototype of this framework.

2015-04-29
Shafagh, H., Hithnawi, A..  2014.  Poster Abstract: Security Comes First, a Public-key Cryptography Framework for the Internet of Things. Distributed Computing in Sensor Systems (DCOSS), 2014 IEEE International Conference on. :135-136.

Novel Internet services are emerging around an increasing number of sensors and actuators in our surroundings, commonly referred to as smart devices. Smart devices, which form the backbone of the Internet of Things (IoT), enable alternative forms of user experience by means of automation, convenience, and efficiency. At the same time new security and safety issues arise, given the Internet-connectivity and the interaction possibility of smart devices with human's proximate living space. Hence, security is a fundamental requirement of the IoT design. In order to remain interoperable with the existing infrastructure, we postulate a security framework compatible to standard IP-based security solutions, yet optimized to meet the constraints of the IoT ecosystem. In this ongoing work, we first identify necessary components of an interoperable secure End-to-End communication while incorporating Public-key Cryptography (PKC). To this end, we tackle involved computational and communication overheads. The required components on the hardware side are the affordable hardware acceleration engines for cryptographic operations and on the software side header compression and long-lasting secure sessions. In future work, we focus on integration of these components into a framework and the evaluation of an early prototype of this framework.