Acevedo, Javier, Ulbricht, Marian, Gabriel, Jennifer, Fitzek, Frank H. P..
2021.
Hardware Accelerated Cryptography for Tactile Internet. European Wireless 2021; 26th European Wireless Conference. :1—8.
Tactile Internet (TI) applications such as industry automation, connected autonomous cars, augmented reality and remote surgery, are based on secure data transmissions at a very low end-to-end latency. In order to fulfill those requirements in real applications, it is necessary to implement traffic encryption when data flows at higher communication protocol layers. Nevertheless, the implementation of the aforementioned protocols is a computing intensive task, in which many arithmetic operations are involved, leading to considerable delay. Therefore, hardware acceleration may be a solution to reduce the overall computing time, while delivering enough throughput during the execution of the network security functions. In this paper, we implement hardware accelerators for cryptographic algorithms on heterogeneous multicore dedicated hardware, using state-of-the-art embedded libraries, cryptographic cores and hardware extensions. By comparing our implementation to software-only solutions in terms of latency and throughput using variable data sets, we find latency reductions in the computing time around 80% as well as performance improvements up to three orders of magnitude.
Tan, Qinyun, Xiao, Kun, He, Wen, Lei, Pinyuan, Chen, Lirong.
2021.
A Global Dynamic Load Balancing Mechanism with Low Latency for Micokernel Operating System. 2021 7th International Symposium on System and Software Reliability (ISSSR). :178—187.
As Internet of Things(IOT) devices become intelli-gent, more powerful computing capability is required. Multi-core processors are widely used in IoT devices because they provide more powerful computing capability while ensuring low power consumption. Therefore, it requires the operating system on IoT devices to support and optimize the scheduling algorithm for multi-core processors. Nowadays, microkernel-based operating systems, such as QNX Neutrino RTOS and HUAWEI Harmony OS, are widely used in IoT devices because of their real-time and security feature. However, research on multi-core scheduling for microkernel operating systems is relatively limited, especially for load balancing mechanisms. Related research is still mainly focused on the traditional monolithic operating systems, such as Linux. Therefore, this paper proposes a low-latency, high- performance, and high real-time centralized global dynamic multi-core load balancing method for the microkernel operating system. It has been implemented and tested on our own microkernel operating system named Mginkgo. The test results show that when there is load imbalance in the system, load balancing can be performed automatically so that all processors in the system can try to achieve the maximum throughput and resource utilization. And the latency brought by load balancing to the system is very low, about 4882 cycles (about 6.164us) triggered by new task creation and about 6596 cycles (about 8.328us) triggered by timing. In addition, we also tested the improvement of system throughput and CPU utilization. The results show that load balancing can improve the CPU utilization by 20% under the preset case, while the CPU utilization occupied by load balancing is negligibly low, about 0.0082%.
Sepulveda, Johanna, Winkler, Dominik, Sepúlveda, Daniel, Cupelli, Mario, Olexa, Radek.
2021.
Post-Quantum Cryptography in MPSoC Environments. 2021 IFIP/IEEE 29th International Conference on Very Large Scale Integration (VLSI-SoC). :1—6.
Multi-processors System-on-Chip (MPSoC) are a key enabling technology for different applications characterized by hyper-connectivity and multi-tenant requirements, where resources are shared and communication is ubiquitous. In such an environment, security plays a major role. To cope with these security needs, MPSoCs usually integrate cryptographic functionalities deployed as software and/or hardware solutions. Quantum computing represents a threat for the current cryptography. To overcome such a threat, Post-quantum cryptography (PQC) can be used, thus ensuring the long term security of different applications. Since 2017, NIST is running a PQC standardization process. While the focus has been the security analysis of the different PQC candidates and the software implementation, the MPSoC PQC implementation has been neglected. To this end, this work presents two contributions. First, the exploration of the multicore capabilities for developing optimized PQC implementations. As a use case, NTRU lattice-based PQC, finalist for the NIST standardization process, is discussed. Second, NTRU was deployed on an AURIX microcontroller of Infineon Technologies AG with the Real-Time Operating System PXROS-HR from HighTec EDV-Systeme GmbH. Results show that NTRU can be efficiently implemented and optimized on a multicore architecture, improving the performance up to 43% when compared to single core solutions.
Hoppe, Augusto, Becker, Jürgen, Kastensmidt, Fernanda Lima.
2021.
High-speed Hardware Accelerator for Trace Decoding in Real-Time Program Monitoring. 2021 IEEE 12th Latin America Symposium on Circuits and System (LASCAS). :1—4.
Multicore processors are currently the focus of new and future critical-system architectures. However, they introduce new problems in regards to safety and security requirements. Real-time control flow monitoring techniques were proposed as solutions to detect the most common types of program errors and security attacks. We propose a new way to use the latest debug and trace architectures to achieve full and isolated real-time control flow monitoring. We present an online trace decoder FPGA component as a solution in the search for scalable and portable monitoring architectures. Our FPGA accelerator achieves real-time CPU monitoring with only 8% of used resources in a Zynq-7000 FPGA.
Ordouie, Navid, Soundararajan, Nirmala, Karne, Ramesh, Wijesinha, Alexander L..
2021.
Developing Computer Applications without any OS or Kernel in a Multi-core Architecture. 2021 International Symposium on Networks, Computers and Communications (ISNCC). :1—8.
Over the years, operating systems (OSs) have grown significantly in complexity and size providing attackers with more avenues to compromise their security. By eliminating the OS, it becomes possible to develop general-purpose non-embedded applications that are free of typical OS-related vulnerabilities. Such applications are simpler and smaller in size, making it easier secure the application code. Bare machine computing (BMC) applications run on ordinary desktops and laptops without the support of any operating system or centralized kernel. Many BMC applications have been developed previously for single-core systems. We show how to build BMC applications for multicore systems by presenting the design and implementation of a novel UDP-based bare machine prototype Web server for a multicore architecture. We also include preliminary experimental results from running the server on the Internet. This work provides a foundation for building secure computer applications that run on multicore systems without the need for intermediary software.
Kumar, S. Ratan, Kumari, V. Valli, Raju, K. V. S. V. N..
2021.
Multi-Core Parallel Processing Technique to Prepare the Time Series Data for the Early Detection of DDoS Flooding Attacks. 2021 8th International Conference on Computing for Sustainable Global Development (INDIACom). :540—545.
Distributed Denial of Service (DDoS) attacks pose a considerable threat to Cloud Computing, Internet of Things (IoT) and other services offered on the Internet. The victim server receives terabytes of data per second during the DDoS attack. It may take hours to examine them to detect a potential threat, leading to denial of service to legitimate users. Processing vast volumes of traffic to mitigate the attack is a challenging task for network administrators. High-performance techniques are more suited for processing DDoS attack traffic compared to Sequential Processing Techniques. This paper proposes a Multi-Core Parallel Processing Technique to prepare the time series data for the early detection of DDoS flooding attacks. Different time series analysis methods are suggested to detect the attack early on. Producing time series data using parallel processing saves time and further speeds up the detection of the attack. The proposed method is applied to the benchmark data set CICDDoS2019 for generating four different time series to detect TCP-based flooding attacks, namely TCP-SYN, TCP-SYN-ACK, TCP-ACK, and TCP-RST. The implementation results show that the proposed method can give a speedup of 2.3 times for processing attack traffic compared to sequential processing.
Olivier, Stephen L., Ellingwood, Nathan D., Berry, Jonathan, Dunlavy, Daniel M..
2021.
Performance Portability of an SpMV Kernel Across Scientific Computing and Data Science Applications. 2021 IEEE High Performance Extreme Computing Conference (HPEC). :1—8.
Both the data science and scientific computing communities are embracing GPU acceleration for their most demanding workloads. For scientific computing applications, the massive volume of code and diversity of hardware platforms at supercomputing centers has motivated a strong effort toward performance portability. This property of a program, denoting its ability to perform well on multiple architectures and varied datasets, is heavily dependent on the choice of parallel programming model and which features of the programming model are used. In this paper, we evaluate performance portability in the context of a data science workload in contrast to a scientific computing workload, evaluating the same sparse matrix kernel on both. Among our implementations of the kernel in different performance-portable programming models, we find that many struggle to consistently achieve performance improvements using the GPU compared to simple one-line OpenMP parallelization on high-end multicore CPUs. We show one that does, and its performance approaches and sometimes even matches that of vendor-provided GPU math libraries.
Xuguang, Zhu.
2021.
A Certainty-guaranteed inter/intra-core communication method for multi-core embedded systems. 2021 IEEE International Conference on Power Electronics, Computer Applications (ICPECA). :1024—1027.
In order to meet the actual needs of operating system localization and high-security operating system, this paper proposes a multi-core embedded high-security operating system inter-core communication mechanism centered on private memory on the core based on the cache mechanism of DSP processors such as Feiteng design. In order to apply it to the multi-core embedded high-security operating system, this paper also combines the priority scheduling scheme used in the design of our actual operating system to analyze the certainty of inter-core communication. The analysis result is: under this communication mechanism There is an upper limit for end-to-end delay, so the certainty of the communication mechanism is guaranteed and can be applied to multi-core high-security embedded operating systems.
Zhou, Tianyang.
2021.
Performance comparison and optimization of mainstream NIDS systems in offline mode based on parallel processing technology. 2021 2nd International Conference on Computing and Data Science (CDS). :136—140.
For the network intrusion detection system (NIDS), improving the performance of the analysis process has always been one of the primary goals that NIDS needs to solve. An important method to improve performance is to use parallel processing technology to maximize the usage of multi-core CPU resources. In this paper, by splitting Pcap data packets, the NIDS software Snort3 can process Pcap packets in parallel mode. On this basis, this paper compares the performance between Snort2, Suricata, and Snort3 with different CPU cores in processing different sizes of Pcap data packets. At the same time, a parallel unpacking algorithm is proposed to further improve the parallel processing performance of Snort3.
Philomina, Josna.
2021.
A Study on the Effect of Hardware Trojans in the Performance of Network on Chip Architectures. 2021 8th International Conference on Smart Computing and Communications (ICSCC). :314—318.
Network on chip (NoC) is the communication infrastructure used in multicores which has been subject to a surfeit of security threats like degrading the system performance, changing the system functionality or leaking sensitive information. Because of the globalization of the advanced semiconductor industry, many third-party venders take part in the hardware design of system. As a result, a malicious circuit, called Hardware Trojans (HT) can be added anywhere into the NoC design and thus making the hardware untrusted. In this paper, a detailed study on the taxonomy of hardware trojans, its detection and prevention mechanisms are presented. Two case studies on HT-assisted Denial of service attacks and its analysis in the performance of network on Chip architecture is also presented in this paper.