Visible to the public Biblio

Filters: Keyword is Supercomputers  [Clear All Filters]
2023-02-03
Suzumura, Toyotaro, Sugiki, Akiyoshi, Takizawa, Hiroyuki, Imakura, Akira, Nakamura, Hiroshi, Taura, Kenjiro, Kudoh, Tomohiro, Hanawa, Toshihiro, Sekiya, Yuji, Kobayashi, Hiroki et al..  2022.  mdx: A Cloud Platform for Supporting Data Science and Cross-Disciplinary Research Collaborations. 2022 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech). :1–7.
The growing amount of data and advances in data science have created a need for a new kind of cloud platform that provides users with flexibility, strong security, and the ability to couple with supercomputers and edge devices through high-performance networks. We have built such a nation-wide cloud platform, called "mdx" to meet this need. The mdx platform's virtualization service, jointly operated by 9 national universities and 2 national research institutes in Japan, launched in 2021, and more features are in development. Currently mdx is used by researchers in a wide variety of domains, including materials informatics, geo-spatial information science, life science, astronomical science, economics, social science, and computer science. This paper provides an overview of the mdx platform, details the motivation for its development, reports its current status, and outlines its future plans.
2022-10-20
Wang, Jingyi, Chiang, Nai-Yuan, Petra, Cosmin G..  2021.  An asynchronous distributed-memory optimization solver for two-stage stochastic programming problems. 2021 20th International Symposium on Parallel and Distributed Computing (ISPDC). :33—40.
We present a scalable optimization algorithm and its parallel implementation for two-stage stochastic programming problems of large-scale, particularly the security constrained optimal power flow models routinely used in electrical power grid operations. Such problems can be prohibitively expensive to solve on industrial scale with the traditional methods or in serial. The algorithm decomposes the problem into first-stage and second-stage optimization subproblems which are then scheduled asynchronously for efficient evaluation in parallel. Asynchronous evaluations are crucial in achieving good balancing and parallel efficiency because the second-stage optimization subproblems have highly varying execution times. The algorithm employs simple local second-order approximations of the second-stage optimal value functions together with exact first- and second-order derivatives for the first-stage subproblems to accelerate convergence. To reduce the number of the evaluations of computationally expensive second-stage subproblems required by line search, we devised a flexible mechanism for controlling the step size that can be tuned to improve performance for individual class of problems. The algorithm is implemented in C++ using MPI non-blocking calls to overlap computations with communication and boost parallel efficiency. Numerical experiments of the algorithm are conducted on Summit and Lassen supercomputers at Oak Ridge and Lawrence Livermore National Laboratories and scaling results show good parallel efficiency.
2022-09-29
Tang, Houjun, Xie, Bing, Byna, Suren, Carns, Philip, Koziol, Quincey, Kannan, Sudarsun, Lofstead, Jay, Oral, Sarp.  2021.  SCTuner: An Autotuner Addressing Dynamic I/O Needs on Supercomputer I/O Subsystems. 2021 IEEE/ACM Sixth International Parallel Data Systems Workshop (PDSW). :29–34.
In high-performance computing (HPC), scientific applications often manage a massive amount of data using I/O libraries. These libraries provide convenient data model abstractions, help ensure data portability, and, most important, empower end users to improve I/O performance by tuning configurations across multiple layers of the HPC I/O stack. We propose SCTuner, an autotuner integrated within the I/O library itself to dynamically tune both the I/O library and the underlying I/O stack at application runtime. To this end, we introduce a statistical benchmarking method to profile the behaviors of individual supercomputer I/O subsystems with varied configurations across I/O layers. We use the benchmarking results as the built-in knowledge in SCTuner, implement an I/O pattern extractor, and plan to implement an online performance tuner as the SCTuner runtime. We conducted a benchmarking analysis on the Summit supercomputer and its GPFS file system Alpine. The preliminary results show that our method can effectively extract the consistent I/O behaviors of the target system under production load, building the base for I/O autotuning at application runtime.
2022-07-01
Matri, Pierre, Ross, Robert.  2021.  Neon: Low-Latency Streaming Pipelines for HPC. 2021 IEEE 14th International Conference on Cloud Computing (CLOUD). :698—707.
Real time data analysis in the context of e.g. realtime monitoring or computational steering is an important tool in many fields of science, allowing scientists to make the best use of limited resources such as sensors and HPC platforms. These tools typically rely on large amounts of continuously collected data that needs to be processed in near-real time to avoid wasting compute, storage, and networking resources. Streaming pipelines are a natural fit for this use case but are inconvenient to use on high-performance computing (HPC) systems because of the diverging system software environment with big data, increasing both the cost and the complexity of the solution. In this paper we propose Neon, a clean-slate design of a streaming data processing framework for HPC systems that enables users to create arbitrarily large streaming pipelines. The experimental results on the Bebop supercomputer show significant performance improvements compared with Apache Storm, with up to 2x increased throughput and reduced latency.
2021-07-27
Loreti, Daniela, Artioli, Marcello, Ciampolini, Anna.  2020.  Solving Linear Systems on High Performance Hardware with Resilience to Multiple Hard Faults. 2020 International Symposium on Reliable Distributed Systems (SRDS). :266–275.
As large-scale linear equation systems are pervasive in many scientific fields, great efforts have been done over the last decade in realizing efficient techniques to solve such systems, possibly relying on High Performance Computing (HPC) infrastructures to boost the performance. In this framework, the ever-growing scale of supercomputers inevitably increases the frequency of faults, making it a crucial issue of HPC application development.A previous study [1] investigated the possibility to enhance the Inhibition Method (IMe) -a linear systems solver for dense unstructured matrices-with fault tolerance to single hard errors, i.e. failures causing one computing processor to stop.This article extends [1] by proposing an efficient technique to obtain fault tolerance to multiple hard errors, which may occur concurrently on different processors belonging to the same or different machines. An improved parallel implementation is also proposed, which is particularly suitable for HPC environments and moves towards the direction of a complete decentralization. The theoretical analysis suggests that the technique (which does not require check pointing, nor rollback) is able to provide fault tolerance to multiple faults at the price of a small overhead and a limited number of additional processors to store the checksums. Experimental results on a HPC architecture validate the theoretical study, showing promising performance improvements w.r.t. a popular fault-tolerant solving technique.
2021-06-02
Bychkov, Igor, Feoktistov, Alexander, Gorsky, Sergey, Edelev, Alexei, Sidorov, Ivan, Kostromin, Roman, Fereferov, Evgeniy, Fedorov, Roman.  2020.  Supercomputer Engineering for Supporting Decision-making on Energy Systems Resilience. 2020 IEEE 14th International Conference on Application of Information and Communication Technologies (AICT). :1—6.
We propose a new approach to creating a subject-oriented distributed computing environment. Such an environment is used to support decision-making in solving relevant problems of ensuring energy systems resilience. The proposed approach is based on the idea of advancing and integrating the following important capabilities in supercomputer engineering: continuous integration, delivery, and deployment of the system and applied software, high-performance computing in heterogeneous environments, multi-agent intelligent computation planning and resource allocation, big data processing and geo-information servicing for subject information, including weakly structured data, and decision-making support. This combination of capabilities and their advancing are unique to the subject domain under consideration, which is related to combinatorial studying critical objects of energy systems. Evaluation of decision-making alternatives is carrying out through applying combinatorial modeling and multi-criteria selection rules. The Orlando Tools framework is used as the basis for an integrated software environment. It implements a flexible modular approach to the development of scientific applications (distributed applied software packages).
2021-05-13
Sheptunov, Sergey A., Sukhanova, Natalia V..  2020.  The Problems of Design and Application of Switching Neural Networks in Creation of Artificial Intelligence. 2020 International Conference Quality Management, Transport and Information Security, Information Technologies (IT QM IS). :428–431.
The new switching architecture of the neural networks was proposed. The switching neural networks consist of the neurons and the switchers. The goal is to reduce expenses on the artificial neural network design and training. For realization of complex models, algorithms and methods of management the neural networks of the big size are required. The number of the interconnection links “everyone with everyone” grows with the number of neurons. The training of big neural networks requires the resources of supercomputers. Time of training of neural networks also depends on the number of neurons in the network. Switching neural networks are divided into fragments connected by the switchers. Training of switcher neuron network is provided by fragments. On the basis of switching neural networks the devices of associative memory were designed with the number of neurons comparable to the human brain.
2020-06-08
Chugunkov, Ilya V., Ivanov, Michael A., Kliuchnikova, Bogdana V..  2019.  Hash Functions are Based on Three-Dimensional Stochastic Transformations. 2019 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus). :202–205.
The methods are based on injecting unpredictability into means and objects of protection are called stochastic methods of information security. The effective protection can be done only by using stochastic methods against an active opponent. The effectiveness of stochastic protection methods is defined by the quality of the used pseudo-random number generators and hash functions. The proposed hashing algorithm DOZENHASH is based on the using of 3D stochastic transformations of DOZEN family. The principal feature of the algorithm is that all input and output data blocks as well as intermediate results of calculations are represented as three-dimensional array of bytes with 4 bytes in each dimension. Thus, the resulting transformation has a high degree of parallelism at the level of elementary operations, in other words, it is focused on the implementation using heterogeneous supercomputer technologies.
2020-03-16
Molyakov, Andrey.  2019.  New security descriptor computing algorithm of Supercomputers. 2019 Third World Conference on Smart Trends in Systems Security and Sustainablity (WorldS4). :349–350.
The author describes computing algorithm based on new scientific definition - “The resulting convolution, which takes into account changes in the significant bits of variables of the Zhegalkin polynomial, is a superposition of hash function calculations for the i-th process”.
2018-04-02
Innokentievich, T. P., Vasilevich, M. V..  2017.  The Evaluation of the Cryptographic Strength of Asymmetric Encryption Algorithms. 2017 Second Russia and Pacific Conference on Computer Technology and Applications (RPC). :180–183.

We propose a method for comparative analysis of evaluation of the cryptographic strength of the asymmetric encryption algorithms RSA and the existing GOST R 34.10-2001. Describes the fundamental design ratios, this method is based on computing capacity used for decoding and the forecast for the development of computer technology.

2017-12-12
Zhu, G., Zeng, Y., Guo, M..  2017.  A Security Analysis Method for Supercomputing Users \#x2019; Behavior. 2017 IEEE 4th International Conference on Cyber Security and Cloud Computing (CSCloud). :287–293.

Supercomputers are widely applied in various domains, which have advantage of high processing capability and mass storage. With growing supercomputing users, the system security receives comprehensive attentions, and becomes more and more important. In this paper, according to the characteristics of supercomputing environment, we perform an in-depth analysis of existing security problems in the process of using resources. To solve these problems, we propose a security analysis method and a prototype system for supercomputing users' behavior. The basic idea is to restore the complete users' behavior paths and operation records based on the supercomputing business process and track the use of resources. Finally, the method is evaluated and the results show that the security analysis method of users' behavior can help administrators detect security incidents in time and respond quickly. The final purpose is to optimize and improve the security level of the whole system.

2017-12-04
Johnston, B., Lee, B., Angove, L., Rendell, A..  2017.  Embedded Accelerators for Scientific High-Performance Computing: An Energy Study of OpenCL Gaussian Elimination Workloads. 2017 46th International Conference on Parallel Processing Workshops (ICPPW). :59–68.

Energy efficient High-Performance Computing (HPC) is becoming increasingly important. Recent ventures into this space have introduced an unlikely candidate to achieve exascale scientific computing hardware with a small energy footprint. ARM processors and embedded GPU accelerators originally developed for energy efficiency in mobile devices, where battery life is critical, are being repurposed and deployed in the next generation of supercomputers. Unfortunately, the performance of executing scientific workloads on many of these devices is largely unknown, yet the bulk of computation required in high-performance supercomputers is scientific. We present an analysis of one such scientific code, in the form of Gaussian Elimination, and evaluate both execution time and energy used on a range of embedded accelerator SoCs. These include three ARM CPUs and two mobile GPUs. Understanding how these low power devices perform on scientific workloads will be critical in the selection of appropriate hardware for these supercomputers, for how can we estimate the performance of tens of thousands of these chips if the performance of one is largely unknown?