Visible to the public Biblio

Filters: Keyword is distributed memory systems  [Clear All Filters]
2020-02-18
Fattahi, Saeideh, Yazdani, Reza, Vahidipour, Seyyed Mehdi.  2019.  Discovery of Society Structure in A Social Network Using Distributed Cache Memory. 2019 5th International Conference on Web Research (ICWR). :264–269.

Community structure detection in social networks has become a big challenge. Various methods in the literature have been presented to solve this challenge. Recently, several methods have also been proposed to solve this challenge based on a mapping-reduction model, in which data and algorithms are divided between different process nodes so that the complexity of time and memory of community detection in large social networks is reduced. In this paper, a mapping-reduction model is first proposed to detect the structure of communities. Then the proposed framework is rewritten according to a new mechanism called distributed cache memory; distributed cache memory can store different values associated with different keys and, if necessary, put them at different computational nodes. Finally, the proposed rewritten framework has been implemented using SPARK tools and its implementation results have been reported on several major social networks. The performed experiments show the effectiveness of the proposed framework by varying the values of various parameters.

2018-03-05
Baldi, M., Chiaraluce, F., Senigagliesi, L., Spalazzi, L., Spegni, F..  2017.  Security in Heterogeneous Distributed Storage Systems: A Practically Achievable Information-Theoretic Approach. 2017 IEEE Symposium on Computers and Communications (ISCC). :1021–1028.

Distributed storage systems and caching systems are becoming widespread, and this motivates the increasing interest on assessing their achievable performance in terms of reliability for legitimate users and security against malicious users. While the assessment of reliability takes benefit of the availability of well established metrics and tools, assessing security is more challenging. The classical cryptographic approach aims at estimating the computational effort for an attacker to break the system, and ensuring that it is far above any feasible amount. This has the limitation of depending on attack algorithms and advances in computing power. The information-theoretic approach instead exploits capacity measures to achieve unconditional security against attackers, but often does not provide practical recipes to reach such a condition. We propose a mixed cryptographic/information-theoretic approach with a twofold goal: estimating the levels of information-theoretic security and defining a practical scheme able to achieve them. In order to find optimal choices of the parameters of the proposed scheme, we exploit an effective probabilistic model checker, which allows us to overcome several limitations of more conventional methods.

2017-12-28
Rolinger, T. B., Simon, T. A., Krieger, C. D..  2017.  Performance challenges for heterogeneous distributed tensor decompositions. 2017 IEEE High Performance Extreme Computing Conference (HPEC). :1–7.

Tensor decompositions, which are factorizations of multi-dimensional arrays, are becoming increasingly important in large-scale data analytics. A popular tensor decomposition algorithm is Canonical Decomposition/Parallel Factorization using alternating least squares fitting (CP-ALS). Tensors that model real-world applications are often very large and sparse, driving the need for high performance implementations of decomposition algorithms, such as CP-ALS, that can take advantage of many types of compute resources. In this work we present ReFacTo, a heterogeneous distributed tensor decomposition implementation based on DeFacTo, an existing distributed memory approach to CP-ALS. DFacTo reduces the critical routine of CP-ALS to a series of sparse matrix-vector multiplications (SpMVs). ReFacTo leverages GPUs within a cluster via MPI to perform these SpMVs and uses OpenMP threads to parallelize other routines. We evaluate the performance of ReFacTo when using NVIDIA's GPU-based cuSPARSE library and compare it to an alternative implementation that uses Intel's CPU-based Math Kernel Library (MKL) for the SpMV. Furthermore, we provide a discussion of the performance challenges of heterogeneous distributed tensor decompositions based on the results we observed. We find that on up to 32 nodes, the SpMV of ReFacTo when using MKL is up to 6.8× faster than ReFacTo when using cuSPARSE.

2015-05-06
Bi Hong, Wan Choi.  2014.  Asymptotic analysis of failed recovery probability in a distributed wireless storage system with limited sum storage capacity. Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on. :6459-6463.

In distributed wireless storage systems, failed recovery probability depends on not only wireless channel conditions but also storage size of each distributed storage node. For efficient utilization of limited storage capacity, we asymptotically analyze the failed recovery probability of a distributed wireless storage system with a sum storage capacity constraint when signal-to-noise ratio goes to infinity, and find the optimal storage allocation strategy across distributed storage nodes in terms of the asymptotic failed recovery probability. It is also shown that when the number of storage nodes is sufficiently large the storage size required at each node is not so large for high exponential order of the failed recovery probability.

2015-05-04
Gimenez, A., Gamblin, T., Rountree, B., Bhatele, A., Jusufi, I., Bremer, P.-T., Hamann, B..  2014.  Dissecting On-Node Memory Access Performance: A Semantic Approach. High Performance Computing, Networking, Storage and Analysis, SC14: International Conference for. :166-176.

Optimizing memory access is critical for performance and power efficiency. CPU manufacturers have developed sampling-based performance measurement units (PMUs) that report precise costs of memory accesses at specific addresses. However, this data is too low-level to be meaningfully interpreted and contains an excessive amount of irrelevant or uninteresting information. We have developed a method to gather fine-grained memory access performance data for specific data objects and regions of code with low overhead and attribute semantic information to the sampled memory accesses. This information provides the context necessary to more effectively interpret the data. We have developed a tool that performs this sampling and attribution and used the tool to discover and diagnose performance problems in real-world applications. Our techniques provide useful insight into the memory behaviour of applications and allow programmers to understand the performance ramifications of key design decisions: domain decomposition, multi-threading, and data motion within distributed memory systems.