Biblio
The Named Data Network (NDN) is a promising network paradigm for content distribution based on caching. However, it may put consumer privacy at risk, as the adversary may identify the content, the name and the signature (namely a certificate) through side-channel timing responses from the cache of the routers. The adversary may identify the content name and the consumer node by distinguishing between cached and un- cached contents. In order to mitigate the timing attack, effective countermeasure methods have been proposed by other authors, such as random caching, random freshness, and probabilistic caching. In this work, we have implemented a timing attack scenario to evaluate the efficiency of these countermeasures and to demonstrate how the adversary can be detected. For this goal, a brute force timing attack scenario based on a real topology was developed, which is the first brute force attack model applied in NDN. Results show that the adversary nodes can be effectively distinguished from other legitimate consumers during the attack period. It is also proposed a multi-level mechanism to detect an adversary node. Through this approach, the content distribution performance can be mitigated against the attack.
Community structure detection in social networks has become a big challenge. Various methods in the literature have been presented to solve this challenge. Recently, several methods have also been proposed to solve this challenge based on a mapping-reduction model, in which data and algorithms are divided between different process nodes so that the complexity of time and memory of community detection in large social networks is reduced. In this paper, a mapping-reduction model is first proposed to detect the structure of communities. Then the proposed framework is rewritten according to a new mechanism called distributed cache memory; distributed cache memory can store different values associated with different keys and, if necessary, put them at different computational nodes. Finally, the proposed rewritten framework has been implemented using SPARK tools and its implementation results have been reported on several major social networks. The performed experiments show the effectiveness of the proposed framework by varying the values of various parameters.
Nowadays, video streaming over HTTP is one of the most dominant Internet applications, using adaptive video techniques. Network assisted approaches have been proposed and are being standardized in order to provide high QoE for the end-users of such applications. SAND is a recent MPEG standard where DASH Aware Network Elements (DANEs) are introduced for this purpose. As web-caches are one of the main components of the SAND architecture, the location and the connectivity of these web-caches plays an important role in the user's QoE. The nature of SAND and DANE provides a good foundation for software controlled virtualized DASH environments, and in this paper, we propose a cache location algorithm and a cache migration algorithm for virtualized SAND deployments. The optimal locations for the virtualized DANEs is determined by an SDN controller and migrates it based on gathered statistics. The performance of the resulting system shows that, when SDN and NFV technologies are leveraged in such systems, software controlled virtualized approaches can provide an increase in QoE.
Big data processing systems are becoming increasingly more present in cloud workloads. Consequently, they are starting to incorporate more sophisticated mechanisms from traditional database and distributed systems. We focus in this work on the use of caching policies, which for big data raise important new challenges. Not only they must respond to new variants of the trade-off between hit rate, response time, and the space consumed by the cache, but they must do so at possibly higher volume and velocity than web and database workloads. Previous caching policies have not been tested experimentally with big data workloads. We address these challenges in this work. We propose the Read Density family of policies, which is a principled approach to quantify the utility of cached objects through a family of utility functions that depend on the frequency of reads of an object. We further design the Approximate Histogram, which is a policy-based technique based on an array of counters. This technique promises to achieve runtime-space efficient computation of the metric required by the cache policy. We evaluate through trace-based simulation the caching policies from the Read Density family, and compare them with over ten state-of-the-art alternatives. We use two workload traces representative for big data processing, collected from commercial Spark and MapReduce deployments. While we achieve comparable performance to the state-of-art with less parameters, meaningful performance improvement for big data workloads remain elusive.
With the rapid increase in the use of mobile devices in people's daily lives, mobile data traffic is exploding in recent years. In the edge computing environment where edge servers are deployed around mobile users, caching popular data on edge servers can ensure mobile users' fast access to those data and reduce the data traffic between mobile users and the centralized cloud. Existing studies consider the data cache problem with a focus on the reduction of network delay and the improvement of mobile devices' energy efficiency. In this paper, we attack the data caching problem in the edge computing environment from the service providers' perspective, who would like to maximize their venues of caching their data. This problem is complicated because data caching produces benefits at a cost and there usually is a trade-off in-between. In this paper, we formulate the data caching problem as an integer programming problem, and maximizes the revenue of the service provider while satisfying a constraint for data access latency. Extensive experiments are conducted on a real-world dataset that contains the locations of edge servers and mobile users, and the results reveal that our approach significantly outperform the baseline approaches.
A systematic study of technologies and concepts used for the design and construction of distributed fail-safe web systems has been conducted. The general principles of the design of distributed web-systems and information technologies that are used in the design of web-systems are considered. As a result of scientific research, it became clear that data backup is a determining attribute of most web systems serving. Thus, the main role in building modern web systems is to scaling them. Scaling in distributed systems is used when performing a particular operation requires a large amount of computing resources. There are two scaling options, namely vertical and horizontal. Vertical scaling is to increase the performance of existing components in order to increase overall productivity. However, for the construction of distributed systems, use horizontal scaling. Horizontal scaling is that the system is split into small components and placed on various physical computers. This approach allows the addition of new nodes to increase the productivity of the web system as a whole.
Shortest path queries on road networks are widely used in location-based services (LBS), e.g., finding the shortest route from my home to the airport through Google Maps. However, when there are a large number of path queries arrived concurrently or in a short while, an LBS provider (e.g., Google Maps) has to endure a high workload and then may lead to a long response time to users. Therefore, path caching services are utilized to accelerate large-scale path query processing, which try to store the historical path results and reuse them to answer the coming queries directly. However, most of existing path caches are organized based on nodes of paths; hence, the underlying road network topology is still needed to answer a path query when its querying origin or destination lies on edges. To overcome this limitation, we propose an edge-based shortest path cache in this paper that can efficiently handle queries without needing any road information, which is much more practical in the real world. We achieve this by designing a totally new edge-based path cache structure, an efficient R-tree-based cache lookup algorithm, and a greedy-based cache construction algorithm. Extensive experiments on a real road network and real point-of-interest datasets are conducted, and the results show the efficiency, scalability, and applicability of our proposed caching techniques.
Caching methods are developed since 50 years for paging in CPU and database systems, and since 25 years for web caching as main application areas among others. Pages of unique size are usual in CPU caches, whereas web caches are storing data chunks of different size in a widely varying range. We study the impact of different object sizes on the performance and the overhead of web caching. This entails different caching goals, starting from the byte and object hit ratio to a generalized value hit ratio for optimized costs and benefits of caching regarding traffic engineering (TE), reduced delays and other QoS measures. The selection of the cache contents turns out to be crucial for the web cache efficiency with awareness of the size and other properties in a score for each object. We introduce a new class of rank exchange caching methods and show how their performance compares to other strategies with extensions needed to include the size and scores for QoS and TE caching goals. Finally, we derive bounds on the object, byte and value hit ratio for the independent request model (IRM) based on optimum knapsack solutions of the cache content.
As Web traffics is increasing on the Internet, caching solutions for Web systems are becoming more important since they can greatly expand system scalability. An important part of a caching solution is cache replacement policy, which is responsible for selecting victim items that should be removed in order to make space for new objects. Typical replacement policies used in practice only take advantage of temporal reference locality by removing the least recently/frequently requested items from the cache. Although those policies work well in memory or filesystem cache, they are inefficient for Web systems since they do not exploit semantic relationship between Web items. This paper presents a semantic-aware caching policy that can be used in Web systems to enhance scalability. The proposed caching mechanism defines semantic distance from a web page to a set of pivot pages and use the semantic distances as a metric for choosing victims. Also, it use a function-based metric that combines access frequency and cache item size for tie-breaking. Our simulations show that out enhancements outperform traditional methods in terms of hit rate, which can be useful for websites with many small and similar-in-size web objects.
When implemented on real systems, cryptographic algorithms are vulnerable to attacks observing their execution behavior, such as cache-timing attacks. Designing protected implementations must be done with knowledge and validation tools as early as possible in the development cycle. In this article we propose a methodology to assess the robustness of the candidates for the NIST post-quantum standardization project to cache-timing attacks. To this end we have developed a dedicated vulnerability research tool. It performs a static analysis with tainting propagation of sensitive variables across the source code and detects leakage patterns. We use it to assess the security of the NIST post-quantum cryptography project submissions. Our results show that more than 80% of the analyzed implementations have at least one potential flaw, and three submissions total more than 1000 reported flaws each. Finally, this comprehensive study of the competitors security allows us to identify the most frequent weaknesses amongst candidates and how they might be fixed.
In military operations, Commander's Intent describes the desired end state and purpose of the operation, expressed in a concise and clear manner. Command by intent is a paradigm that empowers subordinate units to exercise measured initiative to meet mission goals and accept prudent risk within commander's intent. It improves agility of military operations by allowing exploitation of local opportunities without an explicit directive from the commander to do so. This paper discusses what the paradigm entails in terms of architectural decisions for data fusion systems tasked with real-time information collection to satisfy operational mission goals. In our system, information needs of decisions are expressed at a high level, and shared among relevant nodes. The selected nodes, then, jointly operate to meet mission information needs by forwarding and caching relevant data without explicit directives regarding the objects to fetch and sources to contact. A preliminary evaluation of the system is presented using a target tracking application, set in the context of a NATO-based mission scenario, called Anglova. Evaluation results show that delegating some decision authority to the data fusion system (in terms of objects to fetch and sources to contact) allows it to save more network resources, while also increasing mission success rate. The system is therefore particularly well-suited to operation in partially denied or contested environments, where resource bottlenecks caused by adversarial activity impair one's ability to collect real-time information for mission-critical decision making.
Byte-addressable non-volatile memory technology is emerging as an alternative for DRAM for main memory. This new Non-Volatile Main Memory (NVMM) allows programmers to store important data in data structures in memory instead of serializing it to the file system, thereby providing a substantial performance boost. However, modern systems reorder memory operations and utilize volatile caches for better performance, making it difficult to ensure a consistent state in NVMM. Intel recently announced a new set of persistence instructions, clflushopt, clwb, and pcommit. These new instructions make it possible to implement fail-safe code on NVMM, but few workloads have been written or characterized using these new instructions. In this work, we describe how these instructions work and how they can be used to implement write-ahead logging based transactions. We implement several common data structures and kernels and evaluate the performance overhead incurred over traditional non-persistent implementations. In particular, we find that persistence instructions occur in clusters along with expensive fence operations, they have long latency, and they add a significant execution time overhead, on average by 20.3% over code with logging but without fence instructions to order persists. To deal with this overhead and alleviate the performance bottleneck, we propose to speculate past long latency persistency operations using checkpoint-based processing. Our speculative persistence architecture reduces the execution time overheads to only 3.6%.
Wikipedia is one of the most popular information platforms on the Internet. The user access pattern to Wikipedia pages depends on their relevance in the current worldwide social discourse. We use publically available statistics about the top-1000 most popular pages on each day to estimate the efficiency of caches for support of the platform. While the data volumes are moderate, the main goal of Wikipedia caches is to reduce access times for page views and edits. We study the impact of most popular pages on the achievable cache hit rate in comparison to Zipf request distributions and we include daily dynamics in popularity.
Distributed storage systems and caching systems are becoming widespread, and this motivates the increasing interest on assessing their achievable performance in terms of reliability for legitimate users and security against malicious users. While the assessment of reliability takes benefit of the availability of well established metrics and tools, assessing security is more challenging. The classical cryptographic approach aims at estimating the computational effort for an attacker to break the system, and ensuring that it is far above any feasible amount. This has the limitation of depending on attack algorithms and advances in computing power. The information-theoretic approach instead exploits capacity measures to achieve unconditional security against attackers, but often does not provide practical recipes to reach such a condition. We propose a mixed cryptographic/information-theoretic approach with a twofold goal: estimating the levels of information-theoretic security and defining a practical scheme able to achieve them. In order to find optimal choices of the parameters of the proposed scheme, we exploit an effective probabilistic model checker, which allows us to overcome several limitations of more conventional methods.
Security protection is a concern for the Internet of Things (IoT) which performs data exchange autonomously over the internet for remote monitoring, automation and other applications. IoT implementations has raised concerns over its security and various research has been conducted to find an effective solution for this. Thus, this work focus on the analysis of an asymmetric encryption scheme, AA-Beta (AAβ) on a platform constrained in terms of processor capability, storage and random access Memory (RAM). For this work, the platform focused is ARM Cortex-M7 microcontroller. The encryption and decryption's performance on the embedded microcontroller is realized and time executed is measured. By enabled the I-Cache (Instruction cache) and D-Cache (Data Cache), the performances are 50% faster compared to disabled the D-Cache and I-Cache. The performance is then compared to our previous work on System on Chip (SoC). This is to analyze the gap of the SoC that has utilized the full GNU Multiple Precision Arithmetic Library (GMP) package versus ARM Cortex-M7 that using the mini-gmp package in term of the footprint and the actual performance.
A database is a vast collection of data which helps us to collect, retrieve, organize and manage the data in an efficient and effective manner. Databases are critical assets. They store client details, financial information, personal files, company secrets and other data necessary for business. Today people are depending more on the corporate data for decision making, management of customer service and supply chain management etc. Any loss, corrupted data or unavailability of data may seriously affect its performance. The database security should provide protected access to the contents of a database and should preserve the integrity, availability, consistency, and quality of the data This paper describes the architecture based on placing the Elliptical curve cryptography module inside database management software (DBMS), just above the database cache. Using this method only selected part of the database can be encrypted instead of the whole database. This architecture allows us to achieve very strong data security using ECC and increase performance using cache.
Cache coherence protocol bugs can cause multicores to fail. Existing coherence verification approaches incur state explosion at small scales or require considerable human effort. As protocols' complexity and multicores' core counts increase, verification continues to be a challenge. Recently, researchers proposed fractal coherence which achieves scalable verification by enforcing observational equivalence between sub-systems in the coherence protocol. A larger sub-system is verified implicitly if a smaller sub-system has been verified. Unfortunately, fractal protocols suffer from two fundamental limitations: (1) indirect-communication: sub-systems cannot directly communicate and (2) partially-serial-invalidations: cores must be invalidated in a specific, serial order. These limitations disallow common performance optimizations used by conventional directory protocols: reply-forwarding where caches communicate directly and parallel invalidations. Therefore, fractal protocols lack performance scalability while directory protocols lack verification scalability. To enable both performance and verification scalability, we propose Fractal++ which employs a new class of protocol optimizations for verification-constrained architectures: decoupled-replies, contention-hints, and fully-parallel-fractal-invalidations. The first two optimizations allow reply-forwarding-like performance while the third optimization enables parallel invalidations in fractal protocols. Unlike conventional protocols, Fractal++ preserves observational equivalence and hence is scalably verifiable. In 32-core simulations of single- and four-socket systems, Fractal++ performs nearly as well as a directory protocol while providing scalable verifiability whereas the best-performing previous fractal protocol performs 8% on average and up to 26% worse with a single-socket and 12% on average and up to 34% worse with a longer-latency multi-socket system.