Biblio
Information visualization applications have become ubiquitous, in no small part thanks to the ease of wide distribution and deployment to users enabled by the web browser. Scientific visualization applications, relying on native code libraries and parallel processing, have been less suited to such widespread distribution, as browsers do not provide the required libraries or compute capabilities. In this paper, we revisit this gap in visualization technologies and explore how new web technologies, WebAssembly and WebGPU, can be used to deploy powerful visualization solutions for large-scale scientific data in the browser. In particular, we evaluate the programming effort required to bring scientific visualization applications to the browser through these technologies and assess their competitiveness against classic native solutions. As a main example, we present a new GPU-driven isosurface extraction method for block-compressed data sets, that is suitable for interactive isosurface computation on large volumes in resource-constrained environments, such as the browser. We conclude that web browsers are on the verge of becoming a competitive platform for even the most demanding scientific visualization tasks, such as interactive visualization of isosurfaces from a 1TB DNS simulation. We call on researchers and developers to consider investing in a community software stack to ease use of these upcoming browser features to bring accessible scientific visualization to the browser.
We present ClearTrack, a system that tracks meta-data for each primitive value in Java programs to detect and nullify a range of vulnerabilities such as integer overflow/underflow and SQL/command injection vulnerabilities. Contributions include new techniques for eliminating false positives associated with benign integer overflows and underflows, new metadata-aware techniques for detecting and nullifying SQL/command command injection attacks, and results from an independent evaluation team. These results show that 1) ClearTrack operates successfully on Java programs comprising hundreds of thousands of lines of code (including instrumented jar files and Java system libraries, the majority of the applications comprise over 3 million lines of code), 2) because of computations such as cryptography and hash table calculations, these applications perform millions of benign integer overflows and underflows, and 3) ClearTrack successfully detects and nullifies all tested integer overflow and underflow and SQL/command injection vulnerabilities in the benchmark applications.
Big data processing systems are becoming increasingly more present in cloud workloads. Consequently, they are starting to incorporate more sophisticated mechanisms from traditional database and distributed systems. We focus in this work on the use of caching policies, which for big data raise important new challenges. Not only they must respond to new variants of the trade-off between hit rate, response time, and the space consumed by the cache, but they must do so at possibly higher volume and velocity than web and database workloads. Previous caching policies have not been tested experimentally with big data workloads. We address these challenges in this work. We propose the Read Density family of policies, which is a principled approach to quantify the utility of cached objects through a family of utility functions that depend on the frequency of reads of an object. We further design the Approximate Histogram, which is a policy-based technique based on an array of counters. This technique promises to achieve runtime-space efficient computation of the metric required by the cache policy. We evaluate through trace-based simulation the caching policies from the Read Density family, and compare them with over ten state-of-the-art alternatives. We use two workload traces representative for big data processing, collected from commercial Spark and MapReduce deployments. While we achieve comparable performance to the state-of-art with less parameters, meaningful performance improvement for big data workloads remain elusive.
Research on keystroke dynamics has the good potential to offer continuous authentication that complements conventional authentication methods in combating insider threats and identity theft before more harm can be done to the genuine users. Unfortunately, the large amount of data required by free-text keystroke authentication often contain personally identifiable information, or PII, and personally sensitive information, such as a user's first name and last name, username and password for an account, bank card numbers, and social security numbers. As a result, there are privacy risks associated with keystroke data that must be mitigated before they are shared with other researchers. We conduct a systematic study to remove PII's from a recent large keystroke dataset. We find substantial amounts of PII's from the dataset, including names, usernames and passwords, social security numbers, and bank card numbers, which, if leaked, may lead to various harms to the user, including personal embarrassment, blackmails, financial loss, and identity theft. We thoroughly evaluate the effectiveness of our detection program for each kind of PII. We demonstrate that our PII detection program can achieve near perfect recall at the expense of losing some useful information (lower precision). Finally, we demonstrate that the removal of PII's from the original dataset has only negligible impact on the detection error tradeoff of the free-text authentication algorithm by Gunetti and Picardi. We hope that this experience report will be useful in informing the design of privacy removal in future keystroke dynamics based user authentication systems.
Modern processors use branch prediction and speculative execution to maximize performance. For example, if the destination of a branch depends on a memory value that is in the process of being read, CPUs will try to guess the destination and attempt to execute ahead. When the memory value finally arrives, the CPU either discards or commits the speculative computation. Speculative logic is unfaithful in how it executes, can access the victim's memory and registers, and can perform operations with measurable side effects. Spectre attacks involve inducing a victim to speculatively perform operations that would not occur during correct program execution and which leak the victim's confidential information via a side channel to the adversary. This paper describes practical attacks that combine methodology from side channel attacks, fault attacks, and return-oriented programming that can read arbitrary memory from the victim's process. More broadly, the paper shows that speculative execution implementations violate the security assumptions underpinning numerous software security mechanisms, including operating system process separation, containerization, just-in-time (JIT) compilation, and countermeasures to cache timing and side-channel attacks. These attacks represent a serious threat to actual systems since vulnerable speculative execution capabilities are found in microprocessors from Intel, AMD, and ARM that are used in billions of devices. While makeshift processor-specific countermeasures are possible in some cases, sound solutions will require fixes to processor designs as well as updates to instruction set architectures (ISAs) to give hardware architects and software developers a common understanding as to what computation state CPU implementations are (and are not) permitted to leak.
With the development of cyber threats on the Internet, the number of malware, especially unknown malware, is also dramatically increasing. Since all of malware cannot be analyzed by analysts, it is very important to find out new malware that should be analyzed by them. In order to cope with this issue, the existing approaches focused on malware classification using static or dynamic analysis results of malware. However, the static and the dynamic analyses themselves are also too costly and not easy to build the isolated, secure and Internet-like analysis environments such as sandbox. In this paper, we propose a lightweight malware classification method based on detection results of anti-virus software. Since the proposed method can reduce the volume of malware that should be analyzed by analysts, it can be used as a preprocess for in-depth analysis of malware. The experimental showed that the proposed method succeeded in classification of 1,000 malware samples into 187 unique groups. This means that 81% of the original malware samples do not need to analyze by analysts.
Stochastic Computing (SC) is an alternative design paradigm particularly useful for applications where cost is critical. SC has been applied to neural networks, as neural networks are known for their high computational complexity. However previous work in this area has critical limitations such as the fully-parallel architecture assumption, which prevent them from being applicable to recent ones such as convolutional neural networks, or ConvNets. This paper presents the first SC architecture for ConvNets, shows its feasibility, with detailed analyses of implementation overheads. Our SC-ConvNet is a hybrid between SC and conventional binary design, which is a marked difference from earlier SC-based neural networks. Though this might seem like a compromise, it is a novel feature driven by the need to support modern ConvNets at scale, which commonly have many, large layers. Our proposed architecture also features hybrid layer composition, which helps achieve very high recognition accuracy. Our detailed evaluation results involving functional simulation and RTL synthesis suggest that SC-ConvNets are indeed competitive with conventional binary designs, even without considering inherent error resilience of SC.
The start-up value of an SRAM cell is unique, random, and unclonable as it is determined by the inherent process mismatch between transistors. These properties make SRAM an attractive circuit for generating encryption keys. The primary challenge for SRAM based key generation, however, is the poor stability when the circuit is subject to random noise, temperature and voltage changes, and device aging. Temporal majority voting (TMV) and bit masking were used in previous works to identify and store the location of unstable or marginally stable SRAM cells. However, TMV requires a long test time and significant hardware resources. In addition, the number of repetitive power-ups required to find the most stable cells is prohibitively high. To overcome the shortcomings of TMV, we propose a novel data remanence based technique to detect SRAM cells with the highest stability for reliable key generation. This approach requires only two remanence tests: writing `1' (or `0') to the entire array and momentarily shutting down the power until a few cells flip. We exploit the fact that the cells that are easily flipped are the most robust cells when written with the opposite data. The proposed method is more effective in finding the most stable cells in a large SRAM array than a TMV scheme with 1,000 power-up tests. Experimental studies show that the 256-bit key generated from a 512 kbit SRAM using the proposed data remanence method is 100% stable under different temperatures, power ramp up times, and device aging.
Hierarchical Graph Neuron (HGN) is an extension of network-centric algorithm called Graph Neuron (GN), which is used to perform parallel distributed pattern recognition. In this research, HGN scheme is used to classify intrusion attacks in computer networks. Patterns of intrusion attacks are preprocessed in three steps: selecting attributes using information gain attribute evaluation, discretizing the selected attributes using entropy-based discretization supervised method, and selecting the training data using K-Means clustering algorithm. After the preprocessing stage, the HGN scheme is then deployed to classify intrusion attack using the KDD Cup 99 dataset. The results of the classification are measured in terms of accuracy rate, detection rate, false positive rate and true negative rate. The test result shows that the HGN scheme is promising and stable in classifying the intrusion attack patterns with accuracy rate reaches 96.27%, detection rate reaches 99.20%, true negative rate below 15.73%, and false positive rate as low as 0.80%.
Ensuring software security is essential for developing a reliable software. A software can suffer from security problems due to the weakness in code constructs during software development. Our goal is to relate software security with different code constructs so that developers can be aware very early of their coding weaknesses that might be related to a software vulnerability. In this study, we chose Java nano-patterns as code constructs that are method-level patterns defined on the attributes of Java methods. This study aims to find out the correlation between software vulnerability and method-level structural code constructs known as nano-patterns. We found the vulnerable methods from 39 versions of three major releases of Apache Tomcat for our first case study. We extracted nano-patterns from the affected methods of these releases. We also extracted nano-patterns from the non-vulnerable methods of Apache Tomcat, and for this, we selected the last version of three major releases (6.0.45 for release 6, 7.0.69 for release 7 and 8.0.33 for release 8) as the non-vulnerable versions. Then, we compared the nano-pattern distributions in vulnerable versus non-vulnerable methods. In our second case study, we extracted nano-patterns from the affected methods of three vulnerable J2EE web applications: Blueblog 1.0, Personalblog 1.2.6 and Roller 0.9.9, all of which were deliberately made vulnerable for testing purpose. We found that some nano-patterns such as objCreator, staticFieldReader, typeManipulator, looper, exceptions, localWriter, arrReader are more prevalent in affected methods whereas some such as straightLine are more vivid in non-affected methods. We conclude that nano-patterns can be used as the indicator of vulnerability-proneness of code.
This paper proposed a feedback shift register structure which can be split, it is based on a research of operating characteristics about 70 kinds of cryptographic algorithms and the research shows that the “different operations similar structure” reconfigurable design is feasible. Under the configuration information, the proposed structure can implement the multiplication in finite field GF(2n), the multiply/divide linear feedback shift register and other operations. Finally, this paper did a logic synthesis based on 55nm CMOS standard-cell library and the results show that the proposed structure gets a hardware resource saving of nearly 32%, the average power consumption saving of nearly 55% without the critical delay increasing significantly. Therefore, the “different operations similar structure” reconfigurable design is a new design method and the proposed feedback shift register structure can be an important processing unit for coarse-grained reconfigurable cryptologic array.
In our era, most of the communication between people is realized in the form of electronic messages and especially through smart mobile devices. As such, the written text exchanged suffers from bad use of punctuation, misspelling words, continuous chunk of several words without spaces, tables, internet addresses etc. which make traditional text analytics methods difficult or impossible to be applied without serious effort to clean the dataset. Our proposed method in this paper can work in massive noisy and scrambled texts with minimal preprocessing by removing special characters and spaces in order to create a continuous string and detect all the repeated patterns very efficiently using the Longest Expected Repeated Pattern Reduced Suffix Array (LERP-RSA) data structure and a variant of All Repeated Patterns Detection (ARPaD) algorithm. Meta-analyses of the results can further assist a digital forensics investigator to detect important information to the chunk of text analyzed.
In this paper, the correctness of the routing algorithm for the distributed key-value store based on order preserving linear hashing and Skip Graph is proved. In this system, data are divided by linear hashing and Skip Graph is used for overlay network. The routing table of this system is very uniform. Then, short detours can exist in the route of forwarding. By using these detours, the number of hops for the query forwarding is reduced.
As everyone knows vulnerability detection is a very difficult and time consuming work, so taking advantage of the unlabeled data sufficiently is needed and helpful. According the above reality, in this paper a method is proposed to predict buffer overflow based on semi-supervised learning. We first employ Antlr to extract AST from C/C++ source files, then according to the 22 buffer overflow attributes taxonomies, a 22-dimension vector is extracted from every function in AST, at last, the vector is leveraged to train a classifier to predict buffer overflow vulnerabilities. The experiment and evaluation indicate our method is correct and efficient.
Ransomware is a growing threat that encrypts auser's files and holds the decryption key until a ransom ispaid by the victim. This type of malware is responsible fortens of millions of dollars in extortion annually. Worse still, developing new variants is trivial, facilitating the evasion of manyantivirus and intrusion detection systems. In this work, we presentCryptoDrop, an early-warning detection system that alerts a userduring suspicious file activity. Using a set of behavior indicators, CryptoDrop can halt a process that appears to be tampering witha large amount of the user's data. Furthermore, by combininga set of indicators common to ransomware, the system can beparameterized for rapid detection with low false positives. Ourexperimental analysis of CryptoDrop stops ransomware fromexecuting with a median loss of only 10 files (out of nearly5,100 available files). Our results show that careful analysis ofransomware behavior can produce an effective detection systemthat significantly mitigates the amount of victim data loss.
To add multiple layers of security our present work proposes a method for integrating together cryptography and Steganography for secure communication using an image file. We have used here combination of cryptography and steganography that can hide a text in an image in such a way so as to prevent any possible suspicion of having a hidden text, after RSA cipher. It offers privacy and high security through the communication channel.
The transmission of data over a common transmission media revolute the world of information sharing from personal desktop to cloud computing. But the risk of the information theft has increased in the same ratio by the third party working on the same channel. The risk can be avoided using the suitable encryption algorithm. Using the best suited algorithm the transmitted data will be encrypted before placing it on the common channel. Using the public key or the private key the encrypted data can be decrypted by the authenticated user. It will avoid the risk of information theft by the unauthenticated user. In this work we have proposed an encryption algorithm which uses the ASCII code to encrypt the plain text. The common key will be used by sender or receiver to encrypt and decrypt the text for secure communication.
In modern parallel storage systems (e.g., cloud storage and data centers), it is important to provide data availability guarantees against disk (or storage node) failures via redundancy coding schemes. One coding scheme is X-code, which is double-fault tolerant while achieving the optimal update complexity. When a disk/node fails, recovery must be carried out to reduce the possibility of data unavailability. We propose an X-code-based optimal recovery scheme called minimum-disk-read-recovery (MDRR), which minimizes the number of disk reads for single-disk failure recovery. We make several contributions. First, we show that MDRR provides optimal single-disk failure recovery and reduces about 25 percent of disk reads compared to the conventional recovery approach. Second, we prove that any optimal recovery scheme for X-code cannot balance disk reads among different disks within a single stripe in general cases. Third, we propose an efficient logical encoding scheme that issues balanced disk read in a group of stripes for any recovery algorithm (including the MDRR scheme). Finally, we implement our proposed recovery schemes and conduct extensive testbed experiments in a networked storage system prototype. Experiments indicate that MDRR reduces around 20 percent of recovery time of the conventional approach, showing that our theoretical findings are applicable in practice.
This paper presents an efficient implementation of key-value store using Bloom filters on FPGA. Bloom filters are used to reduce the number of unnecessary accesses to the hash tables, thereby improving the performance. Additionally, for better hash table utilization, we use a modified cuckoo hashing algorithm for the implementation. They are implemented in FPGA to further improve the performance. Experimental results show significant performance improvement over existing approaches.