Biblio
The dramatically growing demand of Cyber Physical and Social Computing (CPSC) has enabled a variety of novel channels to reach services in the financial industry. Combining cloud systems with multimedia big data is a novel approach for Financial Service Institutions (FSIs) to diversify service offerings in an efficient manner. However, the security issue is still a great issue in which the service availability often conflicts with the security constraints when the service media channels are varied. This paper focuses on this problem and proposes a novel approach using the Semantic-Based Access Control (SBAC) techniques for acquiring secure financial services on multimedia big data in cloud computing. The proposed approach is entitled IntercroSsed Secure Big Multimedia Model (2SBM), which is designed to secure accesses between various media through the multiple cloud platforms. The main algorithms supporting the proposed model include the Ontology-Based Access Recognition (OBAR) Algorithm and the Semantic Information Matching (SIM) Algorithm. We implement an experimental evaluation to prove the correctness and adoptability of our proposed scheme.
Due to the "curse of dimensionality" problem, it is very expensive to process the nearest neighbor (NN) query in high-dimensional spaces; and hence, approximate approaches, such as Locality-Sensitive Hashing (LSH), are widely used for their theoretical guarantees and empirical performance. Current LSH-based approaches target at the L1 and L2 spaces, while as shown in previous work, the fractional distance metrics (Lp metrics with 0 textless p textless 1) can provide more insightful results than the usual L1 and L2 metrics for data mining and multimedia applications. However, none of the existing work can support multiple fractional distance metrics using one index. In this paper, we propose LazyLSH that answers approximate nearest neighbor queries for multiple Lp metrics with theoretical guarantees. Different from previous LSH approaches which need to build one dedicated index for every query space, LazyLSH uses a single base index to support the computations in multiple Lp spaces, significantly reducing the maintenance overhead. Extensive experiments show that LazyLSH provides more accurate results for approximate kNN search under fractional distance metrics.
Causality inference, such as dynamic taint anslysis, has many applications (e.g., information leak detection). It determines whether an event e is causally dependent on a preceding event c during execution. We develop a new causality inference engine LDX. Given an execution, it spawns a slave execution, in which it mutates c and observes whether any change is induced at e. To preclude non-determinism, LDX couples the executions by sharing syscall outcomes. To handle path differences induced by the perturbation, we develop a novel on-the-fly execution alignment scheme that maintains a counter to reflect the progress of execution. The scheme relies on program analysis and compiler transformation. LDX can effectively detect information leak and security attacks with an average overhead of 6.08% while running the master and the slave concurrently on separate CPUs, much lower than existing systems that require instruction level monitoring. Furthermore, it has much better accuracy in causality inference.
This paper calls for the attention to investigate real-world malwares in large scales by examining the largest real malware repository, VirusTotal. As a first step, we analyzed two fundamental characteristics of Windows executable malwares from VirusTotal. We designed offline and online tools for this analysis. Our results show that malwares appear in bursts and that distributions of malwares are highly skewed.
Recent studies shows that by the end of 2016 more than 60% of Internet traffic would be running on HTTPS. In presence of secure tunnels such as HTTPS, transparent caching solutions become in vain, as the application payload is encrypted by lower level security protocols. This paper addresses this issue and provides an alternate approach, for contents caching without compromising their security. There are three parts to our proposal. First, we propose two new IP layer primitives that allow routers to differentiate between IP and ICN flows. Second, we introduce DCAR (Dual-mode Content Aware Router), which is a traditional IP router enabled to understand the proposed IP primitives. Third, design of DISCS (DCAR based Information centric Secure Content Sharing) framework is proposed that leverages DCAR to allow content object caching along with security services that are comparable to HTTPS. Finally we share details on realizing such system.
With the advantage in compact representation and efficient comparison, binary hashing has been extensively investigated for approximate nearest neighbor search. In this paper, we propose a novel and general hashing framework, which simultaneously considers a new linear pair-wise distance preserving objective and point-wise constraint. The direct distance preserving objective aims to keep the linear relationships between the Euclidean distance and the Hamming distance of data points. Based on different point-wise constraints, we propose two methods to instantiate this framework. The first one is a pseudo-supervised hashing method, which uses existing unsupervised hashing methods to generate binary codes as pseudo-supervised information. The second one is an unsupervised hashing method, in which quantization loss is considered. We validate our framework on two large-scale datasets. The experiments demonstrate that our pseudo-supervised method achieves consistent improvement for the state-of-the-art unsupervised hashing methods, while our unsupervised method outperforms the state-of-the-art methods.
Augmented reality is poised to become a dominant computing paradigm over the next decade. With promises of three-dimensional graphics and interactive interfaces, augmented reality experiences will rival the very best science fiction novels. This breakthrough also brings in unique challenges on how users can authenticate one another to share rich content between augmented reality headsets. Traditional authentication protocols fall short when there is no common central entity or when access to the central authentication server is not available or desirable. Looks Good To Me (LGTM) is an authentication protocol that leverages the unique hardware and context provided with augmented reality headsets to bring innate human trust mechanisms into the digital world to solve authentication in a usable and secure way. LGTM works over point to point wireless communication so users can authenticate one another in a variety of circumstances and is designed with usability at its core, requiring users to perform only two actions: one to initiate and one to confirm. Users intuitively authenticate one another, using seemingly only each other's faces, but under the hood LGTM uses a combination of facial recognition and wireless localization to bootstrap trust from a wireless signal, to a location, to a face, for secure and usable authentication.
This paper focuses on a practically very important problem of matching a real-world product photo to exactly the same item(s) in online shopping sites. The task is extremely challenging because the user photos (i.e., the queries in this scenario) are often captured in uncontrolled environments, while the product images in online shops are mostly taken by professionals with clean backgrounds and perfect lighting conditions. To tackle the problem, we study deep network architectures and training schemes, with the goal of learning a robust deep feature representation that is able to bridge the domain gap between the user photos and the online product images. Our contributions are two-fold. First, we propose an alternative of the popular contrastive loss used in siamese deep networks, namely robust contrastive loss, where we "relax" the penalty on positive pairs to alleviate over-fitting. Second, a multi-task fine-tuning approach is introduced to learn a better feature representation, which not only incorporates knowledge from the provided training photo pairs, but also explores additional information from the large ImageNet dataset to regularize the fine-tuning procedure. Experiments on two challenging real-world datasets demonstrate that both the robust contrastive loss and the multi-task fine-tuning approach are effective, leading to very promising results with a time cost suitable for real-time retrieval.
How to debug large networks is always a challenging task. Software Defined Network (SDN) offers a centralized con- trol platform where operators can statically verify network policies, instead of checking configuration files device-by-device. While such a static verification is useful, it is still not enough: due to data plane faults, packets may not be forwarded according to control plane policies, resulting in network faults at runtime. To address this issue, we present VeriDP, a tool that can continuously monitor what we call control-data plane consistency, defined as the consistency between control plane policies and data plane forwarding behaviors. We prototype VeriDP with small modifications of both hardware and software SDN switches, and show that it can achieve a verification speed of 3 μs per packet, with a false negative rate as low as 0.1%, for the Stanford backbone and Internet2 topologies. In addition, when verification fails, VeriDP can localize faulty switches with a probability as high as 96% for fat tree topologies.
Knowing which part of a program processes which parts of an input can reveal the structure of the input as well as the structure of the program. In a URL textlesspretextgreaterhttp://www.example.com/path/textless/pretextgreater, for instance, the protocol textlesspretextgreaterhttptextless/pretextgreater, the host textlesspretextgreaterwww.example.comtextless/pretextgreater, and the path textlesspretextgreaterpathtextless/pretextgreater would be handled by different functions and stored in different variables. Given a set of sample inputs, we use dynamic tainting to trace the data flow of each input character, and aggregate those input fragments that would be handled by the same function into lexical and syntactical entities. The result is a context-free grammar that reflects valid input structure. In its evaluation, our AUTOGRAM prototype automatically produced readable and structurally accurate grammars for inputs like URLs, spreadsheets or configuration files. The resulting grammars not only allow simple reverse engineering of input formats, but can also directly serve as input for test generators.
We present sandbox mining, a technique to confine an application to resources accessed during automatic testing. Sandbox mining first explores software behavior by means of automatic test generation, and extracts the set of resources accessed during these tests. This set is then used as a sandbox, blocking access to resources not used during testing. The mined sandbox thus protects against behavior changes such as the activation of latent malware, infections, targeted attacks, or malicious updates. The use of test generation makes sandbox mining a fully automatic process that can be run by vendors and end users alike. Our BOXMATE prototype requires less than one hour to extract a sandbox from an Android app, with few to no confirmations required for frequently used functionality.
The security and typical attack behavior of Modbus/TCP industrial network communication protocol are analyzed. The data feature of traffic flow is extracted through the operation mode of the depth analysis abnormal behavior, and the intrusion detection method based on the support vector machine (SVM) is designed. The method analyzes the data characteristics of abnormal communication behavior, and constructs the feature input structure and detection system based on SVM algorithm by using the direct behavior feature selection and abnormal behavior pattern feature construction. The experimental results show that the method can effectively improve the detection rate of abnormal behavior, and enhance the safety protection function of industrial network.
This paper presents an approach to a closed-class authorship attribution (AA) problem. It is based on language modeling for classification and called modified language modeling. Modified language modeling aims to offer a solution for AA problem by Combinations of both bigram words weighting and Unigram words weighting. It makes the relation between unseen text and training documents clearer with giving extra reward of training documents; training document including bigram word as well as unigram words. Moreover, IDF value multiplied by related word probability has been used, instead of removing stop words which are provided by Stop words list. we evaluate Experimental results by four approaches; unigram, bigram, trigram and modified language modeling by using two Persian poem corpora as WMPR-AA2016-A Dataset and WMPR-AA2016-B Dataset. Results show that modified language modeling attributes authors better than other approaches. The result on WMPR-AA2016-B, which is bigger dataset, is much better than another dataset for all approaches. This may indicate that if adequate data is provided to train language modeling the modified language modeling can be a good solution to AA problem.
Cognitive radio (CR) has emerged as a promising technology to increase the utilization of spectrum resource. A pivotal challenge in CR lies on secondary users' (SU) finding each other on the frequency band, i.e., the spectrum locating. In this demo, we implement two kinds of multi-channel rendezvous technology to solve the problem of spectrum locating: (i) the common control channel (CCC) based rendezvous scheme, which is simple and effective when a control channel is always available; and (ii) the channel-hopping (CH) based blind rendezvous, which could also obtain guaranteed rendezvous on all commonly available channels of pairwise SUs in a short time without a CCC. Furthermore, the cognitive nodes in the demonstration could adjust their communication channels autonomously according to the dynamic spectrum environment for continuous data transmission.
In the past three years, Emotion Recognition in the Wild (EmotiW) Grand Challenge has drawn more and more attention due to its huge potential applications. In the fourth challenge, aimed at the task of video based emotion recognition, we propose a multi-clue emotion fusion (MCEF) framework by modeling human emotion from three mutually complementary sources, facial appearance texture, facial action, and audio. To extract high-level emotion features from sequential face images, we employ a CNN-RNN architecture, where face image from each frame is first fed into the fine-tuned VGG-Face network to extract face feature, and then the features of all frames are sequentially traversed in a bidirectional RNN so as to capture dynamic changes of facial textures. To attain more accurate facial actions, a facial landmark trajectory model is proposed to explicitly learn emotion variations of facial components. Further, audio signals are also modeled in a CNN framework by extracting low-level energy features from segmented audio clips and then stacking them as an image-like map. Finally, we fuse the results generated from three clues to boost the performance of emotion recognition. Our proposed MCEF achieves an overall accuracy of 56.66% with a large improvement of 16.19% with respect to the baseline.
Additive manufacturing, also known as 3D printing, has been increasingly applied to fabricate highly intellectual property (IP) sensitive products. However, the related IP protection issues in 3D printers are still largely underexplored. On the other hand, smartphones are equipped with rich onboard sensors and have been applied to pervasive mobile surveillance in many applications. These facts raise one critical question: is it possible that smartphones access the side-channel signals of 3D printer and then hack the IP information? To answer this, we perform an end-to-end study on exploring smartphone-based side-channel attacks against 3D printers. Specifically, we formulate the problem of the IP side-channel attack in 3D printing. Then, we investigate the possible acoustic and magnetic side-channel attacks using the smartphone built-in sensors. Moreover, we explore a magnetic-enhanced side-channel attack model to accurately deduce the vital directional operations of 3D printer. Experimental results show that by exploiting the side-channel signals collected by smartphones, we can successfully reconstruct the physical prints and their G-code with Mean Tendency Error of 5.87% on regular designs and 9.67% on complex designs, respectively. Our study demonstrates this new and practical smartphone-based side channel attack on compromising IP information during 3D printing.
This paper establishes a new framework for electrical cyber-physical systems (ECPSs). The communication network is designed by the characteristics of a power grid. The interdependent relationship of communication networks and power grids is described by data-uploading channels and commands-downloading channels. Control strategies (such as load shedding and relay protection) are extended to this new framework for analyzing the performance of ECPSs under several attack scenarios. The fragility of ECPSs under cyber attacks (DoS attack and false data injection attack) and the effectiveness of relay protection policies are verified by experimental results.
With the increasing popularity of cloud storage services, many individuals and enterprises start to move their local data to the clouds. To ensure their privacy and data security, some cloud service users may want to encrypt their data before outsourcing them. However, this impedes efficient data utilities based on the plain text search. In this paper, we study how to construct a secure index that supports both efficient index updating and similarity search. Using the secure index, users are able to efficiently perform similarity searches tolerating input mistakes and update the index when new data are available. We formally prove the security of our proposal and also perform experiments on real world data to show its efficiency.
Although existing for decades, software tampering attack is still a main threat to systems, such as Android, and cyber physical systems. Many approaches have been proposed to thwart specific procedures of tampering, e.g., obfuscation and self-checksumming. However, none of them can achieve theoretically tamper-proof without the protection of hardware circuit. Rather than proposing new tricks against tampering attacks, we focus on impeding the replication of software tampering via program diversification, and thus pose a scalability barrier against the attacks. Our idea, namely N-version obfuscation (NVO), is to automatically generate and deliver same featured, but functionally nonequivalent software copies to different machines or users. In this paper, we investigate such an idea on Android platform. We carefully design a candidate NVO solution for networked apps, which leverages a Message Authentication Code (MAC) mechanism to generate the functionally nonequivalent diversities. Our evaluation result shows that the time required for breaking such a software system increases linearly with respect to the number of software versions. In this way, attackers would suffer great scalability issues, considering that an app can have millions of users. With minimal NVO costs, effective tamper-resistant security can therefore be established.
While compilers offer a fair trade-off between productivity and executable performance in single-threaded execution, their optimizations remain fragile when addressing compute-intensive code for parallel architectures with deep memory hierarchies. Moreover, these optimizations operate as black boxes, impenetrable for the user, leaving them with no alternative to time-consuming and error-prone manual optimization in cases where an imprecise cost model or a weak analysis resulted in a bad optimization decision. To address this issue, we propose a technique allowing to automatically translate an arbitrary polyhedral optimization, used internally by loop-level optimization frameworks of several modern compilers, into a sequence of comprehensible syntactic transformations as long as this optimization focuses on scheduling loop iterations. This approach opens the black box of the polyhedral frameworks enabling users to examine, refine, replay and even design complex optimizations semi-automatically in partnership with the compiler.