Biblio
A honeypot is a deception tool for enticing attackers to make efforts to compromise the electronic information systems of an organization. A honeypot can serve as an advanced security surveillance tool for use in minimizing the risks of attacks on information technology systems and networks. Honeypots are useful for providing valuable insights into potential system security loopholes. The current research investigated the effectiveness of the use of centralized system management technologies called Puppet and Virtual Machines in the implementation automated honeypots for intrusion detection, correction and prevention. A centralized logging system was used to collect information of the source address, country and timestamp of intrusions by attackers. The unique contributions of this research include: a demonstration how open source technologies is used to dynamically add or modify hacking incidences in a high-interaction honeynet system; a presentation of strategies for making honeypots more attractive for hackers to spend more time to provide hacking evidences; and an exhibition of algorithms for system and network intrusion prevention.
IP tracking and cloaking are practices for identifying users which are used legitimately by websites to provide services and content tailored to particular users. However, it is believed that these practices are also used by malicious websites to avoid detection by anti-virus companies crawling the web to find malware. In addition, malicious websites are also believed to use IP tracking in order to deliver targeted malware based upon a history of previous visits by users. In this paper we empirically investigate these beliefs and collect a large dataset of suspicious URLs in order to identify at what level IP tracking takes place that is at the level of an individual address or at the level of their network provider or organisation (Network tracking). Our results illustrate that IP tracking is used in a small subset of domains within our dataset while no strong indication of network tracking was observed.
Honeypot systems are an effective method for defending production systems from security breaches and to gain detailed information about attackers' motivation, tactics, software and infrastructure. In this paper we present how different types of honeypots can be employed to gain valuable information about attacks and attackers, and also outline new and innovative possibilities for future research.
Defending information systems against advanced attacks is a challenging task; even if all the systems have been properly updated and all the known vulnerabilities have been patched, there is still the possibility of previously unknown zero day attack compromising the system. Honeypots offer a more proactive tool for detecting possible attacks. What is more, they can act as a tool for understanding attackers intentions. In this paper, we propose a design for a diversified honeypot. By increasing variability present in software, diversification decreases the number of assumptions an attacker can make about the target system.
When running large human computation tasks in the real-world, honeypots play an important role for assessing the overall quality of the work produced. The generation of such honeypots can be a significant burden on the task owner as they require specific characteristics in their design and implementation and continuous maintenance when operating data pipelines that include a human computation component. In this extended abstract we outline a novel approach for creating honeypots using automatically generated questions from a reference knowledge base with the ability to control such parameters as topic and difficulty.
Active defense is a popular defense technique based on systems that hinder an attacker's progress by design, rather than reactively responding to an attack only after its detection. Well-known active defense systems are honeypots. Honeypots are fake systems, designed to look like real production systems, aimed at trapping an attacker, and analyzing his attack strategy and goals. These types of systems suffer from a major weakness: it is extremely hard to design them in such a way that an attacker cannot distinguish them from a real production system. In this paper, we advocate that, instead of adding additional fake systems in the corporate network, the production systems themselves should be instrumented to provide active defense capabilities. This perspective to active defense allows containing costs and complexity, while at the same time provides the attacker with a more realistic-looking target, and gives the Incident Response Team more time to identify the attacker. The proposed proof-of-concept prototype system can be used to implement active defense in any corporate production network, with little upfront work, and little maintenance.
Multimedia security and copyright protection has been a popular topic for research and application, due to the explosion of data exchange over the internet and the widespread use of digital media. Watermarking is a process of hiding the digital information inside a digital media. Information hiding as digital watermarks in multimedia enables protection mechanism in decrypted contents. This paper presents a comparative study of existing technique used for digital watermarking an image using Genetic Algorithm and Bacterial Foraging Algorithm (BFO) based optimization technique with proposed one which consists of Genetic Algorithm and Honey Bee based optimization technique. The results obtained after experiment conclude that, new method has indeed outperformed then the conventional technique. The implementation is done over the MATLAB.
Password vaults are used to store login credentials, usually encrypted by a master password, relieving the user from memorizing a large number of complex passwords. To manage accounts on multiple devices, vaults are often stored at an online service, which substantially increases the risk of leaking the (encrypted) vault. To protect the master password against guessing attacks, previous work has introduced cracking-resistant password vaults based on Honey Encryption. If decryption is attempted with a wrong master password, they output plausible-looking decoy vaults, thus seemingly disabling offline guessing attacks. In this work, we propose attacks against cracking-resistant password vaults that are able to distinguish between real and decoy vaults with high accuracy and thus circumvent the offered protection. These attacks are based on differences in the generated distribution of passwords, which are measured using Kullback-Leibler divergence. Our attack is able to rank the correct vault into the 1.3% most likely vaults (on median), compared to 37.8% of the best-reported attack in previous work. (Note that smaller ranks are better, and 50% is achievable by random guessing.) We demonstrate that this attack is, to a certain extent, a fundamental problem with all static Natural Language Encoders (NLE), where the distribution of decoy vaults is fixed. We propose the notion of adaptive NLEs and demonstrate that they substantially limit the effectiveness of such attacks. We give one example of an adaptive NLE based on Markov models and show that the attack is only able to rank the decoy vaults with a median rank of 35.1%.
The second wave of change presented by the age of mobility, wearables, and IoT focuses on how organizations and enterprises, from a wide variety of commercial areas and industries, will use and leverage the new technologies available. Businesses and industries that don't change with the times will simply cease to exist. Applications need to be powered by cognitive and contextual technologies to support real-time proactive decisions. These decisions will be based on the mobile context of a specific user or group of users, incorporating location, time of day, current user task, and more. Driven by the huge amounts of data produced by mobile and wearables devices, and influenced by privacy concerns, the next wave in computing will need to exploit data and computing at the edge of the network. Future mobile apps will have to be cognitive to 'understand' user intentions based on all the available interactions and unstructured data. Mobile applications are becoming increasingly ubiquitous, going beyond what end users can easily comprehend. Essentially, for both business-to-client (B2C) and business-to-business (B2B) apps, only about 30% of the development efforts appear in the interface of the mobile app. For example, areas such as the collaborative nature of the software or the shortened development cycle and time-to-market are not apparent to end users. The other 70% of the effort invested is dedicated to integrating the applications with back-office systems and developing those aspects of the application that operate behind the scenes. An important, yet often complex, part of the solution and mobile app takes place far from the public eye-in the back-office environment. It is there that various aspects of customer relationship management must be addressed: tracking usage data, pushing out messaging as needed, distributing apps to employees within the enterprise, and handling the wide variety of operational and management tasks-often involving the collection and monitoring of data from sensors and wearable devices. All this must be carried out while addressing security concerns that range from verifying user identities, to data protection, to blocking attempted breaches of the organization, and activation of malicious code. Of course, these tasks must be augmented by a systematic approach and vigilant maintenance of user privacy. The first wave of the mobile revolution focused on development platforms, run-time platforms, deployment, activation, and management tools for multi-platform environments, including comprehensive mobile device management (MDM). To realize the full potential of this revolution, we must capitalize on information about the context within which mobile devices are used. With both employees and customers, this context could be a simple piece of information such as the user location or time of use, the hour of the day, or the day of the week. The context could also be represented by more complex data, such as the amount of time used, type of activity performed, or user preferences. Further insight could include the relationship history with the user and the user's behavior as part of that relationship, as well as a long list of variables to be considered in various scenarios. Today, with the new wave of wearables, the definition of context is being further extended to include environmental factors such as temperature, weather, or pollution, as well as personal factors such as heart rate, movement, or even clothing worn. In both B2E and B2C situations, a context-dependent approach, based on the appropriate context for each specific user, offers a superior tool for working with both employees and clients alike. This mode of operation does not start and end with the individual user. Rather, it takes into account the people surrounding the user, the events taking place nearby, appliances or equipment activated, the user's daily schedule, as well as other, more general information, such as the environment and weather. Developing enterprise-wide, context-dependent, mobile solutions is still a complex challenge. A system of real added-value services must be developed, as well as a comprehensive architecture. These four-tier architectures comprise end-user devices like wearables and smartphones, connected to systems of engagement (SoEs), and systems of record (SoRs). All this is needed to enable data analytics and collection in the context where it is created. The data collected will allow further interaction with employees or customers, analytics, and follow-up actions based on the results of that analysis. We also need to ensure end-to-end (E2E) security across these four tiers, and to keep the data and application contexts in sync. These are just some of the challenges being addressed by IBM Research. As an example, these technologies could be deployed in the retail space, especially in brick-and-mortar stores. Identifying a customer entering a store, detecting her location among the aisles, and cross-referencing that data with the customer's transaction history, could lead to special offers tailor-made for that specific customer or suggestions relevant to her purchasing process. This technology enables real-world implementation of metrics, analytics, and other tools familiar to us from the online realm. We can now measure visits to physical stores in the same way we measure web page hits: analyze time spent in the store, the areas visited by the customer, and the results of those visits. In this way, we can also identify shoppers wandering around the store and understand when they are having trouble finding the product they want to purchase. We can also gain insight into the standard traffic patterns of shoppers and how they navigate a store's floors and departments. We might even consider redesigning the store layout to take advantage of this insight to enhance sales. In healthcare, the context can refer to insight extracted from data received from sensors on the patient, from either his mobile device or wearable technology, and information about the patient's environment and location at that moment in time. This data can help determine if any assistance is required. For example, if a patient is discharged from the hospital for continued at-home care, doctors can continue to remotely monitor his condition via a system of sensors and analytic tools that interpret the sensor readings. This approach can also be applied to the area of safety. Scientists at IBM Research are developing a platform that collects and analyzes data from wearable technology to protect the safety of employees working in construction, heavy industry, manufacturing, or out in the field. This solution can serve as a real-time warning system by analyzing information gathered from wearable sensors embedded in personal protective equipment, such as smart safety helmets and protective vests, and in the workers' individual smartphones. These sensors can continuously monitor a worker's pulse rate, movements, body temperature, and hydration level, as well as environmental factors such as noise level, and other parameters. The system can provide immediate alerts to the worker about any dangers in the work environment to prevent possible injury. It can also be used to prevent accidents before they happen or detect accidents once they occur. For example, with sophisticated algorithms, we can detect if a worker falls based on a sudden difference in elevations detected by an accelerometer, and then send an alert to notify her peers and supervisor or call for help. Monitoring can also help ensure safety in areas where continuous exposure to heat or dangerous materials must be limited based on regulated time periods. Mobile technologies can also help manage events with massive numbers of participants, such as professional soccer games, music festivals, and even large-scale public demonstrations, by sending alerts concerning long and growing lines or specific high-traffic areas. These technologies can be used to detect accidents typical of large-scale gatherings, send warnings about overcrowding, and alert the event organizers. In the same way, they can alleviate parking problems or guide public transportation operators- all via analysis and predictive analytics. IBM Research - Haifa is currently involved in multiple activities as part of IBM's MobileFirst initiative. Haifa researchers have a special expertise in time- and location-based intelligent applications, including visual maps that display activity contexts and predictive analytics systems for mobile data and users. In another area, IBM researchers in Haifa are developing new cognitive services driven from the unique data available on mobile and wearable devices. Looking to the future, the IBM Research team is further advancing the integration of wearable technology, augmented reality systems, and biometric tools for mobile user identity validation. Managing contextual data and analyzing the interaction between the different kinds of data presents fascinating challenges for the development of next-generation programming. For example, we need to rethink when and where data processing and computations should occur: Is it best to leave them at the user-device level, or perhaps they should be moved to the back-office systems, servers, and/or the cloud infrastructures with which the user device is connected? New-age applications are becoming more and more distributed. They operate on a wide range of devices, such as wearable technologies, use a variety of sensors, and depend on cloud-based systems. As a result, a new distributed programming paradigm is emerging to meet the needs of these use-cases and real-time scenarios. This paradigm needs to deal with massive amounts of devices, sensors, and data in business systems, and must be able to shift computation from the cloud to the edge, based on context in close to real-time. By processing data at the edge of the network, close to where the interactions and processing are happening, we can help reduce latency and offer new opportunities for improved privacy and security. Despite all these interactions, data collection, and the analytic insights based upon them-we cannot forget the issues of privacy. Without a proper and reliable solution that offers more control over what personal data is shared and how it is used, people will refrain from sharing information. Such sharing is necessary for developing and understanding the context in which people are carrying out various actions, and to offer them tools and services to enhance their actions. In the not-so-distant future, we anticipate the appearance of ad-hoc networks for wearable technology systems that will interact with one another to further expand the scope and value of available context-dependent data.
Information flow analyses traditionally use the Program Dependence Graph (PDG) as a supporting data-structure. This graph relies on Ferrante et al.'s notion of control dependences to represent implicit flows of information. A limitation of this approach is that it may create O(textbarItextbar x textbarEtextbar) implicit flow edges in the PDG, where I are the instructions in a program, and E are the edges in its control flow graph. This paper shows that it is possible to compute information flow analyses using a different notion of implicit dependence, which yields a number of edges linear on the number of definitions plus uses of variables. Our algorithm computes these dependences in a single traversal of the program's dominance tree. This efficiency is possible due to a key property of programs in Static Single Assignment form: the definition of a variable dominates all its uses. Our algorithm correctly implements Hunt and Sands system of security types. Contrary to their original formulation, which required O(IxI) space and time for structured programs, we require only O(I). We have used our ideas to build FlowTracker, a tool that uncovers side-channel vulnerabilities in cryptographic algorithms. FlowTracker handles programs with over one-million assembly instructions in less than 200 seconds, and creates 24% less implicit flow edges than Ferrante et al.'s technique. FlowTracker has detected an issue in a constant-time implementation of Elliptic Curve Cryptography; it has found several time-variant constructions in OpenSSL, one issue in TrueCrypt and it has validated the isochronous behavior of the NaCl library.
In the field of image processing, it is more complex and challenging task to detect the Human motion in the video and recognize their actions from the video sequences. A novel approach is presented in this paper to detect the human motion and recognize their actions. By tracking the selected object over consecutive frames of a video or image sequences, the different Human actions are recognized. Initially, the background motion is subtracted from the input video stream and its binary images are constructed. Using spatiotemporal interest points, the object which needs to be monitored is selected by enclosing the required pixels within the bounding rectangle. The selected foreground pixels within the bounding rectangle are then tracked using edge tracking algorithm. The features are extracted and using these features human motion are detected. Finally, the different human actions are recognized using K-Nearest Neighbor classifier. The applications which uses this methodology where monitoring the human actions is required such as shop surveillance, city surveillance, airports surveillance and other important places where security is the prime factor. The results obtained are quite significant and are analyzed on the datasets like KTH and Weizmann dataset, which contains actions like bending, running, walking, skipping, and hand-waving.
Online services are increasingly dependent on user participation. Whether it's online social networks or crowdsourcing services, understanding user behavior is important yet challenging. In this paper, we build an unsupervised system to capture dominating user behaviors from clickstream data (traces of users' click events), and visualize the detected behaviors in an intuitive manner. Our system identifies "clusters" of similar users by partitioning a similarity graph (nodes are users; edges are weighted by clickstream similarity). The partitioning process leverages iterative feature pruning to capture the natural hierarchy within user clusters and produce intuitive features for visualizing and understanding captured user behaviors. For evaluation, we present case studies on two large-scale clickstream traces (142 million events) from real social networks. Our system effectively identifies previously unknown behaviors, e.g., dormant users, hostile chatters. Also, our user study shows people can easily interpret identified behaviors using our visualization tool.
Modern world has witnessed a dramatic increase in our ability to collect, transmit and distribute real-time monitoring and surveillance data from large-scale information systems and cyber-physical systems. Detecting system anomalies thus attracts significant amount of interest in many fields such as security, fault management, and industrial optimization. Recently, invariant network has shown to be a powerful way in characterizing complex system behaviours. In the invariant network, a node represents a system component and an edge indicates a stable, significant interaction between two components. Structures and evolutions of the invariance network, in particular the vanishing correlations, can shed important light on locating causal anomalies and performing diagnosis. However, existing approaches to detect causal anomalies with the invariant network often use the percentage of vanishing correlations to rank possible casual components, which have several limitations: 1) fault propagation in the network is ignored; 2) the root casual anomalies may not always be the nodes with a high-percentage of vanishing correlations; 3) temporal patterns of vanishing correlations are not exploited for robust detection. To address these limitations, in this paper we propose a network diffusion based framework to identify significant causal anomalies and rank them. Our approach can effectively model fault propagation over the entire invariant network, and can perform joint inference on both the structural, and the time-evolving broken invariance patterns. As a result, it can locate high-confidence anomalies that are truly responsible for the vanishing correlations, and can compensate for unstructured measurement noise in the system. Extensive experiments on synthetic datasets, bank information system datasets, and coal plant cyber-physical system datasets demonstrate the effectiveness of our approach.
Malware detection has been widely studied by analysing either file dropping relationships or characteristics of the file distribution network. This paper, for the first time, studies a global heterogeneous malware delivery graph fusing file dropping relationship and the topology of the file distribution network. The integration offers a unique ability of structuring the end-to-end distribution relationship. However, it brings large heterogeneous graphs to analysis. In our study, an average daily generated graph has more than 4 million edges and 2.7 million nodes that differ in type, such as IPs, URLs, and files. We propose a novel Bayesian label propagation model to unify the multi-source information, including content-agnostic features of different node types and topological information of the heterogeneous network. Our approach does not need to examine the source codes nor inspect the dynamic behaviours of a binary. Instead, it estimates the maliciousness of a given file through a semi-supervised label propagation procedure, which has a linear time complexity w.r.t. the number of nodes and edges. The evaluation on 567 million real-world download events validates that our proposed approach efficiently detects malware with a high accuracy.
Convolution serves as the basic computational primitive for various associative computing tasks ranging from edge detection to image matching. CMOS implementation of such computations entails significant bottlenecks in area and energy consumption due to the large number of multiplication and addition operations involved. In this paper, we propose an ultra-low power and compact hybrid spintronic-CMOS design for the convolution computing unit. Low-voltage operation of domain-wall motion based magneto-metallic "Spin-Memristor"s interfaced with CMOS circuits is able to perform the convolution operation with reasonable accuracy. Simulation results of Gabor filtering for edge detection reveal \textasciitilde 2.5× lower energy consumption compared to a baseline 45nm-CMOS implementation.
After being widely studied in theory, physical layer security schemes are getting closer to enter the consumer market. Still, a thorough practical analysis of their resilience against attacks is missing. In this work, we use software-defined radios to implement such a physical layer security scheme, namely, orthogonal blinding. To this end, we use orthogonal frequency-division multiplexing (OFDM) as a physical layer, similarly to WiFi. In orthogonal blinding, a multi-antenna transmitter overlays the data it transmits with noise in such a way that every node except the intended receiver is disturbed by the noise. Still, our known-plaintext attack can extract the data signal at an eavesdropper by means of an adaptive filter trained using a few known data symbols. Our demonstrator illustrates the iterative training process at the symbol level, thus showing the practicability of the attack.
Whether it is for conditional statement, constant, opaque predicate or equation obfuscation, Mixed Boolean Arithmetics (MBA) technique is a powerful tool providing concrete ways to achieve obfuscation. Recent papers ([22,1]) presented ways to mix such tools with permutation polynomials modulo 2n in order to make them more resilient to SMT solvers. However, because of limitations regarding the inversion of such permutations, the set of permutation polynomials presented suffer some restrictions. Such restrictions bring several methods of arithmetic simplification, decreasing their effectiveness at hiding information. In this work, we present general methods for permutation polynomials inversion. Those methods allow us to remove some of the restrictions presented in the literature, making simplification methods less effective. We discuss complexity and limits of these methods, and conclude that not only current simplification methods may not be as effective as we thought, but they are still many uses of polynomial permutations in obfuscation that are yet to be explored.
Security against hardware trojans is currently becoming an essential ingredient to ensure trust in information systems. A variety of solutions have been introduced to reach this goal, ranging from reactive (i.e., detection-based) to preventive (i.e., trying to make the insertion of a trojan more difficult for the adversary). In this paper, we show how testing (which is a typical detection tool) can be used to state concrete security guarantees for preventive approaches to trojan-resilience. For this purpose, we build on and formalize two important previous works which introduced ``input scrambling" and ``split manufacturing" as countermeasures to hardware trojans. Using these ingredients, we present a generic compiler that can transform any circuit into a trojan-resilient one, for which we can state quantitative security guarantees on the number of correct executions of the circuit thanks to a new tool denoted as ``testing amplification". Compared to previous works, our threat model covers an extended range of hardware trojans while we stick with the goal of minimizing the number of honest elements in our transformed circuits. Since transformed circuits essentially correspond to redundant multiparty computations of the target functionality, they also allow reasonably efficient implementations, which can be further optimized if specialized to certain cryptographic primitives and security goals.
The premise of this year's SafeConfig Workshop is existing tools and methods for security assessments are necessary but insufficient for scientifically rigorous testing and evaluation of resilient and active cyber systems. The objective for this workshop is the exploration and discussion of scientifically sound testing regimen(s) that will continuously and dynamically probe, attack, and "test" the various resilient and active technologies. This adaptation and change in focus necessitates at the very least modification, and potentially, wholesale new developments to ensure that resilient- and agile-aware security testing is available to the research community. All testing, validation and experimentation must also be repeatable, reproducible, subject to scientific scrutiny, measurable and meaningful to both researchers and practitioners.
Resiliency is a relatively new topic in the context of access control. Informally, it refers to the extent to which a multi-user computer system, subject to an authorization policy, is able to continue functioning if a number of authorized users are unavailable. Several interesting problems connected to resiliency were introduced by Li, Wang and Tripunitara [13], many of which were found to be intractable. In this paper, we show that these resiliency problems have unexpected connections with the workflow satisfiability problem (WSP). In particular, we show that an instance of the resiliency checking problem (RCP) may be reduced to an instance of WSP. We then demonstrate that recent advances in our understanding of WSP enable us to develop fixed-parameter tractable algorithms for RCP. Moreover, these algorithms are likely to be useful in practice, given recent experimental work demonstrating the advantages of bespoke algorithms to solve WSP. We also generalize RCP in several different ways, showing in each case how to adapt the reduction to WSP. Li et al also showed that the coexistence of resiliency policies and static separation-of-duty policies gives rise to further interesting questions. We show how our reduction of RCP to WSP may be extended to solve these problems as well and establish that they are also fixed-parameter tractable.
Complex simulations and numerical experiments typically rely on a number of parameters and have an associated score function, e.g. with the goal of maximizing accuracy or minimizing computation time. However, the influence of each individual parameter is often poorly understood a priori and the joint parameter space can be difficult to explore, visualize and optimize. We model this space as an N-dimensional black-box tensor and apply a cross approximation strategy to sample it. Upon learning and compactly expressing this space as a surrogate visualization model, informative subspaces are interactively reconstructed and navigated in the form of charts, images, surface plots, etc. By exploiting efficient operations in the tensor train format, we are able to produce diagrams such as parallel coordinates, bivariate projections and dimensional stacking out of highly-compressed parameter spaces. We demonstrate the proposed framework with several scientific simulations that contain up to 6 parameters and billions of tensor grid points.
Traditional vibration inspection systems, equipped with separated sensing and communication modules, are either very expensive (e.g., hundreds of dollars) and/or suffer from occlusion and narrow field of view (e.g., laser). In this work, we present an RFID-based solution, Tagbeat, to inspect mechanical vibration using COTS RFID tags and readers. Making sense of micro and high-frequency vibration using random and low-frequency readings of tag has been a daunting task, especially challenging for achieving sub-millisecond period accuracy. Our system achieves these three goals by discerning the change pattern of backscatter signal replied from the tag, which is attached on the vibrating surface and displaced by the vibration within a small range. This work introduces three main innovations. First, it shows how one can utilize COTS RFID to sense mechanical vibration and accurately discover its period with a few periods of short and noisy samples. Second, a new digital microscope is designed to amplify the micro-vibration-induced weak signals. Third, Tagbeat introduces compressive reading to inspect high-frequency vibration with relatively low RFID read rate. We implement Tagbeat using a COTS RFID device and evaluate it with a commercial centrifugal machine. Empirical benchmarks with a prototype show that Tagbeat can inspect the vibration period with a mean accuracy of 0.36ms and a relative error rate of 0.03%. We also study three cases to demonstrate how to associate our inspection solution with the specific domain requirements.
In recent years, the analog-to-information converter (AIC), based on compressed sensing (CS) paradigm, is a promising solution to overcome the performance and energy-efficiency limitations of traditional analog-to-digital converters (ADC). Especially, AIC can enable sub-Nyquist signal sampling proportional to the intrinsic information in biomedical applications. However, the legacy AIC structure is tailored toward specific applications, which lacks of flexibility and prevents its universality. In this paper, we introduce a novel programmable AIC architecture, Pro-AIC, to enable effective configurability and reduce its energy overhead by integrating efficient multiplexing hardware design. To improve the quality and time-efficiency of Pro-AIC configuration, we also develop a rapid configuration algorithm, called RapSpiral, to quickly find the near-optimal parameter configuration in Pro-AIC architecture. Specifically, we present a design metric, trade-off penalty, to quantitatively evaluate the performance-energy trade-off. The RapSpiral controls a penalty-driven shrinking triangle to progressively approximate to the optimal trade-off. Our proposed RapSpiral is with log(n) complexity yet high accuracy, without pretraining and complex parameter tuning procedure. RapSpiral is also probable to avoid the local minimum pitfalls. Experimental results indicate that our RapSpiral algorithm can achieve more than 30x speedup compared with the brute force algorithm, with only about 3% trade-off compromise to the optimum in Pro-AIC. Furthermore, the scalability is also verified on larger size benchmarks.
In this paper, we develop a new aggregation technique to reduce the cost of surveying. Our method aims to jointly estimate a vector of target quantities such as public opinion or voter intent across time and maintain good estimates when using only a fraction of the data. Inspired by the James-Stein estimator, we resolve this challenge by shrinking the estimates to a global mean which is assumed to have a sparse representation in some known basis. This assumption has lead to two different methods for estimating the global mean: orthogonal matching pursuit and deep learning. Both of which significantly reduce the number of samples needed to achieve good estimates of the true means of the data and, in the case of presidential elections, can estimate the outcome of the 2012 United States elections while saving hundreds of thousands of samples and maintaining accuracy.
In a wireless system, a signal map shows the signal strength at different locations termed reference points (RPs). As access points (APs) and their transmission power may change over time, keeping an updated signal map is important for applications such as Wi-Fi optimization and indoor localization. Traditionally, the signal map is obtained by a full site survey, which is time-consuming and costly. We address in this paper how to efficiently update a signal map given sparse samples randomly crowdsourced in the space (e.g., by signal monitors, explicit human input, or implicit user participation). We propose Compressive Signal Reconstruction (CSR), a novel learning system employing Bayesian compressive sensing (BCS) for online signal map update. CSR does not rely on any path loss model or line of sight, and is generic enough to serve as a plug-in of any wireless system. Besides signal map update, CSR also computes the estimation error of signals in terms of confidence interval. CSR models the signal correlation with a kernel function. Using it, CSR constructs a sensing matrix based on the newly sampled signals. The sensing matrix is then used to compute the signal change at all the RPs with any BCS algorithm. We have conducted extensive experiments on CSR in our university campus. Our results show that CSR outperforms other state-of-the-art algorithms by a wide margin (reducing signal error by about 30% and sampling points by 20%).