Biblio
In a world where traditional notions of privacy are increasingly challenged by the myriad companies that collect and analyze our data, it is important that decision-making entities are held accountable for unfair treatments arising from irresponsible data usage. Unfortunately, a lack of appropriate methodologies and tools means that even identifying unfair or discriminatory effects can be a challenge in practice. We introduce the unwarranted associations (UA) framework, a principled methodology for the discovery of unfair, discriminatory, or offensive user treatment in data-driven applications. The UA framework unifies and rationalizes a number of prior attempts at formalizing algorithmic fairness. It uniquely combines multiple investigative primitives and fairness metrics with broad applicability, granular exploration of unfair treatment in user subgroups, and incorporation of natural notions of utility that may account for observed disparities. We instantiate the UA framework in FairTest, the first comprehensive tool that helps developers check data-driven applications for unfair user treatment. It enables scalable and statistically rigorous investigation of associations between application outcomes (such as prices or premiums) and sensitive user attributes (such as race or gender). Furthermore, FairTest provides debugging capabilities that let programmers rule out potential confounders for observed unfair effects. We report on use of FairTest to investigate and in some cases address disparate impact, offensive labeling, and uneven rates of algorithmic error in four data-driven applications. As examples, our results reveal subtle biases against older populations in the distribution of error in a predictive health application and offensive racial labeling in an image tagger.
Customers need to know how reliable a new release is, and whether or not the new release has substantially different, either better or worse, reliability than the one currently in production. Customers are demanding quantitative evidence, based on pre-release metrics, to help them decide whether or not to upgrade (and thereby offer new features and capabilities to their customers). Finding ways to estimate future reliability performance is not easy - we have evaluated many prerelease development and test metrics in search of reliability predictors that are sufficiently accurate and also apply to a broad range of software products. This paper describes a successful model that has resulted from these efforts, and also presents both a functional extension and a further conceptual simplification of the extended model that enables us to better communicate key release information to internal stakeholders and customers, without sacrificing predictive accuracy or generalizability. Work remains to be done, but the results of the original model, the extended model, and the simplified version are encouraging and are currently being applied across a range of products and releases. To evaluate whether or not these early predictions are accurate, and also to compare releases that are available to customers, we use a field software reliability assessment mechanism that incorporates two types of customer experience metrics: field bug encounters normalized by usage, and field bug counts, also normalized by usage. Our 'release-overrelease' strategy combines the 'maturity assessment' component (i.e., estimating reliability prior to release to the field) and the 'reliability assessment' component (i.e., gauging actual reliability after release to the field). This overall approach enables us to both predict reliability and compare reliability results for recent releases for a product.
Efficient management and control of modern and next-gen networks is of paramount importance as networks have to maintain highly reliable service quality whilst supporting rapid growth in traffic demand and new application services. Rapid mitigation of network service degradations is a key factor in delivering high service quality. Automation is vital to achieving rapid mitigation of issues, particularly at the network edge where the scale and diversity is the greatest. This automation involves the rapid detection, localization and (where possible) repair of service-impacting faults and performance impairments. However, the most significant challenge here is knowing what events to detect, how to correlate events to localize an issue and what mitigation actions should be performed in response to the identified issues. These are defined as policies to systems such as ECOMP. In this paper, we present AESOP, a data-driven intelligent system to facilitate automatic learning of policies and rules for triggering remedial actions in networks. AESOP combines best operational practices (domain knowledge) with a variety of measurement data to learn and validate operational policies to mitigate service issues in networks. AESOP's design addresses the following key challenges: (i) learning from high-dimensional noisy data, (ii) capturing multiple fault models, (iii) modeling the high service-cost of false positives, and (iv) accounting for the evolving network infrastructure. We present the design of our system and show results from our ongoing experiments to show the effectiveness of our policy leaning framework.
When a person gets to a door and wants to get in, what do they do? They knock. In our system, the user's specific knock pattern authenticates their identity, and opens the door for them. The system empowers people's intuitive actions and responses to affect the world around them in a new way. We leverage IOT, and physical computing to make more technology feel like less. From there, the system of a knock based entrance creates affordances in social interaction for shared spaces wherein ownership fluidity and accessibility needs to be balanced with security
A low power consumption three-position four-way direct drive control valve based on hybrid excited linear actuator (HELA-DDCV) was provided to meet the requirements of the response time and the power consumption. A coupling system numerical model was established and validated by experiments, which is based on Matlab/Simulink, from four points of view: electric circuit, electromagnetic field, mechanism and fluid mechanics. A dual-closed-loop PI control strategy for both spool displacement and coil current is adopted, and the process of displacement response was analyzed as well as the power consumption performances. The results show that the prototype valve spool displacement response time is less than 9.6ms. Furthermore, the holding current is less than 30% of the peak current in working process, which reduces the power consumption effectively and improves the system stability. Note that the holding current can be eliminated when the spool working at the ends of stroke, and 0.26 J energy is needed in once action independent of the working time.
Identity masking methods have been developed in recent years for use in multiple applications aimed at protecting privacy. There is only limited work, however, targeted at evaluating effectiveness of methods-with only a handful of studies testing identity masking effectiveness for human perceivers. Here, we employed human participants to evaluate identity masking algorithms on video data of drivers, which contains subtle movements of the face and head. We evaluated the effectiveness of the “personalized supervised bilinear regression method for Facial Action Transfer (FAT)” de-identification algorithm. We also evaluated an edge-detection filter, as an alternate “fill-in” method when face tracking failed due to abrupt or fast head motions. Our primary goal was to develop methods for humanbased evaluation of the effectiveness of identity masking. To this end, we designed and conducted two experiments to address the effectiveness of masking in preventing recognition and in preserving action perception. 1- How effective is an identity masking algorithm?We conducted a face recognition experiment and employed Signal Detection Theory (SDT) to measure human accuracy and decision bias. The accuracy results show that both masks (FAT mask and edgedetection) are effective, but that neither completely eliminated recognition. However, the decision bias data suggest that both masks altered the participants' response strategy and made them less likely to affirm identity. 2- How effectively does the algorithm preserve actions? We conducted two experiments on facial behavior annotation. Results showed that masking had a negative effect on annotation accuracy for the majority of actions, with differences across action types. Notably, the FAT mask preserved actions better than the edge-detection mask. To our knowledge, this is the first study to evaluate a deidentification method aimed at preserving facial ac- ions employing human evaluators in a laboratory setting.
In this proposed method, the traditional elevators are upgraded in such a way that any alarming situation in the elevator can be detected and then sent to a main center where further action can be taken accordingly. Different emergency situation can be handled by implementing the system. Smart elevator system works by installing different modules inside the elevator such as speed sensors which will detect speed variations occurring above or below a certain threshold of elevator speed. The smart elevator system installed within the elevator sends a message to the emergency response center and sends an automated call as well. The smart system also includes an emotion detection algorithm which will detect emotions of the individual based on their expression in the elevator. The smart system also has a whisper detection system as well to know if someone stuck inside the elevator is alive during any hazardous situation. A broadcast signal is used as a check in the elevator system to evaluate if every part of the system is in stable state. Proposed system can completely replace the current elevator systems and become part of smart homes.
We present a novel multimodal fusion model for affective content analysis, combining visual, audio and deep visual-sentiment descriptors from the media content with automated facial action measurements from naturalistic responses to the media. We collected a dataset of 48,867 facial responses to 384 media clips and extracted a rich feature set from the facial responses and media content. The stimulus videos were validated to be informative, inspiring, persuasive, sentimental or amusing. By combining the features, we were able to obtain a classification accuracy of 63% (weighted F1-score: 0.62) for a five-class task. This was a significant improvement over using the media content features alone. By analyzing the feature sets independently, we found that states of informed and persuaded were difficult to differentiate from facial responses alone due to the presence of similar sets of action units in each state (AU 2 occurring frequently in both cases). Facial actions were beneficial in differentiating between amused and informed states whereas media content features alone performed less well due to similarities in the visual and audio make up of the content. We highlight examples of content and reactions from each class. This is the first affective content analysis based on reactions of 10,000s of people.
Crowd management in urban settings has mostly relied on either classical, non-automated mechanisms or spontaneous notifications/alerts through social networks. Such management techniques are heavily marred by lack of comprehensive control, especially in terms of averting risks in a manner that ensures crowd safety and enables prompt emergency response. In this paper, we propose a Markov Decision Process Scheme MDP to realize a smart infrastructure that is directly aimed at crowd management. A key emphasis of the scheme is a robust and reliable scalability that provides sufficient flexibility to manage a mixed crowd (i.e., pedestrian, cyclers, manned vehicles and unmanned vehicles). The infrastructure also spans various population settings (e.g., roads, buildings, game arenas, etc.). To realize a reliable and scalable crowd management scheme, the classical MDP is decomposed into Local MDPs with smaller action-state spaces. Preliminarily results show that the MDP decomposition can reduce the system global cost and facilitate fast convergence to local near-optimal solution for each L-MDP.
This paper presents the results of research and simulation of feature automated control of a hysteretic object and the difference between automated control and automatic control. The main feature of automatic control is in the fact that the control loop contains human being as a regulator with its limited response speed. The human reaction can be described as integrating link. The hysteretic object characteristic is switching from one state to another. This is followed by a transient process from one to another characteristic. For this reason, it is very difficult to keep the object in a desired state. Automatic operation ensures fast switching of the feedback signal that produces such a mode, which in many ways is similar to the sliding mode. In the sliding mode control signal abruptly switches from maximum to minimum and vice versa. The average value provides the necessary action to the object. Theoretical analysis and simulation show that the use of the maximum value of the control signal is not required. It is sufficient that the switching oscillation amplitude is such that the output signal varies with the movement of the object along both branches with hysteretic characteristics in the fastest cycle. The average output value in this case corresponds to the prescribed value of the control task. With automated control, the human response can be approximately modeled by integrating regulator. In this case the amplitude fluctuation could be excessively high and the frequency could be excessively low. The simulation showed that creating an artificial additional fluctuation in the control signal makes possible to provide a reduction in the amplitude and the resulting increase in the frequency of oscillation near to the prescribed value. This should be evaluated as a way to improve the quality of automated control with the helps of human being. The paper presents some practical examples of the examined method.
Mission assurance requires effective, near-real time defensive cyber operations to appropriately respond to cyber attacks, without having a significant impact on operations. The ability to rapidly compute, prioritize and execute network-based courses of action (CoAs) relies on accurate situational awareness and mission-context information. Although diverse solutions exist for automatically collecting and analysing infrastructure data, few deliver automated analysis and implementation of network-based CoAs in the context of the ongoing mission. In addition, such processes can be operatorintensive and available tools tend to be specific to a set of common data sources and network responses. To address these issues, Defence Research and Development Canada (DRDC) is leading the development of the Automated Computer Network Defence (ARMOUR) technology demonstrator and cyber defence science and technology (S&T) platform. ARMOUR integrates new and existing off-the-shelf capabilities to provide enhanced decision support and to automate many of the tasks currently executed manually by network operators. This paper describes the cyber defence integration framework, situational awareness, and automated mission-oriented decision support that ARMOUR provides.
In a continually evolving cyber-threat landscape, the detection and prevention of cyber attacks has become a complex task. Technological developments have led organisations to digitise the majority of their operations. This practice, however, has its perils, since cybespace offers a new attack-surface. Institutions which are tasked to protect organisations from these threats utilise mainly network data and their incident response strategy remains oblivious to the needs of the organisation when it comes to protecting operational aspects. This paper presents a system able to combine threat intelligence data, attack-trend data and organisational data (along with other data sources available) in order to achieve automated network-defence actions. Our approach combines machine learning, visual analytics and information from business processes to guide through a decision-making process for a Security Operation Centre environment. We test our system on two synthetic scenarios and show that correlating network data with non-network data for automated network defences is possible and worth investigating further.
Performing large-scale malware classification is increasingly becoming a critical step in malware analytics as the number and variety of malware samples is rapidly growing. Statistical machine learning constitutes an appealing method to cope with this increase as it can use mathematical tools to extract information out of large-scale datasets and produce interpretable models. This has motivated a surge of scientific work in developing machine learning methods for detection and classification of malicious executables. However, an optimal method for extracting the most informative features for different malware families, with the final goal of malware classification, is yet to be found. Fortunately, neural networks have evolved to the state that they can surpass the limitations of other methods in terms of hierarchical feature extraction. Consequently, neural networks can now offer superior classification accuracy in many domains such as computer vision and natural language processing. In this paper, we transfer the performance improvements achieved in the area of neural networks to model the execution sequences of disassembled malicious binaries. We implement a neural network that consists of convolutional and feedforward neural constructs. This architecture embodies a hierarchical feature extraction approach that combines convolution of n-grams of instructions with plain vectorization of features derived from the headers of the Portable Executable (PE) files. Our evaluation results demonstrate that our approach outperforms baseline methods, such as simple Feedforward Neural Networks and Support Vector Machines, as we achieve 93% on precision and recall, even in case of obfuscations in the data.
As the malware threat landscape is constantly evolving and over one million new malware strains are being generated every day [1], early automatic detection of threats constitutes a top priority of cybersecurity research, and amplifies the need for more advanced detection and classification methods that are effective and efficient. In this paper, we present the application of machine learning algorithms to predict the length of time malware should be executed in a sandbox to reveal its malicious intent. We also introduce a novel hybrid approach to malware classification based on static binary analysis and dynamic analysis of malware. Static analysis extracts information from a binary file without executing it, and dynamic analysis captures the behavior of malware in a sandbox environment. Our experimental results show that by turning the aforementioned problems into machine learning problems, it is possible to get an accuracy of up to 90% on the prediction of the malware analysis run time and up to 92% on the classification of malware families.
We present AVAMAT: AntiVirus and Malware Analysis Tool - a tool for analysing the malware detection capabilities of AntiVirus (AV) products running on different operating system (OS) platforms. Even though similar tools are available, such as VirusTotal and MetaDefender, they have several limitations, which motivated the creation of our own tool. With AVAMAT we are able to analyse not only whether an AV detects a malware, but also at what stage of inspection does it detect it and on what OS. AVAMAT enables experimental campaigns to answer various research questions, ranging from the detection capabilities of AVs on OSs, to optimal ways in which AVs could be combined to improve malware detection capabilities.
Malware analysis relies heavily on the use of virtual machines (VMs) for functionality and safety. There are subtle differences in operation between virtual and physical machines. Contemporary malware checks for these differences and changes its behavior when it detects a VM presence. These anti-VM techniques hinder malware analysis. Existing research approaches to uncover differences between VMs and physical machines use randomized testing, and thus cannot guarantee completeness. In this article, we propose a detect-and-hide approach, which systematically addresses anti-VM techniques in malware. First, we propose cardinal pill testing—a modification of red pill testing that aims to enumerate the differences between a given VM and a physical machine through carefully designed tests. Cardinal pill testing finds five times more pills by running 15 times fewer tests than red pill testing. We examine the causes of pills and find that, while the majority of them stem from the failure of VMs to follow CPU specifications, a small number stem from under-specification of certain instructions by the Intel manual. This leads to divergent implementations in different CPU and VM architectures. Cardinal pill testing successfully enumerates the differences that stem from the first cause. Finally, we propose VM Cloak—a WinDbg plug-in which hides the presence of VMs from malware. VM Cloak monitors each execute malware command, detects potential pills, and at runtime modifies the command’s outcomes to match those that a physical machine would generate. We implemented VM Cloak and verified that it successfully hides VM presence from malware.
Malware damages computers and the threat is a serious problem. Malware can be detected by pattern matching method or dynamic heuristic method. However, it is difficult to detect all new malware subspecies perfectly by existing methods. In this paper, we propose a new method which automatically detects new malware subspecies by static analysis of execution files and machine learning. The method can distinguish malware from benignware and it can also classify malware subspecies into malware families. We combine static analysis of execution files with machine learning classifier and natural language processing by machine learning. Information of DLL Import, assembly code and hexdump are acquired by static analysis of execution files of malware and benignware to create feature vectors. Paragraph vectors of information by static analysis of execution files are created by machine learning of PV-DBOW model for natural language processing. Support vector machine and classifier of k-nearest neighbor algorithm are used in our method, and the classifier learns paragraph vectors of information by static analysis. Unknown execution files are classified into malware or benignware by pre-learned SVM. Moreover, malware subspecies are also classified into malware families by pre-learned k-nearest. We evaluate the accuracy of the classification by experiments. We think that new malware subspecies can be effectively detected by our method without existing methods for malware analysis such as generic method and dynamic heuristic method.
Code reuse detection is a key technique in reverse engineering. However, existing source code similarity comparison techniques are not applicable to binary code. Moreover, compilers have made this problem even more difficult due to the fact that different assembly code and control flow structures can be generated by the compilers even when implementing the same functionality. To address this problem, we present a fuzzy matching approach to compare two functions. We first obtain an initial mapping between basic blocks by leveraging the concept of longest common subsequence on the basic block level and execution path level. We then extend the achieved mapping using neighborhood exploration. To make our approach applicable to large data sets, we designed an effective filtering process using Minhashing. Based on the proposed approach, we implemented a tool named BinSequence and conducted extensive experiments with it. Our results show that given a large assembly code repository with millions of functions, BinSequence is efficient and can attain high quality similarity ranking of assembly functions with an accuracy of above 90%. We also present several practical use cases including patch analysis, malware analysis and bug search.
The increasing growth of cybercrimes targeting mobile devices urges an efficient malware analysis platform. With the emergence of evasive malware, which is capable of detecting that it is being analyzed in virtualized environments, bare-metal analysis has become the definitive resort. Existing works mainly focus on extracting the malicious behaviors exposed during bare-metal analysis. However, after malware analysis, it is equally important to quickly restore the system to a clean state to examine the next sample. Unfortunately, state-of-the-art solutions on mobile platforms can only restore the disk, and require a time-consuming system reboot. In addition, all of the existing works require some in-guest components to assist the restoration. Therefore, a kernel-level malware is still able to detect the presence of the in-guest components. We propose Bolt, a transparent restoration mechanism for bare-metal analysis on mobile platform without rebooting. Bolt achieves a reboot-less restoration by simultaneously making a snapshot for both the physical memory and the disk. Memory snapshot is enabled by an isolated operating system (BoltOS) in the ARM TrustZone secure world, and disk snapshot is accomplished by a piece of customized firmware (BoltFTL) for flash-based block devices. Because both the BoltOS and the BoltFTL are isolated from the guest system, even kernel-level malware cannot interfere with the restoration. More importantly, Bolt does not require any modifications into the guest system. As such, Bolt is the first that simultaneously achieves efficiency, isolation, and stealthiness to recover from infection due to malware execution. We have implemented a Bolt prototype working with the Android OS. Experimental results show that Bolt can restore the guest system to a clean state in only 2.80 seconds.
The blockchain emerges as an innovative tool that has the potential to positively impact the way we design a number of online applications today. In many ways, the blockchain technology is, however, still not mature enough to cater for industrial standards. Namely, existing Byzantine tolerant permission-based blockchain deployments can only scale to a limited number of nodes. These systems typically require that all transactions (and their order of execution) are publicly available to all nodes in the system, which comes at odds with common data sharing practices in the industry, and prevents a centralized regulator from overseeing the full blockchain system. In this paper, we propose a novel blockchain architecture devised specifically to meet industrial standards. Our proposal leverages the notion of satellite chains that can privately run different consensus protocols in parallel - thereby considerably boosting the scalability premises of the system. Our solution also accounts for a hands-off regulator that oversees the entire network, enforces specific policies by means of smart contracts, etc. We implemented our solution and integrated it with Hyperledger Fabric v0.6.
This paper considers the problem of running a long-term on-demand service for executing actively-secure computations. We examined state-of-the-art tools and implementations for actively-secure computation and identified a set of key features indispensable to offer meaningful service like this. Since no satisfactory tools exist for the purpose, we developed Pool, a new tool for building and executing actively-secure computation protocols at extreme scales with nearly zero offline delay. With Pool, we are able to obliviously execute, for the first time, reactive computations like ORAM in the malicious threat model. Many technical benefits of Pool can be attributed to the concept of pool-based cut-and-choose. We show with experiments that this idea has significantly improved the scalability and usability of JIMU, a state-of-the-art LEGO protocol.