Biblio

List
Filter

Found 221 results

Filters: Keyword is visualization [Clear All Filters]

2021-06-01

Chen, Zhenfang, Wang, Peng, Ma, Lin, Wong, Kwan-Yee K., Wu, Qi. 2020. Cops-Ref: A New Dataset and Task on Compositional Referring Expression Comprehension. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). :10083–10092.

Referring expression comprehension (REF) aims at identifying a particular object in a scene by a natural language expression. It requires joint reasoning over the textual and visual domains to solve the problem. Some popular referring expression datasets, however, fail to provide an ideal test bed for evaluating the reasoning ability of the models, mainly because 1) their expressions typically describe only some simple distinctive properties of the object and 2) their images contain limited distracting information. To bridge the gap, we propose a new dataset for visual reasoning in context of referring expression comprehension with two main features. First, we design a novel expression engine rendering various reasoning logics that can be flexibly combined with rich visual properties to generate expressions with varying compositionality. Second, to better exploit the full reasoning chain embodied in an expression, we propose a new test setting by adding additional distracting images containing objects sharing similar properties with the referent, thus minimising the success rate of reasoning-free cross-domain alignment. We evaluate several state-of-the-art REF models, but find none of them can achieve promising performance. A proposed modular hard mining strategy performs the best but still leaves substantial room for improvement.

2021-05-18

Sinhabahu, Nadun, Wimalaratne, Prasad, Wijesiriwardana, Chaman. 2020. Secure Codecity with Evolution: Visualizing Security Vulnerability Evolution of Software Systems. 2020 20th International Conference on Advances in ICT for Emerging Regions (ICTer). :1–2.

The analysis of large-scale software and finding security vulnerabilities while its evolving is difficult without using supplementary tools, because of the size and complexity of today's systems. However just by looking at a report, doesn't transmit the overall picture of the system in terms of security vulnerabilities and its evolution throughout the project lifecycle. Software visualization is a program comprehension technique used in the context of the present and explores large amounts of information precisely. For the analysis of security vulnerabilities of complex software systems, Secure Codecity with Evolution is an interactive 3D visualization tool that can be utilized. Its studies techniques and methods are used for graphically illustrating security aspects and the evolution of software. The Main goal of the proposed Framework defined as uplift, simplify, and clarify the mental representation that a software engineer has of a software system and its evolution in terms of its security. Static code was visualised based on a city metaphor, which represents classes as buildings and packages as districts of a city. Identified Vulnerabilities were represented in a different color according to the severity. To visualize a number of different aspects, A large variety of options were given. Users can evaluate the evolution of the security vulnerabilities of a system on several versions using Matrices provided which will help users go get an overall understanding about security vulnerabilities varies with different versions of software. This framework was implemented using SonarQube for software vulnerability detection and ThreeJs for implementing the City Metaphor. The evaluation results evidently show that our framework surpasses the existing tools in terms of accuracy, efficiency and usability.

2021-04-08

Rhee, K. H.. 2020. Composition of Visual Feature Vector Pattern for Deep Learning in Image Forensics. IEEE Access. 8:188970—188980.

In image forensics, to determine whether the image is impurely transformed, it extracts and examines the features included in the suspicious image. In general, the features extracted for the detection of forgery images are based on numerical values, so it is somewhat unreasonable to use in the CNN structure for image classification. In this paper, the extraction method of a feature vector is using a least-squares solution. Treat a suspicious image like a matrix and its solution to be coefficients as the feature vector. Get two solutions from two images of the original and its median filter residual (MFR). Subsequently, the two features were formed into a visualized pattern and then fed into CNN deep learning to classify the various transformed images. A new structure of the CNN net layer was also designed by hybrid with the inception module and the residual block to classify visualized feature vector patterns. The performance of the proposed image forensics detection (IFD) scheme was measured with the seven transformed types of image: average filtered (window size: 3 × 3), gaussian filtered (window size: 3 × 3), JPEG compressed (quality factor: 90, 70), median filtered (window size: 3 × 3, 5 × 5), and unaltered. The visualized patterns are fed into the image input layer of the designed CNN hybrid model. Throughout the experiment, the accuracy of median filtering detection was 98% over. Also, the area under the curve (AUC) by sensitivity (TP: true positive rate) and 1-specificity (FP: false positive rate) results of the proposed IFD scheme approached to `1' on the designed CNN hybrid model. Experimental results show high efficiency and performance to classify the various transformed images. Therefore, the grade evaluation of the proposed scheme is “Excellent (A)”.

2021-03-29

Fajri, M., Hariyanto, N., Gemsjaeger, B.. 2020. Automatic Protection Implementation Considering Protection Assessment Method of DER Penetration for Smart Distribution Network. 2020 International Conference on Technology and Policy in Energy and Electric Power (ICT-PEP). :323—328.

Due to geographical locations of Indonesia, some technology such as hydro and solar photovoltaics are very attractive to be used and developed. Distribution Energy Resources (DER) is the appropriate schemes implemented to achieve optimal operation respecting the location and capacity of the plant. The Gorontalo sub-system network was chosen as a case study considering both of micro-hydro and PV as contributed to supply the grid. The needs of a smart electrical system are required to improve reliability, power quality, and adaptation to any circumstances during DER application. While the topology was changing over time, intermittent of DER output and bidirectional power flow can be overcome with smart grid systems. In this study, an automation algorithm has been conducted to aid the engineers in solving the protection problems caused by DER implementation. The Protection Security Assessment (PSA) method is used to evaluate the state of the protection system. Determine the relay settings using an adaptive rule-based method on expert systems. The application with a Graphical User Interface (GUI) has been developed to make user easier to get the specific relay settings and locations which are sensitive, fast, reliable, and selective.

Grundy, J.. 2020. Human-centric Software Engineering for Next Generation Cloud- and Edge-based Smart Living Applications. 2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID). :1—10.

Humans are a key part of software development, including customers, designers, coders, testers and end users. In this keynote talk I explain why incorporating human-centric issues into software engineering for next-generation applications is critical. I use several examples from our recent and current work on handling human-centric issues when engineering various `smart living' cloud- and edge-based software systems. This includes using human-centric, domain-specific visual models for non-technical experts to specify and generate data analysis applications; personality impact on aspects of software activities; incorporating end user emotions into software requirements engineering for smart homes; incorporating human usage patterns into emerging edge computing applications; visualising smart city-related data; reporting diverse software usability defects; and human-centric security and privacy requirements for smart living systems. I assess the usefulness of these approaches, highlight some outstanding research challenges, and briefly discuss our current work on new human-centric approaches to software engineering for smart living applications.

Li, J., Wang, X., Liu, S.. 2020. Hash Retrieval Method for Recaptured Images Based on Convolutional Neural Network. 2020 2nd World Symposium on Artificial Intelligence (WSAI). :79–83.

For the purpose of outdoor advertising market researching, AD images are recaptured and uploaded everyday for statistics. But the quality of the recaptured advertising images are often affected by conditions such as angle, distance, and light during the shooting process, which consequently reduce either the speed or the accuracy of the retrieving algorithm. In this paper, we proposed a hash retrieval method based on convolutional neural networks for recaptured images. The basic idea is to add a hash layer to the convolutional neural network and then extract the binary hash code output by the hash layer to perform image retrieval in lowdimensional Hamming space. Experimental results show that the retrieval performance is improved compared with the current commonly used hash retrieval methods.

Al-Janabi, S. I. Ali, Al-Janabi, S. T. Faraj, Al-Khateeb, B.. 2020. Image Classification using Convolution Neural Network Based Hash Encoding and Particle Swarm Optimization. 2020 International Conference on Data Analytics for Business and Industry: Way Towards a Sustainable Economy (ICDABI). :1–5.

Image Retrieval (IR) has become one of the main problems facing computer society recently. To increase computing similarities between images, hashing approaches have become the focus of many programmers. Indeed, in the past few years, Deep Learning (DL) has been considered as a backbone for image analysis using Convolutional Neural Networks (CNNs). This paper aims to design and implement a high-performance image classifier that can be used in several applications such as intelligent vehicles, face recognition, marketing, and many others. This work considers experimentation to find the sequential model's best configuration for classifying images. The best performance has been obtained from two layers' architecture; the first layer consists of 128 nodes, and the second layer is composed of 32 nodes, where the accuracy reached up to 0.9012. The proposed classifier has been achieved using CNN and the data extracted from the CIFAR-10 dataset by the inception model, which are called the Transfer Values (TRVs). Indeed, the Particle Swarm Optimization (PSO) algorithm is used to reduce the TRVs. In this respect, the work focus is to reduce the TRVs to obtain high-performance image classifier models. Indeed, the PSO algorithm has been enhanced by using the crossover technique from genetic algorithms. This led to a reduction of the complexity of models in terms of the number of parameters used and the execution time.

2021-03-18

Khan, A., Chefranov, A. G.. 2020. A Captcha-Based Graphical Password With Strong Password Space and Usability Study. 2020 International Conference on Electrical, Communication, and Computer Engineering (ICECCE). :1—6.

Security for authentication is required to give a superlative secure users' personal information. This paper presents a model of the Graphical password scheme under the impact of security and ease of use for user authentication. We integrate the concept of recognition with re-called and cued-recall based schemes to offer superior security compared to existing schemes. Click Symbols (CS) Alphabet combine into one entity: Alphanumeric (A) and Visual (V) symbols (CS-AV) is Captcha-based password scheme, we integrate it with recall-based n ×n grid points, where a user can draw the shape or pattern by the intersection of the grid points as a way to enter a graphical password. Next scheme, the combination of CS-AV with grid cells allows very large password space ( 2.4 ×104 bits of entropy) and provides reasonable usability results by determining an empirical study of memorable password space. Proposed schemes support most applicable platform for input devices and promising strong resistance to shoulder surfing attacks on a mobile device which can be occurred during unlocking (pattern) the smartphone.

2021-03-01

Tan, R., Khan, N., Guan, L.. 2020. Locality Guided Neural Networks for Explainable Artificial Intelligence. 2020 International Joint Conference on Neural Networks (IJCNN). :1–8.

In current deep network architectures, deeper layers in networks tend to contain hundreds of independent neurons which makes it hard for humans to understand how they interact with each other. By organizing the neurons by correlation, humans can observe how clusters of neighbouring neurons interact with each other. In this paper, we propose a novel algorithm for back propagation, called Locality Guided Neural Network (LGNN) for training networks that preserves locality between neighbouring neurons within each layer of a deep network. Heavily motivated by Self-Organizing Map (SOM), the goal is to enforce a local topology on each layer of a deep network such that neighbouring neurons are highly correlated with each other. This method contributes to the domain of Explainable Artificial Intelligence (XAI), which aims to alleviate the black-box nature of current AI methods and make them understandable by humans. Our method aims to achieve XAI in deep learning without changing the structure of current models nor requiring any post processing. This paper focuses on Convolutional Neural Networks (CNNs), but can theoretically be applied to any type of deep learning architecture. In our experiments, we train various VGG and Wide ResNet (WRN) networks for image classification on CIFAR100. In depth analyses presenting both qualitative and quantitative results demonstrate that our method is capable of enforcing a topology on each layer while achieving a small increase in classification accuracy.

Taylor, E., Shekhar, S., Taylor, G. W.. 2020. Response Time Analysis for Explainability of Visual Processing in CNNs. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). :1555–1558.

Explainable artificial intelligence (XAI) methods rely on access to model architecture and parameters that is not always feasible for most users, practitioners, and regulators. Inspired by cognitive psychology, we present a case for response times (RTs) as a technique for XAI. RTs are observable without access to the model. Moreover, dynamic inference models performing conditional computation generate variable RTs for visual learning tasks depending on hierarchical representations. We show that MSDNet, a conditional computation model with early-exit architecture, exhibits slower RT for images with more complex features in the ObjectNet test set, as well as the human phenomenon of scene grammar, where object recognition depends on intrascene object-object relationships. These results cast light on MSDNet's feature space without opening the black box and illustrate the promise of RT methods for XAI.

Tao, J., Xiong, Y., Zhao, S., Xu, Y., Lin, J., Wu, R., Fan, C.. 2020. XAI-Driven Explainable Multi-view Game Cheating Detection. 2020 IEEE Conference on Games (CoG). :144–151.

Online gaming is one of the most successful applications having a large number of players interacting in an online persistent virtual world through the Internet. However, some cheating players gain improper advantages over normal players by using illegal automated plugins which has brought huge harm to game health and player enjoyment. Game industries have been devoting much efforts on cheating detection with multiview data sources and achieved great accuracy improvements by applying artificial intelligence (AI) techniques. However, generating explanations for cheating detection from multiple views still remains a challenging task. To respond to the different purposes of explainability in AI models from different audience profiles, we propose the EMGCD, the first explainable multi-view game cheating detection framework driven by explainable AI (XAI). It combines cheating explainers to cheating classifiers from different views to generate individual, local and global explanations which contributes to the evidence generation, reason generation, model debugging and model compression. The EMGCD has been implemented and deployed in multiple game productions in NetEase Games, achieving remarkable and trustworthy performance. Our framework can also easily generalize to other types of related tasks in online games, such as explainable recommender systems, explainable churn prediction, etc.

2021-02-08

Moussa, Y., Alexan, W.. 2020. Message Security Through AES and LSB Embedding in Edge Detected Pixels of 3D Images. 2020 2nd Novel Intelligent and Leading Emerging Sciences Conference (NILES). :224—229.

This paper proposes an advanced scheme of message security in 3D cover images using multiple layers of security. Cryptography using AES-256 is implemented in the first layer. In the second layer, edge detection is applied. Finally, LSB steganography is executed in the third layer. The efficiency of the proposed scheme is measured using a number of performance metrics. For instance, mean square error (MSE), peak signal-to-noise ratio (PSNR), structural similarity index measure (SSIM), mean absolute error (MAE) and entropy.

2021-02-03

Velaora, M., Roy, R. van, Guéna, F.. 2020. ARtect, an augmented reality educational prototype for architectural design. 2020 Fourth World Conference on Smart Trends in Systems, Security and Sustainability (WorldS4). :110—115.

ARtect is an Augmented Reality application developed with Unity 3D, which envisions an educational interactive and immersive tool for architects, designers, researchers, and artists. This digital instrument renders the competency to visualize custom-made 3D models and 2D graphics in interior and exterior environments. The user-friendly interface offers an accurate insight before the materialization of any architectural project, enabling evaluation of the design proposal. This practice could be integrated into learning architectural design process, saving resources of printed drawings, and 3D carton models during several stages of spatial conception.

Bahaei, S. Sheikh. 2020. A Framework for Risk Assessment in Augmented Reality-Equipped Socio-Technical Systems. 2020 50th Annual IEEE-IFIP International Conference on Dependable Systems and Networks-Supplemental Volume (DSN-S). :77—78.

New technologies, such as augmented reality (AR) are used to enhance human capabilities and extend human functioning; nevertheless they may cause distraction and incorrect human functioning. Systems including socio entities (such as human) and technical entities (such as augmented reality) are called socio-technical systems. In order to do risk assessment in such systems, considering new dependability threats caused by augmented reality is essential, for example failure of an extended human function is a new type of dependability threat introduced to the system because of new technologies. In particular, it is required to identify these new dependability threats and extend modeling and analyzing techniques to be able to uncover their potential impacts. This research aims at providing a framework for risk assessment in AR-equipped socio-technical systems by identifying AR-extended human failures and AR-caused faults leading to human failures. Our work also extends modeling elements in an existing metamodel for modeling socio-technical systems, to enable AR-relevant dependability threats modeling. This extended metamodel is expected to be used for extending analysis techniques to analyze AR-equipped socio-technical systems.

Sabu, R., Yasuda, K., Kato, R., Kawaguchi, S., Iwata, H.. 2020. Does visual search by neck motion improve hemispatial neglect?: An experimental study using an immersive virtual reality system 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC). :262—267.

Unilateral spatial neglect (USN) is a higher cognitive dysfunction that can occur after a stroke. It is defined as an impairment in finding, reporting, reacting to, and directing stimuli opposite the damaged side of the brain. We have proposed a system to identify neglected regions in USN patients in three dimensions using three-dimensional virtual reality. The objectives of this study are twofold: first, to propose a system for numerically identifying the neglected regions using an object detection task in a virtual space, and second, to compare the neglected regions during object detection when the patient's neck is immobilized (‘fixed-neck’ condition) versus when the neck can be freely moved to search (‘free-neck’ condition). We performed the test using an immersive virtual reality system, once with the patient's neck fixed and once with the patient's neck free to move. Comparing the results of the study in two patients, we found that the neglected areas were similar in the fixed-neck condition. However, in the free-neck condition, one patient's neglect improved while the other patient’s neglect worsened. These results suggest that exploratory ability affects the symptoms of USN and is crucial for clinical evaluation of USN patients.

Kennard, M., Zhang, H., Akimoto, Y., Hirokawa, M., Suzuki, K.. 2020. Effects of Visual Biofeedback on Competition Performance Using an Immersive Mixed Reality System. 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC). :3793—3798.

This paper investigates the effects of real time visual biofeedback for improving sports performance using a large scale immersive mixed reality system in which users are able to play a simulated game of curling. The users slide custom curling stones across the floor onto a projected target whose size is dictated by the user’s stress-related physiological measure; heart rate (HR). The higher HR the player has, the smaller the target will be, and vice-versa. In the experiment participants were asked to compete in three different conditions: baseline, with and without the proposed biofeedback. The results show that when providing a visual representation of the player’s HR or "choking" in competition, it helped the player understand their condition and improve competition performance (P-value of 0.0391).

2021-02-01

Han, W., Schulz, H.-J.. 2020. Beyond Trust Building — Calibrating Trust in Visual Analytics. 2020 IEEE Workshop on TRust and EXpertise in Visual Analytics (TREX). :9–15.

Trust is a fundamental factor in how users engage in interactions with Visual Analytics (VA) systems. While the importance of building trust to this end has been pointed out in research, the aspect that trust can also be misplaced is largely ignored in VA so far. This position paper addresses this aspect by putting trust calibration in focus – i.e., the process of aligning the user’s trust with the actual trustworthiness of the VA system. To this end, we present the trust continuum in the context of VA, dissect important trust issues in both VA systems and users, as well as discuss possible approaches that can build and calibrate trust.

2021-01-25

Ghazo, A. T. Al, Ibrahim, M., Ren, H., Kumar, R.. 2020. A2G2V: Automatic Attack Graph Generation and Visualization and Its Applications to Computer and SCADA Networks. IEEE Transactions on Systems, Man, and Cybernetics: Systems. 50:3488–3498.

Securing cyber-physical systems (CPS) and Internet of Things (IoT) systems requires the identification of how interdependence among existing atomic vulnerabilities may be exploited by an adversary to stitch together an attack that can compromise the system. Therefore, accurate attack graphs play a significant role in systems security. A manual construction of the attack graphs is tedious and error-prone, this paper proposes a model-checking-based automated attack graph generator and visualizer (A2G2V). The proposed A2G2V algorithm uses existing model-checking tools, an architecture description tool, and our own code to generate an attack graph that enumerates the set of all possible sequences in which atomic-level vulnerabilities can be exploited to compromise system security. The architecture description tool captures a formal representation of the networked system, its atomic vulnerabilities, their pre-and post-conditions, and security property of interest. A model-checker is employed to automatically identify an attack sequence in the form of a counterexample. Our own code integrated with the model-checker parses the counterexamples, encodes those for specification relaxation, and iterates until all attack sequences are revealed. Finally, a visualization tool has also been incorporated with A2G2V to generate a graphical representation of the generated attack graph. The results are illustrated through application to computer as well as control (SCADA) networks.

More, S., Jamadar, I., Kazi, F.. 2020. Security Visualization and Active Querying for OT Network. :1—6.

Traditionally Industrial Control System(ICS) used air-gap mechanism to protect Operational Technology (OT) networks from cyber-attacks. As internet is evolving and so are business models, customer supplier relationships and their needs are changing. Hence lot of ICS are now connected to internet by providing levels of defense strategies in between OT network and business network to overcome the traditional mechanism of air-gap. This upgrade made OT networks available and accessible through internet. OT networks involve number of physical objects and computer networks. Physical damages to system have become rare but the number of cyber-attacks occurring are evidently increasing. To tackle cyber-attacks, we have a number of measures in place like Firewalls, Intrusion Detection System (IDS) and Intrusion Prevention System (IPS). To ensure no attack on or suspicious behavior within network takes place, we can use visual aids like creating dashboards which are able to flag any such activity and create visual alert about same. This paper describes creation of parser object to convert Common Event Format(CEF) to Comma Separated Values(CSV) format and dashboard to extract maximum amount of data and analyze network behavior. And working of active querying by leveraging packet level data from network to analyze network inclusion in real-time. The mentioned methodology is verified on data collected from Waste Water Treatment Plant and results are presented.,} booktitle = {2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT)

2021-01-15

Park, W.. 2020. A Study on Analytical Visualization of Deep Web. 2020 22nd International Conference on Advanced Communication Technology (ICACT). :81—83.

Nowadays, there is a flood of data such as naked body photos and child pornography, which is making people bloodless. In addition, people also distribute drugs through unknown dark channels. In particular, most transactions are being made through the Deep Web, the dark path. “Deep Web refers to an encrypted network that is not detected on search engine like Google etc. Users must use Tor to visit sites on the dark web” [4]. In other words, the Dark Web uses Tor's encryption client. Therefore, users can visit multiple sites on the dark Web, but not know the initiator of the site. In this paper, we propose the key idea based on the current status of such crimes and a crime information visual system for Deep Web has been developed. The status of deep web is analyzed and data is visualized using Java. It is expected that the program will help more efficient management and monitoring of crime in unknown web such as deep web, torrent etc.

Matern, F., Riess, C., Stamminger, M.. 2019. Exploiting Visual Artifacts to Expose Deepfakes and Face Manipulations. 2019 IEEE Winter Applications of Computer Vision Workshops (WACVW). :83—92.

High quality face editing in videos is a growing concern and spreads distrust in video content. However, upon closer examination, many face editing algorithms exhibit artifacts that resemble classical computer vision issues that stem from face tracking and editing. As a consequence, we wonder how difficult it is to expose artificial faces from current generators? To this end, we review current facial editing methods and several characteristic artifacts from their processing pipelines. We also show that relatively simple visual artifacts can be already quite effective in exposing such manipulations, including Deepfakes and Face2Face. Since the methods are based on visual features, they are easily explicable also to non-technical experts. The methods are easy to implement and offer capabilities for rapid adjustment to new manipulation types with little data available. Despite their simplicity, the methods are able to achieve AUC values of up to 0.866.

Li, Y., Yang, X., Sun, P., Qi, H., Lyu, S.. 2020. Celeb-DF: A Large-Scale Challenging Dataset for DeepFake Forensics. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). :3204—3213.

AI-synthesized face-swapping videos, commonly known as DeepFakes, is an emerging problem threatening the trustworthiness of online information. The need to develop and evaluate DeepFake detection algorithms calls for datasets of DeepFake videos. However, current DeepFake datasets suffer from low visual quality and do not resemble DeepFake videos circulated on the Internet. We present a new large-scale challenging DeepFake video dataset, Celeb-DF, which contains 5,639 high-quality DeepFake videos of celebrities generated using improved synthesis process. We conduct a comprehensive evaluation of DeepFake detection methods and datasets to demonstrate the escalated level of challenges posed by Celeb-DF.

Nguyen, H. M., Derakhshani, R.. 2020. Eyebrow Recognition for Identifying Deepfake Videos. 2020 International Conference of the Biometrics Special Interest Group (BIOSIG). :1—5.

Deepfake imagery that contains altered faces has become a threat to online content. Current anti-deepfake approaches usually do so by detecting image anomalies, such as visible artifacts or inconsistencies. However, with deepfake advances, these visual artifacts are becoming harder to detect. In this paper, we show that one can use biometric eyebrow matching as a tool to detect manipulated faces. Our method could provide an 0.88 AUC and 20.7% EER for deepfake detection when applied to the highest quality deepfake dataset, Celeb-DF.

2021-01-11

YE, X., JI, B., Chen, X., QIAN, D., Zhao, Z.. 2020. Probability Boltzmann Machine Network for Face Detection on Video. 2020 13th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI). :138—147.

By the multi-layer nonlinear mapping and the semantic feature extraction of the deep learning, a deep learning network is proposed for video face detection to overcome the challenge of detecting faces rapidly and accurately in video with changeable background. Particularly, a pre-training procedure is used to initialize the network parameters to avoid falling into the local optimum, and the greedy layer-wise learning is introduced in the pre-training to avoid the training error transfer in layers. Key to the network is that the probability of neurons models the status of human brain neurons which is a continuous distribution from the most active to the least active and the hidden layer’s neuron number decreases layer-by-layer to reduce the redundant information of the input data. Moreover, the skin color detection is used to accelerate the detection speed by generating candidate regions. Experimental results show that, besides the faster detection speed and robustness against face rotation, the proposed method possesses lower false detection rate and lower missing detection rate than traditional algorithms.

2020-12-15

Reardon, C., Lee, K., Fink, J.. 2018. Come See This! Augmented Reality to Enable Human-Robot Cooperative Search. 2018 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR). :1—7.

Robots operating alongside humans in field environments have the potential to greatly increase the situational awareness of their human teammates. A significant challenge, however, is the efficient conveyance of what the robot perceives to the human in order to achieve improved situational awareness. We believe augmented reality (AR), which allows a human to simultaneously perceive the real world and digital information situated virtually in the real world, has the potential to address this issue. Motivated by the emerging prevalence of practical human-wearable AR devices, we present a system that enables a robot to perform cooperative search with a human teammate, where the robot can both share search results and assist the human teammate in navigation to the search target. We demonstrate this ability in a search task in an uninstrumented environment where the robot identifies and localizes targets and provides navigation direction via AR to bring the human to the correct target.