Biblio
This paper proposes a method for detecting anomalies in video data. A Variational Autoencoder (VAE) is used for reducing the dimensionality of video frames, generating latent space information that is comparable to low-dimensional sensory data (e.g., positioning, steering angle), making feasible the development of a consistent multi-modal architecture for autonomous vehicles. An Adapted Markov Jump Particle Filter defined by discrete and continuous inference levels is employed to predict the following frames and detecting anomalies in new video sequences. Our method is evaluated on different video scenarios where a semi-autonomous vehicle performs a set of tasks in a closed environment.
By analogy to nature, sight is the main integral component of robotic complexes, including unmanned vehicles. In this connection, one of the urgent tasks in the modern development of unmanned vehicles is the solution to the problem of providing security for new advanced systems, algorithms, methods, and principles of space navigation of robots. In the paper, we present an approach to the protection of machine vision systems based on technologies of deep learning. At the heart of the approach lies the “Feature Squeezing” method that works on the phase of model operation. It allows us to detect “adversarial” examples. Considering the urgency and importance of the target process, the features of unmanned vehicle hardware platforms and also the necessity of execution of tasks on detecting of the objects in real-time mode, it was offered to carry out an additional simple computational procedure of localization and classification of required objects in case of crossing a defined in advance threshold of “adversarial” object testing.
To bring a uniform development platform which seamlessly combines hardware components and software architecture of various developers across the globe and reduce the complexity in producing robots which help people in their daily ergonomics. ROS has come out to be a game changer. It is disappointing to see the lack of penetration of technology in different verticals which involve protection, defense and security. By leveraging the power of ROS in the field of robotic automation and computer vision, this research will pave path for identification of suspicious activity with autonomously moving bots which run on ROS. The research paper proposes and validates a flow where ROS and computer vision algorithms like YOLO can fall in sync with each other to provide smarter and accurate methods for indoor and limited outdoor patrolling. Identification of age,`gender, weapons and other elements which can disturb public harmony will be an integral part of the research and development process. The simulation and testing reflects the efficiency and speed of the designed software architecture.
In this study we propose a novel method for drone surveillance that can simultaneously analyze time-frequency responses in all pixels of a high-frame-rate video. The propellers of flying drones rotate at hundreds of Hz and their principal vibration frequency components are much higher than those of their background objects. To separate the pixels around a drone's propellers from its background, we utilize these time-series features for vibration source localization with pixel-level short-time Fourier transform (STFT). We verify the relationship between the number of taps in the STFT computation and the performance of our algorithm, including the execution time and the localization accuracy, by conducting experiments under various conditions, such as degraded appearance, weather, and defocused blur. The robustness of the proposed algorithm is also verified by localizing a flying multi-copter in real-time in an outdoor scenario.
To meet the high requirement of human-machine interaction, quadruped robots with human recognition and tracking capability are studied in this paper. We first introduce a marker recognition system which uses multi-thread laser scanner and retro-reflective markers to distinguish the robot's leader and other objects. When the robot follows leader autonomously, the variant A* algorithm which having obstacle grids extended virtually (EA*) is used to plan the path. But if robots need to track and follow the leader's path as closely as possible, it will trust that the path which leader have traveled is safe enough and uses the incremental form of EA* algorithm (IEA*) to reproduce the trajectory. The simulation and experiment results illustrate the feasibility and effectiveness of the proposed algorithms.
In the past few years, visual information collection and transmission is increased significantly for various applications. Smart vehicles, service robotic platforms and surveillance cameras for the smart city applications are collecting a large amount of visual data. The preservation of the privacy of people presented in this data is an important factor in storage, processing, sharing and transmission of visual data across the Internet of Robotic Things (IoRT). In this paper, a novel anonymisation method for information security and privacy preservation in visual data in sharing layer of the Web of Robotic Things (WoRT) is proposed. The proposed framework uses deep neural network based semantic segmentation to preserve the privacy in video data base of the access level of the applications and users. The data is anonymised to the applications with lower level access but the applications with higher legal access level can analyze and annotated the complete data. The experimental results show that the proposed method while giving the required access to the authorities for legal applications of smart city surveillance, is capable of preserving the privacy of the people presented in the data.
As drone attracts much interest, the drone industry has opened their market to ordinary people, making drones to be used in daily lives. However, as it got easier for drone to be used by more people, safety and security issues have raised as accidents are much more likely to happen: colliding into people by losing control or invading secured properties. For safety purposes, it is essential for observers and drone to be aware of an approaching drone. In this paper, we introduce a comprehensive drone detection system based on machine learning. This system is designed to be operable on drones with camera. Based on the camera images, the system deduces location on image and vendor model of drone based on machine classification. The system is actually built with OpenCV library. We collected drone imagery and information for learning process. The system's output shows about 89 percent accuracy.
Robots operating alongside humans in field environments have the potential to greatly increase the situational awareness of their human teammates. A significant challenge, however, is the efficient conveyance of what the robot perceives to the human in order to achieve improved situational awareness. We believe augmented reality (AR), which allows a human to simultaneously perceive the real world and digital information situated virtually in the real world, has the potential to address this issue. We propose to demonstrate that augmented reality can be used to enable human-robot cooperative search, where the robot can both share search results and assist the human teammate in navigating to a search target.
Explosive naval mines pose a threat to ocean and sea faring vessels, both military and civilian. This work applies deep neural network (DNN) methods to the problem of detecting minelike objects (MLO) on the seafloor in side-scan sonar imagery. We explored how the DNN depth, memory requirements, calculation requirements, and training data distribution affect detection efficacy. A visualization technique (class activation map) was incorporated that aids a user in interpreting the model's behavior. We found that modest DNN model sizes yielded better accuracy (98%) than very simple DNN models (93%) and a support vector machine (78%). The largest DNN models achieved textless;1% efficacy increase at a cost of a 17x increase of trainable parameter count and computation requirements. In contrast to DNNs popularized for many-class image recognition tasks, the models for this task require far fewer computational resources (0.3% of parameters), and are suitable for embedded use within an autonomous unmanned underwater vehicle.
This paper describes Smartpig, an algorithm for the iterative mosaicking of images of a planar surface using a unique parameterization which decomposes inter-image projective warps into camera intrinsics, fronto-parallel projections, and inter-image similarities. The constraints resulting from the inter-image alignments within an image set are stored in an undirected graph structure allowing efficient optimization of image projections on the plane. Camera pose is also directly recoverable from the graph, making Smartpig a feasible solution to the problem of simultaneous location and mapping (SLAM). Smartpig is demonstrated on a set of 144 high resolution aerial images and evaluated with a number of metrics against ground control.
A number of blind Image Quality Evaluation Metrics (IQEMs) for Unmanned Aerial Vehicle (UAV) photograph application are presented. Nowadays, the visible light camera is widely used for UAV photograph application because of its vivid imaging effect; however, the outdoor environment light will produce great negative influences on its imaging output unfortunately. In this paper, to conquer this problem above, we design and reuse a series of blind IQEMs to analyze the imaging quality of UAV application. The Human Visual System (HVS) based IQEMs, including the image brightness level, the image contrast level, the image noise level, the image edge blur level, the image texture intensity level, the image jitter level, and the image flicker level, are all considered in our application. Once these IQEMs are calculated, they can be utilized to provide a computational reference for the following image processing application, such as image understanding and recognition. Some preliminary experiments for image enhancement have proved the correctness and validity of our proposed technique.
Mobile and aerial robots used in urban search and rescue (USAR) operations have shown the potential for allowing us to explore, survey and assess collapsed structures effectively at a safe distance. RGB-D cameras, such as the Microsoft Kinect, allow us to capture 3D depth data in addition to RGB images, providing a significantly richer user experience than flat video, which may provide improved situational awareness for first responders. However, the richer data comes at a higher cost in terms of data throughput and computing power requirements. In this paper we consider the problem of live streaming RGB-D data over wired and wireless communication channels, using low-power, embedded computing equipment. When assessing a disaster environment, a range camera is typically mounted on a ground or aerial robot along with the onboard computer system. Ground robots can use both wireless radio and tethers for communications, whereas aerial robots can only use wireless communication. We propose a hybrid lossless and lossy streaming compression format designed specifically for RGB-D data and investigate the feasibility and usefulness of live-streaming this data in disaster situations.