Biblio
Human action recognition in video is one of the most widely applied topics in the field of image and video processing, with many applications in surveillance (security, sports, etc.), activity detection, video-content-based monitoring, man-machine interaction, and health/disability care. Action recognition is a complex process that faces several challenges such as occlusion, camera movement, viewpoint move, background clutter, and brightness variation. In this study, we propose a novel human action recognition method using convolutional neural networks (CNN) and deep bidirectional LSTM (DB-LSTM) networks, using only raw video frames. First, deep features are extracted from video frames using a pre-trained CNN architecture called ResNet152. The sequential information of the frames is then learned using the DB-LSTM network, where multiple layers are stacked together in both forward and backward passes of DB-LSTM, to increase depth. The evaluation results of the proposed method using PyTorch, compared to the state-of-the-art methods, show a considerable increase in the efficiency of action recognition on the UCF 101 dataset, reaching 95% recognition accuracy. The choice of the CNN architecture, proper tuning of input parameters, and techniques such as data augmentation contribute to the accuracy boost in this study.
Today, as surveillance systems are widely used for indoor and outdoor monitoring applications, there is a growing interest in real-time generation detection and there are many different applications for real-time generation detection and analysis. Two-dimensional videos; It is used in multimedia content-based indexing, information acquisition, visual surveillance and distributed cross-camera surveillance systems, human tracking, traffic monitoring and similar applications. It is of great importance for the development of systems for national security by following a moving target within the scope of military applications. In this research, a more efficient solution is proposed in addition to the existing methods. Therefore, we present YOLO, a new approach to object detection for military applications.
Training a feed-forward network for the fast neural style transfer of images has proven successful, but the naive extension of processing videos frame by frame is prone to producing flickering results. We propose the first end-to-end network for online video style transfer, which generates temporally coherent stylized video sequences in near realtime. Two key ideas include an efficient network by incorporating short-term coherence, and propagating short-term coherence to long-term, which ensures consistency over a longer period of time. Our network can incorporate different image stylization networks and clearly outperforms the per-frame baseline both qualitatively and quantitatively. Moreover, it can achieve visually comparable coherence to optimization-based video style transfer, but is three orders of magnitude faster.
Abnormal crowd behavior detection is an important research issue in video processing and computer vision. In this paper we introduce a novel method to detect abnormal crowd behaviors in video surveillance based on interest points. A complex network-based algorithm is used to detect interest points and extract the global texture features in scenarios. The performance of the proposed method is evaluated on publicly available datasets. We present a detailed analysis of the characteristics of the crowd behavior in different density crowd scenes. The analysis of crowd behavior features and simulation results are also demonstrated to illustrate the effectiveness of our proposed method.