Biblio
Video summarization aims to improve the efficiency of large-scale video browsing through producting concise summaries. It has been popular among many scenarios such as video surveillance, video review and data annotation. Traditional video summarization techniques focus on filtration in image features dimension or image semantics dimension. However, such techniques can make a large amount of possible useful information lost, especially for many videos with rich text semantics like interviews, teaching videos, in that only the information relevant to the image dimension will be retained. In order to solve the above problem, this paper considers video summarization as a continuous multi-dimensional decision-making process. Specifically, the summarization model predicts a probability for each frame and its corresponding text, and then we designs reward methods for each of them. Finally, comprehensive summaries in two dimensions, i.e. images and semantics, is generated. This approach is not only unsupervised and does not rely on labels and user interaction, but also decouples the semantic and image summarization models to provide more usable interfaces for subsequent engineering use.
ISSN: 2693-9371
Fake news is a new phenomenon that promotes misleading information and fraud via internet social media or traditional news sources. Fake news is readily manufactured and transmitted across numerous social media platforms nowadays, and it has a significant influence on the real world. It is vital to create effective algorithms and tools for detecting misleading information on social media platforms. Most modern research approaches for identifying fraudulent information are based on machine learning, deep learning, feature engineering, graph mining, image and video analysis, and newly built datasets and online services. There is a pressing need to develop a viable approach for readily detecting misleading information. The deep learning LSTM and Bi-LSTM model was proposed as a method for detecting fake news, In this work. First, the NLTK toolkit was used to remove stop words, punctuation, and special characters from the text. The same toolset is used to tokenize and preprocess the text. Since then, GLOVE word embeddings have incorporated higher-level characteristics of the input text extracted from long-term relationships between word sequences captured by the RNN-LSTM, Bi-LSTM model to the preprocessed text. The proposed model additionally employs dropout technology with Dense layers to improve the model's efficiency. The proposed RNN Bi-LSTM-based technique obtains the best accuracy of 94%, and 93% using the Adam optimizer and the Binary cross-entropy loss function with Dropout (0.1,0.2), Once the Dropout range increases it decreases the accuracy of the model as it goes 92% once Dropout (0.3).
To exploit high temporal correlations in video frames of the same scene, the current frame is predicted from the already-encoded reference frames using block-based motion estimation and compensation techniques. While this approach can efficiently exploit the translation motion of the moving objects, it is susceptible to other types of affine motion and object occlusion/deocclusion. Recently, deep learning has been used to model the high-level structure of human pose in specific actions from short videos and then generate virtual frames in future time by predicting the pose using a generative adversarial network (GAN). Therefore, modelling the high-level structure of human pose is able to exploit semantic correlation by predicting human actions and determining its trajectory. Video surveillance applications will benefit as stored “big” surveillance data can be compressed by estimating human pose trajectories and generating future frames through semantic correlation. This paper explores a new way of video coding by modelling human pose from the already-encoded frames and using the generated frame at the current time as an additional forward-referencing frame. It is expected that the proposed approach can overcome the limitations of the traditional backward-referencing frames by predicting the blocks containing the moving objects with lower residuals. Our experimental results show that the proposed approach can achieve on average up to 2.83 dB PSNR gain and 25.93% bitrate savings for high motion video sequences compared to standard video coding.
ISSN: 2642-9357
A distributed denial-of-service (DDoS) is a malicious attempt by attackers to disrupt the normal traffic of a targeted server, service or network. This is done by overwhelming the target and its surrounding infrastructure with a flood of Internet traffic. The multiple compromised computer systems (bots or zombies) then act as sources of attack traffic. Exploited machines can include computers and other network resources such as IoT devices. The attack results in either degraded network performance or a total service outage of critical infrastructure. This can lead to heavy financial losses and reputational damage. These attacks maximise effectiveness by controlling the affected systems remotely and establishing a network of bots called bot networks. It is very difficult to separate the attack traffic from normal traffic. Early detection is essential for successful mitigation of the attack, which gives rise to a very important role in cybersecurity to detect the attacks and mitigate the effects. This can be done by deploying machine learning or deep learning models to monitor the traffic data. We propose using various machine learning and deep learning algorithms to analyse the traffic patterns and separate malicious traffic from normal traffic. Two suitable datasets have been identified (DDoS attack SDN dataset and CICDDoS2019 dataset). All essential preprocessing is performed on both datasets. Feature selection is also performed before detection techniques are applied. 8 different Neural Networks/ Ensemble/ Machine Learning models are chosen and the datasets are analysed. The best model is chosen based on the performance metrics (DEEP NEURAL NETWORK MODEL). An alternative is also suggested (Next best - Hypermodel). Optimisation by Hyperparameter tuning further enhances the accuracy. Based on the nature of the attack and the intended target, suitable mitigation procedures can then be deployed.