Biblio
Video streams acquired from thermal cameras are proven to be beneficial in diverse number of fields including military, healthcare, law enforcement, and security. Despite the hype, thermal imaging is increasingly affected by poor resolution, where it has expensive optical sensors and inability to attain optical precision. In recent years, deep learning based super-resolution algorithms are developed to enhance the video frame resolution at high accuracy. This paper presents a comparative analysis of super resolution (SR) techniques based on deep neural networks (DNN) that are applied on thermal video dataset. SRCNN, EDSR, Auto-encoder, and SRGAN are also discussed and investigated. Further the results on benchmark thermal datasets including FLIR, OSU thermal pedestrian database and OSU color thermal database are evaluated and analyzed. Based on the experimental results, it is concluded that, SRGAN has delivered a superior performance on thermal frames when compared to other techniques and improvements, which has the ability to provide state-of-the art performance in real time operations.
Increasing number of Internet-scale applications, such as video streaming, incur huge amount of wide area traffic. Such traffic over the unreliable Internet without bandwidth guarantee suffers unpredictable network performance. This result, however, is unappealing to the application providers. Fortunately, Internet giants like Google and Microsoft are increasingly deploying their private wide area networks (WANs) to connect their global datacenters. Such high-speed private WANs are reliable, and can provide predictable network performance. In this paper, we propose a new type of service-inter-datacenter network as a service (iDaaS), where traditional application providers can reserve bandwidth from those Internet giants to guarantee their wide area traffic. Specifically, we design a bandwidth trading market among multiple iDaaS providers and application providers, and concentrate on the essential bandwidth pricing problem. The involved challenging issue is that the bandwidth price of each iDaaS provider is not only influenced by other iDaaS providers, but also affected by the application providers. To address this issue, we characterize the interaction between iDaaS providers and application providers using a Stackelberg game model, and analyze the existence and uniqueness of the equilibrium. We further present an efficient bandwidth pricing algorithm by blending the advantage of a geometrical Nash bargaining solution and the demand segmentation method. For comparison, we present two bandwidth reservation algorithms, where each iDaaS provider's bandwidth is reserved in a weighted fair manner and a max-min fair manner, respectively. Finally, we conduct comprehensive trace-driven experiments. The evaluation results show that our proposed algorithms not only ensure the revenue of iDaaS providers, but also provide bandwidth guarantee for application providers with lower bandwidth price per unit.
With the rapid development of the Internet of vehicles, there is a huge amount of multimedia data becoming a hidden trouble in the Internet of Things. Therefore, it is necessary to process and store them in real time as a way of big data curation. In this paper, a method of real-time processing and storage based on CDN in vehicle monitoring system is proposed. The MPEG-DASH standard is used to process the multimedia data by dividing them into MPD files and media segments. A real-time monitoring system of vehicle on the basis of the method introduced is designed and implemented.
With the growing number of streaming services, internet providers are increasingly needing to be able to identify the types of data and content providers that are being used on their networks. Traditional methods, such as IP and port scanning, are not always available for clients using VPNs or with providers using varying IP addresses. As such, in this paper we explore a potential method using neural networks and Markov Decision Process in order to augment deep packet inspection techniques in identifying the source and class of video streaming services.
Nowadays, video streaming over HTTP is one of the most dominant Internet applications, using adaptive video techniques. Network assisted approaches have been proposed and are being standardized in order to provide high QoE for the end-users of such applications. SAND is a recent MPEG standard where DASH Aware Network Elements (DANEs) are introduced for this purpose. As web-caches are one of the main components of the SAND architecture, the location and the connectivity of these web-caches plays an important role in the user's QoE. The nature of SAND and DANE provides a good foundation for software controlled virtualized DASH environments, and in this paper, we propose a cache location algorithm and a cache migration algorithm for virtualized SAND deployments. The optimal locations for the virtualized DANEs is determined by an SDN controller and migrates it based on gathered statistics. The performance of the resulting system shows that, when SDN and NFV technologies are leveraged in such systems, software controlled virtualized approaches can provide an increase in QoE.
With the unprecedented prevalence of mobile network applications, cryptographic protocols, such as the Secure Socket Layer/Transport Layer Security (SSL/TLS), are widely used in mobile network applications for communication security. The proven methods for encrypted video stream classification or encrypted protocol detection are unsuitable for the SSL/TLS traffic. Consequently, application-level traffic classification based networking and security services are facing severe challenges in effectiveness. Existing encrypted traffic classification methods exhibit unsatisfying accuracy for applications with similar state characteristics. In this paper, we propose a multiple-attribute-based encrypted traffic classification system named Multi-Attribute Associated Fingerprints (MAAF). We develop MAAF based on the two key insights that the DNS traces generated during the application runtime contain classification guidance information and that the handshake certificates in the encrypted flows can provide classification clues. Apart from the exploitation of key insights, MAAF employs the context of the encrypted traffic to overcome the attribute-lacking problem during the classification. Our experimental results demonstrate that MAAF achieves 98.69% accuracy on the real-world traceset that consists of 16 applications, supports the early prediction, and is robust to the scale of the training traceset. Besides, MAAF is superior to the state-of-the-art methods in terms of both accuracy and robustness.
Traditional firewalls, Intrusion Detection Systems(IDS) and network analytics tools extensively use the `flow' connection concept, consisting of five `tuples' of source and destination IP, ports and protocol type, for classification and management of network activities. By analysing flows, information can be obtained from TCP/IP fields and packet content to give an understanding of what is being transferred within a single connection. As networks have evolved to incorporate more connections and greater bandwidth, particularly from ``always on'' IoT devices and video and data streaming, so too have malicious network threats, whose communication methods have increased in sophistication. As a result, the concept of the 5 tuple flow in isolation is unable to detect such threats and malicious behaviours. This is due to factors such as the length of time and data required to understand the network traffic behaviour, which cannot be accomplished by observing a single connection. To alleviate this issue, this paper proposes the use of additional, two tuple and single tuple flow types to associate multiple 5 tuple communications, with generated metadata used to profile individual connnection behaviour. This proposed approach enables advanced linking of different connections and behaviours, developing a clearer picture as to what network activities have been taking place over a prolonged period of time. To demonstrate the capability of this approach, an expert system rule set has been developed to detect the presence of a multi-peered ZeuS botnet, which communicates by making multiple connections with multiple hosts, thus undetectable to standard IDS systems observing 5 tuple flow types in isolation. Finally, as the solution is rule based, this implementation operates in realtime and does not require post-processing and analytics of other research solutions. This paper aims to demonstrate possible applications for next generation firewalls and methods to acquire additional information from network traffic.
Advancements in semiconductor domain gave way to realize numerous applications in Video Surveillance using Computer vision and Deep learning, Video Surveillances in Industrial automation, Security, ADAS, Live traffic analysis etc. through image understanding improves efficiency. Image understanding requires input data with high precision which is dependent on Image resolution and location of camera. The data of interest can be thermal image or live feed coming for various sensors. Composite(CVBS) is a popular video interface capable of streaming upto HD(1920x1080) quality. Unlike high speed serial interfaces like HDMI/MIPI CSI, Analog composite video interface is a single wire standard supporting longer distances. Image understanding requires edge detection and classification for further processing. Sobel filter is one the most used edge detection filter which can be embedded into live stream. This paper proposes Zynq FPGA based system design for video surveillance with Sobel edge detection, where the input Composite video decoded (Analog CVBS input to YCbCr digital output), processed in HW and streamed to HDMI display simultaneously storing in SD memory for later processing. The HW design is scalable for resolutions from VGA to Full HD for 60fps and 4K for 24fps. The system is built on Xilinx ZC702 platform and TVP5146 to showcase the functional path.
The world is witnessing a remarkable increase in the usage of video surveillance systems. Besides fulfilling an imperative security and safety purpose, it also contributes towards operations monitoring, hazard detection and facility management in industry/smart factory settings. Most existing surveillance techniques use hand-crafted features analyzed using standard machine learning pipelines for action recognition and event detection. A key shortcoming of such techniques is the inability to learn from unlabeled video streams. The entire video stream is unlabeled when the requirement is to detect irregular, unforeseen and abnormal behaviors, anomalies. Recent developments in intelligent high-level video analysis have been successful in identifying individual elements in a video frame. However, the detection of anomalies in an entire video feed requires incremental and unsupervised machine learning. This paper presents a novel approach that incorporates high-level video analysis outcomes with incremental knowledge acquisition and self-learning for autonomous video surveillance. The proposed approach is capable of detecting changes that occur over time and separating irregularities from re-occurrences, without the prerequisite of a labeled dataset. We demonstrate the proposed approach using a benchmark video dataset and the results confirm its validity and usability for autonomous video surveillance.
Video surveillance has been widely adopted to ensure home security in recent years. Most video encoding standards such as H.264 and MPEG-4 compress the temporal redundancy in a video stream using difference coding, which only encodes the residual image between a frame and its reference frame. Difference coding can efficiently compress a video stream, but it causes side-channel information leakage even though the video stream is encrypted, as reported in this paper. Particularly, we observe that the traffic patterns of an encrypted video stream are different when a user conducts different basic activities of daily living, which must be kept private from third parties as obliged by HIPAA regulations. We also observe that by exploiting this side-channel information leakage, attackers can readily infer a user's basic activities of daily living based on only the traffic size data of an encrypted video stream. We validate such an attack using two off-the-shelf cameras, and the results indicate that the user's basic activities of daily living can be recognized with a high accuracy.
Tracking and maintaining satisfactory QoE for video streaming services is becoming a greater challenge for mobile network operators than ever before. Downloading and watching video content on mobile devices is currently a growing trend among users, that is causing a demand for higher bandwidth and better provisioning throughout the network infrastructure. At the same time, popular demand for privacy has led many online streaming services to adopt end-to-end encryption, leaving providers with only a handful of indicators for identifying QoE issues. In order to address these challenges, we propose a novel methodology for detecting video streaming QoE issues from encrypted traffic. We develop predictive models for detecting different levels of QoE degradation that is caused by three key influence factors, i.e. stalling, the average video quality and the quality variations. The models are then evaluated on the production network of a large scale mobile operator, where we show that despite encryption our methodology is able to accurately detect QoE problems with 72\textbackslash%-92\textbackslash% accuracy, while even higher performance is achieved when dealing with cleartext traffic
The video streaming between the sender and the receiver involves multiple unsecured hops where the video data can be illegally copied if the nodes run malicious forwarding logic. This paper introduces a novel method to stream video data through dual channels using dual data paths. The frames' pixels are also scrambled. The video frames are divided into two frame streams. At the receiver side video is re-constructed and played for a limited time period. As soon as small chunk of merged video is played, it is deleted from video buffer. The approach has been tried to formalize and initial simulation has been done over MATLAB. Preliminary results are optimistic and a refined approach may lead to a formal designing of network layer routing protocol with corrections in transport layer.
Mobile and aerial robots used in urban search and rescue (USAR) operations have shown the potential for allowing us to explore, survey and assess collapsed structures effectively at a safe distance. RGB-D cameras, such as the Microsoft Kinect, allow us to capture 3D depth data in addition to RGB images, providing a significantly richer user experience than flat video, which may provide improved situational awareness for first responders. However, the richer data comes at a higher cost in terms of data throughput and computing power requirements. In this paper we consider the problem of live streaming RGB-D data over wired and wireless communication channels, using low-power, embedded computing equipment. When assessing a disaster environment, a range camera is typically mounted on a ground or aerial robot along with the onboard computer system. Ground robots can use both wireless radio and tethers for communications, whereas aerial robots can only use wireless communication. We propose a hybrid lossless and lossy streaming compression format designed specifically for RGB-D data and investigate the feasibility and usefulness of live-streaming this data in disaster situations.
Traffic from mobile wireless networks has been growing at a fast pace in recent years and is expected to surpass wired traffic very soon. Service providers face significant challenges at such scales including providing seamless mobility, efficient data delivery, security, and provisioning capacity at the wireless edge. In the Mobility First project, we have been exploring clean slate enhancements to the network protocols that can inherently provide support for at-scale mobility and trustworthiness in the Internet. An extensible data plane using pluggable compute-layer services is a key component of this architecture. We believe these extensions can be used to implement in-network services to enhance mobile end-user experience by either off-loading work and/or traffic from mobile devices, or by enabling en-route service-adaptation through context-awareness (e.g., Knowing contemporary access bandwidth). In this work we present details of the architectural support for in-network services within Mobility First, and propose protocol and service-API extensions to flexibly address these pluggable services from end-points. As a demonstrative example, we implement an in network service that does rate adaptation when delivering video streams to mobile devices that experience variable connection quality. We present details of our deployment and evaluation of the non-IP protocols along with compute-layer extensions on the GENI test bed, where we used a set of programmable nodes across 7 distributed sites to configure a Mobility First network with hosts, routers, and in-network compute services.
Recognizing activities in wide aerial/overhead imagery remains a challenging problem due in part to low-resolution video and cluttered scenes with a large number of moving objects. In the context of this research, we deal with two un-synchronized data sources collected in real-world operating scenarios: full-motion videos (FMV) and analyst call-outs (ACO) in the form of chat messages (voice-to-text) made by a human watching the streamed FMV from an aerial platform. We present a multi-source multi-modal activity/event recognition system for surveillance applications, consisting of: (1) detecting and tracking multiple dynamic targets from a moving platform, (2) representing FMV target tracks and chat messages as graphs of attributes, (3) associating FMV tracks and chat messages using a probabilistic graph-based matching approach, and (4) detecting spatial-temporal activity boundaries. We also present an activity pattern learning framework which uses the multi-source associated data as training to index a large archive of FMV videos. Finally, we describe a multi-intelligence user interface for querying an index of activities of interest (AOIs) by movement type and geo-location, and for playing-back a summary of associated text (ACO) and activity video segments of targets-of-interest (TOIs) (in both pixel and geo-coordinates). Such tools help the end-user to quickly search, browse, and prepare mission reports from multi-source data.
To reduce human efforts in browsing long surveillance videos, synopsis videos are proposed. Traditional synopsis video generation applying optimization on video tubes is very time consuming and infeasible for real-time online generation. This dilemma significantly reduces the feasibility of synopsis video generation in practical situations. To solve this problem, the synopsis video generation problem is formulated as a maximum a posteriori probability (MAP) estimation problem in this paper, where the positions and appearing frames of video objects are chronologically rearranged in real time without the need to know their complete trajectories. Moreover, a synopsis table is employed with MAP estimation to decide the temporal locations of the incoming foreground objects in the synopsis video without needing an optimization procedure. As a result, the computational complexity of the proposed video synopsis generation method can be significantly reduced. Furthermore, as it does not require prescreening the entire video, this approach can be applied on online streaming videos.