Visible to the public Biblio

Filters: Keyword is video streaming  [Clear All Filters]
2021-02-15
Rabieh, K., Mercan, S., Akkaya, K., Baboolal, V., Aygun, R. S..  2020.  Privacy-Preserving and Efficient Sharing of Drone Videos in Public Safety Scenarios using Proxy Re-encryption. 2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science (IRI). :45–52.
Unmanned Aerial Vehicles (UAVs) also known as drones are being used in many applications where they can record or stream videos. One interesting application is the Intelligent Transportation Systems (ITS) and public safety applications where drones record videos and send them to a control center for further analysis. These videos are shared by various clients such as law enforcement or emergency personnel. In such cases, the recording might include faces of civilians or other sensitive information that might pose privacy concerns. While the video can be encrypted and stored in the cloud that way, it can still be accessed once the keys are exposed to third parties which is completely insecure. To prevent such insecurity, in this paper, we propose proxy re-encryption based sharing scheme to enable third parties to access only limited videos without having the original encryption key. The costly pairing operations in proxy re-encryption are not used to allow rapid access and delivery of the surveillance videos to third parties. The key management is handled by a trusted control center, which acts as the proxy to re-encrypt the data. We implemented and tested the approach in a realistic simulation environment using different resolutions under ns-3. The implementation results and comparisons indicate that there is an acceptable overhead while it can still preserve the privacy of drivers and passengers.
2021-01-15
Brockschmidt, J., Shang, J., Wu, J..  2019.  On the Generality of Facial Forgery Detection. 2019 IEEE 16th International Conference on Mobile Ad Hoc and Sensor Systems Workshops (MASSW). :43—47.
A variety of architectures have been designed or repurposed for the task of facial forgery detection. While many of these designs have seen great success, they largely fail to address challenges these models may face in practice. A major challenge is posed by generality, wherein models must be prepared to perform in a variety of domains. In this paper, we investigate the ability of state-of-the-art facial forgery detection architectures to generalize. We first propose two criteria for generality: reliably detecting multiple spoofing techniques and reliably detecting unseen spoofing techniques. We then devise experiments which measure how a given architecture performs against these criteria. Our analysis focuses on two state-of-the-art facial forgery detection architectures, MesoNet and XceptionNet, both being convolutional neural networks (CNNs). Our experiments use samples from six state-of-the-art facial forgery techniques: Deepfakes, Face2Face, FaceSwap, GANnotation, ICface, and X2Face. We find MesoNet and XceptionNet show potential to generalize to multiple spoofing techniques but with a slight trade-off in accuracy, and largely fail against unseen techniques. We loosely extrapolate these results to similar CNN architectures and emphasize the need for better architectures to meet the challenges of generality.
2021-01-11
Gautam, A., Singh, S..  2020.  A Comparative Analysis of Deep Learning based Super-Resolution Techniques for Thermal Videos. 2020 Third International Conference on Smart Systems and Inventive Technology (ICSSIT). :919—925.

Video streams acquired from thermal cameras are proven to be beneficial in diverse number of fields including military, healthcare, law enforcement, and security. Despite the hype, thermal imaging is increasingly affected by poor resolution, where it has expensive optical sensors and inability to attain optical precision. In recent years, deep learning based super-resolution algorithms are developed to enhance the video frame resolution at high accuracy. This paper presents a comparative analysis of super resolution (SR) techniques based on deep neural networks (DNN) that are applied on thermal video dataset. SRCNN, EDSR, Auto-encoder, and SRGAN are also discussed and investigated. Further the results on benchmark thermal datasets including FLIR, OSU thermal pedestrian database and OSU color thermal database are evaluated and analyzed. Based on the experimental results, it is concluded that, SRGAN has delivered a superior performance on thermal frames when compared to other techniques and improvements, which has the ability to provide state-of-the art performance in real time operations.

2020-12-15
Nasser, B., Rabani, A., Freiling, D., Gan, C..  2018.  An Adaptive Telerobotics Control for Advanced Manufacturing. 2018 NASA/ESA Conference on Adaptive Hardware and Systems (AHS). :82—89.
This paper explores an innovative approach to the telerobotics reasoning architecture and networking, which offer a reliable and adaptable operational process for complex tasks. There are many operational challenges in the remote control for manufacturing that can be introduced by the network communications and Iatency. A new protocol, named compact Reliable UDP (compact-RUDP), has been developed to combine both data channelling and media streaming for robot teleoperation. The original approach ensures connection reliability by implementing a TCP-like sliding window with UDP packets. The protocol provides multiple features including data security, link status monitoring, bandwidth control, asynchronous file transfer and prioritizing transfer of data packets. Experiments were conducted on a 5DOF robotic arm where a cutting tool was mounted at its distal end. A light sensor was used to guide the robot movements, and a camera device to provide a video stream of the operation. The data communication reliability is evaluated using Round-Trip Time (RTT), and advanced robot path planning for distributed decision making between endpoints. The results show 88% correlation between the remotely and locally operated robots. The file transfers and video streaming were performed with no data loss or corruption on the control commands and data feedback packets.
2020-12-01
Li, W., Guo, D., Li, K., Qi, H., Zhang, J..  2018.  iDaaS: Inter-Datacenter Network as a Service. IEEE Transactions on Parallel and Distributed Systems. 29:1515—1529.

Increasing number of Internet-scale applications, such as video streaming, incur huge amount of wide area traffic. Such traffic over the unreliable Internet without bandwidth guarantee suffers unpredictable network performance. This result, however, is unappealing to the application providers. Fortunately, Internet giants like Google and Microsoft are increasingly deploying their private wide area networks (WANs) to connect their global datacenters. Such high-speed private WANs are reliable, and can provide predictable network performance. In this paper, we propose a new type of service-inter-datacenter network as a service (iDaaS), where traditional application providers can reserve bandwidth from those Internet giants to guarantee their wide area traffic. Specifically, we design a bandwidth trading market among multiple iDaaS providers and application providers, and concentrate on the essential bandwidth pricing problem. The involved challenging issue is that the bandwidth price of each iDaaS provider is not only influenced by other iDaaS providers, but also affected by the application providers. To address this issue, we characterize the interaction between iDaaS providers and application providers using a Stackelberg game model, and analyze the existence and uniqueness of the equilibrium. We further present an efficient bandwidth pricing algorithm by blending the advantage of a geometrical Nash bargaining solution and the demand segmentation method. For comparison, we present two bandwidth reservation algorithms, where each iDaaS provider's bandwidth is reserved in a weighted fair manner and a max-min fair manner, respectively. Finally, we conduct comprehensive trace-driven experiments. The evaluation results show that our proposed algorithms not only ensure the revenue of iDaaS providers, but also provide bandwidth guarantee for application providers with lower bandwidth price per unit.

2020-11-30
Xu, Y., Chen, H., Zhao, Y., Zhang, W., Shen, Q., Zhang, X., Ma, Z..  2019.  Neural Adaptive Transport Framework for Internet-scale Interactive Media Streaming Services. 2019 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB). :1–6.
Network dynamics, such as bandwidth fluctuation and unexpected latency, hurt users' quality of experience (QoE) greatly for media services over the Internet. In this work, we propose a neural adaptive transport (NAT) framework to tackle the network dynamics for Internet-scale interactive media services. The entire NAT system has three major components: a learning based cloud overlay routing (COR) scheme for the best delivery path to bypass the network bottlenecks while offering the minimal end-to-end latency simultaneously; a residual neural network based collaborative video processing (CVP) system to trade the computational capability at client-end for QoE improvement via learned resolution scaling; and a deep reinforcement learning (DRL) based adaptive real-time streaming (ARS) strategy to select the appropriate video bitrate for maximal QoE. We have demonstrated that COR could improve the user satisfaction from 5% to 43%, CVP could reduce the bandwidth consumption more than 30% at the same quality, and DRL-based ARS can maintain the smooth streaming with \textbackslashtextless; 50% QoE improvement, respectively.
2020-11-02
Xiong, Wenjie, Shan, Chun, Sun, Zhaoliang, Meng, Qinglei.  2018.  Real-time Processing and Storage of Multimedia Data with Content Delivery Network in Vehicle Monitoring System. 2018 6th International Conference on Wireless Networks and Mobile Communications (WINCOM). :1—4.

With the rapid development of the Internet of vehicles, there is a huge amount of multimedia data becoming a hidden trouble in the Internet of Things. Therefore, it is necessary to process and store them in real time as a way of big data curation. In this paper, a method of real-time processing and storage based on CDN in vehicle monitoring system is proposed. The MPEG-DASH standard is used to process the multimedia data by dividing them into MPD files and media segments. A real-time monitoring system of vehicle on the basis of the method introduced is designed and implemented.

2020-09-08
Perello, Jordi, Lopez, Albert, Careglio, Davide.  2019.  Experimenting with Real Application-specific QoS Guarantees in a Large-scale RINA Demonstrator. 2019 22nd Conference on Innovation in Clouds, Internet and Networks and Workshops (ICIN). :31–36.
This paper reports the definition, setup and obtained results of the Fed4FIRE + medium experiment ERASER, aimed to evaluate the actual Quality of Service (QoS) guarantees that the clean-slate Recursive InterNetwork Architecture (RINA) can deliver to heterogeneous applications at large-scale. To this goal, a 37-Node 5G metro/regional RINA network scenario, spanning from the end-user to the server where applications run in a datacenter has been configured in the Virtual Wall experimentation facility. This scenario has initially been loaded with synthetic application traffic flows, with diverse QoS requirements, thus reproducing different network load conditions. Next,their experienced QoS metrics end-to-end have been measured with two different QTA-Mux (i.e., the most accepted candidate scheduling policy for providing RINA with its QoS support) deployment scenarios. Moreover, on this RINA network scenario loaded with synthetic application traffic flows, a real HD (1080p) video streaming demonstration has also been conducted, setting up video streaming sessions to end-users at different network locations, illustrating the perceived Quality of Experience (QoE). Obtained results in ERASER disclose that, by appropriately deploying and configuring QTA-Mux, RINA can yield effective QoS support, which has provided perfect QoE in almost all locations in our demo when assigning video traffic flows the highest (i.e., Gold) QoS Cube.
2020-07-03
Shaout, Adnan, Crispin, Brennan.  2019.  Markov Augmented Neural Networks for Streaming Video Classification. 2019 International Arab Conference on Information Technology (ACIT). :1—7.

With the growing number of streaming services, internet providers are increasingly needing to be able to identify the types of data and content providers that are being used on their networks. Traditional methods, such as IP and port scanning, are not always available for clients using VPNs or with providers using varying IP addresses. As such, in this paper we explore a potential method using neural networks and Markov Decision Process in order to augment deep packet inspection techniques in identifying the source and class of video streaming services.

Adari, Suman Kalyan, Garcia, Washington, Butler, Kevin.  2019.  Adversarial Video Captioning. 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W). :24—27.
In recent years, developments in the field of computer vision have allowed deep learning-based techniques to surpass human-level performance. However, these advances have also culminated in the advent of adversarial machine learning techniques, capable of launching targeted image captioning attacks that easily fool deep learning models. Although attacks in the image domain are well studied, little work has been done in the video domain. In this paper, we show it is possible to extend prior attacks in the image domain to the video captioning task, without heavily affecting the video's playback quality. We demonstrate our attack against a state-of-the-art video captioning model, by extending a prior image captioning attack known as Show and Fool. To the best of our knowledge, this is the first successful method for targeted attacks against a video captioning model, which is able to inject 'subliminal' perturbations into the video stream, and force the model to output a chosen caption with up to 0.981 cosine similarity, achieving near-perfect similarity to chosen target captions.
2020-02-18
Kalan, Reza Shokri, Sayit, Muge, Clayman, Stuart.  2019.  Optimal Cache Placement and Migration for Improving the Performance of Virtualized SAND. 2019 IEEE Conference on Network Softwarization (NetSoft). :78–83.

Nowadays, video streaming over HTTP is one of the most dominant Internet applications, using adaptive video techniques. Network assisted approaches have been proposed and are being standardized in order to provide high QoE for the end-users of such applications. SAND is a recent MPEG standard where DASH Aware Network Elements (DANEs) are introduced for this purpose. As web-caches are one of the main components of the SAND architecture, the location and the connectivity of these web-caches plays an important role in the user's QoE. The nature of SAND and DANE provides a good foundation for software controlled virtualized DASH environments, and in this paper, we propose a cache location algorithm and a cache migration algorithm for virtualized SAND deployments. The optimal locations for the virtualized DANEs is determined by an SDN controller and migrates it based on gathered statistics. The performance of the resulting system shows that, when SDN and NFV technologies are leveraged in such systems, software controlled virtualized approaches can provide an increase in QoE.

2020-02-10
Chen, Yige, Zang, Tianning, Zhang, Yongzheng, Zhou, Yuan, Wang, Yipeng.  2019.  Rethinking Encrypted Traffic Classification: A Multi-Attribute Associated Fingerprint Approach. 2019 IEEE 27th International Conference on Network Protocols (ICNP). :1–11.

With the unprecedented prevalence of mobile network applications, cryptographic protocols, such as the Secure Socket Layer/Transport Layer Security (SSL/TLS), are widely used in mobile network applications for communication security. The proven methods for encrypted video stream classification or encrypted protocol detection are unsuitable for the SSL/TLS traffic. Consequently, application-level traffic classification based networking and security services are facing severe challenges in effectiveness. Existing encrypted traffic classification methods exhibit unsatisfying accuracy for applications with similar state characteristics. In this paper, we propose a multiple-attribute-based encrypted traffic classification system named Multi-Attribute Associated Fingerprints (MAAF). We develop MAAF based on the two key insights that the DNS traces generated during the application runtime contain classification guidance information and that the handshake certificates in the encrypted flows can provide classification clues. Apart from the exploitation of key insights, MAAF employs the context of the encrypted traffic to overcome the attribute-lacking problem during the classification. Our experimental results demonstrate that MAAF achieves 98.69% accuracy on the real-world traceset that consists of 16 applications, supports the early prediction, and is robust to the scale of the training traceset. Besides, MAAF is superior to the state-of-the-art methods in terms of both accuracy and robustness.

2020-01-02
Hagan, Matthew, Kang, BooJoong, McLaughlin, Kieran, Sezer, Sakir.  2018.  Peer Based Tracking Using Multi-Tuple Indexing for Network Traffic Analysis and Malware Detection. 2018 16th Annual Conference on Privacy, Security and Trust (PST). :1–5.

Traditional firewalls, Intrusion Detection Systems(IDS) and network analytics tools extensively use the `flow' connection concept, consisting of five `tuples' of source and destination IP, ports and protocol type, for classification and management of network activities. By analysing flows, information can be obtained from TCP/IP fields and packet content to give an understanding of what is being transferred within a single connection. As networks have evolved to incorporate more connections and greater bandwidth, particularly from ``always on'' IoT devices and video and data streaming, so too have malicious network threats, whose communication methods have increased in sophistication. As a result, the concept of the 5 tuple flow in isolation is unable to detect such threats and malicious behaviours. This is due to factors such as the length of time and data required to understand the network traffic behaviour, which cannot be accomplished by observing a single connection. To alleviate this issue, this paper proposes the use of additional, two tuple and single tuple flow types to associate multiple 5 tuple communications, with generated metadata used to profile individual connnection behaviour. This proposed approach enables advanced linking of different connections and behaviours, developing a clearer picture as to what network activities have been taking place over a prolonged period of time. To demonstrate the capability of this approach, an expert system rule set has been developed to detect the presence of a multi-peered ZeuS botnet, which communicates by making multiple connections with multiple hosts, thus undetectable to standard IDS systems observing 5 tuple flow types in isolation. Finally, as the solution is rule based, this implementation operates in realtime and does not require post-processing and analytics of other research solutions. This paper aims to demonstrate possible applications for next generation firewalls and methods to acquire additional information from network traffic.

2019-08-12
Eetha, S., Agrawal, S., Neelam, S..  2018.  Zynq FPGA Based System Design for Video Surveillance with Sobel Edge Detection. 2018 IEEE International Symposium on Smart Electronic Systems (iSES) (Formerly iNiS). :76–79.

Advancements in semiconductor domain gave way to realize numerous applications in Video Surveillance using Computer vision and Deep learning, Video Surveillances in Industrial automation, Security, ADAS, Live traffic analysis etc. through image understanding improves efficiency. Image understanding requires input data with high precision which is dependent on Image resolution and location of camera. The data of interest can be thermal image or live feed coming for various sensors. Composite(CVBS) is a popular video interface capable of streaming upto HD(1920x1080) quality. Unlike high speed serial interfaces like HDMI/MIPI CSI, Analog composite video interface is a single wire standard supporting longer distances. Image understanding requires edge detection and classification for further processing. Sobel filter is one the most used edge detection filter which can be embedded into live stream. This paper proposes Zynq FPGA based system design for video surveillance with Sobel edge detection, where the input Composite video decoded (Analog CVBS input to YCbCr digital output), processed in HW and streamed to HDMI display simultaneously storing in SD memory for later processing. The HW design is scalable for resolutions from VGA to Full HD for 60fps and 4K for 24fps. The system is built on Xilinx ZC702 platform and TVP5146 to showcase the functional path.

2018-04-04
Nawaratne, R., Bandaragoda, T., Adikari, A., Alahakoon, D., Silva, D. De, Yu, X..  2017.  Incremental knowledge acquisition and self-learning for autonomous video surveillance. IECON 2017 - 43rd Annual Conference of the IEEE Industrial Electronics Society. :4790–4795.

The world is witnessing a remarkable increase in the usage of video surveillance systems. Besides fulfilling an imperative security and safety purpose, it also contributes towards operations monitoring, hazard detection and facility management in industry/smart factory settings. Most existing surveillance techniques use hand-crafted features analyzed using standard machine learning pipelines for action recognition and event detection. A key shortcoming of such techniques is the inability to learn from unlabeled video streams. The entire video stream is unlabeled when the requirement is to detect irregular, unforeseen and abnormal behaviors, anomalies. Recent developments in intelligent high-level video analysis have been successful in identifying individual elements in a video frame. However, the detection of anomalies in an entire video feed requires incremental and unsupervised machine learning. This paper presents a novel approach that incorporates high-level video analysis outcomes with incremental knowledge acquisition and self-learning for autonomous video surveillance. The proposed approach is capable of detecting changes that occur over time and separating irregularities from re-occurrences, without the prerequisite of a labeled dataset. We demonstrate the proposed approach using a benchmark video dataset and the results confirm its validity and usability for autonomous video surveillance.

2017-11-20
Li, H., He, Y., Sun, L., Cheng, X., Yu, J..  2016.  Side-channel information leakage of encrypted video stream in video surveillance systems. IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications. :1–9.

Video surveillance has been widely adopted to ensure home security in recent years. Most video encoding standards such as H.264 and MPEG-4 compress the temporal redundancy in a video stream using difference coding, which only encodes the residual image between a frame and its reference frame. Difference coding can efficiently compress a video stream, but it causes side-channel information leakage even though the video stream is encrypted, as reported in this paper. Particularly, we observe that the traffic patterns of an encrypted video stream are different when a user conducts different basic activities of daily living, which must be kept private from third parties as obliged by HIPAA regulations. We also observe that by exploiting this side-channel information leakage, attackers can readily infer a user's basic activities of daily living based on only the traffic size data of an encrypted video stream. We validate such an attack using two off-the-shelf cameras, and the results indicate that the user's basic activities of daily living can be recognized with a high accuracy.

2017-05-18
Dimopoulos, Giorgos, Leontiadis, Ilias, Barlet-Ros, Pere, Papagiannaki, Konstantina.  2016.  Measuring Video QoE from Encrypted Traffic. Proceedings of the 2016 Internet Measurement Conference. :513–526.

Tracking and maintaining satisfactory QoE for video streaming services is becoming a greater challenge for mobile network operators than ever before. Downloading and watching video content on mobile devices is currently a growing trend among users, that is causing a demand for higher bandwidth and better provisioning throughout the network infrastructure. At the same time, popular demand for privacy has led many online streaming services to adopt end-to-end encryption, leaving providers with only a handful of indicators for identifying QoE issues. In order to address these challenges, we propose a novel methodology for detecting video streaming QoE issues from encrypted traffic. We develop predictive models for detecting different levels of QoE degradation that is caused by three key influence factors, i.e. stalling, the average video quality and the quality variations. The models are then evaluated on the production network of a large scale mobile operator, where we show that despite encryption our methodology is able to accurately detect QoE problems with 72\textbackslash%-92\textbackslash% accuracy, while even higher performance is achieved when dealing with cleartext traffic

2017-02-14
V. Mishra, K. Choudhary, S. Maheshwari.  2015.  "Video Streaming Using Dual-Channel Dual-Path Routing to Prevent Packet Copy Attack". 2015 IEEE International Conference on Computational Intelligence Communication Technology. :645-650.

The video streaming between the sender and the receiver involves multiple unsecured hops where the video data can be illegally copied if the nodes run malicious forwarding logic. This paper introduces a novel method to stream video data through dual channels using dual data paths. The frames' pixels are also scrambled. The video frames are divided into two frame streams. At the receiver side video is re-constructed and played for a limited time period. As soon as small chunk of merged video is played, it is deleted from video buffer. The approach has been tried to formalize and initial simulation has been done over MATLAB. Preliminary results are optimistic and a refined approach may lead to a formal designing of network layer routing protocol with corrections in transport layer.

2015-05-05
Coatsworth, M., Tran, J., Ferworn, A..  2014.  A hybrid lossless and lossy compression scheme for streaming RGB-D data in real time. Safety, Security, and Rescue Robotics (SSRR), 2014 IEEE International Symposium on. :1-6.

Mobile and aerial robots used in urban search and rescue (USAR) operations have shown the potential for allowing us to explore, survey and assess collapsed structures effectively at a safe distance. RGB-D cameras, such as the Microsoft Kinect, allow us to capture 3D depth data in addition to RGB images, providing a significantly richer user experience than flat video, which may provide improved situational awareness for first responders. However, the richer data comes at a higher cost in terms of data throughput and computing power requirements. In this paper we consider the problem of live streaming RGB-D data over wired and wireless communication channels, using low-power, embedded computing equipment. When assessing a disaster environment, a range camera is typically mounted on a ground or aerial robot along with the onboard computer system. Ground robots can use both wireless radio and tethers for communications, whereas aerial robots can only use wireless communication. We propose a hybrid lossless and lossy streaming compression format designed specifically for RGB-D data and investigate the feasibility and usefulness of live-streaming this data in disaster situations.
 

Bronzino, F., Chao Han, Yang Chen, Nagaraja, K., Xiaowei Yang, Seskar, I., Raychaudhuri, D..  2014.  In-Network Compute Extensions for Rate-Adaptive Content Delivery in Mobile Networks. Network Protocols (ICNP), 2014 IEEE 22nd International Conference on. :511-517.

Traffic from mobile wireless networks has been growing at a fast pace in recent years and is expected to surpass wired traffic very soon. Service providers face significant challenges at such scales including providing seamless mobility, efficient data delivery, security, and provisioning capacity at the wireless edge. In the Mobility First project, we have been exploring clean slate enhancements to the network protocols that can inherently provide support for at-scale mobility and trustworthiness in the Internet. An extensible data plane using pluggable compute-layer services is a key component of this architecture. We believe these extensions can be used to implement in-network services to enhance mobile end-user experience by either off-loading work and/or traffic from mobile devices, or by enabling en-route service-adaptation through context-awareness (e.g., Knowing contemporary access bandwidth). In this work we present details of the architectural support for in-network services within Mobility First, and propose protocol and service-API extensions to flexibly address these pluggable services from end-points. As a demonstrative example, we implement an in network service that does rate adaptation when delivering video streams to mobile devices that experience variable connection quality. We present details of our deployment and evaluation of the non-IP protocols along with compute-layer extensions on the GENI test bed, where we used a set of programmable nodes across 7 distributed sites to configure a Mobility First network with hosts, routers, and in-network compute services.

2015-05-01
Hammoud, R.I., Sahin, C.S., Blasch, E.P., Rhodes, B.J..  2014.  Multi-source Multi-modal Activity Recognition in Aerial Video Surveillance. Computer Vision and Pattern Recognition Workshops (CVPRW), 2014 IEEE Conference on. :237-244.

Recognizing activities in wide aerial/overhead imagery remains a challenging problem due in part to low-resolution video and cluttered scenes with a large number of moving objects. In the context of this research, we deal with two un-synchronized data sources collected in real-world operating scenarios: full-motion videos (FMV) and analyst call-outs (ACO) in the form of chat messages (voice-to-text) made by a human watching the streamed FMV from an aerial platform. We present a multi-source multi-modal activity/event recognition system for surveillance applications, consisting of: (1) detecting and tracking multiple dynamic targets from a moving platform, (2) representing FMV target tracks and chat messages as graphs of attributes, (3) associating FMV tracks and chat messages using a probabilistic graph-based matching approach, and (4) detecting spatial-temporal activity boundaries. We also present an activity pattern learning framework which uses the multi-source associated data as training to index a large archive of FMV videos. Finally, we describe a multi-intelligence user interface for querying an index of activities of interest (AOIs) by movement type and geo-location, and for playing-back a summary of associated text (ACO) and activity video segments of targets-of-interest (TOIs) (in both pixel and geo-coordinates). Such tools help the end-user to quickly search, browse, and prepare mission reports from multi-source data.

Chun-Rong Huang, Chung, P.-C.J., Di-Kai Yang, Hsing-Cheng Chen, Guan-Jie Huang.  2014.  Maximum a Posteriori Probability Estimation for Online Surveillance Video Synopsis. Circuits and Systems for Video Technology, IEEE Transactions on. 24:1417-1429.

To reduce human efforts in browsing long surveillance videos, synopsis videos are proposed. Traditional synopsis video generation applying optimization on video tubes is very time consuming and infeasible for real-time online generation. This dilemma significantly reduces the feasibility of synopsis video generation in practical situations. To solve this problem, the synopsis video generation problem is formulated as a maximum a posteriori probability (MAP) estimation problem in this paper, where the positions and appearing frames of video objects are chronologically rearranged in real time without the need to know their complete trajectories. Moreover, a synopsis table is employed with MAP estimation to decide the temporal locations of the incoming foreground objects in the synopsis video without needing an optimization procedure. As a result, the computational complexity of the proposed video synopsis generation method can be significantly reduced. Furthermore, as it does not require prescreening the entire video, this approach can be applied on online streaming videos.